Koichiro Den [Sun, 22 Oct 2017 04:13:16 +0000 (13:13 +0900)]
tcp: do tcp_mstamp_refresh before retransmits on TSQ handler
When retransmission on TSQ handler was introduced in the commit f9616c35a0d7 ("tcp: implement TSQ for retransmits"), the retransmitted
skbs' timestamps were updated on the actual transmission. In the later
commit 385e20706fac ("tcp: use tp->tcp_mstamp in output path"), it stops
being done so. In the commit, the comment says "We try to refresh
tp->tcp_mstamp only when necessary", and at present tcp_tsq_handler and
tcp_v4_mtu_reduced applies to this. About the latter, it's okay since
it's rare enough.
About the former, even though possible retransmissions on the tasklet
comes just after the destructor run in NET_RX softirq handling, the time
between them could be nonnegligibly large to the extent that
tcp_rack_advance or rto rearming be affected if other (remaining) RX,
BLOCK and (preceding) TASKLET sofirq handlings are unexpectedly heavy.
So in the same way as tcp_write_timer_handler does, doing tcp_mstamp_refresh
ensures the accuracy of algorithms relying on it.
Fixes: 385e20706fac ("tcp: use tp->tcp_mstamp in output path") Signed-off-by: Koichiro Den <den@klaipeden.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Pull networking fixes from David Miller:
"A little more than usual this time around. Been travelling, so that is
part of it.
Anyways, here are the highlights:
1) Deal with memcontrol races wrt. listener dismantle, from Eric
Dumazet.
2) Handle page allocation failures properly in nfp driver, from Jaku
Kicinski.
3) Fix memory leaks in macsec, from Sabrina Dubroca.
4) Fix crashes in pppol2tp_session_ioctl(), from Guillaume Nault.
5) Several fixes in bnxt_en driver, including preventing potential
NVRAM parameter corruption from Michael Chan.
6) Fix for KRACK attacks in wireless, from Johannes Berg.
7) rtnetlink event generation fixes from Xin Long.
8) Deadlock in mlxsw driver, from Ido Schimmel.
9) Disallow arithmetic operations on context pointers in bpf, from
Jakub Kicinski.
10) Missing sock_owned_by_user() check in sctp_icmp_redirect(), from
Xin Long.
11) Only TCP is supported for sockmap, make that explicit with a
check, from John Fastabend.
12) Fix IP options state races in DCCP and TCP, from Eric Dumazet.
13) Fix panic in packet_getsockopt(), also from Eric Dumazet.
14) Add missing locked in hv_sock layer, from Dexuan Cui.
15) Various aquantia bug fixes, including several statistics handling
cures. From Igor Russkikh et al.
16) Fix arithmetic overflow in devmap code, from John Fastabend.
17) Fix busted socket memory accounting when we get a fault in the tcp
zero copy paths. From Willem de Bruijn.
18) Don't leave opt->tot_len uninitialized in ipv6, from Eric Dumazet"
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (106 commits)
stmmac: Don't access tx_q->dirty_tx before netif_tx_lock
ipv6: flowlabel: do not leave opt->tot_len with garbage
of_mdio: Fix broken PHY IRQ in case of probe deferral
textsearch: fix typos in library helpers
rxrpc: Don't release call mutex on error pointer
net: stmmac: Prevent infinite loop in get_rx_timestamp_status()
net: stmmac: Fix stmmac_get_rx_hwtstamp()
net: stmmac: Add missing call to dev_kfree_skb()
mlxsw: spectrum_router: Configure TIGCR on init
mlxsw: reg: Add Tunneling IPinIP General Configuration Register
net: ethtool: remove error check for legacy setting transceiver type
soreuseport: fix initialization race
net: bridge: fix returning of vlan range op errors
sock: correct sk_wmem_queued accounting on efault in tcp zerocopy
bpf: add test cases to bpf selftests to cover all access tests
bpf: fix pattern matches for direct packet access
bpf: fix off by one for range markings with L{T, E} patterns
bpf: devmap fix arithmetic overflow in bitmap_size calculation
net: aquantia: Bad udp rate on default interrupt coalescing
net: aquantia: Enable coalescing management via ethtool interface
...
Bernd Edlinger [Sat, 21 Oct 2017 06:51:30 +0000 (06:51 +0000)]
stmmac: Don't access tx_q->dirty_tx before netif_tx_lock
This is the possible reason for different hard to reproduce
problems on my ARMv7-SMP test system.
The symptoms are in recent kernels imprecise external aborts,
and in older kernels various kinds of network stalls and
unexpected page allocation failures.
My testing indicates that the trouble started between v4.5 and v4.6
and prevails up to v4.14.
Using the dirty_tx before acquiring the spin lock is clearly
wrong and was first introduced with v4.6.
Fixes: e3ad57c96715 ("stmmac: review RX/TX ring management") Signed-off-by: Bernd Edlinger <bernd.edlinger@hotmail.de> Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Sat, 21 Oct 2017 19:26:23 +0000 (12:26 -0700)]
ipv6: flowlabel: do not leave opt->tot_len with garbage
When syzkaller team brought us a C repro for the crash [1] that
had been reported many times in the past, I finally could find
the root cause.
If FlowLabel info is merged by fl6_merge_options(), we leave
part of the opt_space storage provided by udp/raw/l2tp with random value
in opt_space.tot_len, unless a control message was provided at sendmsg()
time.
Then ip6_setup_cork() would use this random value to perform a kzalloc()
call. Undefined behavior and crashes.
Fix is to properly set tot_len in fl6_merge_options()
At the same time, we can also avoid consuming memory and cpu cycles
to clear it, if every option is copied via a kmemdup(). This is the
change in ip6_setup_cork().
Depending on whether the PHY driver will fall back to polling, Ethernet
may or may not work.
To fix this:
1. Switch of_mdiobus_register_phy() from irq_of_parse_and_map() to
of_irq_get().
Unlike the former, the latter returns -EPROBE_DEFER if the
interrupt controller is not yet available, so this condition can be
detected.
Other errors are handled the same as before, i.e. use the passed
mdio->irq[addr] as interrupt.
2. Propagate and handle errors from of_mdiobus_register_phy() and
of_mdiobus_register_device().
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: David S. Miller <davem@davemloft.net>
David Howells [Fri, 20 Oct 2017 16:01:22 +0000 (17:01 +0100)]
rxrpc: Don't release call mutex on error pointer
Don't release call mutex at the end of rxrpc_kernel_begin_call() if the
call pointer actually holds an error value.
Fixes: 540b1c48c37a ("rxrpc: Fix deadlock between call creation and sendmsg/recvmsg") Reported-by: Marc Dionne <marc.dionne@auristor.com> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jose Abreu [Fri, 20 Oct 2017 13:37:36 +0000 (14:37 +0100)]
net: stmmac: Prevent infinite loop in get_rx_timestamp_status()
Prevent infinite loop by correctly setting the loop condition to
break when i == 10.
Signed-off-by: Jose Abreu <joabreu@synopsys.com> Cc: David S. Miller <davem@davemloft.net> Cc: Joao Pinto <jpinto@synopsys.com> Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com> Cc: Alexandre Torgue <alexandre.torgue@st.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jose Abreu [Fri, 20 Oct 2017 13:37:35 +0000 (14:37 +0100)]
net: stmmac: Fix stmmac_get_rx_hwtstamp()
When using GMAC4 the valid timestamp is from CTX next desc but
we are passing the previous desc to get_rx_timestamp_status()
callback.
Fix this and while at it rework a little bit the function logic.
Signed-off-by: Jose Abreu <joabreu@synopsys.com> Cc: David S. Miller <davem@davemloft.net> Cc: Joao Pinto <jpinto@synopsys.com> Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com> Cc: Alexandre Torgue <alexandre.torgue@st.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jose Abreu [Fri, 20 Oct 2017 13:37:34 +0000 (14:37 +0100)]
net: stmmac: Add missing call to dev_kfree_skb()
When RX HW timestamp is enabled and a frame is discarded we are
not freeing the skb but instead only setting to NULL the entry.
Add a call to dev_kfree_skb_any() so that skb entry is correctly
freed.
Signed-off-by: Jose Abreu <joabreu@synopsys.com> Cc: David S. Miller <davem@davemloft.net> Cc: Joao Pinto <jpinto@synopsys.com> Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com> Cc: Alexandre Torgue <alexandre.torgue@st.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Sun, 22 Oct 2017 01:46:39 +0000 (21:46 -0400)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
Pull input fixes from Dmitry Torokhov:
- joydev now implements a blacklist to avoid creating joystick nodes
for accelerometers found in composite devices such as PlaStation
controllers
- assorted driver fixes
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
Input: ims-psu - check if CDC union descriptor is sane
Input: joydev - blacklist ds3/ds4/udraw motion sensors
Input: allow matching device IDs on property bits
Input: factor out and export input_device_id matching code
Input: goodix - poll the 'buffer status' bit before reading data
Input: axp20x-pek - fix module not auto-loading for axp221 pek
Input: tca8418 - enable interrupt after it has been requested
Input: stmfts - fix setting ABS_MT_POSITION_* maximum size
Input: ti_am335x_tsc - fix incorrect step config for 5 wire touchscreen
Input: synaptics - disable kernel tracking on SMBus devices
Linus Torvalds [Sun, 22 Oct 2017 01:39:18 +0000 (21:39 -0400)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs fixes from Al Viro:
"MS_I_VERSION fixes - Mimi's fix + missing bits picked from Matthew
(his patch contained a duplicate of the fs/namespace.c fix as well,
but by that point the original fix had already been applied)"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
Convert fs/*/* to SB_I_VERSION
vfs: fix mounting a filesystem with i_version
David S. Miller [Sun, 22 Oct 2017 01:19:03 +0000 (02:19 +0100)]
Merge branch 'mlxsw-fixes'
Jiri Pirko says:
====================
mlxsw: spectrum: Configure TTL of "inherit" for offloaded tunnels
Petr says:
Currently mlxsw only offloads tunnels that are configured with TTL of "inherit"
(which is the default). However, Spectrum defaults to 255 and the driver
neglects to change the configuration. Thus the tunnel packets from offloaded
tunnels always have TTL of 255, even though tunnels with explicit TTL of 255 are
never actually offloaded.
To fix this, introduce support for TIGCR, the register that keeps the related
bits of global tunnel configuration, and use it on first offload to properly
configure inheritance of TTL of tunnel packets from overlay packets.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Petr Machata [Fri, 20 Oct 2017 07:16:16 +0000 (09:16 +0200)]
mlxsw: spectrum_router: Configure TIGCR on init
Spectrum tunnels do not default to ttl of "inherit" like the Linux ones
do. Configure TIGCR on router init so that the TTL of tunnel packets is
copied from the overlay packets.
Fixes: ee954d1a91b2 ("mlxsw: spectrum_router: Support GRE tunnels") Signed-off-by: Petr Machata <petrm@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Petr Machata [Fri, 20 Oct 2017 07:16:15 +0000 (09:16 +0200)]
mlxsw: reg: Add Tunneling IPinIP General Configuration Register
The TIGCR register is used for setting up the IPinIP Tunnel
configuration.
Fixes: ee954d1a91b2 ("mlxsw: spectrum_router: Support GRE tunnels") Signed-off-by: Petr Machata <petrm@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Niklas Söderlund [Thu, 19 Oct 2017 23:32:08 +0000 (01:32 +0200)]
net: ethtool: remove error check for legacy setting transceiver type
Commit 9cab88726929605 ("net: ethtool: Add back transceiver type")
restores the transceiver type to struct ethtool_link_settings and
convert_link_ksettings_to_legacy_settings() but forgets to remove the
error check for the same in convert_legacy_settings_to_link_ksettings().
This prevents older versions of ethtool to change link settings.
# ethtool --version
ethtool version 3.16
# ethtool -s eth0 autoneg on speed 100 duplex full
Cannot set new settings: Invalid argument
not setting speed
not setting duplex
not setting autoneg
While newer versions of ethtool works.
# ethtool --version
ethtool version 4.10
# ethtool -s eth0 autoneg on speed 100 duplex full
[ 57.703268] sh-eth ee700000.ethernet eth0: Link is Down
[ 59.618227] sh-eth ee700000.ethernet eth0: Link is Up - 100Mbps/Full - flow control rx/tx
Fixes: 19cab88726929605 ("net: ethtool: Add back transceiver type") Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se> Reported-by: Renjith R V <renjith.rv@quest-global.com> Tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: David S. Miller <davem@davemloft.net>
Craig Gallek [Thu, 19 Oct 2017 19:00:29 +0000 (15:00 -0400)]
soreuseport: fix initialization race
Syzkaller stumbled upon a way to trigger
WARNING: CPU: 1 PID: 13881 at net/core/sock_reuseport.c:41
reuseport_alloc+0x306/0x3b0 net/core/sock_reuseport.c:39
There are two initialization paths for the sock_reuseport structure in a
socket: Through the udp/tcp bind paths of SO_REUSEPORT sockets or through
SO_ATTACH_REUSEPORT_[CE]BPF before bind. The existing implementation
assumedthat the socket lock protected both of these paths when it actually
only protects the SO_ATTACH_REUSEPORT path. Syzkaller triggered this
double allocation by running these paths concurrently.
This patch moves the check for double allocation into the reuseport_alloc
function which is protected by a global spin lock.
Fixes: e32ea7e74727 ("soreuseport: fast reuseport UDP socket selection") Fixes: c125e80b8868 ("soreuseport: fast reuseport TCP socket selection") Signed-off-by: Craig Gallek <kraig@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
net: bridge: fix returning of vlan range op errors
When vlan tunnels were introduced, vlan range errors got silently
dropped and instead 0 was returned always. Restore the previous
behaviour and return errors to user-space.
Fixes: efa5356b0d97 ("bridge: per vlan dst_metadata netlink support") Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Acked-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Willem de Bruijn [Thu, 19 Oct 2017 16:40:39 +0000 (12:40 -0400)]
sock: correct sk_wmem_queued accounting on efault in tcp zerocopy
Syzkaller hits WARN_ON(sk->sk_wmem_queued) in sk_stream_kill_queues
after triggering an EFAULT in __zerocopy_sg_from_iter.
On this error, skb_zerocopy_stream_iter resets the skb to its state
before the operation with __pskb_trim. It cannot kfree_skb like
datagram callers, as the skb may have data from a previous send call.
__pskb_trim calls skb_condense for unowned skbs, which adjusts their
truesize. These tcp skbuffs are owned and their truesize must add up
to sk_wmem_queued. But they match because their skb->sk is NULL until
tcp_transmit_skb.
Temporarily set skb->sk when calling __pskb_trim to signal that the
skbuffs are owned and avoid the skb_condense path.
Fixes: 52267790ef52 ("sock: add MSG_ZEROCOPY") Signed-off-by: Willem de Bruijn <willemb@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Sat, 21 Oct 2017 23:56:10 +0000 (00:56 +0100)]
Merge branch 'bpf-range-marking-fixes'
Daniel Borkmann says:
====================
Two BPF fixes for range marking
The set contains two fixes for direct packet access range
markings and test cases for all direct packet access patterns
that the verifier matches on.
They are targeted for net tree, note that once net gets merged
into net-next, there will be a minor merge conflict due to
signature change of the function find_good_pkt_pointers() as
well as data_meta patterns present in net-next tree. You can
just add bool false to the data_meta patterns and I will
follow-up with properly converting the patterns for data_meta
in a similar way.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Borkmann [Sat, 21 Oct 2017 00:34:23 +0000 (02:34 +0200)]
bpf: add test cases to bpf selftests to cover all access tests
Lets add test cases to cover really all possible direct packet
access tests for good/bad access cases so we keep tracking them.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Borkmann [Sat, 21 Oct 2017 00:34:22 +0000 (02:34 +0200)]
bpf: fix pattern matches for direct packet access
Alexander had a test program with direct packet access, where
the access test was in the form of data + X > data_end. In an
unrelated change to the program LLVM decided to swap the branches
and emitted code for the test in form of data + X <= data_end.
We hadn't seen these being generated previously, thus verifier
would reject the program. Therefore, fix up the verifier to
detect all test cases, so we don't run into such issues in the
future.
Fixes: b4e432f1000a ("bpf: enable BPF_J{LT, LE, SLT, SLE} opcodes in verifier") Reported-by: Alexander Alemayhu <alexander@alemayhu.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Borkmann [Sat, 21 Oct 2017 00:34:21 +0000 (02:34 +0200)]
bpf: fix off by one for range markings with L{T, E} patterns
During review I noticed that the current logic for direct packet
access marking in check_cond_jmp_op() has an off by one for the
upper right range border when marking in find_good_pkt_pointers()
with BPF_JLT and BPF_JLE. It's not really harmful given access
up to pkt_end is always safe, but we should nevertheless correct
the range marking before it becomes ABI. If pkt_data' denotes a
pkt_data derived pointer (pkt_data + X), then for pkt_data' < pkt_end
in the true branch as well as for pkt_end <= pkt_data' in the false
branch we mark the range with X although it should really be X - 1
in these cases. For example, X could be pkt_end - pkt_data, then
when testing for pkt_data' < pkt_end the verifier simulation cannot
deduce that a byte load of pkt_data' - 1 would succeed in this
branch.
Fixes: b4e432f1000a ("bpf: enable BPF_J{LT, LE, SLT, SLE} opcodes in verifier") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
John Fastabend [Thu, 19 Oct 2017 16:03:52 +0000 (09:03 -0700)]
bpf: devmap fix arithmetic overflow in bitmap_size calculation
An integer overflow is possible in dev_map_bitmap_size() when
calculating the BITS_TO_LONG logic which becomes, after macro
replacement,
(((n) + (d) - 1)/ (d))
where 'n' is a __u32 and 'd' is (8 * sizeof(long)). To avoid
overflow cast to u64 before arithmetic.
Reported-by: Richard Weinberger <richard@nod.at> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Sat, 21 Oct 2017 18:32:46 +0000 (14:32 -0400)]
Merge tag 'dmaengine-fix-4.14-rc6' of git://git.infradead.org/users/vkoul/slave-dma
Pull dmaengine fix from Vinod Koul:
"Late fix for altera driver which fixes the locking in driver"
* tag 'dmaengine-fix-4.14-rc6' of git://git.infradead.org/users/vkoul/slave-dma:
dmaengine: altera: Use IRQ-safe spinlock calls in the error paths as well
Igor Russkikh [Thu, 19 Oct 2017 15:23:59 +0000 (18:23 +0300)]
net: aquantia: Bad udp rate on default interrupt coalescing
Default Tx rates cause very long ISR delays on Tx.
0xff is 510us delay, giving only ~ 2000 interrupts per seconds for
Tx rings cleanup. With these settings udp tx rate was never higher than
~800Mbps on a single stream. Changing min delay to 0xF makes it
way better with ~6Gbps
TCP stream performance is almost unaffected by this change, since LSO
optimizations play important role.
CPU load is affected insignificantly by this change.
Signed-off-by: Pavel Belous <pavel.belous@aquantia.com> Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Igor Russkikh [Thu, 19 Oct 2017 15:23:58 +0000 (18:23 +0300)]
net: aquantia: Enable coalescing management via ethtool interface
Aquantia NIC allows both TX and RX interrupt throttle rate (ITR)
management, but this was used in a very limited way via predefined
values. This patch allows to setup ITR default values via module
command line arguments and via standard ethtool coalescing settings.
Signed-off-by: Pavel Belous <pavel.belous@aquantia.com> Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Igor Russkikh [Thu, 19 Oct 2017 15:23:57 +0000 (18:23 +0300)]
net: aquantia: mmio unmap was not performed on driver removal
That may lead to mmio resource leakage.
Signed-off-by: Pavel Belous <pavel.belous@aquantia.com> Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Igor Russkikh [Thu, 19 Oct 2017 15:23:56 +0000 (18:23 +0300)]
net: aquantia: Limit number of MSIX irqs to the number of cpus
There is no much practical use from having MSIX vectors more that number
of cpus, thus cap this first with preconfigured limit, then with number
of cpus online.
Signed-off-by: Pavel Belous <pavel.belous@aquantia.com> Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Igor Russkikh [Thu, 19 Oct 2017 15:23:55 +0000 (18:23 +0300)]
net: aquantia: Fixed transient link up/down/up notification
When doing ifconfig down/up, driver did not reported carrier_off neither
in nic_stop nor in nic_start. That caused link to be visible as "up"
during couple of seconds immediately after "ifconfig up".
Signed-off-by: Pavel Belous <pavel.belous@aquantia.com> Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Igor Russkikh [Thu, 19 Oct 2017 15:23:54 +0000 (18:23 +0300)]
net: aquantia: Add queue restarts stats counter
Queue stat strings are cleaned up, duplicate stat name strings removed,
queue restarts counter added
Signed-off-by: Pavel Belous <pavel.belous@aquantia.com> Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Igor Russkikh [Thu, 19 Oct 2017 15:23:53 +0000 (18:23 +0300)]
net: aquantia: Reset nic statistics on interface up/down
Internal statistics system on chip never gets reset until hardware
reboot. This is quite inconvenient in terms of ethtool statistics usage.
This patch implements incremental statistics update inside of
service callback.
Upon nic initialization, first request is done to fetch
initial stat data, current collected stat data gets cleared.
Internal statistics mailbox readout is improved to save space and
increase readability
Signed-off-by: Pavel Belous <pavel.belous@aquantia.com> Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Matteo Croce [Thu, 19 Oct 2017 12:22:17 +0000 (14:22 +0200)]
udp: make some messages more descriptive
In the UDP code there are two leftover error messages with very few meaning.
Replace them with a more descriptive error message as some users
reported them as "strange network error".
Signed-off-by: Matteo Croce <mcroce@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Stefano Brivio [Thu, 19 Oct 2017 11:31:28 +0000 (13:31 +0200)]
geneve: Fix function matching VNI and tunnel ID on big-endian
On big-endian machines, functions converting between tunnel ID
and VNI use the three LSBs of tunnel ID storage to map VNI.
The comparison function eq_tun_id_and_vni(), on the other hand,
attempted to map the VNI from the three MSBs. Fix it by using
the same check implemented on LE, which maps VNI from the three
LSBs of tunnel ID.
Fixes: 2e0b26e10352 ("geneve: Optimize geneve device lookup.") Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Reviewed-by: Jakub Sitnicki <jkbs@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Sat, 21 Oct 2017 01:30:31 +0000 (02:30 +0100)]
Merge tag 'linux-can-fixes-for-4.14-20171019' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can
Marc Kleine-Budde says:
====================
pull-request: can 2017-10-19
this is a pull request of 11 patches for the upcoming 4.14 release.
There are 6 patches by ZHU Yi for the flexcan driver, that work around
the CAN error handling state transition problems found in various
incarnations of the flexcan IP core.
The patch by Colin Ian King fixes a potential NULL pointer deref in the
CAN broad cast manager (bcm). One patch by me replaces a direct deref of a RCU
protected pointer by rcu_access_pointer. My second patch adds missing
OOM error handling in af_can. A patch by Stefan Mätje for the esd_usb2
driver fixes the dlc in received RTR frames. And the last patch is by
Wolfgang Grandegger, it fixes a busy loop in the gs_usb driver in case
it runs out of TX contexts.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Dexuan Cui [Thu, 19 Oct 2017 03:33:14 +0000 (03:33 +0000)]
hv_sock: add locking in the open/close/release code paths
Without the patch, when hvs_open_connection() hasn't completely established
a connection (e.g. it has changed sk->sk_state to SS_CONNECTED, but hasn't
inserted the sock into the connected queue), vsock_stream_connect() may see
the sk_state change and return the connection to the userspace, and next
when the userspace closes the connection quickly, hvs_release() may not see
the connection in the connected queue; finally hvs_open_connection()
inserts the connection into the queue, but we won't be able to purge the
connection for ever.
Signed-off-by: Dexuan Cui <decui@microsoft.com> Cc: K. Y. Srinivasan <kys@microsoft.com> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Stephen Hemminger <sthemmin@microsoft.com> Cc: Vitaly Kuznetsov <vkuznets@redhat.com> Cc: Cathy Avery <cavery@redhat.com> Cc: Rolf Neugebauer <rolf.neugebauer@docker.com> Cc: Marcelo Cerri <marcelo.cerri@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Samuel Mendoza-Jonas <sam@mendozajonas.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Gavin Shan [Thu, 19 Oct 2017 02:43:08 +0000 (13:43 +1100)]
net/ncsi: Enforce failover on link monitor timeout
The NCSI channel has been configured to provide service if its link
monitor timer is enabled, regardless of its state (inactive or active).
So the timeout event on the link monitor indicates the out-of-service
on that channel, for which a failover is needed.
This sets NCSI_DEV_RESHUFFLE flag to enforce failover on link monitor
timeout, regardless the channel's original state (inactive or active).
Also, the link is put into "down" state to give the failing channel
lowest priority when selecting for the active channel. The state of
failing channel should be set to active in order for deinitialization
and failover to be done.
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Samuel Mendoza-Jonas <sam@mendozajonas.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Gavin Shan [Thu, 19 Oct 2017 02:43:07 +0000 (13:43 +1100)]
net/ncsi: Disable HWA mode when no channels are found
When there are no NCSI channels probed, HWA (Hardware Arbitration)
mode is enabled. It's not correct because HWA depends on the fact:
NCSI channels exist and all of them support HWA mode. This disables
HWA when no channels are probed.
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Samuel Mendoza-Jonas <sam@mendozajonas.com> Signed-off-by: David S. Miller <davem@davemloft.net>
net/ncsi: Stop monitor if channel times out or is inactive
ncsi_channel_monitor() misses stopping the channel monitor in several
places that it should, causing a WARN_ON_ONCE() to trigger when the
monitor is re-started later, eg:
[ 459.040000] WARNING: CPU: 0 PID: 1093 at net/ncsi/ncsi-manage.c:269 ncsi_start_channel_monitor+0x7c/0x90
[ 459.040000] CPU: 0 PID: 1093 Comm: kworker/0:3 Not tainted 4.10.17-gaca2fdd #140
[ 459.040000] Hardware name: ASpeed SoC
[ 459.040000] Workqueue: events ncsi_dev_work
[ 459.040000] [<80010094>] (unwind_backtrace) from [<8000d950>] (show_stack+0x20/0x24)
[ 459.040000] [<8000d950>] (show_stack) from [<801dbf70>] (dump_stack+0x20/0x28)
[ 459.040000] [<801dbf70>] (dump_stack) from [<80018d7c>] (__warn+0xe0/0x108)
[ 459.040000] [<80018d7c>] (__warn) from [<80018e70>] (warn_slowpath_null+0x30/0x38)
[ 459.040000] [<80018e70>] (warn_slowpath_null) from [<803f6a08>] (ncsi_start_channel_monitor+0x7c/0x90)
[ 459.040000] [<803f6a08>] (ncsi_start_channel_monitor) from [<803f7664>] (ncsi_configure_channel+0xdc/0x5fc)
[ 459.040000] [<803f7664>] (ncsi_configure_channel) from [<803f8160>] (ncsi_dev_work+0xac/0x474)
[ 459.040000] [<803f8160>] (ncsi_dev_work) from [<8002d244>] (process_one_work+0x1e0/0x450)
[ 459.040000] [<8002d244>] (process_one_work) from [<8002d510>] (worker_thread+0x5c/0x570)
[ 459.040000] [<8002d510>] (worker_thread) from [<80033614>] (kthread+0x124/0x164)
[ 459.040000] [<80033614>] (kthread) from [<8000a5e8>] (ret_from_fork+0x14/0x2c)
This also updates the monitor instead of just returning if
ncsi_xmit_cmd() fails to send the get-link-status command so that the
monitor properly times out.
Fixes: e6f44ed6d04d3 "net/ncsi: Package and channel management" Signed-off-by: Samuel Mendoza-Jonas <sam@mendozajonas.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Correct the value of the HNCDSC AEN packet. Fixes: 7a82ecf4cfb85 "net/ncsi: NCSI AEN packet handler" Signed-off-by: Samuel Mendoza-Jonas <sam@mendozajonas.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Wed, 18 Oct 2017 23:14:52 +0000 (16:14 -0700)]
packet: avoid panic in packet_getsockopt()
syzkaller got crashes in packet_getsockopt() processing
PACKET_ROLLOVER_STATS command while another thread was managing
to change po->rollover
Using RCU will fix this bug. We might later add proper RCU annotations
for sparse sake.
In v2: I replaced kfree(rollover) in fanout_add() to kfree_rcu()
variant, as spotted by John.
Fixes: a9b6391814d5 ("packet: rollover statistics") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Willem de Bruijn <willemb@google.com> Cc: John Sperbeck <jsperbeck@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Fri, 20 Oct 2017 16:04:13 +0000 (09:04 -0700)]
tcp/dccp: fix ireq->opt races
syzkaller found another bug in DCCP/TCP stacks [1]
For the reasons explained in commit ce1050089c96 ("tcp/dccp: fix
ireq->pktopts race"), we need to make sure we do not access
ireq->opt unless we own the request sock.
Note the opt field is renamed to ireq_opt to ease grep games.
[1]
BUG: KASAN: use-after-free in ip_queue_xmit+0x1687/0x18e0 net/ipv4/ip_output.c:474
Read of size 1 at addr ffff8801c951039c by task syz-executor5/3295
Fixes: e994b2f0fb92 ("tcp: do not lock listener to process SYN packets") Fixes: 079096f103fa ("tcp/dccp: install syn_recv requests into ehash table") Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Arnd Bergmann [Fri, 20 Oct 2017 20:24:48 +0000 (22:24 +0200)]
Merge tag 'sunxi-fixes-for-4.14' of https://git.kernel.org/pub/scm/linux/kernel/git/sunxi/linux into fixes
Pull "Allwinner fixes for 4.14" from Maxime Ripard:
Two fixes, one for the A31 DRM binding, and one for a missing regulator on
the pine MMC controller.
* tag 'sunxi-fixes-for-4.14' of https://git.kernel.org/pub/scm/linux/kernel/git/sunxi/linux:
ARM: dts: sun6i: Fix endpoint IDs in second display pipeline
arm64: allwinner: a64: pine64: Use dcdc1 regulator for mmc0
Kees Cook [Fri, 20 Oct 2017 14:36:05 +0000 (07:36 -0700)]
waitid(): Avoid unbalanced user_access_end() on access_ok() error
As pointed out by Linus and David, the earlier waitid() fix resulted in
a (currently harmless) unbalanced user_access_end() call. This fixes it
to just directly return EFAULT on access_ok() failure.
David S. Miller [Fri, 20 Oct 2017 12:01:30 +0000 (13:01 +0100)]
Merge branch 'sockmap-fixes'
John Fastabend says:
====================
sockmap fixes for net
The following implements a set of fixes for sockmap and changes the
API slightly in a few places to reduce preempt_disable/enable scope.
We do this here in net because it requires an API change and this
avoids getting stuck with legacy API going forward.
The short description:
Access to skb mark is removed, it is problematic when we add
features in the future because mark is a union and used by the
TCP/socket code internally. We don't want to expose this to the
BPF programs or let programs change the values.
The other change is caching metadata in the skb itself between
when the BPF program returns a redirect code and the core code
implements the redirect. This avoids having per cpu metadata.
Finally, tighten restriction on using sockmap to CAP_NET_ADMIN and
only SOCK_STREAM sockets.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
John Fastabend [Wed, 18 Oct 2017 14:11:44 +0000 (07:11 -0700)]
bpf: require CAP_NET_ADMIN when using devmap
Devmap is used with XDP which requires CAP_NET_ADMIN so lets also
make CAP_NET_ADMIN required to use the map.
Signed-off-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
John Fastabend [Wed, 18 Oct 2017 14:11:22 +0000 (07:11 -0700)]
bpf: require CAP_NET_ADMIN when using sockmap maps
Restrict sockmap to CAP_NET_ADMIN.
Signed-off-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
John Fastabend [Wed, 18 Oct 2017 14:10:58 +0000 (07:10 -0700)]
bpf: remove mark access for SK_SKB program types
The skb->mark field is a union with reserved_tailroom which is used
in the TCP code paths from stream memory allocation. Allowing SK_SKB
programs to set this field creates a conflict with future code
optimizations, such as "gifting" the skb to the egress path instead
of creating a new skb and doing a memcpy.
Because we do not have a released version of SK_SKB yet lets just
remove it for now. A more appropriate scratch pad to use at the
socket layer is dev_scratch, but lets add that in future kernels
when needed.
Signed-off-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
John Fastabend [Wed, 18 Oct 2017 14:10:36 +0000 (07:10 -0700)]
bpf: avoid preempt enable/disable in sockmap using tcp_skb_cb region
SK_SKB BPF programs are run from the socket/tcp context but early in
the stack before much of the TCP metadata is needed in tcp_skb_cb. So
we can use some unused fields to place BPF metadata needed for SK_SKB
programs when implementing the redirect function.
This allows us to drop the preempt disable logic. It does however
require an API change so sk_redirect_map() has been updated to
additionally provide ctx_ptr to skb. Note, we do however continue to
disable/enable preemption around actual BPF program running to account
for map updates.
Signed-off-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
John Fastabend [Wed, 18 Oct 2017 14:10:15 +0000 (07:10 -0700)]
bpf: enforce TCP only support for sockmap
Only TCP sockets have been tested and at the moment the state change
callback only handles TCP sockets. This adds a check to ensure that
sockets actually being added are TCP sockets.
For net-next we can consider UDP support.
Signed-off-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Xin Long [Wed, 18 Oct 2017 13:37:49 +0000 (21:37 +0800)]
sctp: add the missing sock_owned_by_user check in sctp_icmp_redirect
Now sctp processes icmp redirect packet in sctp_icmp_redirect where
it calls sctp_transport_dst_check in which tp->dst can be released.
The problem is before calling sctp_transport_dst_check, it doesn't
check sock_owned_by_user, which means tp->dst could be freed while
a process is accessing it with owning the socket.
An use-after-free issue could be triggered by this.
This patch is to fix it by checking sock_owned_by_user before calling
sctp_transport_dst_check in sctp_icmp_redirect, so that it would not
release tp->dst if users still hold sock lock.
Besides, the same issue fixed in commit 45caeaa5ac0b ("dccp/tcp: fix
routing redirect race") on sctp also needs this check.
Fixes: 55be7a9c6074 ("ipv4: Add redirect support to all protocol icmp error handlers") Reported-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Fri, 20 Oct 2017 10:58:43 +0000 (06:58 -0400)]
Merge tag 'for-linus-4.14c-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen fix from Juergen Gross:
"A fix for the Xen pv network drivers (frontend and backend) avoiding
the network connection to become unusable due to an illegal MTU"
* tag 'for-linus-4.14c-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
xen-netfront, xen-netback: Use correct minimum MTU values
Linus Torvalds [Fri, 20 Oct 2017 10:38:56 +0000 (06:38 -0400)]
Merge tag 'drm-fixes-for-v4.14-rc6' of git://people.freedesktop.org/~airlied/linux
Pull drm fixes from Dave Airlie:
"Standard fixes pull for rc6: one regression fix for amdgpu, a bunch of
nouveau fixes that I'd missed a pull req for from Ben last week, some
exynos regression fixes, and a few fixes for i915"
* tag 'drm-fixes-for-v4.14-rc6' of git://people.freedesktop.org/~airlied/linux:
drm/nouveau/fbcon: fix oops without fbdev emulation
Revert "drm/amdgpu: discard commands of killed processes"
drm/i915: Use a mask when applying WaProgramL3SqcReg1Default
drm/i915: Report -EFAULT before pwrite fast path into shmemfs
drm/i915/cnl: Fix PLL initialization for HDMI.
drm/i915/cnl: Fix PLL mapping.
drm/i915: Use bdw_ddi_translations_fdi for Broadwell
drm/i915: Fix eviction when the GGTT is idle but full
drm/i915/gvt: Fix GPU hang after reusing vGPU instance across different guest OS
drm/exynos: Clear drvdata after component unbind
drm/exynos: Fix potential NULL pointer dereference in suspend/resume paths
drm/nouveau/kms/nv50: fix oops during DP IRQ handling on non-MST boards
drm/nouveau/bsp/g92: disable by default
drm/nouveau/mmu: flush tlbs before deleting page tables
Linus Torvalds [Fri, 20 Oct 2017 10:32:26 +0000 (06:32 -0400)]
Merge branch 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux
Pull i2c fixes from Wolfram Sang:
"A couple of bugfixes for I2C drivers.
Because the changes for the piix4 driver are larger than usual, the
patches have been in linux-next for more than a week with no reports
coming in. The rest is usual stuff"
* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
i2c: omap: Fix error handling for clk_get()
i2c: piix4: Disable completely the IMC during SMBUS_BLOCK_DATA
i2c: piix4: Fix SMBus port selection for AMD Family 17h chips
i2c: imx: fix misleading bus recovery debug message
i2c: imx: use IRQF_SHARED mode to request IRQ
i2c: ismt: Separate I2C block read from SMBus block read
Linus Torvalds [Fri, 20 Oct 2017 10:19:38 +0000 (06:19 -0400)]
Merge branch 'fixes-v4.14-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security
Pull key handling fixes from James Morris:
"This includes a fix for the capabilities code from Colin King, and a
set of further fixes for the keys subsystem. From David:
- Fix a bunch of places where kernel drivers may access revoked
user-type keys and don't do it correctly.
- Fix some ecryptfs bits.
- Fix big_key to require CONFIG_CRYPTO.
- Fix a couple of bugs in the asymmetric key type.
- Fix a race between updating and finding negative keys.
- Prevent add_key() from updating uninstantiated keys.
- Make loading of key flags and expiry time atomic when not holding
locks"
* 'fixes-v4.14-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
commoncap: move assignment of fs_ns to avoid null pointer dereference
pkcs7: Prevent NULL pointer dereference, since sinfo is not always set.
KEYS: load key flags and expiry time atomically in proc_keys_show()
KEYS: Load key expiry time atomically in keyring_search_iterator()
KEYS: load key flags and expiry time atomically in key_validate()
KEYS: don't let add_key() update an uninstantiated key
KEYS: Fix race between updating and finding a negative key
KEYS: checking the input id parameters before finding asymmetric key
KEYS: Fix the wrong index when checking the existence of second id
security/keys: BIG_KEY requires CONFIG_CRYPTO
ecryptfs: fix dereference of NULL user_key_payload
fscrypt: fix dereference of NULL user_key_payload
lib/digsig: fix dereference of NULL user_key_payload
FS-Cache: fix dereference of NULL user_key_payload
KEYS: encrypted: fix dereference of NULL user_key_payload
Stefan Roese [Mon, 16 Oct 2017 06:13:53 +0000 (08:13 +0200)]
dmaengine: altera: Use IRQ-safe spinlock calls in the error paths as well
The patch edf10919 [dmaengine: altera: fix spinlock usage] missed to
change 2 occurrences of spin_unlock_bh() to spin_unlock_irqrestore().
This patch fixes this by moving to the IRQ-safe call in the error
paths as well.
Fixes: edf10919 (dmaengine: altera: fix spinlock usage) Signed-off-by: Stefan Roese <sr@denx.de> Reviewed-by: Sylvain Lesne <lesne@alse-fr.com>
[add fixes tag and fix typo in log] Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Linus Torvalds [Fri, 20 Oct 2017 02:49:21 +0000 (22:49 -0400)]
Merge tag 'pm-4.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management fix from Rafael Wysocki:
"This reverts a problematic commit modifying the turbostat utility that
went in during the 4.13 cycle (Len Brown)"
* tag 'pm-4.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
Revert "tools/power turbostat: stop migrating, unless '-m'"
Paul E. McKenney [Thu, 19 Oct 2017 21:26:20 +0000 (14:26 -0700)]
doc: Fix RCU's docbook options
Commit 764f80798b95 ("doc: Add RCU files to docbook-generation files")
added :external: options for RCU source files in the file
Documentation/core-api/kernel-api.rst. However, this now means nothing,
so this commit removes them.
Reported-by: Randy Dunlap <rdunlap@infradead.org> Reported-by: Akira Yokosawa <akiyks@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
membarrier: Provide register expedited private command
This introduces a "register private expedited" membarrier command which
allows eventual removal of important memory barrier constraints on the
scheduler fast-paths. It changes how the "private expedited" membarrier
command (new to 4.14) is used from user-space.
This new command allows processes to register their intent to use the
private expedited command. This affects how the expedited private
command introduced in 4.14-rc is meant to be used, and should be merged
before 4.14 final.
Processes are now required to register before using
MEMBARRIER_CMD_PRIVATE_EXPEDITED, otherwise that command returns EPERM.
This fixes a problem that arose when designing requested extensions to
sys_membarrier() to allow JITs to efficiently flush old code from
instruction caches. Several potential algorithms are much less painful
if the user register intent to use this functionality early on, for
example, before the process spawns the second thread. Registering at
this time removes the need to interrupt each and every thread in that
process at the first expedited sys_membarrier() system call.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Dmitry Torokhov [Sat, 7 Oct 2017 18:07:47 +0000 (11:07 -0700)]
Input: ims-psu - check if CDC union descriptor is sane
Before trying to use CDC union descriptor, try to validate whether that it
is sane by checking that intf->altsetting->extra is big enough and that
descriptor bLength is not too big and not too small.
Introduce a device table used for blacklisting devices. We currently
blacklist the motion sensor subdevice of THQ Udraw and Sony ds3/ds4.
Signed-off-by: Roderick Colenbrander <roderick.colenbrander@sony.com>
[dtor: siwtched to blacklist built on input_device_id and using
input_match_device_id()] Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Dmitry Torokhov [Mon, 9 Oct 2017 18:09:33 +0000 (11:09 -0700)]
Input: factor out and export input_device_id matching code
Factor out and export input_match_device_id() so that modules may use it.
It will be needed by joydev to blacklist accelerometers in composite
devices.
Paul Cercueil [Fri, 13 Oct 2017 18:04:48 +0000 (11:04 -0700)]
Input: goodix - poll the 'buffer status' bit before reading data
The Goodix panel triggers an interrupt on touch events. However, its
registers will contain the valid values a short time after the
interrupt, and not when it's raised. At that moment, the 'buffer status'
bit is set.
Previously, if the 'buffer status' bit was not set when the registers
were read, the data was discarded and no input event was emitted,
causing "finger down" or "finger up" events to be missed sometimes.
This went unnoticed until v4.9, as the DesignWare I2C driver commonly
used with this driver had enough latency for that bug to never trigger
until commit 2702ea7dbec5 ("i2c: designware: wait for disable/enable only
if necessary").
Now, in the IRQ handler we will poll (with a timeout) the 'buffer status'
bit and process the data of the panel as soon as this bit gets set.
Note that the Goodix panel will send a few spurious interrupts after the
'finger up' event, in which the 'buffer status' bit will never be set.
Cc: Bastien Nocera <hadess@hadess.net> Cc: russianneuromancer@ya.ru Signed-off-by: Paul Cercueil <paul@crapouillou.net>
[hdegoede@redhat.com: Change poll loop to use jiffies,
add comment about typical poll time] Signed-off-by: Hans de Goede <hdegoede@redhat.com>
[dtor: rearranged control flow a bit to avoid explicit goto and double
check] Reviewed-by: Bastien Nocera <hadess@hadess.net> Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Hans de Goede [Thu, 19 Oct 2017 22:38:50 +0000 (15:38 -0700)]
Input: axp20x-pek - fix module not auto-loading for axp221 pek
Now that we have a platform_device_id table and multiple supported ids
we should be using MODULE_DEVICE_TABLE instead of MODULE_ALIAS.
This fixes a regression on Bay and Cherry Trail devices, where the power
button is now enumerated as an "axp221-pek" and it was impossible to
wakeup these devices from suspend since the module did not load.
Fixes: c3cc94470bd3 ("Input: axp20x-pek - add support for AXP221 PEK") Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Damien Riegel [Thu, 19 Oct 2017 22:34:55 +0000 (15:34 -0700)]
Input: tca8418 - enable interrupt after it has been requested
Currently, enabling keypad interrupts is one of the first operations
done on the keypad, even before the interrupt is requested, so there is
a small time window where the keypad can fire interrupts but the driver
is not yet ready to handle them. It's fine for level interrupts because
they will be handled anyway, but not so much for edge ones.
This commit modifies and moves the function in charge of configuring the
keypad. Enabling interrupts is now the last thing done on the keypad,
and after the interrupt has been requested by the driver.
Writing to the config register was also used to determine if the device
was indeed present on the bus or not, this has been replaced by reading
the lock/event count register to keep the same functionality.
Linus Torvalds [Thu, 19 Oct 2017 20:18:58 +0000 (16:18 -0400)]
Merge branch 'parisc-4.14-3' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux
Pull parisc fixes from Helge Deller:
"Three small important fixes for the parisc architecture:
- Export __cmpxchg_u64() symbol on 32bit kernel too. This unbreaks
building the kernel with ixgbe kernel module. From Guenter Roeck.
- Fix 64-bit atomic cmpxchg kernel helper function for 32-bit kernel
in LWS code for userspace. This unbreaks e.g. the 64-bit variant of
the glibc function __sync_fetch_and_add() with a 32-bit parisc
kernel. From John David Anglin, tagged for backport to v3.13+.
- Detect nonsynchronous CPU-internal cr16 cycle counters more
reliable. This avoids stalled CPU warnings by the kernel soft
lockup detector. From me, tagged for backport to v4.13+"
* 'parisc-4.14-3' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
parisc: Fix detection of nonsynchronous cr16 cycle counters
parisc: Export __cmpxchg_u64 unconditionally
parisc: Fix double-word compare and exchange in LWS code on 32-bit kernels
Linus Torvalds [Thu, 19 Oct 2017 20:15:17 +0000 (16:15 -0400)]
Merge tag 'sound-4.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"We've got slightly more fixes than wished, but heading to a good
shape. Most of changes are about HD-audio fixes, one for a buggy code
that went into 4.13, and another for avoiding a crash due to buggy
BIOS.
Apart from HD-audio, a sequencer core change that is only for UP
config (which must be pretty rare nowadays), and a USB-audio quirk as
usual"
* tag 'sound-4.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: hda - Fix incorrect TLV callback check introduced during set_fs() removal
ALSA: hda: Remove superfluous '-' added by printk conversion
ALSA: hda: Abort capability probe at invalid register read
ALSA: seq: Enable 'use' locking in all configurations
ALSA: usb-audio: Add native DSD support for Pro-Ject Pre Box S2 Digital
Arnd Bergmann [Thu, 19 Oct 2017 15:58:13 +0000 (17:58 +0200)]
Merge tag 'renesas-fixes-for-v4.14' of https://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas into fixes
Pull "Renesas ARM Based SoC Fixes for v4.14" from Simon Horman:
Add 12V regulator to backlight allowing the power supply
for the backlight to be found.
* tag 'renesas-fixes-for-v4.14' of https://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas:
arm64: dts: salvator-common: add 12V regulator to backlight
Ulf Hansson [Fri, 6 Oct 2017 04:20:25 +0000 (06:20 +0200)]
ARM: ux500: Fix regression while init PM domains
The commit afece3ab9a36 ("PM / Domains: Add time accounting to various
genpd states") causes a boot regression for ux500.
The problem occurs when the ux500 machine code calls pm_genpd_init(), which
since the above change triggers a call to ktime_get(). More precisely,
because ux500 initializes PM domains in the init_IRQ() phase of the boot,
timekeeping has not yet been initialized.
Fix the problem by moving the initialization of the PM domains to after
timekeeping has been initialized.
Fixes: afece3ab9a36 ("PM / Domains: Add time accounting to various genpd..") Cc: Thara Gopinath <thara.gopinath@linaro.org> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Arnd Bergmann [Thu, 19 Oct 2017 15:42:30 +0000 (17:42 +0200)]
Merge tag 'v4.14-rockchip-dts64fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mmind/linux-rockchip into fixes
Pull "Rockchip dts64 Fixes for 4.14 part 2" from Heiko Stübner:
The vqmmc voltages on rk3399 pose a risk for the chip if they
exceed 3.0V, so they got fixed to not be at 3.3V
And Arnd found a typo in the recently added iommu nodes.
* tag 'v4.14-rockchip-dts64fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mmind/linux-rockchip:
arm64: dts: rockchip: fix typo in iommu nodes
arm64: dts: rockchip: correct vqmmc voltage for rk3399 platforms
Arnd Bergmann [Thu, 19 Oct 2017 15:41:22 +0000 (17:41 +0200)]
Merge tag 'imx-fixes-4.14' of git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux into fixes
Pull "i.MX fixes for 4.14" from Shawn Guo:
- Fix the legacy PCI interrupt numbers for i.MX7. The numbers were
wrongly coded in an inverted order than what Reference Manual tells.
It causes problem for PCI devices using legacy interrupt.
* tag 'imx-fixes-4.14' of git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux:
ARM: dts: imx7d: Invert legacy PCI irq mapping
Arnd Bergmann [Thu, 19 Oct 2017 15:40:11 +0000 (17:40 +0200)]
Merge tag 'mvebu-fixes-4.14-2' of git://git.infradead.org/linux-mvebu into fixes
Pull "mvebu fixes for 4.14 (part 2)" from Gregory CLEMENT
Two device tree related fixes:
- One on Armada 38x using a other compatible string for I2C in order
to cover an errata.
- One for Armada 7K/8K fixing a typo on interrupt-map property for
PCIe leading to fail PME and AER root port service initialization
And the last one for the mbus fixing the window size calculation when
it exceed 32bits
* tag 'mvebu-fixes-4.14-2' of git://git.infradead.org/linux-mvebu:
bus: mbus: fix window size calculation for 4GB windows
ARM: dts: Fix I2C repeated start issue on Armada-38x
arm64: dts: marvell: fix interrupt-map property for Armada CP110 PCIe controller
Arnd Bergmann [Thu, 19 Oct 2017 15:36:08 +0000 (17:36 +0200)]
Merge tag 'at91-fixes2' of git://git.kernel.org/pub/scm/linux/kernel/git/nferre/linux-at91 into fixes
Fixes: second batch for 4.14:
- one DT phy address fix for the new sama5d27 som1 ek
- two DT ADC patches that were forgotten while moving to
hardware triggers for sama5d2 (iio changes already applied)
* tag 'at91-fixes2' of git://git.kernel.org/pub/scm/linux/kernel/git/nferre/linux-at91:
ARM: dts: at91: sama5d2: add ADC hw trigger edge type
ARM: dts: at91: sama5d2_xplained: enable ADTRG pin
ARM: dts: at91: at91-sama5d27_som1: fix PHY ID
Arnd Bergmann [Thu, 19 Oct 2017 15:30:31 +0000 (17:30 +0200)]
Merge tag 'arm-soc/for-4.14/devicetree-fixes' of http://github.com/Broadcom/stblinux into fixes
Pull "Broadcom devicetree fixes for 4.14" from Florian Fainelli:
This pull request contains Broadcom ARM-based SoC Device Tree fixes for 4.14,
please pull the following:
- Loic fixes the console path on the Raspberry Pi 3 which was not correctly set
and would cause all sorts of confusion between the Bluetooth controller and the
kernel console
* tag 'arm-soc/for-4.14/devicetree-fixes' of http://github.com/Broadcom/stblinux:
ARM: dts: bcm283x: Fix console path on RPi3
Xin Long [Tue, 17 Oct 2017 15:26:10 +0000 (23:26 +0800)]
sctp: do not peel off an assoc from one netns to another one
Now when peeling off an association to the sock in another netns, all
transports in this assoc are not to be rehashed and keep use the old
key in hashtable.
As a transport uses sk->net as the hash key to insert into hashtable,
it would miss removing these transports from hashtable due to the new
netns when closing the sock and all transports are being freeed, then
later an use-after-free issue could be caused when looking up an asoc
and dereferencing those transports.
This is a very old issue since very beginning, ChunYu found it with
syzkaller fuzz testing with this series:
This patch is to block this call when peeling one assoc off from one
netns to another one, so that the netns of all transport would not
go out-sync with the key in hashtable.
Note that this patch didn't fix it by rehashing transports, as it's
difficult to handle the situation when the tuple is already in use
in the new netns. Besides, no one would like to peel off one assoc
to another netns, considering ipaddrs, ifaces, etc. are usually
different.
Reported-by: ChunYu Wang <chunwang@redhat.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
====================
bpf: Fix for BPF devmap percpu allocation splat
The set fixes a splat in devmap percpu allocation when we alloc
the flush bitmap. Patch 1 is a prerequisite for the fix in patch 2,
patch 1 is rather small, so if this could be routed via -net, for
example, with Tejun's Ack that would be good. Patch 3 gets rid of
remaining PCPU_MIN_UNIT_SIZE checks, which are percpu allocator
internals and should not be used.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Borkmann [Tue, 17 Oct 2017 14:55:54 +0000 (16:55 +0200)]
bpf: do not test for PCPU_MIN_UNIT_SIZE before percpu allocations
PCPU_MIN_UNIT_SIZE is an implementation detail of the percpu
allocator. Given we support __GFP_NOWARN now, lets just let
the allocation request fail naturally instead. The two call
sites from BPF mistakenly assumed __GFP_NOWARN would work, so
no changes needed to their actual __alloc_percpu_gfp() calls
which use the flag already.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Borkmann [Tue, 17 Oct 2017 14:55:53 +0000 (16:55 +0200)]
bpf: fix splat for illegal devmap percpu allocation
It was reported that syzkaller was able to trigger a splat on
devmap percpu allocation due to illegal/unsupported allocation
request size passed to __alloc_percpu():
This was due to too large max_entries for the map such that we
surpassed the upper limit of PCPU_MIN_UNIT_SIZE. It's fine to
fail naturally here, so switch to __alloc_percpu_gfp() and pass
__GFP_NOWARN instead.
Fixes: 11393cc9b9be ("xdp: Add batching support to redirect map") Reported-by: Mark Rutland <mark.rutland@arm.com> Reported-by: Shankara Pailoor <sp3485@columbia.edu> Reported-by: Richard Weinberger <richard@nod.at> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Cc: John Fastabend <john.fastabend@gmail.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Borkmann [Tue, 17 Oct 2017 14:55:52 +0000 (16:55 +0200)]
mm, percpu: add support for __GFP_NOWARN flag
Add an option for pcpu_alloc() to support __GFP_NOWARN flag.
Currently, we always throw a warning when size or alignment
is unsupported (and also dump stack on failed allocation
requests). The warning itself is harmless since we return
NULL anyway for any failed request, which callers are
required to handle anyway. However, it becomes harmful when
panic_on_warn is set.
The rationale for the WARN() in pcpu_alloc() is that it can
be tracked when larger than supported allocation requests are
made such that allocations limits can be tweaked if warranted.
This makes sense for in-kernel users, however, there are users
of pcpu allocator where allocation size is derived from user
space requests, e.g. when creating BPF maps. In these cases,
the requests should fail gracefully without throwing a splat.
The current work-around was to check allocation size against
the upper limit of PCPU_MIN_UNIT_SIZE from call-sites for
bailing out prior to a call to pcpu_alloc() in order to
avoid throwing the WARN(). This is bad in multiple ways since
PCPU_MIN_UNIT_SIZE is an implementation detail, and having
the checks on call-sites only complicates the code for no
good reason. Thus, lets fix it generically by supporting the
__GFP_NOWARN flag that users can then use with calling the
__alloc_percpu_gfp() helper instead.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Cc: Tejun Heo <tj@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Netanel Belgazal [Tue, 17 Oct 2017 07:33:05 +0000 (07:33 +0000)]
net: ena: fix wrong max Tx/Rx queues on ethtool
ethtool ena_get_channels() expose the max number of queues as the max
number of queues ENA supports (128 queues) and not the actual number
of created queues.
Signed-off-by: Netanel Belgazal <netanel@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>
can: gs_usb: fix busy loop if no more TX context is available
If sending messages with no cable connected, it quickly happens that
there is no more TX context available. Then "gs_can_start_xmit()"
returns with "NETDEV_TX_BUSY" and the upper layer does retry
immediately keeping the CPU busy. To fix that issue, I moved
"atomic_dec(&dev->active_tx_urbs)" from "gs_usb_xmit_callback()" to
the TX done handling in "gs_usb_receive_bulk_callback()". Renaming
"active_tx_urbs" to "active_tx_contexts" and moving it into
"gs_[alloc|free]_tx_context()" would also make sense.
Signed-off-by: Wolfgang Grandegger <wg@grandegger.com> Cc: linux-stable <stable@vger.kernel.org> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Stefan Mätje [Wed, 18 Oct 2017 11:25:17 +0000 (13:25 +0200)]
can: esd_usb2: Fix can_dlc value for received RTR, frames
The dlc member of the struct rx_msg contains also the ESD_RTR flag to
mark received RTR frames. Without the fix the can_dlc value for received
RTR frames would always be set to 8 by get_can_dlc() instead of the
received value.
Fixes: 96d8e90382dc ("can: Add driver for esd CAN-USB/2 device") Signed-off-by: Stefan Mätje <stefan.maetje@esd.eu> Cc: linux-stable <stable@vger.kernel.org> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Colin Ian King [Fri, 8 Sep 2017 15:02:35 +0000 (16:02 +0100)]
can: bcm: check for null sk before deferencing it via the call to sock_net
The assignment of net via call sock_net will dereference sk. This
is performed before a sanity null check on sk, so there could be
a potential null dereference on the sock_net call if sk is null.
Fix this by assigning net after the sk null check. Also replace
the sk == NULL with the more usual !sk idiom.
Detected by CoverityScan CID#1431862 ("Dereference before null check")
Fixes: 384317ef4187 ("can: network namespace support for CAN_BCM protocol") Signed-off-by: Colin Ian King <colin.king@canonical.com> Acked-by: Oliver Hartkopp <socketcan@hartkopp.net> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Enable FLEXCAN_QUIRK_BROKEN_WERR_STATE and
FLEXCAN_QUIRK_BROKEN_PERR_STATE for p1010 to report correct state
transitions.
Signed-off-by: Zhu Yi <yi.zhu5@cn.bosch.com> Signed-off-by: Mark Jonas <mark.jonas@de.bosch.com> Acked-by: Wolfgang Grandegger <wg@grandegger.com> Cc: linux-stable <stable@vger.kernel.org> # >= v4.11 Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Enable FLEXCAN_QUIRK_BROKEN_PERR_STATE for i.MX28 to report correct
state transitions, especially to error passive.
Signed-off-by: Wolfgang Grandegger <wg@grandegger.com> Signed-off-by: Zhu Yi <yi.zhu5@cn.bosch.com> Signed-off-by: Mark Jonas <mark.jonas@de.bosch.com> Cc: linux-stable <stable@vger.kernel.org> # >= v4.11 Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>