David Howells [Thu, 15 Feb 2018 22:59:00 +0000 (22:59 +0000)]
rxrpc: Work around usercopy check
Due to a check recently added to copy_to_user(), it's now not permitted to
copy from slab-held data to userspace unless the slab is whitelisted. This
affects rxrpc_recvmsg() when it attempts to place an RXRPC_USER_CALL_ID
control message in the userspace control message buffer. A warning is
generated by usercopy_warn() because the source is the copy of the
user_call_ID retained in the rxrpc_call struct.
Work around the issue by copying the user_call_ID to a variable on the
stack and passing that to put_cmsg().
Reported-by: Jonathan Billings <jsbillings@jsbillings.org> Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Kees Cook <keescook@chromium.org> Tested-by: Jonathan Billings <jsbillings@jsbillings.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Thu, 15 Feb 2018 22:47:15 +0000 (14:47 -0800)]
tun: fix tun_napi_alloc_frags() frag allocator
<Mark Rutland reported>
While fuzzing arm64 v4.16-rc1 with Syzkaller, I've been hitting a
misaligned atomic in __skb_clone:
atomic_inc(&(skb_shinfo(skb)->dataref));
where dataref doesn't have the required natural alignment, and the
atomic operation faults. e.g. i often see it aligned to a single
byte boundary rather than a four byte boundary.
AFAICT, the skb_shared_info is misaligned at the instant it's
allocated in __napi_alloc_skb() __napi_alloc_skb()
</end of report>
Problem is caused by tun_napi_alloc_frags() using
napi_alloc_frag() with user provided seg sizes,
leading to other users of this API getting unaligned
page fragments.
Since we would like to not necessarily add paddings or alignments to
the frags that tun_napi_alloc_frags() attaches to the skb, switch to
another page frag allocator.
As a bonus skb_page_frag_refill() can use GFP_KERNEL allocations,
meaning that we can not deplete memory reserves as easily.
Fixes: 90e33d459407 ("tun: enable napi_gro_frags() for TUN/TAP driver") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Mark Rutland <mark.rutland@arm.com> Tested-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Alexey Kodanev [Thu, 15 Feb 2018 17:18:43 +0000 (20:18 +0300)]
udplite: fix partial checksum initialization
Since UDP-Lite is always using checksum, the following path is
triggered when calculating pseudo header for it:
udp4_csum_init() or udp6_csum_init()
skb_checksum_init_zero_check()
__skb_checksum_validate_complete()
The problem can appear if skb->len is less than CHECKSUM_BREAK. In
this particular case __skb_checksum_validate_complete() also invokes
__skb_checksum_complete(skb). If UDP-Lite is using partial checksum
that covers only part of a packet, the function will return bad
checksum and the packet will be dropped.
It can be fixed if we skip skb_checksum_init_zero_check() and only
set the required pseudo header checksum for UDP-Lite with partial
checksum before udp4_csum_init()/udp6_csum_init() functions return.
Fixes: ed70fcfcee95 ("net: Call skb_checksum_init in IPv4") Fixes: e4f45b7f40bd ("net: Call skb_checksum_init in IPv6") Signed-off-by: Alexey Kodanev <alexey.kodanev@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
After commit 3f34cfae1238 ("netfilter: on sockopt() acquire sock lock
only in the required scope"), the caller of nf_{get/set}sockopt() must
not hold any lock, but, in such changeset, I forgot to cope with DECnet.
This commit addresses the issue moving the nf call outside the lock,
in the dn_{get,set}sockopt() with the same schema currently used by
ipv4 and ipv6. Also moves the unhandled sockopts of the end of the main
switch statements, to improve code readability.
Reported-by: Petr Vandrovec <petr@vandrovec.name> BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=198791#c2 Fixes: 3f34cfae1238 ("netfilter: on sockopt() acquire sock lock only in the required scope") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Casey Leedom [Thu, 15 Feb 2018 14:33:18 +0000 (20:03 +0530)]
PCI/cxgb4: Extend T3 PCI quirk to T4+ devices
We've run into a problem where our device is attached
to a Virtual Machine and the use of the new pci_set_vpd_size()
API doesn't help. The VM kernel has been informed that
the accesses are okay, but all of the actual VPD Capability
Accesses are trapped down into the KVM Hypervisor where it
goes ahead and imposes the silent denials.
The right idea is to follow the kernel.org
commit 1c7de2b4ff88 ("PCI: Enable access to non-standard VPD for
Chelsio devices (cxgb3)") which Alexey Kardashevskiy authored
to establish a PCI Quirk for our T3-based adapters. This commit
extends that PCI Quirk to cover Chelsio T4 devices and later.
The advantage of this approach is that the VPD Size gets set early
in the Base OS/Hypervisor Boot and doesn't require that the cxgb4
driver even be available in the Base OS/Hypervisor. Thus PF4 can
be exported to a Virtual Machine and everything should work.
Fixes: 67e658794ca1 ("cxgb4: Set VPD size so we can read both VPD structures") Cc: <stable@vger.kernel.org> # v4.9+ Signed-off-by: Casey Leedom <leedom@chelsio.com> Signed-off-by: Arjun Vynipadath <arjun@chelsio.com> Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Rahul Lakkireddy [Thu, 15 Feb 2018 12:50:01 +0000 (18:20 +0530)]
cxgb4: fix trailing zero in CIM LA dump
Set correct size of the CIM LA dump for T6.
Fixes: 27887bc7cb7f ("cxgb4: collect hardware LA dumps") Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com> Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Ganesh Goudar [Thu, 15 Feb 2018 12:46:57 +0000 (18:16 +0530)]
cxgb4: free up resources of pf 0-3
free pf 0-3 resources, commit baf5086840ab ("cxgb4:
restructure VF mgmt code") erroneously removed the
code which frees the pf 0-3 resources, causing the
probe of pf 0-3 to fail in case of driver reload.
Fixes: baf5086840ab ("cxgb4: restructure VF mgmt code") Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Stefano Brivio [Thu, 15 Feb 2018 08:46:03 +0000 (09:46 +0100)]
fib_semantics: Don't match route with mismatching tclassid
In fib_nh_match(), if output interface or gateway are passed in
the FIB configuration, we don't have to check next hops of
multipath routes to conclude whether we have a match or not.
However, we might still have routes with different realms
matching the same output interface and gateway configuration,
and this needs to cause the match to fail. Otherwise the first
route inserted in the FIB will match, regardless of the realms:
# ip route add 1.1.1.1 dev eth0 table 1234 realms 1/2
# ip route append 1.1.1.1 dev eth0 table 1234 realms 3/4
# ip route list table 1234
1.1.1.1 dev eth0 scope link realms 1/2
1.1.1.1 dev eth0 scope link realms 3/4
# ip route del 1.1.1.1 dev ens3 table 1234 realms 3/4
# ip route list table 1234
1.1.1.1 dev ens3 scope link realms 3/4
whereas route with realms 3/4 should have been deleted instead.
Explicitly check for fc_flow passed in the FIB configuration
(this comes from RTA_FLOW extracted by rtm_to_fib_config()) and
fail matching if it differs from nh_tclassid.
The handling of RTA_FLOW for multipath routes later in
fib_nh_match() is still needed, as we can have multiple RTA_FLOW
attributes that need to be matched against the tclassid of each
next hop.
v2: Check that fc_flow is set before discarding the match, so
that the user can still select the first matching rule by
not specifying any realm, as suggested by David Ahern.
Reported-by: Jianlin Shi <jishi@redhat.com> Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Acked-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Kees Cook [Wed, 14 Feb 2018 23:45:07 +0000 (15:45 -0800)]
NFC: llcp: Limit size of SDP URI
The tlv_len is u8, so we need to limit the size of the SDP URI. Enforce
this both in the NLA policy and in the code that performs the allocation
and copy, to avoid writing past the end of the allocated buffer.
Fixes: d9b8d8e19b073 ("NFC: llcp: Service Name Lookup netlink interface") Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Boris Pismenny [Wed, 14 Feb 2018 08:46:07 +0000 (10:46 +0200)]
tls: reset the crypto info if copy_from_user fails
copy_from_user could copy some partial information, as a result
TLS_CRYPTO_INFO_READY(crypto_info) could be true while crypto_info is
using uninitialzed data.
This patch resets crypto_info when copy_from_user fails.
fixes: 3c4d7559159b ("tls: kernel TLS support") Signed-off-by: Boris Pismenny <borisp@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Boris Pismenny [Wed, 14 Feb 2018 08:46:06 +0000 (10:46 +0200)]
tls: retrun the correct IV in getsockopt
Current code returns four bytes of salt followed by four bytes of IV.
This patch returns all eight bytes of IV.
fixes: 3c4d7559159b ("tls: kernel TLS support") Signed-off-by: Boris Pismenny <borisp@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 14 Feb 2018 19:52:39 +0000 (14:52 -0500)]
Merge branch 'net-segmentation-offload-doc-fixes'
Daniel Axtens says:
====================
Updates to segmentation-offloads.txt
I've been trying to wrap my head around GSO for a while now. This is a
set of small changes to the docs that would probably have been helpful
when I was starting out.
I realise that GSO_DODGY is still a notable omission - I'm hesitant to
write too much on it just yet as I don't understand it well and I
think it's in the process of changing.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Axtens [Wed, 14 Feb 2018 07:05:33 +0000 (18:05 +1100)]
docs: segmentation-offloads.txt: add SCTP info
Most of this is extracted from 90017accff61 ("sctp: Add GSO support"),
with some extra text about GSO_BY_FRAGS and the need to check for it.
Cc: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: Daniel Axtens <dja@axtens.net> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Axtens [Wed, 14 Feb 2018 07:05:32 +0000 (18:05 +1100)]
docs: segmentation-offloads.txt: Fix ref to SKB_GSO_TUNNEL_REMCSUM
The doc originally called it SKB_GSO_REMCSUM. Fix it.
Fixes: f7a6272bf3cb ("Documentation: Add documentation for TSO and GSO features") Signed-off-by: Daniel Axtens <dja@axtens.net> Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Axtens [Wed, 14 Feb 2018 07:05:31 +0000 (18:05 +1100)]
docs: segmentation-offloads.txt: update for UFO depreciation
UFO is deprecated except for tuntap and packet per 0c19f846d582,
("net: accept UFO datagrams from tuntap and packet"). Update UFO
docs to reflect this.
Signed-off-by: Daniel Axtens <dja@axtens.net> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 14 Feb 2018 19:46:33 +0000 (14:46 -0500)]
Merge branch 'tipc-locking-fixes'
Ying Xue says:
====================
tipc: Fix missing RTNL lock protection during setting link properties
At present it's unsafe to configure link properties through netlink
as the entire setting process is not under RTNL lock protection. Now
TIPC supports two different sets of netlink APIs at the same time, and
they share the same set of backend functions to configure bearer,
media and net properties. In order to solve the missing RTNL issue,
we have to make the whole __tipc_nl_compat_doit() protected by RTNL,
which means any function called within it cannot take RTNL any more.
So in the series we first introduce the following new functions which
doesn't hold RTNl lock:
Meanwhile, __tipc_nl_compat_doit() has been reconstructed to minimize
the time of holding RTNL lock.
Changes in v4:
- Per suggestion of Kirill Tkhai, divided original big one patch into
seven small ones so that they can be easily reviewed.
Changes in v3:
- Optimized return method of __tipc_nl_bearer_enable() regarding
the comments from David M and Kirill Tkhai
- Moved the allocations of memory in __tipc_nl_compat_doit() out
of RTNL lock to minimize the time of holding RTNL lock according
to the suggestion of Kirill Tkhai.
Changes in v2:
- The whole operation of setting bearer/media properties has been
protected under RTNL, as per feedback from David M.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Ying Xue [Wed, 14 Feb 2018 05:38:04 +0000 (13:38 +0800)]
tipc: Fix missing RTNL lock protection during setting link properties
Currently when user changes link properties, TIPC first checks if
user's command message contains media name or bearer name through
tipc_media_find() or tipc_bearer_find() which is protected by RTNL
lock. But when tipc_nl_compat_link_set() conducts the checking with
the two functions, it doesn't hold RTNL lock at all, as a result,
the following complaints were reported:
In order to correct the mistake, __tipc_nl_compat_doit() has been
protected by RTNL lock, which means the whole operation of setting
bearer/media properties is under RTNL protection.
Signed-off-by: Ying Xue <ying.xue@windriver.com> Reported-by: syzbot <syzbot+6345fd433db009b29413@syzkaller.appspotmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Ying Xue [Wed, 14 Feb 2018 05:37:58 +0000 (13:37 +0800)]
tipc: Refactor __tipc_nl_compat_doit
As preparation for adding RTNL to make (*cmd->transcode)() and
(*cmd->transcode)() constantly protected by RTNL lock, we move out of
memory allocations existing between them as many as possible so that
the time of holding RTNL can be minimized in __tipc_nl_compat_doit().
Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 14 Feb 2018 19:39:11 +0000 (14:39 -0500)]
Merge branch 'ibmvnic-leaks'
Thomas Falcon says:
====================
ibmvnic: Fix memory leaks in the driver
This patch set is pretty self-explanatory. It includes
a number of patches that fix memory leaks found with
kmemleak in the ibmvnic driver.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Thomas Falcon [Wed, 14 Feb 2018 00:23:43 +0000 (18:23 -0600)]
ibmvnic: Clean RX pool buffers during device close
During device close or reset, there were some cases of outstanding
RX socket buffers not being freed. Include a function similar to the
one that already exists to clean TX socket buffers in this case.
Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Thomas Falcon [Wed, 14 Feb 2018 00:23:40 +0000 (18:23 -0600)]
ibmvnic: Fix login buffer memory leaks
During device bringup, the driver exchanges login buffers with
firmware. These buffers contain information such number of TX
and RX queues alloted to the device, RX buffer size, etc. These
buffers weren't being properly freed on device reset or close.
We can free the buffer we send to firmware as soon as we get
a response. There is information in the response buffer that
the driver needs for normal operation so retain it until the
next reset or removal.
Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Thomas Falcon [Tue, 13 Feb 2018 21:32:50 +0000 (15:32 -0600)]
ibmvnic: Wait until reset is complete to set carrier on
Pushes back setting the carrier on until the end of the reset
code. This resolves a bug where a watchdog timer was detecting
that a TX queue had stalled before the adapter reset was complete.
Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
As I previously[1] pointed out this implementation of XDP_REDIRECT is
wrong. XDP_REDIRECT is a facility that must work between different
NIC drivers. Another NIC driver can call ndo_xdp_xmit/nicvf_xdp_xmit,
but your driver patch assumes payload data (at top of page) will
contain a queue index and a DMA addr, this is not true and worse will
likely contain garbage.
Given you have not fixed this in due time (just reached v4.16-rc1),
the only option I see is a revert.
Cc: Sunil Goutham <sgoutham@cavium.com> Cc: Christina Jacob <cjacob@caviumnetworks.com> Cc: Aleksey Makarov <aleksey.makarov@cavium.com> Fixes: aa136d0c82fc ("net: thunderx: Add support for xdp redirect") Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Xin Long [Tue, 13 Feb 2018 11:29:13 +0000 (19:29 +0800)]
sctp: fix some copy-paste errors for file comments
This patch is to fix the file comments in stream.c and
stream_interleave.c
v1->v2:
rephrase the comment for stream.c according to Neil's suggestion.
Fixes: a83863174a61 ("sctp: prepare asoc stream for stream reconf") Fixes: 0c3f6f655487 ("sctp: implement make_datafrag for sctp_stream_interleave") Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Tue, 13 Feb 2018 05:35:31 +0000 (21:35 -0800)]
net: fix race on decreasing number of TX queues
netif_set_real_num_tx_queues() can be called when netdev is up.
That usually happens when user requests change of number of
channels/rings with ethtool -L. The procedure for changing
the number of queues involves resetting the qdiscs and setting
dev->num_tx_queues to the new value. When the new value is
lower than the old one, extra care has to be taken to ensure
ordering of accesses to the number of queues vs qdisc reset.
Currently the queues are reset before new dev->num_tx_queues
is assigned, leaving a window of time where packets can be
enqueued onto the queues going down, leading to a likely
crash in the drivers, since most drivers don't check if TX
skbs are assigned to an active queue.
Fixes: e6484930d7c7 ("net: allocate tx queues in register_netdevice") Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Sowmini Varadhan [Tue, 13 Feb 2018 17:46:16 +0000 (09:46 -0800)]
rds: do not call ->conn_alloc with GFP_KERNEL
Commit ebeeb1ad9b8a ("rds: tcp: use rds_destroy_pending() to synchronize
netns/module teardown and rds connection/workq management")
adds an rcu read critical section to __rd_conn_create. The
memory allocations in that critcal section need to use
GFP_ATOMIC to avoid sleeping.
This patch was verified with syzkaller reproducer.
Reported-by: syzbot+a0564419941aaae3fe3c@syzkaller.appspotmail.com Fixes: ebeeb1ad9b8a ("rds: tcp: use rds_destroy_pending() to synchronize
netns/module teardown and rds connection/workq management") Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com> Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Pirko [Tue, 13 Feb 2018 11:00:17 +0000 (12:00 +0100)]
net: sched: fix tc_u_common lookup
The offending commit wrongly assumes 1:1 mapping between block and q.
However, there are multiple blocks for a single q for classful qdiscs.
Since the obscure tc_u_common sharing mechanism expects it to be shared
among a qdisc, fix it by storing q pointer in case the block is not
shared.
Reported-by: Paweł Staszewski <pstaszewski@itcare.pl> Reported-by: Cong Wang <xiyou.wangcong@gmail.com> Fixes: 7fa9d974f3c2 ("net: sched: cls_u32: use block instead of q in tc_u_common") Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Pirko [Tue, 13 Feb 2018 11:00:16 +0000 (12:00 +0100)]
net: sched: don't set q pointer for shared blocks
It is pointless to set block->q for block which are shared among
multiple qdiscs. So remove the assignment in that case. Do a bit of code
reshuffle to make block->index initialized at that point so we can use
tcf_block_shared() helper.
Reported-by: Cong Wang <xiyou.wangcong@gmail.com> Fixes: 4861738775d7 ("net: sched: introduce shared filter blocks infrastructure") Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Pirko [Tue, 13 Feb 2018 10:22:42 +0000 (11:22 +0100)]
mlxsw: spectrum_router: Fix error path in mlxsw_sp_vr_create
Since mlxsw_sp_fib_create() and mlxsw_sp_mr_table_create()
use ERR_PTR macro to propagate int err through return of a pointer,
the return value is not NULL in case of failure. So if one
of the calls fails, one of vr->fib4, vr->fib6 or vr->mr4_table
is not NULL and mlxsw_sp_vr_is_used wrongly assumes
that vr is in use which leads to crash like following one:
[ 1293.949291] BUG: unable to handle kernel NULL pointer dereference at 00000000000006c9
[ 1293.952729] IP: mlxsw_sp_mr_table_flush+0x15/0x70 [mlxsw_spectrum]
Fix this by using local variables to hold the pointers and set vr->*
only in case everything went fine.
This fixes a compile problem of some user space applications by not
including linux/libc-compat.h in uapi/if_ether.h.
linux/libc-compat.h checks which "features" the header files, included
from the libc, provide to make the Linux kernel uapi header files only
provide no conflicting structures and enums. If a user application mixes
kernel headers and libc headers it could happen that linux/libc-compat.h
gets included too early where not all other libc headers are included
yet. Then the linux/libc-compat.h would not prevent all the
redefinitions and we run into compile problems.
This patch removes the include of linux/libc-compat.h from
uapi/if_ether.h to fix the recently introduced case, but not all as this
is more or less impossible.
It is no problem to do the check directly in the if_ether.h file and not
in libc-compat.h as this does not need any fancy glibc header detection
as glibc never provided struct ethhdr and should define
__UAPI_DEF_ETHHDR by them self when they will provide this.
The following test program did not compile correctly any more:
Jan Glauber [Mon, 12 Feb 2018 17:20:11 +0000 (18:20 +0100)]
net: cavium: fix NULL pointer dereference in cavium_ptp_put
Prevent a kernel panic on reboot if ptp_clock is NULL by checking
the ptp pointer before using it.
Signed-off-by: Jan Glauber <jglauber@cavium.com> Fixes: 8c56df372bc1 ("net: add support for Cavium PTP coprocessor") Cc: Radoslaw Biernacki <rad@semihalf.com> Cc: Aleksey Makarov <aleksey.makarov@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Mika Westerberg [Mon, 12 Feb 2018 14:10:20 +0000 (17:10 +0300)]
net: thunderbolt: Run disconnect flow asynchronously when logout is received
The control channel calls registered callbacks when control messages
such as XDomain protocol messages are received. The control channel
handling is done in a worker running on system workqueue which means the
networking driver can't run tear down flow which includes sending
disconnect request and waiting for a reply in the same worker. Otherwise
reply is never received (as the work is already running) and the
operation times out.
To fix this run disconnect ThunderboltIP flow asynchronously once
ThunderboltIP logout message is received.
Fixes: e69b6c02b4c3 ("net: Add support for networking over Thunderbolt cable") Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Cc: stable@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>
Mika Westerberg [Mon, 12 Feb 2018 14:10:19 +0000 (17:10 +0300)]
net: thunderbolt: Tear down connection properly on suspend
When suspending to mem or disk the Thunderbolt controller typically goes
down as well tearing down the connection automatically. However, when
suspend to idle is used this does not happen so we need to make sure the
connection is properly disconnected before it can be re-established
during resume.
Fixes: e69b6c02b4c3 ("net: Add support for networking over Thunderbolt cable") Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Cc: stable@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>
sh_eth: Remove obsolete explicit clock handling for WoL
Currently, if Wake-on-LAN is enabled, the SH-ETH device's module clock
is manually kept running during system suspend, to make sure the device
stays active.
Since commits 91c719f5ec6671f7 ("soc: renesas: rcar-sysc: Keep wakeup
sources active during system suspend") and 744dddcae84441b1 ("clk:
renesas: mstp: Keep wakeup sources active during system suspend"), this
workaround is no longer needed. Hence remove all explicit clock
handling to keep the device active.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se> Reviewed-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Signed-off-by: David S. Miller <davem@davemloft.net>
ravb: Remove obsolete explicit clock handling for WoL
Currently, if Wake-on-LAN is enabled, the EtherAVB device's module clock
is manually kept running during system suspend, to make sure the device
stays active.
Since commit 91c719f5ec6671f7 ("soc: renesas: rcar-sysc: Keep wakeup
sources active during system suspend") , this workaround is no longer
needed. Hence remove all explicit clock handling to keep the device
active.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se> Reviewed-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Ingo van Lil [Mon, 12 Feb 2018 11:02:52 +0000 (12:02 +0100)]
net: phy: fix wrong mask to phy_modify()
When forcing a specific link mode, the PHY driver must clear the
existing speed and duplex bits in BMCR while preserving some other
control bits. This logic was accidentally inverted with the introduction
of phy_modify().
Fixes: fea23fb591cc ("net: phy: convert read-modify-write to phy_modify()") Signed-off-by: Ingo van Lil <inguin@gmx.de> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
Ilya Lesokhin [Mon, 12 Feb 2018 10:57:04 +0000 (12:57 +0200)]
tcp: Honor the eor bit in tcp_mtu_probe
Avoid SKB coalescing if eor bit is set in one of the relevant
SKBs.
Fixes: c134ecb87817 ("tcp: Make use of MSG_EOR in tcp_sendmsg") Signed-off-by: Ilya Lesokhin <ilyal@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Xin Long [Mon, 12 Feb 2018 10:31:24 +0000 (18:31 +0800)]
sctp: remove the useless check in sctp_renege_events
Remove the 'if (chunk)' check in sctp_renege_events for idata process,
as all renege commands are generated in sctp_eat_data and it can't be
NULL.
The same thing we already did for common data in sctp_ulpq_renege.
Fixes: 94014e8d871a ("sctp: implement renege_events for sctp_stream_interleave") Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Xin Long [Mon, 12 Feb 2018 10:29:51 +0000 (18:29 +0800)]
sctp: add SCTP_CID_I_DATA and SCTP_CID_I_FWD_TSN conversion in sctp_cname
After the support for SCTP_CID_I_DATA and SCTP_CID_I_FWD_TSN chunks,
the corresp conversion in sctp_cname should also be added. Otherwise,
in some places, pr_debug will print them as "unknown chunk".
Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Xin Long [Mon, 12 Feb 2018 10:29:06 +0000 (18:29 +0800)]
sctp: do not pr_err for the duplicated node in transport rhlist
The pr_err in sctp_hash_transport was supposed to report a sctp bug
for using rhashtable/rhlist.
The err '-EEXIST' introduced in Commit cd2b70875058 ("sctp: check
duplicate node before inserting a new transport") doesn't belong
to that case.
So just return -EEXIST back without pr_err any kmsg.
Fixes: cd2b70875058 ("sctp: check duplicate node before inserting a new transport") Reported-by: Wei Chen <weichen@redhat.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Xin Long [Mon, 12 Feb 2018 09:15:40 +0000 (17:15 +0800)]
bridge: check brport attr show in brport_show
Now br_sysfs_if file flush doesn't have attr show. To read it will
cause kernel panic after users chmod u+r this file.
Xiong found this issue when running the commands:
ip link add br0 type bridge
ip link add type veth
ip link set veth0 master br0
chmod u+r /sys/devices/virtual/net/veth0/brport/flush
timeout 3 cat /sys/devices/virtual/net/veth0/brport/flush
kernel crashed with NULL a pointer dereference call trace.
This patch is to fix it by return -EINVAL when brport_attr->show
is null, just the same as the check for brport_attr->store in
brport_store().
Fixes: 9cf637473c85 ("bridge: add sysfs hook to flush forwarding table") Reported-by: Xiong Zhou <xzhou@redhat.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jason Wang [Sun, 11 Feb 2018 03:28:12 +0000 (11:28 +0800)]
ptr_ring: prevent integer overflow when calculating size
Switch to use dividing to prevent integer overflow when size is too
big to calculate allocation size properly.
Reported-by: Eric Biggers <ebiggers3@gmail.com> Fixes: 6e6e41c31122 ("ptr_ring: fail early if queue occupies more than KMALLOC_MAX_SIZE") Signed-off-by: Jason Wang <jasowang@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Sun, 11 Feb 2018 22:34:03 +0000 (14:34 -0800)]
vfs: do bulk POLL* -> EPOLL* replacement
This is the mindless scripted replacement of kernel use of POLL*
variables as described by Al, done by this script:
for V in IN OUT PRI ERR RDNORM RDBAND WRNORM WRBAND HUP RDHUP NVAL MSG; do
L=`git grep -l -w POLL$V | grep -v '^t' | grep -v /um/ | grep -v '^sa' | grep -v '/poll.h$'|grep -v '^D'`
for f in $L; do sed -i "-es/^\([^\"]*\)\(\<POLL$V\>\)/\\1E\\2/" $f; done
done
with de-mangling cleanups yet to come.
NOTE! On almost all architectures, the EPOLL* constants have the same
values as the POLL* constants do. But they keyword here is "almost".
For various bad reasons they aren't the same, and epoll() doesn't
actually work quite correctly in some cases due to this on Sparc et al.
The next patch from Al will sort out the final differences, and we
should be all done.
Scripted-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Sun, 11 Feb 2018 21:57:19 +0000 (13:57 -0800)]
Merge branch 'work.poll2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull more poll annotation updates from Al Viro:
"This is preparation to solving the problems you've mentioned in the
original poll series.
After this series, the kernel is ready for running
for V in IN OUT PRI ERR RDNORM RDBAND WRNORM WRBAND HUP RDHUP NVAL MSG; do
L=`git grep -l -w POLL$V | grep -v '^t' | grep -v /um/ | grep -v '^sa' | grep -v '/poll.h$'|grep -v '^D'`
for f in $L; do sed -i "-es/^\([^\"]*\)\(\<POLL$V\>\)/\\1E\\2/" $f; done
done
as a for bulk search-and-replace.
After that, the kernel is ready to apply the patch to unify
{de,}mangle_poll(), and then get rid of kernel-side POLL... uses
entirely, and we should be all done with that stuff.
Basically, that's what you suggested wrt KPOLL..., except that we can
use EPOLL... instead - they already are arch-independent (and equal to
what is currently kernel-side POLL...).
After the preparations (in this series) switch to returning EPOLL...
from ->poll() instances is completely mechanical and kernel-side
POLL... can go away. The last step (killing kernel-side POLL... and
unifying {de,}mangle_poll() has to be done after the
search-and-replace job, since we need userland-side POLL... for
unified {de,}mangle_poll(), thus the cherry-pick at the last step.
After that we will have:
- POLL{IN,OUT,...} *not* in __poll_t, so any stray instances of
->poll() still using those will be caught by sparse.
- eventpoll.c and select.c warning-free wrt __poll_t
- no more kernel-side definitions of POLL... - userland ones are
visible through the entire kernel (and used pretty much only for
mangle/demangle)
- same behavior as after the first series (i.e. sparc et.al. epoll(2)
working correctly)"
* 'work.poll2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
annotate ep_scan_ready_list()
ep_send_events_proc(): return result via esed->res
preparation to switching ->poll() to returning EPOLL...
add EPOLLNVAL, annotate EPOLL... and event_poll->event
use linux/poll.h instead of asm/poll.h
xen: fix poll misannotation
smc: missing poll annotations
Linus Torvalds [Sun, 11 Feb 2018 21:52:32 +0000 (13:52 -0800)]
Merge tag 'nios2-v4.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/lftan/nios2
Pull nios2 update from Ley Foon Tan:
- clean up old Kconfig options from defconfig
- remove leading 0x and 0s from bindings notation in dts files
* tag 'nios2-v4.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/lftan/nios2:
nios2: defconfig: Cleanup from old Kconfig options
nios2: dts: Remove leading 0x and 0s from bindings notation
Max Filippov [Sun, 11 Feb 2018 09:07:54 +0000 (01:07 -0800)]
xtensa: fix build with KASAN
The commit 917538e212a2 ("kasan: clean up KASAN_SHADOW_SCALE_SHIFT
usage") removed KASAN_SHADOW_SCALE_SHIFT definition from
include/linux/kasan.h and added it to architecture-specific headers,
except for xtensa. This broke the xtensa build with KASAN enabled.
Define KASAN_SHADOW_SCALE_SHIFT in arch/xtensa/include/asm/kasan.h
Reported by: kbuild test robot <fengguang.wu@intel.com> Fixes: 917538e212a2 ("kasan: clean up KASAN_SHADOW_SCALE_SHIFT usage") Acked-by: Andrey Konovalov <andreyknvl@google.com> Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
For simplicity, two sed expressions were used to solve each warnings separately.
To make the regex expression more robust a few other issues were resolved,
namely setting unit-address to lower case, and adding a whitespace before the
the opening curly brace:
Linus Torvalds [Sat, 10 Feb 2018 22:05:11 +0000 (14:05 -0800)]
Merge tag 'for-linus-20180210' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:
"A few fixes to round off the merge window on the block side:
- a set of bcache fixes by way of Michael Lyle, from the usual bcache
suspects.
- add a simple-to-hook-into function for bpf EIO error injection.
- fix blk-wbt that mischarectized flushes as reads. Improve the logic
so that flushes and writes are accounted as writes, and only reads
as reads. From me.
- fix requeue crash in BFQ, from Paolo"
* tag 'for-linus-20180210' of git://git.kernel.dk/linux-block:
block, bfq: add requeue-request hook
bcache: fix for data collapse after re-attaching an attached device
bcache: return attach error when no cache set exist
bcache: set writeback_rate_update_seconds in range [1, 60] seconds
bcache: fix for allocator and register thread race
bcache: set error_limit correctly
bcache: properly set task state in bch_writeback_thread()
bcache: fix high CPU occupancy during journal
bcache: add journal statistic
block: Add should_fail_bio() for bpf error injection
blk-wbt: account flush requests correctly
Linus Torvalds [Sat, 10 Feb 2018 21:55:33 +0000 (13:55 -0800)]
Merge tag 'platform-drivers-x86-v4.16-3' of git://github.com/dvhart/linux-pdx86
Pull x86 platform driver updates from Darren Hart:
"Mellanox fixes and new system type support.
Mostly data for new system types with a correction and an
uninitialized variable fix"
[ Pulling from github because git.infradead.org currently seems to be
down for some reason, but Darren had a backup location - Linus ]
* tag 'platform-drivers-x86-v4.16-3' of git://github.com/dvhart/linux-pdx86:
platform/x86: mlx-platform: Add support for new 200G IB and Ethernet systems
platform/x86: mlx-platform: Add support for new msn201x system type
platform/x86: mlx-platform: Add support for new msn274x system type
platform/x86: mlx-platform: Fix power cable setting for msn21xx family
platform/x86: mlx-platform: Add define for the negative bus
platform/x86: mlx-platform: Use defines for bus assignment
platform/mellanox: mlxreg-hotplug: Fix uninitialized variable
Linus Torvalds [Sat, 10 Feb 2018 21:50:23 +0000 (13:50 -0800)]
Merge tag 'chrome-platform-for-linus-4.16' of git://git.kernel.org/pub/scm/linux/kernel/git/bleung/chrome-platform
Pull chrome platform updates from Benson Leung:
- move cros_ec_dev to drivers/mfd
- other small maintenance fixes
[ The cros_ec_dev movement came in earlier through the MFD tree - Linus ]
* tag 'chrome-platform-for-linus-4.16' of git://git.kernel.org/pub/scm/linux/kernel/git/bleung/chrome-platform:
platform/chrome: Use proper protocol transfer function
platform/chrome: cros_ec_lpc: Add support for Google Glimmer
platform/chrome: cros_ec_lpc: Register the driver if ACPI entry is missing.
platform/chrome: cros_ec_lpc: remove redundant pointer request
cros_ec: fix nul-termination for firmware build info
platform/chrome: chromeos_laptop: make chromeos_laptop const
Linus Torvalds [Sat, 10 Feb 2018 21:16:35 +0000 (13:16 -0800)]
Merge tag 'kvm-4.16-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM updates from Radim Krčmář:
"ARM:
- icache invalidation optimizations, improving VM startup time
- support for forwarded level-triggered interrupts, improving
performance for timers and passthrough platform devices
- a small fix for power-management notifiers, and some cosmetic
changes
PPC:
- add MMIO emulation for vector loads and stores
- allow HPT guests to run on a radix host on POWER9 v2.2 CPUs without
requiring the complex thread synchronization of older CPU versions
- improve the handling of escalation interrupts with the XIVE
interrupt controller
- support decrement register migration
- various cleanups and bugfixes.
s390:
- Cornelia Huck passed maintainership to Janosch Frank
- exitless interrupts for emulated devices
- cleanup of cpuflag handling
- kvm_stat counter improvements
- VSIE improvements
- mm cleanup
x86:
- hypervisor part of SEV
- UMIP, RDPID, and MSR_SMI_COUNT emulation
- paravirtualized TLB shootdown using the new KVM_VCPU_PREEMPTED bit
- allow guests to see TOPOEXT, GFNI, VAES, VPCLMULQDQ, and more
AVX512 features
- show vcpu id in its anonymous inode name
- many fixes and cleanups
- per-VCPU MSR bitmaps (already merged through x86/pti branch)
- stable KVM clock when nesting on Hyper-V (merged through
x86/hyperv)"
* tag 'kvm-4.16-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (197 commits)
KVM: PPC: Book3S: Add MMIO emulation for VMX instructions
KVM: PPC: Book3S HV: Branch inside feature section
KVM: PPC: Book3S HV: Make HPT resizing work on POWER9
KVM: PPC: Book3S HV: Fix handling of secondary HPTEG in HPT resizing code
KVM: PPC: Book3S PR: Fix broken select due to misspelling
KVM: x86: don't forget vcpu_put() in kvm_arch_vcpu_ioctl_set_sregs()
KVM: PPC: Book3S PR: Fix svcpu copying with preemption enabled
KVM: PPC: Book3S HV: Drop locks before reading guest memory
kvm: x86: remove efer_reload entry in kvm_vcpu_stat
KVM: x86: AMD Processor Topology Information
x86/kvm/vmx: do not use vm-exit instruction length for fast MMIO when running nested
kvm: embed vcpu id to dentry of vcpu anon inode
kvm: Map PFN-type memory regions as writable (if possible)
x86/kvm: Make it compile on 32bit and with HYPYERVISOR_GUEST=n
KVM: arm/arm64: Fixup userspace irqchip static key optimization
KVM: arm/arm64: Fix userspace_irqchip_in_use counting
KVM: arm/arm64: Fix incorrect timer_is_pending logic
MAINTAINERS: update KVM/s390 maintainers
MAINTAINERS: add Halil as additional vfio-ccw maintainer
MAINTAINERS: add David as a reviewer for KVM/s390
...
59f47eff03a0 ("powerpc/pci: Use of_irq_parse_and_map_pci() helper")
replaced of_irq_parse_pci() + irq_create_of_mapping() with
of_irq_parse_and_map_pci(), but neglected to capture the virq
returned by irq_create_of_mapping(), so virq remained zero, which
caused INTx configuration to fail.
Save the virq value returned by of_irq_parse_and_map_pci() and correct
the virq declaration to match the of_irq_parse_and_map_pci() signature.
Linus Torvalds [Sat, 10 Feb 2018 03:32:41 +0000 (19:32 -0800)]
Merge tag 'kbuild-v4.16-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
Pull more Kbuild updates from Masahiro Yamada:
"Makefile changes:
- enable unused-variable warning that was wrongly disabled for clang
Kconfig changes:
- warn about blank 'help' and fix existing instances
- fix 'choice' behavior to not write out invisible symbols
- fix misc weirdness
Coccinell changes:
- fix false positive of free after managed memory alloc detection
- improve performance of NULL dereference detection"
* tag 'kbuild-v4.16-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (21 commits)
kconfig: remove const qualifier from sym_expand_string_value()
kconfig: add xrealloc() helper
kconfig: send error messages to stderr
kconfig: echo stdin to stdout if either is redirected
kconfig: remove check_stdin()
kconfig: remove 'config*' pattern from .gitignnore
kconfig: show '?' prompt even if no help text is available
kconfig: do not write choice values when their dependency becomes n
coccinelle: deref_null: avoid useless computation
coccinelle: devm_free: reduce false positives
kbuild: clang: disable unused variable warnings only when constant
kconfig: Warn if help text is blank
nios2: kconfig: Remove blank help text
arm: vt8500: kconfig: Remove blank help text
MIPS: kconfig: Remove blank help text
MIPS: BCM63XX: kconfig: Remove blank help text
lib/Kconfig.debug: Remove blank help text
Staging: rtl8192e: kconfig: Remove blank help text
Staging: rtl8192u: kconfig: Remove blank help text
mmc: kconfig: Remove blank help text
...
Linus Torvalds [Sat, 10 Feb 2018 03:22:17 +0000 (19:22 -0800)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull misc vfs fixes from Al Viro.
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
seq_file: fix incomplete reset on read from zero offset
kernfs: fix regression in kernfs_fop_write caused by wrong type
Masahiro Yamada [Thu, 8 Feb 2018 16:19:08 +0000 (01:19 +0900)]
kconfig: remove const qualifier from sym_expand_string_value()
This function returns realloc'ed memory, so the returned pointer
must be passed to free() when done. So, 'const' qualifier is odd.
It is allowed to modify the expanded string.
Vadim Pasternak [Fri, 9 Feb 2018 23:59:32 +0000 (23:59 +0000)]
platform/x86: mlx-platform: Add support for new 200G IB and Ethernet systems
It adds support for new Mellanox system types of basic classes qmb7, sn34,
sn37, containing systems QMB700 (40x200GbE InfiniBand switch), SN3700
(32x200GbE and 16x400GbE Ethernet switch) and SN3410 (6x400GbE plus
48x50GbE Ethernet switch). These are the Top of the Rack systems, equipped
with Mellanox COM-Express carrier board and switch board with Mellanox
Quantum device, which supports InfiniBand switching with 40X200G ports and
line rate of up to HDR speed or with Mellanox Spectrum-2 device, which
supports Ethernet switching with 32X200G ports line rate of up to HDR
speed.
Signed-off-by: Vadim Pasternak <vadimp@mellanox.com> Signed-off-by: Darren Hart (VMware) <dvhart@infradead.org>
Vadim Pasternak [Fri, 9 Feb 2018 23:59:31 +0000 (23:59 +0000)]
platform/x86: mlx-platform: Add support for new msn201x system type
It adds support for new Mellanox system types of basic half unit size
class msn201x, containing system MSN2010 (18x10GbE plus 4x4x25GbE) half
and its derivatives. This is the Top of the Rack system, equipped with
Mellanox Small Form Factor carrier board and switch board with Mellanox
Spectrum device, which supports Ethernet switching with 32X100G ports line
rate of up to EDR speed.
Signed-off-by: Vadim Pasternak <vadimp@mellanox.com> Signed-off-by: Darren Hart (VMware) <dvhart@infradead.org>
Vadim Pasternak [Fri, 9 Feb 2018 23:59:30 +0000 (23:59 +0000)]
platform/x86: mlx-platform: Add support for new msn274x system type
It adds support for new Mellanox system types of basic class msn274x,
containing system MSN2740 (32x100GbE Ethernet switch with cost reduction)
and its derivatives. These are the Top of the Rack system, equipped with
Mellanox Small Form Factor carrier board and switch board with Mellanox
Spectrum device, which supports Ethernet switching with 32X100G ports line
rate of up to EDR speed.
Signed-off-by: Vadim Pasternak <vadimp@mellanox.com> Signed-off-by: Darren Hart (VMware) <dvhart@infradead.org>
1) Make allocations less aggressive in x_tables, from Minchal Hocko.
2) Fix netfilter flowtable Kconfig deps, from Pablo Neira Ayuso.
3) Fix connection loss problems in rtlwifi, from Larry Finger.
4) Correct DRAM dump length for some chips in ath10k driver, from Yu
Wang.
5) Fix ABORT handling in rxrpc, from David Howells.
6) Add SPDX tags to Sun networking drivers, from Shannon Nelson.
7) Some ipv6 onlink handling fixes, from David Ahern.
8) Netem packet scheduler interval calcualtion fix from Md. Islam.
9) Don't put crypto buffers on-stack in rxrpc, from David Howells.
10) Fix handling of error non-delivery status in netlink multicast
delivery over multiple namespaces, from Nicolas Dichtel.
11) Missing xdp flush in tuntap driver, from Jason Wang.
12) Synchonize RDS protocol netns/module teardown with rds object
management, from Sowini Varadhan.
13) Add nospec annotations to mpls, from Dan Williams.
14) Fix SKB truesize handling in TIPC, from Hoang Le.
15) Interrupt masking fixes in stammc from Niklas Cassel.
16) Don't allow ptr_ring objects to be sized outside of kmalloc's
limits, from Jason Wang.
17) Don't allow SCTP chunks to be built which will have a length
exceeding the chunk header's 16-bit length field, from Alexey
Kodanev.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (82 commits)
ibmvnic: Remove skb->protocol checks in ibmvnic_xmit
bpf: fix rlimit in reuseport net selftest
sctp: verify size of a new chunk in _sctp_make_chunk()
s390/qeth: fix SETIP command handling
s390/qeth: fix underestimated count of buffer elements
ptr_ring: try vmalloc() when kmalloc() fails
ptr_ring: fail early if queue occupies more than KMALLOC_MAX_SIZE
net: stmmac: remove redundant enable of PMT irq
net: stmmac: rename GMAC_INT_DEFAULT_MASK for dwmac4
net: stmmac: discard disabled flags in interrupt status register
ibmvnic: Reset long term map ID counter
tools/libbpf: handle issues with bpf ELF objects containing .eh_frames
selftests/bpf: add selftest that use test_libbpf_open
selftests/bpf: add test program for loading BPF ELF files
tools/libbpf: improve the pr_debug statements to contain section numbers
bpf: Sync kernel ABI header with tooling header for bpf_common.h
net: phy: fix phy_start to consider PHY_IGNORE_INTERRUPT
net: thunder: change q_len's type to handle max ring size
tipc: fix skb truesize/datasize ratio control
net/sched: cls_u32: fix cls_u32 on filter replace
...
Linus Torvalds [Fri, 9 Feb 2018 22:55:30 +0000 (14:55 -0800)]
Merge tag 'nfs-for-4.16-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs
Pull more NFS client updates from Trond Myklebust:
"A few bugfixes and some small sunrpc latency/performance improvements
before the merge window closes:
Stable fixes:
- fix an incorrect calculation of the RDMA send scatter gather
element limit
- fix an Oops when attempting to free resources after RDMA device
removal
Bugfixes:
- SUNRPC: Ensure we always release the TCP socket in a timely fashion
when the connection is shut down.
- SUNRPC: Don't call __UDPX_INC_STATS() from a preemptible context
Latency/Performance:
- SUNRPC: Queue latency sensitive socket tasks to the less contended
xprtiod queue
- SUNRPC: Make the xprtiod workqueue unbounded.
- SUNRPC: Make the rpciod workqueue unbounded"
* tag 'nfs-for-4.16-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
SUNRPC: Don't call __UDPX_INC_STATS() from a preemptible context
fix parallelism for rpc tasks
Make the xprtiod workqueue unbounded.
SUNRPC: Queue latency-sensitive socket tasks to xprtiod
SUNRPC: Ensure we always close the socket after a connection shuts down
xprtrdma: Fix BUG after a device removal
xprtrdma: Fix calculation of ri_max_send_sges
Linus Torvalds [Fri, 9 Feb 2018 22:49:46 +0000 (14:49 -0800)]
Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending
Pull SCSI target updates from Nicholas Bellinger:
"The highlights include:
- numerous target-core-user improvements related to queue full and
timeout handling. (MNC)
- prevent target-core-user corruption when invalid data page is
requested. (MNC)
- add target-core device action configfs attributes to allow
user-space to trigger events separate from existing attributes
exposed to end-users. (MNC)
- fix iscsi-target NULL pointer dereference 4.6+ regression in CHAP
error path. (David Disseldorp)
- avoid target-core backend UNMAP callbacks if range is zero. (Andrei
Vagin)
- fix a iscsi-target 4.14+ regression related multiple PDU logins,
that was exposed due to removal of TCP prequeue support. (Florian
Westphal + MNC)
Also, there is a iser-target bug still being worked on for post -rc1
code to address a long standing issue resulting in persistent
ib_post_send() failures, for RNICs with small max_send_sge"
* 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: (36 commits)
iscsi-target: make sure to wake up sleeping login worker
tcmu: Fix trailing semicolon
tcmu: fix cmd user after free
target: fix destroy device in target_configure_device
tcmu: allow userspace to reset ring
target core: add device action configfs files
tcmu: fix error return code in tcmu_configure_device()
target_core_user: add cmd id to broken ring message
target: add SAM_STAT_BUSY sense reason
tcmu: prevent corruption when invalid data page requested
target: don't call an unmap callback if a range length is zero
target/iscsi: avoid NULL dereference in CHAP auth error path
cxgbit: call neigh_event_send() to update MAC address
target: tcm_loop: Use seq_puts() in tcm_loop_show_info()
target: tcm_loop: Delete an unnecessary return statement in tcm_loop_submission_work()
target: tcm_loop: Delete two unnecessary variable initialisations in tcm_loop_issue_tmr()
target: tcm_loop: Combine substrings for 26 messages
target: tcm_loop: Improve a size determination in two functions
target: tcm_loop: Delete an error message for a failed memory allocation in four functions
sbp-target: Delete an error message for a failed memory allocation in three functions
...
Linus Torvalds [Fri, 9 Feb 2018 22:47:09 +0000 (14:47 -0800)]
Merge tag 'trace-v4.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing fixes from Steven Rostedt:
"Al Viro discovered some breakage with the parsing of the
set_ftrace_filter as well as the removing of function probes.
This fixes the code with Al's suggestions. I also added a few
selftests to test the broken cases such that they wont happen
again"
* tag 'trace-v4.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
selftests/ftrace: Add more tests for removing of function probes
selftests/ftrace: Add some missing glob checks
selftests/ftrace: Have reset_ftrace_filter handle multiple instances
selftests/ftrace: Have reset_ftrace_filter handle modules
tracing: Fix parsing of globs with a wildcard at the beginning
ftrace: Remove incorrect setting of glob search field
Linus Torvalds [Fri, 9 Feb 2018 22:42:57 +0000 (14:42 -0800)]
Merge tag '4.16-minor-rc-SMB3-fixes' of git://git.samba.org/sfrench/cifs-2.6
Pull cifs fixes from Steve French:
"There are a couple additional security fixes that are still being
tested that are not in this set."
* tag '4.16-minor-rc-SMB3-fixes' of git://git.samba.org/sfrench/cifs-2.6:
Add missing structs and defines from recent SMB3.1.1 documentation
address lock imbalance warnings in smbdirect.c
cifs: silence compiler warnings showing up with gcc-8.0.0
Add some missing debug fields in server and tcon structs
Radim Krčmář [Fri, 2 Feb 2018 17:26:58 +0000 (18:26 +0100)]
Merge branch 'msr-bitmaps' of git://git.kernel.org/pub/scm/virt/kvm/kvm
This topic branch allocates separate MSR bitmaps for each VCPU.
This is required for the IBRS enablement to choose, on a per-VM
basis, whether to intercept the SPEC_CTRL and PRED_CMD MSRs;
the IBRS enablement comes in through the tip tree.
John Allen [Fri, 9 Feb 2018 19:19:46 +0000 (13:19 -0600)]
ibmvnic: Remove skb->protocol checks in ibmvnic_xmit
Having these checks in ibmvnic_xmit causes problems with VLAN
tagging and balance-alb/tlb bonding modes. The restriction they
imposed can be removed.
Signed-off-by: John Allen <jallen@linux.vnet.ibm.com> Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
For the former adjust rlimit since this was the cause of
failure for loading the BPF prog, and for the latter add
SO_REUSEADDR.
Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org> Link: https://bugs.linaro.org/show_bug.cgi?id=3502 Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
Alexey Kodanev [Fri, 9 Feb 2018 14:35:23 +0000 (17:35 +0300)]
sctp: verify size of a new chunk in _sctp_make_chunk()
When SCTP makes INIT or INIT_ACK packet the total chunk length
can exceed SCTP_MAX_CHUNK_LEN which leads to kernel panic when
transmitting these packets, e.g. the crash on sending INIT_ACK:
Here the chunk size for INIT_ACK packet becomes too big, mostly
because of the state cookie (INIT packet has large size with
many address parameters), plus additional server parameters.
Later this chunk causes the panic in skb_put_data():
'nskb' (head skb) was previously allocated with packet->size
from u16 'chunk->chunk_hdr->length'.
As suggested by Marcelo we should check the chunk's length in
_sctp_make_chunk() before trying to allocate skb for it and
discard a chunk if its size bigger than SCTP_MAX_CHUNK_LEN.
Signed-off-by: Alexey Kodanev <alexey.kodanev@oracle.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leinter@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 9 Feb 2018 19:30:23 +0000 (14:30 -0500)]
Merge branch 's390-qeth-fixes'
Julian Wiedmann says:
====================
s390/qeth: fixes 2018-02-09
please apply the following two qeth patches for 4.16 and stable.
One restricts a command quirk to the intended commandd type,
while the other fixes an off-by-one during data transmission
that can cause qeth to build malformed buffer descriptors.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Julian Wiedmann [Fri, 9 Feb 2018 10:03:50 +0000 (11:03 +0100)]
s390/qeth: fix SETIP command handling
send_control_data() applies some special handling to SETIP v4 IPA
commands. But current code parses *all* command types for the SETIP
command code. Limit the command code check to IPA commands.
Fixes: 5b54e16f1a54 ("qeth: do not spin for SETIP ip assist command") Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Ursula Braun [Fri, 9 Feb 2018 10:03:49 +0000 (11:03 +0100)]
s390/qeth: fix underestimated count of buffer elements
For a memory range/skb where the last byte falls onto a page boundary
(ie. 'end' is of the form xxx...xxx001), the PFN_UP() part of the
calculation currently doesn't round up to the next PFN due to an
off-by-one error.
Thus qeth believes that the skb occupies one page less than it
actually does, and may select a IO buffer that doesn't have enough spare
buffer elements to fit all of the skb's data.
HW detects this as a malformed buffer descriptor, and raises an
exception which then triggers device recovery.
Fixes: 2863c61334aa ("qeth: refactor calculation of SBALE count") Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com> Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jason Wang [Fri, 9 Feb 2018 09:45:50 +0000 (17:45 +0800)]
ptr_ring: try vmalloc() when kmalloc() fails
This patch switch to use kvmalloc_array() for using a vmalloc()
fallback to help in case kmalloc() fails.
Reported-by: syzbot+e4d4f9ddd4295539735d@syzkaller.appspotmail.com Fixes: 2e0ab8ca83c12 ("ptr_ring: array based FIFO for pointers") Signed-off-by: Jason Wang <jasowang@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jason Wang [Fri, 9 Feb 2018 09:45:49 +0000 (17:45 +0800)]
ptr_ring: fail early if queue occupies more than KMALLOC_MAX_SIZE
To avoid slab to warn about exceeded size, fail early if queue
occupies more than KMALLOC_MAX_SIZE.
Reported-by: syzbot+e4d4f9ddd4295539735d@syzkaller.appspotmail.com Fixes: 2e0ab8ca83c12 ("ptr_ring: array based FIFO for pointers") Signed-off-by: Jason Wang <jasowang@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Niklas Cassel [Fri, 9 Feb 2018 16:22:47 +0000 (17:22 +0100)]
net: stmmac: remove redundant enable of PMT irq
For dwmac4, GMAC_INT_DEFAULT_ENABLE already includes
GMAC_INT_PMT_EN, so it is redundant to check if hw->pmt
is set, and if so, setting the bit again.
For dwmac1000, GMAC_INT_DEFAULT_MASK does not include
GMAC_INT_DISABLE_PMT, so it is redundant to check if
hw->pmt is set, and if so, clearing an already cleared bit.
Improve code readability by removing this redundant code.
Signed-off-by: Niklas Cassel <niklas.cassel@axis.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Niklas Cassel [Fri, 9 Feb 2018 16:22:46 +0000 (17:22 +0100)]
net: stmmac: rename GMAC_INT_DEFAULT_MASK for dwmac4
GMAC_INT_DEFAULT_MASK is written to the interrupt enable register.
In previous versions of the IP (e.g. dwmac1000), this register was
instead an interrupt mask register.
To improve clarity and reflect reality, rename GMAC_INT_DEFAULT_MASK
to GMAC_INT_DEFAULT_ENABLE.
Signed-off-by: Niklas Cassel <niklas.cassel@axis.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Niklas Cassel [Fri, 9 Feb 2018 16:22:45 +0000 (17:22 +0100)]
net: stmmac: discard disabled flags in interrupt status register
The interrupt status register in both dwmac1000 and dwmac4 ignores
interrupt enable (for dwmac4) / interrupt mask (for dwmac1000).
Therefore, if we want to check only the bits that can actually trigger
an irq, we have to filter the interrupt status register manually.
Commit 0a764db10337 ("stmmac: Discard masked flags in interrupt status
register") fixed this for dwmac1000. Fix the same issue for dwmac4.
Just like commit 0a764db10337 ("stmmac: Discard masked flags in
interrupt status register"), this makes sure that we do not get
spurious link up/link down prints.
Signed-off-by: Niklas Cassel <niklas.cassel@axis.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Thomas Falcon [Fri, 9 Feb 2018 17:41:09 +0000 (11:41 -0600)]
ibmvnic: Reset long term map ID counter
When allocating RX or TX buffer pools, the driver needs to provide a
unique mapping ID to firmware for each pool. This value is assigned
using a counter which is incremented after a new pool is created. The
ID can be an integer ranging from 1-255. When migrating to a device
that requests a different number of queues, this value was not being
reset properly. As a result, after enough migrations, the counter
exceeded the upper bound and pool creation failed. This is fixed by
resetting the counter to one in this case.
Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
The following pull-request contains BPF updates for your *net* tree.
The main changes are:
1) Two fixes for BPF sockmap in order to break up circular map references
from programs attached to sockmap, and detaching related sockets in
case of socket close() event. For the latter we get rid of the
smap_state_change() and plug into ULP infrastructure, which will later
also be used for additional features anyway such as TX hooks. For the
second issue, dependency chain is broken up via map release callback
to free parse/verdict programs, all from John.
2) Fix a libbpf relocation issue that was found while implementing XDP
support for Suricata project. Issue was that when clang was invoked
with default target instead of bpf target, then various other e.g.
debugging relevant sections are added to the ELF file that contained
relocation entries pointing to non-BPF related sections which libbpf
trips over instead of skipping them. Test cases for libbpf are added
as well, from Jesper.
3) Various misc fixes for bpftool and one for libbpf: a small addition
to libbpf to make sure it recognizes all standard section prefixes.
Then, the Makefile in bpftool/Documentation is improved to explicitly
check for rst2man being installed on the system as we otherwise risk
installing empty man pages; the man page for bpftool-map is corrected
and a set of missing bash completions added in order to avoid shipping
bpftool where the completions are only partially working, from Quentin.
4) Fix applying the relocation to immediate load instructions in the
nfp JIT which were missing a shift, from Jakub.
5) Two fixes for the BPF kernel selftests: handle CONFIG_BPF_JIT_ALWAYS_ON=y
gracefully in test_bpf.ko module and mark them as FLAG_EXPECTED_FAIL
in this case; and explicitly delete the veth devices in the two tests
test_xdp_{meta,redirect}.sh before dismantling the netnses as when
selftests are run in batch mode, then workqueue to handle destruction
might not have finished yet and thus veth creation in next test under
same dev name would fail, from Yonghong.
6) Fix test_kmod.sh to check the test_bpf.ko module path before performing
an insmod, and fallback to modprobe. Especially the latter is useful
when having a device under test that has the modules installed instead,
from Naresh.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Fri, 9 Feb 2018 18:07:39 +0000 (10:07 -0800)]
Merge tag 'for-linus-4.16-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen fixes from Juergen Gross:
"Only five small fixes for issues when running under Xen"
* tag 'for-linus-4.16-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
xen: Fix {set,clear}_foreign_p2m_mapping on autotranslating guests
pvcalls-back: do not return error on inet_accept EAGAIN
xen-netfront: Fix race between device setup and open
xen/grant-table: Use put_page instead of free_page
x86/xen: init %gs very early to avoid page faults with stack protector