net: ethernet: ti: ale: fix allmulti for nu type ale
On AM65xx MCU CPSW2G NUSS and 66AK2E/L NUSS allmulti setting does not allow
unregistered mcast packets to pass.
This happens, because ALE VLAN entries on these SoCs do not contain port
masks for reg/unreg mcast packets, but instead store indexes of
ALE_VLAN_MASK_MUXx_REG registers which intended for store port masks for
reg/unreg mcast packets.
This path was missed by commit 9d1f6447274f ("net: ethernet: ti: ale: fix
seeing unreg mcast packets with promisc and allmulti disabled").
Hence, fix it by taking into account ALE type in cpsw_ale_set_allmulti().
Fixes: 9d1f6447274f ("net: ethernet: ti: ale: fix seeing unreg mcast packets with promisc and allmulti disabled") Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
net: ethernet: ti: am65-cpsw-nuss: fix ale parameters init
The ALE parameters structure is created on stack, so it has to be reset
before passing to cpsw_ale_create() to avoid garbage values.
Fixes: 93a76530316a ("net: ethernet: ti: introduce am65x/j721e gigabit eth subsystem driver") Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Andrii Nakryiko [Fri, 12 Jun 2020 19:45:04 +0000 (12:45 -0700)]
libbpf: Support pre-initializing .bss global variables
Remove invalid assumption in libbpf that .bss map doesn't have to be updated
in kernel. With addition of skeleton and memory-mapped initialization image,
.bss doesn't have to be all zeroes when BPF map is created, because user-code
might have initialized those variables from user-space.
Andrii Nakryiko [Fri, 12 Jun 2020 20:16:03 +0000 (13:16 -0700)]
tools/bpftool: Fix skeleton codegen
Remove unnecessary check at the end of codegen() routine which makes codegen()
to always fail and exit bpftool with error code. Positive value of variable
n is not an indicator of a failure.
Fixes: 2c4779eff837 ("tools, bpftool: Exit on error in function codegen") Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Reviewed-by: Tobias Klauser <tklauser@distanz.ch> Link: https://lore.kernel.org/bpf/20200612201603.680852-1-andriin@fb.com
Sabrina Dubroca [Wed, 10 Jun 2020 10:19:43 +0000 (12:19 +0200)]
bpf: tcp: Recv() should return 0 when the peer socket is closed
If the peer is closed, we will never get more data, so
tcp_bpf_wait_data will get stuck forever. In case we passed
MSG_DONTWAIT to recv(), we get EAGAIN but we should actually get
0.
>From man 2 recv:
RETURN VALUE
When a stream socket peer has performed an orderly shutdown, the
return value will be 0 (the traditional "end-of-file" return).
This patch makes tcp_bpf_wait_data always return 1 when the peer
socket has been shutdown. Either we have data available, and it would
have returned 1 anyway, or there isn't, in which case we'll call
tcp_recvmsg which does the right thing in this situation.
Thomas Falcon [Fri, 12 Jun 2020 18:34:41 +0000 (13:34 -0500)]
ibmvnic: Flush existing work items before device removal
Ensure that all scheduled work items have completed before continuing
with device removal and after further event scheduling has been
halted. This patch fixes a bug where a scheduled driver reset event
is processed following device removal.
Signed-off-by: Thomas Falcon <tlfalcon@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Cong Wang [Fri, 12 Jun 2020 07:16:55 +0000 (00:16 -0700)]
genetlink: clean up family attributes allocations
genl_family_rcv_msg_attrs_parse() and genl_family_rcv_msg_attrs_free()
take a boolean parameter to determine whether allocate/free the family
attrs. This is unnecessary as we can just check family->parallel_ops.
More importantly, callers would not need to worry about pairing these
parameters correctly after this patch.
And this fixes a memory leak, as after commit c36f05559104
("genetlink: fix memory leaks in genl_family_rcv_msg_dumpit()")
we call genl_family_rcv_msg_attrs_parse() for both parallel and
non-parallel cases.
Fixes: c36f05559104 ("genetlink: fix memory leaks in genl_family_rcv_msg_dumpit()") Reported-by: Ido Schimmel <idosch@idosch.org> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Tested-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
This series fixes four bugs in the configuration of IPA endpoints.
See the description of each for more information.
In this version I have dropped the last patch from the series, and
restored a "static" keyword that had inadvertently gotten removed.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Alex Elder [Thu, 11 Jun 2020 19:48:33 +0000 (14:48 -0500)]
net: ipa: header pad field only valid for AP->modem endpoint
Only QMAP endpoints should be configured to find a pad size field
within packet headers. They are found in the first byte of the QMAP
header (and the hardware fills only the 6 bits in that byte that
constitute the pad_len field).
The RMNet driver assumes the pad_len field is valid for received
packets, so we want to ensure the pad_len field is filled in that
case. That driver also assumes the length in the QMAP header
includes the pad bytes.
The RMNet driver does *not* pad the packets it sends, so the pad_len
field can be ignored.
Fix ipa_endpoint_init_hdr_ext() so it only marks the pad field
offset valid for QMAP RX endpoints, and in that case indicates
that the length field in the header includes the pad bytes.
Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Alex Elder [Thu, 11 Jun 2020 19:48:32 +0000 (14:48 -0500)]
net: ipa: program upper nibbles of sequencer type
The upper two nibbles of the sequencer type were not used for
SDM845, and were assumed to be 0. But for SC7180 they are used, and
so they must be programmed by ipa_endpoint_init_seq(). Fix this bug.
IPA_SEQ_PKT_PROCESS_NO_DEC_NO_UCP_DMAP doesn't have a descriptive
comment, so add one.
Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Alex Elder [Thu, 11 Jun 2020 19:48:31 +0000 (14:48 -0500)]
net: ipa: fix modem LAN RX endpoint id
The endpoint id assigned to the modem LAN RX endpoint for the SC7180 SoC
is incorrect. The erroneous value might have been copied from SDM845 and
never updated. The correct endpoint id to use for this SoC is 11.
Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Alex Elder [Thu, 11 Jun 2020 19:48:30 +0000 (14:48 -0500)]
net: ipa: program metadata mask differently
The way the mask value is programmed for QMAP RX endpoints was based
on some wrong assumptions about the way metadata containing the QMAP
mux_id value is formatted. The metadata value supplied by the
modem is *not* in QMAP format, and in fact contains the mux_id we
want in its (big endian) low-order byte. That byte must be written
by the IPA into offset 1 of the QMAP header it inserts before the
received packet.
QMAP TX endpoints *do* use a QMAP header as the metadata sent with
each packet. The modem assumes this, and based on that assumes the
mux_id is in the second byte. To match those assumptions we must
program the modem TX (QMAP) endpoint HDR register to indicate the
metadata will be found at offset 0 in the message header.
The previous configuration managed to work, but it was not working
correctly. This patch fixes a bug whose symptom was receipt of
messages containing the wrong QMAP mux_id.
In fixing this, get rid of ipa_rmnet_mux_id_metadata_mask(), which
was more or less defined so there was a separate place to explain
what was happening as we generated the mask value. Instead, put a
longer description of how this works above ipa_endpoint_init_hdr(),
and define the metadata mask to use as a simple constant.
Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Shannon Nelson [Fri, 12 Jun 2020 00:18:15 +0000 (17:18 -0700)]
ionic: add pcie_print_link_status
Print the PCIe link information for our device.
Fixes: 77f972a7077d ("ionic: remove support for mgmt device") Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 12 Jun 2020 01:25:20 +0000 (18:25 -0700)]
Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-queue
Jeff Kirsher says:
====================
Intel Wired LAN Driver Updates 2020-06-11
This series contains fixes to the iavf driver.
Brett fixes the supported link speeds in the iavf driver, which was only
able to report speeds that the i40e driver supported and was missing the
speeds supported by the ice driver. In addition, fix how 2.5 and 5.0
GbE speeds are reported.
Alek fixes a enum comparison that was comparing two different enums that
may have different values, so update the comparison to use matching
enums.
Paul increases the time to complete a reset to allow for 128 VFs to
complete a reset.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
David Howells [Thu, 11 Jun 2020 20:57:00 +0000 (21:57 +0100)]
rxrpc: Fix race between incoming ACK parser and retransmitter
There's a race between the retransmission code and the received ACK parser.
The problem is that the retransmission loop has to drop the lock under
which it is iterating through the transmission buffer in order to transmit
a packet, but whilst the lock is dropped, the ACK parser can crank the Tx
window round and discard the packets from the buffer.
The retransmission code then updated the annotations for the wrong packet
and a later retransmission thought it had to retransmit a packet that
wasn't there, leading to a NULL pointer dereference.
Fix this by:
(1) Moving the annotation change to before we drop the lock prior to
transmission. This means we can't vary the annotation depending on
the outcome of the transmission, but that's fine - we'll retransmit
again later if it failed now.
(2) Skipping the packet if the skb pointer is NULL.
Fixes: 248f219cb8bc ("rxrpc: Rewrite the data and ack handling code") Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Dan Carpenter [Wed, 3 Jun 2020 17:54:36 +0000 (20:54 +0300)]
net/mlx5: E-Switch, Fix some error pointer dereferences
We can't leave "counter" set to an error pointer. Otherwise either it
will lead to an error pointer dereference later in the function or it
leads to an error pointer dereference when we call mlx5_fc_destroy().
Leon Romanovsky [Tue, 2 Jun 2020 12:28:37 +0000 (15:28 +0300)]
net/mlx5: Don't fail driver on failure to create debugfs
Clang warns:
drivers/net/ethernet/mellanox/mlx5/core/main.c:1278:6: warning: variable
'err' is used uninitialized whenever 'if' condition is true
[-Wsometimes-uninitialized]
if (!priv->dbg_root) {
^~~~~~~~~~~~~~~
drivers/net/ethernet/mellanox/mlx5/core/main.c:1303:9: note:
uninitialized use occurs here
return err;
^~~
drivers/net/ethernet/mellanox/mlx5/core/main.c:1278:2: note: remove the
'if' if its condition is always false
if (!priv->dbg_root) {
^~~~~~~~~~~~~~~~~~~~~~
drivers/net/ethernet/mellanox/mlx5/core/main.c:1259:9: note: initialize
the variable 'err' to silence this warning
int err;
^
= 0
1 warning generated.
The check of returned value of debugfs_create_dir() is wrong because
by the design debugfs failures should never fail the driver and the
check itself was wrong too. The kernel compiled without CONFIG_DEBUG_FS
will return ERR_PTR(-ENODEV) and not NULL as expected.
Parav Pandit [Fri, 15 May 2020 07:44:06 +0000 (02:44 -0500)]
net/mlx5: Fix devlink objects and devlink device unregister sequence
Current below problems exists.
1. devlink device is registered by mlx5_load_one(). But it is
not unregistered by mlx5_unload_one(). This is incorrect.
2. Above issue leads to,
When mlx5 PCI device is removed, currently devlink device is
unregistered before devlink ports are unregistered in below ladder
diagram.
remove_one()
mlx5_devlink_unregister()
[..]
devlink_unregister() <- ports are still registered!
mlx5_unload_one()
mlx5_unregister_device()
mlx5_remove_device()
mlx5e_remove()
mlx5e_devlink_port_unregister()
devlink_port_unregister()
3. Condition checking for registering and unregister device are not
symmetric either in these routines.
Hence, fix the sequence by having load and unload routines symmetric
and in right order.
i.e.
(a) register devlink device followed by registering devlink ports
(b) unregister devlink ports followed by devlink device
Do this based on boot and cleanup flags instead of different
conditions.
Fixes: c6acd629eec7 ("net/mlx5e: Add support for devlink-port in non-representors mode") Fixes: f60f315d339e ("net/mlx5e: Register devlink ports for physical link, PCI PF, VFs") Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Moshe Shemesh <moshe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Parav Pandit [Thu, 14 May 2020 10:12:56 +0000 (05:12 -0500)]
net/mlx5: Disable reload while removing the device
While unregistration is in progress, user might be reloading the
interface.
This can race with unregistration in below flow which uses the
resources which are getting disabled by reload flow.
Hence, disable the devlink reloading first when removing the device.
Aya Levin [Sun, 17 May 2020 09:45:52 +0000 (12:45 +0300)]
net/mlx5e: Fix ethtool hfunc configuration change
Changing RX hash function requires rearranging of RQT internal indexes,
the user isn't exposed to such changes and these changes do not affect
the user configured indirection table. Rebuild RQ table on hfunc change.
After an XSK is closed, the relevant structures in the channel are not
zeroed. If an XSK is opened the second time on the same channel without
recreating channels, the stray values in the structures will lead to
incorrect operation of queues, which causes CQE errors, and the new
socket doesn't work at all.
This patch fixes the issue by explicitly zeroing XSK-related structs in
the channel on XSK close. Note that those structs are zeroed on channel
creation, and usually a configuration change (XDP program is set)
happens on XSK open, which leads to recreating channels, so typical XSK
usecases don't suffer from this issue. However, if XSKs are opened and
closed on the same channel without removing the XDP program, this bug
reproduces.
Shay Drory [Thu, 7 May 2020 06:32:53 +0000 (09:32 +0300)]
net/mlx5: Fix fatal error handling during device load
Currently, in case of fatal error during mlx5_load_one(), we cannot
enter error state until mlx5_load_one() is finished, what can take
several minutes until commands will get timeouts, because these commands
can't be processed due to the fatal error.
Fix it by setting dev->state as MLX5_DEVICE_STATE_INTERNAL_ERROR before
requesting the lock.
Fixes: c1d4d2e92ad6 ("net/mlx5: Avoid calling sleeping function by the health poll thread") Signed-off-by: Shay Drory <shayd@mellanox.com> Reviewed-by: Moshe Shemesh <moshe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Shay Drory [Wed, 6 May 2020 12:59:48 +0000 (15:59 +0300)]
net/mlx5: drain health workqueue in case of driver load error
In case there is a work in the health WQ when we teardown the driver,
in driver load error flow, the health work will try to read dev->iseg,
which was already unmap in mlx5_pci_close().
Fix it by draining the health workqueue first thing in mlx5_pci_close().
Paul Greenwalt [Fri, 5 Jun 2020 17:09:46 +0000 (10:09 -0700)]
iavf: increase reset complete wait time
With an increased number of VFs, it's possible to encounter the following
issue during reset.
iavf b8d4:00:02.0: Hardware reset detected
iavf b8d4:00:02.0: Reset never finished (0)
iavf b8d4:00:02.0: Reset task did not complete, VF disabled
Increase the reset complete wait count to allow for 128 VFs to complete
reset.
Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Brett Creeley [Fri, 5 Jun 2020 17:09:45 +0000 (10:09 -0700)]
iavf: Fix reporting 2.5 Gb and 5Gb speeds
Commit 4ae4916b5643 ("i40e: fix 'Unknown bps' in dmesg for 2.5Gb/5Gb
speeds") added the ability for the PF to report 2.5 and 5Gb speeds,
however, the iavf driver does not recognize those speeds as the values were
not added there. Add the proper enums and values so that iavf can properly
deal with those speeds.
Fixes: 4ae4916b5643 ("i40e: fix 'Unknown bps' in dmesg for 2.5Gb/5Gb speeds") Signed-off-by: Brett Creeley <brett.creeley@intel.com> Signed-off-by: Witold Fijalkowski <witoldx.fijalkowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
adapter->link_speed has type enum virtchnl_link_speed but our comparisons
are against enum iavf_aq_link_speed. Though they are, currently, the same
values, change the comparison to the matching enum virtchnl_link_speed
since that may not always be the case.
Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Piotr Kwapulinski <piotr.kwapulinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Brett Creeley [Fri, 5 Jun 2020 17:09:43 +0000 (10:09 -0700)]
iavf: fix speed reporting over virtchnl
Link speeds are communicated over virtchnl using an enum
virtchnl_link_speed. Currently, the highest link speed is 40Gbps which
leaves us unable to reflect some speeds that an ice VF is capable of.
This causes link speed to be misreported on the iavf driver.
Allow for communicating link speeds using Mbps so that the proper speed can
be reported for an ice VF. Moving away from the enum allows us to
communicate future speed changes without requiring a new enum to be added.
In order to support communicating link speeds over virtchnl in Mbps the
following functionality was added:
- Added u32 link_speed_mbps in the iavf_adapter structure.
- Added the macro ADV_LINK_SUPPORT(_a) to determine if the VF
driver supports communicating link speeds in Mbps.
- Added the function iavf_get_vpe_link_status() to fill the
correct link_status in the event_data union based on the
ADV_LINK_SUPPORT(_a) macro.
- Added the function iavf_set_adapter_link_speed_from_vpe()
to determine whether or not to fill the u32 link_speed_mbps or
enum virtchnl_link_speed link_speed field in the iavf_adapter
structure based on the ADV_LINK_SUPPORT(_a) macro.
- Do not free vf_res in iavf_init_get_resources() as vf_res will be
accessed in iavf_get_link_ksettings(); memset to 0 instead. This
memory is subsequently freed in iavf_remove().
Fixes: 7c710869d64e ("ice: Add handlers for VF netdevice operations") Signed-off-by: Brett Creeley <brett.creeley@intel.com> Signed-off-by: Sergey Nemov <sergey.nemov@intel.com> Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Tobias Klauser [Thu, 11 Jun 2020 10:33:41 +0000 (12:33 +0200)]
tools, bpftool: Exit on error in function codegen
Currently, the codegen function might fail and return an error. But its
callers continue without checking its return value. Since codegen can
fail only in the unlikely case of the system running out of memory or
the static template being malformed, just exit(-1) directly from codegen
and make it void-returning.
Li RongQing [Thu, 11 Jun 2020 05:11:06 +0000 (13:11 +0800)]
xdp: Fix xsk_generic_xmit errno
Propagate sock_alloc_send_skb error code, not set it to
EAGAIN unconditionally, when fail to allocate skb, which
might cause that user space unnecessary loops.
Tuong Lien [Thu, 11 Jun 2020 10:08:08 +0000 (17:08 +0700)]
tipc: fix NULL pointer dereference in tipc_disc_rcv()
When a bearer is enabled, we create a 'tipc_discoverer' object to store
the bearer related data along with a timer and a preformatted discovery
message buffer for later probing... However, this is only carried after
the bearer was set 'up', that left a race condition resulting in kernel
panic.
It occurs when a discovery message from a peer node is received and
processed in bottom half (since the bearer is 'up' already) just before
the discoverer object is created but is now accessed in order to update
the preformatted buffer (with a new trial address, ...) so leads to the
NULL pointer dereference.
We solve the problem by simply moving the bearer 'up' setting to later,
so make sure everything is ready prior to any message receiving.
Acked-by: Jon Maloy <jmaloy@redhat.com> Signed-off-by: Tuong Lien <tuong.t.lien@dektech.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>
Tuong Lien [Thu, 11 Jun 2020 10:07:35 +0000 (17:07 +0700)]
tipc: fix kernel WARNING in tipc_msg_append()
syzbot found the following issue:
WARNING: CPU: 0 PID: 6808 at include/linux/thread_info.h:150 check_copy_size include/linux/thread_info.h:150 [inline]
WARNING: CPU: 0 PID: 6808 at include/linux/thread_info.h:150 copy_from_iter include/linux/uio.h:144 [inline]
WARNING: CPU: 0 PID: 6808 at include/linux/thread_info.h:150 tipc_msg_append+0x49a/0x5e0 net/tipc/msg.c:242
Kernel panic - not syncing: panic_on_warn set ...
This happens after commit 5e9eeccc58f3 ("tipc: fix NULL pointer
dereference in streaming") that tried to build at least one buffer even
when the message data length is zero... However, it now exposes another
bug that the 'mss' can be zero and the 'cpy' will be negative, thus the
above kernel WARNING will appear!
The zero value of 'mss' is never expected because it means Nagle is not
enabled for the socket (actually the socket type was 'SOCK_SEQPACKET'),
so the function 'tipc_msg_append()' must not be called at all. But that
was in this particular case since the message data length was zero, and
the 'send <= maxnagle' check became true.
We resolve the issue by explicitly checking if Nagle is enabled for the
socket, i.e. 'maxnagle != 0' before calling the 'tipc_msg_append()'. We
also reinforce the function to against such a negative values if any.
Reported-by: syzbot+75139a7d2605236b0b7f@syzkaller.appspotmail.com Fixes: c0bceb97db9e ("tipc: add smart nagle feature") Acked-by: Jon Maloy <jmaloy@redhat.com> Signed-off-by: Tuong Lien <tuong.t.lien@dektech.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>
Shannon Nelson [Thu, 11 Jun 2020 04:07:39 +0000 (21:07 -0700)]
ionic: remove support for mgmt device
We no longer support the mgmt device in the ionic driver,
so remove the device id and related code.
Fixes: b3f064e9746d ("ionic: add support for device id 0x1004") Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: David S. Miller <davem@davemloft.net>
Xu Wang [Thu, 11 Jun 2020 02:45:20 +0000 (02:45 +0000)]
drivers: dpaa2: Use devm_kcalloc() in setup_dpni()
A multiplication for the size determination of a memory allocation
indicated that an array data structure should be processed.
Thus use the corresponding function "devm_kcalloc".
Signed-off-by: Xu Wang <vulab@iscas.ac.cn> Signed-off-by: David S. Miller <davem@davemloft.net>
YiFei Zhu [Wed, 10 Jun 2020 18:41:40 +0000 (13:41 -0500)]
selftests/bpf: Add cgroup_skb/egress test for load_bytes_relative
When cgroup_skb/egress triggers the MAC header is not set. Added a
test that asserts reading MAC header is a -EFAULT but NET header
succeeds. The test result from within the eBPF program is stored in
an 1-element array map that the userspace then reads and asserts on.
Another assertion is added that reading from a large offset, past
the end of packet, returns -EFAULT.
YiFei Zhu [Wed, 10 Jun 2020 18:41:39 +0000 (13:41 -0500)]
net/filter: Permit reading NET in load_bytes_relative when MAC not set
Added a check in the switch case on start_header that checks for
the existence of the header, and in the case that MAC is not set
and the caller requests for MAC, -EFAULT. If the caller requests
for NET then MAC's existence is completely ignored.
There is no function to check NET header's existence and as far
as cgroup_skb/egress is concerned it should always be set.
Removed for ptr >= the start of header, considering offset is
bounded unsigned and should always be true. len <= end - mac is
redundant to ptr + len <= end.
Jakub Kicinski [Wed, 10 Jun 2020 23:59:11 +0000 (16:59 -0700)]
docs: networkng: convert sja1105's devlink info to RTS
A new file snuck into the tree after all existing documentation
was converted to RST. Convert sja1105's devlink info and move
it where the rest of the drivers are documented.
Signed-off-by: Jakub Kicinski <kuba@kernel.org> Acked-by: Vladimir Oltean <vladimir.oltean@nxp.com> Tested-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Wed, 10 Jun 2020 23:09:06 +0000 (16:09 -0700)]
docs: networkng: fix lists and table in sja1105
We need an empty line before list stats, otherwise first point
will be smooshed into the paragraph. Inside tables text must
start at the same offset in the cell, otherwise sphinx thinks
it's a new indented block.
Documentation/networking/dsa/sja1105.rst:108: WARNING: Block quote ends without a blank line; unexpected unindent.
Documentation/networking/dsa/sja1105.rst:112: WARNING: Definition list ends without a blank line; unexpected unindent.
Documentation/networking/dsa/sja1105.rst:245: WARNING: Unexpected indentation.
Documentation/networking/dsa/sja1105.rst:246: WARNING: Block quote ends without a blank line; unexpected unindent.
Documentation/networking/dsa/sja1105.rst:253: WARNING: Unexpected indentation.
Documentation/networking/dsa/sja1105.rst:254: WARNING: Block quote ends without a blank line; unexpected unindent.
Fixes: a20bc43bfb2e ("docs: net: dsa: sja1105: document the best_effort_vlan_filtering option") Signed-off-by: Jakub Kicinski <kuba@kernel.org> Acked-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Wed, 10 Jun 2020 22:36:48 +0000 (15:36 -0700)]
docs: networking: fix extra spaces in ethtool-netlink
Sphinx appears to get upset at extra spaces at the end of a literal:
Documentation/networking/ethtool-netlink.rst:1032: WARNING: Inline literal start-string without end-string.
Documentation/networking/ethtool-netlink.rst:1034: WARNING: Inline literal start-string without end-string.
Documentation/networking/ethtool-netlink.rst:1036: WARNING: Inline literal start-string without end-string.
Documentation/networking/ethtool-netlink.rst:1089: WARNING: Inline literal start-string without end-string.
Documentation/networking/ethtool-netlink.rst:1091: WARNING: Inline literal start-string without end-string.
Documentation/networking/ethtool-netlink.rst:1093: WARNING: Inline literal start-string without end-string.
Fixes: f2bc8ad31a7f ("net: ethtool: Allow PHY cable test TDR data to configured") Fixes: a331172b156b ("net: ethtool: Add attributes for cable test TDR data") Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
This is due to NAPI left enabled if macb_phylink_connect() fail.
Fixes: 7897b071ac3b ("net: macb: convert to phylink") Signed-off-by: Corentin Labbe <clabbe@baylibre.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Paolo Abeni [Wed, 10 Jun 2020 08:49:00 +0000 (10:49 +0200)]
mptcp: don't leak msk in token container
If a listening MPTCP socket has unaccepted sockets at close
time, the related msks are freed via mptcp_sock_destruct(),
which in turn does not invoke the proto->destroy() method
nor the mptcp_token_destroy() function.
Due to the above, the child msk socket is not removed from
the token container, leading to later UaF.
Address the issue explicitly removing the token even in the
above error path.
Fixes: 79c0949e9a09 ("mptcp: Add key generation and token tree") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: David S. Miller <davem@davemloft.net>
Paolo Abeni [Wed, 10 Jun 2020 08:47:41 +0000 (10:47 +0200)]
mptcp: fix races between shutdown and recvmsg
The msk sk_shutdown flag is set by a workqueue, possibly
introducing some delay in user-space notification. If the last
subflow carries some data with the fin packet, the user space
can wake-up before RCV_SHUTDOWN is set. If it executes unblocking
recvmsg(), it may return with an error instead of eof.
Address the issue explicitly checking for eof in recvmsg(), when
no data is found.
Fixes: 59832e246515 ("mptcp: subflow: check parent mptcp socket on subflow state change") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: David S. Miller <davem@davemloft.net>
David Ahern [Tue, 9 Jun 2020 23:27:28 +0000 (17:27 -0600)]
vxlan: Remove access to nexthop group struct
vxlan driver should be using helpers to access nexthop struct
internals. Remove open check if whether nexthop is multipath in
favor of the existing nexthop_is_multipath helper. Add a new
helper, nexthop_has_v4, to cover the need to check has_v4 in
a group.
Fixes: 1274e1cc4226 ("vxlan: ecmp support for mac fdb entries") Cc: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
David Ahern [Tue, 9 Jun 2020 02:54:43 +0000 (20:54 -0600)]
nexthop: Fix fdb labeling for groups
fdb nexthops are marked with a flag. For standalone nexthops, a flag was
added to the nh_info struct. For groups that flag was added to struct
nexthop when it should have been added to the group information. Fix
by removing the flag from the nexthop struct and adding a flag to nh_group
that mirrors nh_info and is really only a caching of the individual types.
Add a helper, nexthop_is_fdb, for use by the vxlan code and fixup the
internal code to use the flag from either nh_info or nh_group.
v2
- propagate fdb_nh in remove_nh_grp_entry
Fixes: 38428d68719c ("nexthop: support for fdb ecmp nexthops") Cc: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Andrii Nakryiko [Wed, 10 Jun 2020 05:23:35 +0000 (22:23 -0700)]
libbpf: Handle GCC noreturn-turned-volatile quirk
Handle a GCC quirk of emitting extra volatile modifier in DWARF (and
subsequently preserved in BTF by pahole) for function pointers marked as
__attribute__((noreturn)). This was the way to mark such functions before GCC
2.5 added noreturn attribute. Drop such func_proto modifiers, similarly to how
it's done for array (also to handle GCC quirk/bug).
Such volatile attribute is emitted by GCC only, so existing selftests can't
express such test. Simple repro is like this (compiled with GCC + BTF
generated by pahole):
Remove function declarations that are not available in the tree anymore.
Fixes: 709ffbe19b77 ("net: remove indirect block netdev event registration") Reported-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
tannerlove [Tue, 9 Jun 2020 21:21:32 +0000 (17:21 -0400)]
selftests/net: in rxtimestamp getopt_long needs terminating null entry
getopt_long requires the last element to be filled with zeros.
Otherwise, passing an unrecognized option can cause a segfault.
Fixes: 16e781224198 ("selftests/net: Add a test to validate behavior of rx timestamps") Signed-off-by: Tanner Love <tannerlove@google.com> Acked-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
net: mvneta: do not redirect frames during reconfiguration
Disable frames injection in mvneta_xdp_xmit routine during hw
re-configuration in order to avoid hardware hangs
Fixes: b0a43db9087a ("net: mvneta: add XDP_TX support") Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Wang Hai [Tue, 9 Jun 2020 14:18:16 +0000 (22:18 +0800)]
dccp: Fix possible memleak in dccp_init and dccp_fini
There are some memory leaks in dccp_init() and dccp_fini().
In dccp_fini() and the error handling path in dccp_init(), free lhash2
is missing. Add inet_hashinfo2_free_mod() to do it.
If inet_hashinfo2_init_mod() failed in dccp_init(),
percpu_counter_destroy() should be called to destroy dccp_orphan_count.
It need to goto out_free_percpu when inet_hashinfo2_init_mod() failed.
Fixes: c92c81df93df ("net: dccp: fix kernel crash on module load") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Wang Hai <wanghai38@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Shannon Nelson [Tue, 9 Jun 2020 03:41:43 +0000 (20:41 -0700)]
ionic: wait on queue start until after IFF_UP
The netif_running() test looks at __LINK_STATE_START which
gets set before ndo_open() is called, there is a window of
time between that and when the queues are actually ready to
be run. If ionic_check_link_status() notices that the link is
up very soon after netif_running() becomes true, it might try
to run the queues before they are ready, causing all manner of
potential issues. Since the netdev->flags IFF_UP isn't set
until after ndo_open() returns, we can wait for that before
we allow ionic_check_link_status() to start the queues.
On the way back to close, __LINK_STATE_START is cleared before
calling ndo_stop(), and IFF_UP is cleared after. Both of
these need to be true in order to safely stop the queues
from ionic_check_link_status().
Fixes: 49d3b493673a ("ionic: disable the queues on link down") Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: David S. Miller <davem@davemloft.net>
Since the quiesce/activate rework, __netdev_watchdog_up() is directly
called in the ucc_geth driver.
Unfortunately, this function is not available for modules and thus
ucc_geth cannot be built as a module anymore. Fix it by exporting
__netdev_watchdog_up().
Since the commit introducing the regression was backported to stable
branches, this one should ideally be as well.
Fixes: 79dde73cf9bc ("net/ethernet/freescale: rework quiesce/activate for ucc_geth") Signed-off-by: Valentin Longchamp <valentin@longchamp.me> Signed-off-by: David S. Miller <davem@davemloft.net>
Cong Wang [Mon, 8 Jun 2020 21:53:01 +0000 (14:53 -0700)]
net: change addr_list_lock back to static key
The dynamic key update for addr_list_lock still causes troubles,
for example the following race condition still exists:
CPU 0: CPU 1:
(RCU read lock) (RTNL lock)
dev_mc_seq_show() netdev_update_lockdep_key()
-> lockdep_unregister_key()
-> netif_addr_lock_bh()
because lockdep doesn't provide an API to update it atomically.
Therefore, we have to move it back to static keys and use subclass
for nest locking like before.
In commit 1a33e10e4a95 ("net: partially revert dynamic lockdep key
changes"), I already reverted most parts of commit ab92d68fc22f
("net: core: add generic lockdep keys").
This patch reverts the rest and also part of commit f3b0a18bb6cb
("net: remove unnecessary variables and callback"). After this
patch, addr_list_lock changes back to using static keys and
subclasses to satisfy lockdep. Thanks to dev->lower_level, we do
not have to change back to ->ndo_get_lock_subclass().
And hopefully this reduces some syzbot lockdep noises too.
Reported-by: syzbot+f3a0e80c34b3fc28ac5e@syzkaller.appspotmail.com Cc: Taehee Yoo <ap420073@gmail.com> Cc: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
V2:
- Defer changing BPF-syscall to start at file-descriptor 1
- Use {} to zero initialise struct.
The recent commit fbee97feed9b ("bpf: Add support to attach bpf program to a
devmap entry"), introduced ability to attach (and run) a separate XDP
bpf_prog for each devmap entry. A bpf_prog is added via a file-descriptor.
As zero were a valid FD, not using the feature requires using value minus-1.
The UAPI is extended via tail-extending struct bpf_devmap_val and using
map->value_size to determine the feature set.
This will break older userspace applications not using the bpf_prog feature.
Consider an old userspace app that is compiled against newer kernel
uapi/bpf.h, it will not know that it need to initialise the member
bpf_prog.fd to minus-1. Thus, users will be forced to update source code to
get program running on newer kernels.
This patch remove the minus-1 checks, and have zero mean feature isn't used.
Followup patches either for kernel or libbpf should handle and avoid
returning file-descriptor zero in the first place.
Lorenz Bauer [Mon, 8 Jun 2020 16:22:01 +0000 (17:22 +0100)]
bpf: cgroup: Allow multi-attach program to replace itself
When using BPF_PROG_ATTACH to attach a program to a cgroup in
BPF_F_ALLOW_MULTI mode, it is not possible to replace a program
with itself. This is because the check for duplicate programs
doesn't take the replacement program into account.
Replacing a program with itself might seem weird, but it has
some uses: first, it allows resetting the associated cgroup storage.
Second, it makes the API consistent with the non-ALLOW_MULTI usage,
where it is possible to replace a program with itself. Third, it
aligns BPF_PROG_ATTACH with bpf_link, where replacing itself is
also supported.
Sice this code has been refactored a few times this change will
only apply to v5.7 and later. Adjustments could be made to
commit 1020c1f24a94 ("bpf: Simplify __cgroup_bpf_attach") and
commit d7bf2c10af05 ("bpf: allocate cgroup storage entries on attaching bpf programs")
as well as commit 324bda9e6c5a ("bpf: multi program support for cgroup+bpf")
tracing/probe: Fix bpf_task_fd_query() for kprobes and uprobes
Commit 60d53e2c3b75 ("tracing/probe: Split trace_event related data from
trace_probe") removed the trace_[ku]probe structure from the
trace_event_call->data pointer. As bpf_get_[ku]probe_info() were
forgotten in that change, fix them now. These functions are currently
only used by the bpf_task_fd_query() syscall handler to collect
information about a perf event.
Fixes: 60d53e2c3b75 ("tracing/probe: Split trace_event related data from trace_probe") Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Link: https://lore.kernel.org/bpf/20200608124531.819838-1-jean-philippe@linaro.org
Lorenz Bauer [Mon, 8 Jun 2020 09:42:57 +0000 (10:42 +0100)]
scripts: Require pahole v1.16 when generating BTF
bpf_iter requires the kernel BTF to be generated with
pahole >= 1.16, since otherwise the function definitions
that the iterator attaches to are not included.
This failure mode is indistiguishable from trying to attach
to an iterator that really doesn't exist.
Since it's really easy to miss this requirement, bump the
pahole version check used at build time to at least 1.16.
Fixes: 15d83c4d7cef ("bpf: Allow loading of a bpf_iter program") Suggested-by: Ivan Babrou <ivan@cloudflare.com> Signed-off-by: Lorenz Bauer <lmb@cloudflare.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20200608094257.47366-1-lmb@cloudflare.com
Jakub Sitnicki [Sun, 7 Jun 2020 20:52:29 +0000 (22:52 +0200)]
bpf, sockhash: Synchronize delete from bucket list on map free
We can end up modifying the sockhash bucket list from two CPUs when a
sockhash is being destroyed (sock_hash_free) on one CPU, while a socket
that is in the sockhash is unlinking itself from it on another CPU
it (sock_hash_delete_from_link).
This results in accessing a list element that is in an undefined state as
reported by KASAN:
Fix it by reintroducing spin-lock protected critical section around the
code that removes the elements from the bucket on sockhash free.
To do that we also need to defer processing of removed elements, until out
of atomic context so that we can unlink the socket from the map when
holding the sock lock.
Fixes: 90db6d772f74 ("bpf, sockmap: Remove bucket->lock from sock_{hash|map}_free") Reported-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20200607205229.2389672-3-jakub@cloudflare.com
Jakub Sitnicki [Sun, 7 Jun 2020 20:52:28 +0000 (22:52 +0200)]
bpf, sockhash: Fix memory leak when unlinking sockets in sock_hash_free
When sockhash gets destroyed while sockets are still linked to it, we will
walk the bucket lists and delete the links. However, we are not freeing the
list elements after processing them, leaking the memory.
The leak can be triggered by close()'ing a sockhash map when it still
contains sockets, and observed with kmemleak:
dihu [Fri, 5 Jun 2020 08:46:25 +0000 (16:46 +0800)]
bpf/sockmap: Fix kernel panic at __tcp_bpf_recvmsg
When user application calls read() with MSG_PEEK flag to read data
of bpf sockmap socket, kernel panic happens at
__tcp_bpf_recvmsg+0x12c/0x350. sk_msg is not removed from ingress_msg
queue after read out under MSG_PEEK flag is set. Because it's not
judged whether sk_msg is the last msg of ingress_msg queue, the next
sk_msg may be the head of ingress_msg queue, whose memory address of
sg page is invalid. So it's necessary to add check codes to prevent
this problem.
tannerlove [Mon, 8 Jun 2020 19:37:15 +0000 (15:37 -0400)]
selftests/net: in timestamping, strncpy needs to preserve null byte
If user passed an interface option longer than 15 characters, then
device.ifr_name and hwtstamp.ifr_name became non-null-terminated
strings. The compiler warned about this:
Fixes: cb9eff097831 ("net: new user space API for time stamping of incoming and outgoing packets") Signed-off-by: Tanner Love <tannerlove@google.com> Acked-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 9 Jun 2020 02:13:37 +0000 (19:13 -0700)]
Merge tag 'rxrpc-fixes-20200605' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs
David Howells says:
====================
rxrpc: Fix hang due to missing notification
Here's a fix for AF_RXRPC. Occasionally calls hang because there are
circumstances in which rxrpc generate a notification when a call is
completed - primarily because initial packet transmission failed and the
call was killed off and an error returned. But the AFS filesystem driver
doesn't check this under all circumstances, expecting failure to be
delivered by asynchronous notification.
There are two patches: the first moves the problematic bits out-of-line and
the second contains the fix.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Arjun Roy [Mon, 8 Jun 2020 01:54:41 +0000 (18:54 -0700)]
net-zerocopy: use vm_insert_pages() for tcp rcv zerocopy
Use vm_insert_pages() for tcp receive zerocopy. Spin lock cycles (as
reported by perf) drop from a couple of percentage points to a fraction of
a percent. This results in a roughly 6% increase in efficiency, measured
roughly as zerocopy receive count divided by CPU utilization.
The intention of this patchset is to reduce atomic ops for tcp zerocopy
receives, which normally hits the same spinlock multiple times
consecutively.
[akpm@linux-foundation.org: suppress gcc-7.2.0 warning] Link: http://lkml.kernel.org/r/20200128025958.43490-3-arjunroy.kdev@gmail.com Signed-off-by: Arjun Roy <arjunroy@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com> Cc: David Miller <davem@davemloft.net> Cc: Matthew Wilcox <willy@infradead.org> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Pooja Trivedi [Fri, 5 Jun 2020 16:01:18 +0000 (16:01 +0000)]
net/tls(TLS_SW): Add selftest for 'chunked' sendfile test
This selftest tests for cases where sendfile's 'count'
parameter is provided with a size greater than the intended
file size.
Motivation: When sendfile is provided with 'count' parameter
value that is greater than the size of the file, kTLS example
fails to send the file correctly. Last chunk of the file is
not sent, and the data integrity is compromised.
The reason is that the last chunk has MSG_MORE flag set
because of which it gets added to pending records, but is
not pushed.
Note that if user space were to send SSL_shutdown control
message, pending records would get flushed and the issue
would not happen. So a shutdown control message following
sendfile can mask the issue.
Signed-off-by: Pooja Trivedi <pooja.trivedi@stackpath.com> Signed-off-by: Mallesham Jatharkonda <mallesham.jatharkonda@oneconvergence.com> Signed-off-by: Josh Tway <josh.tway@stackpath.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 9 Jun 2020 00:14:19 +0000 (17:14 -0700)]
Merge tag 'mac80211-for-davem-2020-06-08' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211
Johannes Berg says:
====================
Just a small update:
* fix the deadlock on rfkill/wireless removal that a few
people reported
* fix an uninitialized variable
* update wiki URLs
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix test race, in which background poll can get either 5 or 6 samples,
depending on timing of notification. Prevent this by open-coding sample
triggering and forcing notification for the very last sample only.
Also switch to using atomic increments and exchanges for more obviously
reliable counting and checking. Additionally, check expected processed sample
counters for single-threaded use cases as well.
- Fix the build with certain Kconfig combinations for the Chelsio
inline TLS device, from Rohit Maheshwar and Vinay Kumar Yadavi.
- Fix leak in genetlink, from Cong Lang.
- Fix out of bounds packet header accesses in seg6, from Ahmed
Abdelsalam.
- Two XDP fixes in the ENA driver, from Sameeh Jubran
- Use rwsem in device rename instead of a seqcount because this code
can sleep, from Ahmed S. Darwish.
- Fix WoL regressions in r8169, from Heiner Kallweit.
- Fix qed crashes in kdump mode, from Alok Prasad.
- Fix the callbacks used for certain thermal zones in mlxsw, from Vadim
Pasternak.
* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (35 commits)
net: dsa: lantiq_gswip: fix and improve the unsupported interface error
mlxsw: core: Use different get_trend() callbacks for different thermal zones
net: dp83869: Reset return variable if PHY strap is read
rhashtable: Drop raw RCU deref in nested_table_free
cxgb4: Use kfree() instead kvfree() where appropriate
net: qed: fixes crash while running driver in kdump kernel
vsock/vmci: make vmci_vsock_transport_cb() static
net: ethtool: Fix comment mentioning typo in IS_ENABLED()
net: phy: mscc: fix Serdes configuration in vsc8584_config_init
net: mscc: Fix OF_MDIO config check
net: marvell: Fix OF_MDIO config check
net: dp83867: Fix OF_MDIO config check
net: dp83869: Fix OF_MDIO config check
net: ethernet: mvneta: fix MVNETA_SKB_HEADROOM alignment
ethtool: linkinfo: remove an unnecessary NULL check
net/xdp: use shift instead of 64 bit division
crypto/chtls:Fix compile error when CONFIG_IPV6 is disabled
inet_connection_sock: clear inet_num out of destroy helper
yam: fix possible memory leak in yam_init_driver
lan743x: Use correct MAC_CR configuration for 1 GBit speed
...
- Rework the sparc32 page tables so that READ_ONCE(*pmd), as done by
generic code, operates on a word sized element. From Will Deacon.
- Some scnprintf() conversions, from Chen Zhou.
- A pin_user_pages() conversion from John Hubbard.
- Several 32-bit ptrace register handling fixes and such from Al Viro.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-next:
fix a braino in "sparc32: fix register window handling in genregs32_[gs]et()"
sparc32: mm: Only call ctor()/dtor() functions for first and last user
sparc32: mm: Disable SPLIT_PTLOCK_CPUS
sparc32: mm: Don't try to free page-table pages if ctor() fails
sparc32: register memory occupied by kernel as memblock.memory
sparc: remove unused header file nfs_fs.h
sparc32: fix register window handling in genregs32_[gs]et()
sparc64: fix misuses of access_process_vm() in genregs32_[sg]et()
oradax: convert get_user_pages() --> pin_user_pages()
sparc: use scnprintf() in show_pciobppath_attr() in vio.c
sparc: use scnprintf() in show_pciobppath_attr() in pci.c
tty: vcc: Fix error return code in vcc_probe()
sparc32: mm: Reduce allocation size for PMD and PTE tables
sparc32: mm: Change pgtable_t type to pte_t * instead of struct page *
sparc32: mm: Restructure sparc32 MMU page-table layout
sparc32: mm: Fix argument checking in __srmmu_get_nocache()
sparc64: Replace zero-length array with flexible-array
sparc: mm: return true,false in kern_addr_valid()
net: dsa: lantiq_gswip: fix and improve the unsupported interface error
While trying to use the lantiq_gswip driver on one of my boards I made
a mistake when specifying the phy-mode (because the out-of-tree driver
wants phy-mode "gmii" or "mii" for the internal PHYs). In this case the
following error is printed multiple times:
Unsupported interface: 3
While it gives at least a hint at what may be wrong it is not very user
friendly. Print the human readable phy-mode and also which port is
configured incorrectly (this hardware supports ports 0..6) to improve
the cases where someone made a mistake.
Fixes: 14fceff4771e51 ("net: dsa: Add Lantiq / Intel DSA driver for vrx200") Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com> Acked-by: Hauke Mehrtens <hauke@hauke-m.de> Signed-off-by: David S. Miller <davem@davemloft.net>
Vadim Pasternak [Sun, 7 Jun 2020 08:10:27 +0000 (11:10 +0300)]
mlxsw: core: Use different get_trend() callbacks for different thermal zones
The driver registers three different types of thermal zones: For the
ASIC itself, for port modules and for gearboxes.
Currently, all three types use the same get_trend() callback which does
not work correctly for the ASIC thermal zone. The callback assumes that
the device data is of type 'struct mlxsw_thermal_module', whereas for
the ASIC thermal zone 'struct mlxsw_thermal' is passed as device data.
Fix this by using one get_trend() callback for the ASIC thermal zone and
another for the other two types.
Fixes: 6f73862fabd9 ("mlxsw: core: Add the hottest thermal zone detection") Signed-off-by: Vadim Pasternak <vadimp@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Sun, 7 Jun 2020 23:13:43 +0000 (16:13 -0700)]
Merge tag 'pinctrl-v5.8-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl
Pull pin control updates from Linus Walleij:
"This is the bulk of pin control changes for the v5.8 kernel cycle.
It's just really boring this time. Zero core changes. Just linear
development, cleanups and misc noncritical fixes. Some new drivers for
very new Qualcomm and Intel chips.
New drivers:
- Intel Jasper Lake support.
- NXP Freescale i.MX8DXL support.
- Qualcomm SM8250 support.
- Renesas R8A7742 SH-PFC support.
Driver improvements:
- Severe cleanup and modernization of the MCP23s08 driver.
- Mediatek driver modularized.
- Setting config supported in the Meson driver.
- Wakeup support for the Broadcom BCM7211"
* tag 'pinctrl-v5.8-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl: (72 commits)
pinctrl: sprd: Fix the incorrect pull-up definition
pinctrl: pxa: pxa2xx: Remove 'pxa2xx_pinctrl_exit()' which is unused and broken
pinctrl: freescale: imx: Use 'devm_of_iomap()' to avoid a resource leak in case of error in 'imx_pinctrl_probe()'
pinctrl: freescale: imx: Fix an error handling path in 'imx_pinctrl_probe()'
pinctrl: sirf: add missing put_device() call in sirfsoc_gpio_probe()
pinctrl: imxl: Fix an error handling path in 'imx1_pinctrl_core_probe()'
pinctrl: bcm2835: Add support for wake-up interrupts
pinctrl: bcm2835: Match BCM7211 compatible string
dt-bindings: pinctrl: Document optional BCM7211 wake-up interrupts
dt-bindings: pinctrl: Document 7211 compatible for brcm, bcm2835-gpio.txt
dt-bindings: pinctrl: stm32: Add missing interrupts property
pinctrl: at91-pio4: Add COMPILE_TEST support
pinctrl: Fix return value about devm_platform_ioremap_resource()
MAINTAINERS: Renesas Pin Controllers are supported
dt-bindings: pinctrl: ocelot: Add Sparx5 SoC support
pinctrl: ocelot: Fix GPIO interrupt decoding on Jaguar2
pinctrl: ocelot: Remove instance number from pin functions
pinctrl: ocelot: Always register GPIO driver
dt-bindings: pinctrl: rockchip: update example
pinctrl: amd: Add ACPI dependency
...
* tag 'rtc-5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux: (29 commits)
rtc: pcf2127: watchdog: handle nowayout feature
rtc: fsl-ftm-alarm: fix freeze(s2idle) failed to wake
rtc: abx80x: Provide debug feedback for invalid dt properties
rtc: abx80x: Add Device Tree matching table
rtc: rv3028: Add missed check for devm_regmap_init_i2c()
rtc: mpc5121: Use correct return value for mpc5121_rtc_probe()
rtc: goldfish: Use correct return value for goldfish_rtc_probe()
rtc: snvs: Add necessary clock operations for RTC APIs
rtc: snvs: Make SNVS clock always prepared
rtc: ingenic: Reset regulator register in probe
rtc: ingenic: Fix masking of error code
rtc: ingenic: Remove unused fields from private structure
rtc: ingenic: Set wakeup params in probe
rtc: ingenic: Enable clock in probe
rtc: ingenic: Use local 'dev' variable in probe
rtc: ingenic: Only support probing from devicetree
rtc: mc13xxx: fix a double-unlock issue
rtc: stmp3xxx: update contact email
rtc: max77686: Use single-byte writes on MAX77620
rtc: pcf2127: report battery switch over
...
Linus Torvalds [Sun, 7 Jun 2020 23:08:41 +0000 (16:08 -0700)]
Merge tag 'ntb-5.8' of git://github.com/jonmason/ntb
Pull NTB updates from Jon Mason:
"Intel Icelake NTB support, Intel driver bug fixes, and lots of bug
fixes for ntb tests"
* tag 'ntb-5.8' of git://github.com/jonmason/ntb:
NTB: ntb_test: Fix bug when counting remote files
NTB: perf: Fix race condition when run with ntb_test
NTB: perf: Fix support for hardware that doesn't have port numbers
NTB: perf: Don't require one more memory window than number of peers
NTB: ntb_pingpong: Choose doorbells based on port number
NTB: Fix the default port and peer numbers for legacy drivers
NTB: Revert the change to use the NTB device dev for DMA allocations
NTB: ntb_tool: reading the link file should not end in a NULL byte
ntb_perf: avoid false dma unmap of destination address
ntb_perf: increase sleep time from one milli sec to one sec
ntb_tool: pass correct struct device to dma_alloc_coherent
ntb_perf: pass correct struct device to dma_alloc_coherent
ntb: hw: remove the code that sets the DMA mask
NTB: correct ntb_peer_spad_addr and ntb_peer_spad_read comment typos
ntb: intel: fix static declaration
ntb: intel: add hw workaround for NTB BAR alignment
ntb: intel: Add Icelake (gen4) support for Intel NTB
NTB: Fix static check warning in perf_clear_test
include/ntb: Fix typo in ntb_unregister_device description
Linus Torvalds [Sun, 7 Jun 2020 23:04:49 +0000 (16:04 -0700)]
Merge tag 'apparmor-pr-2020-06-07' of git://git.kernel.org/pub/scm/linux/kernel/git/jj/linux-apparmor
Pull apparmor updates from John Johansen:
"Features:
- Replace zero-length array with flexible-array
- add a valid state flags check
- add consistency check between state and dfa diff encode flags
- add apparmor subdir to proc attr interface
- fail unpack if profile mode is unknown
- add outofband transition and use it in xattr match
- ensure that dfa state tables have entries
Cleanups:
- Use true and false for bool variable
- Remove semicolon
- Clean code by removing redundant instructions
- Replace two seq_printf() calls by seq_puts() in aa_label_seq_xprint()
- remove duplicate check of xattrs on profile attachment
- remove useless aafs_create_symlink
Bug fixes:
- Fix memory leak of profile proxy
- fix introspection of of task mode for unconfined tasks
- fix nnp subset test for unconfined
- check/put label on apparmor_sk_clone_security()"
* tag 'apparmor-pr-2020-06-07' of git://git.kernel.org/pub/scm/linux/kernel/git/jj/linux-apparmor:
apparmor: Fix memory leak of profile proxy
apparmor: fix introspection of of task mode for unconfined tasks
apparmor: check/put label on apparmor_sk_clone_security()
apparmor: Use true and false for bool variable
security/apparmor/label.c: Clean code by removing redundant instructions
apparmor: Replace zero-length array with flexible-array
apparmor: ensure that dfa state tables have entries
apparmor: remove duplicate check of xattrs on profile attachment.
apparmor: add outofband transition and use it in xattr match
apparmor: fail unpack if profile mode is unknown
apparmor: fix nnp subset test for unconfined
apparmor: remove useless aafs_create_symlink
apparmor: add proc subdir to attrs
apparmor: add consistency check between state and dfa diff encode flags
apparmor: add a valid state flags check
AppArmor: Remove semicolon
apparmor: Replace two seq_printf() calls by seq_puts() in aa_label_seq_xprint()
Roberto Sassu [Sun, 7 Jun 2020 21:00:29 +0000 (23:00 +0200)]
ima: Remove __init annotation from ima_pcrread()
Commit 6cc7c266e5b4 ("ima: Call ima_calc_boot_aggregate() in
ima_eventdigest_init()") added a call to ima_calc_boot_aggregate() so that
the digest can be recalculated for the boot_aggregate measurement entry if
the 'd' template field has been requested. For the 'd' field, only SHA1 and
MD5 digests are accepted.
Given that ima_eventdigest_init() does not have the __init annotation, all
functions called should not have it. This patch removes __init from
ima_pcrread().
John Johansen [Sat, 6 Jun 2020 01:12:21 +0000 (18:12 -0700)]
apparmor: fix introspection of of task mode for unconfined tasks
Fix two issues with introspecting the task mode.
1. If a task is attached to a unconfined profile that is not the
ns->unconfined profile then. Mode the mode is always reported
as -
$ ps -Z
LABEL PID TTY TIME CMD
unconfined 1287 pts/0 00:00:01 bash
test (-) 1892 pts/0 00:00:00 ps
instead of the correct value of (unconfined) as shown below
$ ps -Z
LABEL PID TTY TIME CMD
unconfined 2483 pts/0 00:00:01 bash
test (unconfined) 3591 pts/0 00:00:00 ps
2. if a task is confined by a stack of profiles that are unconfined
the output of label mode is again the incorrect value of (-) like
above, instead of (unconfined). This is because the visibile
profile count increment is skipped by the special casing of
unconfined.
Fixes: f1bd904175e8 ("apparmor: add the base fns() for domain labels") Signed-off-by: John Johansen <john.johansen@canonical.com>
apparmor: check/put label on apparmor_sk_clone_security()
Currently apparmor_sk_clone_security() does not check for existing
label/peer in the 'new' struct sock; it just overwrites it, if any
(with another reference to the label of the source sock.)
The label reference count leak is observed if apparmor_sock_graft()
is called previously: this sets the 'ctx->label' field by getting
a reference to the current label (later overwritten, without put.)
Apparently both calls are done on their own right, especially for
other LSMs, being introduced in 2010/2014, before apparmor socket
mediation in 2017 (see commits [1,2,3,4]).
So, it looks OK there! Let's fix the reference leak in apparmor.
Test-case:
---------
Exercise that code path enough to overflow label reference count.
[1] commit 507cad355fc9 ("crypto: af_alg - Make sure sk_security is initialized on accept()ed sockets")
[2] commit 4c63f83c2c2e ("crypto: af_alg - properly label AF_ALG socket")
[3] commit 2acce6aa9f65 ("Networking") a.k.a ("crypto: af_alg - Avoid sock_graft call warning)
[4] commit 56974a6fcfef ("apparmor: add base infastructure for socket mediation")
Fixes: 56974a6fcfef ("apparmor: add base infastructure for socket mediation") Reported-by: Brian Moyles <bmoyles@netflix.com> Signed-off-by: Mauricio Faria de Oliveira <mfo@canonical.com> Signed-off-by: John Johansen <john.johansen@canonical.com>
Linus Torvalds [Sun, 7 Jun 2020 17:59:32 +0000 (10:59 -0700)]
Merge tag 'char-misc-5.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char/misc driver updates from Greg KH:
"Here is the large set of char/misc driver patches for 5.8-rc1
Included in here are:
- habanalabs driver updates, loads
- mhi bus driver updates
- extcon driver updates
- clk driver updates (approved by the clock maintainer)
- firmware driver updates
- fpga driver updates
- gnss driver updates
- coresight driver updates
- interconnect driver updates
- parport driver updates (it's still alive!)
- nvmem driver updates
- soundwire driver updates
- visorbus driver updates
- w1 driver updates
- various misc driver updates
In short, loads of different driver subsystem updates along with the
drivers as well.
All have been in linux-next for a while with no reported issues"
* tag 'char-misc-5.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (233 commits)
habanalabs: correctly cast u64 to void*
habanalabs: initialize variable to default value
extcon: arizona: Fix runtime PM imbalance on error
extcon: max14577: Add proper dt-compatible strings
extcon: adc-jack: Fix an error handling path in 'adc_jack_probe()'
extcon: remove redundant assignment to variable idx
w1: omap-hdq: print dev_err if irq flags are not cleared
w1: omap-hdq: fix interrupt handling which did show spurious timeouts
w1: omap-hdq: fix return value to be -1 if there is a timeout
w1: omap-hdq: cleanup to add missing newline for some dev_dbg
/dev/mem: Revoke mappings when a driver claims the region
misc: xilinx-sdfec: convert get_user_pages() --> pin_user_pages()
misc: xilinx-sdfec: cleanup return value in xsdfec_table_write()
misc: xilinx-sdfec: improve get_user_pages_fast() error handling
nvmem: qfprom: remove incorrect write support
habanalabs: handle MMU cache invalidation timeout
habanalabs: don't allow hard reset with open processes
habanalabs: GAUDI does not support soft-reset
habanalabs: add print for soft reset due to event
habanalabs: improve MMU cache invalidation code
...
Linus Torvalds [Sun, 7 Jun 2020 17:53:36 +0000 (10:53 -0700)]
Merge tag 'driver-core-5.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core updates from Greg KH:
"Here is the set of driver core patches for 5.8-rc1.
Not all that huge this release, just a number of small fixes and
updates:
- software node fixes
- kobject now sends KOBJ_REMOVE when it is removed from sysfs, not
when it is removed from memory (which could come much later)
- device link additions and fixes based on testing on more devices
- firmware core cleanups
- other minor changes, full details in the shortlog
All have been in linux-next for a while with no reported issues"
* tag 'driver-core-5.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (23 commits)
driver core: Update device link status correctly for SYNC_STATE_ONLY links
firmware_loader: change enum fw_opt to u32
software node: implement software_node_unregister()
kobject: send KOBJ_REMOVE uevent when the object is removed from sysfs
driver core: Remove unnecessary is_fwnode_dev variable in device_add()
drivers property: When no children in primary, try secondary
driver core: platform: Fix spelling errors in platform.c
driver core: Remove check in driver_deferred_probe_force_trigger()
of: platform: Batch fwnode parsing when adding all top level devices
driver core: fw_devlink: Add support for batching fwnode parsing
driver core: Look for waiting consumers only for a fwnode's primary device
driver core: Move code to the right part of the file
Revert "Revert "driver core: Set fw_devlink to "permissive" behavior by default""
drivers: base: Fix NULL pointer exception in __platform_driver_probe() if a driver developer is foolish
firmware_loader: move fw_fallback_config to a private kernel symbol namespace
driver core: Add missing '\n' in log messages
driver/base/soc: Use kobj_to_dev() API
Add documentation on meaning of -EPROBE_DEFER
driver core: platform: remove redundant assignment to variable ret
debugfs: Use the correct style for SPDX License Identifier
...