Currently, device V1 and V2 do not support segmentation
offload for UDP based tunnel packet who needs outer UDP
checksum offload, so there is a workaround in the driver
to set the checksum of the outer UDP checksum as zero. This
is not what the user wants, so remove this feature for
device V1 and V2, add support for it later(when the device
has the ability to do that).
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Huazhong Tan [Sat, 28 Nov 2020 03:51:45 +0000 (11:51 +0800)]
net: hns3: add support for TX hardware checksum offload
For the device that supports TX hardware checksum, the hardware
can calculate the checksum from the start and fill the checksum
to the offset position, which reduces the operations of
calculating the type and header length of L3/L4. So add this
feature for the HNS3 ethernet driver.
The previous simple BD description is unsuitable, rename it as
HW TX CSUM.
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Huazhong Tan [Sat, 28 Nov 2020 03:51:44 +0000 (11:51 +0800)]
net: hns3: add support for RX completion checksum
In some cases (for example ip fragment), hardware will
calculate the checksum of whole packet in RX, and setup
the HNS3_RXD_L2_CSUM_B flag in the descriptor, so add
support to utilize this checksum.
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Lukas Bulwahn [Fri, 27 Nov 2020 09:34:21 +0000 (10:34 +0100)]
net/ipv6: propagate user pointer annotation
For IPV6_2292PKTOPTIONS, do_ipv6_getsockopt() stores the user pointer
optval in the msg_control field of the msghdr.
Hence, sparse rightfully warns at ./net/ipv6/ipv6_sockglue.c:1151:33:
warning: incorrect type in assignment (different address spaces)
expected void *msg_control
got char [noderef] __user *optval
Since commit 1f466e1f15cf ("net: cleanly handle kernel vs user buffers for
->msg_control"), user pointers shall be stored in the msg_control_user
field, and kernel pointers in the msg_control field. This allows to
propagate __user annotations nicely through this struct.
Store optval in msg_control_user to properly record and propagate the
memory space annotation of this pointer.
Note that msg_control_is_user is set to true, so the key invariant, i.e.,
use msg_control_user if and only if msg_control_is_user is true, holds.
The msghdr is further used in the six alternative put_cmsg() calls, with
msg_control_is_user being true, put_cmsg() picks msg_control_user
preserving the __user annotation and passes that properly to
copy_to_user().
Marco Elver [Wed, 25 Nov 2020 22:48:40 +0000 (23:48 +0100)]
net: switch to storing KCOV handle directly in sk_buff
It turns out that usage of skb extensions can cause memory leaks. Ido
Schimmel reported: "[...] there are instances that blindly overwrite
'skb->extensions' by invoking skb_copy_header() after __alloc_skb()."
Therefore, give up on using skb extensions for KCOV handle, and instead
directly store kcov_handle in sk_buff.
Vlad Buslov [Fri, 27 Nov 2020 15:12:05 +0000 (17:12 +0200)]
net: sched: remove redundant 'rtnl_held' argument
Functions tfilter_notify_chain() and tcf_get_next_proto() are always called
with rtnl lock held in current implementation. Moreover, attempting to call
them without rtnl lock would cause a warning down the call chain in
function __tcf_get_next_proto() that requires the lock to be held by
callers. Remove the 'rtnl_held' argument in order to simplify the code and
make rtnl lock requirement explicit.
Gustavo A. R. Silva's patch for the pcan_usb driver fixes fall-through warnings
for Clang.
The next 5 patches target the mcp251xfd driver and are by Ursula Maplehurst and
me. They optimizie the TEF and RX path by reducing number of SPI core requests
to set the UINC bit.
The remaining 8 patches target the m_can driver. The first 4 are various
cleanups for the SPI binding driver (tcan4x5x) by Sean Nyekjaer, Dan Murphy and
me. Followed by 4 cleanup patches by me for the m_can and m_can_platform
driver.
* tag 'linux-can-next-for-5.11-20201130' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next:
can: m_can: m_can_class_unregister(): move right after m_can_class_register()
can: m_can: m_can_plat_remove(): remove unneeded platform_set_drvdata()
can: m_can: remove not used variable struct m_can_classdev::freq
can: m_can: Kconfig: convert the into menu
can: tcan4x5x: tcan4x5x_can_probe(): remove probe failed error message
can: tcan4x5x: remove mram_start and reg_offset from struct tcan4x5x_priv
can: tcan4x5x: rename parse_config() function
can: tcan4x5x: tcan4x5x_clear_interrupts(): remove redundant return statement
can: mcp251xfd: tef-path: reduce number of SPI core requests to set UINC bit
can: mcp251xfd: move struct mcp251xfd_tef_ring definition
can: mcp251xfd: struct mcp251xfd_priv::tef to array of length 1
can: mcp25xxfd: rx-path: reduce number of SPI core requests to set UINC bit
can: mcp251xfd: mcp25xxfd_ring_alloc(): add define instead open coding the maximum number of RX objects
can: pcan_usb_core: fix fall-through warnings for Clang
====================
====================
mptcp: avoid workqueue usage for data
The current locking schema used to protect the MPTCP data-path
requires the usage of the MPTCP workqueue to process the incoming
data, depending on trylock result.
The above poses scalability limits and introduces random delays
in MPTCP-level acks.
With this series we use a single spinlock to protect the MPTCP
data-path, removing the need for workqueue and delayed ack usage.
This additionally reduces the number of atomic operations required
per packet and cleans-up considerably the poll/wake-up code.
====================
Paolo Abeni [Fri, 27 Nov 2020 10:10:27 +0000 (11:10 +0100)]
mptcp: use mptcp release_cb for delayed tasks
We have some tasks triggered by the subflow receive path
which require to access the msk socket status, specifically:
mptcp_clean_una() and mptcp_push_pending()
We have almost everything in place to defer to the msk
release_cb such tasks when the msk sock is owned.
Since the worker is no more used to clean the acked data,
for fallback sockets we need to explicitly flush them.
As an added bonus we can move the wake-up code in __mptcp_clean_una(),
simplify a lot mptcp_poll() and move the timer update under
the data lock.
The worker is now used only to process and send DATA_FIN
packets and do the mptcp-level retransmissions.
Acked-by: Florian Westphal <fw@strlen.de> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Paolo Abeni [Fri, 27 Nov 2020 10:10:26 +0000 (11:10 +0100)]
mptcp: avoid a few atomic ops in the rx path
Extending the data_lock scope in mptcp_incoming_option
we can use that to protect both snd_una and wnd_end.
In the typical case, we will have a single atomic op instead of 2
Acked-by: Florian Westphal <fw@strlen.de> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Paolo Abeni [Fri, 27 Nov 2020 10:10:24 +0000 (11:10 +0100)]
mptcp: protect the rx path with the msk socket spinlock
Such spinlock is currently used only to protect the 'owned'
flag inside the socket lock itself. With this patch, we extend
its scope to protect the whole msk receive path and
sk_forward_memory.
Given the above, we can always move data into the msk receive
queue (and OoO queue) from the subflow.
We leverage the previous commit, so that we need to acquire the
spinlock in the tx path only when moving fwd memory.
recvmsg() must now explicitly acquire the socket spinlock
when moving skbs out of sk_receive_queue. To reduce the number of
lock operations required we use a second rx queue and splice the
first into the latter in mptcp_lock_sock(). Additionally rmem
allocated memory is bulk-freed via release_cb()
Acked-by: Florian Westphal <fw@strlen.de> Co-developed-by: Florian Westphal <fw@strlen.de> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Paolo Abeni [Fri, 27 Nov 2020 10:10:23 +0000 (11:10 +0100)]
mptcp: implement wmem reservation
This leverages the previous commit to reserve the wmem
required for the sendmsg() operation when the msk socket
lock is first acquired.
Some heuristics are used to get a reasonable [over] estimation of
the whole memory required. If we can't forward alloc such amount
fallback to a reasonable small chunk, otherwise enter the wait
for memory path.
When sendmsg() needs more memory it looks at wmem_reserved
first and if that is exhausted, move more space from
sk_forward_alloc.
The reserved memory is not persistent and is released at the
next socket unlock via the release_cb().
Overall this will simplify the next patch.
Acked-by: Florian Westphal <fw@strlen.de> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Tue, 1 Dec 2020 01:32:03 +0000 (17:32 -0800)]
Merge branch 'dpaa_eth-add-xdp-support'
Camelia Groza says:
====================
dpaa_eth: add XDP support
Enable XDP support for the QorIQ DPAA1 platforms.
Implement all the current actions (DROP, ABORTED, PASS, TX, REDIRECT). No
Tx batching is added at this time.
Additional XDP_PACKET_HEADROOM bytes are reserved in each frame's headroom.
After transmit, a reference to the xdp_frame is saved in the buffer for
clean-up on confirmation in a newly created structure for software
annotations. DPAA_TX_PRIV_DATA_SIZE bytes are reserved in the buffer for
storing this structure and the XDP program is restricted from accessing
them.
The driver shares the egress frame queues used for XDP with the network
stack. The DPAA driver is a LLTX driver so no explicit locking is required
on transmission.
====================
Camelia Groza [Wed, 25 Nov 2020 16:53:36 +0000 (18:53 +0200)]
dpaa_eth: implement the A050385 erratum workaround for XDP
For XDP TX, even tough we start out with correctly aligned buffers, the
XDP program might change the data's alignment. For REDIRECT, we have no
control over the alignment either.
Create a new workaround for xdp_frame structures to verify the erratum
conditions and move the data to a fresh buffer if necessary. Create a new
xdp_frame for managing the new buffer and free the old one using the XDP
API.
Due to alignment constraints, all frames have a 256 byte headroom that
is offered fully to XDP under the erratum. If the XDP program uses all
of it, the data needs to be move to make room for the xdpf backpointer.
Disable the metadata support since the information can be lost.
Acked-by: Madalin Bucur <madalin.bucur@oss.nxp.com> Signed-off-by: Camelia Groza <camelia.groza@nxp.com> Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Camelia Groza [Wed, 25 Nov 2020 16:53:34 +0000 (18:53 +0200)]
dpaa_eth: add XDP_REDIRECT support
After transmission, the frame is returned on confirmation queues for
cleanup. For this, store a backpointer to the xdp_frame in the private
reserved area at the start of the TX buffer.
No TX batching support is implemented at this time.
Acked-by: Madalin Bucur <madalin.bucur@oss.nxp.com> Signed-off-by: Camelia Groza <camelia.groza@nxp.com> Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Camelia Groza [Wed, 25 Nov 2020 16:53:33 +0000 (18:53 +0200)]
dpaa_eth: add XDP_TX support
Use an xdp_frame structure for managing the frame. Store a backpointer to
the structure at the start of the buffer before enqueueing for cleanup
on TX confirmation. Reserve DPAA_TX_PRIV_DATA_SIZE bytes from the frame
size shared with the XDP program for this purpose. Use the XDP
API for freeing the buffer when it returns to the driver on the TX
confirmation path.
The frame queues are shared with the netstack. The DPAA driver is a LLTX
driver so no explicit locking is required on transmission.
This approach will be reused for XDP REDIRECT.
Acked-by: Madalin Bucur <madalin.bucur@oss.nxp.com> Signed-off-by: Camelia Groza <camelia.groza@nxp.com> Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Camelia Groza [Wed, 25 Nov 2020 16:53:32 +0000 (18:53 +0200)]
dpaa_eth: limit the possible MTU range when XDP is enabled
Implement the ndo_change_mtu callback to prevent users from setting an
MTU that would permit processing of S/G frames. The maximum MTU size
is dependent on the buffer size.
Acked-by: Madalin Bucur <madalin.bucur@oss.nxp.com> Signed-off-by: Camelia Groza <camelia.groza@nxp.com> Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Camelia Groza [Wed, 25 Nov 2020 16:53:31 +0000 (18:53 +0200)]
dpaa_eth: add basic XDP support
Implement the XDP_DROP and XDP_PASS actions.
Avoid draining and reconfiguring the buffer pool at each XDP
setup/teardown by increasing the frame headroom and reserving
XDP_PACKET_HEADROOM bytes from the start. Since we always reserve an
entire page per buffer, this change only impacts Jumbo frame scenarios
where the maximum linear frame size is reduced by 256 bytes. Multi
buffer Scatter/Gather frames are now used instead in these scenarios.
Allow XDP programs to access the entire buffer.
The data in the received frame's headroom can be overwritten by the XDP
program. Extract the relevant fields from the headroom while they are
still available, before running the XDP program.
Since the headroom might be resized before the frame is passed up to the
stack, remove the check for a fixed headroom value when building an skb.
Allow the meta data to be updated and pass the information up the stack.
Scatter/Gather frames are dropped when XDP is enabled.
Acked-by: Madalin Bucur <madalin.bucur@oss.nxp.com> Signed-off-by: Camelia Groza <camelia.groza@nxp.com> Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
can: tcan4x5x: remove mram_start and reg_offset from struct tcan4x5x_priv
Both struct tcan4x5x_priv::mram_start and struct tcan4x5x_priv::reg_offset are
only assigned once with a constant and then always used read-only. This patch
changes the driver to use the constant directly instead.
can: mcp251xfd: struct mcp251xfd_priv::tef to array of length 1
This patch converts the struct mcp251xfd_tef_ring member within the struct
mcp251xfd_priv into an array of length one. This way all rings (tef, tx and rx)
can be accessed in the same way.
can: pcan_usb_core: fix fall-through warnings for Clang
In preparation to enable -Wimplicit-fallthrough for Clang, fix a warning by
moving the "default" to the end of the "switch" statement and explicitly adding
a break statement instead of letting the code fall through to the next case.
====================
net: ipa: start adding IPA v4.5 support
This series starts updating the IPA code to support IPA hardware
version 4.5.
The first patch fixes a problem found while preparing these updates.
Testing shows the code works with or without the change, and with
the fix the code matches "downstream" Qualcomm code.
The second patch updates the definitions for IPA register offsets
and field masks to reflect the changes that come with IPA v4.5. A
few register updates have been deferred until later, because making
use of them involves some nontrivial code updates.
One type of change that IPA v4.5 brings is expanding the range of
certain configuration values. High-order bits are added in a few
cases, and the third patch implements the code changes necessary to
use those newly available bits.
The fourth patch implements several fairly minor changes to the code
required for IPA v4.5 support.
The last two patches implement changes to the GSI registers used for
IPA. Almost none of the registers change, but the range of memory
in which most of the GSI registers is located is shifted by a fixed
amount. The fifth patch updates the GSI register definitions, and
the last patch implements the memory shift for IPA v4.5.
====================
Alex Elder [Wed, 25 Nov 2020 20:45:22 +0000 (14:45 -0600)]
net: ipa: adjust GSI register addresses
The offsets for almost all GSI registers we use have different
offsets starting at IPA version 4.5. Only two registers remain
in their original location.
In a way though, the new register locations are not *that*
different. The entire group of affected registers has simply
been shifted down in memory by a fixed amount (0xd000). So for
example, the channel context 0 register that has a base offset of
0x0001c000 for "older" hardware now has a base offset of 0x0000f000.
This patch aims to add support for IPA v4.5 registers at their new
offets in a way that minimizes the amount of code that needs to
change. It is not ideal, but it avoids the need to maintain
a nearly complete set of additional register offset definitions.
The approach takes advantage of the fact that when accessing GSI
registers we do not access any of memory at lower end of the "gsi"
memory range (with two exceptions already noted). In particular,
we do not access anything within the bottom 0xd000 bytes of the
GSI memory range.
For IPA version 4.5, after we map the GSI memory, we adjust the
virtual memory pointer downward by the fixed amount (0xd000).
That way, register accesses using the offsets defined by the
existing GSI_REG_*() macros will resolve to the proper locations
for IPA version 4.5.
The two registers *not* affected by this offset are accessed only
in gsi_irq_setup(). There, for IPA version 4.5, we undo the general
register adjustment by adding the fixed amount back to the virtual
address to access these registers.
Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Alex Elder [Wed, 25 Nov 2020 20:45:21 +0000 (14:45 -0600)]
net: ipa: update gsi registers for IPA v4.5
Very few GSI register definitions change for IPA v4.5, however
as a group their position in memory shifts a constant amount
(handled by the next commit).
Add definitions and update comments to the set of GSI registers to
support changes that come with IPA v4.5.
Update the logic in gsi_channel_program() to accommodate the new
(expanded) PREFETCH_MODE field in the CH_C_QOS register.
Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Alex Elder [Wed, 25 Nov 2020 20:45:20 +0000 (14:45 -0600)]
net: ipa: add support to code for IPA v4.5
Update the IPA code to make use of the updated IPA v4.5 register
definitions. Generally what this patch does is, if IPA v4.5
hardware is in use:
- Ensure new registers or fields in IPA v4.5 are updated where
required
- Ensure registers or fields not supported in IPA v4.5 are not
examined when read, or are set to 0 when written
It does this while preserving the existing functionality for IPA
versions lower than v4.5.
The values to program for QSB_MAX_READS and QSB_MAX_WRITES and the
source and destination resource counts are updated to be correct for
all versions through v4.5 as well.
Note that IPA_RESOURCE_GROUP_SRC_MAX and IPA_RESOURCE_GROUP_DST_MAX
already reflect that 5 is an acceptable number of resources (which
IPA v4.5 implements).
Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Alex Elder [Wed, 25 Nov 2020 20:45:19 +0000 (14:45 -0600)]
net: ipa: add new most-significant bits to registers
IPA v4.5 adds a few fields to the endpoint header and extended
header configuration registers that represent new high-order bits
for certain offsets and sizes. Add code to incorporate these upper
bits into the registers for IPA v4.5.
This includes creating ipa_header_size_encoded(), which handles
encoding the metadata offset field for use in the ENDP_INIT_HDR
register in a way appropriate for the hardware version. This and
ipa_metadata_offset_encoded() ensure the mask argument passed to
u32_encode_bits() is constant.
Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Alex Elder [Wed, 25 Nov 2020 20:45:18 +0000 (14:45 -0600)]
net: ipa: update IPA registers for IPA v4.5
Update "ipa_reg.h" so that register definitions support IPA hardware
version 4.5, in addition to versions 3.5.1 through v4.2. Most of
the register definitions are the same, but in some cases fields are
added, changed, or eliminated.
Updates for a few IPA v4.5 registers are more complex, and adding
those definition will be deferred to separate patches. This patch
only updates the register offset and field definitions, and adds
informational comments.
The only code change avoids accessing the backward compatibility
register for IPA version 4.5 in ipa_hardware_config(). Other IPA
v4.5-specific code changes will come later.
Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Alex Elder [Wed, 25 Nov 2020 20:45:17 +0000 (14:45 -0600)]
net: ipa: reverse logic on escape buffer use
Starting with IPA v4.2 there is a GSI channel option to use an
"escape buffer" instead of prefetch buffers. This should be used
for all channels *except* the AP command TX channel. The logic
that implements this has it backwards; fix this bug.
Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
net/sched: act_ct: enable stats for HW offloaded entries
By setting NF_FLOWTABLE_COUNTER. Otherwise, the updates added by
commit ef803b3cf96a ("netfilter: flowtable: add counter support in HW
offload") are not effective when using act_ct.
While at it, now that we have the flag set, protect the call to
nf_ct_acct_update() by commit beb97d3a3192 ("net/sched: act_ct: update
nf_conn_acct for act_ct SW offload in flowtable") with the check on
NF_FLOWTABLE_COUNTER, as also done on other places.
Note that this shouldn't impact performance as these stats are only
enabled when net.netfilter.nf_conntrack_acct is enabled.
Jon Maloy [Wed, 25 Nov 2020 18:29:14 +0000 (13:29 -0500)]
tipc: make node number calculation reproducible
The 32-bit node number, aka node hash or node address, is calculated
based on the 128-bit node identity when it is not set explicitly by
the user. In future commits we will need to perform this hash operation
on peer nodes while feeling safe that we obtain the same result.
We do this by interpreting the initial hash as a network byte order
number. Whenever we need to use the number locally on a node
we must therefore translate it to host byte order to obtain an
architecure independent result.
Furthermore, given the context where we use this number, we must not
allow it to be zero unless the node identity also is zero. Hence, in
the rare cases when the xor-ed hash value may end up as zero we replace
it with a fix number, knowing that the code anyway is capable of
handling hash collisions.
Acked-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jmaloy@redhat.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jon Maloy [Wed, 25 Nov 2020 18:29:13 +0000 (13:29 -0500)]
tipc: refactor tipc_sk_bind() function
We refactor the tipc_sk_bind() function, so that the lock handling
is handled separately from the logics. We also move some sanity
tests to earlier in the call chain, to the function tipc_bind().
Acked-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jmaloy@redhat.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Martin Schiller [Thu, 26 Nov 2020 06:35:53 +0000 (07:35 +0100)]
net/x25: handle additional netdev events
1. Add / remove x25_link_device by NETDEV_REGISTER/UNREGISTER and also
by NETDEV_POST_TYPE_CHANGE/NETDEV_PRE_TYPE_CHANGE.
This change is needed so that the x25_neigh struct for an interface
is already created when it shows up and is kept independently if the
interface goes UP or DOWN.
This is used in an upcomming commit, where x25 params of an neighbour
will get configurable through ioctls.
2. NETDEV_CHANGE event makes it possible to handle carrier loss and
detection. If carrier is lost, clean up everything related to this
neighbour by calling x25_link_terminated().
3. Also call x25_link_terminated() for NETDEV_DOWN events and remove the
call to x25_clear_forward_by_dev() in x25_route_device_down(), as
this is already called by x25_kill_by_neigh() which gets called by
x25_link_terminated().
4. Do nothing for NETDEV_UP and NETDEV_GOING_DOWN events, as these will
be handled in layer 2 (LAPB) and layer3 (X.25) will be informed by
layer2 when layer2 link is established and layer3 link should be
initiated.
Signed-off-by: Martin Schiller <ms@dev.tdt.de> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
====================
mlxsw: Update adjacency index more efficiently
The device supports an operation that allows the driver to issue one
request to update the adjacency index for all the routes in a given
virtual router (VR) from old index and size to new ones. This is useful
in case the configuration of a certain nexthop group is updated and its
adjacency index changes.
Currently, the driver does not use this operation in an efficient
manner. It iterates over all the routes using the nexthop group and
issues an update request for the VR if it is not the same as the
previous VR.
Instead, this patch set tracks the VRs in which the nexthop group is
used and issues one request for each VR.
Example:
8k IPv6 routes were added in an alternating manner to two VRFs. All the
routes are using the same nexthop object ('nhid 1').
Before:
Performance counter stats for 'ip nexthop replace id 1 via 2001:db8:1::2 dev swp3':
Ido Schimmel [Wed, 25 Nov 2020 19:35:05 +0000 (21:35 +0200)]
mlxsw: spectrum_router: Update adjacency index more efficiently
The device supports an operation that allows the driver to issue one
request to update the adjacency index for all the routes in a given
virtual router (VR) from old index and size to new ones. This is useful
in case the configuration of a certain nexthop group is updated and its
adjacency index changes.
Currently, the driver does not use this operation in an efficient
manner. It iterates over all the routes using the nexthop group and
issues an update request for the VR if it is not the same as the
previous VR.
Instead, use the VR tracking added in the previous patch to update the
adjacency index once for each VR currently using the nexthop group.
Example:
8k IPv6 routes were added in an alternating manner to two VRFs. All the
routes are using the same nexthop object ('nhid 1').
Before:
# perf stat -e devlink:devlink_hwmsg --filter='incoming==0' -- ip nexthop replace id 1 via 2001:db8:1::2 dev swp3
Performance counter stats for 'ip nexthop replace id 1 via 2001:db8:1::2 dev swp3':
Ido Schimmel [Wed, 25 Nov 2020 19:35:04 +0000 (21:35 +0200)]
mlxsw: spectrum_router: Track nexthop group virtual router membership
For each nexthop group, track in which virtual routers (VRs) the group is
used. This is going to be used by the next patch to perform a more
efficient adjacency index update whenever the group's adjacency index
changes.
In the rare case where the adjacency pointer cannot be updated for a
given virtual router, rollback the operation so that virtual routers
that are already using the new index will use the old one again.
Ido Schimmel [Wed, 25 Nov 2020 19:35:02 +0000 (21:35 +0200)]
mlxsw: spectrum_router: Pass virtual router parameters directly instead of pointer
mlxsw_sp_adj_index_mass_update_vr() only needs the virtual router's
identifier and protocol, so pass them directly. In a subsequent patch
the caller will not have access to the pointer.
Linus Torvalds [Fri, 27 Nov 2020 23:00:35 +0000 (15:00 -0800)]
Merge tag 'asm-generic-fixes-5.10-2' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic
Pull asm-generic fix from Arnd Bergmann:
"Add correct MAX_POSSIBLE_PHYSMEM_BITS setting to asm-generic.
This is a single bugfix for a bug that Stefan Agner found on 32-bit
Arm, but that exists on several other architectures"
* tag 'asm-generic-fixes-5.10-2' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic:
arch: pgtable: define MAX_POSSIBLE_PHYSMEM_BITS where needed
Linus Torvalds [Fri, 27 Nov 2020 22:48:03 +0000 (14:48 -0800)]
Merge tag 'arm-soc-fixes-v5.10-3' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Pull ARM SoC fixes from Arnd Bergmann:
"Another set of patches for devicetree files and Arm SoC specific
drivers:
- A fix for OP-TEE shared memory on non-SMP systems
- multiple code fixes for the OMAP platform, including one regression
for the CPSW network driver and a few runtime warning fixes
- Some DT patches for the Rockchip RK3399 platform, in particular
fixing the MMC device ordering that recently became
nondeterministic with async probe.
- Multiple DT fixes for the Tegra platform, including a regression
fix for suspend/resume on TX2
- A regression fix for a user-triggered fault in the NXP dpio driver
- A regression fix for a bug caused by an earlier bug fix in the
xilinx firmware driver
- Two more DTC warning fixes
- Sylvain Lemieux steps down as maintainer for the NXP LPC32xx
platform"
* tag 'arm-soc-fixes-v5.10-3' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (24 commits)
arm64: tegra: Fix Tegra234 VDK node names
arm64: tegra: Wrong AON HSP reg property size
arm64: tegra: Fix USB_VBUS_EN0 regulator on Jetson TX1
arm64: tegra: Correct the UART for Jetson Xavier NX
arm64: tegra: Disable the ACONNECT for Jetson TX2
optee: add writeback to valid memory type
firmware: xilinx: Use hash-table for api feature check
firmware: xilinx: Fix SD DLL node reset issue
soc: fsl: dpio: Get the cpumask through cpumask_of(cpu)
ARM: dts: dra76x: m_can: fix order of clocks
bus: ti-sysc: suppress err msg for timers used as clockevent/source
MAINTAINERS: Remove myself as LPC32xx maintainers
arm64: dts: qcom: clear the warnings caused by empty dma-ranges
arm64: dts: broadcom: clear the warnings caused by empty dma-ranges
ARM: dts: am437x-l4: fix compatible for cpsw switch dt node
arm64: dts: rockchip: Reorder LED triggers from mmc devices on rk3399-roc-pc.
arm64: dts: rockchip: Assign a fixed index to mmc devices on rk3399 boards.
arm64: dts: rockchip: Remove system-power-controller from pmic on Odroid Go Advance
arm64: dts: rockchip: fix NanoPi R2S GMAC clock name
ARM: OMAP2+: Manage MPU state properly for omap_enter_idle_coupled()
...
Linus Torvalds [Fri, 27 Nov 2020 22:38:02 +0000 (14:38 -0800)]
Merge tag 'net-5.10-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Networking fixes for 5.10-rc6, including fixes from the WiFi driver,
and CAN subtrees.
Current release - regressions:
- gro_cells: reduce number of synchronize_net() calls
- ch_ktls: release a lock before jumping to an error path
Current release - always broken:
- tcp: Allow full IP tos/IPv6 tclass to be reflected in L3 header
Previous release - regressions:
- net/tls: fix missing received data after fast remote close
- vsock/virtio: discard packets only when socket is really closed
- sock: set sk_err to ee_errno on dequeue from errq
- cxgb4: fix the panic caused by non smac rewrite
Previous release - always broken:
- tcp: fix corner cases around setting ECN with BPF selection of
congestion control
- tcp: fix race condition when creating child sockets from syncookies
on loopback interface
- usbnet: ipheth: fix connectivity with iOS 14
- tun: honor IOCB_NOWAIT flag
- net/packet: fix packet receive on L3 devices without visible hard
header
- devlink: Make sure devlink instance and port are in same net
namespace
- net: openvswitch: fix TTL decrement action netlink message format
- bonding: wait for sysfs kobject destruction before freeing struct
slave
- net: stmmac: fix upstream patch applied to the wrong context
- bnxt_en: fix return value and unwind in probe error paths
Misc:
- devlink: add extra layer of categorization to the reload stats uAPI
before it's released"
* tag 'net-5.10-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (68 commits)
sock: set sk_err to ee_errno on dequeue from errq
mptcp: fix NULL ptr dereference on bad MPJ
net: openvswitch: fix TTL decrement action netlink message format
can: af_can: can_rx_unregister(): remove WARN() statement from list operation sanity check
can: m_can: m_can_dev_setup(): add support for bosch mcan version 3.3.0
can: m_can: fix nominal bitiming tseg2 min for version >= 3.1
can: m_can: m_can_open(): remove IRQF_TRIGGER_FALLING from request_threaded_irq()'s flags
can: mcp251xfd: mcp251xfd_probe(): bail out if no IRQ was given
can: gs_usb: fix endianess problem with candleLight firmware
ch_ktls: lock is not freed
net/tls: Protect from calling tls_dev_del for TLS RX twice
devlink: Make sure devlink instance and port are in same net namespace
devlink: Hold rtnl lock while reading netdev attributes
ptp: clockmatrix: bug fix for idtcm_strverscmp
enetc: Let the hardware auto-advance the taprio base-time of 0
gro_cells: reduce number of synchronize_net() calls
net: stmmac: fix incorrect merge of patch upstream
ipv6: addrlabel: fix possible memory leak in ip6addrlbl_net_init
Documentation: netdev-FAQ: suggest how to post co-dependent series
ibmvnic: enhance resetting status check during module exit
...
====================
net/sched: fix over mtu packet of defrag in
Currently kernel tc subsystem can do conntrack in act_ct. But when several
fragment packets go through the act_ct, function tcf_ct_handle_fragments
will defrag the packets to a big one. But the last action will redirect
mirred to a device which maybe lead the reassembly big packet over the mtu
of target device.
The first patch fix miss init the qdisc_skb_cb->mru
The send one refactor the hanle of xmit in act_mirred and prepare for the
third one
The last one add implict packet fragment support to fix the over mtu for
defrag in act_ct.
====================
wenxu [Wed, 25 Nov 2020 04:01:23 +0000 (12:01 +0800)]
net/sched: sch_frag: add generic packet fragment support.
Currently kernel tc subsystem can do conntrack in cat_ct. But when several
fragment packets go through the act_ct, function tcf_ct_handle_fragments
will defrag the packets to a big one. But the last action will redirect
mirred to a device which maybe lead the reassembly big packet over the mtu
of target device.
This patch add support for a xmit hook to mirred, that gets executed before
xmiting the packet. Then, when act_ct gets loaded, it configs that hook.
The frag xmit hook maybe reused by other modules.
Signed-off-by: wenxu <wenxu@ucloud.cn> Acked-by: Cong Wang <cong.wang@bytedance.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
wenxu [Wed, 25 Nov 2020 04:01:21 +0000 (12:01 +0800)]
net/sched: fix miss init the mru in qdisc_skb_cb
The mru in the qdisc_skb_cb should be init as 0. Only defrag packets in the
act_ct will set the value.
Fixes: 038ebb1a713d ("net/sched: act_ct: fix miss set mru for ovs after defrag in act_ct") Signed-off-by: wenxu <wenxu@ucloud.cn> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
====================
Add CHACHA20-POLY1305 cipher to Kernel TLS
RFC 7905 defines usage of ChaCha20-Poly1305 in TLS connections. This
cipher is widely used nowadays and it's good to have a support for it
in TLS connections in kernel.
====================
Vadim Fedorenko [Tue, 24 Nov 2020 15:24:48 +0000 (18:24 +0300)]
net/tls: add CHACHA20-POLY1305 specific behavior
RFC 7905 defines special behavior for ChaCha-Poly TLS sessions.
The differences are in the calculation of nonce and the absence
of explicit IV. This behavior is like TLSv1.3 partly.
Signed-off-by: Vadim Fedorenko <vfedorenko@novek.ru> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Vadim Fedorenko [Tue, 24 Nov 2020 15:24:46 +0000 (18:24 +0300)]
net/tls: make inline helpers protocol-aware
Inline functions defined in tls.h have a lot of AES-specific
constants. Remove these constants and change argument to struct
tls_prot_info to have an access to cipher type in later patches
Signed-off-by: Vadim Fedorenko <vfedorenko@novek.ru> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Linus Torvalds [Fri, 27 Nov 2020 22:06:23 +0000 (14:06 -0800)]
Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"Three small fixes in the UFS driver: two are for power management
issues and the third is to fix a slew of problem in the sysfs code"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: ufs: Fix race between shutdown and runtime resume flow
scsi: ufs: Make sure clk scaling happens only when HBA is runtime ACTIVE
scsi: ufs: Fix unexpected values from ufshcd_read_desc_param()
Linus Torvalds [Fri, 27 Nov 2020 20:42:13 +0000 (12:42 -0800)]
Merge tag 'for-5.10-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs fixes from David Sterba:
"A few fixes for various warnings that accumulated over past two weeks:
- tree-checker: add missing return values for some errors
- lockdep fixes
- when reading qgroup config and starting quota rescan
- reverse order of quota ioctl lock and VFS freeze lock
- avoid accessing potentially stale fs info during device scan,
reported by syzbot
- add scope NOFS protection around qgroup relation changes
- check for running transaction before flushing qgroups
- fix tracking of new delalloc ranges for some cases"
* tag 'for-5.10-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
btrfs: fix lockdep splat when enabling and disabling qgroups
btrfs: do nofs allocations when adding and removing qgroup relations
btrfs: fix lockdep splat when reading qgroup config on mount
btrfs: tree-checker: add missing returns after data_ref alignment checks
btrfs: don't access possibly stale fs_info data for printing duplicate device
btrfs: tree-checker: add missing return after error in root_item
btrfs: qgroup: don't commit transaction when we already hold the handle
btrfs: fix missing delalloc new bit for new delalloc ranges
Linus Torvalds [Fri, 27 Nov 2020 20:31:04 +0000 (12:31 -0800)]
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma
Pull rdma fixes from Jason Gunthorpe:
"Two security issues and several small bug fixes. Things seem to have
stabilized for this release here.
Summary:
- Significant out of bounds access security issue in i40iw
- Fix misuse of mmu notifiers in hfi1
- Several errors in the register map/usage in hns
- Missing error returns in mthca"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
RDMA/hns: Bugfix for memory window mtpt configuration
RDMA/hns: Fix retry_cnt and rnr_cnt when querying QP
RDMA/hns: Fix wrong field of SRQ number the device supports
IB/hfi1: Ensure correct mm is used at all times
RDMA/i40iw: Address an mmap handler exploit in i40iw
IB/mthca: fix return value of error branch in mthca_init_cq()
Linus Torvalds [Fri, 27 Nov 2020 20:03:07 +0000 (12:03 -0800)]
Merge tag 'mtd/fixes-for-5.10-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux
Pull mtd fixes from Miquel Raynal:
"Because of a recent change in the core, NAND controller drivers
initializing the ECC engine too early in the probe path are broken.
Drivers should wait for the NAND device to be discovered and its
memory layout known before doing any ECC related initialization, so
instead of reverting the faulty change which is actually moving in the
right direction, let's fix the drivers directly: socrates, sharpsl,
r852, plat_nand, pasemi, tmio, txx9ndfmc, orion, mpc5121, lpc32xx_slc,
lpc32xx_mlc, fsmc, diskonchip, davinci, cs553x, au1550, ams-delta,
xway and gpio"
* tag 'mtd/fixes-for-5.10-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux:
mtd: rawnand: socrates: Move the ECC initialization to ->attach_chip()
mtd: rawnand: sharpsl: Move the ECC initialization to ->attach_chip()
mtd: rawnand: r852: Move the ECC initialization to ->attach_chip()
mtd: rawnand: plat_nand: Move the ECC initialization to ->attach_chip()
mtd: rawnand: pasemi: Move the ECC initialization to ->attach_chip()
mtd: rawnand: tmio: Move the ECC initialization to ->attach_chip()
mtd: rawnand: txx9ndfmc: Move the ECC initialization to ->attach_chip()
mtd: rawnand: orion: Move the ECC initialization to ->attach_chip()
mtd: rawnand: mpc5121: Move the ECC initialization to ->attach_chip()
mtd: rawnand: lpc32xx_slc: Move the ECC initialization to ->attach_chip()
mtd: rawnand: lpc32xx_mlc: Move the ECC initialization to ->attach_chip()
mtd: rawnand: fsmc: Move the ECC initialization to ->attach_chip()
mtd: rawnand: diskonchip: Move the ECC initialization to ->attach_chip()
mtd: rawnand: davinci: Move the ECC initialization to ->attach_chip()
mtd: rawnand: cs553x: Move the ECC initialization to ->attach_chip()
mtd: rawnand: au1550: Move the ECC initialization to ->attach_chip()
mtd: rawnand: ams-delta: Move the ECC initialization to ->attach_chip()
mtd: rawnand: xway: Move the ECC initialization to ->attach_chip()
mtd: rawnand: gpio: Move the ECC initialization to ->attach_chip()
Linus Torvalds [Fri, 27 Nov 2020 19:29:53 +0000 (11:29 -0800)]
Merge tag 'spi-fix-v5.10-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi
Pull spi fixes from Mark Brown:
"A few fixes for v5.10, one for the core which fixes some potential
races for controllers with multiple chip selects when configuration of
the chip select for one client device races with the addition and
initial setup of an additional client"
* tag 'spi-fix-v5.10-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
spi: dw: Fix spi registration for controllers overriding CS
spi: imx: fix the unbalanced spi runtime pm management
spi: spi-nxp-fspi: fix fspi panic by unexpected interrupts
spi: Take the SPI IO-mutex in the spi_setup() method
Linus Torvalds [Fri, 27 Nov 2020 19:25:23 +0000 (11:25 -0800)]
Merge tag 'media/v5.10-3' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media
Pull virtual digital TV driver fixes from Mauro Carvalho Chehab:
"A series of fixes for the new virtual digital TV driver (vidtv), which
is meant to help doing tests with the digital TV core and media
userspace apps and libraries.
They cover a series of issues I found on it, together with a few new
things in order to make it easier to detect problems at the DVB core"
* tag 'media/v5.10-3' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: (36 commits)
media: vidtv.rst: add kernel-doc markups
media: vidtv.rst: update vidtv documentation
media: vidtv: simplify EIT write function
media: vidtv: simplify NIT write function
media: vidtv: simplify SDT write function
media: vidtv: cleanup PMT write table function
media: vidtv: cleanup PAT write function
media: vidtv: cleanup PSI table header function
media: vidtv: cleanup PSI descriptor write function
media: vidtv: simplify the crc writing logic
media: vidtv: simplify PSI write function
media: vidtv: add date to the current event
media: vidtv: fix service_id at SDT table
media: vidtv: fix service type
media: vidtv: add a PID entry for the NIT table
media: vidtv: properly fill EIT service_id
media: vidtv: fix the network ID range
media: vidtv: improve EIT data
media: vidtv: cleanup null packet initialization logic
media: vidtv: pre-initialize mux arrays
...
Linus Torvalds [Fri, 27 Nov 2020 19:19:49 +0000 (11:19 -0800)]
Merge tag 'drm-fixes-2020-11-27-1' of git://anongit.freedesktop.org/drm/drm
Pull drm fixes from Dave Airlie:
"Unfortunately this has a bit of thanksgiving stuffing in it, as it a
bit larger (at least the vc4 patches) than I like at this point in
time.
The main thing is it has a bunch of regressions fixes for reports in
the last couple of weeks, ast, nouveau and the amdgpu ttm init fix,
along with the usual selection of amdgpu and i915 fixes.
The vc4 fixes are a few but they are fixes and the nastiest one is a
fix for when you have a 2.4Ghz Wifi and a HDMI signal with a clock in
that range and there isn't enough shielding and interference happen
between the two, the fix adjusts the mode clock to try and avoid the
wifi channels in that case.
Hopefully you can merge this between turkey slices, and next week
should be quieter.
ast:
- LUT loading regression fix
nouveau:
- relocations regression fix
amdgpu:
- ttm init oops fix
- Runtime pm fix
- SI UVD suspend/resume fix
- HDCP fix for headless cards
- Sienna Cichlid golden register update
i915:
- Fix Perf/OA workaround register corruption (Lionel)
- Correct a comment statement in GVT (Yan)
- Fix GT enable/disable iterrupts, including a race condition that
prevented GPU to go idle (Chris)
- Free stale request on destroying the virtual engine (Chris)
exynos:
- config dependency fix
mediatek:
- unused var removal
- horizonal front/back porch formula fix
vc4:
- wifi and hdmi interference fix
- mode rejection fixes
- use after free fix
- cleanup some code"
* tag 'drm-fixes-2020-11-27-1' of git://anongit.freedesktop.org/drm/drm: (28 commits)
drm/nouveau: fix relocations applying logic and a double-free
drm/ast: Reload gamma LUT after changing primary plane's color format
drm/amdgpu: Fix size calculation when init onchip memory
drm/amdgpu: update golden setting for sienna_cichlid
drm/amd/display: Avoid HDCP initialization in devices without output
drm/i915/gt: Free stale request on destroying the virtual engine
drm/i915/gt: Don't cancel the interrupt shadow too early
drm/i915/gt: Track signaled breadcrumbs outside of the breadcrumb spinlock
drm/amdgpu: fix a page fault
drm/amdgpu: fix SI UVD firmware validate resume fail
drm/amd/amdgpu: fix null pointer in runtime pm
drm/i915/gt: Defer enabling the breadcrumb interrupt to after submission
drm/i915/gvt: correct a false comment of flag F_UNALIGN
drm/i915/perf: workaround register corruption in OATAILPTR
drm/vc4: kms: Don't disable the muxing of an active CRTC
drm/vc4: kms: Store the unassigned channel list in the state
drm/exynos: depend on COMMON_CLK to fix compile tests
drm/mediatek: dsi: Modify horizontal front/back porch byte formula
drm/vc4: hdmi: Disable Wifi Frequencies
dt-bindings: display: Add a property to deal with WiFi coexistence
...
Jakub Kicinski [Fri, 27 Nov 2020 19:13:39 +0000 (11:13 -0800)]
Merge tag 'linux-can-fixes-for-5.10-20201127' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can
Marc Kleine-Budde says:
====================
pull-request: can 2020-11-27
The first patch is by me and target the gs_usb driver and fixes the endianess
problem with candleLight firmware.
Another patch by me for the mcp251xfd driver add sanity checking to bail out if
no IRQ is configured.
The next three patches target the m_can driver. A patch by me removes the
hardcoded IRQF_TRIGGER_FALLING from the request_threaded_irq() as this clashes
with the trigger level specified in the DT. Further a patch by me fixes the
nominal bitiming tseg2 min value for modern m_can cores. Pankaj Sharma's patch
add support for cores version 3.3.x.
The last patch by Oliver Hartkopp is for af_can and converts a WARN() into a
pr_warn(), which is triggered by the syzkaller. It was able to create a
situation where the closing of a socket runs simultaneously to the notifier
call chain for removing the CAN network device in use.
* tag 'linux-can-fixes-for-5.10-20201127' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can:
can: af_can: can_rx_unregister(): remove WARN() statement from list operation sanity check
can: m_can: m_can_dev_setup(): add support for bosch mcan version 3.3.0
can: m_can: fix nominal bitiming tseg2 min for version >= 3.1
can: m_can: m_can_open(): remove IRQF_TRIGGER_FALLING from request_threaded_irq()'s flags
can: mcp251xfd: mcp251xfd_probe(): bail out if no IRQ was given
can: gs_usb: fix endianess problem with candleLight firmware
====================
Linus Torvalds [Fri, 27 Nov 2020 19:09:13 +0000 (11:09 -0800)]
Merge tag 'platform-drivers-x86-v5.10-2' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86
Pull x86 platform driver fixes from Hans de Goede:
- thinkpad_acpi fixes: two bug-fixes and three model specific quirks
- fixes for misc other drivers: two bug-fixes and three model specific
quirks
* tag 'platform-drivers-x86-v5.10-2' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86:
platform/x86: touchscreen_dmi: Add info for the Irbis TW118 tablet
platform/x86: touchscreen_dmi: Add info for the Predia Basic tablet
platform/x86: intel-vbtn: Support for tablet mode on HP Pavilion 13 x360 PC
platform/x86: toshiba_acpi: Fix the wrong variable assignment
platform/x86: acer-wmi: add automatic keyboard background light toggle key as KEY_LIGHTS_TOGGLE
platform/x86: thinkpad_acpi: Whitelist P15 firmware for dual fan control
platform/x86: thinkpad_acpi: Send tablet mode switch at wakeup time
platform/x86: thinkpad_acpi: Add BAT1 is primary battery quirk for Thinkpad Yoga 11e 4th gen
platform/x86: thinkpad_acpi: Do not report SW_TABLET_MODE on Yoga 11e
platform/x86: thinkpad_acpi: add P1 gen3 second fan support
Willem de Bruijn [Thu, 26 Nov 2020 15:12:20 +0000 (10:12 -0500)]
sock: set sk_err to ee_errno on dequeue from errq
When setting sk_err, set it to ee_errno, not ee_origin.
Commit f5f99309fa74 ("sock: do not set sk_err in
sock_dequeue_err_skb") disabled updating sk_err on errq dequeue,
which is correct for most error types (origins):
- sk->sk_err = err;
Commit 38b257938ac6 ("sock: reset sk_err when the error queue is
empty") reenabled the behavior for IMCP origins, which do require it:
+ if (icmp_next)
+ sk->sk_err = SKB_EXT_ERR(skb_next)->ee.ee_origin;
But read from ee_errno.
Fixes: 38b257938ac6 ("sock: reset sk_err when the error queue is empty") Reported-by: Ayush Ranjan <ayushranjan@google.com> Signed-off-by: Willem de Bruijn <willemb@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Link: https://lore.kernel.org/r/20201126151220.2819322-1-willemdebruijn.kernel@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Paolo Abeni [Thu, 26 Nov 2020 14:17:53 +0000 (15:17 +0100)]
mptcp: fix NULL ptr dereference on bad MPJ
If an msk listener receives an MPJ carrying an invalid token, it
will zero the request socket msk entry. That should later
cause fallback and subflow reset - as per RFC - at
subflow_syn_recv_sock() time due to failing hmac validation.
Since commit 4cf8b7e48a09 ("subflow: introduce and use
mptcp_can_accept_new_subflow()"), we unconditionally dereference
- in mptcp_can_accept_new_subflow - the subflow request msk
before performing hmac validation. In the above scenario we
hit a NULL ptr dereference.
Address the issue doing the hmac validation earlier.
Linus Torvalds [Fri, 27 Nov 2020 19:04:13 +0000 (11:04 -0800)]
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm fixes from Paolo Bonzini:
"ARM:
- Fix alignment of the new HYP sections
- Fix GICR_TYPER access from userspace
S390:
- do not reset the global diag318 data for per-cpu reset
- do not mark memory as protected too early
- fix for destroy page ultravisor call
x86:
- fix for SEV debugging
- fix incorrect return code
- fix for 'noapic' with PIC in userspace and LAPIC in kernel
- fix for 5-level paging"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
kvm: x86/mmu: Fix get_mmio_spte() on CPUs supporting 5-level PT
KVM: x86: Fix split-irqchip vs interrupt injection window request
KVM: x86: handle !lapic_in_kernel case in kvm_cpu_*_extint
MAINTAINERS: Update email address for Sean Christopherson
MAINTAINERS: add uv.c also to KVM/s390
s390/uv: handle destroy page legacy interface
KVM: arm64: vgic-v3: Drop the reporting of GICR_TYPER.Last for userspace
KVM: SVM: fix error return code in svm_create_vcpu()
KVM: SVM: Fix offset computation bug in __sev_dbg_decrypt().
KVM: arm64: Correctly align nVHE percpu data
KVM: s390: remove diag318 reset code
KVM: s390: pv: Mark mm as protected after the set secure parameters and improve cleanup
Eelco Chaudron [Tue, 24 Nov 2020 12:34:44 +0000 (07:34 -0500)]
net: openvswitch: fix TTL decrement action netlink message format
Currently, the openvswitch module is not accepting the correctly formated
netlink message for the TTL decrement action. For both setting and getting
the dec_ttl action, the actions should be nested in the
OVS_DEC_TTL_ATTR_ACTION attribute as mentioned in the openvswitch.h uapi.
When the original patch was sent, it was tested with a private OVS userspace
implementation. This implementation was unfortunately not upstreamed and
reviewed, hence an erroneous version of this patch was sent out.
Leaving the patch as-is would cause problems as the kernel module could
interpret additional attributes as actions and vice-versa, due to the
actions not being encapsulated/nested within the actual attribute, but
being concatinated after it.
Linus Torvalds [Fri, 27 Nov 2020 18:59:02 +0000 (10:59 -0800)]
Merge tag 'powerpc-5.10-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:
"Some more powerpc fixes for 5.10:
- regression fix for a boot failure on some 32-bit machines.
- fix for host crashes in the KVM system reset handling.
- fix for a possible oops in the KVM XIVE interrupt handling on
Power9.
- fix for host crashes triggerable via the KVM emulated MMIO handling
when running HPT guests.
- a couple of small build fixes.
Thanks to Andreas Schwab, Cédric Le Goater, Christophe Leroy, Erhard
Furtner, Greg Kurz, Greg Kurz, Németh Márton, Nicholas Piggin, Nick
Desaulniers, Serge Belyshev, and Stephen Rothwell"
* tag 'powerpc-5.10-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/64s: Fix allnoconfig build since uaccess flush
powerpc/64s/exception: KVM Fix for host DSI being taken in HPT guest MMU context
powerpc: Drop -me200 addition to build flags
KVM: PPC: Book3S HV: XIVE: Fix possible oops when accessing ESB page
powerpc/64s: Fix KVM system reset handling when CONFIG_PPC_PSERIES=y
powerpc/32s: Use relocation offset when setting early hash table
Linus Torvalds [Fri, 27 Nov 2020 18:44:59 +0000 (10:44 -0800)]
Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Will Deacon:
"The main changes are relating to our handling of access/dirty bits,
where our low-level page-table helpers could lead to stale young
mappings and loss of the dirty bit in some cases (the latter has not
been observed in practice, but could happen when clearing "soft-dirty"
if we enabled that). These were posted as part of a larger series, but
the rest of that is less urgent and needs a v2 which I'll get to
shortly.
In other news, we've now got a set of fixes to resolve the
lockdep/tracing problems that have been plaguing us for a while, but
they're still a bit "fresh" and I plan to send them to you next week
after we've got some more confidence in them (although initial CI
results look good).
Summary:
- Fix kerneldoc warnings generated by ACPI IORT code
- Fix pte_accessible() so that access flag is ignored
- Fix missing header #include
- Fix loss of software dirty bit across pte_wrprotect() when HW DBM
is enabled"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: pgtable: Ensure dirty bit is preserved across pte_wrprotect()
arm64: pgtable: Fix pte_accessible()
ACPI/IORT: Fix doc warnings in iort.c
arm64/fpsimd: add <asm/insn.h> to <asm/kprobes.h> to fix fpsimd build
Linus Torvalds [Fri, 27 Nov 2020 18:41:19 +0000 (10:41 -0800)]
Merge tag 'iommu-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull iommu fixes from Will Deacon:
"Here's another round of IOMMU fixes for -rc6 consisting mainly of a
bunch of independent driver fixes. Thomas agreed for me to take the
x86 'tboot' fix here, as it fixes a regression introduced by a vt-d
change.
- Fix intel iommu driver when running on devices without VCCAP_REG
- Fix swiotlb and "iommu=pt" interaction under TXT (tboot)
- Fix missing return value check during device probe()
- Fix probe ordering for Qualcomm SMMU implementation
- Ensure page-sized mappings are used for AMD IOMMU buffers with SNP
RMP"
* tag 'iommu-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
iommu/vt-d: Don't read VCCAP register unless it exists
x86/tboot: Don't disable swiotlb when iommu is forced on
iommu: Check return of __iommu_attach_device()
arm-smmu-qcom: Ensure the qcom_scm driver has finished probing
iommu/amd: Enforce 4k mapping for certain IOMMU data structures
Linus Torvalds [Fri, 27 Nov 2020 18:38:36 +0000 (10:38 -0800)]
Merge tag 'printk-for-5.10-rc6-fixup' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux
Pull printk fixes from Petr Mladek:
- do not lose trailing newline in pr_cont() calls
- two trivial fixes for a dead store and a config description
* tag 'printk-for-5.10-rc6-fixup' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux:
printk: finalize records with trailing newlines
printk: remove unneeded dead-store assignment
init/Kconfig: Fix CPU number in LOG_CPU_MAX_BUF_SHIFT description
The guest triggering this crashes. Note, this happens with the traditional
MMU and EPT enabled, not with the newly introduced TDP MMU. Turns out,
there was a subtle change in the above mentioned commit. Previously,
walk_shadow_page_get_mmio_spte() was setting 'root' to 'iterator.level'
which is returned by shadow_walk_init() and this equals to
'vcpu->arch.mmu->shadow_root_level'. Now, get_mmio_spte() sets it to
'int root = vcpu->arch.mmu->root_level'.
The difference between 'root_level' and 'shadow_root_level' on CPUs
supporting 5-level page tables is that in some case we don't want to
use 5-level, in particular when 'cpuid_maxphyaddr(vcpu) <= 48'
kvm_mmu_get_tdp_level() returns '4'. In case upper layer is not used,
the corresponding SPTE will fail '__is_rsvd_bits_set()' check.
Revert to using 'shadow_root_level'.
Fixes: 95fb5b0258b7 ("kvm: x86/mmu: Support MMIO in the TDP MMU") Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20201126110206.2118959-1-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Paolo Bonzini [Fri, 27 Nov 2020 08:18:20 +0000 (09:18 +0100)]
KVM: x86: Fix split-irqchip vs interrupt injection window request
kvm_cpu_accept_dm_intr and kvm_vcpu_ready_for_interrupt_injection are
a hodge-podge of conditions, hacked together to get something that
more or less works. But what is actually needed is much simpler;
in both cases the fundamental question is, do we have a place to stash
an interrupt if userspace does KVM_INTERRUPT?
In userspace irqchip mode, that is !vcpu->arch.interrupt.injected.
Currently kvm_event_needs_reinjection(vcpu) covers it, but it is
unnecessarily restrictive.
In split irqchip mode it's a bit more complicated, we need to check
kvm_apic_accept_pic_intr(vcpu) (the IRQ window exit is basically an INTACK
cycle and thus requires ExtINTs not to be masked) as well as
!pending_userspace_extint(vcpu). However, there is no need to
check kvm_event_needs_reinjection(vcpu), since split irqchip keeps
pending ExtINT state separate from event injection state, and checking
kvm_cpu_has_interrupt(vcpu) is wrong too since ExtINT has higher
priority than APIC interrupts. In fact the latter fixes a bug:
when userspace requests an IRQ window vmexit, an interrupt in the
local APIC can cause kvm_cpu_has_interrupt() to be true and thus
kvm_vcpu_ready_for_interrupt_injection() to return false. When this
happens, vcpu_run does not exit to userspace but the interrupt window
vmexits keep occurring. The VM loops without any hope of making progress.
we realize two things. First, thanks to the previous patch the complex
conditional can reuse !kvm_cpu_has_extint(vcpu). Second, the interrupt
window request in vcpu_enter_guest()
should be kept in sync with kvm_vcpu_ready_for_interrupt_injection():
it is unnecessary to ask the processor for an interrupt window
if we would not be able to return to userspace. Therefore,
kvm_cpu_accept_dm_intr(vcpu) is basically !kvm_cpu_has_extint(vcpu)
ANDed with the existing check for masked ExtINT. It all makes sense:
- we can accept an interrupt from userspace if there is a place
to stash it (and, for irqchip split, ExtINTs are not masked).
Interrupts from userspace _can_ be accepted even if right now
EFLAGS.IF=0.
- in order to tell userspace we will inject its interrupt ("IRQ
window open" i.e. kvm_vcpu_ready_for_interrupt_injection), both
KVM and the vCPU need to be ready to accept the interrupt.
... and this is what the patch implements.
Reported-by: David Woodhouse <dwmw@amazon.co.uk> Analyzed-by: David Woodhouse <dwmw@amazon.co.uk> Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Nikos Tsironis <ntsironis@arrikto.com> Reviewed-by: David Woodhouse <dwmw@amazon.co.uk> Tested-by: David Woodhouse <dwmw@amazon.co.uk>
Paolo Bonzini [Fri, 27 Nov 2020 07:53:52 +0000 (08:53 +0100)]
KVM: x86: handle !lapic_in_kernel case in kvm_cpu_*_extint
Centralize handling of interrupts from the userspace APIC
in kvm_cpu_has_extint and kvm_cpu_get_extint, since
userspace APIC interrupts are handled more or less the
same as ExtINTs are with split irqchip. This removes
duplicated code from kvm_cpu_has_injectable_intr and
kvm_cpu_has_interrupt, and makes the code more similar
between kvm_cpu_has_{extint,interrupt} on one side
and kvm_cpu_get_{extint,interrupt} on the other.
Cc: stable@vger.kernel.org Reviewed-by: Filippo Sironi <sironi@amazon.de> Reviewed-by: David Woodhouse <dwmw@amazon.co.uk> Tested-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>