net: sched: Use struct_size() helper in kvmalloc()
Make use of the struct_size() helper instead of an open-coded version,
in order to avoid any potential type mistakes or integer overflows
that, in the worst scenario, could lead to heap overflows.
net/sched/sch_api.c b193e15ac69d ("net: prevent user from passing illegal stab size") 69508d43334e ("net_sched: Use struct_size() and flex_array_size() helpers")
Merge tag 'net-5.15-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Networking fixes, including fixes from mac80211, netfilter and bpf.
Current release - regressions:
- bpf, cgroup: assign cgroup in cgroup_sk_alloc when called from
interrupt
- mdio: revert mechanical patches which broke handling of optional
resources
- dev_addr_list: prevent address duplication
Previous releases - regressions:
- sctp: break out if skb_header_pointer returns NULL in sctp_rcv_ootb
(NULL deref)
- Revert "mac80211: do not use low data rates for data frames with no
ack flag", fixing broadcast transmissions
- mac80211: fix use-after-free in CCMP/GCMP RX
- netfilter: include zone id in tuple hash again, minimize collisions
- netfilter: nf_tables: unlink table before deleting it (race -> UAF)
- netfilter: log: work around missing softdep backend module
- mptcp: don't return sockets in foreign netns
- sched: flower: protect fl_walk() with rcu (race -> UAF)
- ixgbe: fix NULL pointer dereference in ixgbe_xdp_setup
- smsc95xx: fix stalled rx after link change
- enetc: fix the incorrect clearing of IF_MODE bits
- ipv4: fix rtnexthop len when RTA_FLOW is present
- dsa: mv88e6xxx: 6161: use correct MAX MTU config method for this
SKU
- e100: fix length calculation & buffer overrun in ethtool::get_regs
Previous releases - always broken:
- mac80211: fix using stale frag_tail skb pointer in A-MSDU tx
- mac80211: drop frames from invalid MAC address in ad-hoc mode
- af_unix: fix races in sk_peer_pid and sk_peer_cred accesses (race
-> UAF)
- bpf, x86: Fix bpf mapping of atomic fetch implementation
- bpf: handle return value of BPF_PROG_TYPE_STRUCT_OPS prog
- netfilter: ip6_tables: zero-initialize fragment offset
- mhi: fix error path in mhi_net_newlink
- af_unix: return errno instead of NULL in unix_create1() when over
the fs.file-max limit
Misc:
- bpf: exempt CAP_BPF from checks against bpf_jit_limit
- netfilter: conntrack: make max chain length random, prevent
guessing buckets by attackers
- netfilter: nf_nat_masquerade: make async masq_inet6_event handling
generic, defer conntrack walk to work queue (prevent hogging RTNL
lock)"
* tag 'net-5.15-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (77 commits)
af_unix: fix races in sk_peer_pid and sk_peer_cred accesses
net: stmmac: fix EEE init issue when paired with EEE capable PHYs
net: dev_addr_list: handle first address in __hw_addr_add_ex
net: sched: flower: protect fl_walk() with rcu
net: introduce and use lock_sock_fast_nested()
net: phy: bcm7xxx: Fixed indirect MMD operations
net: hns3: disable firmware compatible features when uninstall PF
net: hns3: fix always enable rx vlan filter problem after selftest
net: hns3: PF enable promisc for VF when mac table is overflow
net: hns3: fix show wrong state when add existing uc mac address
net: hns3: fix mixed flag HCLGE_FLAG_MQPRIO_ENABLE and HCLGE_FLAG_DCB_ENABLE
net: hns3: don't rollback when destroy mqprio fail
net: hns3: remove tc enable checking
net: hns3: do not allow call hns3_nic_net_open repeatedly
ixgbe: Fix NULL pointer dereference in ixgbe_xdp_setup
net: bridge: mcast: Associate the seqcount with its protecting lock.
net: mdio-ipq4019: Fix the error for an optional regs resource
net: hns3: fix hclge_dbg_dump_tm_pg() stack usage
net: mdio: mscc-miim: Fix the mdio controller
af_unix: Return errno instead of NULL in unix_create1().
...
Aya Levin [Mon, 13 Sep 2021 13:49:47 +0000 (16:49 +0300)]
net/mlx5e: Mutually exclude setting of TX-port-TS and MQPRIO in channel mode
TX-port-TS hijacks the PTP traffic to a specific HW TX-queue. This
conflicts with MQPRIO in channel mode, which specifies explicitly which
TC accepts the packet. This patch mutually excludes the above
configuration.
Fixes: ec60c4581bd9 ("net/mlx5e: Support MQPRIO channel mode") Signed-off-by: Aya Levin <ayal@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Lama Kayal [Sun, 29 Aug 2021 08:26:03 +0000 (11:26 +0300)]
net/mlx5e: Fix the presented RQ index in PTP stats
PTP-RQ counters title format contains PTP-RQ identifier, which is
mistakenly not passed to sprinft().
This leads to unexpected garbage values instead.
This patch fixes it.
When setting number of completion EQs of the SF, consider number of
online CPUs.
Without this consideration, when number of online cpus are less than 8,
unnecessary 8 completion EQs are allocated.
Aya Levin [Thu, 23 Sep 2021 12:30:01 +0000 (15:30 +0300)]
net/mlx5: Avoid generating event after PPS out in Real time mode
When in Real-time mode, HW clock is synced with the PTP daemon. Hence
driver should not re-calibrate the next pulse (via MTPPSE repetitive
events mechanism).
This patch arms repetitive events only in free-running mode.
Fixes: 432119de33d9 ("net/mlx5: Add cyc2time HW translation mode support") Signed-off-by: Aya Levin <ayal@nvidia.com> Reviewed-by: Eran Ben Elisha <eranbe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Aya Levin [Thu, 23 Sep 2021 13:56:09 +0000 (16:56 +0300)]
net/mlx5: Force round second at 1PPS out start time
Allow configuration of 1PPS start time only with time-stamp representing
a round second. Prior to this patch driver allowed setting of a
non-round-second which is not supported by the device. Avoid unexpected
behavior by restricting start-time configuration to a round-second.
Fixes: 4272f9b88db9 ("net/mlx5e: Change 1PPS out scheme") Signed-off-by: Aya Levin <ayal@nvidia.com> Reviewed-by: Eran Ben Elisha <eranbe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
net/mlx5: E-Switch, Fix double allocation of acl flow counter
Flow counter is allocated in eswitch legacy acl setting functions
without checking if already allocated by previous setting. Add a check
to avoid such double allocation.
* Add netdev->tc_to_txq rollback in case of failure in
mlx5e_update_netdev_queues().
* Fix broken transition between the two modes:
MQPRIO DCB mode with tc==8, and MQPRIO channel mode.
* Disable MQPRIO channel mode if re-attaching with a different number
of channels.
* Improve code sharing.
net/mlx5e: Keep the value for maximum number of channels in-sync
The value for maximum number of channels is first calculated based
on the netdev's profile and current function resources (specifically,
number of MSIX vectors, which depends among other things on the number
of online cores in the system).
This value is then used to calculate the netdev's number of rxqs/txqs.
Once created (by alloc_etherdev_mqs), the number of netdev's rxqs/txqs
is constant and we must not exceed it.
To achieve this, keep the maximum number of channels in sync upon any
netdevice re-attach.
Use mlx5e_get_max_num_channels() for calculating the number of netdev's
rxqs/txqs. After netdev is created, use mlx5e_calc_max_nch() (which
coinsiders core device resources, profile, and netdev) to init or
update priv->max_nch.
Before this patch, the value of priv->max_nch might get out of sync,
mistakenly allowing accesses to out-of-bounds objects, which would
crash the system.
Track the number of channels stats structures used in a separate
field, as they are persistent to suspend/resume operations. All the
collected stats of every channel index that ever existed should be
preserved. They are reset only when struct mlx5e_priv is,
in mlx5e_priv_cleanup(), which is part of the profile changing flow.
There is no point anymore in blocking a profile change due to max_nch
mismatch in mlx5e_netdev_change_profile(). Remove the limitation.
Fixes: a1f240f18017 ("net/mlx5e: Adjust to max number of channles when re-attaching") Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Aya Levin <ayal@nvidia.com> Reviewed-by: Maxim Mikityanskiy <maximmi@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Raed Salem [Thu, 26 Aug 2021 14:07:17 +0000 (17:07 +0300)]
net/mlx5e: IPSEC RX, enable checksum complete
Currently in Rx data path IPsec crypto offloaded packets uses
csum_none flag, so checksum is handled by the stack, this naturally
have some performance/cpu utilization impact on such flows. As Nvidia
NIC starting from ConnectX6DX provides checksum complete value out of
the box also for such flows there is no sense in taking csum_none path,
furthermore the stack (xfrm) have the method to handle checksum complete
corrections for such flows i.e. IPsec trailer removal and consequently
checksum value adjustment.
Because of the above and in addition the ConnectX6DX is the first HW
which supports IPsec crypto offload then it is safe to report csum
complete for IPsec offloaded traffic.
Fixes: b2ac7541e377 ("net/mlx5e: IPsec: Add Connect-X IPsec Rx data path offload") Signed-off-by: Raed Salem <raeds@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Merge tag 'gpio-fixes-for-v5.15-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux
Pull gpio fixes from Bartosz Golaszewski:
"A single fix for the gpio-pca953x driver and two commits updating the
MAINTAINERS entries for Mun Yew Tham (GPIO specific) and myself
(treewide after a change in professional situation).
Summary:
- don't ignore I2C errors in gpio-pca953x
- update MAINTAINERS entries for Mun Yew Tham and myself"
* tag 'gpio-fixes-for-v5.15-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
MAINTAINERS: Update Mun Yew Tham as Altera Pio Driver maintainer
MAINTAINERS: update my email address
gpio: pca953x: do not ignore i2c errors
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma
Pull rdma fixes from Jason Gunthorpe:
"Not much too exciting here, although two syzkaller bugs that seem to
have 9 lives may have finally been squashed.
Several core bugs and a batch of driver bug fixes:
- Fix compilation problems in qib and hfi1
- Do not corrupt the joined multicast group state when using
SEND_ONLY
- Several CMA bugs, a reference leak for listening and two syzkaller
crashers
- Various bug fixes for irdma
- Fix a Sleeping while atomic bug in usnic
- Properly sanitize kernel pointers in dmesg
- Two bugs in the 64b CQE support for hns"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
RDMA/hns: Add the check of the CQE size of the user space
RDMA/hns: Fix the size setting error when copying CQE in clean_cq()
RDMA/hfi1: Fix kernel pointer leak
RDMA/usnic: Lock VF with mutex instead of spinlock
RDMA/hns: Work around broken constant propagation in gcc 8
RDMA/cma: Ensure rdma_addr_cancel() happens before issuing more requests
RDMA/cma: Do not change route.addr.src_addr.ss_family
RDMA/irdma: Report correct WC error when there are MW bind errors
RDMA/irdma: Report correct WC error when transport retry counter is exceeded
RDMA/irdma: Validate number of CQ entries on create CQ
RDMA/irdma: Skip CQP ring during a reset
MAINTAINERS: Update Broadcom RDMA maintainers
RDMA/cma: Fix listener leak in rdma_cma_listen_on_all() failure
IB/cma: Do not send IGMP leaves for sendonly Multicast groups
IB/qib: Fix clang confusion of NULL pointer comparison
Eric Dumazet [Wed, 29 Sep 2021 22:57:50 +0000 (15:57 -0700)]
af_unix: fix races in sk_peer_pid and sk_peer_cred accesses
Jann Horn reported that SO_PEERCRED and SO_PEERGROUPS implementations
are racy, as af_unix can concurrently change sk_peer_pid and sk_peer_cred.
In order to fix this issue, this patch adds a new spinlock that needs
to be used whenever these fields are read or written.
Jann also pointed out that l2cap_sock_get_peer_pid_cb() is currently
reading sk->sk_peer_pid which makes no sense, as this field
is only possibly set by AF_UNIX sockets.
We will have to clean this in a separate patch.
This could be done by reverting b48596d1dc25 "Bluetooth: L2CAP: Add get_peer_pid callback"
or implementing what was truly expected.
Fixes: 109f6e39fa07 ("af_unix: Allow SO_PEERCRED to work across namespaces.") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Jann Horn <jannh@google.com> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Cc: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 30 Sep 2021 13:17:10 +0000 (14:17 +0100)]
Merge branch 'snmp-optimizations'
Eric Dumazet says:
====================
net: snmp: minor optimizations
Fetching many SNMP counters on hosts with large number of cpus
takes a lot of time. mptcp still uses the old non-batched
fashion which is not cache friendly.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Wong Vee Khee [Thu, 30 Sep 2021 06:44:36 +0000 (14:44 +0800)]
net: stmmac: fix EEE init issue when paired with EEE capable PHYs
When STMMAC is paired with Energy-Efficient Ethernet(EEE) capable PHY,
and the PHY is advertising EEE by default, we need to enable EEE on the
xPCS side too, instead of having user to manually trigger the enabling
config via ethtool.
Fixed this by adding xpcs_config_eee() call in stmmac_eee_init().
Fixes: 7617af3d1a5e ("net: pcs: Introducing support for DWC xpcs Energy Efficient Ethernet") Cc: Michael Sit Wei Hong <michael.wei.hong.sit@intel.com> Signed-off-by: Wong Vee Khee <vee.khee.wong@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jason Xing [Wed, 29 Sep 2021 17:56:05 +0000 (10:56 -0700)]
ixgbe: let the xdpdrv work with more than 64 cpus
Originally, ixgbe driver doesn't allow the mounting of xdpdrv if the
server is equipped with more than 64 cpus online. So it turns out that
the loading of xdpdrv causes the "NOMEM" failure.
Actually, we can adjust the algorithm and then make it work through
mapping the current cpu to some xdp ring with the protect of @tx_lock.
Here are some numbers before/after applying this patch with xdp-example
loaded on the eth0X:
As server (rx path):
Before After
TCP_STREAM send-64 1416.49 1383.12
TCP_STREAM send-128 3141.49 3055.50
TCP_STREAM send-512 9488.73 9487.44
TCP_STREAM send-1k 9491.17 9356.22 (not stable)
TCP_RR send-1 23617.74 23601.60
...
Notice: the TCP_RR mode is unstable as the official document explains.
I tested many times with different parameters combined through netperf.
Though the result is not that accurate, I cannot see much influence on
this patch. The static key is places on the hot path, but it actually
shouldn't cause a huge regression theoretically.
Co-developed-by: Shujin Li <lishujin@kuaishou.com> Signed-off-by: Shujin Li <lishujin@kuaishou.com> Signed-off-by: Jason Xing <xingwanli@kuaishou.com> Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 30 Sep 2021 12:36:46 +0000 (13:36 +0100)]
Merge branch 'SO_RESEVED_MEM'
Wei Wang says:
====================
net: add new socket option SO_RESERVE_MEM
This patch series introduces a new socket option SO_RESERVE_MEM.
This socket option provides a mechanism for users to reserve a certain
amount of memory for the socket to use. When this option is set, kernel
charges the user specified amount of memory to memcg, as well as
sk_forward_alloc. This amount of memory is not reclaimable and is
available in sk_forward_alloc for this socket.
With this socket option set, the networking stack spends less cycles
doing forward alloc and reclaim, which should lead to better system
performance, with the cost of an amount of pre-allocated and
unreclaimable memory, even under memory pressure.
With a tcp_stream test with 10 flows running on a simulated 100ms RTT
link, I can see the cycles spent in __sk_mem_raise_allocated() dropping
by ~0.02%. Not a whole lot, since we already have logic in
sk_mem_uncharge() to only reclaim 1MB when sk_forward_alloc has more
than 2MB free space. But on a system suffering memory pressure
constently, the savings should be more.
The first patch is the implementation of this socket option. The
following 2 patches change the tcp stack to make use of this reserved
memory when under memory pressure. This makes the tcp stack behavior
more flexible when under memory pressure, and provides a way for user to
control the distribution of the memory among its sockets.
With a TCP connection on a simulated 100ms RTT link, the default
throughput under memory pressure is ~500Kbps. With SO_RESERVE_MEM set to
100KB, the throughput under memory pressure goes up to ~3.5Mbps.
Change since v2:
- Added description for new field added in struct sock in patch 1
Change since v1:
- Added performance stats in cover letter and rebased
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Wei Wang [Wed, 29 Sep 2021 17:25:13 +0000 (10:25 -0700)]
tcp: adjust rcv_ssthresh according to sk_reserved_mem
When user sets SO_RESERVE_MEM socket option, in order to utilize the
reserved memory when in memory pressure state, we adjust rcv_ssthresh
according to the available reserved memory for the socket, instead of
using 4 * advmss always.
Signed-off-by: Wei Wang <weiwan@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Wei Wang [Wed, 29 Sep 2021 17:25:12 +0000 (10:25 -0700)]
tcp: adjust sndbuf according to sk_reserved_mem
If user sets SO_RESERVE_MEM socket option, in order to fully utilize the
reserved memory in memory pressure state on the tx path, we modify the
logic in sk_stream_moderate_sndbuf() to set sk_sndbuf according to
available reserved memory, instead of MIN_SOCK_SNDBUF, and adjust it
when new data is acked.
Signed-off-by: Wei Wang <weiwan@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Wei Wang [Wed, 29 Sep 2021 17:25:11 +0000 (10:25 -0700)]
net: add new socket option SO_RESERVE_MEM
This socket option provides a mechanism for users to reserve a certain
amount of memory for the socket to use. When this option is set, kernel
charges the user specified amount of memory to memcg, as well as
sk_forward_alloc. This amount of memory is not reclaimable and is
available in sk_forward_alloc for this socket.
With this socket option set, the networking stack spends less cycles
doing forward alloc and reclaim, which should lead to better system
performance, with the cost of an amount of pre-allocated and
unreclaimable memory, even under memory pressure.
Note:
This socket option is only available when memory cgroup is enabled and we
require this reserved memory to be charged to the user's memcg. We hope
this could avoid mis-behaving users to abused this feature to reserve a
large amount on certain sockets and cause unfairness for others.
Signed-off-by: Wei Wang <weiwan@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Wed, 29 Sep 2021 15:32:24 +0000 (08:32 -0700)]
net: dev_addr_list: handle first address in __hw_addr_add_ex
struct dev_addr_list is used for device addresses, unicast addresses
and multicast addresses. The first of those needs special handling
of the main address - netdev->dev_addr points directly the data
of the entry and drivers write to it freely, so we can't maintain
it in the rbtree (for now, at least, to be fixed in net-next).
Current work around sprinkles special handling of the first
address on the list throughout the code but it missed the case
where address is being added. First address will not be visible
during subsequent adds.
Syzbot found a warning where unicast addresses are modified
without holding the rtnl lock, tl;dr is that team generates
the same modification multiple times, not necessarily when
right locks are held.
In the repro we have:
macvlan -> team -> veth
macvlan adds a unicast address to the team. Team then pushes
that address down to its memebers (veths). Next something unrelated
makes team sync member addrs again, and because of the bug
the addr entries get duplicated in the veths. macvlan gets
removed, removes its addr from team which removes only one
of the duplicated addresses from veths. This removal is done
under rtnl. Next syzbot uses iptables to add a multicast addr
to team (which does not hold rtnl lock). Team syncs veth addrs,
but because veths' unicast list still has the duplicate it will
also get sync, even though this update is intended for mc addresses.
Again, uc address updates need rtnl lock, boom.
Reported-by: syzbot+7a2ab2cdc14d134de553@syzkaller.appspotmail.com Fixes: 406f42fa0d3c ("net-next: When a bond have a massive amount of VLANs with IPv6 addresses, performance of changing link state, attaching a VRF, changing an IPv6 address, etc. go down dramtically.") Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Russell King [Wed, 29 Sep 2021 15:28:53 +0000 (16:28 +0100)]
net: phy: marvell10g: add downshift tunable support
Add support for the downshift tunable for the Marvell 88x3310 PHY.
Downshift is only usable with firmware 0.3.5.0 and later.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
Patch that refactored fl_walk() to use idr_for_each_entry_continue_ul()
also removed rcu protection of individual filters which causes following
use-after-free when filter is deleted concurrently. Fix fl_walk() to obtain
rcu read lock while iterating and taking the filter reference and temporary
release the lock while calling arg->fn() callback that can sleep.
KASAN trace:
[ 352.773640] ==================================================================
[ 352.775041] BUG: KASAN: use-after-free in fl_walk+0x159/0x240 [cls_flower]
[ 352.776304] Read of size 4 at addr ffff8881c8251480 by task tc/2987
Fixes: d39d714969cd ("idr: introduce idr_for_each_entry_continue_ul()") Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Acked-by: Cong Wang <cong.wang@bytedance.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Colin Ian King [Wed, 29 Sep 2021 13:27:53 +0000 (14:27 +0100)]
octeontx2-af: Remove redundant initialization of variable pin
The variable pin is being initialized with a value that is never
read, it is being updated later on in only one case of a switch
statement. The assignment is redundant and can be removed.
Addresses-Coverity: ("Unused value") Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
The macb PTP support currently implements the `gettime64` callback to allow
to retrieve the hardware clock time. Update the implementation to provide
the `gettimex64` callback instead.
The difference between the two is that with `gettime64` a snapshot of the
system clock is taken before and after invoking the callback. Whereas
`gettimex64` expects the callback itself to take the snapshots.
To get the time from the macb Ethernet core multiple register accesses have
to be done. Only one of which will happen at the time reported by the
function. This leads to a non-symmetric delay and adds a slight offset
between the hardware and system clock time when using the `gettime64`
method. This offset can be a few 100 nanoseconds. Switching to the
`gettimex64` method allows for a more precise correlation of the hardware
and system clocks and results in a lower offset between the two.
On a Xilinx ZynqMP system `phc2sys` reports a delay of 1120 ns before and
300 ns after the patch. With the latter being mostly symmetric.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de> Acked-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
We added a state variable to track whether a certain port
was VLAN filtering or not, but we can just inquire the DSA
core about this.
Cc: Vladimir Oltean <olteanv@gmail.com> Cc: Mauri Sandberg <sandberg@mailfence.com> Cc: DENG Qingfang <dqfext@gmail.com> Cc: Alvin Å ipraga <alsi@bang-olufsen.dk> Cc: Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
the problem originates from uncorrect lock annotation in the mptcp
code and is only visible since commit 2dcb96bacce3 ("net: core: Correct
the sock::sk_lock.owned lockdep annotations"), but is present since
the port-based endpoint support initial implementation.
This patch addresses the issue introducing a nested variant of
lock_sock_fast() and using it in the relevant code path.
Fixes: 1729cf186d8a ("mptcp: create the listening socket for new port") Fixes: 2dcb96bacce3 ("net: core: Correct the sock::sk_lock.owned lockdep annotations") Suggested-by: Thomas Gleixner <tglx@linutronix.de> Reported-and-tested-by: syzbot+1dd53f7a89b299d59eaf@syzkaller.appspotmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
Adds XDP_PASS, XDP_TX, XDP_DROP and XDP_REDIRECT support
for netdev PF.
Signed-off-by: Geetha sowjanya <gakula@marvell.com> Signed-off-by: Sunil Goutham <sgoutham@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
octeontx2-af: Adjust LA pointer for cpt parse header
In case of ltype NPC_LT_LA_CPT_HDR, LA pointer is pointing to the
start of cpt parse header. Since cpt parse header has veriable
length padding, this will be a problem for DMAC extraction. Adding
KPU profile changes to adjust the LA pointer to start at ether header
in case of cpt parse header by
- Adding ptr advance in pkind 58 to a fixed value 40
- Adding variable length offset 7 and mask 7 (pad len in
CPT_PARSE_HDR).
Also added the missing static declaration for npc_set_var_len_offset_pkind
function.
Signed-off-by: Kiran Kumar K <kirankumark@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
net_sched: Use struct_size() and flex_array_size() helpers
Make use of the struct_size() and flex_array_size() helpers instead of
an open-coded version, in order to avoid any potential type mistakes
or integer overflows that, in the worse scenario, could lead to heap
overflows.
Per gpio_chip interface, error shall be proparated to the caller.
Attempt to silent diagnostics by returning zero (as written in the
comment) is plain wrong, because the zero return can be interpreted by
the caller as the gpio value.
Fixes: cf530217408e ("devlink: Notify users when objects are accessible") Reported-by: Eric Dumazet <eric.dumazet@gmail.com> Tested-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://lore.kernel.org/r/2ed1159291f2a589b013914f2b60d8172fc525c1.1632925030.git.leonro@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Merge tag 'sound-5.15-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"This became a slightly large collection of changes, partly because
I've been off in the last weeks. Most of changes are small and
scattered while a bit big change is found in HD-audio Realtek codec
driver; it's a very device-specific fix that has been long wanted, so
I decided to pick up although it's in the middle RC.
Some highlights:
- A new guard ioctl for ALSA rawmidi API to avoid the misuse of the
new timestamp framing mode; it's for a regression fix
- HD-audio: a revert of the 5.15 change that might work badly, new
quirks for Lenovo Legion & co, a follow-up fix for CS8409
- ASoC: lots of SOF-related fixes, fsl component fixes, corrections
of mediatek drivers
- USB-audio: fix for the PM resume
- FireWire: oxfw and motu fixes"
* tag 'sound-5.15-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (25 commits)
ALSA: pcsp: Make hrtimer forwarding more robust
ALSA: rawmidi: introduce SNDRV_RAWMIDI_IOCTL_USER_PVERSION
ALSA: firewire-motu: fix truncated bytes in message tracepoints
ASoC: SOF: trace: Omit error print when waking up trace sleepers
ASoC: mediatek: mt8195: remove wrong fixup assignment on HDMITX
ASoC: SOF: loader: Re-phrase the missing firmware error to avoid duplication
ASoC: SOF: loader: release_firmware() on load failure to avoid batching
ALSA: hda/cs8409: Setup Dolphin Headset Mic as Phantom Jack
ALSA: pcxhr: "fix" PCXHR_REG_TO_PORT definition
ASoC: SOF: imx: imx8m: Bar index is only valid for IRAM and SRAM types
ASoC: SOF: imx: imx8: Bar index is only valid for IRAM and SRAM types
ASoC: SOF: Fix DSP oops stack dump output contents
ALSA: hda/realtek: Quirks to enable speaker output for Lenovo Legion 7i 15IMHG05, Yoga 7i 14ITL5/15ITL5, and 13s Gen2 laptops.
ALSA: usb-audio: Unify mixer resume and reset_resume procedure
Revert "ALSA: hda: Drop workaround for a hang at shutdown again"
ALSA: oxfw: fix transmission method for Loud models based on OXFW971
ASoC: mediatek: common: handle NULL case in suspend/resume function
ASoC: fsl_xcvr: register platform component before registering cpu dai
ASoC: fsl_spdif: register platform component before registering cpu dai
ASoC: fsl_micfil: register platform component before registering cpu dai
...
Mianhan Liu [Wed, 29 Sep 2021 05:31:09 +0000 (13:31 +0800)]
net/ipv4/datagram.c: remove superfluous header files from datagram.c
datagram.c hasn't use any macro or function declared in linux/ip.h.
Thus, these files can be removed from datagram.c safely without
affecting the compilation of the net/ipv4 module
Signed-off-by: Mianhan Liu <liumh1@shanghaitech.edu.cn> Signed-off-by: David S. Miller <davem@davemloft.net>
Mianhan Liu [Wed, 29 Sep 2021 06:41:06 +0000 (14:41 +0800)]
net/dsa/tag_ksz.c: remove superfluous headers
tag_ksz.c hasn't use any macro or function declared in linux/slab.h.
Thus, these files can be removed from tag_ksz.c safely without
affecting the compilation of the ./net/dsa module
Signed-off-by: Mianhan Liu <liumh1@shanghaitech.edu.cn> Signed-off-by: David S. Miller <davem@davemloft.net>
Mianhan Liu [Wed, 29 Sep 2021 06:36:11 +0000 (14:36 +0800)]
net/dsa/tag_8021q.c: remove superfluous headers
tag_8021q.c hasn't use any macro or function declared in linux/if_bridge.h.
Thus, these files can be removed from tag_8021q.c safely without
affecting the compilation of the ./net/dsa module
Signed-off-by: Mianhan Liu <liumh1@shanghaitech.edu.cn> Signed-off-by: David S. Miller <davem@davemloft.net>
When EEE support was added to the 28nm EPHY it was assumed that it would
be able to support the standard clause 45 over clause 22 register access
method. It turns out that the PHY does not support that, which is the
very reason for using the indirect shadow mode 2 bank 3 access method.
Implement {read,write}_mmd to allow the standard PHY library routines
pertaining to EEE querying and configuration to work correctly on these
PHYs. This forces us to implement a __phy_set_clr_bits() function that
does not grab the MDIO bus lock since the PHY driver's {read,write}_mmd
functions are always called with that lock held.
Fixes: 83ee102a6998 ("net: phy: bcm7xxx: add support for 28nm EPHY") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
net/mlx4: Use array_size() helper in copy_to_user()
Use array_size() helper instead of the open-coded version in
copy_to_user(). These sorts of multiplication factors need
to be wrapped in array_size().
Link: https://github.com/KSPP/linux/issues/160 Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
ethtool: ioctl: Use array_size() helper in copy_{from,to}_user()
Use array_size() helper instead of the open-coded version in
copy_{from,to}_user(). These sorts of multiplication factors
need to be wrapped in array_size().
net: hns3: disable firmware compatible features when uninstall PF
Currently, the firmware compatible features are enabled in PF driver
initialization process, but they are not disabled in PF driver
deinitialization process and firmware keeps these features in enabled
status.
In this case, if load an old PF driver (for example, in VM) which not
support the firmware compatible features, firmware will still send mailbox
message to PF when link status changed and PF will print
"un-supported mailbox message, code = 201".
To fix this problem, disable these firmware compatible features in PF
driver deinitialization process.
Fixes: ed8fb4b262ae ("net: hns3: add link change event report") Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
net: hns3: fix always enable rx vlan filter problem after selftest
Currently, the rx vlan filter will always be disabled before selftest and
be enabled after selftest as the rx vlan filter feature is fixed on in
old device earlier than V3.
However, this feature is not fixed in some new devices and it can be
disabled by user. In this case, it is wrong if rx vlan filter is enabled
after selftest. So fix it.
Fixes: bcc26e8dc432 ("net: hns3: remove unused code in hns3_self_test()") Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
net: hns3: PF enable promisc for VF when mac table is overflow
If unicast mac address table is full, and user add a new mac address, the
unicast promisc needs to be enabled for the new unicast mac address can be
used. So does the multicast promisc.
Now this feature has been implemented for PF, and VF should be implemented
too. When the mac table of VF is overflow, PF will enable promisc for this
VF.
Fixes: 1e6e76101fd9 ("net: hns3: configure promisc mode for VF asynchronously") Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
net: hns3: fix show wrong state when add existing uc mac address
Currently, if function adds an existing unicast mac address, eventhough
driver will not add this address into hardware, but it will return 0 in
function hclge_add_uc_addr_common(). It will cause the state of this
unicast mac address is ACTIVE in driver, but it should be in TO-ADD state.
To fix this problem, function hclge_add_uc_addr_common() returns -EEXIST
if mac address is existing, and delete two error log to avoid printing
them all the time after this modification.
Fixes: 72110b567479 ("net: hns3: return 0 and print warning when hit duplicate MAC") Signed-off-by: Jian Shen <shenjian15@huawei.com> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
net: hns3: fix mixed flag HCLGE_FLAG_MQPRIO_ENABLE and HCLGE_FLAG_DCB_ENABLE
HCLGE_FLAG_MQPRIO_ENABLE is supposed to set when enable
multiple TCs with tc mqprio, and HCLGE_FLAG_DCB_ENABLE is
supposed to set when enable multiple TCs with ets. But
the driver mixed the flags when updating the tm configuration.
Furtherly, PFC should be available when HCLGE_FLAG_MQPRIO_ENABLE
too, so remove the unnecessary limitation.
Fixes: 5a5c90917467 ("net: hns3: add support for tc mqprio offload") Signed-off-by: Jian Shen <shenjian15@huawei.com> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
net: hns3: don't rollback when destroy mqprio fail
For destroy mqprio is irreversible in stack, so it's unnecessary
to rollback the tc configuration when destroy mqprio failed.
Otherwise, it may cause the configuration being inconsistent
between driver and netstack.
As the failure is usually caused by reset, and the driver will
restore the configuration after reset, so it can keep the
configuration being consistent between driver and hardware.
Fixes: 5a5c90917467 ("net: hns3: add support for tc mqprio offload") Signed-off-by: Jian Shen <shenjian15@huawei.com> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, in function hns3_nic_set_real_num_queue(), the
driver doesn't report the queue count and offset for disabled
tc. If user enables multiple TCs, but only maps user
priorities to partial of them, it may cause the queue range
of the unmapped TC being displayed abnormally.
Fix it by removing the tc enable checking, ensure the queue
count is not zero.
With this change, the tc_en is useless now, so remove it.
Fixes: a75a8efa00c5 ("net: hns3: Fix tc setup when netdev is first up") Signed-off-by: Jian Shen <shenjian15@huawei.com> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
net: hns3: do not allow call hns3_nic_net_open repeatedly
hns3_nic_net_open() is not allowed to called repeatly, but there
is no checking for this. When doing device reset and setup tc
concurrently, there is a small oppotunity to call hns3_nic_net_open
repeatedly, and cause kernel bug by calling napi_enable twice.
The calltrace information is like below:
[ 3078.222780] ------------[ cut here ]------------
[ 3078.230255] kernel BUG at net/core/dev.c:6991!
[ 3078.236224] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
[ 3078.243431] Modules linked in: hns3 hclgevf hclge hnae3 vfio_iommu_type1 vfio_pci vfio_virqfd vfio pv680_mii(O)
[ 3078.258880] CPU: 0 PID: 295 Comm: kworker/u8:5 Tainted: G O 5.14.0-rc4+ #1
[ 3078.269102] Hardware name: , BIOS KpxxxFPGA 1P B600 V181 08/12/2021
[ 3078.276801] Workqueue: hclge hclge_service_task [hclge]
[ 3078.288774] pstate: 60400009 (nZCv daif +PAN -UAO -TCO BTYPE=--)
[ 3078.296168] pc : napi_enable+0x80/0x84
tc qdisc sho[w 3d0e7v8 .e3t0h218 79] lr : hns3_nic_net_open+0x138/0x510 [hns3]
Once hns3_nic_net_open() is excute success, the flag
HNS3_NIC_STATE_DOWN will be cleared. So add checking for this
flag, directly return when HNS3_NIC_STATE_DOWN is no set.
Fixes: e888402789b9 ("net: hns3: call hns3_nic_net_open() while doing HNAE3_UP_CLIENT") Signed-off-by: Jian Shen <shenjian15@huawei.com> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jeremy Kerr [Wed, 29 Sep 2021 07:26:11 +0000 (15:26 +0800)]
mctp: Do inits as a subsys_initcall
In a future change, we'll want to provide a registration call for
mctp-specific devices. This requires us to have the networks established
before device driver inits, so run the core init as a subsys_initcall.
Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>
Jeremy Kerr [Wed, 29 Sep 2021 07:26:10 +0000 (15:26 +0800)]
mctp: Add tracepoints for tag/key handling
The tag allocation, release and bind events are somewhat opaque outside
the kernel; this change adds a few tracepoints to assist in
instrumentation and debugging.
Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>
Jeremy Kerr [Wed, 29 Sep 2021 07:26:09 +0000 (15:26 +0800)]
mctp: Implement a timeout for tags
Currently, a MCTP (local-eid,remote-eid,tag) tuple is allocated to a
socket on send, and only expires when the socket is closed.
This change introduces a tag timeout, freeing the tuple after a fixed
expiry - currently six seconds. This is greater than (but close to) the
max response timeout in upper-layer bindings.
Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>
Jeremy Kerr [Wed, 29 Sep 2021 07:26:08 +0000 (15:26 +0800)]
mctp: Add refcounts to mctp_dev
Currently, we tie the struct mctp_dev lifetime to the underlying struct
net_device, and hold/put that device as a proxy for a separate mctp_dev
refcount. This works because we're not holding any references to the
mctp_dev that are different from the netdev lifetime.
In a future change we'll break that assumption though, as we'll need to
hold mctp_dev references in a workqueue, which might live past the
netdev unregister notification.
In order to support that, this change introduces a refcount on the
mctp_dev, currently taken by the net_device->mctp_ptr reference, and
released on netdev unregister events. We can then use this for future
references that might outlast the net device.
Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>
Jeremy Kerr [Wed, 29 Sep 2021 07:26:07 +0000 (15:26 +0800)]
mctp: locking, lifetime and validity changes for sk_keys
We will want to invalidate sk_keys in a future change, which will
require a boolean flag to mark invalidated items in the socket & net
namespace lists. We'll also need to take a reference to keys, held over
non-atomic contexts, so we need a refcount on keys also.
This change adds a validity flag (currently always true) and refcount to
struct mctp_sk_key. With a refcount on the keys, using RCU no longer
makes much sense; we have exact indications on the lifetime of keys. So,
we also change the RCU list traversal to a locked implementation.
Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>
The LAN8804 PHY has same features as that of LAN8814 PHY except that it
doesn't support 1588, SyncE or Q-USGMII.
This PHY is found inside the LAN966X switches.
Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
ixgbe: Fix NULL pointer dereference in ixgbe_xdp_setup
The ixgbe driver currently generates a NULL pointer dereference with
some machine (online cpus < 63). This is due to the fact that the
maximum value of num_xdp_queues is nr_cpu_ids. Code is in
"ixgbe_set_rss_queues"".
Here's how the problem repeats itself:
Some machine (online cpus < 63), And user set num_queues to 63 through
ethtool. Code is in the "ixgbe_set_channels",
adapter->ring_feature[RING_F_FDIR].limit = count;
It becomes 63.
When user use xdp, "ixgbe_set_rss_queues" will set queues num.
adapter->num_rx_queues = rss_i;
adapter->num_tx_queues = rss_i;
adapter->num_xdp_queues = ixgbe_xdp_queues(adapter);
And rss_i's value is from
f = &adapter->ring_feature[RING_F_FDIR];
rss_i = f->indices = f->limit;
So "num_rx_queues" > "num_xdp_queues", when run to "ixgbe_xdp_setup",
for (i = 0; i < adapter->num_rx_queues; i++)
if (adapter->xdp_ring[i]->xsk_umem)
So I fix ixgbe_max_channels so that it will not allow a setting of queues
to be higher than the num_online_cpus(). And when run to ixgbe_xdp_setup,
take the smaller value of num_rx_queues and num_xdp_queues.
Fixes: 4a9b32f30f80 ("ixgbe: fix potential RX buffer starvation for AF_XDP") Signed-off-by: Feng Zhou <zhoufeng.zf@bytedance.com> Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 29 Sep 2021 09:30:56 +0000 (10:30 +0100)]
Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/nex
t-queue
Tony Nguyen says:
====================
100GbE Intel Wired LAN Driver Updates 2021-09-28
This series contains updates to ice driver only.
Dave adds support for QoS DSCP allowing for DSCP to TC mapping via APP
TLVs.
Ani adds enforcement of DSCP to only supported devices with the
introduction of a feature bitmap and corrects messaging of unsupported
modules based on link mode.
Jake refactors devlink info functions to be void as the functions no
longer return errors.
Jeff fixes a macro name to properly reflect the value.
Len Baker converts a kzalloc allocation to, the preferred, kcalloc.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 29 Sep 2021 09:27:33 +0000 (10:27 +0100)]
Merge branch 'octeontx2-ptp-vf'
Subbaraya Sundeep <sbhatta@marvell.com>
====================
octeontx2: Add PTP support for VFs
PTP is a shared hardware block which can prepend
RX timestamps to packets before directing packets to
PFs or VFs and can notify the TX timestamps to PFs or VFs
via TX completion queue descriptors. Hence adding PTP
support for VFs is exactly similar to PFs with minimal changes.
This patchset adds that PTP support for VFs.
Patch 1 - When an interface is set in promisc/multicast
the same setting is not retained when changing mtu or channels.
This is due to toggling of the interface by driver but not
calling set_rx_mode in the down-up sequence. Since setting
an interface to multicast properly is required for ptp this is
addressed in this patch.
Patch 2 - Changes in VF driver for registering timestamping
ethtool ops and ndo_ioctl.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
octeontx2-nicvf: Add PTP hardware clock support to NIX VF
This patch adds PTP PHC support to NIX VF interfaces. This enables
a VF to run PTP master/slave instance. PTP block being a shared
hardware resource it is recommended to avoid running multiple
PTP instances in the system which will impact the PTP clock
accuracy.
Signed-off-by: Naveen Mamindlapalli <naveenm@marvell.com> Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com> Signed-off-by: Sunil Kovvuri Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
octeontx2-pf: Enable promisc/allmulti match MCAM entries.
Whenever the interface is brought up/down then set_rx_mode
function is called by the stack which enables promisc/allmulti
MCAM entries. But there are cases when driver brings
interface down and then up such as while changing number
of channels. In these cases promisc/allmulti MCAM entries
are left disabled as set_rx_mode callback is not called.
This patch enables these MCAM entries in all such cases.
Signed-off-by: Rakesh Babu <rsaladi2@marvell.com> Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com> Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Thomas Gleixner [Tue, 28 Sep 2021 14:10:49 +0000 (16:10 +0200)]
net: bridge: mcast: Associate the seqcount with its protecting lock.
The sequence count bridge_mcast_querier::seq is protected by
net_bridge::multicast_lock but seqcount_init() does not associate the
seqcount with the lock. This leads to a warning on PREEMPT_RT because
preemption is still enabled.
Let seqcount_init() associate the seqcount with lock that protects the
write section. Remove lockdep_assert_held_once() because lockdep already checks
whether the associated lock is held.
Fixes: 67b746f94ff39 ("net: bridge: mcast: make sure querier port/address updates are consistent") Reported-by: Mike Galbraith <efault@gmx.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Tested-by: Mike Galbraith <efault@gmx.de> Acked-by: Nikolay Aleksandrov <nikolay@nvidia.com> Link: https://lore.kernel.org/r/20210928141049.593833-1-bigeasy@linutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Cai Huoqing [Tue, 28 Sep 2021 13:48:49 +0000 (21:48 +0800)]
net: mdio-ipq4019: Fix the error for an optional regs resource
The second resource is optional which is only provided on the chipset
IPQ5018. But the blamed commit ignores that and if the resource is
not there it just fails.
the resource is used like this,
if (priv->eth_ldo_rdy) {
val = readl(priv->eth_ldo_rdy);
val |= BIT(0);
writel(val, priv->eth_ldo_rdy);
fsleep(IPQ_PHY_SET_DELAY_US);
}
This patch reverts that to still allow the second resource to be optional
because other SoC have the some MDIO controller and doesn't need to
second resource.
Fixes: fa14d03e014a ("net: mdio-ipq4019: Make use of devm_platform_ioremap_resource()") Signed-off-by: Cai Huoqing <caihuoqing@baidu.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://lore.kernel.org/r/20210928134849.2092-1-caihuoqing@baidu.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Merge tag 'vfio-v5.15-rc4' of git://github.com/awilliam/linux-vfio
Pull VFIO fixes from Alex Williamson:
- Fix vfio-ap leak on uninit (Jason Gunthorpe)
- Add missing prototype arg name (Colin Ian King)
* tag 'vfio-v5.15-rc4' of git://github.com/awilliam/linux-vfio:
vfio/ap_ops: Add missed vfio_uninit_group_dev()
vfio/pci: add missing identifier name in argument of function prototype
Merge tag 'm68k-for-v5.15-tag3' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k
Pull more m68k updates from Geert Uytterhoeven:
- signal handling fixes
- removal of set_fs()
[ The set_fs removal isn't strictly a fix, but it's been pending for a
while and is very welcome. The signal handling fixes resolved an issue
that was incorrectly attributed to the set_fs changes - Linus ]
* tag 'm68k-for-v5.15-tag3' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k:
m68k: Remove set_fs()
m68k: Provide __{get,put}_kernel_nofault
m68k: Factor the 8-byte lowlevel {get,put}_user code into helpers
m68k: Use BUILD_BUG for passing invalid sizes to get_user/put_user
m68k: Remove the 030 case in virt_to_phys_slow
m68k: Document that access_ok is broken for !CONFIG_CPU_HAS_ADDRESS_SPACES
m68k: Leave stack mangling to asm wrapper of sigreturn()
m68k: Update ->thread.esp0 before calling syscall_trace() in ret_from_signal
m68k: Handle arrivals of multiple signals correctly
Merge tag 'nios2_fixes_for_v5.15_part1' of git://git.kernel.org/pub/scm/linux/kernel/git/dinguyen/linux
Pull nios2 fixes from Dinh Nguyen:
- Fix build warning for unmet dependency for EARLY_PRINTK
- Remove unused dram_start() function
* tag 'nios2_fixes_for_v5.15_part1' of git://git.kernel.org/pub/scm/linux/kernel/git/dinguyen/linux:
NIOS2: setup.c: drop unused variable 'dram_start'
NIOS2: fix kconfig unmet dependency warning for SERIAL_CORE_CONSOLE
Len Baker [Sun, 5 Sep 2021 06:50:20 +0000 (08:50 +0200)]
ice: Prefer kcalloc over open coded arithmetic
As noted in the "Deprecated Interfaces, Language Features, Attributes,
and Conventions" documentation [1], size calculations (especially
multiplication) should not be performed in memory allocator (or similar)
function arguments due to the risk of them overflowing. This could lead
to values wrapping around and a smaller allocation being made than the
caller was expecting. Using those allocations could lead to linear
overflows of heap memory and other misbehaviors.
In this case this is not actually dynamic sizes: both sides of the
multiplication are constant values. However it is best to refactor this
anyway, just to keep the open-coded math idiom out of code.
So, use the purpose specific kcalloc() function instead of the argument
size * count in the kzalloc() function.
Signed-off-by: Len Baker <len.baker@gmx.com> Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Jeff Guo [Fri, 16 Jul 2021 22:16:44 +0000 (15:16 -0700)]
ice: Fix macro name for IPv4 fragment flag
In IPv4 header, fragment flags indicate whether the packet needs
to be fragmented or not. The value 0x20 represents MF (More Fragment); fix
the macro name to match this.
Signed-off-by: Ting Xu <ting.xu@intel.com> Signed-off-by: Jeff Guo <jia.guo@intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Jacob Keller [Fri, 20 Aug 2021 15:09:50 +0000 (08:09 -0700)]
ice: refactor devlink getter/fallback functions to void
After commit a8f89fa27773 ("ice: do not abort devlink info if board
identifier can't be found"), the getter/fallback() functions no longer
report an error. Convert the interface to a void so that it is no
longer possible to add a version field that is fatal. This makes
sense, because we should not fail to report other versions just
because one of the version pieces could not be found.
Finally, clean up the getter functions line wrapping so that none of
them take more than 80 columns, as is the usual style for networking
files.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
The messaging for unsupported module detection is different for
lenient mode and strict mode. Update the code to print the right
messaging for a given link mode.
Media topology conflict is not an error in lenient mode, so return
an error code only if not in lenient mode.
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
ice: Add feature bitmap, helpers and a check for DSCP
DSCP a.k.a L3 QoS is only supported on certain devices. To enforce this,
this patch introduces a bitmap of features and helper functions.
The feature bitmap is set based on device IDs on driver init. Currently,
DSCP is the only feature in this bitmap, but there will be more in the
future. In the DCB netlink flow, check if the feature bit is set before
exercising DSCP.
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Dave Ertman [Fri, 6 Aug 2021 20:53:56 +0000 (13:53 -0700)]
ice: Add DSCP support
Implement code to handle submission of APP TLV's
containing DSCP to TC mapping.
The first such mapping received on an interface
will cause that PF to switch to L3 DSCP QoS mode,
apply the default config for that mode, and apply
the received mapping.
Only one such mapping will be allowed per DSCP value,
and when the last DSCP mapping is deleted, the PF
will switch back into L2 VLAN QoS mode, applying the
appropriate default QoS settings.
L3 DSCP QoS mode will only be allowed in SW DCBx
mode, in other words, when the FW LLDP engine is
disabled. Commands that break this mutual exclusivity
will be blocked.
Co-developed-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Signed-off-by: Dave Ertman <david.m.ertman@intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Merge tag 'fsverity-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscrypt
Pull fsverity fix from Eric Biggers:
"Fix an integer overflow when computing the Merkle tree layout of
extremely large files, exposed by btrfs adding support for fs-verity"
* tag 'fsverity-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscrypt:
fs-verity: fix signed integer overflow with i_size near S64_MAX
Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost
Pull virtio/vdpa fixes from Michael Tsirkin:
"Fixes up some issues in rc1"
* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
vdpa: potential uninitialized return in vhost_vdpa_va_map()
vdpa/mlx5: Avoid executing set_vq_ready() if device is reset
vdpa/mlx5: Clear ready indication for control VQ
vduse: Cleanup the old kernel states after reset failure
vduse: missing error code in vduse_init()
virtio: don't fail on !of_device_is_compatible
Merge tag 'mmc-v5.15-2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc
Pull MMC fixes from Ulf Hansson:
- renesas_sdhi: Fix regression with hard reset on old SDHIs
- dw_mmc: Only inject fault before done/error
* tag 'mmc-v5.15-2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
mmc: renesas_sdhi: fix regression with hard reset on old SDHIs
mmc: dw_mmc: Only inject fault before done/error
The use of dma_unmap_addr()/dma_unmap_len() in the driver causes
multiple warnings when these macros are defined as empty, e.g.
in an ARCH=i386 allmodconfig build:
drivers/net/ethernet/google/gve/gve_tx_dqo.c: In function 'gve_tx_add_skb_no_copy_dqo':
drivers/net/ethernet/google/gve/gve_tx_dqo.c:494:40: error: unused variable 'buf' [-Werror=unused-variable]
494 | struct gve_tx_dma_buf *buf =
This is not how the NEED_DMA_MAP_STATE macros are meant to work,
as they rely on never using local variables or a temporary structure
like gve_tx_dma_buf.
Remote the gve_tx_dma_buf definition and open-code the contents
in all places to avoid the warning. This causes some rather long
lines but otherwise ends up making the driver slightly smaller.
Current driver uses software CQ head pointer to poll on CQE
header in memory to determine if CQE is valid. Software needs
to make sure, that the reads of the CQE do not get re-ordered
so much that it ends up with an inconsistent view of the CQE.
To ensure that DMB barrier after read to first CQE cacheline
and before reading of the rest of the CQE is needed.
But having barrier for every CQE read will impact the performance,
instead use hardware CQ head and tail pointers to find the
valid number of CQEs.
Signed-off-by: Geetha sowjanya <gakula@marvell.com> Signed-off-by: Sunil Kovvuri Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 28 Sep 2021 12:50:38 +0000 (13:50 +0100)]
Merge branch 'octeontx2-af-external-ptp-clock'
Hariprasad Kelam says:
====================
Externel ptp clock support
Externel ptp support is required in a scenario like connecting
a external timing device to the chip for time synchronization.
This series of patches adds support to ptp driver to use external
clock and enables PTP config in CN10K MAC block (RPM). Currently
PTP configuration is left unchanged in FLR handler these patches
addresses the same.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
,