]> git.proxmox.com Git - mirror_ubuntu-kernels.git/log
mirror_ubuntu-kernels.git
2 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Jakub Kicinski [Thu, 14 Jul 2022 21:19:42 +0000 (14:19 -0700)]
Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

include/net/sock.h
  310731e2f161 ("net: Fix data-races around sysctl_mem.")
  e70f3c701276 ("Revert "net: set SK_MEM_QUANTUM to 4096"")
https://lore.kernel.org/all/20220711120211.7c8b7cba@canb.auug.org.au/

net/ipv4/fib_semantics.c
  747c14307214 ("ip: fix dflt addr selection for connected nexthop")
  d62607c3fe45 ("net: rename reference+tracking helpers")

net/tls/tls.h
include/net/tls.h
  3d8c51b25a23 ("net/tls: Check for errors in tls_device_init")
  587903142308 ("tls: create an internal header")

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agox86/speculation: Use DECLARE_PER_CPU for x86_spec_ctrl_current
Nathan Chancellor [Wed, 13 Jul 2022 15:24:37 +0000 (08:24 -0700)]
x86/speculation: Use DECLARE_PER_CPU for x86_spec_ctrl_current

Clang warns:

  arch/x86/kernel/cpu/bugs.c:58:21: error: section attribute is specified on redeclared variable [-Werror,-Wsection]
  DEFINE_PER_CPU(u64, x86_spec_ctrl_current);
                      ^
  arch/x86/include/asm/nospec-branch.h:283:12: note: previous declaration is here
  extern u64 x86_spec_ctrl_current;
             ^
  1 error generated.

The declaration should be using DECLARE_PER_CPU instead so all
attributes stay in sync.

Cc: stable@vger.kernel.org
Fixes: fc02735b14ff ("KVM: VMX: Prevent guest RSB poisoning attacks with eIBRS")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2 years agoMerge tag 'net-5.19-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Linus Torvalds [Thu, 14 Jul 2022 19:48:07 +0000 (12:48 -0700)]
Merge tag 'net-5.19-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:
 "Including fixes from netfilter, bpf and wireless.

  Still no major regressions, the release continues to be calm. An
  uptick of fixes this time around due to trivial data race fixes and
  patches flowing down from subtrees.

  There has been a few driver fixes (particularly a few fixes for false
  positives due to 66e4c8d95008 which went into -next in May!) that make
  me worry the wide testing is not exactly fully through.

  So "calm" but not "let's just cut the final ASAP" vibes over here.

  Current release - regressions:

   - wifi: rtw88: fix write to const table of channel parameters

  Current release - new code bugs:

   - mac80211: add gfp_t arg to ieeee80211_obss_color_collision_notify

   - mlx5:
      - TC, allow offload from uplink to other PF's VF
      - Lag, decouple FDB selection and shared FDB
      - Lag, correct get the port select mode str

   - bnxt_en: fix and simplify XDP transmit path

   - r8152: fix accessing unset transport header

  Previous releases - regressions:

   - conntrack: fix crash due to confirmed bit load reordering (after
     atomic -> refcount conversion)

   - stmmac: dwc-qos: disable split header for Tegra194

  Previous releases - always broken:

   - mlx5e: ring the TX doorbell on DMA errors

   - bpf: make sure mac_header was set before using it

   - mac80211: do not wake queues on a vif that is being stopped

   - mac80211: fix queue selection for mesh/OCB interfaces

   - ip: fix dflt addr selection for connected nexthop

   - seg6: fix skb checksums for SRH encapsulation/insertion

   - xdp: fix spurious packet loss in generic XDP TX path

   - bunch of sysctl data race fixes

   - nf_log: incorrect offset to network header

  Misc:

   - bpf: add flags arg to bpf_dynptr_read and bpf_dynptr_write APIs"

* tag 'net-5.19-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (87 commits)
  nfp: flower: configure tunnel neighbour on cmsg rx
  net/tls: Check for errors in tls_device_init
  MAINTAINERS: Add an additional maintainer to the AMD XGBE driver
  xen/netback: avoid entering xenvif_rx_next_skb() with an empty rx queue
  selftests/net: test nexthop without gw
  ip: fix dflt addr selection for connected nexthop
  net: atlantic: remove aq_nic_deinit() when resume
  net: atlantic: remove deep parameter on suspend/resume functions
  sfc: fix kernel panic when creating VF
  seg6: bpf: fix skb checksum in bpf_push_seg6_encap()
  seg6: fix skb checksum in SRv6 End.B6 and End.B6.Encaps behaviors
  seg6: fix skb checksum evaluation in SRH encapsulation/insertion
  sfc: fix use after free when disabling sriov
  net: sunhme: output link status with a single print.
  r8152: fix accessing unset transport header
  net: stmmac: fix leaks in probe
  net: ftgmac100: Hold reference returned by of_get_child_by_name()
  nexthop: Fix data-races around nexthop_compat_mode.
  ipv4: Fix data-races around sysctl_ip_dynaddr.
  tcp: Fix a data-race around sysctl_tcp_ecn_fallback.
  ...

2 years agoMerge tag '5.19-rc6-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6
Linus Torvalds [Thu, 14 Jul 2022 19:35:15 +0000 (12:35 -0700)]
Merge tag '5.19-rc6-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6

Pull cifs fixes from Steve French:
 "Three smb3 client fixes:

   - two multichannel fixes: fix a potential deadlock freeing a channel,
     and fix a race condition on failed creation of a new channel

   - mount failure fix: work around a server bug in some common older
     Samba servers by avoiding padding at the end of the negotiate
     protocol request"

* tag '5.19-rc6-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
  smb3: workaround negprot bug in some Samba servers
  cifs: remove unnecessary locking of chan_lock while freeing session
  cifs: fix race condition with delayed threads

2 years agoMerge tag 'nfsd-5.19-3' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux
Linus Torvalds [Thu, 14 Jul 2022 19:29:43 +0000 (12:29 -0700)]
Merge tag 'nfsd-5.19-3' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux

Pull nfsd fixes from Chuck Lever:
 "Notable regression fixes:

   - Enable SETATTR(time_create) to fix regression with Mac OS clients

   - Fix a lockd crasher and broken NLM UNLCK behavior"

* tag 'nfsd-5.19-3' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux:
  lockd: fix nlm_close_files
  lockd: set fl_owner when unlocking files
  NFSD: Decode NFSv4 birth time attribute

2 years agoMerge tag 'integrity-v5.19-fix' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Thu, 14 Jul 2022 19:15:42 +0000 (12:15 -0700)]
Merge tag 'integrity-v5.19-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity

Pull integrity fixes from Mimi Zohar:
 "Here are a number of fixes for recently found bugs.

  Only 'ima: fix violation measurement list record' was introduced in
  the current release. The rest address existing bugs"

* tag 'integrity-v5.19-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity:
  ima: Fix potential memory leak in ima_init_crypto()
  ima: force signature verification when CONFIG_KEXEC_SIG is configured
  ima: Fix a potential integer overflow in ima_appraise_measurement
  ima: fix violation measurement list record
  Revert "evm: Fix memleak in init_desc"

2 years agoMerge tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm
Linus Torvalds [Thu, 14 Jul 2022 19:08:59 +0000 (12:08 -0700)]
Merge tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm

Pull ARM fixes from Russell King:

 - quieten the spectre-bhb prints

 - mark flattened device tree sections as shareable

 - remove some obsolete CPU domain code and help text

 - fix thumb unaligned access abort emulation

 - fix amba_device_add() refcount underflow

 - fix literal placement

* tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm:
  ARM: 9208/1: entry: add .ltorg directive to keep literals in range
  ARM: 9207/1: amba: fix refcount underflow if amba_device_add() fails
  ARM: 9214/1: alignment: advance IT state after emulating Thumb instruction
  ARM: 9213/1: Print message about disabled Spectre workarounds only once
  ARM: 9212/1: domain: Modify Kconfig help text
  ARM: 9211/1: domain: drop modify_domain()
  ARM: 9210/1: Mark the FDT_FIXED sections as shareable
  ARM: 9209/1: Spectre-BHB: avoid pr_info() every time a CPU comes out of idle

2 years agoum: Replace to_phys() and to_virt() with less generic function names
Guenter Roeck [Thu, 14 Jul 2022 18:46:00 +0000 (11:46 -0700)]
um: Replace to_phys() and to_virt() with less generic function names

The UML function names to_virt() and to_phys() are exposed by UML
headers, and are very generic and may be defined by drivers.  As it
turns out, commit 9409c9b6709e ("pmem: refactor pmem_clear_poison()")
did exactly that.

This results in build errors such as the following when trying to build
um:allmodconfig:

  drivers/nvdimm/pmem.c: In function ‘pmem_dax_zero_page_range’:
  ./arch/um/include/asm/page.h:105:20: error: too few arguments to function ‘to_phys’
    105 | #define __pa(virt) to_phys((void *) (unsigned long) (virt))
        |                    ^~~~~~~

Use less generic function names for the um specific to_phys() and
to_virt() functions to fix the problem and to avoid similar problems in
the future.

Fixes: 9409c9b6709e ("pmem: refactor pmem_clear_poison()")
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2 years agoMerge tag 'sound-5.19-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai...
Linus Torvalds [Thu, 14 Jul 2022 18:34:16 +0000 (11:34 -0700)]
Merge tag 'sound-5.19-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound

Pull sound fixes from Takashi Iwai:
 "Hopefully the last one for 5.19. This became bigger than wished, but
  all changes are pretty device-specific small fixes, which look less
  worrisome.

  The majority of changes are about various ASoC fixes, while the usual
  HD-audio quirks are included as well"

* tag 'sound-5.19-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (28 commits)
  ALSA: hda/realtek - Enable the headset-mic on a Xiaomi's laptop
  ALSA: hda/realtek - Fix headset mic problem for a HP machine with alc221
  ALSA: hda/realtek: fix mute/micmute LEDs for HP machines
  ALSA: hda/realtek - Fix headset mic problem for a HP machine with alc671
  ALSA: hda - Add fixup for Dell Latitidue E5430
  ALSA: hda/conexant: Apply quirk for another HP ProDesk 600 G3 model
  ALSA: hda/realtek: Fix headset mic for Acer SF313-51
  ASoC: Intel: Skylake: Correct the handling of fmt_config flexible array
  ASoC: Intel: Skylake: Correct the ssp rate discovery in skl_get_ssp_clks()
  ASoC: rt5640: Fix the wrong state of JD1 and JD2
  ASoC: Intel: sof_rt5682: fix out-of-bounds array access
  ASoC: qdsp6: fix potential memory leak in q6apm_get_audioreach_graph()
  ASoC: tas2764: Fix amp gain register offset & default
  ASoC: tas2764: Correct playback volume range
  ASoC: tas2764: Fix and extend FSYNC polarity handling
  ASoC: tas2764: Add post reset delays
  ASoC: dt-bindings: Fix description for msm8916
  ASoC: doc: Capitalize RESET line name
  ASoC: arizona: Update arizona_aif_cfg_changed to use RX_BCLK_RATE
  ASoC: cs47l92: Fix event generation for OUT1 demux
  ...

2 years agonfp: flower: configure tunnel neighbour on cmsg rx
Tianyu Yuan [Thu, 14 Jul 2022 08:19:15 +0000 (10:19 +0200)]
nfp: flower: configure tunnel neighbour on cmsg rx

nfp_tun_write_neigh() function will configure a tunnel neighbour when
calling nfp_tun_neigh_event_handler() or nfp_flower_cmsg_process_one_rx()
(with no tunnel neighbour type) from firmware.

When configuring IP on physical port as a tunnel endpoint, no operation
will be performed after receiving the cmsg mentioned above.

Therefore, add a progress to configure tunnel neighbour in this case.

v2: Correct format of fixes tag.

Fixes: f1df7956c11f ("nfp: flower: rework tunnel neighbour configuration")
Signed-off-by: Tianyu Yuan <tianyu.yuan@corigine.com>
Reviewed-by: Louis Peens <louis.peens@corigine.com>
Reviewed-by: Baowen Zheng <baowen.zheng@corigine.com>
Signed-off-by: Simon Horman <simon.horman@corigine.com>
Link: https://lore.kernel.org/r/20220714081915.148378-1-simon.horman@corigine.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet/tls: Check for errors in tls_device_init
Tariq Toukan [Thu, 14 Jul 2022 07:07:54 +0000 (10:07 +0300)]
net/tls: Check for errors in tls_device_init

Add missing error checks in tls_device_init.

Fixes: e8f69799810c ("net/tls: Add generic NIC offload infrastructure")
Reported-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://lore.kernel.org/r/20220714070754.1428-1-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoMAINTAINERS: Add an additional maintainer to the AMD XGBE driver
Tom Lendacky [Wed, 13 Jul 2022 22:31:41 +0000 (17:31 -0500)]
MAINTAINERS: Add an additional maintainer to the AMD XGBE driver

Add Shyam Sundar S K as an additional maintainer to support the AMD XGBE
network device driver.

Cc: Shyam Sundar S K <Shyam-sundar.S-k@amd.com>
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Link: https://lore.kernel.org/r/db367f24089c2bbbcd1cec8e21af49922017a110.1657751501.git.thomas.lendacky@amd.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoxen/netback: avoid entering xenvif_rx_next_skb() with an empty rx queue
Juergen Gross [Wed, 13 Jul 2022 13:53:22 +0000 (15:53 +0200)]
xen/netback: avoid entering xenvif_rx_next_skb() with an empty rx queue

xenvif_rx_next_skb() is expecting the rx queue not being empty, but
in case the loop in xenvif_rx_action() is doing multiple iterations,
the availability of another skb in the rx queue is not being checked.

This can lead to crashes:

[40072.537261] BUG: unable to handle kernel NULL pointer dereference at 0000000000000080
[40072.537407] IP: xenvif_rx_skb+0x23/0x590 [xen_netback]
[40072.537534] PGD 0 P4D 0
[40072.537644] Oops: 0000 [#1] SMP NOPTI
[40072.537749] CPU: 0 PID: 12505 Comm: v1-c40247-q2-gu Not tainted 4.12.14-122.121-default #1 SLE12-SP5
[40072.537867] Hardware name: HP ProLiant DL580 Gen9/ProLiant DL580 Gen9, BIOS U17 11/23/2021
[40072.537999] task: ffff880433b38100 task.stack: ffffc90043d40000
[40072.538112] RIP: e030:xenvif_rx_skb+0x23/0x590 [xen_netback]
[40072.538217] RSP: e02b:ffffc90043d43de0 EFLAGS: 00010246
[40072.538319] RAX: 0000000000000000 RBX: ffffc90043cd7cd0 RCX: 00000000000000f7
[40072.538430] RDX: 0000000000000000 RSI: 0000000000000006 RDI: ffffc90043d43df8
[40072.538531] RBP: 000000000000003f R08: 000077ff80000000 R09: 0000000000000008
[40072.538644] R10: 0000000000007ff0 R11: 00000000000008f6 R12: ffffc90043ce2708
[40072.538745] R13: 0000000000000000 R14: ffffc90043d43ed0 R15: ffff88043ea748c0
[40072.538861] FS: 0000000000000000(0000) GS:ffff880484600000(0000) knlGS:0000000000000000
[40072.538988] CS: e033 DS: 0000 ES: 0000 CR0: 0000000080050033
[40072.539088] CR2: 0000000000000080 CR3: 0000000407ac8000 CR4: 0000000000040660
[40072.539211] Call Trace:
[40072.539319] xenvif_rx_action+0x71/0x90 [xen_netback]
[40072.539429] xenvif_kthread_guest_rx+0x14a/0x29c [xen_netback]

Fix that by stopping the loop in case the rx queue becomes empty.

Cc: stable@vger.kernel.org
Fixes: 98f6d57ced73 ("xen-netback: process guest rx packets in batches")
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Paul Durrant <paul@xen.org>
Link: https://lore.kernel.org/r/20220713135322.19616-1-jgross@suse.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoamdgpu: disable powerpc support for the newer display engine
Linus Torvalds [Wed, 13 Jul 2022 19:36:50 +0000 (12:36 -0700)]
amdgpu: disable powerpc support for the newer display engine

The DRM_AMD_DC_DCN display engine support (Raven, Navi, and newer) has
not been building cleanly on powerpc and causes link errors due to
mixing hard- and soft-float object files:

  powerpc64-linux-ld: drivers/gpu/drm/amd/amdgpu/../display/dc/dml/display_mode_lib.o uses hard float, drivers/gpu/drm/amd/amdgpu/../display/dc/dcn31/dcn31_resource.o uses soft float
  powerpc64-linux-ld: failed to merge target specific data of file drivers/gpu/drm/amd/amdgpu/../display/dc/dcn31/dcn31_resource.o
  [..]

and while patches are floating around, it's not exactly obvious what is
going on.

The problem bisects to commit 41b7a347bf14 ("powerpc: Book3S 64-bit
outline-only KASAN support") but that is probably more about changing
config variables than the fundamental cause.

Despite the bisection result, a more directly related commit seems to be
26f4712aedbd ("drm/amd/display: move FPU related code from dcn31 to
dml/dcn31 folder").  It's probably a combination of the two.

This has been going on since the merge window, without any final word.
So instead of blindly applying patches that may or may not be the right
thing, let's disable this for now.

As Michael Ellerman says:
 "IIUIC this code was never enabled on ppc before, so disabling it seems
  like a reasonable fix to get the build clean"

and once we have more actual feedback (and find any potential users) we
can always re-enable it with the patch that fixes the issues and
back-port as necessary.

Fixes: 41b7a347bf14 ("powerpc: Book3S 64-bit outline-only KASAN support")
Fixes: 26f4712aedbd ("drm/amd/display: move FPU related code from dcn31 to dml/dcn31 folder")
Reported-and-tested-by: Guenter Roeck <linux@roeck-us.net>
Link: https://lore.kernel.org/all/20220606153910.GA1773067@roeck-us.net/
Link: https://lore.kernel.org/all/20220618232737.2036722-1-linux@roeck-us.net/
Link: https://lore.kernel.org/all/20220713050724.GA2471738@roeck-us.net/
Acked-by: Michael Ellerman <michael@ellerman.id.au>
Acked-by: Alex Deucher <alexdeucher@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2 years agoselftests/net: test nexthop without gw
Nicolas Dichtel [Wed, 13 Jul 2022 11:48:53 +0000 (13:48 +0200)]
selftests/net: test nexthop without gw

This test implement the scenario described in the commit
"ip: fix dflt addr selection for connected nexthop".
The test configures a nexthop object with an output device only (no gateway
address) and a route that uses this nexthop. The goal is to check if the
kernel selects a valid source address.

Link: https://lore.kernel.org/netdev/20220712095545.10947-1-nicolas.dichtel@6wind.com/
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Link: https://lore.kernel.org/r/20220713114853.29406-2-nicolas.dichtel@6wind.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 years agoip: fix dflt addr selection for connected nexthop
Nicolas Dichtel [Wed, 13 Jul 2022 11:48:52 +0000 (13:48 +0200)]
ip: fix dflt addr selection for connected nexthop

When a nexthop is added, without a gw address, the default scope was set
to 'host'. Thus, when a source address is selected, 127.0.0.1 may be chosen
but rejected when the route is used.

When using a route without a nexthop id, the scope can be configured in the
route, thus the problem doesn't exist.

To explain more deeply: when a user creates a nexthop, it cannot specify
the scope. To create it, the function nh_create_ipv4() calls fib_check_nh()
with scope set to 0. fib_check_nh() calls fib_check_nh_nongw() wich was
setting scope to 'host'. Then, nh_create_ipv4() calls
fib_info_update_nhc_saddr() with scope set to 'host'. The src addr is
chosen before the route is inserted.

When a 'standard' route (ie without a reference to a nexthop) is added,
fib_create_info() calls fib_info_update_nhc_saddr() with the scope set by
the user. iproute2 set the scope to 'link' by default.

Here is a way to reproduce the problem:
ip netns add foo
ip -n foo link set lo up
ip netns add bar
ip -n bar link set lo up
sleep 1

ip -n foo link add name eth0 type dummy
ip -n foo link set eth0 up
ip -n foo address add 192.168.0.1/24 dev eth0

ip -n foo link add name veth0 type veth peer name veth1 netns bar
ip -n foo link set veth0 up
ip -n bar link set veth1 up

ip -n bar address add 192.168.1.1/32 dev veth1
ip -n bar route add default dev veth1

ip -n foo nexthop add id 1 dev veth0
ip -n foo route add 192.168.1.1 nhid 1

Try to get/use the route:
> $ ip -n foo route get 192.168.1.1
> RTNETLINK answers: Invalid argument
> $ ip netns exec foo ping -c1 192.168.1.1
> ping: connect: Invalid argument

Try without nexthop group (iproute2 sets scope to 'link' by dflt):
ip -n foo route del 192.168.1.1
ip -n foo route add 192.168.1.1 dev veth0

Try to get/use the route:
> $ ip -n foo route get 192.168.1.1
> 192.168.1.1 dev veth0 src 192.168.0.1 uid 0
>     cache
> $ ip netns exec foo ping -c1 192.168.1.1
> PING 192.168.1.1 (192.168.1.1) 56(84) bytes of data.
> 64 bytes from 192.168.1.1: icmp_seq=1 ttl=64 time=0.039 ms
>
> --- 192.168.1.1 ping statistics ---
> 1 packets transmitted, 1 received, 0% packet loss, time 0ms
> rtt min/avg/max/mdev = 0.039/0.039/0.039/0.000 ms

CC: stable@vger.kernel.org
Fixes: 597cfe4fc339 ("nexthop: Add support for IPv4 nexthops")
Reported-by: Edwin Brossette <edwin.brossette@6wind.com>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Link: https://lore.kernel.org/r/20220713114853.29406-1-nicolas.dichtel@6wind.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 years agoARM: 9208/1: entry: add .ltorg directive to keep literals in range
Ard Biesheuvel [Tue, 31 May 2022 08:49:24 +0000 (09:49 +0100)]
ARM: 9208/1: entry: add .ltorg directive to keep literals in range

LKP reports a build issue on Clang, related to a literal load of
__current issued through the ldr_va macro. This turns out to be due to
the fact that group relocations are disabled when CONFIG_COMPILE_TEST=y,
which means that the ldr_va macro resolves to a pair of LDR
instructions, the first one being a literal load issued too far from its
literal pool.

Due to the introduction of a couple of new uses of this macro in commit
508074607c7b95b2 ("ARM: 9195/1: entry: avoid explicit literal loads"),
the literal pools end up getting rearranged in a way that causes the
literal for __current to go out of range. Let's fix this up by putting a
.ltorg directive in a suitable place in the code.

Link: https://lore.kernel.org/all/202205290805.1vZLAr36-lkp@intel.com/
Fixes: 508074607c7b95b2 ("ARM: 9195/1: entry: avoid explicit literal loads")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
2 years agoARM: 9207/1: amba: fix refcount underflow if amba_device_add() fails
Wang Kefeng [Tue, 24 May 2022 08:03:46 +0000 (09:03 +0100)]
ARM: 9207/1: amba: fix refcount underflow if amba_device_add() fails

"ARM: 9192/1: amba: fix memory leak in amba_device_try_add()" leads
to a refcount underflow if amba_device_add() fails, which called by
of_amba_device_create(), the of_amba_device_create() already exists
the error handling, so amba_put_device() only need to be added into
amba_deferred_retry().

Fixes: 7719a68b2fa4 ("ARM: 9192/1: amba: fix memory leak in amba_device_try_add()")
Reported-by: Guenter Roeck <linux@roeck-us.net>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
2 years agonet: atlantic: remove aq_nic_deinit() when resume
Chia-Lin Kao (AceLan) [Wed, 13 Jul 2022 11:12:24 +0000 (19:12 +0800)]
net: atlantic: remove aq_nic_deinit() when resume

aq_nic_deinit() has been called while suspending, so we don't have to call
it again on resume.
Actually, call it again leads to another hang issue when resuming from
S3.

Jul 8 03:09:44 u-Precision-7865-Tower kernel: [ 5910.992345] Call Trace:
Jul 8 03:09:44 u-Precision-7865-Tower kernel: [ 5910.992346] <TASK>
Jul 8 03:09:44 u-Precision-7865-Tower kernel: [ 5910.992348] aq_nic_deinit+0xb4/0xd0 [atlantic]
Jul 8 03:09:44 u-Precision-7865-Tower kernel: [ 5910.992356] aq_pm_thaw+0x7f/0x100 [atlantic]
Jul 8 03:09:44 u-Precision-7865-Tower kernel: [ 5910.992362] pci_pm_resume+0x5c/0x90
Jul 8 03:09:44 u-Precision-7865-Tower kernel: [ 5910.992366] ? pci_pm_thaw+0x80/0x80
Jul 8 03:09:44 u-Precision-7865-Tower kernel: [ 5910.992368] dpm_run_callback+0x4e/0x120
Jul 8 03:09:44 u-Precision-7865-Tower kernel: [ 5910.992371] device_resume+0xad/0x200
Jul 8 03:09:44 u-Precision-7865-Tower kernel: [ 5910.992373] async_resume+0x1e/0x40
Jul 8 03:09:44 u-Precision-7865-Tower kernel: [ 5910.992374] async_run_entry_fn+0x33/0x120
Jul 8 03:09:44 u-Precision-7865-Tower kernel: [ 5910.992377] process_one_work+0x220/0x3c0
Jul 8 03:09:44 u-Precision-7865-Tower kernel: [ 5910.992380] worker_thread+0x4d/0x3f0
Jul 8 03:09:44 u-Precision-7865-Tower kernel: [ 5910.992382] ? process_one_work+0x3c0/0x3c0
Jul 8 03:09:44 u-Precision-7865-Tower kernel: [ 5910.992384] kthread+0x12a/0x150
Jul 8 03:09:44 u-Precision-7865-Tower kernel: [ 5910.992386] ? set_kthread_struct+0x40/0x40
Jul 8 03:09:44 u-Precision-7865-Tower kernel: [ 5910.992387] ret_from_fork+0x22/0x30
Jul 8 03:09:44 u-Precision-7865-Tower kernel: [ 5910.992391] </TASK>
Jul 8 03:09:44 u-Precision-7865-Tower kernel: [ 5910.992392] ---[ end trace 1ec8c79604ed5e0d ]---
Jul 8 03:09:44 u-Precision-7865-Tower kernel: [ 5910.992394] PM: dpm_run_callback(): pci_pm_resume+0x0/0x90 returns -110
Jul 8 03:09:44 u-Precision-7865-Tower kernel: [ 5910.992397] atlantic 0000:02:00.0: PM: failed to resume async: error -110

Fixes: 1809c30b6e5a ("net: atlantic: always deep reset on pm op, fixing up my null deref regression")
Signed-off-by: Chia-Lin Kao (AceLan) <acelan.kao@canonical.com>
Link: https://lore.kernel.org/r/20220713111224.1535938-2-acelan.kao@canonical.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 years agonet: atlantic: remove deep parameter on suspend/resume functions
Chia-Lin Kao (AceLan) [Wed, 13 Jul 2022 11:12:23 +0000 (19:12 +0800)]
net: atlantic: remove deep parameter on suspend/resume functions

Below commit claims that atlantic NIC requires to reset the device on pm
op, and had set the deep to true for all suspend/resume functions.
commit 1809c30b6e5a ("net: atlantic: always deep reset on pm op, fixing up my null deref regression")
So, we could remove deep parameter on suspend/resume functions without
any functional change.

Fixes: 1809c30b6e5a ("net: atlantic: always deep reset on pm op, fixing up my null deref regression")
Signed-off-by: Chia-Lin Kao (AceLan) <acelan.kao@canonical.com>
Link: https://lore.kernel.org/r/20220713111224.1535938-1-acelan.kao@canonical.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 years agosfc: fix kernel panic when creating VF
Íñigo Huguet [Wed, 13 Jul 2022 09:21:16 +0000 (11:21 +0200)]
sfc: fix kernel panic when creating VF

When creating VFs a kernel panic can happen when calling to
efx_ef10_try_update_nic_stats_vf.

When releasing a DMA coherent buffer, sometimes, I don't know in what
specific circumstances, it has to unmap memory with vunmap. It is
disallowed to do that in IRQ context or with BH disabled. Otherwise, we
hit this line in vunmap, causing the crash:
  BUG_ON(in_interrupt());

This patch reenables BH to release the buffer.

Log messages when the bug is hit:
 kernel BUG at mm/vmalloc.c:2727!
 invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
 CPU: 6 PID: 1462 Comm: NetworkManager Kdump: loaded Tainted: G          I      --------- ---  5.14.0-119.el9.x86_64 #1
 Hardware name: Dell Inc. PowerEdge R740/06WXJT, BIOS 2.8.2 08/27/2020
 RIP: 0010:vunmap+0x2e/0x30
 ...skip...
 Call Trace:
  __iommu_dma_free+0x96/0x100
  efx_nic_free_buffer+0x2b/0x40 [sfc]
  efx_ef10_try_update_nic_stats_vf+0x14a/0x1c0 [sfc]
  efx_ef10_update_stats_vf+0x18/0x40 [sfc]
  efx_start_all+0x15e/0x1d0 [sfc]
  efx_net_open+0x5a/0xe0 [sfc]
  __dev_open+0xe7/0x1a0
  __dev_change_flags+0x1d7/0x240
  dev_change_flags+0x21/0x60
  ...skip...

Fixes: d778819609a2 ("sfc: DMA the VF stats only when requested")
Reported-by: Ma Yuying <yuma@redhat.com>
Signed-off-by: Íñigo Huguet <ihuguet@redhat.com>
Acked-by: Edward Cree <ecree.xilinx@gmail.com>
Link: https://lore.kernel.org/r/20220713092116.21238-1-ihuguet@redhat.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 years agoMerge branch 'xen-netfront-xsa-403-follow-on'
Paolo Abeni [Thu, 14 Jul 2022 10:20:21 +0000 (12:20 +0200)]
Merge branch 'xen-netfront-xsa-403-follow-on'

Jan Beulich says:

====================
xen-netfront: XSA-403 follow-on

While investigating the XSA, I did notice a few more things. The two
patches aren't really dependent on one another.

1: remove leftover call to xennet_tx_buf_gc()
2: re-order error checks in xennet_get_responses()
====================

Link: https://lore.kernel.org/r/7fca0e44-43b5-8448-3653-249d117dc084@suse.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 years agoxen-netfront: re-order error checks in xennet_get_responses()
Jan Beulich [Wed, 13 Jul 2022 09:19:55 +0000 (11:19 +0200)]
xen-netfront: re-order error checks in xennet_get_responses()

Check the retrieved grant reference first; there's no point trying to
have xennet_move_rx_slot() move invalid data (and further defer
recognition of the issue, likely making diagnosis yet more difficult).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 years agoxen-netfront: remove leftover call to xennet_tx_buf_gc()
Jan Beulich [Wed, 13 Jul 2022 09:19:33 +0000 (11:19 +0200)]
xen-netfront: remove leftover call to xennet_tx_buf_gc()

In talk_to_netback(), called earlier from xennet_connect(), queues and
shared rings were just re-initialized, so all this function call could
result in is setting ->broken (again) right away in case any unconsumed
responses were found.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 years agoMerge branch 'seg6-fix-skb-checksum-for-srh-encapsulation-insertion'
Paolo Abeni [Thu, 14 Jul 2022 08:15:15 +0000 (10:15 +0200)]
Merge branch 'seg6-fix-skb-checksum-for-srh-encapsulation-insertion'

Andrea Mayer says:

====================
seg6: fix skb checksum for SRH encapsulation/insertion

The Linux kernel supports Segment Routing Header (SRH)
encapsulation/insertion operations by providing the capability to: i)
encapsulate a packet in an outer IPv6 header with a specified SRH; ii)
insert a specified SRH directly after the IPv6 header of the packet.
Note that the insertion operation is also referred to as 'injection'.

The two operations are respectively supported by seg6_do_srh_encap() and
seg6_do_srh_inline(), which operate on the skb associated to the packet as
needed (e.g. adding the necessary headers and initializing them, while
taking care to recalculate the skb checksum).

seg6_do_srh_encap() and seg6_do_srh_inline() do not initialize the payload
length of the IPv6 header, which is carried out by the caller functions.
However, this approach causes the corruption of the skb checksum which
needs to be updated only after initialization of headers is completed
(thanks to Paolo Abeni for detecting this issue).

The patchset fixes the skb checksum corruption by moving the IPv6 header
payload length initialization from the callers of seg6_do_srh_encap() and
seg6_do_srh_inline() directly into these functions.

This patchset is organized as follows:
 - patch 1/3, seg6: fix skb checksum evaluation in SRH
   encapsulation/insertion;
    (* SRH encapsulation/insertion available since v4.10)

 - patch 2/3, seg6: fix skb checksum in SRv6 End.B6 and End.B6.Encaps
   behaviors;
    (* SRv6 End.B6 and End.B6.Encaps behaviors available since v4.14)

 - patch 3/3, seg6: bpf: fix skb checksum in bpf_push_seg6_encap();
    (* bpf IPv6 Segment Routing helpers available since v4.18)

====================

Link: https://lore.kernel.org/r/20220712175837.16267-1-andrea.mayer@uniroma2.it
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 years agoseg6: bpf: fix skb checksum in bpf_push_seg6_encap()
Andrea Mayer [Tue, 12 Jul 2022 17:58:37 +0000 (19:58 +0200)]
seg6: bpf: fix skb checksum in bpf_push_seg6_encap()

Both helper functions bpf_lwt_seg6_action() and bpf_lwt_push_encap() use
the bpf_push_seg6_encap() to encapsulate the packet in an IPv6 with Segment
Routing Header (SRH) or insert an SRH between the IPv6 header and the
payload.
To achieve this result, such helper functions rely on bpf_push_seg6_encap()
which, in turn, leverages seg6_do_srh_{encap,inline}() to perform the
required operation (i.e. encap/inline).

This patch removes the initialization of the IPv6 header payload length
from bpf_push_seg6_encap(), as it is now handled properly by
seg6_do_srh_{encap,inline}() to prevent corruption of the skb checksum.

Fixes: fe94cc290f53 ("bpf: Add IPv6 Segment Routing helpers")
Signed-off-by: Andrea Mayer <andrea.mayer@uniroma2.it>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 years agoseg6: fix skb checksum in SRv6 End.B6 and End.B6.Encaps behaviors
Andrea Mayer [Tue, 12 Jul 2022 17:58:36 +0000 (19:58 +0200)]
seg6: fix skb checksum in SRv6 End.B6 and End.B6.Encaps behaviors

The SRv6 End.B6 and End.B6.Encaps behaviors rely on functions
seg6_do_srh_{encap,inline}() to, respectively: i) encapsulate the
packet within an outer IPv6 header with the specified Segment Routing
Header (SRH); ii) insert the specified SRH directly after the IPv6
header of the packet.

This patch removes the initialization of the IPv6 header payload length
from the input_action_end_b6{_encap}() functions, as it is now handled
properly by seg6_do_srh_{encap,inline}() to avoid corruption of the skb
checksum.

Fixes: 140f04c33bbc ("ipv6: sr: implement several seg6local actions")
Signed-off-by: Andrea Mayer <andrea.mayer@uniroma2.it>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 years agoseg6: fix skb checksum evaluation in SRH encapsulation/insertion
Andrea Mayer [Tue, 12 Jul 2022 17:58:35 +0000 (19:58 +0200)]
seg6: fix skb checksum evaluation in SRH encapsulation/insertion

Support for SRH encapsulation and insertion was introduced with
commit 6c8702c60b88 ("ipv6: sr: add support for SRH encapsulation and
injection with lwtunnels"), through the seg6_do_srh_encap() and
seg6_do_srh_inline() functions, respectively.
The former encapsulates the packet in an outer IPv6 header along with
the SRH, while the latter inserts the SRH between the IPv6 header and
the payload. Then, the headers are initialized/updated according to the
operating mode (i.e., encap/inline).
Finally, the skb checksum is calculated to reflect the changes applied
to the headers.

The IPv6 payload length ('payload_len') is not initialized
within seg6_do_srh_{inline,encap}() but is deferred in seg6_do_srh(), i.e.
the caller of seg6_do_srh_{inline,encap}().
However, this operation invalidates the skb checksum, since the
'payload_len' is updated only after the checksum is evaluated.

To solve this issue, the initialization of the IPv6 payload length is
moved from seg6_do_srh() directly into the seg6_do_srh_{inline,encap}()
functions and before the skb checksum update takes place.

Fixes: 6c8702c60b88 ("ipv6: sr: add support for SRH encapsulation and injection with lwtunnels")
Reported-by: Paolo Abeni <pabeni@redhat.com>
Link: https://lore.kernel.org/all/20220705190727.69d532417be7438b15404ee1@uniroma2.it
Signed-off-by: Andrea Mayer <andrea.mayer@uniroma2.it>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 years agoocteontx2-af: Limit link bringup time at firmware
Hariprasad Kelam [Tue, 12 Jul 2022 16:18:15 +0000 (21:48 +0530)]
octeontx2-af: Limit link bringup time at firmware

Set the maximum time firmware should poll for a link.
If not set firmware could block CPU for a long time resulting
in mailbox failures. If link doesn't come up within 1second,
firmware will anyway notify the status as and when LINK comes up

Signed-off-by: Hariprasad Kelam <hkelam@marvell.com>
Signed-off-by: Sunil Kovvuri Goutham <sgoutham@marvell.com>
Signed-off-by: Geetha Sowjanya <gakula@marvell.com>
Link: https://lore.kernel.org/r/20220712161815.12621-1-gakula@marvell.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoMerge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net...
Jakub Kicinski [Thu, 14 Jul 2022 03:16:03 +0000 (20:16 -0700)]
Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue

Tony Nguyen says:

====================
Intel Wired LAN Driver Updates 2022-07-12

This series contains updates to ice driver only.

Paul fixes detection of E822 devices for firmware update and changes NVM
read for snapshot creation to be done in chunks as some systems cannot
read the entire NVM in the allotted time.
====================

Link: https://lore.kernel.org/r/20220712164829.7275-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agosfc: fix use after free when disabling sriov
Íñigo Huguet [Tue, 12 Jul 2022 06:26:42 +0000 (08:26 +0200)]
sfc: fix use after free when disabling sriov

Use after free is detected by kfence when disabling sriov. What was read
after being freed was vf->pci_dev: it was freed from pci_disable_sriov
and later read in efx_ef10_sriov_free_vf_vports, called from
efx_ef10_sriov_free_vf_vswitching.

Set the pointer to NULL at release time to not trying to read it later.

Reproducer and dmesg log (note that kfence doesn't detect it every time):
$ echo 1 > /sys/class/net/enp65s0f0np0/device/sriov_numvfs
$ echo 0 > /sys/class/net/enp65s0f0np0/device/sriov_numvfs

 BUG: KFENCE: use-after-free read in efx_ef10_sriov_free_vf_vswitching+0x82/0x170 [sfc]

 Use-after-free read at 0x00000000ff3c1ba5 (in kfence-#224):
  efx_ef10_sriov_free_vf_vswitching+0x82/0x170 [sfc]
  efx_ef10_pci_sriov_disable+0x38/0x70 [sfc]
  efx_pci_sriov_configure+0x24/0x40 [sfc]
  sriov_numvfs_store+0xfe/0x140
  kernfs_fop_write_iter+0x11c/0x1b0
  new_sync_write+0x11f/0x1b0
  vfs_write+0x1eb/0x280
  ksys_write+0x5f/0xe0
  do_syscall_64+0x5c/0x80
  entry_SYSCALL_64_after_hwframe+0x44/0xae

 kfence-#224: 0x00000000edb8ef95-0x00000000671f5ce1, size=2792, cache=kmalloc-4k

 allocated by task 6771 on cpu 10 at 3137.860196s:
  pci_alloc_dev+0x21/0x60
  pci_iov_add_virtfn+0x2a2/0x320
  sriov_enable+0x212/0x3e0
  efx_ef10_sriov_configure+0x67/0x80 [sfc]
  efx_pci_sriov_configure+0x24/0x40 [sfc]
  sriov_numvfs_store+0xba/0x140
  kernfs_fop_write_iter+0x11c/0x1b0
  new_sync_write+0x11f/0x1b0
  vfs_write+0x1eb/0x280
  ksys_write+0x5f/0xe0
  do_syscall_64+0x5c/0x80
  entry_SYSCALL_64_after_hwframe+0x44/0xae

 freed by task 6771 on cpu 12 at 3170.991309s:
  device_release+0x34/0x90
  kobject_cleanup+0x3a/0x130
  pci_iov_remove_virtfn+0xd9/0x120
  sriov_disable+0x30/0xe0
  efx_ef10_pci_sriov_disable+0x57/0x70 [sfc]
  efx_pci_sriov_configure+0x24/0x40 [sfc]
  sriov_numvfs_store+0xfe/0x140
  kernfs_fop_write_iter+0x11c/0x1b0
  new_sync_write+0x11f/0x1b0
  vfs_write+0x1eb/0x280
  ksys_write+0x5f/0xe0
  do_syscall_64+0x5c/0x80
  entry_SYSCALL_64_after_hwframe+0x44/0xae

Fixes: 3c5eb87605e85 ("sfc: create vports for VFs and assign random MAC addresses")
Reported-by: Yanghang Liu <yanghliu@redhat.com>
Signed-off-by: Íñigo Huguet <ihuguet@redhat.com>
Acked-by: Martin Habets <habetsm.xilinx@gmail.com>
Link: https://lore.kernel.org/r/20220712062642.6915-1-ihuguet@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoNFC: nxp-nci: add error reporting
Michael Walle [Tue, 12 Jul 2022 17:00:10 +0000 (19:00 +0200)]
NFC: nxp-nci: add error reporting

The PN7160 supports error notifications. Add the appropriate callbacks.

Signed-off-by: Michael Walle <michael@walle.cc>
Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20220712170011.2990629-1-michael@walle.cc
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agosmb3: workaround negprot bug in some Samba servers
Steve French [Tue, 12 Jul 2022 05:11:42 +0000 (00:11 -0500)]
smb3: workaround negprot bug in some Samba servers

Mount can now fail to older Samba servers due to a server
bug handling padding at the end of the last negotiate
context (negotiate contexts typically are rounded up to 8
bytes by adding padding if needed). This server bug can
be avoided by switching the order of negotiate contexts,
placing a negotiate context at the end that does not
require padding (prior to the recent netname context fix
this was the case on the client).

Fixes: 73130a7b1ac9 ("smb3: fix empty netname context on secondary channels")
Reported-by: Julian Sikorski <belegdol@gmail.com>
Tested-by: Julian Sikorski <belegdol+github@gmail.com>
Reviewed-by: Shyam Prasad N <sprasad@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2 years agovf/remap: return the amount of bytes actually deduplicated
Ansgar Lößer [Wed, 13 Jul 2022 18:51:44 +0000 (20:51 +0200)]
vf/remap: return the amount of bytes actually deduplicated

When using the FIDEDUPRANGE ioctl, in case of success the requested size
is returned. In some cases this might not be the actual amount of bytes
deduplicated.

This change modifies vfs_dedupe_file_range() to report the actual amount
of bytes deduplicated, instead of the requested amount.

Link: https://lore.kernel.org/linux-fsdevel/5548ef63-62f9-4f46-5793-03165ceccacc@tu-darmstadt.de/
Reported-by: Ansgar Lößer <ansgar.loesser@kom.tu-darmstadt.de>
Reported-by: Max Schlecht <max.schlecht@informatik.hu-berlin.de>
Reported-by: Björn Scheuermann <scheuermann@kom.tu-darmstadt.de>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Darrick J Wong <djwong@kernel.org>
Signed-off-by: Ansgar Lößer <ansgar.loesser@kom.tu-darmstadt.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2 years agoMerge tag 'cgroup-for-5.19-rc6-fixes' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Wed, 13 Jul 2022 18:47:01 +0000 (11:47 -0700)]
Merge tag 'cgroup-for-5.19-rc6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup

Pull cgroup fix from Tejun Heo:
 "Fix an old and subtle bug in the migration path.

  css_sets are used to track tasks and migrations are tasks moving from
  a group of css_sets to another group of css_sets. The migration path
  pins all source and destination css_sets in the prep stage.

  Unfortunately, it was overloading the same list_head entry to track
  sources and destinations, which got confused for migrations which are
  partially identity leading to use-after-frees.

  Fixed by using dedicated list_heads for tracking sources and
  destinations"

* tag 'cgroup-for-5.19-rc6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
  cgroup: Use separate src/dst nodes when preloading css_sets for migration

2 years agofs/remap: constrain dedupe of EOF blocks
Dave Chinner [Wed, 13 Jul 2022 07:49:15 +0000 (17:49 +1000)]
fs/remap: constrain dedupe of EOF blocks

If dedupe of an EOF block is not constrainted to match against only
other EOF blocks with the same EOF offset into the block, it can
match against any other block that has the same matching initial
bytes in it, even if the bytes beyond EOF in the source file do
not match.

Fix this by constraining the EOF block matching to only match
against other EOF blocks that have identical EOF offsets and data.
This allows "whole file dedupe" to continue to work without allowing
eof blocks to randomly match against partial full blocks with the
same data.

Reported-by: Ansgar Lößer <ansgar.loesser@tu-darmstadt.de>
Fixes: 1383a7ed6749 ("vfs: check file ranges before cloning files")
Link: https://lore.kernel.org/linux-fsdevel/a7c93559-4ba1-df2f-7a85-55a143696405@tu-darmstadt.de/
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2 years agoima: Fix potential memory leak in ima_init_crypto()
Jianglei Nie [Tue, 12 Jul 2022 01:10:37 +0000 (09:10 +0800)]
ima: Fix potential memory leak in ima_init_crypto()

On failure to allocate the SHA1 tfm, IMA fails to initialize and exits
without freeing the ima_algo_array. Add the missing kfree() for
ima_algo_array to avoid the potential memory leak.

Signed-off-by: Jianglei Nie <niejianglei2021@163.com>
Fixes: 6d94809af6b0 ("ima: Allocate and initialize tfm for each PCR bank")
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2 years agoima: force signature verification when CONFIG_KEXEC_SIG is configured
Coiby Xu [Wed, 13 Jul 2022 07:21:11 +0000 (15:21 +0800)]
ima: force signature verification when CONFIG_KEXEC_SIG is configured

Currently, an unsigned kernel could be kexec'ed when IMA arch specific
policy is configured unless lockdown is enabled. Enforce kernel
signature verification check in the kexec_file_load syscall when IMA
arch specific policy is configured.

Fixes: 99d5cadfde2b ("kexec_file: split KEXEC_VERIFY_SIG into KEXEC_SIG and KEXEC_SIG_FORCE")
Reported-and-suggested-by: Mimi Zohar <zohar@linux.ibm.com>
Signed-off-by: Coiby Xu <coxu@redhat.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2 years agonet: sunhme: output link status with a single print.
Nick Bowler [Wed, 13 Jul 2022 01:58:35 +0000 (21:58 -0400)]
net: sunhme: output link status with a single print.

This driver currently prints the link status using four separate
printk calls, which these days gets presented to the user as four
distinct messages, not exactly ideal:

  [   32.582778] eth0: Link is up using
  [   32.582828] internal
  [   32.582837] transceiver at
  [   32.582888] 100Mb/s, Full Duplex.

Restructure the display_link_mode function to use a single netdev_info
call to present all this information as a single message, which is much
nicer:

  [   33.640143] hme 0000:00:01.1 eth0: Link is up using internal transceiver at 100Mb/s, Full Duplex.

The display_forced_link_mode function has a similar structure, so adjust
it in a similar fashion.

Signed-off-by: Nick Bowler <nbowler@draconx.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agor8152: fix accessing unset transport header
Hayes Wang [Wed, 13 Jul 2022 03:31:11 +0000 (11:31 +0800)]
r8152: fix accessing unset transport header

A warning is triggered by commit 66e4c8d95008 ("net: warn if transport
header was not set"). The warning is harmless, because the value from
skb_transport_offset() is only used for skb_is_gso() is true or the
skb->ip_summed is equal to CHECKSUM_PARTIAL.

Fixes: 66e4c8d95008 ("net: warn if transport header was not set")
Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agoocteontx2-af: returning uninitialized variable
Sebin Sebastian [Wed, 13 Jul 2022 07:38:58 +0000 (13:08 +0530)]
octeontx2-af: returning uninitialized variable

Fix coverity error 'use of uninitialized variable'. err is uninitialized
and is returned which can lead to unintended results. err has been replaced
with -einval.
Coverity issue: 1518921 (uninitialized scalar variable)

Signed-off-by: Sebin Sebastian <mailmesebin00@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agoocteontx2-af: Remove duplicate include
Jiapeng Chong [Wed, 13 Jul 2022 02:07:59 +0000 (10:07 +0800)]
octeontx2-af: Remove duplicate include

The include is in line 14 and 23. Remove the duplicate.

Fix following checkincludes warning:

./drivers/net/ethernet/marvell/octeontx2/af/rvu_npc_hash.c: linux/bitfield.h is included more than once.
./drivers/net/ethernet/marvell/octeontx2/af/rvu_npc_hash.c: rvu_npc_hash.h is included more than once.

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet/sched: remove return value of unregister_tcf_proto_ops
Zhengchao Shao [Wed, 13 Jul 2022 01:54:38 +0000 (09:54 +0800)]
net/sched: remove return value of unregister_tcf_proto_ops

Return value of unregister_tcf_proto_ops is unused, remove it.

Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agoMerge tag 'wireless-next-2022-07-13' of git://git.kernel.org/pub/scm/linux/kernel...
David S. Miller [Wed, 13 Jul 2022 13:28:52 +0000 (14:28 +0100)]
Merge tag 'wireless-next-2022-07-13' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next

Johannes Berg says:

====================
A fairly large set of updates for next, highlights:

ath10k
 * ethernet frame format support

rtw89
 * TDLS support

cfg80211/mac80211
 * airtime fairness fixes
 * EHT support continued, especially in AP mode
 * initial (and still major) rework for multi-link
   operation (MLO) from 802.11be/wifi 7

As usual, also many small updates/cleanups/fixes/etc.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agoMerge tag 'wireless-2022-07-13' of git://git.kernel.org/pub/scm/linux/kernel/git...
David S. Miller [Wed, 13 Jul 2022 13:27:38 +0000 (14:27 +0100)]
Merge tag 'wireless-2022-07-13' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless

Johannes Berg says:

====================
A small set of fixes for
 * queue selection in mesh/ocb
 * queue handling on interface stop
 * hwsim virtio device vs. some other virtio changes
 * dt-bindings email addresses
 * color collision memory allocation
 * a const variable in rtw88
 * shared SKB transmit in the ethernet format path
 * P2P client port authorization
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: stmmac: fix leaks in probe
Dan Carpenter [Tue, 12 Jul 2022 14:42:25 +0000 (17:42 +0300)]
net: stmmac: fix leaks in probe

These two error paths should clean up before returning.

Fixes: 2bb4b98b60d7 ("net: stmmac: Add Ingenic SoCs MAC support.")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agoMerge branch 'phy-mxl-gpy-version-fix-and-improvements'
David S. Miller [Wed, 13 Jul 2022 13:18:39 +0000 (14:18 +0100)]
Merge branch 'phy-mxl-gpy-version-fix-and-improvements'

Michael Walle says:

====================
net: phy: mxl-gpy: version fix and improvements

Fix the version reporting which was introduced earlier. The version will
not change during runtime, so cache it and save a PHY read on every auto
negotiation. Also print the version in a human readable form.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: phy: mxl-gpy: print firmware in human readable form
Michael Walle [Tue, 12 Jul 2022 13:15:54 +0000 (15:15 +0200)]
net: phy: mxl-gpy: print firmware in human readable form

Now having a major and a minor number, also print the firmware in
human readable form "major.minor". Still keep the 4-digit hexadecimal
representation because that form is used in the firmware changelog
documents. Also, drop the "release" string assuming that most common
case, but make it clearer that the user is running a test version.

Signed-off-by: Michael Walle <michael@walle.cc>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: phy: mxl-gpy: rename the FW type field name
Michael Walle [Tue, 12 Jul 2022 13:15:53 +0000 (15:15 +0200)]
net: phy: mxl-gpy: rename the FW type field name

Align the firmware field name with the reference manual where it is
called "major".

Signed-off-by: Michael Walle <michael@walle.cc>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: phy: mxl-gpy: cache PHY firmware version
Michael Walle [Tue, 12 Jul 2022 13:15:52 +0000 (15:15 +0200)]
net: phy: mxl-gpy: cache PHY firmware version

Cache the firmware version during probe. There is no need to read the
firmware version on every autonegotiation.

Signed-off-by: Michael Walle <michael@walle.cc>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: phy: mxl-gpy: fix version reporting
Michael Walle [Tue, 12 Jul 2022 13:15:51 +0000 (15:15 +0200)]
net: phy: mxl-gpy: fix version reporting

The commit 09ce6b20103b ("net: phy: mxl-gpy: add temperature sensor")
will overwrite the return value and the reported version will be wrong.
Fix it.

Fixes: 09ce6b20103b ("net: phy: mxl-gpy: add temperature sensor")
Signed-off-by: Michael Walle <michael@walle.cc>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: ip6mr: add RTM_GETROUTE netlink op
David Lamparter [Tue, 12 Jul 2022 12:10:02 +0000 (14:10 +0200)]
net: ip6mr: add RTM_GETROUTE netlink op

The IPv6 multicast routing code previously implemented only the dump
variant of RTM_GETROUTE.  Implement single MFC item retrieval by copying
and adapting the respective IPv4 code.

Tested against FRRouting's IPv6 PIM stack.

Signed-off-by: David Lamparter <equinox@diac24.net>
Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org>
Reviewed-by: David Ahern <dsahern@kernel.org>
Cc: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agoMerge branch 'devlink-cosmetic-fixes'
David S. Miller [Wed, 13 Jul 2022 12:49:44 +0000 (13:49 +0100)]
Merge branch 'devlink-cosmetic-fixes'

Jiri Pirko says:

====================
net: devlink: devl_* cosmetic fixes

Hi. This patches just fixes some small cosmetic issues of devl_* related
functions which I found on the way.
===================

Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: devlink: move unlocked function prototypes alongside the locked ones
Jiri Pirko [Tue, 12 Jul 2022 10:24:24 +0000 (12:24 +0200)]
net: devlink: move unlocked function prototypes alongside the locked ones

Maintain the same order as it is in devlink.c for function prototypes.
The most of the locked variants would very likely soon be removed
and the unlocked version would be the only one.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: devlink: use helpers to work with devlink->lock mutex
Jiri Pirko [Tue, 12 Jul 2022 10:24:23 +0000 (12:24 +0200)]
net: devlink: use helpers to work with devlink->lock mutex

As far as the lock helpers exist as the drivers need to work with the
devlink->lock mutex, use the helpers internally in devlink.c in order to
be consistent.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: devlink: fix unlocked vs locked functions descriptions
Jiri Pirko [Tue, 12 Jul 2022 10:24:22 +0000 (12:24 +0200)]
net: devlink: fix unlocked vs locked functions descriptions

To be unified with the rest of the code, the unlocked version (devl_*)
of function should have the same description in documentation as the
locked one. Add the missing documentation. Also, add "Context"
annotation for the locked versions where it is missing.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agoocteontx2-af: Skip CGX/RPM probe incase of zero lmac count
Hariprasad Kelam [Tue, 12 Jul 2022 06:42:50 +0000 (12:12 +0530)]
octeontx2-af: Skip CGX/RPM probe incase of zero lmac count

In few error cases MAC(CGX/RPM) block is having 0 lmacs.
AF driver uses MAC block with lmac pair to get firmware
data etc. These commands will fail as there is no LMAC
associated with MAC block.

This patch skips the probe of these MAC blocks such that AF driver
uses correct MAC block and LMAC pair for firmware communication and
define new LMAC_AF_ERROR types for command timeout etc.

This patch also enables channel back pressure for all LMACs.

Signed-off-by: Hariprasad Kelam <hkelam@marvell.com>
Signed-off-by: Sunil Kovvuri Goutham <sgoutham@marvell.com>
Signed-off-by: Geetha sowjanya <gakula@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: ftgmac100: Hold reference returned by of_get_child_by_name()
Liang He [Tue, 12 Jul 2022 06:14:17 +0000 (14:14 +0800)]
net: ftgmac100: Hold reference returned by of_get_child_by_name()

In ftgmac100_probe(), we should hold the refernece returned by
of_get_child_by_name() and use it to call of_node_put() for
reference balance.

Fixes: 39bfab8844a0 ("net: ftgmac100: Add support for DT phy-handle property")
Signed-off-by: Liang He <windhl@126.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agoMerge branch 'net-sysctl-races'
David S. Miller [Wed, 13 Jul 2022 11:56:50 +0000 (12:56 +0100)]
Merge branch 'net-sysctl-races'

Kuniyuki Iwashima says:

====================
sysctl: Fix data-races around ipv4_net_table (Roun).

This series fixes data-races around the first 13 knobs and
nexthop_compat_mode in ipv4_net_table.

I will post another patch for three early_demux knobs later,
so the next round will start from ip_default_ttl.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonexthop: Fix data-races around nexthop_compat_mode.
Kuniyuki Iwashima [Tue, 12 Jul 2022 00:15:33 +0000 (17:15 -0700)]
nexthop: Fix data-races around nexthop_compat_mode.

While reading nexthop_compat_mode, it can be changed concurrently.
Thus, we need to add READ_ONCE() to its readers.

Fixes: 4f80116d3df3 ("net: ipv4: add sysctl for nexthop api compatibility mode")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agoipv4: Fix data-races around sysctl_ip_dynaddr.
Kuniyuki Iwashima [Tue, 12 Jul 2022 00:15:32 +0000 (17:15 -0700)]
ipv4: Fix data-races around sysctl_ip_dynaddr.

While reading sysctl_ip_dynaddr, it can be changed concurrently.
Thus, we need to add READ_ONCE() to its readers.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agotcp: Fix a data-race around sysctl_tcp_ecn_fallback.
Kuniyuki Iwashima [Tue, 12 Jul 2022 00:15:31 +0000 (17:15 -0700)]
tcp: Fix a data-race around sysctl_tcp_ecn_fallback.

While reading sysctl_tcp_ecn_fallback, it can be changed concurrently.
Thus, we need to add READ_ONCE() to its reader.

Fixes: 492135557dc0 ("tcp: add rfc3168, section 6.1.1.1. fallback")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agotcp: Fix data-races around sysctl_tcp_ecn.
Kuniyuki Iwashima [Tue, 12 Jul 2022 00:15:30 +0000 (17:15 -0700)]
tcp: Fix data-races around sysctl_tcp_ecn.

While reading sysctl_tcp_ecn, it can be changed concurrently.
Thus, we need to add READ_ONCE() to its readers.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agoraw: Fix a data-race around sysctl_raw_l3mdev_accept.
Kuniyuki Iwashima [Tue, 12 Jul 2022 00:15:29 +0000 (17:15 -0700)]
raw: Fix a data-race around sysctl_raw_l3mdev_accept.

While reading sysctl_raw_l3mdev_accept, it can be changed concurrently.
Thus, we need to add READ_ONCE() to its reader.

Fixes: 6897445fb194 ("net: provide a sysctl raw_l3mdev_accept for raw socket lookup with VRFs")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agoicmp: Fix a data-race around sysctl_icmp_ratemask.
Kuniyuki Iwashima [Tue, 12 Jul 2022 00:15:28 +0000 (17:15 -0700)]
icmp: Fix a data-race around sysctl_icmp_ratemask.

While reading sysctl_icmp_ratemask, it can be changed concurrently.
Thus, we need to add READ_ONCE() to its reader.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agoicmp: Fix a data-race around sysctl_icmp_ratelimit.
Kuniyuki Iwashima [Tue, 12 Jul 2022 00:15:27 +0000 (17:15 -0700)]
icmp: Fix a data-race around sysctl_icmp_ratelimit.

While reading sysctl_icmp_ratelimit, it can be changed concurrently.
Thus, we need to add READ_ONCE() to its reader.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agoicmp: Fix a data-race around sysctl_icmp_errors_use_inbound_ifaddr.
Kuniyuki Iwashima [Tue, 12 Jul 2022 00:15:26 +0000 (17:15 -0700)]
icmp: Fix a data-race around sysctl_icmp_errors_use_inbound_ifaddr.

While reading sysctl_icmp_errors_use_inbound_ifaddr, it can be changed
concurrently.  Thus, we need to add READ_ONCE() to its reader.

Fixes: 1c2fb7f93cb2 ("[IPV4]: Sysctl configurable icmp error source address.")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agoicmp: Fix a data-race around sysctl_icmp_ignore_bogus_error_responses.
Kuniyuki Iwashima [Tue, 12 Jul 2022 00:15:25 +0000 (17:15 -0700)]
icmp: Fix a data-race around sysctl_icmp_ignore_bogus_error_responses.

While reading sysctl_icmp_ignore_bogus_error_responses, it can be changed
concurrently.  Thus, we need to add READ_ONCE() to its reader.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agoicmp: Fix a data-race around sysctl_icmp_echo_ignore_broadcasts.
Kuniyuki Iwashima [Tue, 12 Jul 2022 00:15:24 +0000 (17:15 -0700)]
icmp: Fix a data-race around sysctl_icmp_echo_ignore_broadcasts.

While reading sysctl_icmp_echo_ignore_broadcasts, it can be changed
concurrently.  Thus, we need to add READ_ONCE() to its reader.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agoicmp: Fix data-races around sysctl_icmp_echo_enable_probe.
Kuniyuki Iwashima [Tue, 12 Jul 2022 00:15:23 +0000 (17:15 -0700)]
icmp: Fix data-races around sysctl_icmp_echo_enable_probe.

While reading sysctl_icmp_echo_enable_probe, it can be changed
concurrently.  Thus, we need to add READ_ONCE() to its readers.

Fixes: d329ea5bd884 ("icmp: add response to RFC 8335 PROBE messages")
Fixes: 1fd07f33c3ea ("ipv6: ICMPV6: add response to ICMPV6 RFC 8335 PROBE messages")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agoicmp: Fix a data-race around sysctl_icmp_echo_ignore_all.
Kuniyuki Iwashima [Tue, 12 Jul 2022 00:15:22 +0000 (17:15 -0700)]
icmp: Fix a data-race around sysctl_icmp_echo_ignore_all.

While reading sysctl_icmp_echo_ignore_all, it can be changed concurrently.
Thus, we need to add READ_ONCE() to its reader.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agotcp: Fix a data-race around sysctl_max_tw_buckets.
Kuniyuki Iwashima [Tue, 12 Jul 2022 00:15:21 +0000 (17:15 -0700)]
tcp: Fix a data-race around sysctl_max_tw_buckets.

While reading sysctl_max_tw_buckets, it can be changed concurrently.
Thus, we need to add READ_ONCE() to its reader.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agosysctl: Fix data-races in proc_dointvec_ms_jiffies().
Kuniyuki Iwashima [Tue, 12 Jul 2022 00:15:20 +0000 (17:15 -0700)]
sysctl: Fix data-races in proc_dointvec_ms_jiffies().

A sysctl variable is accessed concurrently, and there is always a chance
of data-race.  So, all readers and writers need some basic protection to
avoid load/store-tearing.

This patch changes proc_dointvec_ms_jiffies() to use READ_ONCE() and
WRITE_ONCE() internally to fix data-races on the sysctl side.  For now,
proc_dointvec_ms_jiffies() itself is tolerant to a data-race, but we still
need to add annotations on the other subsystem's side.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agosysctl: Fix data-races in proc_dou8vec_minmax().
Kuniyuki Iwashima [Tue, 12 Jul 2022 00:15:19 +0000 (17:15 -0700)]
sysctl: Fix data-races in proc_dou8vec_minmax().

A sysctl variable is accessed concurrently, and there is always a chance
of data-race.  So, all readers and writers need some basic protection to
avoid load/store-tearing.

This patch changes proc_dou8vec_minmax() to use READ_ONCE() and
WRITE_ONCE() internally to fix data-races on the sysctl side.  For now,
proc_dou8vec_minmax() itself is tolerant to a data-race, but we still
need to add annotations on the other subsystem's side.

Fixes: cb9444130662 ("sysctl: add proc_dou8vec_minmax()")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agoMerge branch 'prestera-port-range-filters'
David S. Miller [Wed, 13 Jul 2022 11:16:56 +0000 (12:16 +0100)]
Merge branch 'prestera-port-range-filters'

Maksym Glubokiy says:

====================
net: prestera: add support for port range filters

This adds support for port-range rules:

  $ tc qdisc add ... clsact
  $ tc filter add ... flower ... src_port <PMIN>-<PMAX> ...
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
Co-developed-by: Volodymyr Mytnyk <volodymyr.mytnyk@plvision.eu>
Signed-off-by: Volodymyr Mytnyk <volodymyr.mytnyk@plvision.eu>
Signed-off-by: Maksym Glubokiy <maksym.glubokiy@plvision.eu>
2 years agonet: prestera: add support for port range filters
Maksym Glubokiy [Mon, 11 Jul 2022 15:09:08 +0000 (18:09 +0300)]
net: prestera: add support for port range filters

This adds support for port-range rules:

  $ tc qdisc add ... clsact
  $ tc filter add ... flower ... src_port <PMIN>-<PMAX> ...

Co-developed-by: Volodymyr Mytnyk <volodymyr.mytnyk@plvision.eu>
Signed-off-by: Volodymyr Mytnyk <volodymyr.mytnyk@plvision.eu>
Signed-off-by: Maksym Glubokiy <maksym.glubokiy@plvision.eu>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: extract port range fields from fl_flow_key
Maksym Glubokiy [Mon, 11 Jul 2022 15:09:07 +0000 (18:09 +0300)]
net: extract port range fields from fl_flow_key

So it can be used for port range filter offloading.

Co-developed-by: Volodymyr Mytnyk <volodymyr.mytnyk@plvision.eu>
Signed-off-by: Volodymyr Mytnyk <volodymyr.mytnyk@plvision.eu>
Signed-off-by: Maksym Glubokiy <maksym.glubokiy@plvision.eu>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agoMerge branch 'prestera-mdb-offload'
David S. Miller [Wed, 13 Jul 2022 11:14:05 +0000 (12:14 +0100)]
Merge branch 'prestera-mdb-offload'

Oleksandr Mazur says:

====================
net: marvell: prestera: add MDB offloading support

This patch series adds support for the MDB handling for the marvell
Prestera Driver. It's used to propagate IGMP groups registered within
the Kernel to the underlying HW (offload registered groups).

Features:
 - define (and implement) main internal MDB-related objects;
 - define (and implement) main HW APIs for MDB handling;
 - add MDB handling support for both regular ports as well as Bond
   interfaces;
 - Mirrored behavior of Bridge driver upon multicast router appearance
   (all traffic flooded when there's no mcast router; mcast router
    receives all mcast traffic, and only group-specific registered mcast
    traffic is being received by ports who've explicitly joined any group
    when there's a registered mcast router);
 - Reworked prestera driver bridge flags (especially multicast)
   setting - thus making it possible to react over mcast disabled messages
   properly by offloading this state to the underlying HW
   (disabling multicast flooding);

Limitations:
 - Not full (partial) IGMPv3 support (due to limited switchdev
   notification capabilities:
     MDB events are being propagated only with a single MAC entry,
     while IGMPv3 has Group-Specific requests and responses);
 - It's possible that multiple IP groups would receive traffic from
   other groups, as MDB events are being propagated with a single MAC
   entry, which makes it possible to map a few IPs over same MAC;

Co-developed-by: Yevhen Orlov <yevhen.orlov@plvision.eu>
Signed-off-by: Yevhen Orlov <yevhen.orlov@plvision.eu>
Signed-off-by: Oleksandr Mazur <oleksandr.mazur@plvision.eu>
PATCH V5:
 - fix checkpatch errors (4/4).
 - remove function forward declarations, and move
   function implementations to match the removed
   forward declarations (4/4).
 - rebased changes on top of latest master.
PATCH V4:
 - fix clang warning - var uninitialized when used.
PATCH V3:
 - add missing function implementations to 2/4 (HW API implementation),
   only definitions were added int V1, V2, and implementation waas missed
   by mistake.

Reported-by: kernel test robot <lkp@intel.com>
 - fix compiletime warning (unused variable)

PATCH V2:
 - include all the patches of patch series (V1's been sent to
   closed net-next, also had not all patches included);
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: marvell: prestera: implement software MDB entries allocation
Oleksandr Mazur [Mon, 11 Jul 2022 11:28:22 +0000 (14:28 +0300)]
net: marvell: prestera: implement software MDB entries allocation

Define bridge MDB entry (software entry):
  - entry that get's created upon receiving MDB management events
    (create/delete), that inherently defines a software entry,
    which can be enabled (offloaded to the HW) or disabled (removed
    from HW).
    This separation is done to achieve a better highlevel
    management of HW resources - software MDB entry could exist,
    while it's not necessarily should be configured on the HW.
    For example: by default, the Linux behavior would not replicate
    multicast traffic to multicast group members if there's no
    active multicast router and thus - no actual multicast traffic
    can be received/sent. So, until multicast router appears on the
    system no HW configuration should be applied, although SW MDB entries
    should be tracked.
    Another example would be altering state of 'multicast enabled' on
    the bridge: MC_DISABLED should invoke disabling / clearing multicast
    groups of specified bridge on the HW, yet upon receiving 'multicast
    enabled' event, driver should reconfigure any existing software MDB
    groups on the HW.
    Keeping track of software MDB entries in such way makes it possible
    to properly react on such events.
Define bridge MDB port entry (software entry):
  - entry that helps keeping track (on software - driver - level) of which
    bridge mebemer interface joined any give MDB group;

Co-developed-by: Yevhen Orlov <yevhen.orlov@plvision.eu>
Signed-off-by: Yevhen Orlov <yevhen.orlov@plvision.eu>
Signed-off-by: Oleksandr Mazur <oleksandr.mazur@plvision.eu>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: marvell: prestera: define and implement MDB / flood domain API for entries creat...
Oleksandr Mazur [Mon, 11 Jul 2022 11:28:21 +0000 (14:28 +0300)]
net: marvell: prestera: define and implement MDB / flood domain API for entries creation and deletion

Define and implement prestera API calls for managing MDB and
  flood domain (ports) entries (create / delete / find calls).

Co-developed-by: Yevhen Orlov <yevhen.orlov@plvision.eu>
Signed-off-by: Yevhen Orlov <yevhen.orlov@plvision.eu>
Signed-off-by: Oleksandr Mazur <oleksandr.mazur@plvision.eu>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: marvell: prestera: define MDB/flood domain entries and HW API to offload them...
Oleksandr Mazur [Mon, 11 Jul 2022 11:28:20 +0000 (14:28 +0300)]
net: marvell: prestera: define MDB/flood domain entries and HW API to offload them to the HW

Define MDB entry that can be offloaded:
  - FDB entry, that defines an multicast group to which traffic can be
    replicated to;
Define flood domain:
  - Arrangement of ports (list), that have joined multicast group, which
    would receive and replicate to multicast traffic of specified group;
Define flood domain port:
  - single flood domain list entry, that is associated with any given
    bridge port interface (could be LAG interface or physical port-member).
    Applicable to both Q and D bridges;

Co-developed-by: Yevhen Orlov <yevhen.orlov@plvision.eu>
Signed-off-by: Yevhen Orlov <yevhen.orlov@plvision.eu>
Signed-off-by: Oleksandr Mazur <oleksandr.mazur@plvision.eu>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: marvell: prestera: rework bridge flags setting
Oleksandr Mazur [Mon, 11 Jul 2022 11:28:19 +0000 (14:28 +0300)]
net: marvell: prestera: rework bridge flags setting

Separate flags to make it possible to alter them separately;
Move bridge flags setting logic from HW API level to prestera_main
  where it belongs;
Move bridge flags parsing (and setting using prestera API) to
  prestera_switchdev.c - module responsible for bridge operations
  handling;

Signed-off-by: Oleksandr Mazur <oleksandr.mazur@plvision.eu>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agoip6_tunnel: allow to inherit from VLAN encapsulated IP
Matthias May [Mon, 11 Jul 2022 09:17:22 +0000 (11:17 +0200)]
ip6_tunnel: allow to inherit from VLAN encapsulated IP

The current code allows to inherit the TTL (hop_limit) from the
payload when skb->protocol is ETH_P_IP or ETH_P_IPV6.
However when the payload is VLAN encapsulated (e.g because the tunnel
is of type GRETAP), then this inheriting does not work, because the
visible skb->protocol is of type ETH_P_8021Q or ETH_P_8021AD.

Instead of skb->protocol, use skb_protocol().

Signed-off-by: Matthias May <matthias.may@westermo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agoip6_gre: use actual protocol to select xmit
Matthias May [Mon, 11 Jul 2022 09:17:21 +0000 (11:17 +0200)]
ip6_gre: use actual protocol to select xmit

When the payload is a VLAN encapsulated IPv6/IPv6 frame, we can
skip the 802.1q/802.1ad ethertypes and jump to the actual protocol.
This way we treat IPv4/IPv6 frames as IP instead of as "other".

Signed-off-by: Matthias May <matthias.may@westermo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agoip6_gre: set DSCP for non-IP
Matthias May [Mon, 11 Jul 2022 09:17:20 +0000 (11:17 +0200)]
ip6_gre: set DSCP for non-IP

The current code always forces a dscp of 0 for all non-IP frames.
However when setting a specific TOS with the command

ip link add name tep0 type ip6gretap local fdd1:ced0:5d88:3fce::1
remote fdd1:ced0:5d88:3fce::2 tos 0xa0

one would expect all GRE encapsulated frames to have a TOS of 0xA0.
and not only when the payload is IPv4/IPv6.

Signed-off-by: Matthias May <matthias.may@westermo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agoip_tunnel: allow to inherit from VLAN encapsulated IP
Matthias May [Mon, 11 Jul 2022 09:17:19 +0000 (11:17 +0200)]
ip_tunnel: allow to inherit from VLAN encapsulated IP

The current code allows to inherit the TOS, TTL, DF from the payload
when skb->protocol is ETH_P_IP or ETH_P_IPV6.
However when the payload is VLAN encapsulated (e.g because the tunnel
is of type GRETAP), then this inheriting does not work, because the
visible skb->protocol is of type ETH_P_8021Q or ETH_P_8021AD.

Instead of skb->protocol, use skb_protocol().

Signed-off-by: Matthias May <matthias.may@westermo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agoALSA: hda/realtek - Enable the headset-mic on a Xiaomi's laptop
Meng Tang [Wed, 13 Jul 2022 09:41:33 +0000 (17:41 +0800)]
ALSA: hda/realtek - Enable the headset-mic on a Xiaomi's laptop

The headset on this machine is not defined, after applying the quirk
ALC256_FIXUP_ASUS_HEADSET_MIC, the headset-mic works well

Signed-off-by: Meng Tang <tangmeng@uniontech.com>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20220713094133.9894-1-tangmeng@uniontech.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2 years agoALSA: hda/realtek - Fix headset mic problem for a HP machine with alc221
Meng Tang [Wed, 13 Jul 2022 06:33:32 +0000 (14:33 +0800)]
ALSA: hda/realtek - Fix headset mic problem for a HP machine with alc221

On a HP 288 Pro G2 MT (X9W02AV), the front mic could not be detected.
In order to get it working, the pin configuration needs to be set
correctly, and the ALC221_FIXUP_HP_288PRO_MIC_NO_PRESENCE fixup needs
to be applied.

Signed-off-by: Meng Tang <tangmeng@uniontech.com>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20220713063332.30095-1-tangmeng@uniontech.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2 years agoALSA: hda/realtek: fix mute/micmute LEDs for HP machines
Jeremy Szu [Wed, 13 Jul 2022 02:27:04 +0000 (10:27 +0800)]
ALSA: hda/realtek: fix mute/micmute LEDs for HP machines

The HP ProBook 440/450 G9 and EliteBook 640/650 G9 have multiple
motherboard design and they are using different subsystem ID of audio
codec. Add the same quirk for other MBs.

Signed-off-by: Jeremy Szu <jeremy.szu@canonical.com>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20220713022706.22892-1-jeremy.szu@canonical.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2 years agoqlogic: qed: fix clang -Wformat warnings
Justin Stitt [Mon, 11 Jul 2022 23:24:04 +0000 (16:24 -0700)]
qlogic: qed: fix clang -Wformat warnings

When building with Clang we encounter these warnings:
| drivers/net/ethernet/qlogic/qed/qed_dev.c:416:30: error: format
| specifies type 'char' but the argument has type 'u32' (aka 'unsigned
| int') [-Werror,-Wformat] i);
-
| drivers/net/ethernet/qlogic/qed/qed_dev.c:630:13: error: format
| specifies type 'char' but the argument has type 'int' [-Werror,-Wformat]
| p_llh_info->num_ppfid - 1);

For the first warning, `i` is a u32 which is much wider than the format
specifier `%hhd` describes. This results in a loss of bits after 2^7.

The second warning involves implicit integer promotion as the resulting
type of addition cannot be smaller than an int.

example:
``
uint8_t a = 4, b = 7;
int size = sizeof(a + b - 1);
printf("%d\n", size);
// output: 4
```

See more:
(https://wiki.sei.cmu.edu/confluence/display/c/INT02-C.+Understand+integer+conversion+rules)
"Integer types smaller than int are promoted when an operation is
performed on them. If all values of the original type can be represented
as an int, the value of the smaller type is converted to an int;
otherwise, it is converted to an unsigned int."

Link: https://github.com/ClangBuiltLinux/linux/issues/378
Signed-off-by: Justin Stitt <justinstitt@google.com>
Link: https://lore.kernel.org/r/20220711232404.2189257-1-justinstitt@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoMerge branch 'bnxt_en-5-bug-fixes'
Jakub Kicinski [Wed, 13 Jul 2022 03:36:00 +0000 (20:36 -0700)]
Merge branch 'bnxt_en-5-bug-fixes'

Michael Chan says:

====================
bnxt_en: 5 Bug fixes

This patchset fixes various issues, including SRIOV error unwinding,
one error recovery path, live patch reporting, XDP transmit path,
and PHC clock reading.
====================

Link: https://lore.kernel.org/r/1657592778-12730-1-git-send-email-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agobnxt_en: Fix bnxt_refclk_read()
Pavan Chebbi [Tue, 12 Jul 2022 02:26:18 +0000 (22:26 -0400)]
bnxt_en: Fix bnxt_refclk_read()

The upper 32-bit PHC register is not latched when reading the lower
32-bit PHC register.  Current code leaves a small window where we may
not read correct higher order bits if the lower order bits are just about
to wrap around.

This patch fixes this by reading higher order bits twice and makes
sure that final value is correctly paired with its lower 32 bits.

Fixes: 30e96f487f64 ("bnxt_en: Do not read the PTP PHC during chip reset")
Cc: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agobnxt_en: Fix and simplify XDP transmit path
Michael Chan [Tue, 12 Jul 2022 02:26:17 +0000 (22:26 -0400)]
bnxt_en: Fix and simplify XDP transmit path

Fix the missing length hint in the TX BD for the XDP transmit path.  The
length hint is required on legacy chips.

Also, simplify the code by eliminating the first_buf local variable.
tx_buf contains the same value.  The opaque value only needs to be set
on the first BD.  Fix this also for correctness.

Fixes: a7559bc8c17c ("bnxt: support transmit and free of aggregation buffers")
Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agobnxt_en: fix livepatch query
Vikas Gupta [Tue, 12 Jul 2022 02:26:16 +0000 (22:26 -0400)]
bnxt_en: fix livepatch query

In the livepatch query fw_target BNXT_FW_SRT_PATCH is
applicable for P5 chips only.

Fixes: 3c4153394e2c ("bnxt_en: implement firmware live patching")
Reviewed-by: Saravanan Vajravel <saravanan.vajravel@broadcom.com>
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Vikas Gupta <vikas.gupta@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agobnxt_en: Fix bnxt_reinit_after_abort() code path
Michael Chan [Tue, 12 Jul 2022 02:26:15 +0000 (22:26 -0400)]
bnxt_en: Fix bnxt_reinit_after_abort() code path

bnxt_reinit_after_abort() is called during ifup when a previous
FW reset sequence has aborted or a previous ifup has failed after
detecting FW reset.  In all cases, it is safe to assume that a
previous FW reset has completed and the driver may not have fully
reinitialized.

Prior to this patch, it is assumed that the
FUNC_DRV_IF_CHANGE_RESP_FLAGS_HOT_FW_RESET_DONE flag will always be
set by the firmware in bnxt_hwrm_if_change().  This may not be true if
the driver has already attempted to register with the firmware.  The
firmware may not set the RESET_DONE flag again after the driver has
registered, assuming that the driver has seen the flag already.

Fix it to always go through the FW reset initialization path if
the BNXT_STATE_FW_RESET_DET flag is set.  This flag is always set
by the driver after successfully going through bnxt_reinit_after_abort().

Fixes: 6882c36cf82e ("bnxt_en: attempt to reinitialize after aborted reset")
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agobnxt_en: reclaim max resources if sriov enable fails
Kashyap Desai [Tue, 12 Jul 2022 02:26:14 +0000 (22:26 -0400)]
bnxt_en: reclaim max resources if sriov enable fails

If bnxt_sriov_enable() fails after some resources have been reserved
for the VFs, the current code is not unwinding properly and the
reserved resources become unavailable afterwards.  Fix it by
properly unwinding with a call to bnxt_hwrm_func_qcaps() to
reset all maximum resources.

Also, add the missing bnxt_ulp_sriov_cfg() call to let the RDMA
driver know to abort.

Fixes: c0c050c58d84 ("bnxt_en: New Broadcom ethernet driver.")
Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoigb: add xdp frags support to ndo_xdp_xmit
Lorenzo Bianconi [Mon, 11 Jul 2022 23:07:51 +0000 (16:07 -0700)]
igb: add xdp frags support to ndo_xdp_xmit

Add the capability to map non-linear xdp frames in XDP_TX and
ndo_xdp_xmit callback.

Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Tested-by: Chandan Kumar Rout <chandanx.rout@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Link: https://lore.kernel.org/r/20220711230751.3124415-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoMerge branch 'mptcp-support-changes-to-initial-subflow-priority'
Jakub Kicinski [Wed, 13 Jul 2022 01:37:22 +0000 (18:37 -0700)]
Merge branch 'mptcp-support-changes-to-initial-subflow-priority'

Mat Martineau says:

====================
mptcp: Support changes to initial subflow priority

This series updates the in-kernel MPTCP path manager to allow changes to
subflow priority for the first subflow created for each MPTCP connection
(the one created with the MP_CAPABLE handshake).

Patches 1 and 2 do some refactoring to simplify the new functionality.

Patch 3 introduces the new feature to change the initial subflow
priority and send the MP_PRIO header on that subflow.

Patch 4 cleans up code related to tracking endpoint ids on the initial
subflow.

Patch 5 adds a selftest to confirm that subflow priorities are updated
as expected.
====================

Link: https://lore.kernel.org/r/20220711191633.80826-1-mathew.j.martineau@linux.intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoselftests: mptcp: add MPC backup tests
Paolo Abeni [Mon, 11 Jul 2022 19:16:33 +0000 (12:16 -0700)]
selftests: mptcp: add MPC backup tests

Add a couple of test-cases covering the newly introduced
features - priority update for the MPC subflow.

Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agomptcp: more accurate MPC endpoint tracking
Paolo Abeni [Mon, 11 Jul 2022 19:16:32 +0000 (12:16 -0700)]
mptcp: more accurate MPC endpoint tracking

Currently the id accounting for the ID 0 subflow is not correct:
at creation time we mark (correctly) as unavailable the endpoint
id corresponding the MPC subflow source address, while at subflow
removal time set as available the id 0.

With this change we track explicitly the endpoint id corresponding
to the MPC subflow so that we can mark it as available at removal time.
Additionally this allow deleting the initial subflow via the NL PM
specifying the corresponding endpoint id.

Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>