git.proxmox.com Git - mirror_ubuntu-jammy-kernel.git/log

tcp: switch orphan_count to bare per-cpu counters

BugLink: https://bugs.launchpad.net/bugs/1951822
[ Upstream commit 19757cebf0c5016a1f36f7fe9810a9f0b33c0832 ]

Use of percpu_counter structure to track count of orphaned
sockets is causing problems on modern hosts with 256 cpus
or more.

Stefan Bach reported a serious spinlock contention in real workloads,
that I was able to reproduce with a netfilter rule dropping
incoming FIN packets.

    53.56%  server  [kernel.kallsyms]      [k] queued_spin_lock_slowpath
            |
            ---queued_spin_lock_slowpath
               |
                --53.51%--_raw_spin_lock_irqsave
                          |
                           --53.51%--__percpu_counter_sum
                                     tcp_check_oom
                                     |
                                     |--39.03%--__tcp_close
                                     |          tcp_close
                                     |          inet_release
                                     |          inet6_release
                                     |          sock_close
                                     |          __fput
                                     |          ____fput
                                     |          task_work_run
                                     |          exit_to_usermode_loop
                                     |          do_syscall_64
                                     |          entry_SYSCALL_64_after_hwframe
                                     |          __GI___libc_close
                                     |
                                      --14.48%--tcp_out_of_resources
                                                tcp_write_timeout
                                                tcp_retransmit_timer
                                                tcp_write_timer_handler
                                                tcp_write_timer
                                                call_timer_fn
                                                expire_timers
                                                __run_timers
                                                run_timer_softirq
                                                __softirqentry_text_start

As explained in commit cf86a086a180 ("net/dst: use a smaller percpu_counter
batch for dst entries accounting"), default batch size is too big
for the default value of tcp_max_orphans (262144).

But even if we reduce batch sizes, there would still be cases
where the estimated count of orphans is beyond the limit,
and where tcp_too_many_orphans() has to call the expensive
percpu_counter_sum_positive().

One solution is to use plain per-cpu counters, and have
a timer to periodically refresh this cache.

Updating this cache every 100ms seems about right, tcp pressure
state is not radically changing over shorter periods.

percpu_counter was nice 15 years ago while hosts had less
than 16 cpus, not anymore by current standards.

v2: Fix the build issue for CONFIG_CRYPTO_DEV_CHELSIO_TLS=m,
    reported by kernel test robot <lkp@intel.com>
    Remove unused socket argument from tcp_too_many_orphans()

Fixes: dd24c00191d5 ("net: Use a percpu_counter for orphan_count")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Stefan Bach <sfb@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

net: tulip: winbond-840: fix build for UML

BugLink: https://bugs.launchpad.net/bugs/1951822
[ Upstream commit a3d708925fcca1a2f7219bc9ce93e6341f85c1e0 ]

On i386, when builtin (not a loadable module), the winbond-840 driver
inspects boot_cpu_data to see what CPU family it is running on, and
then acts on that data. The "family" struct member (x86) does not exist
when running on UML, so prevent that test and do the default action.

Prevents this build error on UML + i386:

../drivers/net/ethernet/dec/tulip/winbond-840.c: In function ‘init_registers’:
../drivers/net/ethernet/dec/tulip/winbond-840.c:882:19: error: ‘struct cpuinfo_um’ has no member named ‘x86’
if (boot_cpu_data.x86 <= 4) {

Fixes: 68f5d3f3b654 ("um: add PCI over virtio emulation driver")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: linux-um@lists.infradead.org
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Link: https://lore.kernel.org/r/20211014050606.7288-1-rdunlap@infradead.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

net: intel: igc_ptp: fix build for UML

BugLink: https://bugs.launchpad.net/bugs/1951822
[ Upstream commit 523994ba3ad1b7b55abe4a72e156897b5e2db825 ]

On a UML build, the igc_ptp driver uses CONFIG_X86_TSC for timestamp
conversion. The function that is used is not available on UML builds,
so have the function use the default system_counterval_t timestamp
instead for UML builds.

Prevents this build error on UML:

../drivers/net/ethernet/intel/igc/igc_ptp.c: In function ‘igc_device_tstamp_to_system’:
../drivers/net/ethernet/intel/igc/igc_ptp.c:777:9: error: implicit declaration of function ‘convert_art_ns_to_tsc’ [-Werror=implicit-function-declaration]
return convert_art_ns_to_tsc(tstamp);
../drivers/net/ethernet/intel/igc/igc_ptp.c:777:9: error: incompatible types when returning type ‘int’ but ‘struct system_counterval_t’ was expected
return convert_art_ns_to_tsc(tstamp);

Fixes: 68f5d3f3b654 ("um: add PCI over virtio emulation driver")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: linux-um@lists.infradead.org
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Cc: Jesse Brandeburg <jesse.brandeburg@intel.com>
Cc: Tony Nguyen <anthony.l.nguyen@intel.com>
Cc: intel-wired-lan@lists.osuosl.org
Link: https://lore.kernel.org/r/20211014050516.6846-1-rdunlap@infradead.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

net: fealnx: fix build for UML

BugLink: https://bugs.launchpad.net/bugs/1951822
[ Upstream commit cd2621d07d517473611b170c69beb6524c677740 ]

On i386, when builtin (not a loadable module), the fealnx driver
inspects boot_cpu_data to see what CPU family it is running on, and
then acts on that data. The "family" struct member (x86) does not exist
when running on UML, so prevent that test and do the default action.

Prevents this build error on UML + i386:

../drivers/net/ethernet/fealnx.c: In function ‘netdev_open’:
../drivers/net/ethernet/fealnx.c:861:19: error: ‘struct cpuinfo_um’ has no member named ‘x86’

Fixes: 68f5d3f3b654 ("um: add PCI over virtio emulation driver")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: linux-um@lists.infradead.org
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Link: https://lore.kernel.org/r/20211014050500.5620-1-rdunlap@infradead.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

kernel/sched: Fix sched_fork() access an invalid sched_task_group

BugLink: https://bugs.launchpad.net/bugs/1951822
[ Upstream commit 4ef0c5c6b5ba1f38f0ea1cedad0cad722f00c14a ]

There is a small race between copy_process() and sched_fork()
where child->sched_task_group point to an already freed pointer.

parent doing fork()      | someone moving the parent
| to another cgroup
  -------------------------------+-------------------------------
  copy_process()
      + dup_task_struct()<1>
  parent move to another cgroup,
  and free the old cgroup. <2>
      + sched_fork()
+ __set_task_cpu()<3>
+ task_fork_fair()
  + sched_slice()<4>

In the worst case, this bug can lead to "use-after-free" and
cause panic as shown above:

  (1) parent copy its sched_task_group to child at <1>;

  (2) someone move the parent to another cgroup and free the old
      cgroup at <2>;

  (3) the sched_task_group and cfs_rq that belong to the old cgroup
      will be accessed at <3> and <4>, which cause a panic:

  [] BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
  [] PGD 8000001fa0a86067 P4D 8000001fa0a86067 PUD 2029955067 PMD 0
  [] Oops: 0000 [#1] SMP PTI
  [] CPU: 7 PID: 648398 Comm: ebizzy Kdump: loaded Tainted: G           OE    --------- -  - 4.18.0.x86_64+ #1
  [] RIP: 0010:sched_slice+0x84/0xc0

  [] Call Trace:
  []  task_fork_fair+0x81/0x120
  []  sched_fork+0x132/0x240
  []  copy_process.part.5+0x675/0x20e0
  []  ? __handle_mm_fault+0x63f/0x690
  []  _do_fork+0xcd/0x3b0
  []  do_syscall_64+0x5d/0x1d0
  []  entry_SYSCALL_64_after_hwframe+0x65/0xca
  [] RIP: 0033:0x7f04418cd7e1

Between cgroup_can_fork() and cgroup_post_fork(), the cgroup
membership and thus sched_task_group can't change. So update child's
sched_task_group at sched_post_fork() and move task_fork() and
__set_task_cpu() (where accees the sched_task_group) from sched_fork()
to sched_post_fork().

Fixes: 8323f26ce342 ("sched: Fix race in task_group")
Signed-off-by: Zhang Qiao <zhangqiao22@huawei.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Tejun Heo <tj@kernel.org>
Link: https://lkml.kernel.org/r/20210915064030.2231-1-zhangqiao22@huawei.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

ath10k: fix max antenna gain unit

BugLink: https://bugs.launchpad.net/bugs/1951822
[ Upstream commit 0a491167fe0cf9f26062462de2a8688b96125d48 ]

Most of the txpower for the ath10k firmware is stored as twicepower (0.5 dB
steps). This isn't the case for max_antenna_gain - which is still expected
by the firmware as dB.

The firmware is converting it from dB to the internal (twicepower)
representation when it calculates the limits of a channel. This can be seen
in tpc_stats when configuring "12" as max_antenna_gain. Instead of the
expected 12 (6 dB), the tpc_stats shows 24 (12 dB).

Tested on QCA9888 and IPQ4019 with firmware 10.4-3.5.3-00057.

Fixes: 02256930d9b8 ("ath10k: use proper tx power unit")
Signed-off-by: Sven Eckelmann <seckelmann@datto.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20190611172131.6064-1-sven@narfation.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

hwmon: (pmbus/lm25066) Let compiler determine outer dimension of lm25066_coeff

BugLink: https://bugs.launchpad.net/bugs/1951822
[ Upstream commit b7931a7b0e0df4d2a25fedd895ad32c746b77bc1 ]

Maintaining this manually is error prone (there are currently only
five chips supported, not six); gcc can do it for us automatically.

Signed-off-by: Zev Weiss <zev@bewilderbeest.net>
Fixes: 666c14906b49 ("hwmon: (pmbus/lm25066) Drop support for LM25063")
Link: https://lore.kernel.org/r/20210928092242.30036-5-zev@bewilderbeest.net
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

hwmon: Fix possible memleak in __hwmon_device_register()

BugLink: https://bugs.launchpad.net/bugs/1951822
[ Upstream commit ada61aa0b1184a8fda1a89a340c7d6cc4e59aee5 ]

I got memory leak as follows when doing fault injection test:

unreferenced object 0xffff888102740438 (size 8):
  comm "27", pid 859, jiffies 4295031351 (age 143.992s)
  hex dump (first 8 bytes):
    68 77 6d 6f 6e 30 00 00                          hwmon0..
  backtrace:
    [<00000000544b5996>] __kmalloc_track_caller+0x1a6/0x300
    [<00000000df0d62b9>] kvasprintf+0xad/0x140
    [<00000000d3d2a3da>] kvasprintf_const+0x62/0x190
    [<000000005f8f0f29>] kobject_set_name_vargs+0x56/0x140
    [<00000000b739e4b9>] dev_set_name+0xb0/0xe0
    [<0000000095b69c25>] __hwmon_device_register+0xf19/0x1e50 [hwmon]
    [<00000000a7e65b52>] hwmon_device_register_with_info+0xcb/0x110 [hwmon]
    [<000000006f181e86>] devm_hwmon_device_register_with_info+0x85/0x100 [hwmon]
    [<0000000081bdc567>] tmp421_probe+0x2d2/0x465 [tmp421]
    [<00000000502cc3f8>] i2c_device_probe+0x4e1/0xbb0
    [<00000000f90bda3b>] really_probe+0x285/0xc30
    [<000000007eac7b77>] __driver_probe_device+0x35f/0x4f0
    [<000000004953d43d>] driver_probe_device+0x4f/0x140
    [<000000002ada2d41>] __device_attach_driver+0x24c/0x330
    [<00000000b3977977>] bus_for_each_drv+0x15d/0x1e0
    [<000000005bf2a8e3>] __device_attach+0x267/0x410

When device_register() returns an error, the name allocated in
dev_set_name() will be leaked, the put_device() should be used
instead of calling hwmon_dev_release() to give up the device
reference, then the name will be freed in kobject_cleanup().

Reported-by: Hulk Robot <hulkci@huawei.com>
Fixes: bab2243ce189 ("hwmon: Introduce hwmon_device_register_with_groups")
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Link: https://lore.kernel.org/r/20211012112758.2681084-1-yangyingliang@huawei.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

net, neigh: Fix NTF_EXT_LEARNED in combination with NTF_USE

BugLink: https://bugs.launchpad.net/bugs/1951822
[ Upstream commit e4400bbf5b15750e1b59bf4722d18d99be60c69f ]

The NTF_EXT_LEARNED neigh flag is usually propagated back to user space
upon dump of the neighbor table. However, when used in combination with
NTF_USE flag this is not the case despite exempting the entry from the
garbage collector. This results in inconsistent state since entries are
typically marked in neigh->flags with NTF_EXT_LEARNED, but here they are
not. Fix it by propagating the creation flag to ___neigh_create().

Before fix:

  # ./ip/ip n replace 192.168.178.30 dev enp5s0 use extern_learn
  # ./ip/ip n
  192.168.178.30 dev enp5s0 lladdr f4:8c:50:5e:71:9a REACHABLE
  [...]

After fix:

  # ./ip/ip n replace 192.168.178.30 dev enp5s0 use extern_learn
  # ./ip/ip n
  192.168.178.30 dev enp5s0 lladdr f4:8c:50:5e:71:9a extern_learn REACHABLE
  [...]

Fixes: 9ce33e46531d ("neighbour: support for NTF_EXT_LEARNED flag")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Roopa Prabhu <roopa@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

memstick: jmb38x_ms: use appropriate free function in jmb38x_ms_alloc_host()

BugLink: https://bugs.launchpad.net/bugs/1951822
[ Upstream commit beae4a6258e64af609ad5995cc6b6056eb0d898e ]

The "msh" pointer is device managed, meaning that memstick_alloc_host()
calls device_initialize() on it. That means that it can't be free
using kfree() but must instead be freed with memstick_free_host().
Otherwise it leads to a tiny memory leak of device resources.

Fixes: 60fdd931d577 ("memstick: add support for JMicron jmb38x MemoryStick host controller")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Link: https://lore.kernel.org/r/20211011123912.GD15188@kili
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

memstick: avoid out-of-range warning

BugLink: https://bugs.launchpad.net/bugs/1951822
[ Upstream commit 4853396f03c3019eccf5cd113e464231e9ddf0b3 ]

clang-14 complains about a sanity check that always passes when the
page size is 64KB or larger:

drivers/memstick/core/ms_block.c:1739:21: error: result of comparison of constant 65536 with expression of type 'unsigned short' is always false [-Werror,-Wtautological-constant-out-of-range-compare]
if (msb->page_size > PAGE_SIZE) {
~~~~~~~~~~~~~~ ^ ~~~~~~~~~

This is fine, it will still work on all architectures, so just shut
up that warning with a cast.

Fixes: 0ab30494bc4f ("memstick: add support for legacy memorysticks")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20210927094520.696665-1-arnd@kernel.org
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

mmc: sdhci-omap: Fix context restore

BugLink: https://bugs.launchpad.net/bugs/1951822
[ Upstream commit d806e334d0390502cd2a820ad33d65d7f9bba618 ]

We need to restore context in a specified order with HCTL set in two
phases. This is similar to what omap_hsmmc_context_restore() is doing.
Otherwise SDIO can stop working on resume.

And for PM runtime and SDIO cards, we need to also save SYSCTL, IE and
ISE.

This should not be a problem currently, and these patches can be applied
whenever suitable.

Fixes: ee0f309263a6 ("mmc: sdhci-omap: Add Support for Suspend/Resume")
Signed-off-by: Tony Lindgren <tony@atomide.com>
Link: https://lore.kernel.org/r/20210921110029.21944-3-tony@atomide.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

mmc: sdhci-omap: Fix NULL pointer exception if regulator is not configured

BugLink: https://bugs.launchpad.net/bugs/1951822
[ Upstream commit 8e0e7bd38b1ec7f9e5d18725ad41828be4e09859 ]

If sdhci-omap is configured for an unused device instance and the device
is not set as disabled, we can get a NULL pointer dereference:

Unable to handle kernel NULL pointer dereference at virtual address
00000045
...
(regulator_set_voltage) from [<c07d7008>] (mmc_regulator_set_ocr+0x44/0xd0)
(mmc_regulator_set_ocr) from [<c07e2d80>] (sdhci_set_ios+0xa4/0x490)
(sdhci_set_ios) from [<c07ea690>] (sdhci_omap_set_ios+0x124/0x160)
(sdhci_omap_set_ios) from [<c07c8e94>] (mmc_power_up.part.0+0x3c/0x154)
(mmc_power_up.part.0) from [<c07c9d20>] (mmc_start_host+0x88/0x9c)
(mmc_start_host) from [<c07cad34>] (mmc_add_host+0x58/0x7c)
(mmc_add_host) from [<c07e2574>] (__sdhci_add_host+0xf0/0x22c)
(__sdhci_add_host) from [<c07eaf68>] (sdhci_omap_probe+0x318/0x72c)
(sdhci_omap_probe) from [<c06a39d8>] (platform_probe+0x58/0xb8)

AFAIK we are not seeing this with the devices configured in the mainline
kernel but this can cause issues for folks bringing up their boards.

Fixes: 7d326930d352 ("mmc: sdhci-omap: Add OMAP SDHCI driver")
Signed-off-by: Tony Lindgren <tony@atomide.com>
Link: https://lore.kernel.org/r/20210921110029.21944-2-tony@atomide.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

gve: Track RX buffer allocation failures

BugLink: https://bugs.launchpad.net/bugs/1951822
[ Upstream commit 1b4d1c9bab091ac6e20a3ff80c30c5cefe192bf4 ]

The rx_buf_alloc_fail counter wasn't getting updated.

Fixes: 433e274b8f7b0 ("gve: Add stats for gve.")
Signed-off-by: Catherine Sullivan <csully@google.com>
Signed-off-by: Jeroen de Borst <jeroendb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

gve: Recover from queue stall due to missed IRQ

BugLink: https://bugs.launchpad.net/bugs/1951822
[ Upstream commit 87a7f321bb6a45e54b7d6c90d032ee5636a6ad97 ]

Don't always reset the driver on a TX timeout. Attempt to
recover by kicking the queue in case an IRQ was missed.

Fixes: 9e5f7d26a4c08 ("gve: Add workqueue and reset support")
Signed-off-by: John Fraker <jfraker@google.com>
Signed-off-by: David Awogbemila <awogbemila@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

b43: fix a lower bounds test

BugLink: https://bugs.launchpad.net/bugs/1951822
[ Upstream commit 9b793db5fca44d01f72d3564a168171acf7c4076 ]

The problem is that "channel" is an unsigned int, when it's less 5 the
value of "channel - 5" is not a negative number as one would expect but
is very high positive value instead.

This means that "start" becomes a very high positive value. The result
of that is that we never enter the "for (i = start; i <= end; i++) {"
loop. Instead of storing the result from b43legacy_radio_aci_detect()
it just uses zero.

Fixes: ef1a628d83fc ("b43: Implement dynamic PHY API")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Michael Büsch <m@bues.ch>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20211006073621.GE8404@kili
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

b43legacy: fix a lower bounds test

BugLink: https://bugs.launchpad.net/bugs/1951822
[ Upstream commit c1c8380b0320ab757e60ed90efc8b1992a943256 ]

The problem is that "channel" is an unsigned int, when it's less 5 the
value of "channel - 5" is not a negative number as one would expect but
is very high positive value instead.

This means that "start" becomes a very high positive value. The result
of that is that we never enter the "for (i = start; i <= end; i++) {"
loop. Instead of storing the result from b43legacy_radio_aci_detect()
it just uses zero.

Fixes: 75388acd0cd8 ("[B43LEGACY]: add mac80211-based driver for legacy BCM43xx devices")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Michael Büsch <m@bues.ch>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20211006073542.GD8404@kili
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

ima: fix deadlock when traversing "ima_default_rules".

BugLink: https://bugs.launchpad.net/bugs/1951822
[ Upstream commit eb0782bbdfd0d7c4786216659277c3fd585afc0e ]

The current IMA ruleset is identified by the variable "ima_rules"
that default to "&ima_default_rules". When loading a custom policy
for the first time, the variable is updated to "&ima_policy_rules"
instead. That update isn't RCU-safe, and deadlocks are possible.
Indeed, some functions like ima_match_policy() may loop indefinitely
when traversing "ima_default_rules" with list_for_each_entry_rcu().

When iterating over the default ruleset back to head, if the list
head is "ima_default_rules", and "ima_rules" have been updated to
"&ima_policy_rules", the loop condition (&entry->list != ima_rules)
stays always true, traversing won't terminate, causing a soft lockup
and RCU stalls.

Introduce a temporary value for "ima_rules" when iterating over
the ruleset to avoid the deadlocks.

Signed-off-by: liqiong <liqiong@nfschina.com>
Reviewed-by: THOBY Simon <Simon.THOBY@viveris.fr>
Fixes: 38d859f991f3 ("IMA: policy can now be updated multiple times")
Reported-by: kernel test robot <lkp@intel.com> (Fix sparse: incompatible types in comparison expression.)
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

hwrng: mtk - Force runtime pm ops for sleep ops

BugLink: https://bugs.launchpad.net/bugs/1951822
[ Upstream commit b6f5f0c8f72d348b2d07b20d7b680ef13a7ffe98 ]

Currently mtk_rng_runtime_suspend/resume is called for both runtime pm
and system sleep operations.

This is wrong as these should only be runtime ops as the name already
suggests. Currently freezing the system will lead to a call to
mtk_rng_runtime_suspend even if the device currently isn't active. This
leads to a clock warning because it is disabled/unprepared although it
isn't enabled/prepared currently.

This patch fixes this by only setting the runtime pm ops and forces to
call the runtime pm ops from the system sleep ops as well if active but
not otherwise.

Fixes: 81d2b34508c6 ("hwrng: mtk - add runtime PM support")
Signed-off-by: Markus Schneider-Pargmann <msp@baylibre.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

crypto: qat - disregard spurious PFVF interrupts

BugLink: https://bugs.launchpad.net/bugs/1951822
[ Upstream commit 18fcba469ba5359c1de7e3fb16f7b9e8cd1b8e02 ]

Upon receiving a PFVF message, check if the interrupt bit is set in the
message. If it is not, that means that the interrupt was probably
triggered by a collision. In this case, disregard the message and
re-enable the interrupts.

Fixes: ed8ccaef52fa ("crypto: qat - Add support for SRIOV")
Signed-off-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com>
Reviewed-by: Marco Chiappero <marco.chiappero@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

crypto: qat - detect PFVF collision after ACK

BugLink: https://bugs.launchpad.net/bugs/1951822
[ Upstream commit 9b768e8a3909ac1ab39ed44a3933716da7761a6f ]

Detect a PFVF collision between the local and the remote function by
checking if the message on the PFVF CSR has been overwritten.
This is done after the remote function confirms that the message has
been received, by clearing the interrupt bit, or the maximum number of
attempts (ADF_IOV_MSG_ACK_MAX_RETRY) to check the CSR has been exceeded.

Fixes: ed8ccaef52fa ("crypto: qat - Add support for SRIOV")
Signed-off-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com>
Co-developed-by: Marco Chiappero <marco.chiappero@intel.com>
Signed-off-by: Marco Chiappero <marco.chiappero@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

crypto: ccree - avoid out-of-range warnings from clang

BugLink: https://bugs.launchpad.net/bugs/1951822
[ Upstream commit cfd6fb45cfaf46fa9547421d8da387dc9c7997d4 ]

clang points out inconsistencies in the FIELD_PREP() invocation in
this driver that result from the 'mask' being a 32-bit value:

drivers/crypto/ccree/cc_driver.c:117:18: error: result of comparison of constant 18446744073709551615 with expression of type 'u32' (aka 'unsigned int') is always false [-Werror,-Wtautological-constant-out-of-range-compare]
        cache_params |= FIELD_PREP(mask, val);
                        ^~~~~~~~~~~~~~~~~~~~~
include/linux/bitfield.h:94:3: note: expanded from macro 'FIELD_PREP'
                __BF_FIELD_CHECK(_mask, 0ULL, _val, "FIELD_PREP: ");    \
                ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
include/linux/bitfield.h:52:28: note: expanded from macro '__BF_FIELD_CHECK'
                BUILD_BUG_ON_MSG((_mask) > (typeof(_reg))~0ull,         \
                ~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

This does not happen in other places that just pass a constant here.

Work around the warnings by widening the type of the temporary variable.

Fixes: 05c2a705917b ("crypto: ccree - rework cache parameters handling")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Gilad ben-Yossef <gilad@benyossef.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

media: dvb-frontends: mn88443x: Handle errors of clk_prepare_enable()

BugLink: https://bugs.launchpad.net/bugs/1951822
[ Upstream commit 69a10678e2fba3d182e78ea041f2d1b1a6058764 ]

mn88443x_cmn_power_on() did not handle possible errors of
clk_prepare_enable() and always finished successfully so that its caller
mn88443x_probe() did not care about failed preparing/enabling of clocks
as well.

Add missed error handling in both mn88443x_cmn_power_on() and
mn88443x_probe(). This required to change the return value of the former
from "void" to "int".

Found by Linux Driver Verification project (linuxtesting.org).

Fixes: 0f408ce8941f ("media: dvb-frontends: add Socionext MN88443x ISDB-S/T demodulator driver")
Signed-off-by: Evgeny Novikov <novikov@ispras.ru>
Co-developed-by: Kirill Shilimanov <kirill.shilimanov@huawei.com>
Signed-off-by: Kirill Shilimanov <kirill.shilimanov@huawei.com>
Signed-off-by: Sean Young <sean@mess.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

media: venus: fix vpp frequency calculation for decoder

BugLink: https://bugs.launchpad.net/bugs/1951822
[ Upstream commit 1444232152ea33f5ae41fc14bade3e74d642b634 ]

In existing video driver implementation vpp frequency calculation in
calculate_inst_freq() is always zero because the value of vpp_freq_per_mb
is always zero for decoder.

Fixed this by correcting the calculating the vpp frequency calculation for
decoder.

Fixes: 3cfe5815ce0e ("media: venus: Enable low power setting for encoder")
Signed-off-by: Mansur Alisha Shaik <mansur@codeaurora.org>
Signed-off-by: Stanimir Varbanov <stanimir.varbanov@linaro.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

netfilter: nft_dynset: relax superfluous check on set updates

BugLink: https://bugs.launchpad.net/bugs/1951822
[ Upstream commit 7b1394892de8d95748d05e3ee41e85edb4abbfa1 ]

Relax this condition to make add and update commands idempotent for sets
with no timeout. The eval function already checks if the set element
timeout is available and updates it if the update command is used.

Fixes: 22fe54d5fefc ("netfilter: nf_tables: add support for dynamic set updates")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

rcu: Fix rcu_dynticks_curr_cpu_in_eqs() vs noinstr

BugLink: https://bugs.launchpad.net/bugs/1951822
[ Upstream commit 74aece72f95f399dd29363669dc32a1344c8fab4 ]

vmlinux.o: warning: objtool: rcu_nmi_enter()+0x36: call to __kasan_check_read() leaves .noinstr.text section

noinstr cannot have atomic_*() functions in because they're explicitly
annotated, use arch_atomic_*().

Fixes: 2be57f732889 ("rcu: Weaken ->dynticks accesses and updates")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

rcu: Always inline rcu_dynticks_task*_{enter,exit}()

BugLink: https://bugs.launchpad.net/bugs/1951822
[ Upstream commit 7663ad9a5dbcc27f3090e6bfd192c7e59222709f ]

RCU managed to grow a few noinstr violations:

vmlinux.o: warning: objtool: rcu_dynticks_eqs_enter()+0x0: call to rcu_dynticks_task_trace_enter() leaves .noinstr.text section
vmlinux.o: warning: objtool: rcu_dynticks_eqs_exit()+0xe: call to rcu_dynticks_task_trace_exit() leaves .noinstr.text section

Fix them by adding __always_inline to the relevant trivial functions.

Also replace the noinstr with __always_inline for the existing
rcu_dynticks_task_*() functions since noinstr would force noinline
them, even when empty, which seems silly.

Fixes: 7d0c9c50c5a1 ("rcu-tasks: Avoid IPIing userspace/idle tasks if kernel is so built")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

EDAC/amd64: Handle three rank interleaving mode

BugLink: https://bugs.launchpad.net/bugs/1951822
[ Upstream commit 9f4873fb6af7966de8fcbd95c36b61351c1c4b1f ]

AMD Rome systems and later support interleaving between three identical
ranks within a channel.

Check for this mode by counting the number of enabled chip selects and
comparing their masks. If there are exactly three enabled chip selects
and their masks are identical, then three rank interleaving is enabled.

The size of a rank is determined from its mask value. However, three
rank interleaving doesn't follow the method of swapping an interleave
bit with the most significant bit. Rather, the interleave bit is flipped
and the most significant bit remains the same. There is only a single
interleave bit in this case.

Account for this when determining the chip select size by keeping the
most significant bit at its original value and ignoring any zero bits.
This will return a full bitmask in [MSB:1].

Fixes: e53a3b267fb0 ("EDAC/amd64: Find Chip Select memory size using Address Mask")
Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20211005154419.2060504-1-yazen.ghannam@amd.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

x86/insn: Use get_unaligned() instead of memcpy()

BugLink: https://bugs.launchpad.net/bugs/1951822
[ Upstream commit f96b4675839b66168f5a07bf964dde6c2f1c4885 ]

Use get_unaligned() instead of memcpy() to access potentially unaligned
memory, which, when accessed through a pointer, leads to undefined
behavior. get_unaligned() describes much better what is happening there
anyway even if memcpy() does the job.

In addition, since perf tool builds with -Werror, it would fire with:

  util/intel-pt-decoder/../../../arch/x86/lib/insn.c: In function '__insn_get_emulate_prefix':
  tools/include/../include/asm-generic/unaligned.h:10:15: error: packed attribute is unnecessary [-Werror=packed]
     10 |  const struct { type x; } __packed *__pptr = (typeof(__pptr))(ptr); \

because -Werror=packed would complain if the packed attribute would have
no effect on the layout of the structure.

In this case, that is intentional so disable the warning only for that
compilation unit.

That part is Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>

No functional changes.

Fixes: 5ba1071f7554 ("x86/insn, tools/x86: Fix undefined behavior due to potential unaligned accesses")
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Tested-by: Stephen Rothwell <sfr@canb.auug.org.au>
Link: https://lkml.kernel.org/r/YVSsIkj9Z29TyUjE@zn.tnic
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

PM: EM: Fix inefficient states detection

BugLink: https://bugs.launchpad.net/bugs/1951822
[ Upstream commit aa1a43262ad5df010768f69530fa179ff81651d3 ]

Currently, a debug message is printed if an inefficient state is detected
in the Energy Model. Unfortunately, it won't detect if the first state is
inefficient or if two successive states are. Fix this behavior.

Fixes: 27871f7a8a34 (PM: Introduce an Energy Model management framework)
Signed-off-by: Vincent Donnefort <vincent.donnefort@arm.com>
Reviewed-by: Quentin Perret <qperret@google.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Reviewed-by: Matthias Kaehlcke <mka@chromium.org>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

ath9k: Fix potential interrupt storm on queue reset

BugLink: https://bugs.launchpad.net/bugs/1951822
[ Upstream commit 4925642d541278575ad1948c5924d71ffd57ef14 ]

In tests with two Lima boards from 8devices (QCA4531 based) on OpenWrt
19.07 we could force a silent restart of a device with no serial
output when we were sending a high amount of UDP traffic (iperf3 at 80
MBit/s in both directions from external hosts, saturating the wifi and
causing a load of about 4.5 to 6) and were then triggering an
ath9k_queue_reset().

Further debugging showed that the restart was caused by the ath79
watchdog. With disabled watchdog we could observe that the device was
constantly going into ath_isr() interrupt handler and was returning
early after the ATH_OP_HW_RESET flag test, without clearing any
interrupts. Even though ath9k_queue_reset() calls
ath9k_hw_kill_interrupts().

With JTAG we could observe the following race condition:

1) ath9k_queue_reset()
   ...
   -> ath9k_hw_kill_interrupts()
   -> set_bit(ATH_OP_HW_RESET, &common->op_flags);
   ...
   <- returns

      2) ath9k_tasklet()
         ...
         -> ath9k_hw_resume_interrupts()
         ...
         <- returns

                 3) loops around:
                    ...
                    handle_int()
                    -> ath_isr()
                       ...
                       -> if (test_bit(ATH_OP_HW_RESET,
                                       &common->op_flags))
                            return IRQ_HANDLED;

                    x) ath_reset_internal():
                       => never reached <=

And in ath_isr() we would typically see the following interrupts /
interrupt causes:

* status: 0x00111030 or 0x00110030
* async_cause: 2 (AR_INTR_MAC_IPQ)
* sync_cause: 0

So the ath9k_tasklet() reenables the ath9k interrupts
through ath9k_hw_resume_interrupts() which ath9k_queue_reset() had just
disabled. And ath_isr() then keeps firing because it returns IRQ_HANDLED
without actually clearing the interrupt.

To fix this IRQ storm also clear/disable the interrupts again when we
are in reset state.

Cc: Sven Eckelmann <sven@narfation.org>
Cc: Simon Wunderlich <sw@simonwunderlich.de>
Cc: Linus Lüssing <linus.luessing@c0d3.blue>
Fixes: 872b5d814f99 ("ath9k: do not access hardware on IRQs during reset")
Signed-off-by: Linus Lüssing <ll@simonwunderlich.de>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20210914192515.9273-3-linus.luessing@c0d3.blue
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

ath10k: Don't always treat modem stop events as crashes

BugLink: https://bugs.launchpad.net/bugs/1951822
[ Upstream commit 747ff7d3d7424876111b7559b7f6704762f89796 ]

When rebooting on sc7180 Trogdor devices I see the following crash from
the wifi driver.

ath10k_snoc 18800000.wifi: firmware crashed! (guid 83493570-29a2-4e98-a83e-70048c47669c)

This is because a modem stop event looks just like a firmware crash to
the driver, the qmi connection is closed in both cases. Use the qcom ssr
notifier block to stop treating the qmi connection close event as a
firmware crash signal when the modem hasn't actually crashed. See
ath10k_qmi_event_server_exit() for more details.

This silences the crash message seen during every reboot.

Fixes: 3f14b73c3843 ("ath10k: Enable MSA region dump support for WCN3990")
Cc: Youghandhar Chintala <youghand@codeaurora.org>
Cc: Abhishek Kumar <kuabhs@chromium.org>
Cc: Steev Klimaszewski <steev@kali.org>
Cc: Matthias Kaehlcke <mka@chromium.org>
Cc: Rakesh Pillai <pillair@codeaurora.org>
Signed-off-by: Stephen Boyd <swboyd@chromium.org>
Reviewed-by: Rakesh Pillai <pillair@codeaurora.org>
Tested-By: Youghandhar Chintala <youghand@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20210922233341.182624-1-swboyd@chromium.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

media: em28xx: Don't use ops->suspend if it is NULL

BugLink: https://bugs.launchpad.net/bugs/1951822
[ Upstream commit 51fa3b70d27342baf1ea8aaab3e96e5f4f26d5b2 ]

The call to ops->suspend for the dev->dev_next case can currently
trigger a call on a null function pointer if ops->suspend is null.
Skip over the use of function ops->suspend if it is null.

Addresses-Coverity: ("Dereference after null check")

Fixes: be7fd3c3a8c5 ("media: em28xx: Hauppauge DualHD second tuner functionality")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

cpuidle: Fix kobject memory leaks in error paths

BugLink: https://bugs.launchpad.net/bugs/1951822
[ Upstream commit e5f5a66c9aa9c331da5527c2e3fd9394e7091e01 ]

Commit c343bf1ba5ef ("cpuidle: Fix three reference count leaks")
fixes the cleanup of kobjects; however, it removes kfree() calls
altogether, leading to memory leaks.

Fix those and also defer the initialization of dev->kobj_dev until
after the error check, so that we do not end up with a dangling
pointer.

Fixes: c343bf1ba5ef ("cpuidle: Fix three reference count leaks")
Signed-off-by: Anel Orazgaliyeva <anelkz@amazon.de>
Suggested-by: Aman Priyadarshi <apeureka@amazon.de>
[ rjw: Subject edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

drm: fb_helper: fix CONFIG_FB dependency

BugLink: https://bugs.launchpad.net/bugs/1951822
[ Upstream commit 606b102876e3741851dfb09d53f3ee57f650a52c ]

With CONFIG_FB=m and CONFIG_DRM=y, we get a link error in the fb helper:

aarch64-linux-ld: drivers/gpu/drm/drm_fb_helper.o: in function `drm_fb_helper_alloc_fbi':
(.text+0x10cc): undefined reference to `framebuffer_alloc'

Tighten the dependency so it is only allowed in the case that DRM can
link against FB.

Fixes: f611b1e7624c ("drm: Avoid circular dependencies for CONFIG_FB")
Link: https://lore.kernel.org/all/20210721152211.2706171-1-arnd@kernel.org/
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210927142816.2069269-1-arnd@kernel.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

crypto: ecc - fix CRYPTO_DEFAULT_RNG dependency

BugLink: https://bugs.launchpad.net/bugs/1951822
[ Upstream commit 38aa192a05f22f9778f9420e630f0322525ef12e ]

The ecc.c file started out as part of the ECDH algorithm but got
moved out into a standalone module later. It does not build without
CRYPTO_DEFAULT_RNG, so now that other modules are using it as well we
can run into this link error:

aarch64-linux-ld: ecc.c:(.text+0xfc8): undefined reference to `crypto_default_rng'
aarch64-linux-ld: ecc.c:(.text+0xff4): undefined reference to `crypto_put_default_rng'

Move the 'select CRYPTO_DEFAULT_RNG' statement into the correct symbol.

Fixes: 0d7a78643f69 ("crypto: ecrdsa - add EC-RDSA (GOST 34.10) algorithm")
Fixes: 4e6602916bc6 ("crypto: ecdsa - Add support for ECDSA signature verification")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Stefan Berger <stefanb@linux.ibm.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

kprobes: Do not use local variable when creating debugfs file

BugLink: https://bugs.launchpad.net/bugs/1951822
[ Upstream commit 8f7262cd66699a4b02eb7549b35c81b2116aad95 ]

debugfs_create_file() takes a pointer argument that can be used during
file operation callbacks (accessible via i_private in the inode
structure). An obvious requirement is for the pointer to refer to
valid memory when used.

When creating the debugfs file to dynamically enable / disable
kprobes, a pointer to local variable is passed to
debugfs_create_file(); which will go out of scope when the init
function returns. The reason this hasn't triggered random memory
corruption is because the pointer is not accessed during the debugfs
file callbacks.

Since the enabled state is managed by the kprobes_all_disabled global
variable, the local variable is not needed. Fix the incorrect (and
unnecessary) usage of local variable during debugfs_file_create() by
passing NULL instead.

Link: https://lkml.kernel.org/r/163163031686.489837.4476867635937014973.stgit@devnote2
Fixes: bf8f6e5b3e51 ("Kprobes: The ON/OFF knob thru debugfs")
Signed-off-by: Punit Agrawal <punitagrawal@gmail.com>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

scs: Release kasan vmalloc poison in scs_free process

BugLink: https://bugs.launchpad.net/bugs/1951822
[ Upstream commit 528a4ab45300fa6283556d9b48e26b45a8aa15c4 ]

Since scs allocation is moved to vmalloc region, the
shadow stack is protected by kasan_posion_vmalloc.
However, the vfree_atomic operation needs to access
its context for scs_free process and causes kasan error
as the dump info below.

This patch Adds kasan_unpoison_vmalloc() before vfree_atomic,
which aligns to the prior flow as using kmem_cache.
The vmalloc region will go back posioned in the following
vumap() operations.

==================================================================
BUG: KASAN: vmalloc-out-of-bounds in llist_add_batch+0x60/0xd4
Write of size 8 at addr ffff8000100b9000 by task kthreadd/2

CPU: 0 PID: 2 Comm: kthreadd Not tainted 5.15.0-rc2-11681-g92477dd1faa6-dirty #1
Hardware name: linux,dummy-virt (DT)
Call trace:
  dump_backtrace+0x0/0x43c
  show_stack+0x1c/0x2c
  dump_stack_lvl+0x68/0x84
  print_address_description+0x80/0x394
  kasan_report+0x180/0x1dc
  __asan_report_store8_noabort+0x48/0x58
  llist_add_batch+0x60/0xd4
  vfree_atomic+0x60/0xe0
  scs_free+0x1dc/0x1fc
  scs_release+0xa4/0xd4
  free_task+0x30/0xe4
  __put_task_struct+0x1ec/0x2e0
  delayed_put_task_struct+0x5c/0xa0
  rcu_do_batch+0x62c/0x8a0
  rcu_core+0x60c/0xc14
  rcu_core_si+0x14/0x24
  __do_softirq+0x19c/0x68c
  irq_exit+0x118/0x2dc
  handle_domain_irq+0xcc/0x134
  gic_handle_irq+0x7c/0x1bc
  call_on_irq_stack+0x40/0x70
  do_interrupt_handler+0x78/0x9c
  el1_interrupt+0x34/0x60
  el1h_64_irq_handler+0x1c/0x2c
  el1h_64_irq+0x78/0x7c
  _raw_spin_unlock_irqrestore+0x40/0xcc
  sched_fork+0x4f0/0xb00
  copy_process+0xacc/0x3648
  kernel_clone+0x168/0x534
  kernel_thread+0x13c/0x1b0
  kthreadd+0x2bc/0x400
  ret_from_fork+0x10/0x20

Memory state around the buggy address:
  ffff8000100b8f00: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8
  ffff8000100b8f80: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8
>ffff8000100b9000: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8
                    ^
  ffff8000100b9080: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8
  ffff8000100b9100: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8
==================================================================

Suggested-by: Kuan-Ying Lee <kuan-ying.lee@mediatek.com>
Acked-by: Will Deacon <will@kernel.org>
Tested-by: Will Deacon <will@kernel.org>
Reviewed-by: Sami Tolvanen <samitolvanen@google.com>
Signed-off-by: Yee Lee <yee.lee@mediatek.com>
Fixes: a2abe7cbd8fe ("scs: switch to vmapped shadow stacks")
Link: https://lore.kernel.org/r/20210930081619.30091-1-yee.lee@mediatek.com
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

media: atmel: fix the ispck initialization

BugLink: https://bugs.launchpad.net/bugs/1951822
[ Upstream commit d7f26849ed7cc875d0ff7480c2efebeeccea2bad ]

The runtime enabling of the ISPCK (internally clocks the pipeline inside
the ISC) has to be done after the pm_runtime for the ISC dev has been
started.

After the commit by Mauro:
the ISC failed to probe with the error:

atmel-sama5d2-isc f0008000.isc: failed to enable ispck: -13
atmel-sama5d2-isc: probe of f0008000.isc failed with error -13

This is because the enabling of the ispck is done too early in the probe,
and the PM runtime returns invalid request.
Thus, moved this clock enabling after pm_runtime_idle is called.

The ISPCK is required only for sama5d2 type of ISC.
Thus, add a bool inside the isc struct that is platform dependent.
For the sama7g5-isc, the enabling of the ISPCK is wrong and does not make
sense. Removed it from the sama7g5 probe. In sama7g5-isc, there is only
one clock, the MCK, which also clocks the internal pipeline of the ISC.

Adapted the clk_prepare and clk_unprepare to request the runtime PM
for both clocks (MCK and ISPCK) in case of sama5d2-isc, and the single
clock (MCK) in case of sama7g5-isc.

Fixes: dd97908ee350 ("media: atmel: properly get pm_runtime")
Signed-off-by: Eugen Hristev <eugen.hristev@microchip.com>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

media: cx23885: Fix snd_card_free call on null card pointer

BugLink: https://bugs.launchpad.net/bugs/1951822
[ Upstream commit 7266dda2f1dfe151b12ef0c14eb4d4e622fb211c ]

Currently a call to snd_card_new that fails will set card with a NULL
pointer, this causes a null pointer dereference on the error cleanup
path when card it passed to snd_card_free. Fix this by adding a new
error exit path that does not call snd_card_free and exiting via this
new path.

Addresses-Coverity: ("Explicit null dereference")

Fixes: 9e44d63246a9 ("[media] cx23885: Add ALSA support")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

media: tm6000: Avoid card name truncation

BugLink: https://bugs.launchpad.net/bugs/1951822
[ Upstream commit 42bb98e420d454fef3614b70ea11cc59068395f6 ]

The "card" string only holds 31 characters (and the terminating NUL).
In order to avoid truncation, use a shorter card description instead of
the current result, "Trident TVMaster TM5600/6000/60".

Suggested-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Fixes: e28f49b0b2a8 ("V4L/DVB: tm6000: fix some info messages")
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

media: si470x: Avoid card name truncation

BugLink: https://bugs.launchpad.net/bugs/1951822
[ Upstream commit 2908249f3878a591f7918368fdf0b7b0a6c3158c ]

The "card" string only holds 31 characters (and the terminating NUL).
In order to avoid truncation, use a shorter card description instead of
the current result, "Silicon Labs Si470x FM Radio Re".

Suggested-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Fixes: 78656acdcf48 ("V4L/DVB (7038): USB radio driver for Silicon Labs Si470x FM Radio Receivers")
Fixes: cc35bbddfe10 ("V4L/DVB (12416): radio-si470x: add i2c driver for si470x")
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

media: radio-wl1273: Avoid card name truncation

BugLink: https://bugs.launchpad.net/bugs/1951822
[ Upstream commit dfadec236aa99f6086141949c9dc3ec50f3ff20d ]

The "card" string only holds 31 characters (and the terminating NUL).
In order to avoid truncation, use a shorter card description instead of
the current result, "Texas Instruments Wl1273 FM Rad".

Suggested-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Fixes: 87d1a50ce451 ("[media] V4L2: WL1273 FM Radio: TI WL1273 FM radio driver")
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

media: sun6i-csi: Allow the video device to be open multiple times

BugLink: https://bugs.launchpad.net/bugs/1951822
[ Upstream commit 8ed852834683ebe064157e069af8dfb41cad6403 ]

Previously it was possible, but a recent fix for uninitialized
`ret` variable broke this behavior.

v4l2_fh_is_singular_file() check is there just to determine
whether the power needs to be enabled, and it's not a failure
if it returns false.

Fixes: ba9139116bc0 ("media: sun6i-csi: add a missing return code")
Signed-off-by: Ondrej Jirman <megous@megous.com>
Reviewed-by: Jernej Skrabec <jernej.skrabec@gmail.com>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

media: i2c: ths8200 needs V4L2_ASYNC

BugLink: https://bugs.launchpad.net/bugs/1951822
[ Upstream commit e4625044d656f3c33ece0cc9da22577bc10ca5d3 ]

Fix the build errors reported by the kernel test robot by
selecting V4L2_ASYNC:

mips-linux-ld: drivers/media/i2c/ths8200.o: in function `ths8200_remove':
ths8200.c:(.text+0x1ec): undefined reference to `v4l2_async_unregister_subdev'
mips-linux-ld: drivers/media/i2c/ths8200.o: in function `ths8200_probe':
ths8200.c:(.text+0x404): undefined reference to `v4l2_async_register_subdev'

Fixes: ed29f89497006 ("media: i2c: ths8200: support asynchronous probing")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reported-by: kernel test robot <lkp@intel.com>
Reviewed-by: Lad Prabhakar <prabhakar.csengg@gmail.com>
Acked-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

media: imx-jpeg: Fix the error handling path of 'mxc_jpeg_probe()'

BugLink: https://bugs.launchpad.net/bugs/1951822
[ Upstream commit 5c47dc6657543b3c4dffcbe741fb693b9b96796d ]

A successful 'mxc_jpeg_attach_pm_domains()' call should be balanced by a
corresponding 'mxc_jpeg_detach_pm_domains()' call in the error handling
path of the probe, as already done in the remove function.

Update the error handling path accordingly.

Fixes: 2db16c6ed72c ("media: imx-jpeg: Add V4L2 driver for i.MX8 JPEG Encoder/Decoder")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

media: mtk-vpu: Fix a resource leak in the error handling path of 'mtk_vpu_probe()'

BugLink: https://bugs.launchpad.net/bugs/1951822
[ Upstream commit 2143ad413c05c7be24c3a92760e367b7f6aaac92 ]

A successful 'clk_prepare()' call should be balanced by a corresponding
'clk_unprepare()' call in the error handling path of the probe, as already
done in the remove function.

Update the error handling path accordingly.

Fixes: 3003a180ef6b ("[media] VPU: mediatek: support Mediatek VPU")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Reviewed-by: Houlong Wei <houlong.wei@mediatek.com>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

media: TDA1997x: handle short reads of hdmi info frame.

BugLink: https://bugs.launchpad.net/bugs/1951822
[ Upstream commit 48d219f9cc667bc6fbc3e3af0b1bfd75db94fce4 ]

Static analysis reports this representative problem

tda1997x.c:1939: warning: 7th function call argument is an uninitialized
value

The 7th argument is buffer[0], which is set in the earlier call to
io_readn(). When io_readn() call to io_read() fails with the first
read, buffer[0] is not set and 0 is returned and stored in len.

The later call to hdmi_infoframe_unpack()'s size parameter is the
static size of buffer, always 40, so a short read is not caught
in hdmi_infoframe_unpacks()'s checking. The variable len should be
used instead.

Zero initialize buffer to 0 so it is in a known start state.

Fixes: 9ac0038db9a7 ("media: i2c: Add TDA1997x HDMI receiver driver")
Signed-off-by: Tom Rix <trix@redhat.com>
Reviewed-by: Tim Harvey <tharvey@gateworks.com>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

media: mtk-vcodec: venc: fix return value when start_streaming fails

BugLink: https://bugs.launchpad.net/bugs/1951822
[ Upstream commit 065a7c66bd8b21db212fa86187ff12f0cac6ea6d ]

In case vb2ops_venc_start_streaming fails, the error value
is overwritten by the ret value of pm_runtime_put which might
be 0. Fix it.

Fixes: 985c73693fe5a (" media: mtk-vcodec: Separating mtk encoder driver")
Signed-off-by: Dafna Hirschfeld <dafna.hirschfeld@collabora.com>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

media: v4l2-ioctl: S_CTRL output the right value

BugLink: https://bugs.launchpad.net/bugs/1951822
[ Upstream commit c87ed93574e3cd8346c05bd934c617596c12541b ]

If the driver does not implement s_ctrl, but it does implement
s_ext_ctrls, we convert the call.

When that happens we have also to convert back the response from
s_ext_ctrls.

Fixes v4l2_compliance:
Control ioctls (Input 0):
fail: v4l2-test-controls.cpp(411): returned control value out of range
fail: v4l2-test-controls.cpp(507): invalid control 00980900
test VIDIOC_G/S_CTRL: FAIL

Fixes: 35ea11ff8471 ("V4L/DVB (8430): videodev: move some functions from v4l2-dev.h to v4l2-common.h or v4l2-ioctl.h")
Reviewed-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Ricardo Ribalda <ribalda@chromium.org>
Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

media: imx258: Fix getting clock frequency

BugLink: https://bugs.launchpad.net/bugs/1951822
[ Upstream commit d170b0ea1760989fe8ac053bef83e61f3bf87992 ]

Obtain the clock frequency by reading the clock-frequency property if
there's no clock.

Fixes: 9fda25332c4b ("media: i2c: imx258: get clock from device properties and enable it via runtime PM")
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

media: dvb-usb: fix ununit-value in az6027_rc_query

BugLink: https://bugs.launchpad.net/bugs/1951822
[ Upstream commit afae4ef7d5ad913cab1316137854a36bea6268a5 ]

Syzbot reported ununit-value bug in az6027_rc_query(). The problem was
in missing state pointer initialization. Since this function does nothing
we can simply initialize state to REMOTE_NO_KEY_PRESSED.

Reported-and-tested-by: syzbot+2cd8c5db4a85f0a04142@syzkaller.appspotmail.com
Fixes: 76f9a820c867 ("V4L/DVB: AZ6027: Initial import of the driver")
Signed-off-by: Pavel Skripkin <paskripkin@gmail.com>
Signed-off-by: Sean Young <sean@mess.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

media: ttusb-dec: avoid release of non-acquired mutex

BugLink: https://bugs.launchpad.net/bugs/1951822
[ Upstream commit 36b9d695aa6fb8e9a312db21af41f90824d16ab4 ]

ttusb_dec_send_command() invokes mutex_lock_interruptible() that can
fail but then it releases the non-acquired mutex. The patch fixes that.

Found by Linux Driver Verification project (linuxtesting.org).

Fixes: dba328bab4c6 ("media: ttusb-dec: cleanup an error handling logic")
Signed-off-by: Evgeny Novikov <novikov@ispras.ru>
Signed-off-by: Sean Young <sean@mess.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

media: cxd2880-spi: Fix a null pointer dereference on error handling path

BugLink: https://bugs.launchpad.net/bugs/1951822
[ Upstream commit 11b982e950d2138e90bd120501df10a439006ff8 ]

Currently the null pointer check on dvb_spi->vcc_supply is inverted and
this leads to only null values of the dvb_spi->vcc_supply being passed
to the call of regulator_disable causing null pointer dereferences.
Fix this by only calling regulator_disable if dvb_spi->vcc_supply is
not null.

Addresses-Coverity: ("Dereference after null check")

Fixes: dcb014582101 ("media: cxd2880-spi: Fix an error handling path")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Sean Young <sean@mess.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

media: meson-ge2d: Fix rotation parameter changes detection in 'ge2d_s_ctrl()'

BugLink: https://bugs.launchpad.net/bugs/1951822
[ Upstream commit 4b9e3e8af4b336eefca1f1ee535bc4b6734ed6aa ]

There is likely a typo here. To be consistent, we should compare
'fmt.height' with 'ctx->out.pix_fmt.height', not 'ctx->out.pix_fmt.width'.

Instead of fixing the test, just remove it and copy 'fmt' unconditionally.

Fixes: 59a635327ca7 ("media: meson: Add M2M driver for the Amlogic GE2D Accelerator Unit")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Acked-by: Neil Armstrong <narmstrong@baylibre.com>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

media: em28xx: add missing em28xx_close_extension

BugLink: https://bugs.launchpad.net/bugs/1951822
[ Upstream commit 2c98b8a3458df03abdc6945bbef67ef91d181938 ]

If em28xx dev has ->dev_next pointer, we need to delete ->dev_next list
node from em28xx_extension_devlist on disconnect to avoid UAF bugs and
corrupted list bugs, since driver frees this pointer on disconnect.

Reported-and-tested-by: syzbot+a6969ef522a36d3344c9@syzkaller.appspotmail.com
Fixes: 1a23f81b7dc3 ("V4L/DVB (9979): em28xx: move usb probe code to a proper place")
Signed-off-by: Pavel Skripkin <paskripkin@gmail.com>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

libbpf: Fix skel_internal.h to set errno on loader retval < 0

BugLink: https://bugs.launchpad.net/bugs/1951822
[ Upstream commit e68ac0082787f4e8ee6ae5b19076ec7709ce715b ]

When the loader indicates an internal error (result of a checked bpf
system call), it returns the result in attr.test.retval. However, tests
that rely on ASSERT_OK_PTR on NULL (returned from light skeleton) may
miss that NULL denotes an error if errno is set to 0. This would result
in skel pointer being NULL, while ASSERT_OK_PTR returning 1, leading to
a SEGV on dereference of skel, because libbpf_get_error relies on the
assumption that errno is always set in case of error for ptr == NULL.

In particular, this was observed for the ksyms_module test. When
executed using `./test_progs -t ksyms`, prior tests manipulated errno
and the test didn't crash when it failed at ksyms_module load, while
using `./test_progs -t ksyms_module` crashed due to errno being
untouched.

Fixes: 67234743736a (libbpf: Generate loader program out of BPF ELF file.)
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210927145941.1383001-11-memxor@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

drm/amdgpu: fix warning for overflow check

BugLink: https://bugs.launchpad.net/bugs/1951822
[ Upstream commit 335aea75b0d95518951cad7c4c676e6f1c02c150 ]

The overflow check in amdgpu_bo_list_create() causes a warning with
clang-14 on 64-bit architectures, since the limit can never be
exceeded.

drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c:74:18: error: result of comparison of constant 256204778801521549 with expression of type 'unsigned int' is always false [-Werror,-Wtautological-constant-out-of-range-compare]
if (num_entries > (SIZE_MAX - sizeof(struct amdgpu_bo_list))
~~~~~~~~~~~ ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

The check remains useful for 32-bit architectures, so just avoid the
warning by using size_t as the type for the count.

Fixes: 920990cb080a ("drm/amdgpu: allocate the bo_list array after the list")
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

arm64: mm: update max_pfn after memory hotplug

BugLink: https://bugs.launchpad.net/bugs/1951822
[ Upstream commit 8fac67ca236b961b573355e203dbaf62a706a2e5 ]

After new memory blocks have been hotplugged, max_pfn and max_low_pfn
needs updating to reflect on new PFNs being hot added to system.
Without this patch, debug-related functions that use max_pfn such as
get_max_dump_pfn() or read_page_owner() will not work with any page in
memory that is hot-added after boot.

Fixes: 4ab215061554 ("arm64: Add memory hotplug support")
Signed-off-by: Sudarshan Rajagopalan <quic_sudaraja@quicinc.com>
Signed-off-by: Chris Goldsworthy <quic_cgoldswo@quicinc.com>
Acked-by: David Hildenbrand <david@redhat.com>
Cc: Florian Fainelli <f.fainelli@gmail.com>
Cc: Georgi Djakov <quic_c_gdjako@quicinc.com>
Tested-by: Georgi Djakov <quic_c_gdjako@quicinc.com>
Link: https://lore.kernel.org/r/a51a27ee7be66024b5ce626310d673f24107bcb8.1632853776.git.quic_cgoldswo@quicinc.com
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

drm/ttm: stop calling tt_swapin in vm_access

BugLink: https://bugs.launchpad.net/bugs/1951822
[ Upstream commit f5d28856b89baab4232a9f841e565763fcebcdf9 ]

In commit:

commit 09ac4fcb3f255e9225967c75f5893325c116cdbe
Author: Felix Kuehling <Felix.Kuehling@amd.com>
Date: Thu Jul 13 17:01:16 2017 -0400

drm/ttm: Implement vm_operations_struct.access v2

we added the vm_access hook, where we also directly call tt_swapin for
some reason. If something is swapped-out then the ttm_tt must also be
unpopulated, and since access_kmap should also call tt_populate, if
needed, then swapping-in will already be handled there.

If anything, calling tt_swapin directly here would likely always fail
since the tt->pages won't yet be populated, or worse since the tt->pages
array is never actually cleared in unpopulate this might lead to a nasty
uaf.

Fixes: 09ac4fcb3f25 ("drm/ttm: Implement vm_operations_struct.access v2")
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Christian König <christian.koenig@amd.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210927114114.152310-1-matthew.auld@intel.com
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

ath10k: sdio: Add missing BH locking around napi_schdule()

BugLink: https://bugs.launchpad.net/bugs/1951822
[ Upstream commit 019edd01d174ce4bb2e517dd332922514d176601 ]

On a i.MX-based board with a QCA9377 Wifi chip, the following errors
are seen after launching the 'hostapd' application:

hostapd /etc/wifi.conf
Configuration file: /etc/wifi.conf
wlan0: interface state UNINITIALIZED->COUNTRY_UPDATE
NOHZ tick-stop error: Non-RCU local softirq work is pending, handler #08!!!
Using interface wlan0 with hwaddr 00:1f:7b:31:04:a0 and ssid "thessid"
IPv6: ADDRCONF(NETDEV_CHANGE): wlan0: link becomes ready
wlan0: interface state COUNTRY_UPDATE->ENABLED
wlan0: AP-ENABLED
NOHZ tick-stop error: Non-RCU local softirq work is pending, handler #08!!!
NOHZ tick-stop error: Non-RCU local softirq work is pending, handler #08!!!
NOHZ tick-stop error: Non-RCU local softirq work is pending, handler #08!!!
NOHZ tick-stop error: Non-RCU local softirq work is pending, handler #08!!!
...

Fix this problem by adding the BH locking around napi-schedule(),
in the same way it was done in commit e63052a5dd3c ("mlx5e: add
add missing BH locking around napi_schdule()").

Its commit log provides the following explanation:

"It's not correct to call napi_schedule() in pure process
context. Because we use __raise_softirq_irqoff() we require
callers to be in a context which will eventually lead to
softirq handling (hardirq, bh disabled, etc.).

With code as is users will see:

NOHZ tick-stop error: Non-RCU local softirq work is pending, handler #08!!!
"

Fixes: cfee8793a74d ("ath10k: enable napi on RX path for sdio")
Signed-off-by: Fabio Estevam <festevam@denx.de>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20210824144339.2796122-1-festevam@denx.de
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

ath10k: Fix missing frame timestamp for beacon/probe-resp

BugLink: https://bugs.launchpad.net/bugs/1951822
[ Upstream commit e6dfbc3ba90cc2b619229be56b485f085a0a8e1c ]

When receiving a beacon or probe response, we should update the
boottime_ns field which is the timestamp the frame was received at.
(cf mac80211.h)

This fixes a scanning issue with Android since it relies on this
timestamp to determine when the AP has been seen for the last time
(via the nl80211 BSS_LAST_SEEN_BOOTTIME parameter).

Fixes: 5e3dd157d7e7 ("ath10k: mac80211 driver for Qualcomm Atheros 802.11ac CQA98xx devices")
Signed-off-by: Loic Poulain <loic.poulain@linaro.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/1629811733-7927-1-git-send-email-loic.poulain@linaro.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

gve: DQO: avoid unused variable warnings

BugLink: https://bugs.launchpad.net/bugs/1951822
[ Upstream commit 1e0083bd0777e4a418a6710d9ee04b979cdbe5cc ]

The use of dma_unmap_addr()/dma_unmap_len() in the driver causes
multiple warnings when these macros are defined as empty, e.g.
in an ARCH=i386 allmodconfig build:

drivers/net/ethernet/google/gve/gve_tx_dqo.c: In function 'gve_tx_add_skb_no_copy_dqo':
drivers/net/ethernet/google/gve/gve_tx_dqo.c:494:40: error: unused variable 'buf' [-Werror=unused-variable]
494 | struct gve_tx_dma_buf *buf =

This is not how the NEED_DMA_MAP_STATE macros are meant to work,
as they rely on never using local variables or a temporary structure
like gve_tx_dma_buf.

Remote the gve_tx_dma_buf definition and open-code the contents
in all places to avoid the warning. This causes some rather long
lines but otherwise ends up making the driver slightly smaller.

Fixes: a57e5de476be ("gve: DQO: Add TX path")
Link: https://lore.kernel.org/netdev/20210723231957.1113800-1-bcf@google.com/
Link: https://lore.kernel.org/netdev/20210721151100.2042139-1-arnd@kernel.org/
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

ath11k: Fix memory leak in ath11k_qmi_driver_event_work

BugLink: https://bugs.launchpad.net/bugs/1951822
[ Upstream commit 72de799aa9e3e064b35238ef053d2f0a49db055a ]

The buffer pointed to by event is not freed in case
ATH11K_FLAG_UNREGISTERING bit is set, resulting in
memory leak, so fix it.

Tested-on: WCN6855 hw2.0 PCI WLAN.HSP.1.1-01720.1-QCAHSPSWPL_V1_V2_SILICONZ_LITE-1

Fixes: d5c65159f289 ("ath11k: driver for Qualcomm IEEE 802.11ax devices")
Signed-off-by: Baochen Qiang <bqiang@codeaurora.org>
Signed-off-by: Jouni Malinen <jouni@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20210913180246.193388-4-jouni@codeaurora.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

ath11k: fix packet drops due to incorrect 6 GHz freq value in rx status

BugLink: https://bugs.launchpad.net/bugs/1951822
[ Upstream commit 9d6ae1f5cf733c0e8d7f904c501fd015c4b9f0f4 ]

Frequency in rx status is being filled incorrectly in the 6 GHz band as
channel number received is invalid in this case which is causing packet
drops. So fix that.

Fixes: 5dcf42f8b79d ("ath11k: Use freq instead of channel number in rx path")
Signed-off-by: Pradeep Kumar Chitrapu <pradeepc@codeaurora.org>
Signed-off-by: Jouni Malinen <jouni@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20210722102054.43419-2-jouni@codeaurora.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

ath11k: Avoid race during regd updates

BugLink: https://bugs.launchpad.net/bugs/1951822
[ Upstream commit 1db2b0d0a39102238fcbf9092cefa65a710642e9 ]

Whenever ath11k is bootup with a user country already set, cfg80211
notifies this country info to ath11k soon after registration, where the
notification is sent to the firmware for fetching the rules of this user
country input.

Multiple race conditions could be seen in this scenario where a new
request is either lost as pointed in [1] or a new regd overwrites the
default regd provided by the firmware during bootup. Note that, the
default regd is used for intersection purpose and hence it should not be
overwritten.

The main reason as pointed by [1] is the usage of ATH11K_FLAG_REGISTERED
flag which is updated after completion of core registration, whereas the
reg notification from cfg80211 and wmi events for the corresponding
request can happen much before that. Since the ATH11K_FLAG_REGISTERED is
currently used to determine if the event containing reg rules belong to
default regd or for user request, there is a possibility of the default
regd getting overwritten.

Since the default reg rules will be received only once per pdev on
firmware load, the above flag based check can be replaced with a check
to see if default_regd is already set, so that we can now always update
the new_regd. Also if the new_regd is set, this will be always used to
update the reg rules for the registered phy.

[1] https://patchwork.kernel.org/project/linux-wireless/patch/1829665.1PRlr7bOQj@ripper/

Tested-on: IPQ8074 hw2.0 AHB WLAN.HK.2.4.0.1-01460-QCAHKSWPL_SILICONZ-1
Fixes: d5c65159f289 ("ath11k: driver for Qualcomm IEEE 802.11ax devices")
Signed-off-by: Sriram R <srirrama@codeaurora.org>
Signed-off-by: Jouni Malinen <jouni@codeaurora.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20210721212029.142388-4-jouni@codeaurora.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

ath11k: fix some sleeping in atomic bugs

BugLink: https://bugs.launchpad.net/bugs/1951822
[ Upstream commit aadf7c81a0771b8f1c97dabca6a48bae1b387779 ]

The ath11k_dbring_bufs_replenish() and ath11k_dbring_fill_bufs()
take a "gfp" parameter but they since they take spinlocks, the
allocations they do have to be atomic. This causes a bug because
ath11k_dbring_buf_setup passes GFP_KERNEL for the gfp flags.

The fix is to use GFP_ATOMIC and remove the unused parameters.

Fixes: bd6478559e27 ("ath11k: Add direct buffer ring support")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20210812070434.GE31863@kili
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

bpf/tests: Fix error in tail call limit tests

BugLink: https://bugs.launchpad.net/bugs/1951822
[ Upstream commit 18935a72eb25525b655262579e1652362a3b29bb ]

This patch fixes an error in the tail call limit test that caused the
test to fail on for x86-64 JIT. Previously, the register R0 was used to
report the total number of tail calls made. However, after a tail call
fall-through, the value of the R0 register is undefined. Now, all tail
call error path tests instead use context state to store the count.

Fixes: 874be05f525e ("bpf, tests: Add tail call test suite")
Reported-by: Paul Chaignon <paul@cilium.io>
Reported-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Johan Almbladh <johan.almbladh@anyfinetworks.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Link: https://lore.kernel.org/bpf/20210914091842.4186267-14-johan.almbladh@anyfinetworks.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

net: dsa: rtl8366: Fix a bug in deleting VLANs

BugLink: https://bugs.launchpad.net/bugs/1951822
[ Upstream commit d8251b9db34a2cbc5619b610e7e8aad1d165c531 ]

We were checking that the MC (member config) was != 0
for some reason, all we need to check is that the config
has no ports, i.e. no members. Then it can be recycled.
This must be some misunderstanding.

Fixes: 4ddcaf1ebb5e ("net: dsa: rtl8366: Properly clear member config")
Cc: Mauri Sandberg <sandberg@mailfence.com>
Cc: DENG Qingfang <dqfext@gmail.com>
Reviewed-by: Alvin Šipraga <alsi@bang-olufsen.dk>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

net: dsa: rtl8366rb: Fix off-by-one bug

BugLink: https://bugs.launchpad.net/bugs/1951822
[ Upstream commit 5f5f12f5d4b108399130bb5c11f07765851d9cdb ]

The max VLAN number with non-4K VLAN activated is 15, and the
range is 0..15. Not 16.

The impact should be low since we by default have 4K VLAN and
thus have 4095 VLANs to play with in this switch. There will
not be a problem unless the code is rewritten to only use
16 VLANs.

Fixes: d8652956cf37 ("net: dsa: realtek-smi: Add Realtek SMI driver")
Cc: Mauri Sandberg <sandberg@mailfence.com>
Cc: DENG Qingfang <dqfext@gmail.com>
Cc: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Alvin Šipraga <alsi@bang-olufsen.dk>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

net/mlx5: Accept devlink user input after driver initialization complete

BugLink: https://bugs.launchpad.net/bugs/1951822
[ Upstream commit 64ea2d0e7263b67d8efc93fa1baace041ed36d1e ]

The change of devlink_alloc() to accept device makes sure that device
is fully initialized and device_register() does nothing except allowing
users to use that devlink instance.

Such change ensures that no user input will be usable till that point and
it eliminates the need to worry about internal locking as long as devlink_register
is called last since all accesses to the devlink are during initialization.

This change fixes the following lockdep warning.

======================================================
WARNING: possible circular locking dependency detected
5.14.0-rc2+ #27 Not tainted
------------------------------------------------------
devlink/265 is trying to acquire lock:
ffff8880133c2bc0 (&dev->intf_state_mutex){+.+.}-{3:3}, at: mlx5_unload_one+0x1e/0xa0 [mlx5_core]
but task is already holding lock:
ffffffff8362b468 (devlink_mutex){+.+.}-{3:3}, at: devlink_nl_pre_doit+0x2b/0x8d0
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:

-> #1 (devlink_mutex){+.+.}-{3:3}:
        __mutex_lock+0x149/0x1310
        devlink_register+0xe7/0x280
        mlx5_devlink_register+0x118/0x480 [mlx5_core]
        mlx5_init_one+0x34b/0x440 [mlx5_core]
        probe_one+0x480/0x6e0 [mlx5_core]
        pci_device_probe+0x2a0/0x4a0
        really_probe+0x1cb/0xba0
        __driver_probe_device+0x18f/0x470
        driver_probe_device+0x49/0x120
        __driver_attach+0x1ce/0x400
        bus_for_each_dev+0x11e/0x1a0
        bus_add_driver+0x309/0x570
        driver_register+0x20f/0x390
        0xffffffffa04a0062
        do_one_initcall+0xd5/0x400
        do_init_module+0x1c8/0x760
        load_module+0x7d9d/0xa4b0
        __do_sys_finit_module+0x118/0x1a0
        do_syscall_64+0x3d/0x90
        entry_SYSCALL_64_after_hwframe+0x44/0xae

-> #0 (&dev->intf_state_mutex){+.+.}-{3:3}:
        __lock_acquire+0x2999/0x5a40
        lock_acquire+0x1a9/0x4a0
        __mutex_lock+0x149/0x1310
        mlx5_unload_one+0x1e/0xa0 [mlx5_core]
        mlx5_devlink_reload_down+0x185/0x2b0 [mlx5_core]
        devlink_reload+0x1f2/0x640
        devlink_nl_cmd_reload+0x6c3/0x10d0
        genl_family_rcv_msg_doit+0x1e9/0x2f0
        genl_rcv_msg+0x27f/0x4a0
        netlink_rcv_skb+0x11e/0x340
        genl_rcv+0x24/0x40
        netlink_unicast+0x433/0x700
        netlink_sendmsg+0x6fb/0xbe0
        sock_sendmsg+0xb0/0xe0
        __sys_sendto+0x192/0x240
        __x64_sys_sendto+0xdc/0x1b0
        do_syscall_64+0x3d/0x90
        entry_SYSCALL_64_after_hwframe+0x44/0xae

other info that might help us debug this:

  Possible unsafe locking scenario:

        CPU0                    CPU1
        ----                    ----
   lock(devlink_mutex);
                                lock(&dev->intf_state_mutex);
                                lock(devlink_mutex);
   lock(&dev->intf_state_mutex);

  *** DEADLOCK ***

3 locks held by devlink/265:
  #0: ffffffff836371d0 (cb_lock){++++}-{3:3}, at: genl_rcv+0x15/0x40
  #1: ffffffff83637288 (genl_mutex){+.+.}-{3:3}, at: genl_rcv_msg+0x31a/0x4a0
  #2: ffffffff8362b468 (devlink_mutex){+.+.}-{3:3}, at: devlink_nl_pre_doit+0x2b/0x8d0

stack backtrace:
CPU: 0 PID: 265 Comm: devlink Not tainted 5.14.0-rc2+ #27
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
Call Trace:
  dump_stack_lvl+0x45/0x59
  check_noncircular+0x268/0x310
  ? print_circular_bug+0x460/0x460
  ? __kernel_text_address+0xe/0x30
  ? alloc_chain_hlocks+0x1e6/0x5a0
  __lock_acquire+0x2999/0x5a40
  ? lockdep_hardirqs_on_prepare+0x3e0/0x3e0
  ? add_lock_to_list.constprop.0+0x6c/0x530
  lock_acquire+0x1a9/0x4a0
  ? mlx5_unload_one+0x1e/0xa0 [mlx5_core]
  ? lock_release+0x6c0/0x6c0
  ? lockdep_hardirqs_on_prepare+0x3e0/0x3e0
  ? lock_is_held_type+0x98/0x110
  __mutex_lock+0x149/0x1310
  ? mlx5_unload_one+0x1e/0xa0 [mlx5_core]
  ? lock_is_held_type+0x98/0x110
  ? mlx5_unload_one+0x1e/0xa0 [mlx5_core]
  ? find_held_lock+0x2d/0x110
  ? mutex_lock_io_nested+0x1160/0x1160
  ? mlx5_lag_is_active+0x72/0x90 [mlx5_core]
  ? lock_downgrade+0x6d0/0x6d0
  ? do_raw_spin_lock+0x12e/0x270
  ? rwlock_bug.part.0+0x90/0x90
  ? mlx5_unload_one+0x1e/0xa0 [mlx5_core]
  mlx5_unload_one+0x1e/0xa0 [mlx5_core]
  mlx5_devlink_reload_down+0x185/0x2b0 [mlx5_core]
  ? netlink_broadcast_filtered+0x308/0xac0
  ? mlx5_devlink_info_get+0x1f0/0x1f0 [mlx5_core]
  ? __build_skb_around+0x110/0x2b0
  ? __alloc_skb+0x113/0x2b0
  devlink_reload+0x1f2/0x640
  ? devlink_unregister+0x1e0/0x1e0
  ? security_capable+0x51/0x90
  devlink_nl_cmd_reload+0x6c3/0x10d0
  ? devlink_nl_cmd_get_doit+0x1e0/0x1e0
  ? devlink_nl_pre_doit+0x72/0x8d0
  genl_family_rcv_msg_doit+0x1e9/0x2f0
  ? __lock_acquire+0x15e2/0x5a40
  ? genl_family_rcv_msg_attrs_parse.constprop.0+0x240/0x240
  ? mutex_lock_io_nested+0x1160/0x1160
  ? security_capable+0x51/0x90
  genl_rcv_msg+0x27f/0x4a0
  ? genl_get_cmd+0x3c0/0x3c0
  ? lock_acquire+0x1a9/0x4a0
  ? devlink_nl_cmd_get_doit+0x1e0/0x1e0
  ? lock_release+0x6c0/0x6c0
  netlink_rcv_skb+0x11e/0x340
  ? genl_get_cmd+0x3c0/0x3c0
  ? netlink_ack+0x930/0x930
  genl_rcv+0x24/0x40
  netlink_unicast+0x433/0x700
  ? netlink_attachskb+0x750/0x750
  ? __alloc_skb+0x113/0x2b0
  netlink_sendmsg+0x6fb/0xbe0
  ? netlink_unicast+0x700/0x700
  ? netlink_unicast+0x700/0x700
  sock_sendmsg+0xb0/0xe0
  __sys_sendto+0x192/0x240
  ? __x64_sys_getpeername+0xb0/0xb0
  ? do_sys_openat2+0x10a/0x370
  ? down_write_nested+0x150/0x150
  ? do_user_addr_fault+0x215/0xd50
  ? __x64_sys_openat+0x11f/0x1d0
  ? __x64_sys_open+0x1a0/0x1a0
  __x64_sys_sendto+0xdc/0x1b0
  ? syscall_enter_from_user_mode+0x1d/0x50
  do_syscall_64+0x3d/0x90
  entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7f50b50b6b3a
Code: d8 64 89 02 48 c7 c0 ff ff ff ff eb b8 0f 1f 00 f3 0f 1e fa 41 89 ca 64 8b 04 25 18 00 00 00 85 c0 75 15 b8 2c 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 76 c3 0f 1f 44 00 00 55 48 83 ec 30 44 89 4c
RSP: 002b:00007fff6c0d3f38 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
RAX: ffffffffffffffda RBX: 0000000000000005 RCX: 00007f50b50b6b3a
RDX: 0000000000000038 RSI: 000055763ac08440 RDI: 0000000000000003
RBP: 000055763ac08410 R08: 00007f50b5192200 R09: 000000000000000c
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 0000000000000000 R14: 000055763ac08410 R15: 000055763ac08440
mlx5_core 0000:00:09.0: firmware version: 4.8.9999
mlx5_core 0000:00:09.0: 0.000 Gb/s available PCIe bandwidth (8.0 GT/s PCIe x255 link)
mlx5_core 0000:00:09.0 eth1: Link up

Fixes: a6f3b62386a0 ("net/mlx5: Move devlink registration before interfaces load")
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

cfg80211: always free wiphy specific regdomain

BugLink: https://bugs.launchpad.net/bugs/1951822
[ Upstream commit e53e9828a8d2c6545e01ff9711f1221f2fd199ce ]

In the (somewhat unlikely) event that we allocate a wiphy, then
add a regdomain to it, and then fail registration, we leak the
regdomain. Fix this by just always freeing it at the end, in the
normal cases we'll free (and NULL) it during wiphy_unregister().
This happened when the wiphy settings were bad, and since they
can be controlled by userspace with hwsim, syzbot was able to
find this issue.

Reported-by: syzbot+1638e7c770eef6b6c0d0@syzkaller.appspotmail.com
Fixes: 3e0c3ff36c4c ("cfg80211: allow multiple driver regulatory_hints()")
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Link: https://lore.kernel.org/r/20210927131105.68b70cef4674.I4b9f0aa08c2af28555963b9fe3d34395bb72e0cc@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

mac80211: twt: don't use potentially unaligned pointer

BugLink: https://bugs.launchpad.net/bugs/1951822
[ Upstream commit 7ff379ba2d4b7b205240e666601fe302207d73f8 ]

Since we're pointing into a frame, the pointer to the
twt_agrt->req_type struct member is potentially not
aligned properly. Open-code le16p_replace_bits() to
avoid passing an unaligned pointer.

Reported-by: kernel test robot <lkp@intel.com>
Fixes: f5a4c24e689f ("mac80211: introduce individual TWT support in AP mode")
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Link: https://lore.kernel.org/r/20210927115124.e1208694f37b.Ie3de9bcc5dde5a79e3ac81f3185beafe4d214e57@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

fortify: Fix dropped strcpy() compile-time write overflow check

BugLink: https://bugs.launchpad.net/bugs/1951822
[ Upstream commit 072af0c638dc8a5c7db2edc4dddbd6d44bee3bdb ]

The implementation for intra-object overflow in str*-family functions
accidentally dropped compile-time write overflow checking in strcpy(),
leaving it entirely to run-time. Add back the intended check.

Fixes: 6a39e62abbaf ("lib: string.h: detect intra-object overflow in fortified string functions")
Cc: Daniel Axtens <dja@axtens.net>
Cc: Francis Laniel <laniel_francis@privacyrequired.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

mptcp: do not shrink snd_nxt when recovering

BugLink: https://bugs.launchpad.net/bugs/1951822
[ Upstream commit 0d199e4363b482badcedba764e2aceab53a4a10a ]

When recovering after a link failure, snd_nxt should not be set to a
lower value. Else, update of snd_nxt is broken because:

msk->snd_nxt += ret; (where ret is number of bytes sent)

assumes that snd_nxt always moves forward.
After reduction, its possible that snd_nxt update gets out of sync:
dfrag we just sent might have had a data sequence number even past
recovery_snd_nxt.

This change factors the common msk state update to a helper
and updates snd_nxt based on the current dfrag data sequence number.

The conditional is required for the recovery phase where we may
re-transmit old dfrags that are before current snd_nxt.

After this change, snd_nxt only moves forward and covers all in-sequence
data that was transmitted.

recovery_snd_nxt is retained to detect when recovery has completed.

Fixes: 1e1d9d6f119c5 ("mptcp: handle pending data on closed subflow")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

rxrpc: Fix _usecs_to_jiffies() by using usecs_to_jiffies()

BugLink: https://bugs.launchpad.net/bugs/1951822
[ Upstream commit acde891c243c1ed85b19d4d5042bdf00914f5739 ]

Directly using _usecs_to_jiffies() might be unsafe, so it's
better to use usecs_to_jiffies() instead.
Because we can see that the result of _usecs_to_jiffies()
could be larger than MAX_JIFFY_OFFSET values without the
check of the input.

Fixes: c410bf01933e ("Fix the excessive initial retransmission timeout")
Signed-off-by: Jiasheng Jiang <jiasheng@iscas.ac.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

qed: Don't ignore devlink allocation failures

BugLink: https://bugs.launchpad.net/bugs/1951822
[ Upstream commit e6a54d6f221301347aaf9d83bb1f23129325c1c5 ]

devlink is a software interface that doesn't depend on any hardware
capabilities. The failure in SW means memory issues, wrong parameters,
programmer error e.t.c.

Like any other such interface in the kernel, the returned status of
devlink APIs should be checked and propagated further and not ignored.

Fixes: 755f982bb1ff ("qed/qede: make devlink survive recovery")
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

bnxt_en: Check devlink allocation and registration status

BugLink: https://bugs.launchpad.net/bugs/1951822
[ Upstream commit e624c70e1131e145bd0510b8a700b5e2d112e377 ]

devlink is a software interface that doesn't depend on any hardware
capabilities. The failure in SW means memory issues, wrong parameters,
programmer error e.t.c.

Like any other such interface in the kernel, the returned status of
devlink APIs should be checked and propagated further and not ignored.

Fixes: 4ab0c6a8ffd7 ("bnxt_en: add support to enable VF-representors")
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

Bluetooth: hci_h5: Fix (runtime)suspend issues on RTL8723BS HCIs

BugLink: https://bugs.launchpad.net/bugs/1951822
[ Upstream commit 9a9023f314873241a43b5a2b96e9c0caaa958433 ]

The recently added H5_WAKEUP_DISABLE h5->flags flag gets checked in
h5_btrtl_open(), but it gets set in h5_serdev_probe() *after*
calling hci_uart_register_device() and thus after h5_btrtl_open()
is called, set this flag earlier.

Also on devices where suspend/resume involves fully re-probing the HCI,
runtime-pm suspend should not be used, make the runtime-pm setup
conditional on the H5_WAKEUP_DISABLE flag too.

This fixes the HCI being removed and then re-added every 10 seconds
because it was being reprobed as soon as it was runtime-suspended.

Fixes: 66f077dde749 ("Bluetooth: hci_h5: add WAKEUP_DISABLE flag")
Fixes: d9dd833cf6d2 ("Bluetooth: hci_h5: Add runtime suspend")
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Archie Pusaka <apusaka@chromium.org>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

crypto: qat - power up 4xxx device

BugLink: https://bugs.launchpad.net/bugs/1951822
[ Upstream commit ca605f97dae4bf070b7c584aec23c1c922e4d823 ]

After reset or boot, QAT 4xxx devices are inactive and require to be
explicitly activated.
This is done by writing the DRV_ACTIVE bit in the PM_INTERRUPT register
and polling the PM_INIT_STATE to make sure that the transaction has
completed properly.

If this is not done, the driver will fail the initialization sequence
reporting the following message:
    [   22.081193] 4xxx 0000:f7:00.0: enabling device (0140 -> 0142)
    [   22.720285] QAT: AE0 is inactive!!
    [   22.720287] QAT: failed to get device out of reset
    [   22.720288] 4xxx 0000:f7:00.0: qat_hal_clr_reset error
    [   22.720290] 4xxx 0000:f7:00.0: Failed to init the AEs
    [   22.720290] 4xxx 0000:f7:00.0: Failed to initialise Acceleration Engine
    [   22.720789] 4xxx 0000:f7:00.0: Resetting device qat_dev0
    [   22.825099] 4xxx: probe of 0000:f7:00.0 failed with error -14

The patch also temporarily disables the power management source of
interrupt, to avoid possible spurious interrupts as the power management
feature is not fully supported.

The device init function has been added to adf_dev_init(), and not in the
probe of 4xxx to make sure that the device is re-enabled in case of
reset.

Note that the error code reported by hw_data->init_device() in
adf_dev_init() has been shadowed for consistency with the other calls
in the same function.

Fixes: 8c8268166e83 ("crypto: qat - add qat_4xxx driver")
Signed-off-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com>
Reviewed-by: Wojciech Ziemba <wojciech.ziemba@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

crypto: caam - disable pkc for non-E SoCs

BugLink: https://bugs.launchpad.net/bugs/1951822
[ Upstream commit f20311cc9c58052e0b215013046cbf390937910c ]

On newer CAAM versions, not all accelerators are disabled if the SoC is
a non-E variant. While the driver checks most of the modules for
availability, there is one - PKHA - which sticks out. On non-E variants
it is still reported as available, that is the number of instances is
non-zero, but it has limited functionality. In particular it doesn't
support encryption and decryption, but just signing and verifying. This
is indicated by a bit in the PKHA_MISC field. Take this bit into account
if we are checking for availability.

This will the following error:
[ 8.167817] caam_jr 8020000.jr: 20000b0f: CCB: desc idx 11: : Invalid CHA selected.

Tested on an NXP LS1028A (non-E) SoC.

Fixes: d239b10d4ceb ("crypto: caam - add register map changes cf. Era 10")
Signed-off-by: Michael Walle <michael@walle.cc>
Reviewed-by: Horia Geantă <horia.geanta@nxp.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

drm/amdgpu: move amdgpu_virt_release_full_gpu to fini_early stage

BugLink: https://bugs.launchpad.net/bugs/1951822
[ Upstream commit 6effad8abe0ba4db3d9c58ed585127858a990f35 ]

adev->rmmio is set to be NULL in amdgpu_device_unmap_mmio to prevent
access after pci_remove, however, in SRIOV case, amdgpu_virt_release_full_gpu
will still use adev->rmmio for access after amdgpu_device_unmap_mmio.
The patch is to move such SRIOV calling earlier to fini_early stage.

Fixes: 07775fc13878 ("drm/amdgpu: Unmap all MMIO mappings")
Cc: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Leslie Shi <Yuliang.Shi@amd.com>
Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

drm/amd/display: Pass display_pipe_params_st as const in DML

BugLink: https://bugs.launchpad.net/bugs/1951822
[ Upstream commit 22667e6ec6b2ce9ca706e9061660b059725d009c ]

[Why]
This neither needs to be on the stack nor passed by value
to each function call. In fact, when building with clang
it seems to break the Linux's default 1024 byte stack
frame limit.

[How]
We can simply pass this as a const pointer.

This patch fixes these Coverity IDs
Addresses-Coverity-ID: 1424031: ("Big parameter passed by value")
Addresses-Coverity-ID: 1423970: ("Big parameter passed by value")
Addresses-Coverity-ID: 1423941: ("Big parameter passed by value")
Addresses-Coverity-ID: 1451742: ("Big parameter passed by value")
Addresses-Coverity-ID: 1451887: ("Big parameter passed by value")
Addresses-Coverity-ID: 1454146: ("Big parameter passed by value")
Addresses-Coverity-ID: 1454152: ("Big parameter passed by value")
Addresses-Coverity-ID: 1454413: ("Big parameter passed by value")
Addresses-Coverity-ID: 1466144: ("Big parameter passed by value")
Addresses-Coverity-ID: 1487237: ("Big parameter passed by value")

Signed-off-by: Harry Wentland <harry.wentland@amd.com>
Fixes: 3fe617ccafd6 ("Enable '-Werror' by default for all kernel builds")
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: amd-gfx@lists.freedesktop.org
Cc: Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
Cc: Arnd Bergmann <arnd@kernel.org>
Cc: Leo Li <sunpeng.li@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Xinhui Pan <Xinhui.Pan@amd.com>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: llvm@lists.linux.dev
Acked-by: Christian König <christian.koenig@amd.com>
Build-tested-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Leo Li <sunpeng.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

drm/amdgpu: Fix crash on device remove/driver unload

BugLink: https://bugs.launchpad.net/bugs/1951822
[ Upstream commit d82e2c249c8ffaec20fa618611ea2ab4dcfd4d01 ]

Crash:
BUG: unable to handle page fault for address: 00000000000010e1
RIP: 0010:vega10_power_gate_vce+0x26/0x50 [amdgpu]
Call Trace:
pp_set_powergating_by_smu+0x16a/0x2b0 [amdgpu]
amdgpu_dpm_set_powergating_by_smu+0x92/0xf0 [amdgpu]
amdgpu_dpm_enable_vce+0x2e/0xc0 [amdgpu]
vce_v4_0_hw_fini+0x95/0xa0 [amdgpu]
amdgpu_device_fini_hw+0x232/0x30d [amdgpu]
amdgpu_driver_unload_kms+0x5c/0x80 [amdgpu]
amdgpu_pci_remove+0x27/0x40 [amdgpu]
pci_device_remove+0x3e/0xb0
device_release_driver_internal+0x103/0x1d0
device_release_driver+0x12/0x20
pci_stop_bus_device+0x79/0xa0
pci_stop_and_remove_bus_device_locked+0x1b/0x30
remove_store+0x7b/0x90
dev_attr_store+0x17/0x30
sysfs_kf_write+0x4b/0x60
kernfs_fop_write_iter+0x151/0x1e0

Why:
VCE/UVD had dependency on SMC block for their suspend but
SMC block is the first to do HW fini due to some constraints

How:
Since the original patch was dealing with suspend issues
move the SMC block dependency back into suspend hooks as
was done in V1 of the original patches.
Keep flushing idle work both in suspend and HW fini seuqnces
since it's essential in both cases.

Fixes: 859e4659273f1d ("drm/amdgpu: add missing cleanups for more ASICs on UVD/VCE suspend")
Fixes: bf756fb833cbe8 ("drm/amdgpu: add missing cleanups for Polaris12 UVD/VCE on suspend")
Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

Bluetooth: btmtkuart: fix a memleak in mtk_hci_wmt_sync

BugLink: https://bugs.launchpad.net/bugs/1951822
[ Upstream commit 3e5f2d90c28f9454e421108554707620bc23269d ]

bdev->evt_skb will get freed in the normal path and one error path
of mtk_hci_wmt_sync, while the other error paths do not free it,
which may cause a memleak. This bug is suggested by a static analysis
tool, please advise.

Fixes: e0b67035a90b ("Bluetooth: mediatek: update the common setup between MT7622 and other devices")
Signed-off-by: Dinghao Liu <dinghao.liu@zju.edu.cn>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

wilc1000: fix possible memory leak in cfg_scan_result()

BugLink: https://bugs.launchpad.net/bugs/1951822
[ Upstream commit 3c719fed0f3a5e95b1d164609ecc81c4191ade70 ]

When the BSS reference holds a valid reference, it is not freed. The 'if'
condition is wrong. Instead of the 'if (bss)' check, the 'if (!bss)' check
is used.
The issue is solved by removing the unnecessary 'if' check because
cfg80211_put_bss() already performs the NULL validation.

Fixes: 6cd4fa5ab691 ("staging: wilc1000: make use of cfg80211_inform_bss_frame()")
Signed-off-by: Ajay Singh <ajay.kathat@microchip.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20210916164902.74629-3-ajay.kathat@microchip.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

wcn36xx: Fix Antenna Diversity Switching

BugLink: https://bugs.launchpad.net/bugs/1951822
[ Upstream commit 701668d3bfa03dabc5095fc383d5315544ee5b31 ]

We have been tracking a strange bug with Antenna Diversity Switching (ADS)
on wcn3680b for a while.

ADS is configured like this:
   A. Via a firmware configuration table baked into the NV area.
       1. Defines if ADS is enabled.
       2. Defines which GPIOs are connected to which antenna enable pin.
       3. Defines which antenna/GPIO is primary and which is secondary.

   B. WCN36XX_CFG_VAL(ANTENNA_DIVERSITY, N)
      N is a bitmask of available antenna.

      Setting N to 3 indicates a bitmask of enabled antenna (1 | 2).

      Obviously then we can set N to 1 or N to 2 to fix to a particular
      antenna and disable antenna diversity.

   C. WCN36XX_CFG_VAL(ASD_PROBE_INTERVAL, XX)
      XX is the number of beacons between each antenna RSSI check.
      Setting this value to 50 means, every 50 received beacons, run the
      ADS algorithm.

   D. WCN36XX_CFG_VAL(ASD_TRIGGER_THRESHOLD, YY)
      YY is a two's complement integer which specifies the RSSI decibel
      threshold below which ADS will run.
      We default to -60db here, meaning a measured RSSI <= -60db will
      trigger an ADS probe.

   E. WCN36XX_CFG_VAL(ASD_RTT_RSSI_HYST_THRESHOLD, Z)
      Z is a hysteresis value, indicating a delta which the RSSI must
      exceed for the antenna switch to be valid.

      For example if HYST_THRESHOLD == 3 AntennaId1-RSSI == -60db and
      AntennaId-2-RSSI == -58db then firmware will not switch antenna.
      The threshold needs to be -57db or better to satisfy the criteria.

   F. A firmware feature bit also exists ANTENNA_DIVERSITY_SELECTION.
      This feature bit is used by the firmware to report if
      ANTENNA_DIVERSITY_SELECTION is supported. The host is not required to
      toggle this bit to enable or disable ADS.

ADS works like this:

    A. Every XX beacons the firmware switches to or remains on the primary
       antenna.

    B. The firmware then sends a Request-To-Send (RTS) packet to the AP.

    C. The firmware waits for a Clear-To-Send (CTS) response from the AP.

    D. The firmware then notes the received RSSI on the CTS packet.

    E. The firmware then repeats steps A-D on the secondary antenna.

    F. Subsequently if the RSSI on the measured antenna is better than
       ASD_TRIGGER_THRESHOLD + the active antenna's RSSI then the
       measured antenna becomes the active antenna.

    G. If RSSI rises past ASD_TRIGGER_THRESHOLD then ADS doesn't run at
       all even if there is a substantially better RSSI on the alternative
       antenna.

What we have been observing is that the RTS packet is being sent but the
MAC address is a byte-swapped version of the target MAC. The ADS/RTS MAC is
corrupted only when the link is encrypted, if the AP is open the RTS MAC is
correct. Similarly if we configure the firmware to an RTS/CTS sequence for
regular data - the transmitted RTS MAC is correctly formatted.

Internally the wcn36xx firmware uses the indexes in the SMD commands to
populate and extract data from specific entries in an STA lookup table. The
AP's MAC appears a number of times in different indexes within this lookup
table, so the MAC address extracted for the data-transmit RTS and the MAC
address extracted for the ADS/RTS packet are not the same STA table index.

Our analysis indicates the relevant firmware STA table index is
"bssSelfStaIdx".

There is an STA populate function responsible for formatting the MAC
address of the bssSelfStaIdx including byte-swapping the MAC address.

Its clear then that the required STA populate command did not run for
bssSelfStaIdx.

So taking a look at the sequence of SMD commands sent to the firmware we
see the following downstream when moving from an unencrypted to encrypted
BSS setup.

- WLAN_HAL_CONFIG_BSS_REQ
- WLAN_HAL_CONFIG_STA_REQ
- WLAN_HAL_SET_STAKEY_REQ

Upstream in wcn36xx we have

- WLAN_HAL_CONFIG_BSS_REQ
- WLAN_HAL_SET_STAKEY_REQ

The solution then is to add the missing WLAN_HAL_CONFIG_STA_REQ between
WLAN_HAL_CONFIG_BSS_REQ and WLAN_HAL_SET_STAKEY_REQ.

No surprise WLAN_HAL_CONFIG_STA_REQ is the routine responsible for
populating the STA lookup table in the firmware and once done the MAC sent
by the ADS routine is in the correct byte-order.

This bug is apparent with ADS but it is also the case that any other
firmware routine that depends on the "bssSelfStaIdx" would retrieve
malformed data on an encrypted link.

Fixes: 3e977c5c523d ("wcn36xx: Define wcn3680 specific firmware parameters")
Signed-off-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org>
Tested-by: Benjamin Li <benl@squareup.com>
Reviewed-by: Loic Poulain <loic.poulain@linaro.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20210909144428.2564650-2-bryan.odonoghue@linaro.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

cgroup: Make rebind_subsystems() disable v2 controllers all at once

BugLink: https://bugs.launchpad.net/bugs/1951822
[ Upstream commit 7ee285395b211cad474b2b989db52666e0430daf ]

It was found that the following warning was displayed when remounting
controllers from cgroup v2 to v1:

[ 8042.997778] WARNING: CPU: 88 PID: 80682 at kernel/cgroup/cgroup.c:3130 cgroup_apply_control_disable+0x158/0x190
   :
[ 8043.091109] RIP: 0010:cgroup_apply_control_disable+0x158/0x190
[ 8043.096946] Code: ff f6 45 54 01 74 39 48 8d 7d 10 48 c7 c6 e0 46 5a a4 e8 7b 67 33 00 e9 41 ff ff ff 49 8b 84 24 e8 01 00 00 0f b7 40 08 eb 95 <0f> 0b e9 5f ff ff ff 48 83 c4 08 5b 5d 41 5c 41 5d 41 5e 41 5f c3
[ 8043.115692] RSP: 0018:ffffba8a47c23d28 EFLAGS: 00010202
[ 8043.120916] RAX: 0000000000000036 RBX: ffffffffa624ce40 RCX: 000000000000181a
[ 8043.128047] RDX: ffffffffa63c43e0 RSI: ffffffffa63c43e0 RDI: ffff9d7284ee1000
[ 8043.135180] RBP: ffff9d72874c5800 R08: ffffffffa624b090 R09: 0000000000000004
[ 8043.142314] R10: ffffffffa624b080 R11: 0000000000002000 R12: ffff9d7284ee1000
[ 8043.149447] R13: ffff9d7284ee1000 R14: ffffffffa624ce70 R15: ffffffffa6269e20
[ 8043.156576] FS:  00007f7747cff740(0000) GS:ffff9d7a5fc00000(0000) knlGS:0000000000000000
[ 8043.164663] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 8043.170409] CR2: 00007f7747e96680 CR3: 0000000887d60001 CR4: 00000000007706e0
[ 8043.177539] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 8043.184673] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 8043.191804] PKRU: 55555554
[ 8043.194517] Call Trace:
[ 8043.196970]  rebind_subsystems+0x18c/0x470
[ 8043.201070]  cgroup_setup_root+0x16c/0x2f0
[ 8043.205177]  cgroup1_root_to_use+0x204/0x2a0
[ 8043.209456]  cgroup1_get_tree+0x3e/0x120
[ 8043.213384]  vfs_get_tree+0x22/0xb0
[ 8043.216883]  do_new_mount+0x176/0x2d0
[ 8043.220550]  __x64_sys_mount+0x103/0x140
[ 8043.224474]  do_syscall_64+0x38/0x90
[ 8043.228063]  entry_SYSCALL_64_after_hwframe+0x44/0xae

It was caused by the fact that rebind_subsystem() disables
controllers to be rebound one by one. If more than one disabled
controllers are originally from the default hierarchy, it means that
cgroup_apply_control_disable() will be called multiple times for the
same default hierarchy. A controller may be killed by css_kill() in
the first round. In the second round, the killed controller may not be
completely dead yet leading to the warning.

To avoid this problem, we collect all the ssid's of controllers that
needed to be disabled from the default hierarchy and then disable them
in one go instead of one by one.

Fixes: 334c3679ec4b ("cgroup: reimplement rebind_subsystems() using cgroup_apply_control() and friends")
Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

spi: Fixed division by zero warning

BugLink: https://bugs.launchpad.net/bugs/1951822
[ Upstream commit 09134c5322df9f105d9ed324051872d5d0e162aa ]

The reason for dividing by zero is because the dummy bus width is zero,
but if the dummy n bytes is zero, it indicates that there is no data transfer,
so there is no need for calculation.

Fixes: 7512eaf54190 ("spi: cadence-quadspi: Fix dummy cycle calculation when buswidth > 1")
Signed-off-by: Yoshitaka Ikeda <ikeda@nskint.co.jp>
Acked-by: Pratyush Yadav <p.yadav@ti.com>
Link: https://lore.kernel.org/r/OSZPR01MB70049C8F56ED8902852DF97B8BD49@OSZPR01MB7004.jpnprd01.prod.outlook.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

drm: bridge: it66121: Fix return value it66121_probe

BugLink: https://bugs.launchpad.net/bugs/1951822
[ Upstream commit f3bc07eba481942a246926c5b934199e7ccd567b ]

Currently it66121_probe returns -EPROBE_DEFER if the there is no remote
endpoint found in the device tree which doesn't seem helpful, since this
is not going to change later and it is never checked if the next bridge
has been initialized yet. It will fail in that case later while doing
drm_bridge_attach for the next bridge in it66121_bridge_attach.

Since the bindings documentation for it66121 bridge driver states
there has to be a remote endpoint defined, its safe to return -EINVAL
in that case.
This additonally adds a check, if the remote endpoint is enabled and
returns -EPROBE_DEFER, if the remote bridge hasn't been initialized
(yet).

Fixes: 988156dc2fc9 ("drm: bridge: add it66121 driver")
Signed-off-by: Alex Bee <knaerzche@gmail.com>
Signed-off-by: Robert Foss <robert.foss@linaro.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20210918140420.231346-1-knaerzche@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

net: phylink: don't call netif_carrier_off() with NULL netdev

BugLink: https://bugs.launchpad.net/bugs/1951822
[ Upstream commit cbcca2e3961eac736566ac13ef0d0bf6f0b764ec ]

Dan Carpenter points out that we have a code path that permits a NULL
netdev pointer to be passed to netif_carrier_off(), which will cause
a kernel oops. In any case, we need to set pl->old_link_state to false
to have the desired effect when there is no netdev present.

Fixes: f97493657c63 ("net: phylink: add suspend/resume support")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

net: net_namespace: Fix undefined member in key_remove_domain()

BugLink: https://bugs.launchpad.net/bugs/1951822
[ Upstream commit aed0826b0cf2e488900ab92193893e803d65c070 ]

The key_domain member in struct net only exists if we define CONFIG_KEYS.
So we should add the define when we used key_domain.

Fixes: 9b242610514f ("keys: Network namespace domain tag")
Signed-off-by: Yajun Deng <yajun.deng@linux.dev>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

lockdep: Let lock_is_held_type() detect recursive read as read

BugLink: https://bugs.launchpad.net/bugs/1951822
[ Upstream commit 2507003a1d10917c9158077bf6030719d02c941e ]

lock_is_held_type(, 1) detects acquired read locks. It only recognized
locks acquired with lock_acquire_shared(). Read locks acquired with
lock_acquire_shared_recursive() are not recognized because a `2' is
stored as the read value.

Rework the check to additionally recognise lock's read value one and two
as a read held lock.

Fixes: e918188611f07 ("locking: More accurate annotations for read_lock()")
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Boqun Feng <boqun.feng@gmail.com>
Acked-by: Waiman Long <longman@redhat.com>
Link: https://lkml.kernel.org/r/20210903084001.lblecrvz4esl4mrr@linutronix.de
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

virtio-gpu: fix possible memory allocation failure

BugLink: https://bugs.launchpad.net/bugs/1951822
[ Upstream commit 5bd4f20de8acad37dbb3154feb34dbc36d506c02 ]

When kmem_cache_zalloc in virtio_gpu_get_vbuf fails, it will return
an error code. But none of its callers checks this error code, and
a core dump will take place.

Considering many of its callers can't handle such error, I add
a __GFP_NOFAIL flag when calling kmem_cache_zalloc to make sure
it won't fail, and delete those unused error handlings.

Fixes: dc5698e80cf724 ("Add virtio gpu driver.")
Signed-off-by: Yuntao Liu <liuyuntao10@huawei.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20210828104321.3410312-1-liuyuntao10@huawei.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

crypto: sm4 - Do not change section of ck and sbox

BugLink: https://bugs.launchpad.net/bugs/1951822
[ Upstream commit 4a7e1e5fc294687a8941fa3eeb4a7e8539ca5e2f ]

When building with clang and GNU as, there is a warning about ignored
changed section attributes:

/tmp/sm4-c916c8.s: Assembler messages:
/tmp/sm4-c916c8.s:677: Warning: ignoring changed section attributes for
.data..cacheline_aligned

"static const" places the data in .rodata but __cacheline_aligned has
the section attribute to place it in .data..cacheline_aligned, in
addition to the aligned attribute.

To keep the alignment but avoid attempting to change sections, use the
____cacheline_aligned attribute, which is just the aligned attribute.

Fixes: 2b31277af577 ("crypto: sm4 - create SM4 library based on sm4 generic code")
Link: https://github.com/ClangBuiltLinux/linux/issues/1441
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

drm/v3d: fix wait for TMU write combiner flush

BugLink: https://bugs.launchpad.net/bugs/1951822
[ Upstream commit e4f868191138975f2fdf2f37c11318b47db4acc9 ]

The hardware sets the TMUWCF bit back to 0 when the TMU write
combiner flush completes so we should be checking for that instead
of the L2TFLS bit.

v2 (Melissa Wen):
- Add Signed-off-by and Fixes tags.
- Change the error message for the timeout to be more clear.

Fixes spurious Vulkan CTS failures in:
dEQP-VK.binding_model.descriptorset_random.*

Fixes: d223f98f02099 ("drm/v3d: Add support for compute shader dispatch.")
Signed-off-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Melissa Wen <mwen@igalia.com>
Signed-off-by: Melissa Wen <melissa.srw@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210915100507.3945-1-itoral@igalia.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

net/mlx5: Publish and unpublish all devlink parameters at once

BugLink: https://bugs.launchpad.net/bugs/1951822
[ Upstream commit e9310aed8e6a5003abb2aa6b9229d2fb9ceb9e85 ]

The devlink parameters were published in two steps despite being static
and known in advance.

First step was to use devlink_params_publish() which iterated over all
known up to that point parameters and sent notification messages.
In second step, the call was devlink_param_publish() that looped over
same parameters list and sent notification for new parameters.

In order to simplify the API, move devlink_params_publish() to be called
when all parameters were already added and save the need to iterate over
parameters list again.

As a side effect, this change fixes the error unwind flow in which
parameters were not marked as unpublished.

Fixes: 82e6c96f04e1 ("net/mlx5: Register to devlink ingress VLAN filter trap")
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

objtool: Handle __sanitize_cov*() tail calls

BugLink: https://bugs.launchpad.net/bugs/1951822
[ Upstream commit f56dae88a81fded66adf2bea9922d1d98d1da14f ]

Turns out the compilers also generate tail calls to __sanitize_cov*(),
make sure to also patch those out in noinstr code.

Fixes: 0f1441b44e82 ("objtool: Fix noinstr vs KCOV")
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Marco Elver <elver@google.com>
Link: https://lore.kernel.org/r/20210624095147.818783799@infradead.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

x86/xen: Mark cpu_bringup_and_idle() as dead_end_function

BugLink: https://bugs.launchpad.net/bugs/1951822
[ Upstream commit 9af9dcf11bda3e2c0e24c1acaacb8685ad974e93 ]

The asm_cpu_bringup_and_idle() function is required to push the return
value on the stack in order to make ORC happy, but the only reason
objtool doesn't complain is because of a happy accident.

The thing is that asm_cpu_bringup_and_idle() doesn't return, so
validate_branch() never terminates and falls through to the next
function, which in the normal case is the hypercall_page. And that, as
it happens, is 4095 NOPs and a RET.

Make asm_cpu_bringup_and_idle() terminate on it's own, by making the
function it calls as a dead-end. This way we no longer rely on what
code happens to come after.

Fixes: c3881eb58d56 ("x86/xen: Make the secondary CPU idle tasks reliable")
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Link: https://lore.kernel.org/r/20210624095147.693801717@infradead.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>

MIPS: lantiq: dma: fix burst length for DEU

BugLink: https://bugs.launchpad.net/bugs/1951822
[ Upstream commit 5ad74d39c51dd41b3c819f4f5396655f0629b4fd ]

The current definition of 2W burst length is invalid.
This patch fixes it. Current downstream DEU driver doesn't
use DMA. An incorrect burst length value doesn't cause any
errors. This patch also adds other burst length values.

Fixes: dfec1a827d2b ("MIPS: Lantiq: Add DMA support")
Signed-off-by: Aleksander Jan Bajkowski <olek2@wp.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>