git.proxmox.com Git - mirror_ubuntu-kernels.git/log

KVM: Add GDS_NO support to KVM

Gather Data Sampling (GDS) is a transient execution attack using
gather instructions from the AVX2 and AVX512 extensions. This attack
allows malicious code to infer data that was previously stored in
vector registers. Systems that are not vulnerable to GDS will set the
GDS_NO bit of the IA32_ARCH_CAPABILITIES MSR. This is useful for VM
guests that may think they are on vulnerable systems that are, in
fact, not affected. Guests that are running on affected hosts where
the mitigation is enabled are protected as if they were running
on an unaffected system.

On all hosts that are not affected or that are mitigated, set the
GDS_NO bit.

Signed-off-by: Daniel Sneddon <daniel.sneddon@linux.intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Josh Poimboeuf <jpoimboe@kernel.org>
(cherry picked from commit 81ac7e5d741742d650b4ed6186c4826c1a0631a7)
CVE-2022-40982
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
Acked-by: Roxana Nicolescu <roxana.nicolescu@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

x86/speculation: Add Kconfig option for GDS

Gather Data Sampling (GDS) is mitigated in microcode. However, on
systems that haven't received the updated microcode, disabling AVX
can act as a mitigation. Add a Kconfig option that uses the microcode
mitigation if available and disables AVX otherwise. Setting this
option has no effect on systems not affected by GDS. This is the
equivalent of setting gather_data_sampling=force.

Signed-off-by: Daniel Sneddon <daniel.sneddon@linux.intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Josh Poimboeuf <jpoimboe@kernel.org>
(cherry picked from commit 53cf5797f114ba2bd86d23a862302119848eff19)
CVE-2022-40982
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
Acked-by: Roxana Nicolescu <roxana.nicolescu@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

x86/speculation: Add force option to GDS mitigation

The Gather Data Sampling (GDS) vulnerability allows malicious software
to infer stale data previously stored in vector registers. This may
include sensitive data such as cryptographic keys. GDS is mitigated in
microcode, and systems with up-to-date microcode are protected by
default. However, any affected system that is running with older
microcode will still be vulnerable to GDS attacks.

Since the gather instructions used by the attacker are part of the
AVX2 and AVX512 extensions, disabling these extensions prevents gather
instructions from being executed, thereby mitigating the system from
GDS. Disabling AVX2 is sufficient, but we don't have the granularity
to do this. The XCR0[2] disables AVX, with no option to just disable
AVX2.

Add a kernel parameter gather_data_sampling=force that will enable the
microcode mitigation if available, otherwise it will disable AVX on
affected systems.

This option will be ignored if cmdline mitigations=off.

This is a *big* hammer. It is known to break buggy userspace that
uses incomplete, buggy AVX enumeration. Unfortunately, such userspace
does exist in the wild:

https://www.mail-archive.com/bug-coreutils@gnu.org/msg33046.html

[ dhansen: add some more ominous warnings about disabling AVX ]

Signed-off-by: Daniel Sneddon <daniel.sneddon@linux.intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Josh Poimboeuf <jpoimboe@kernel.org>
(cherry picked from commit 553a5c03e90a6087e88f8ff878335ef0621536fb)
CVE-2022-40982
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
Acked-by: Roxana Nicolescu <roxana.nicolescu@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

x86/speculation: Add Gather Data Sampling mitigation

Gather Data Sampling (GDS) is a hardware vulnerability which allows
unprivileged speculative access to data which was previously stored in
vector registers.

Intel processors that support AVX2 and AVX512 have gather instructions
that fetch non-contiguous data elements from memory. On vulnerable
hardware, when a gather instruction is transiently executed and
encounters a fault, stale data from architectural or internal vector
registers may get transiently stored to the destination vector
register allowing an attacker to infer the stale data using typical
side channel techniques like cache timing attacks.

This mitigation is different from many earlier ones for two reasons.
First, it is enabled by default and a bit must be set to *DISABLE* it.
This is the opposite of normal mitigation polarity. This means GDS can
be mitigated simply by updating microcode and leaving the new control
bit alone.

Second, GDS has a "lock" bit. This lock bit is there because the
mitigation affects the hardware security features KeyLocker and SGX.
It needs to be enabled and *STAY* enabled for these features to be
mitigated against GDS.

The mitigation is enabled in the microcode by default. Disable it by
setting gather_data_sampling=off or by disabling all mitigations with
mitigations=off. The mitigation status can be checked by reading:

/sys/devices/system/cpu/vulnerabilities/gather_data_sampling

Signed-off-by: Daniel Sneddon <daniel.sneddon@linux.intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Josh Poimboeuf <jpoimboe@kernel.org>
(cherry picked from commit 8974eb588283b7d44a7c91fa09fcbaf380339f3a)
CVE-2022-40982
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
Acked-by: Roxana Nicolescu <roxana.nicolescu@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

x86/xen: Fix secondary processors' FPU initialization

Moving the call of fpu__init_cpu() from cpu_init() to start_secondary()
broke Xen PV guests, as those don't call start_secondary() for APs.

Call fpu__init_cpu() in Xen's cpu_bringup(), which is the Xen PV
replacement of start_secondary().

Fixes: b81fac906a8f ("x86/fpu: Move FPU initialization into arch_cpu_finalize_init()")
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20230703130032.22916-1-jgross@suse.com
(cherry picked from commit fe3e0a13e597c1c8617814bf9b42ab732db5c26e)
CVE-2022-40982
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
Acked-by: Roxana Nicolescu <roxana.nicolescu@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

x86/mem_encrypt: Unbreak the AMD_MEM_ENCRYPT=n build

Moving mem_encrypt_init() broke the AMD_MEM_ENCRYPT=n because the
declaration of that function was under #ifdef CONFIG_AMD_MEM_ENCRYPT and
the obvious placement for the inline stub was the #else path.

This is a leftover of commit 20f07a044a76 ("x86/sev: Move common memory
encryption code to mem_encrypt.c") which made mem_encrypt_init() depend on
X86_MEM_ENCRYPT without moving the prototype. That did not fail back then
because there was no stub inline as the core init code had a weak function.

Move both the declaration and the stub out of the CONFIG_AMD_MEM_ENCRYPT
section and guard it with CONFIG_X86_MEM_ENCRYPT.

Fixes: 439e17576eb4 ("init, x86: Move mem_encrypt_init() into arch_cpu_finalize_init()")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Closes: https://lore.kernel.org/oe-kbuild-all/202306170247.eQtCJPE8-lkp@intel.com/
(cherry picked from commit 0a9567ac5e6a40cdd9c8cd15b19a62a15250f450)
CVE-2022-40982
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
Acked-by: Roxana Nicolescu <roxana.nicolescu@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

x86/fpu: Move FPU initialization into arch_cpu_finalize_init()

Initializing the FPU during the early boot process is a pointless
exercise. Early boot is convoluted and fragile enough.

Nothing requires that the FPU is set up early. It has to be initialized
before fork_init() because the task_struct size depends on the FPU register
buffer size.

Move the initialization to arch_cpu_finalize_init() which is the perfect
place to do so.

No functional change.

This allows to remove quite some of the custom early command line parsing,
but that's subject to the next installment.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20230613224545.902376621@linutronix.de
(cherry picked from commit b81fac906a8f9e682e513ddd95697ec7a20878d4)
CVE-2022-40982
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
Acked-by: Roxana Nicolescu <roxana.nicolescu@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

x86/fpu: Mark init functions __init

No point in keeping them around.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20230613224545.841685728@linutronix.de
(cherry picked from commit 1703db2b90c91b2eb2d699519fc505fe431dde0e)
CVE-2022-40982
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
Acked-by: Roxana Nicolescu <roxana.nicolescu@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

x86/fpu: Remove cpuinfo argument from init functions

Nothing in the call chain requires it

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20230613224545.783704297@linutronix.de
(cherry picked from commit 1f34bb2a24643e0087652d81078e4f616562738d)
CVE-2022-40982
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
Acked-by: Roxana Nicolescu <roxana.nicolescu@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

x86/init: Initialize signal frame size late

No point in doing this during really early boot. Move it to an early
initcall so that it is set up before possible user mode helpers are started
during device initialization.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20230613224545.727330699@linutronix.de
(cherry picked from commit 54d9a91a3d6713d1332e93be13b4eaf0fa54349d)
CVE-2022-40982
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
Acked-by: Roxana Nicolescu <roxana.nicolescu@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

init, x86: Move mem_encrypt_init() into arch_cpu_finalize_init()

Invoke the X86ism mem_encrypt_init() from X86 arch_cpu_finalize_init() and
remove the weak fallback from the core code.

No functional change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20230613224545.670360645@linutronix.de
(backported from commit 439e17576eb47f26b78c5bbc72e344d4206d2327)
[cascardo: really remove mem_encrypt_init from init/main.c]
CVE-2022-40982
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
Acked-by: Roxana Nicolescu <roxana.nicolescu@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

init: Invoke arch_cpu_finalize_init() earlier

X86 is reworking the boot process so that initializations which are not
required during early boot can be moved into the late boot process and out
of the fragile and restricted initial boot phase.

arch_cpu_finalize_init() is the obvious place to do such initializations,
but arch_cpu_finalize_init() is invoked too late in start_kernel() e.g. for
initializing the FPU completely. fork_init() requires that the FPU is
initialized as the size of task_struct on X86 depends on the size of the
required FPU register buffer.

Fortunately none of the init calls between calibrate_delay() and
arch_cpu_finalize_init() is relevant for the functionality of
arch_cpu_finalize_init().

Invoke it right after calibrate_delay() where everything which is relevant
for arch_cpu_finalize_init() has been set up already.

No functional change intended.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
Link: https://lore.kernel.org/r/20230613224545.612182854@linutronix.de
(backported from commit 9df9d2f0471b4c4702670380b8d8a45b40b23a7d)
[cascardo: fixed conflict due to call to mem_encrypt_init]
CVE-2022-40982
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
Acked-by: Roxana Nicolescu <roxana.nicolescu@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

init: Remove check_bugs() leftovers

Everything is converted over to arch_cpu_finalize_init(). Remove the
check_bugs() leftovers including the empty stubs in asm-generic, alpha,
parisc, powerpc and xtensa.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Link: https://lore.kernel.org/r/20230613224545.553215951@linutronix.de
(cherry picked from commit 61235b24b9cb37c13fcad5b9596d59a1afdcec30)
CVE-2022-40982
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
Acked-by: Roxana Nicolescu <roxana.nicolescu@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

um/cpu: Switch to arch_cpu_finalize_init()

check_bugs() is about to be phased out. Switch over to the new
arch_cpu_finalize_init() implementation.

No functional change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Richard Weinberger <richard@nod.at>
Link: https://lore.kernel.org/r/20230613224545.493148694@linutronix.de
(cherry picked from commit 9349b5cd0908f8afe95529fc7a8cbb1417df9b0c)
CVE-2022-40982
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
Acked-by: Roxana Nicolescu <roxana.nicolescu@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

sparc/cpu: Switch to arch_cpu_finalize_init()

check_bugs() is about to be phased out. Switch over to the new
arch_cpu_finalize_init() implementation.

No functional change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Sam Ravnborg <sam@ravnborg.org>
Link: https://lore.kernel.org/r/20230613224545.431995857@linutronix.de
(cherry picked from commit 44ade508e3bfac45ae97864587de29eb1a881ec0)
CVE-2022-40982
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
Acked-by: Roxana Nicolescu <roxana.nicolescu@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

sh/cpu: Switch to arch_cpu_finalize_init()

check_bugs() is about to be phased out. Switch over to the new
arch_cpu_finalize_init() implementation.

No functional change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20230613224545.371697797@linutronix.de
(cherry picked from commit 01eb454e9bfe593f320ecbc9aaec60bf87cd453d)
CVE-2022-40982
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
Acked-by: Roxana Nicolescu <roxana.nicolescu@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

mips/cpu: Switch to arch_cpu_finalize_init()

check_bugs() is about to be phased out. Switch over to the new
arch_cpu_finalize_init() implementation.

No functional change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20230613224545.312438573@linutronix.de
(backported from commit 7f066a22fe353a827a402ee2835e81f045b1574d)
[cascardo: only removed check_bugs from arch/mips/include/asm/bugs.h]
CVE-2022-40982
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
Acked-by: Roxana Nicolescu <roxana.nicolescu@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

m68k/cpu: Switch to arch_cpu_finalize_init()

check_bugs() is about to be phased out. Switch over to the new
arch_cpu_finalize_init() implementation.

No functional change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Link: https://lore.kernel.org/r/20230613224545.254342916@linutronix.de
(cherry picked from commit 9ceecc2589b9d7cef6b321339ed8de484eac4b20)
CVE-2022-40982
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
Acked-by: Roxana Nicolescu <roxana.nicolescu@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

ia64/cpu: Switch to arch_cpu_finalize_init()

check_bugs() is about to be phased out. Switch over to the new
arch_cpu_finalize_init() implementation.

No functional change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20230613224545.137045745@linutronix.de
(cherry picked from commit 6c38e3005621800263f117fb00d6787a76e16de7)
CVE-2022-40982
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
Acked-by: Roxana Nicolescu <roxana.nicolescu@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

ARM: cpu: Switch to arch_cpu_finalize_init()

check_bugs() is about to be phased out. Switch over to the new
arch_cpu_finalize_init() implementation.

No functional change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20230613224545.078124882@linutronix.de
(cherry picked from commit ee31bb0524a2e7c99b03f50249a411cc1eaa411f)
CVE-2022-40982
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
Acked-by: Roxana Nicolescu <roxana.nicolescu@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

x86/cpu: Switch to arch_cpu_finalize_init()

check_bugs() is a dumping ground for finalizing the CPU bringup. Only parts of
it has to do with actual CPU bugs.

Split it apart into arch_cpu_finalize_init() and cpu_select_mitigations().

Fixup the bogus 32bit comments while at it.

No functional change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20230613224545.019583869@linutronix.de
(cherry picked from commit 7c7077a72674402654f3291354720cd73cdf649e)
CVE-2022-40982
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
Acked-by: Roxana Nicolescu <roxana.nicolescu@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

init: Provide arch_cpu_finalize_init()

check_bugs() has become a dumping ground for all sorts of activities to
finalize the CPU initialization before running the rest of the init code.

Most are empty, a few do actual bug checks, some do alternative patching
and some cobble a CPU advertisement string together....

Aside of that the current implementation requires duplicated function
declaration and mostly empty header files for them.

Provide a new function arch_cpu_finalize_init(). Provide a generic
declaration if CONFIG_ARCH_HAS_CPU_FINALIZE_INIT is selected and a stub
inline otherwise.

This requires a temporary #ifdef in start_kernel() which will be removed
along with check_bugs() once the architectures are converted over.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20230613224544.957805717@linutronix.de
(cherry picked from commit 7725acaa4f0c04fbefb0e0d342635b967bb7d414)
CVE-2022-40982
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
Acked-by: Roxana Nicolescu <roxana.nicolescu@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

netfilter: nf_tables: skip immediate deactivate in _PREPARE_ERROR

On error when building the rule, the immediate expression unbinds the
chain, hence objects can be deactivated by the transaction records.

Otherwise, it is possible to trigger the following warning:

WARNING: CPU: 3 PID: 915 at net/netfilter/nf_tables_api.c:2013 nf_tables_chain_destroy+0x1f7/0x210 [nf_tables]
CPU: 3 PID: 915 Comm: chain-bind-err- Not tainted 6.1.39 #1
RIP: 0010:nf_tables_chain_destroy+0x1f7/0x210 [nf_tables]

Fixes: 4bedf9eee016 ("netfilter: nf_tables: fix chain binding transaction logic")
Reported-by: Kevin Rich <kevinrich1337@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Florian Westphal <fw@strlen.de>
CVE-2023-4015
(cherry picked from commit 0a771f7b266b02d262900c75f1e175c7fe76fec2)
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
Acked-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Roxana Nicolescu <roxana.nicolescu@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

netfilter: nf_tables: unbind non-anonymous set if rule construction fails

Otherwise a dangling reference to a rule object that is gone remains
in the set binding list.

Fixes: 26b5a5712eb8 ("netfilter: nf_tables: add NFT_TRANS_PREPARE_ERROR to deal with bound set/chain")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
CVE-2023-4015
(cherry picked from commit 3e70489721b6c870252c9082c496703677240f53)
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
Acked-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Roxana Nicolescu <roxana.nicolescu@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

netfilter: nf_tables: add NFT_TRANS_PREPARE_ERROR to deal with bound set/chain

Add a new state to deal with rule expressions deactivation from the
newrule error path, otherwise the anonymous set remains in the list in
inactive state for the next generation. Mark the set/chain transaction
as unbound so the abort path releases this object, set it as inactive in
the next generation so it is not reachable anymore from this transaction
and reference counter is dropped.

Fixes: 1240eb93f061 ("netfilter: nf_tables: incorrect error path handling with NFT_MSG_NEWRULE")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
CVE-2023-4015
(cherry picked from commit 26b5a5712eb85e253724e56a54c17f8519bd8e4e)
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
Acked-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Roxana Nicolescu <roxana.nicolescu@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

netfilter: nf_tables: disallow rule addition to bound chain via NFTA_RULE_CHAIN_ID

Bail out with EOPNOTSUPP when adding rule to bound chain via
NFTA_RULE_CHAIN_ID. The following warning splat is shown when
adding a rule to a deleted bound chain:

WARNING: CPU: 2 PID: 13692 at net/netfilter/nf_tables_api.c:2013 nf_tables_chain_destroy+0x1f7/0x210 [nf_tables]
CPU: 2 PID: 13692 Comm: chain-bound-rul Not tainted 6.1.39 #1
RIP: 0010:nf_tables_chain_destroy+0x1f7/0x210 [nf_tables]

Fixes: d0e2c7de92c7 ("netfilter: nf_tables: add NFT_CHAIN_BINDING")
Reported-by: Kevin Rich <kevinrich1337@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Florian Westphal <fw@strlen.de>
CVE-2023-3995
(cherry picked from commit 0ebc1064e4874d5987722a2ddbc18f94aa53b211)
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
Acked-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

netfilter: nf_tables: skip bound chain on rule flush

Skip bound chain when flushing table rules, the rule that owns this
chain releases these objects.

Otherwise, the following warning is triggered:

  WARNING: CPU: 2 PID: 1217 at net/netfilter/nf_tables_api.c:2013 nf_tables_chain_destroy+0x1f7/0x210 [nf_tables]
  CPU: 2 PID: 1217 Comm: chain-flush Not tainted 6.1.39 #1
  RIP: 0010:nf_tables_chain_destroy+0x1f7/0x210 [nf_tables]

Fixes: d0e2c7de92c7 ("netfilter: nf_tables: add NFT_CHAIN_BINDING")
Reported-by: Kevin Rich <kevinrich1337@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Florian Westphal <fw@strlen.de>
CVE-2023-3777
(cherry picked from commit 6eaf41e87a223ae6f8e7a28d6e78384ad7e407f8)
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
Acked-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

net/sched: cls_u32: Fix reference counter leak leading to overflow

In the event of a failure in tcf_change_indev(), u32_set_parms() will
immediately return without decrementing the recently incremented
reference counter. If this happens enough times, the counter will
rollover and the reference freed, leading to a double free which can be
used to do 'bad things'.

In order to prevent this, move the point of possible failure above the
point where the reference counter is incremented. Also save any
meaningful return values to be applied to the return data at the
appropriate point in time.

This issue was caught with KASAN.

Fixes: 705c7091262d ("net: sched: cls_u32: no need to call tcf_exts_change for newly allocated struct")
Suggested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Lee Jones <lee@kernel.org>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
CVE-2023-3609
(cherry picked from commit 04c55383fa5689357bcdd2c8036725a55ed632bc)
Signed-off-by: Yuxuan Luo <yuxuan.luo@canonical.com>
Acked-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Roxana Nicolescu <roxana.nicolescu@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

ALSA: hda: cs35l41: Ensure amp is only unmuted during playback

BugLink: https://bugs.launchpad.net/bugs/2029199
Currently we only mute after playback has finished, and unmute
prior to setting global enable. To prevent any possible pops
and clicks, mute at probe, and then only unmute after global
enable is set.

Signed-off-by: Stefan Binding <sbinding@opensource.cirrus.com>
Link: https://lore.kernel.org/r/20230721151816.2080453-12-sbinding@opensource.cirrus.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
(cherry picked from commit 2d816d4f92086ca0a7b79198794012076b4cab0b linux-next)
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>

ALSA: hda: cs35l41: Add device_link between HDA and cs35l41_hda

BugLink: https://bugs.launchpad.net/bugs/2029199
To ensure consistency between the HDA core and the CS35L41 HDA
driver, add a device_link between them. This ensures that the
HDA core will suspend first, and resume second, meaning the
amp driver will not miss any events from the playback hook from
the HDA core during system suspend and resume.

Signed-off-by: Stefan Binding <sbinding@opensource.cirrus.com>
Link: https://lore.kernel.org/r/20230721151816.2080453-11-sbinding@opensource.cirrus.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
(cherry picked from commit 7cf5ce66dfda2be444ea668c3d48f732ba4a7fd1 linux-next)
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>

ALSA: hda: cs35l41: Rework System Suspend to ensure correct call separation

BugLink: https://bugs.launchpad.net/bugs/2029199
In order to correctly pause audio on suspend, amps using external boost
require parts of the pause sequence to be called for all amps before moving
on to the next steps.
For example, as part of pausing the audio, the VSPK GPIO must be disabled,
but since this GPIO is controlled by one amp, but controls the boost for
all amps, it is required to separate the calls.
During playback this is achieved by using the pre and post playback hooks,
however during system suspend, this is not possible, so to separate the
calls, we use both the .prepare and .suspend calls to pause the audio.

Currently, for this reason, we do not restart audio on system resume.
However, we can support this by relying on the playback hook to resume
playback after system suspend.

Signed-off-by: Stefan Binding <sbinding@opensource.cirrus.com>
Link: https://lore.kernel.org/r/20230721151816.2080453-10-sbinding@opensource.cirrus.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
(cherry picked from commit c4d0510b81c430249ee69c0554bb7ce4fa3d539e linux-next)
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>

ALSA: hda: cs35l41: Use pre and post playback hooks

BugLink: https://bugs.launchpad.net/bugs/2029199
Use new hooks to ensure separation between play/pause actions,
as required by external boost.

External Boost on CS35L41 requires the amp to go through a
particular sequence of steps. One of these steps involes
the setting of a GPIO. This GPIO is connected to one or
more of the amps, and it may control the boost for all of
the amps. To ensure that the GPIO is set when it is safe
to do so, and to ensure that boost is ready for the rest of
the sequence to be able to continue, we must ensure that
the each part of the sequence is executed for each amp
before moving on to the next part of the sequence.

Some of the Play and Pause actions have moved from Open to
Prepare. This is because Open is not guaranteed to be called
again on system resume, whereas Prepare should.

Signed-off-by: Stefan Binding <sbinding@opensource.cirrus.com>
Link: https://lore.kernel.org/r/20230721151816.2080453-9-sbinding@opensource.cirrus.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
(cherry picked from commit 01ecc562936439cf6dd08b5f8c6bbed2704d9f9e linux-next)
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>

ALSA: hda: hda_component: Add pre and post playback hooks to hda_component

BugLink: https://bugs.launchpad.net/bugs/2029199
These hooks can be used to add callbacks that would be run before and after
the main playback hooks. These hooks would be called for all amps, before
moving on to the next hook, i.e. pre_playback_hook would be called for
all amps, before the playback_hook is called for all amps, then finally
the post_playback_hook is called for all amps.

Signed-off-by: Stefan Binding <sbinding@opensource.cirrus.com>
Link: https://lore.kernel.org/r/20230721151816.2080453-8-sbinding@opensource.cirrus.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
(cherry picked from commit 4eae4892c5bdef990b4b80997b813e2443d6554c linux-next)
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>

ALSA: hda: cs35l41: Move Play and Pause into separate functions

BugLink: https://bugs.launchpad.net/bugs/2029199
This allows play and pause to be called from multiple places,
which is necessary for system suspend and resume.

Signed-off-by: Stefan Binding <sbinding@opensource.cirrus.com>
Link: https://lore.kernel.org/r/20230721151816.2080453-7-sbinding@opensource.cirrus.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
(cherry picked from commit a5adbfb60b0299e788189aa8f88d8fa0dafb9d65 linux-next)
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>

ALSA: hda: cs35l41: Ensure we pass up any errors during system suspend.

BugLink: https://bugs.launchpad.net/bugs/2029199
There are several steps required to put the system into system suspend.
Some of these steps may fail, so the driver should pass up the errors
if they occur.

Signed-off-by: Stefan Binding <sbinding@opensource.cirrus.com>
Link: https://lore.kernel.org/r/20230721151816.2080453-6-sbinding@opensource.cirrus.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
(cherry picked from commit f2a58481a5051704390c2d7653b07ab3d97782da linux-next)
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>

ALSA: hda: cs35l41: Ensure we correctly re-sync regmap before system suspending.

BugLink: https://bugs.launchpad.net/bugs/2029199
In order to properly system suspend, it is necessary to unload the firmware
and ensure the chip is ready for shutdown (if necessary). If the system
is currently in runtime suspend, it is necessary to wake up the device,
and then make it ready. Currently, the wake does not correctly resync
the device, which may mean it cannot suspend correctly. Fix this by
performaing a resync.

Signed-off-by: Stefan Binding <sbinding@opensource.cirrus.com>
Link: https://lore.kernel.org/r/20230721151816.2080453-5-sbinding@opensource.cirrus.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
(cherry picked from commit a3ff564658783e3cd9cc3098da4aebd89fe07d74 linux-next)
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>

ALSA: hda: cs35l41: Check mailbox status of pause command after firmware load

BugLink: https://bugs.launchpad.net/bugs/2029199
Currently, we do not check the return status of the pause command,
immediately after we load firmware. If the pause has failed,
the firmware is not running.

Signed-off-by: Stefan Binding <sbinding@opensource.cirrus.com>
Link: https://lore.kernel.org/r/20230721151816.2080453-4-sbinding@opensource.cirrus.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
(cherry picked from commit 5299b79ca1a2a9b017b87da08563100b0da98e5b linux-next)
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>

ALSA: cs35l41: Poll for Power Up/Down rather than waiting a fixed delay

BugLink: https://bugs.launchpad.net/bugs/2029199
To ensure the chip has correctly powered up or down before continuing,
the driver will now poll a register, rather than wait a fixed delay.

Acked-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Stefan Binding <sbinding@opensource.cirrus.com>
Link: https://lore.kernel.org/r/20230721151816.2080453-3-sbinding@opensource.cirrus.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
(cherry picked from commit f8264c7592088727629e14f396f95ad643847740 linux-next)
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>

ALSA: cs35l41: Use mbox command to enable speaker output for external boost

BugLink: https://bugs.launchpad.net/bugs/2029199
To enable the speaker output in external boost mode, 2 registers must
be set, one after another. The longer the time between the writes of
the two registers, the more likely, and more loudly a pop may occur.
To minimize this, an mbox command can be used to allow the firmware
to perform this action, minimizing any delay between write, thus
minimizing any pop or click as a result. The old method will remain
when running without firmware.

Acked-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Stefan Binding <sbinding@opensource.cirrus.com>
Link: https://lore.kernel.org/r/20230721151816.2080453-2-sbinding@opensource.cirrus.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
(cherry picked from commit fa3efcc36aacdd8ac9372217970d0ed2eb293fcc linux-next)
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>

ALSA: hda: cs35l41: Enable Amp High Pass Filter

BugLink: https://bugs.launchpad.net/bugs/2029199
This helps smooth out pops and clicks in the amps.

Signed-off-by: Stefan Binding <sbinding@opensource.cirrus.com>
Link: https://lore.kernel.org/r/20230213145008.1215849-4-sbinding@opensource.cirrus.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
(cherry picked from commit 5791c7699ff1b8be24e1e3b2c08b180598d3ba28)
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>

ALSA: hda: cs35l41: Ensure firmware/tuning pairs are always loaded

BugLink: https://bugs.launchpad.net/bugs/2029199
To ensure firmware for cs35l41 is correctly running, it is necessary
that a corresponding tuning file is also loaded. Without both,
the firmware may not be performing correctly
Ensure that if we load the firmware, we have also loaded the correct
tuning file. Otherwise, fall back to default firmware and tuning.
If default tuning is also missing, then disable DSP firmware.

Signed-off-by: Stefan Binding <sbinding@opensource.cirrus.com>
Link: https://lore.kernel.org/r/20230213145008.1215849-3-sbinding@opensource.cirrus.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
(cherry picked from commit cd40dad2ca9196631dcd48e1b991244ab3940d83)
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>

ASoC: dt-bindings: cirrus, cs35l41: Document CS35l41 shared boost

BugLink: https://bugs.launchpad.net/bugs/2029199
Describe the properties used for shared boost configuration.
Based on David Rhodes shared boost patches.

Signed-off-by: Lucas Tanure <lucas.tanure@collabora.com>
Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Reviewed-by: David Rhodes <david.rhodes@cirrus.com>
Link: https://lore.kernel.org/r/20230223084324.9076-5-lucas.tanure@collabora.com
Signed-off-by: Mark Brown <broonie@kernel.org>
(backported from commit 340307d7effd99303fe933cde3b7288f8f3c6677)
[khfeng: Skipped documentation cleanup]
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>

ALSA: cs35l41: Add shared boost feature

BugLink: https://bugs.launchpad.net/bugs/2029199
Shared boost allows two amplifiers to share a single boost circuit by
communicating on the MDSYNC bus.
The passive amplifier does not control the boost and receives data from
the active amplifier.

Shared Boost is not supported in HDA Systems.
Based on David Rhodes shared boost patches.

Signed-off-by: Lucas Tanure <lucas.tanure@collabora.com>
Reviewed-by: David Rhodes <david.rhodes@cirrus.com>
Link: https://lore.kernel.org/r/20230223084324.9076-4-lucas.tanure@collabora.com
Signed-off-by: Mark Brown <broonie@kernel.org>
(cherry picked from commit f5030564938be112183ba3df0cdd6dea3f694c2e)
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>

ASoC: cs35l41: Refactor error release code

BugLink: https://bugs.launchpad.net/bugs/2029199
Add cs35l41_error_release function to handle error release sequences.

Signed-off-by: Lucas Tanure <lucas.tanure@collabora.com>
Acked-by: Charles Keepax <ckeepax@opensource.cirrus.com>
Reviewed-by: David Rhodes <david.rhodes@cirrus.com>
Link: https://lore.kernel.org/r/20230223084324.9076-3-lucas.tanure@collabora.com
Signed-off-by: Mark Brown <broonie@kernel.org>
(cherry picked from commit be9457f12e84437259707415364cc5fc96041ed6)
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>

x86/smp: Dont access non-existing CPUID leaf

BugLink: https://bugs.launchpad.net/bugs/2029332
stop_this_cpu() tests CPUID leaf 0x8000001f::EAX unconditionally. Intel
CPUs return the content of the highest supported leaf when a non-existing
leaf is read, while AMD CPUs return all zeros for unsupported leafs.

So the result of the test on Intel CPUs is lottery.

While harmless it's incorrect and causes the conditional wbinvd() to be
issued where not required.

Check whether the leaf is supported before reading it.

[ tglx: Adjusted changelog ]

Fixes: 08f253ec3767 ("x86/cpu: Clear SME feature flag when not in use")
Signed-off-by: Tony Battersby <tonyb@cybernetics.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Borislav Petkov (AMD) <bp@alien8.de>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/3817d810-e0f1-8ef8-0bbd-663b919ca49b@cybernetics.com
Link: https://lore.kernel.org/r/20230615193330.322186388@linutronix.de
(cherry picked from commit 9b040453d4440659f33dc6f0aa26af418ebfe70b)
Signed-off-by: Juerg Haefliger <juerg.haefliger@canonical.com>
Acked-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

x86/smp: Make stop_other_cpus() more robust

BugLink: https://bugs.launchpad.net/bugs/2029332
Tony reported intermittent lockups on poweroff. His analysis identified the
wbinvd() in stop_this_cpu() as the culprit. This was added to ensure that
on SME enabled machines a kexec() does not leave any stale data in the
caches when switching from encrypted to non-encrypted mode or vice versa.

That wbinvd() is conditional on the SME feature bit which is read directly
from CPUID. But that readout does not check whether the CPUID leaf is
available or not. If it's not available the CPU will return the value of
the highest supported leaf instead. Depending on the content the "SME" bit
might be set or not.

That's incorrect but harmless. Making the CPUID readout conditional makes
the observed hangs go away, but it does not fix the underlying problem:

CPU0 CPU1

stop_other_cpus()
   send_IPIs(REBOOT); stop_this_cpu()
   while (num_online_cpus() > 1);         set_online(false);
   proceed... -> hang
          wbinvd()

WBINVD is an expensive operation and if multiple CPUs issue it at the same
time the resulting delays are even larger.

But CPU0 already observed num_online_cpus() going down to 1 and proceeds
which causes the system to hang.

This issue exists independent of WBINVD, but the delays caused by WBINVD
make it more prominent.

Make this more robust by adding a cpumask which is initialized to the
online CPU mask before sending the IPIs and CPUs clear their bit in
stop_this_cpu() after the WBINVD completed. Check for that cpumask to
become empty in stop_other_cpus() instead of watching num_online_cpus().

The cpumask cannot plug all holes either, but it's better than a raw
counter and allows to restrict the NMI fallback IPI to be sent only the
CPUs which have not reported within the timeout window.

Fixes: 08f253ec3767 ("x86/cpu: Clear SME feature flag when not in use")
Reported-by: Tony Battersby <tonyb@cybernetics.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Ashok Raj <ashok.raj@intel.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/all/3817d810-e0f1-8ef8-0bbd-663b919ca49b@cybernetics.com
Link: https://lore.kernel.org/r/87h6r770bv.ffs@tglx
(cherry picked from commit 1f5e7eb7868e42227ac426c96d437117e6e06e8e)
Signed-off-by: Juerg Haefliger <juerg.haefliger@canonical.com>
Acked-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

loop: do not enforce max_loop hard limit by (new) default

BugLink: https://bugs.launchpad.net/bugs/2015400
Problem:

The max_loop parameter is used for 2 different purposes:

1) initial number of loop devices to pre-create on init
2) maximum number of loop devices to add on access/open()

Historically, its default value (zero) caused 1) to create non-zero
number of devices (CONFIG_BLK_DEV_LOOP_MIN_COUNT), and no hard limit on
2) to add devices with autoloading.

However, the default value changed in commit 85c50197716c ("loop: Fix
the max_loop commandline argument treatment when it is set to 0") to
CONFIG_BLK_DEV_LOOP_MIN_COUNT, for max_loop=0 not to pre-create devices.

That does improve 1), but unfortunately it breaks 2), as the default
behavior changed from no-limit to hard-limit.

Example:

For example, this userspace code broke for N >= CONFIG, if the user
relied on the default value 0 for max_loop:

    mknod("/dev/loopN");
    open("/dev/loopN");  // now fails with ENXIO

Though affected users may "fix" it with (loop.)max_loop=0, this means to
require a kernel parameter change on stable kernel update (that commit
Fixes: an old commit in stable).
Solution:

The original semantics for the default value in 2) can be applied if the
parameter is not set (ie, default behavior).

This still keeps the intended function in 1) and 2) if set, and that
commit's intended improvement in 1) if max_loop=0.

Before 85c50197716c:
  - default:     1) CONFIG devices   2) no limit
  - max_loop=0:  1) CONFIG devices   2) no limit
  - max_loop=X:  1) X devices        2) X limit

After 85c50197716c:
  - default:     1) CONFIG devices   2) CONFIG limit (*)
  - max_loop=0:  1) 0 devices (*)    2) no limit
  - max_loop=X:  1) X devices        2) X limit

This commit:
  - default:     1) CONFIG devices   2) no limit (*)
  - max_loop=0:  1) 0 devices        2) no limit
  - max_loop=X:  1) X devices        2) X limit

Future:

The issue/regression from that commit only affects code under the
CONFIG_BLOCK_LEGACY_AUTOLOAD deprecation guard, thus the fix too is
contained under it.

Once that deprecated functionality/code is removed, the purpose 2) of
max_loop (hard limit) is no longer in use, so the module parameter
description can be changed then.

Tests:

Linux 6.4-rc7
CONFIG_BLK_DEV_LOOP_MIN_COUNT=8
CONFIG_BLOCK_LEGACY_AUTOLOAD=y

- default (original)

# ls -1 /dev/loop*
/dev/loop-control
/dev/loop0
...
/dev/loop7

# ./test-loop
open: /dev/loop8: No such device or address

- default (patched)

# ls -1 /dev/loop*
/dev/loop-control
/dev/loop0
...
/dev/loop7

# ./test-loop
#

- max_loop=0 (original & patched):

# ls -1 /dev/loop*
/dev/loop-control

# ./test-loop
#

- max_loop=8 (original & patched):

# ls -1 /dev/loop*
/dev/loop-control
/dev/loop0
...
/dev/loop7

# ./test-loop
open: /dev/loop8: No such device or address

- max_loop=0 (patched; CONFIG_BLOCK_LEGACY_AUTOLOAD is not set)

# ls -1 /dev/loop*
/dev/loop-control

# ./test-loop
open: /dev/loop8: No such device or address

Fixes: 85c50197716c ("loop: Fix the max_loop commandline argument treatment when it is set to 0")
Signed-off-by: Mauricio Faria de Oliveira <mfo@canonical.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20230720143033.841001-3-mfo@canonical.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
(cherry picked from commit bb5faa99f0ce40756ab7bbbce4f16c01ca5ebd5a)
Signed-off-by: Mauricio Faria de Oliveira <mfo@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Tim Gardner <tim.gardner@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

loop: deprecate autoloading callback loop_probe()

BugLink: https://bugs.launchpad.net/bugs/2015400
The 'probe' callback in __register_blkdev() is only used under the
CONFIG_BLOCK_LEGACY_AUTOLOAD deprecation guard.

The loop_probe() function is only used for that callback, so guard it
too, accordingly.

See commit fbdee71bb5d8 ("block: deprecate autoloading based on dev_t").

Signed-off-by: Mauricio Faria de Oliveira <mfo@canonical.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20230720143033.841001-2-mfo@canonical.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
(cherry picked from commit 23881aec85f3219e8462e87c708815ee2cd82358)
Signed-off-by: Mauricio Faria de Oliveira <mfo@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Tim Gardner <tim.gardner@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

EDAC/i10nm: Skip the absent memory controllers

BugLink: https://bugs.launchpad.net/bugs/2028746
Some Sapphire Rapids workstations' absent memory controllers
still appear as PCIe devices that fool the i10nm_edac driver
and result in "shift exponent -66 is negative" call traces
from skx_get_dimm_info().

Skip the absent memory controllers to avoid the call traces.

Reported-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Closes: https://lore.kernel.org/linux-edac/CAAd53p41Ku1m1rapeqb1xtD+kKuk+BaUW=dumuoF0ZO3GhFjFA@mail.gmail.com/T/#m5de16dce60a8c836ec235868c7c16e3fefad0cc2
Tested-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Reported-by: Koba Ko <koba.ko@canonical.com>
Closes: https://lore.kernel.org/linux-edac/SA1PR11MB71305B71CCCC3D9305835202892AA@SA1PR11MB7130.namprd11.prod.outlook.com/T/#t
Tested-by: Koba Ko <koba.ko@canonical.com>
Fixes: d4dc89d069aa ("EDAC, i10nm: Add a driver for Intel 10nm server processors")
Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Link: https://lore.kernel.org/r/20230710013232.59712-1-qiuxu.zhuo@intel.com
(cherry picked from commit c545f5e412250555bd4e717d062b117f20bab418 linux-next)
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Tim Gardner <tim.gardner@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

EDAC/i10nm: Add Intel Granite Rapids server support

BugLink: https://bugs.launchpad.net/bugs/2028746
The Granite Rapids CPU model uses similar memory controller registers
as Sapphire Rapids server but with some different configurations:

- Various memory controller numbers for different Granite Rapids CPUs.
So detect the number of present memory controllers at run time.

- Different MMIO offsets of memory controllers.

- Different triples of bus/dev/fun of some PCI devices used in i10nm_edac.

Add above configurations and Granite Rapids CPU model ID for EDAC support.

[Tony: Fixed 2 typos s/strcture/structure/]

Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Link: https://lore.kernel.org/all/20230113032802.41752-1-qiuxu.zhuo@intel.com
(cherry picked from commit ba987eaaabf99b462cdfed86274e3455d5126349)
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Tim Gardner <tim.gardner@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

EDAC/i10nm: Make more configurations CPU model specific

BugLink: https://bugs.launchpad.net/bugs/2028746
The numbers of memory controllers per socket, channels per memory
controller, DIMMs per channel and the triples of bus/device/function
of PCI devices used in i10nm_edac can be CPU model specific.
Add new fields to the structure res_config for above numbers and
triples to make them CPU model specific.

Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Link: https://lore.kernel.org/all/20230113032802.41752-1-qiuxu.zhuo@intel.com
(cherry picked from commit dd7814b78539416c6e561eeaa0951b3e88ac799e)
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Tim Gardner <tim.gardner@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

EDAC/i10nm: Add Intel Emerald Rapids server support

BugLink: https://bugs.launchpad.net/bugs/2028746
The Emerald Rapids CPU model uses similar memory controller registers
as Sapphire Rapids server. Add Emerald Rapids CPU model number ID for
EDAC support.

Tested-by: Li Zhang <li4.zhang@intel.com>
Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Link: https://lore.kernel.org/all/20230113032802.41752-1-qiuxu.zhuo@intel.com
(cherry picked from commit e4b2bc6616e21f4a7ce4e7452f716e3db8fe66b6)
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Tim Gardner <tim.gardner@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

EDAC/skx_common: Delete duplicated and unreachable code

BugLink: https://bugs.launchpad.net/bugs/2028746
skx_mce_check_error() returns early if the error isn't from memory.
So when skx_mce_output_error() is invoked from skx_mce_check_error(),
it doesn't need to re-check whether the error is from memory. Delete
the duplicated and unreachable code from skx_mce_output_error().

Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Link: https://lore.kernel.org/all/20230113032802.41752-1-qiuxu.zhuo@intel.com
(cherry picked from commit d2415e2e5330fc11f4c688fa518751bdc90259f5)
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Tim Gardner <tim.gardner@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

EDAC/skx_common: Enable EDAC support for the "near" memory

BugLink: https://bugs.launchpad.net/bugs/2028746
The current {skx,i10nm}_edac miss the EDAC support to decode errors from
the 1st level memory (the fast "near" memory as cache) of the 2-level
memory system. Introduce a helper function skx_error_in_mem() to check
whether errors are from memory at the beginning of skx_mce_check_error().

As long as the errors are from memory (either the 1-level memory system
or the 2-level memory system), decode the errors.

Reported-and-tested-by: Youquan Song <youquan.song@intel.com>
Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Link: https://lore.kernel.org/all/20230113032802.41752-1-qiuxu.zhuo@intel.com
(cherry picked from commit 6e8746cb735166eaf3ceb086b31bb0431f5e3532)
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Tim Gardner <tim.gardner@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

video/aperture: Provide a VGA helper for gma500 and internal use

BugLink: https://bugs.launchpad.net/bugs/2028749
The hardware for gma500 is different from the rest, as it uses stolen
framebuffer memory that is not available via PCI BAR. The regular PCI
removal helper cannot detect the framebuffer, while the non-PCI helper
misses possible conflicting VGA devices (i.e., a framebuffer or text
console).

Gma500 therefore calls both helpers to catch all cases. It's confusing
as it implies that there's something about the PCI device that requires
ownership management. The relationship between the PCI device and the
VGA devices is non-obvious. At worst, readers might assume that calling
two functions for clearing aperture ownership is a bug in the driver.

Hence, move the PCI removal helper's code for VGA functionality into
a separate function and call this function from gma500. Documents the
purpose of each call to aperture helpers. The change contains comments
and example code form the discussion at [1].

v5:
* fix grammar in gma500 comment (Javier)

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.kernel.org/project/dri-devel/patch/20230404201842.567344-1-daniel.vetter@ffwll.ch/
Reviewed-by: Javier Martinez Canillas <javierm@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230406132109.32050-10-tzimmermann@suse.de
(cherry picked from commit 116b1c5a364bcbdc40be64d4f3ec9dbc32e264dd)
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Tim Gardner <tim.gardner@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

fbdev: Simplify fb_is_primary_device for x86

BugLink: https://bugs.launchpad.net/bugs/2028749
vga_default_device really is supposed to cover all corners, at least
for x86. Additionally checking for rom shadowing should be redundant,
because the bios/fw only does that for the boot vga device.

If this turns out to be wrong, then most likely that's a special case
which should be added to the vgaarb code, not replicated all over.

Patch motived by changes to the aperture helpers, which also have this
open code in a bunch of places, and which are all removed in a
clean-up series. This function here is just for selecting the default
fbdev device for fbcon, but I figured for consistency it might be good
to throw this patch in on top.

Note that the shadow rom check predates vgaarb, which was only wired
up in commit 88674088d10c ("x86: Use vga_default_device() when
determining whether an fb is primary"). That patch doesn't explain why
we still fall back to the shadow rom check.

v4:
- fix commit message style (i.e., commit 1234 ("..."))
- fix Daniel's S-o-b address

v5:
- add back an S-o-b tag with Daniel's Intel address

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Helge Deller <deller@gmx.de>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Javier Martinez Canillas <javierm@redhat.com>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: x86@kernel.org
Cc: "H. Peter Anvin" <hpa@zytor.com>
Reviewed-by: Javier Martinez Canillas <javierm@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230406132109.32050-9-tzimmermann@suse.de
(cherry picked from commit 5ca1479cd35d9003040e6ac829380debe89b802b)
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Tim Gardner <tim.gardner@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

video/aperture: Only remove sysfb on the default vga pci device

BugLink: https://bugs.launchpad.net/bugs/2028749
Instead of calling aperture_remove_conflicting_devices() to remove the
conflicting devices, just call to aperture_detach_devices() to detach
the device that matches the same PCI BAR / aperture range. Since the
former is just a wrapper of the latter plus a sysfb_disable() call,
and now that's done in this function but only for the primary devices.

This fixes a regression introduced by commit ee7a69aa38d8 ("fbdev:
Disable sysfb device registration when removing conflicting FBs"),
where we remove the sysfb when loading a driver for an unrelated pci
device, resulting in the user losing their efifb console or similar.

Note that in practice this only is a problem with the nvidia blob,
because that's the only gpu driver people might install which does not
come with an fbdev driver of it's own. For everyone else the real gpu
driver will restore a working console.

Also note that in the referenced bug there's confusion that this same
bug also happens on amdgpu. But that was just another amdgpu specific
regression, which just happened to happen at roughly the same time and
with the same user-observable symptoms. That bug is fixed now, see
https://bugzilla.kernel.org/show_bug.cgi?id=216331#c15

Note that we should not have any such issues on non-pci multi-gpu
issues, because I could only find two such cases:
- SoC with some external panel over spi or similar. These panel
  drivers do not use drm_aperture_remove_conflicting_framebuffers(),
  so no problem.
- vga+mga, which is a direct console driver and entirely bypasses all
  this.

For the above reasons the cc: stable is just notionally, this patch
will need a backport and that's up to nvidia if they care enough.

v2:
- Explain a bit better why other multi-gpu that aren't pci shouldn't
  have any issues with making all this fully pci specific.

v3
- polish commit message (Javier)

v4:
- Fix commit message style (i.e., commit 1234 ("..."))
- fix Daniel's S-o-b address

v5:
- add back an S-o-b tag with Daniel's Intel address

Fixes: ee7a69aa38d8 ("fbdev: Disable sysfb device registration when removing conflicting FBs")
Tested-by: Aaron Plattner <aplattner@nvidia.com>
Reviewed-by: Javier Martinez Canillas <javierm@redhat.com>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=216303#c28
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Cc: Aaron Plattner <aplattner@nvidia.com>
Cc: Javier Martinez Canillas <javierm@redhat.com>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Cc: Helge Deller <deller@gmx.de>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: <stable@vger.kernel.org> # v5.19+ (if someone else does the backport)
Link: https://patchwork.freedesktop.org/patch/msgid/20230406132109.32050-8-tzimmermann@suse.de
(cherry picked from commit 5ae3716cfdcd286268133867f67d0803847acefc)
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Tim Gardner <tim.gardner@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

video/aperture: Drop primary argument

BugLink: https://bugs.launchpad.net/bugs/2028749
With the preceding patches it's become defunct. Also I'm about to add
a different boolean argument, so it's better to keep the confusion
down to the absolute minimum.

v2: Since the hypervfb patch got droppped (it's only a pci device for
gen1 vm, not for gen2) there is one leftover user in an actual driver
left to touch.

v4:
- fixes to commit message
- fix Daniel's S-o-b address

v5:
- add back an S-o-b tag with Daniel's Intel address

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Cc: Javier Martinez Canillas <javierm@redhat.com>
Cc: Helge Deller <deller@gmx.de>
Cc: linux-fbdev@vger.kernel.org
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Maxime Ripard <mripard@kernel.org>
Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Wei Liu <wei.liu@kernel.org>
Cc: Dexuan Cui <decui@microsoft.com>
Cc: linux-hyperv@vger.kernel.org
Reviewed-by: Javier Martinez Canillas <javierm@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230406132109.32050-7-tzimmermann@suse.de
(backported from commit 5fbcc6708fe32ef80122cd2a59ddca9d18b24d6e)
[khfeng: Resolve textual dependency, include drivers]
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Tim Gardner <tim.gardner@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

video/aperture: Move vga handling to pci function

BugLink: https://bugs.launchpad.net/bugs/2028749
A few reasons for this:

- It's really the only one where this matters. I tried looking around,
  and I didn't find any non-pci vga-compatible controllers for x86
  (since that's the only platform where we had this until a few
  patches ago), where a driver participating in the aperture claim
  dance would interfere.

- I also don't expect that any future bus anytime soon will
  not just look like pci towards the OS, that's been the case for like
  25+ years by now for practically everything (even non non-x86).

- Also it's a bit funny if we have one part of the vga removal in the
  pci function, and the other in the generic one.

v2: Rebase.

v4:
- fix Daniel's S-o-b address

v5:
- add back an S-o-b tag with Daniel's Intel address

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Cc: Javier Martinez Canillas <javierm@redhat.com>
Cc: Helge Deller <deller@gmx.de>
Cc: linux-fbdev@vger.kernel.org
Reviewed-by: Javier Martinez Canillas <javierm@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230406132109.32050-6-tzimmermann@suse.de
(cherry picked from commit f1d599d315fb7b7343cddaf365e671aaa8453aca)
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Tim Gardner <tim.gardner@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

video/aperture: Only kick vgacon when the pdev is decoding vga

BugLink: https://bugs.launchpad.net/bugs/2028749
Otherwise it's a bit silly, and we might throw out the driver for the
screen the user is actually looking at. I haven't found a bug report
for this case yet, but we did get bug reports for the analog case
where we're throwing out the efifb driver.

v2: Flip the check around to make it clear it's a special case for
kicking out the vgacon driver only (Thomas)

v4:
- fixes to commit message
- fix Daniel's S-o-b address

v5:
- add back an S-o-b tag with Daniel's Intel address

Link: https://bugzilla.kernel.org/show_bug.cgi?id=216303
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Cc: Javier Martinez Canillas <javierm@redhat.com>
Cc: Helge Deller <deller@gmx.de>
Cc: linux-fbdev@vger.kernel.org
Reviewed-by: Javier Martinez Canillas <javierm@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230406132109.32050-5-tzimmermann@suse.de
(cherry picked from commit 7450cd235b45d43ee6f3c235f89e92623458175d)
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Tim Gardner <tim.gardner@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

drm/aperture: Remove primary argument

BugLink: https://bugs.launchpad.net/bugs/2028749
Only really pci devices have a business setting this - it's for
figuring out whether the legacy vga stuff should be nuked too. And
with the preceding two patches those are all using the pci version of
this.

Which means for all other callers primary == false and we can remove
it now.

v2:
- Reorder to avoid compile fail (Thomas)
- Include gma500, which retained it's called to the non-pci version.

v4:
- fix Daniel's S-o-b address

v5:
- add back an S-o-b tag with Daniel's Intel address

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Cc: Javier Martinez Canillas <javierm@redhat.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Maxime Ripard <mripard@kernel.org>
Cc: Deepak Rawat <drawat.floss@gmail.com>
Cc: Neil Armstrong <neil.armstrong@linaro.org>
Cc: Kevin Hilman <khilman@baylibre.com>
Cc: Jerome Brunet <jbrunet@baylibre.com>
Cc: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Cc: Thierry Reding <thierry.reding@gmail.com>
Cc: Jonathan Hunter <jonathanh@nvidia.com>
Cc: Emma Anholt <emma@anholt.net>
Cc: Helge Deller <deller@gmx.de>
Cc: David Airlie <airlied@gmail.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: linux-hyperv@vger.kernel.org
Cc: linux-amlogic@lists.infradead.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-tegra@vger.kernel.org
Cc: linux-fbdev@vger.kernel.org
Acked-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Acked-by: Thierry Reding <treding@nvidia.com>
Reviewed-by: Javier Martinez Canillas <javierm@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230406132109.32050-4-tzimmermann@suse.de
(backported from commit 62aeaeaa1b267c5149abee6b45967a5df3feed58)
[khfeng: msm_fbdev.c already uses different routine, include drivers]
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Tim Gardner <tim.gardner@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

video/aperture: use generic code to figure out the vga default device

BugLink: https://bugs.launchpad.net/bugs/2028749
Since vgaarb has been promoted to be a core piece of the pci subsystem
we don't have to open code random guesses anymore, we actually know
this in a platform agnostic way, and there's no need for an x86
specific hack. See also commit 1d38fe6ee6a8 ("PCI/VGA: Move vgaarb to
drivers/pci")

This should not result in any functional change, and the non-x86
multi-gpu pci systems are probably rare enough to not matter (I don't
know of any tbh). But it's a nice cleanup, so let's do it.

There's been a few questions on previous iterations on dri-devel and
irc:

- fb_is_primary_device() seems to be yet another implementation of
  this theme, and at least on x86 it checks for both
  vga_default_device OR rom shadowing. There shouldn't ever be a case
  where rom shadowing gives any additional hints about the boot vga
  device, but if there is then the default vga selection in vgaarb
  should probably be fixed. And not special-case checks replicated all
  over.

- Thomas also brought up that on most !x86 systems
  fb_is_primary_device() returns 0, except on sparc/parisc. But these
  2 special cases are about platform specific devices and not pci, so
  shouldn't have any interactions.

- Furthermore fb_is_primary_device() is a bit a red herring since it's
  only used to select the right fbdev driver for fbcon, and not for
  the fw handover dance which the aperture helpers handle. At least
  for x86 we might want to look into unifying them, but that's a
  separate thing.

v2: Extend commit message trying to summarize various discussions.

v4:
- make the test for the primary device easier to read (Javier)
- fix commit message style (i.e., commit 1234 ("..."))
- fix Daniel's S-o-b address

v5:
- add back an S-o-b tag with Daniel's Intel address

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Cc: Javier Martinez Canillas <javierm@redhat.com>
Cc: Helge Deller <deller@gmx.de>
Cc: linux-fbdev@vger.kernel.org
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: linux-pci@vger.kernel.org
Reviewed-by: Javier Martinez Canillas <javierm@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230406132109.32050-3-tzimmermann@suse.de
(cherry picked from commit db082219569e3392cd38c9e322151450c855eed4)
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Tim Gardner <tim.gardner@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

drm/gma500: Use drm_aperture_remove_conflicting_pci_framebuffers

BugLink: https://bugs.launchpad.net/bugs/2028749
This one nukes all framebuffers, which is a bit much. In reality
gma500 is igpu and never shipped with anything discrete, so there should
not be any difference.

v2: Unfortunately the framebuffer sits outside of the pci bars for
gma500, and so only using the pci helpers won't be enough. Otoh if we
only use non-pci helper, then we don't get the vga handling, and
subsequent refactoring to untangle these special cases won't work.

It's not pretty, but the simplest fix (since gma500 really is the only
quirky pci driver like this we have) is to just have both calls.

v4:
- fix Daniel's S-o-b address

v5:
- add back an S-o-b tag with Daniel's Intel address

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Cc: Patrik Jakobsson <patrik.r.jakobsson@gmail.com>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Cc: Javier Martinez Canillas <javierm@redhat.com>
Reviewed-by: Javier Martinez Canillas <javierm@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230406132109.32050-2-tzimmermann@suse.de
(cherry picked from commit 80e993988b97fe794f3ec2be6db05fe30f9353c3)
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Tim Gardner <tim.gardner@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

drm/amd/display: Keep PHY active for dp config

BugLink: https://bugs.launchpad.net/bugs/2028740
[Why]
Current hotplug sequence causes temporary hang at the re-entry of the
optimized power state.

[How]
Keep a PHY active when detecting DP signal + DPMS active

Reviewed-by: Nicholas Kazlauskas <Nicholas.Kazlauskas@amd.com>
Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com>
Signed-off-by: Agustin Gutierrez <agustin.gutierrez@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 2b02d746c1818baf741f4eeeff9b97ab4b81e1cf)
Signed-off-by: Aaron Ma <aaron.ma@canonical.com>
Acked-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

ACPI: video: Stop trying to use vendor backlight control on laptops from after ~2012

BugLink: https://bugs.launchpad.net/bugs/2023638
There have been 2 separate reports now about a non working
"dell_backlight" device getting registered under /sys/class/backlight
1 report for a Raptor Lake based Dell and 1 report for a Meteor Lake
(development) platform.

On hw from the last 10 years dell-laptop will not register "dell_backlight"
because acpi_video_get_backlight_type() will return acpi_backlight_video
there if called before the GPU/kms driver loads. So it does not matter if
the GPU driver's native backlight is registered after dell-laptop loads.

But it seems that on the latest generation laptops the ACPI tables
no longer contain acpi_video backlight control support which causes
acpi_video_get_backlight_type() to return acpi_backlight_vendor causing
"dell_backlight" to get registered if the dell-laptop module is loaded
before the GPU/kms driver.

Vendor specific backlight control like the "dell_backlight" device is
only necessary on quite old hw (from before acpi_video backlight control
was introduced). Work around "dell_backlight" registering on very new
hw (where acpi_video backlight control seems to be no more) by making
acpi_video_get_backlight_type() return acpi_backlight_none instead
of acpi_backlight_vendor as final fallback when the ACPI tables have
support for Windows 8 or later (laptops from after ~2012).

Suggested-by: Matthew Garrett <mjg59@srcf.ucam.org>
Reported-by: AceLan Kao <acelan.kao@canonical.com>
Closes: https://lore.kernel.org/platform-driver-x86/20230607034331.576623-1-acelan.kao@canonical.com/
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
(cherry picked from commit aa8a950a5d6b2094830aff834198777371ff91ff)
Signed-off-by: Chia-Lin Kao (AceLan) <acelan.kao@canonical.com>
Acked-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

net: wwan: t7xx: Ensure init is completed before system sleep

BugLink: https://launchpad.net/bugs/2020743
When the system attempts to sleep while mtk_t7xx is not ready, the driver
cannot put the device to sleep:
[   12.472918] mtk_t7xx 0000:57:00.0: [PM] Exiting suspend, modem in invalid state
[   12.472936] mtk_t7xx 0000:57:00.0: PM: pci_pm_suspend(): t7xx_pci_pm_suspend+0x0/0x20 [mtk_t7xx] returns -14
[   12.473678] mtk_t7xx 0000:57:00.0: PM: dpm_run_callback(): pci_pm_suspend+0x0/0x1b0 returns -14
[   12.473711] mtk_t7xx 0000:57:00.0: PM: failed to suspend async: error -14
[   12.764776] PM: Some devices failed to suspend, or early wake event detected

Mediatek confirmed the device can take a rather long time to complete
its initialization, so wait for up to 20 seconds until init is done.

Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit ab87603b251134441a67385ecc9d3371be17b7a7)
Signed-off-by: Chia-Lin Kao (AceLan) <acelan.kao@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Tim Gardner <tim.gardner@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

r8152: add USB device driver for config selection

BugLink: https://launchpad.net/bugs/2020295
Subclassing the generic USB device driver to override the
default configuration selection regardless of matching interface
drivers.

The r815x family devices expose a vendor specific function which
the r8152 interface driver wants to handle.  This is the preferred
device mode. Additionally one or more USB class functions are
usually supported for hosts lacking a vendor specific driver. The
choice is USB configuration based, with one alternate function per
configuration.

Example device with both NCM and ECM alternate cfgs:

T:  Bus=02 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#=  4 Spd=5000 MxCh= 0
D:  Ver= 3.20 Cls=00(>ifc ) Sub=00 Prot=00 MxPS= 9 #Cfgs=  3
P:  Vendor=0bda ProdID=8156 Rev=31.00
S:  Manufacturer=Realtek
S:  Product=USB 10/100/1G/2.5G LAN
S:  SerialNumber=001000001
C:* #Ifs= 1 Cfg#= 1 Atr=a0 MxPwr=256mA
I:* If#= 0 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=00 Driver=r8152
E:  Ad=81(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms
E:  Ad=02(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms
E:  Ad=83(I) Atr=03(Int.) MxPS=   2 Ivl=128ms
C:  #Ifs= 2 Cfg#= 2 Atr=a0 MxPwr=256mA
I:  If#= 0 Alt= 0 #EPs= 1 Cls=02(comm.) Sub=0d Prot=00 Driver=
E:  Ad=83(I) Atr=03(Int.) MxPS=  16 Ivl=128ms
I:  If#= 1 Alt= 0 #EPs= 0 Cls=0a(data ) Sub=00 Prot=01 Driver=
I:  If#= 1 Alt= 1 #EPs= 2 Cls=0a(data ) Sub=00 Prot=01 Driver=
E:  Ad=81(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms
E:  Ad=02(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms
C:  #Ifs= 2 Cfg#= 3 Atr=a0 MxPwr=256mA
I:  If#= 0 Alt= 0 #EPs= 1 Cls=02(comm.) Sub=06 Prot=00 Driver=
E:  Ad=83(I) Atr=03(Int.) MxPS=  16 Ivl=128ms
I:  If#= 1 Alt= 0 #EPs= 0 Cls=0a(data ) Sub=00 Prot=00 Driver=
I:  If#= 1 Alt= 1 #EPs= 2 Cls=0a(data ) Sub=00 Prot=00 Driver=
E:  Ad=81(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms
E:  Ad=02(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms

A problem with this is that Linux will prefer class functions over
vendor specific functions. Using the above example, Linux defaults
to cfg #2, running the device in a sub-optimal NCM mode.

Previously we've attempted to work around the problem by
blacklisting the devices in the ECM class driver "cdc_ether", and
matching on the ECM class function in the vendor specific interface
driver. The latter has been used to switch back to the vendor
specific configuration when the driver is probed for a class
function.

This workaround has several issues;
- class driver blacklists is additional maintanence cruft in an
  unrelated driver
- class driver blacklists prevents users from optionally running
  the devices in class mode
- each device needs double match entries in the vendor driver
- the initial probing as a class function slows down device
  discovery

Now these issues have become even worse with the introduction of
firmware supporting both NCM and ECM, where NCM ends up as the
default mode in Linux. To use the same workaround, we now have
to blacklist the devices in to two different class drivers and
add yet another match entry to the vendor specific driver.

This patch implements an alternative workaround strategy -
independent of the interface drivers.  It avoids adding a
blacklist to the cdc_ncm driver and will let us remove the
existing blacklist from the cdc_ether driver.

As an additional bonus, removing the blacklists allow users to
select one of the other device modes if wanted.

Signed-off-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
(backported from commit ec51fbd1b8a2bca2948dede99c14ec63dc57ff6b)
[AceLan: conflicts with below commit which introduce a ID to the same table
in Jammy kernel. Fixed by preserved the ID and changed to the same new format.
2edbc1459c1b ("r8152: add vendor/device ID pair for Microsoft Devkit")]
Signed-off-by: Chia-Lin Kao (AceLan) <acelan.kao@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Roxana Nicolescu <roxana.nicolescu@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

x86/cpu/amd: Add a Zenbleed fix

Add a fix for the Zen2 VZEROUPPER data corruption bug where under
certain circumstances executing VZEROUPPER can cause register
corruption or leak data.

The optimal fix is through microcode but in the case the proper
microcode revision has not been applied, enable a fallback fix using
a chicken bit.

Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
CVE-2023-20593
(backported from commit 522b1d69219d8f083173819fde04f994aa051a98)
[cascardo: small conflict due to missing commit 8cc68c9c9e92dbaae51a711454c66eb668045508]
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Cengiz Can <cengiz.can@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

x86/cpu/amd: Move the errata checking functionality up

Avoid new and remove old forward declarations.

No functional changes.

Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
CVE-2023-20593
(backported from commit 8b6f687743dacce83dbb0c7cfacf88bab00f808a)
[cascardo: small context conflict at set_dr_addr_mask]
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
Acked-by: Cengiz Can <cengiz.can@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

netfilter: nft_set_pipapo: fix improper element removal

end key should be equal to start unless NFT_SET_EXT_KEY_END is present.

Its possible to add elements that only have a start key
("{ 1.0.0.0 . 2.0.0.0 }") without an internval end.

Insertion treats this via:

if (nft_set_ext_exists(ext, NFT_SET_EXT_KEY_END))
end = (const u8 *)nft_set_ext_key_end(ext)->data;
else
end = start;

but removal side always uses nft_set_ext_key_end().
This is wrong and leads to garbage remaining in the set after removal
next lookup/insert attempt will give:

BUG: KASAN: slab-use-after-free in pipapo_get+0x8eb/0xb90
Read of size 1 at addr ffff888100d50586 by task nft-pipapo_uaf_/1399
Call Trace:
kasan_report+0x105/0x140
pipapo_get+0x8eb/0xb90
nft_pipapo_insert+0x1dc/0x1710
nf_tables_newsetelem+0x31f5/0x4e00
..

Fixes: 3c4287f62044 ("nf_tables: Add set type for arbitrary concatenation of ranges")
Reported-by: lonial con <kongln9170@gmail.com>
Reviewed-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
CVE-2023-4004
(cherry picked from commit 87b5a5c209405cb6b57424cdfa226a6dbd349232)
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
Acked-by: Cengiz Can <cengiz.can@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

net/sched: sch_qfq: account for stab overhead in qfq_enqueue

Lion says:
-------
In the QFQ scheduler a similar issue to CVE-2023-31436
persists.

Consider the following code in net/sched/sch_qfq.c:

static int qfq_enqueue(struct sk_buff *skb, struct Qdisc *sch,
                struct sk_buff **to_free)
{
     unsigned int len = qdisc_pkt_len(skb), gso_segs;

    // ...

     if (unlikely(cl->agg->lmax < len)) {
         pr_debug("qfq: increasing maxpkt from %u to %u for class %u",
              cl->agg->lmax, len, cl->common.classid);
         err = qfq_change_agg(sch, cl, cl->agg->class_weight, len);
         if (err) {
             cl->qstats.drops++;
             return qdisc_drop(skb, sch, to_free);
         }

    // ...

     }

Similarly to CVE-2023-31436, "lmax" is increased without any bounds
checks according to the packet length "len". Usually this would not
impose a problem because packet sizes are naturally limited.

This is however not the actual packet length, rather the
"qdisc_pkt_len(skb)" which might apply size transformations according to
"struct qdisc_size_table" as created by "qdisc_get_stab()" in
net/sched/sch_api.c if the TCA_STAB option was set when modifying the qdisc.

A user may choose virtually any size using such a table.

As a result the same issue as in CVE-2023-31436 can occur, allowing heap
out-of-bounds read / writes in the kmalloc-8192 cache.
-------

We can create the issue with the following commands:

tc qdisc add dev $DEV root handle 1: stab mtu 2048 tsize 512 mpu 0 \
overhead 999999999 linklayer ethernet qfq
tc class add dev $DEV parent 1: classid 1:1 htb rate 6mbit burst 15k
tc filter add dev $DEV parent 1: matchall classid 1:1
ping -I $DEV 1.1.1.2

This is caused by incorrectly assuming that qdisc_pkt_len() returns a
length within the QFQ_MIN_LMAX < len < QFQ_MAX_LMAX.

Fixes: 462dbc9101ac ("pkt_sched: QFQ Plus: fair-queueing service at DRR cost")
Reported-by: Lion <nnamrec@gmail.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Pedro Tammela <pctammela@mojatatu.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
CVE-2023-3611
(cherry picked from commit 3e337087c3b5805fe0b8a46ba622a962880b5d64)
Signed-off-by: Cengiz Can <cengiz.can@canonical.com>
Acked-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

net/sched: sch_qfq: refactor parsing of netlink parameters

Two parameters can be transformed into netlink policies and
validated while parsing the netlink message.

Reviewed-by: Simon Horman <simon.horman@corigine.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Pedro Tammela <pctammela@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
CVE-2023-3611
(cherry picked from commit 25369891fcef373540f8b4e0b3bccf77a04490d5)
[cengizcan: prerequisite commit]
Signed-off-by: Cengiz Can <cengiz.can@canonical.com>
Acked-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

netfilter: nf_tables: fix chain binding transaction logic

Add bound flag to rule and chain transactions as in 6a0a8d10a366
("netfilter: nf_tables: use-after-free in failing rule with bound set")
to skip them in case that the chain is already bound from the abort
path.

This patch fixes an imbalance in the chain use refcnt that triggers a
WARN_ON on the table and chain destroy path.

This patch also disallows nested chain bindings, which is not
supported from userspace.

The logic to deal with chain binding in nft_data_hold() and
nft_data_release() is not correct. The NFT_TRANS_PREPARE state needs a
special handling in case a chain is bound but next expressions in the
same rule fail to initialize as described by 1240eb93f061 ("netfilter:
nf_tables: incorrect error path handling with NFT_MSG_NEWRULE").

The chain is left bound if rule construction fails, so the objects
stored in this chain (and the chain itself) are released by the
transaction records from the abort path, follow up patch ("netfilter:
nf_tables: add NFT_TRANS_PREPARE_ERROR to deal with bound set/chain")
completes this error handling.

When deleting an existing rule, chain bound flag is set off so the
rule expression .destroy path releases the objects.

Fixes: d0e2c7de92c7 ("netfilter: nf_tables: add NFT_CHAIN_BINDING")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
CVE-2023-3610
(cherry picked from commit 4bedf9eee016286c835e3d8fa981ddece5338795)
Signed-off-by: Cengiz Can <cengiz.can@canonical.com>
Acked-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

f2fs: fix to avoid NULL pointer dereference f2fs_write_end_io()

butt3rflyh4ck reports a bug as below:

When a thread always calls F2FS_IOC_RESIZE_FS to resize fs, if resize fs is
failed, f2fs kernel thread would invoke callback function to update f2fs io
info, it would call  f2fs_write_end_io and may trigger null-ptr-deref in
NODE_MAPPING.

general protection fault, probably for non-canonical address
KASAN: null-ptr-deref in range [0x0000000000000030-0x0000000000000037]
RIP: 0010:NODE_MAPPING fs/f2fs/f2fs.h:1972 [inline]
RIP: 0010:f2fs_write_end_io+0x727/0x1050 fs/f2fs/data.c:370
<TASK>
bio_endio+0x5af/0x6c0 block/bio.c:1608
req_bio_endio block/blk-mq.c:761 [inline]
blk_update_request+0x5cc/0x1690 block/blk-mq.c:906
blk_mq_end_request+0x59/0x4c0 block/blk-mq.c:1023
lo_complete_rq+0x1c6/0x280 drivers/block/loop.c:370
blk_complete_reqs+0xad/0xe0 block/blk-mq.c:1101
__do_softirq+0x1d4/0x8ef kernel/softirq.c:571
run_ksoftirqd kernel/softirq.c:939 [inline]
run_ksoftirqd+0x31/0x60 kernel/softirq.c:931
smpboot_thread_fn+0x659/0x9e0 kernel/smpboot.c:164
kthread+0x33e/0x440 kernel/kthread.c:379
ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:308

The root cause is below race case can cause leaving dirty metadata
in f2fs after filesystem is remount as ro:

Thread A Thread B
- f2fs_ioc_resize_fs
- f2fs_readonly   --- return false
- f2fs_resize_fs
- f2fs_remount
- write_checkpoint
- set f2fs as ro
  - free_segment_range
   - update meta_inode's data

Then, if f2fs_put_super()  fails to write_checkpoint due to readonly
status, and meta_inode's dirty data will be writebacked after node_inode
is put, finally, f2fs_write_end_io will access NULL pointer on
sbi->node_inode.

Thread A IRQ context
- f2fs_put_super
- write_checkpoint fails
- iput(node_inode)
- node_inode = NULL
- iput(meta_inode)
  - write_inode_now
   - f2fs_write_meta_page
- f2fs_write_end_io
- NODE_MAPPING(sbi)
: access NULL pointer on node_inode

Fixes: b4b10061ef98 ("f2fs: refactor resize_fs to avoid meta updates in progress")
Reported-by: butt3rflyh4ck <butterflyhuangxx@gmail.com>
Closes: https://lore.kernel.org/r/1684480657-2375-1-git-send-email-yangtiezhu@loongson.cn
Tested-by: butt3rflyh4ck <butterflyhuangxx@gmail.com>
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
CVE-2023-2898
(cherry picked from commit d8189834d4348ae608083e1f1f53792cfcc2a9bc)
Signed-off-by: Yuxuan Luo <yuxuan.luo@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Tim Gardner <tim.gardner@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

ALSA: hda/realtek: Enable 4 amplifiers instead of 2 on a HP platform

BugLink: https://bugs.launchpad.net/bugs/2023197
In the commit 7bb62340951a ("ALSA: hda/realtek: fix speaker, mute/micmute
LEDs not work on a HP platform"), speakers and LEDs are fixed but only 2
CS35L41 amplifiers on SPI bus connected to Realtek codec are enabled. Need
the ALC245_FIXUP_CS35L41_SPI_4_HP_GPIO_LED to get all amplifiers working.

Signed-off-by: Chris Chiu <chris.chiu@canonical.com>
Fixes: 7bb62340951a ("ALSA: hda/realtek: fix speaker, mute/micmute LEDs not work on a HP platform")
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20230606145747.135966-1-chris.chiu@canonical.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
(cherry picked from commit b752a385b584d385683c65cb76a1298f1379a88c)
Signed-off-by: Chris Chiu <chris.chiu@canonical.com>
Acked-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Roxana Nicolescu <roxana.nicolescu@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

UBUNTU: SAUCE: overlayfs: fix reference count mismatch

BugLink: https://bugs.launchpad.net/bugs/2016398
Opened files reported in /proc/pid/map_files can be shows with the wrong
mount point using overlayfs with filesystem namspaces.

This incorrect behavior is fixed:

UBUNTU: SAUCE: overlayfs: fix incorrect mnt_id of files opened from map_files

However, the fix introduced a new regression, the reference to the
original file stored in vma->vm_prfile is not properly released when
vma->vm_prfile is replaced with a new file.

This can cause a reference counter unbalance, leading errors such as
"target is busy" when trying to unmount overlayfs, even if the
filesystem has not active reference.

Fix by properly releasing the original file stored in vm_prfile.

Fixes: 508fdae3f62dd ("UBUNTU: SAUCE: overlayfs: fix incorrect mnt_id of files opened from map_files")
Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Roxana Nicolescu <roxana.nicolescu@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

drm/ast: Fix ARM compatibility

BugLink: https://bugs.launchpad.net/bugs/2026776
ARM architecture only has 'memory', so all devices are accessed by
MMIO if possible.

Signed-off-by: Jammy Huang <jammy_huang@aspeedtech.com>
Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de>
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20230421003354.27767-1-jammy_huang@aspeedtech.com
Acked-by: Brad Figg <bfigg@nvidia.com>
Acked-by: Jamie Nguyen <jamien@nvidia.com>
(cherry picked from commit 4327a6137ed43a091d900b1ac833345d60f32228)
Signed-off-by: Ian May <ian.may@canonical.com>
Acked-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Roxana Nicolescu <roxana.nicolescu@canonical.com>

drm/i915: Allow arbitrary refresh rates with VRR eDP panels

BugLink: https://bugs.launchpad.net/bugs/2024273
If the panel supports VRR it must be capable of accepting
timings with arbitrary vblank length, within the valid VRR
range. Use that fact to allow the user to request any refresh
rate they like. We simply pick the next highest fixed mode
from our list, and adjust the vblank to get the desired refresh
rate in the end.

Of course currently everything to do with the vrefresh is
using 1Hz precision, so might not be exact. But we can improve
that in the future by just upping our vrefresh precision.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230404175431.23064-1-ville.syrjala@linux.intel.com
Reviewed-by: Mitul Golani <mitulkumar.ajitkumar.golani@intel.com>
(cherry picked from commit a2da67028cd05516343533c1609fcaf037237fed)
Signed-off-by: Chris Chiu <chris.chiu@canonical.com>
Acked-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Roxana Nicolescu <roxana.nicolescu@canonical.com>
Signed-off-by: Roxana Nicolescu <roxana.nicolescu@canonical.com>

drm/amd: Align SMU11 SMU_MSG_OverridePcieParameters implementation with SMU13

BugLink: https://bugs.launchpad.net/bugs/2027957
SMU13 overrides dynamic PCIe lane width and dynamic speed by when on
certain hosts. commit 38e4ced80479 ("drm/amd/pm: conditionally disable
pcie lane switching for some sienna_cichlid SKUs") worked around this
issue by setting up certain SKUs to set up certain limits, but the same
fundamental problem with those hosts affects all SMU11 implmentations
as well, so align the SMU11 and SMU13 driver handling.

Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # 6.1.x
(backported from commit e701156ccc6c7a5f104a968dda74cd6434178712
linux-next)
Signed-off-by: Koba Ko <koba.ko@canonical.com>
Acked-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Roxana Nicolescu <roxana.nicolescu@canonical.com>

drm/amd: Move helper for dynamic speed switch check out of smu13

BugLink: https://bugs.launchpad.net/bugs/2027957
This helper is used for checking if the connected host supports
the feature, it can be moved into generic code to be used by other
smu implementations as well.

Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # 6.1.x
(cherry picked from commit 188623076d0f1a500583d392b6187056bf7cc71a
linux-next)
Signed-off-by: Koba Ko <koba.ko@canonical.com>
Acked-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Roxana Nicolescu <roxana.nicolescu@canonical.com>

drm/amd/pm: conditionally disable pcie lane/speed switching for SMU13

BugLink: https://bugs.launchpad.net/bugs/2027957
Intel platforms such as Sapphire Rapids and Raptor Lake don't support
dynamic pcie lane or speed switching.

This limitation seems to carry over from one generation to another.
To be safer, disable dynamic pcie lane width and speed switching when
running on an Intel platform.

Link: https://edc.intel.com/content/www/us/en/design/products/platforms/details/raptor-lake-s/13th-generation-core-processors-datasheet-volume-1-of-2/005/pci-express-support/
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2663
Co-developed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # 6.1.x
(cherry picked from commit 31c7a3b378a136adc63296a2ff17645896fcf303
linux-next)
Signed-off-by: Koba Ko <koba.ko@canonical.com>
Acked-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Roxana Nicolescu <roxana.nicolescu@canonical.com>

drm/amd/pm: share the code around SMU13 pcie parameters update

BugLink: https://bugs.launchpad.net/bugs/2027957
So that SMU13.0.0 and SMU13.0.7 do not need to have one copy each.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # 6.1.x
(cherry picked from commit dcb489bae65d92cfd26da22c7a0d6665b06ecc63
linux-next)
Signed-off-by: Koba Ko <koba.ko@canonical.com>
Acked-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Roxana Nicolescu <roxana.nicolescu@canonical.com>

HID: amd_sfh: Fix for shift-out-of-bounds

BugLink: https://bugs.launchpad.net/bugs/2027773
Shift operation of 'exp' and 'shift' variables exceeds the maximum number
of shift values in the u32 range leading to UBSAN shift-out-of-bounds.

...
[    6.120512] UBSAN: shift-out-of-bounds in drivers/hid/amd-sfh-hid/sfh1_1/amd_sfh_desc.c:149:50
[    6.120598] shift exponent 104 is too large for 64-bit type 'long unsigned int'
[    6.120659] CPU: 4 PID: 96 Comm: kworker/4:1 Not tainted 6.4.0amd_1-next-20230519-dirty #10
[    6.120665] Hardware name: AMD Birman-PHX/Birman-PHX, BIOS SFH_with_HPD_SEN.FD 04/05/2023
[    6.120667] Workqueue: events amd_sfh_work_buffer [amd_sfh]
[    6.120687] Call Trace:
[    6.120690]  <TASK>
[    6.120694]  dump_stack_lvl+0x48/0x70
[    6.120704]  dump_stack+0x10/0x20
[    6.120707]  ubsan_epilogue+0x9/0x40
[    6.120716]  __ubsan_handle_shift_out_of_bounds+0x10f/0x170
[    6.120720]  ? psi_group_change+0x25f/0x4b0
[    6.120729]  float_to_int.cold+0x18/0xba [amd_sfh]
[    6.120739]  get_input_rep+0x57/0x340 [amd_sfh]
[    6.120748]  ? __schedule+0xba7/0x1b60
[    6.120756]  ? __pfx_get_input_rep+0x10/0x10 [amd_sfh]
[    6.120764]  amd_sfh_work_buffer+0x91/0x180 [amd_sfh]
[    6.120772]  process_one_work+0x229/0x430
[    6.120780]  worker_thread+0x4a/0x3c0
[    6.120784]  ? __pfx_worker_thread+0x10/0x10
[    6.120788]  kthread+0xf7/0x130
[    6.120792]  ? __pfx_kthread+0x10/0x10
[    6.120795]  ret_from_fork+0x29/0x50
[    6.120804]  </TASK>
...

Fix this by adding the condition to validate shift ranges.

Fixes: 93ce5e0231d7 ("HID: amd_sfh: Implement SFH1.1 functionality")
Cc: stable@vger.kernel.org
Tested-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Signed-off-by: Basavaraj Natikar <Basavaraj.Natikar@amd.com>
Signed-off-by: Akshata MukundShetty <akshata.mukundshetty@amd.com>
Link: https://lore.kernel.org/r/20230707065722.9036-3-Basavaraj.Natikar@amd.com
Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>
(cherry picked from commit 87854366176403438d01f368b09de3ec2234e0f5)
Signed-off-by: You-Sheng Yang <vicamo.yang@canonical.com>
Acked-by: John Cabaj <john.cabaj@canonical.com>
Acked-by: Tim Gardner <tim.gardner@canonical.com>
Signed-off-by: Roxana Nicolescu <roxana.nicolescu@canonical.com>

HID: amd_sfh: Rename the float32 variable

BugLink: https://bugs.launchpad.net/bugs/2027773
As float32 is also used in other places as a data type, it is necessary
to rename the float32 variable in order to avoid confusion.

Cc: stable@vger.kernel.org
Tested-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Signed-off-by: Basavaraj Natikar <Basavaraj.Natikar@amd.com>
Signed-off-by: Akshata MukundShetty <akshata.mukundshetty@amd.com>
Link: https://lore.kernel.org/r/20230707065722.9036-2-Basavaraj.Natikar@amd.com
Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>
(cherry picked from commit c1685a862a4bea863537f06abaa37a123aef493c)
Signed-off-by: You-Sheng Yang <vicamo.yang@canonical.com>
Acked-by: John Cabaj <john.cabaj@canonical.com>
Acked-by: Tim Gardner <tim.gardner@canonical.com>
Signed-off-by: Roxana Nicolescu <roxana.nicolescu@canonical.com>

cifs: fix mid leak during reconnection after timeout threshold

BugLink: https://bugs.launchpad.net/bugs/2029138
When the number of responses with status of STATUS_IO_TIMEOUT
exceeds a specified threshold (NUM_STATUS_IO_TIMEOUT), we reconnect
the connection. But we do not return the mid, or the credits
returned for the mid, or reduce the number of in-flight requests.

This bug could result in the server->in_flight count to go bad,
and also cause a leak in the mids.

This change moves the check to a few lines below where the
response is decrypted, even of the response is read from the
transform header. This way, the code for returning the mids
can be reused.

Also, the cifs_reconnect was reconnecting just the transport
connection before. In case of multi-channel, this may not be
what we want to do after several timeouts. Changed that to
reconnect the session and the tree too.

Also renamed NUM_STATUS_IO_TIMEOUT to a more appropriate name
MAX_STATUS_IO_TIMEOUT.

Fixes: 8e670f77c4a5 ("Handle STATUS_IO_TIMEOUT gracefully")
Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
(cherry picked from commit 69cba9d3c1284e0838ae408830a02c4a063104bc)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Cengiz Can <cengiz.can@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Roxana Nicolescu <roxana.nicolescu@canonical.com>

UBUNTU: Upstream stable to v6.1.32, v6.3.6

BugLink: https://bugs.launchpad.net/bugs/2028979
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

blk-wbt: fix that wbt can't be disabled by default

BugLink: https://bugs.launchpad.net/bugs/2028979
[ Upstream commit 8a2b20a997a3779ae9fcae268f2959eb82ec05a1 ]

commit b11d31ae01e6 ("blk-wbt: remove unnecessary check in
wbt_enable_default()") removes the checking of CONFIG_BLK_WBT_MQ by
mistake, which is used to control enable or disable wbt by default.

Fix the problem by adding back the checking. This patch also do a litter
cleanup to make related code more readable.

Fixes: b11d31ae01e6 ("blk-wbt: remove unnecessary check in wbt_enable_default()")
Reported-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Link: https://lore.kernel.org/lkml/CAKXUXMzfKq_J9nKHGyr5P5rvUETY4B-fxoQD4sO+NYjFOfVtZA@mail.gmail.com/t/
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20230522121854.2928880-1-yukuai1@huaweicloud.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

cxl/port: Fix NULL pointer access in devm_cxl_add_port()

BugLink: https://bugs.launchpad.net/bugs/2028979
[ Upstream commit a70fc4ed20a6118837b0aecbbf789074935f473b ]

In devm_cxl_add_port() the port creation may fail and its associated
pointer does not contain a valid address. During error message
generation this invalid port address is used. Fix that wrong address
access.

Fixes: f3cd264c4ec1 ("cxl: Unify debug messages when calling devm_cxl_add_port()")
Signed-off-by: Robert Richter <rrichter@amd.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/20230519215436.3394532-1-rrichter@amd.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

net: fec: add dma_wmb to ensure correct descriptor values

BugLink: https://bugs.launchpad.net/bugs/2028979
[ Upstream commit 9025944fddfed5966c8f102f1fe921ab3aee2c12 ]

Two dma_wmb() are added in the XDP TX path to ensure proper ordering of
descriptor and buffer updates:
1. A dma_wmb() is added after updating the last BD to make sure
   the updates to rest of the descriptor are visible before
   transferring ownership to FEC.
2. A dma_wmb() is also added after updating the bdp to ensure these
   updates are visible before updating txq->bd.cur.
3. Start the xmit of the frame immediately right after configuring the
   tx descriptor.

Fixes: 6d6b39f180b8 ("net: fec: add initial XDP support")
Signed-off-by: Shenwei Wang <shenwei.wang@nxp.com>
Reviewed-by: Wei Fang <wei.fang@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

gpiolib: fix allocation of mixed dynamic/static GPIOs

BugLink: https://bugs.launchpad.net/bugs/2028979
[ Upstream commit 7dd3d9bd873f138675cb727eaa51a498d99f0e89 ]

If static allocation and dynamic allocation GPIOs are present,
dynamic allocation pollutes the numberspace for static allocation,
causing static allocation to fail.
Enforce dynamic allocation above GPIO_DYNAMIC_BASE.

Seen on a GTA04 when omap-gpio (static) and twl-gpio (dynamic)
raced:
[some successful registrations of omap_gpio instances]
[    2.553833] twl4030_gpio twl4030-gpio: gpio (irq 145) chaining IRQs 161..178
[    2.561401] gpiochip_find_base: found new base at 160
[    2.564392] gpio gpiochip5: (twl4030): added GPIO chardev (254:5)
[    2.564544] gpio gpiochip5: registered GPIOs 160 to 177 on twl4030
[...]
[    2.692169] omap-gpmc 6e000000.gpmc: GPMC revision 5.0
[    2.697357] gpmc_mem_init: disabling cs 0 mapped at 0x0-0x1000000
[    2.703643] gpiochip_find_base: found new base at 178
[    2.704376] gpio gpiochip6: (omap-gpmc): added GPIO chardev (254:6)
[    2.704589] gpio gpiochip6: registered GPIOs 178 to 181 on omap-gpmc
[...]
[    2.840393] gpio gpiochip7: Static allocation of GPIO base is deprecated, use dynamic allocation.
[    2.849365] gpio gpiochip7: (gpio-160-191): GPIO integer space overlap, cannot add chip
[    2.857513] gpiochip_add_data_with_key: GPIOs 160..191 (gpio-160-191) failed to register, -16
[    2.866149] omap_gpio 48310000.gpio: error -EBUSY: Could not register gpio chip

On that device it is fixed invasively by
commit 92bf78b33b0b4 ("gpio: omap: use dynamic allocation of base")
but let's also fix that for devices where there is still
a mixture of static and dynamic allocation.

Fixes: 7b61212f2a07 ("gpiolib: Get rid of ARCH_NR_GPIOS")
Signed-off-by: Andreas Kemnade <andreas@kemnade.info>
Reviewed-by: <christophe.leroy@csgroup.eu>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

tools headers UAPI: Sync the linux/in.h with the kernel sources

BugLink: https://bugs.launchpad.net/bugs/2028979
commit 5d1ac59ff7445e51a0e4958fa39ac8aa23691698 upstream.

Picking the changes from:

  91d0b78c5177f3e4 ("inet: Add IP_LOCAL_PORT_RANGE socket option")

Silencing these perf build warnings:

  Warning: Kernel ABI header at 'tools/include/uapi/linux/in.h' differs from latest version at 'include/uapi/linux/in.h'
  diff -u tools/include/uapi/linux/in.h include/uapi/linux/in.h

Signed-off-by: Yanteng Si <siyanteng@loongson.cn>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: loongson-kernel@lists.loongnix.cn
Link: https://lore.kernel.org/r/23aabc69956ac94fbf388b05c8be08a64e8c7ccc.1683712945.git.siyanteng@loongson.cn
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

netfilter: ctnetlink: Support offloaded conntrack entry deletion

BugLink: https://bugs.launchpad.net/bugs/2028979
commit 9b7c68b3911aef84afa4cbfc31bce20f10570d51 upstream.

Currently, offloaded conntrack entries (flows) can only be deleted
after they are removed from offload, which is either by timeout,
tcp state change or tc ct rule deletion. This can cause issues for
users wishing to manually delete or flush existing entries.

Support deletion of offloaded conntrack entries.

Example usage:
# Delete all offloaded (and non offloaded) conntrack entries
# whose source address is 1.2.3.4
$ conntrack -D -s 1.2.3.4
# Delete all entries
$ conntrack -F

Signed-off-by: Paul Blakey <paulb@nvidia.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Florian Westphal <fw@strlen.de>
Cc: Demi Marie Obenour <demi@invisiblethingslab.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

cpufreq: amd-pstate: Add ->fast_switch() callback

BugLink: https://bugs.launchpad.net/bugs/2028979
commit 4badf2eb1e986bdbf34dd2f5d4c979553a86fe54 upstream.

Schedutil normally calls the adjust_perf callback for drivers with
adjust_perf callback available and fast_switch_possible flag set.
However, when frequency invariance is disabled and schedutil tries to
invoke fast_switch. So, there is a chance of kernel crash if this
function pointer is not set. To protect against this scenario add
fast_switch callback to amd_pstate driver.

Fixes: 1d215f0319c2 ("cpufreq: amd-pstate: Add fast switch function for AMD P-State")
Signed-off-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Signed-off-by: Wyes Karny <wyes.karny@amd.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

cpufreq: amd-pstate: Update policy->cur in amd_pstate_adjust_perf()

BugLink: https://bugs.launchpad.net/bugs/2028979
commit 3bf8c6307bad5c0cc09cde982e146d847859b651 upstream.

Driver should update policy->cur after updating the frequency.
Currently amd_pstate doesn't update policy->cur when `adjust_perf`
is used. Which causes /proc/cpuinfo to show wrong cpu frequency.
Fix this by updating policy->cur with correct frequency value in
adjust_perf function callback.

- Before the fix: (setting min freq to 1.5 MHz)

[root@amd]# cat /proc/cpuinfo | grep "cpu MHz" | sort | uniq --count
      1 cpu MHz         : 1777.016
      1 cpu MHz         : 1797.160
      1 cpu MHz         : 1797.270
    189 cpu MHz         : 400.000

- After the fix: (setting min freq to 1.5 MHz)

[root@amd]# cat /proc/cpuinfo | grep "cpu MHz" | sort | uniq --count
      1 cpu MHz         : 1753.353
      1 cpu MHz         : 1756.838
      1 cpu MHz         : 1776.466
      1 cpu MHz         : 1776.873
      1 cpu MHz         : 1777.308
      1 cpu MHz         : 1779.900
    183 cpu MHz         : 1805.231
      1 cpu MHz         : 1956.815
      1 cpu MHz         : 2246.203
      1 cpu MHz         : 2259.984

Fixes: 1d215f0319c2 ("cpufreq: amd-pstate: Add fast switch function for AMD P-State")
Signed-off-by: Wyes Karny <wyes.karny@amd.com>
[ rjw: Subject edits ]
Cc: 5.17+ <stable@vger.kernel.org> # 5.17+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

block: fix bio-cache for passthru IO

BugLink: https://bugs.launchpad.net/bugs/2028979
commit 46930b7cc7727271c9c27aac1fdc97a8645e2d00 upstream.

commit <8af870aa5b847> ("block: enable bio caching use for passthru IO")
introduced bio-cache for passthru IO. In case when nr_vecs are greater
than BIO_INLINE_VECS, bio and bvecs are allocated from mempool (instead
of percpu cache) and REQ_ALLOC_CACHE is cleared. This causes the side
effect of not freeing bio/bvecs into mempool on completion.

This patch lets the passthru IO fallback to allocation using bio_kmalloc
when nr_vecs are greater than BIO_INLINE_VECS. The corresponding bio
is freed during call to blk_mq_map_bio_put during completion.

Cc: stable@vger.kernel.org # 6.1
fixes <8af870aa5b847> ("block: enable bio caching use for passthru IO")

Signed-off-by: Anuj Gupta <anuj20.g@samsung.com>
Signed-off-by: Kanchan Joshi <joshi.k@samsung.com>
Link: https://lore.kernel.org/r/20230523111709.145676-1-anuj20.g@samsung.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

Revert "thermal/drivers/mellanox: Use generic thermal_zone_get_trip() function"

BugLink: https://bugs.launchpad.net/bugs/2028979
This reverts commit a71f388045edbf6788a61f43e2cdc94b392a4ea3.

Commit a71f388045ed ("thermal/drivers/mellanox: Use generic
thermal_zone_get_trip() function") was backported as a dependency of the
fix in upstream commit 6d206b1ea9f4 ("mlxsw: core_thermal: Fix fan speed
in maximum cooling state"). However, it is dependent on changes in the
thermal core that were merged in v6.3. Without them, the mlxsw driver is
unable to register its thermal zone:

mlxsw_spectrum 0000:03:00.0: Failed to register thermal zone
mlxsw_spectrum 0000:03:00.0: cannot register bus device
mlxsw_spectrum: probe of 0000:03:00.0 failed with error -22

Fix this by reverting this commit and instead fix the small conflict
with the above mentioned fix. Tested using the test case mentioned in
the change log of the fix:

# cat /sys/class/thermal/thermal_zone2/cdev0/type
mlxsw_fan
# echo 10 > /sys/class/thermal/thermal_zone2/cdev0/cur_state
# cat /sys/class/hwmon/hwmon1/name
mlxsw
# cat /sys/class/hwmon/hwmon1/pwm1
255

After setting the fan to its maximum cooling state (10), it operates at
100% duty cycle instead of being stuck at 0 RPM.

Fixes: a71f388045ed ("thermal/drivers/mellanox: Use generic thermal_zone_get_trip() function")
Reported-by: Joe Botha <joe@atomic.ac>
Tested-by: Joe Botha <joe@atomic.ac>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

bluetooth: Add cmd validity checks at the start of hci_sock_ioctl()

BugLink: https://bugs.launchpad.net/bugs/2028979
commit 000c2fa2c144c499c881a101819cf1936a1f7cf2 upstream.

Previously, channel open messages were always sent to monitors on the first
ioctl() call for unbound HCI sockets, even if the command and arguments
were completely invalid. This can leave an exploitable hole with the abuse
of invalid ioctl calls.

This commit hardens the ioctl processing logic by first checking if the
command is valid, and immediately returning with an ENOIOCTLCMD error code
if it is not. This ensures that ioctl calls with invalid commands are free
of side effects, and increases the difficulty of further exploitation by
forcing exploitation to find a way to pass a valid command first.

Signed-off-by: Ruihan Li <lrh2000@pku.edu.cn>
Co-developed-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Dragos-Marian Panait <dragos.panait@windriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

drm/amd: Don't allow s0ix on APUs older than Raven

BugLink: https://bugs.launchpad.net/bugs/2028979
commit ca47518663973083c513cd6b2801dcda0bfaaa99 upstream.

APUs before Raven didn't support s0ix. As we just relieved some
of the safety checks for s0ix to improve power consumption on
APUs that support it but that are missing BIOS support a new
blind spot was introduced that a user could "try" to run s0ix.

Plug this hole so that if users try to run s0ix on anything older
than Raven it will just skip suspend of the GPU.

Fixes: cf488dcd0ab7 ("drm/amd: Allow s0ix without BIOS support")
Suggested-by: Alexander Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

wifi: iwlwifi: mvm: support wowlan info notification version 2

BugLink: https://bugs.launchpad.net/bugs/2028979
[ Upstream commit 905d50ddbc83bef0d7f3386e7f3472b0324b405b ]

As part of version 2 we don't need to have wake_packet_bufsize
and wake_packet_length. The first one is already calculated by the driver,
the latter is sent as part of the wake packet notification.

Signed-off-by: Haim Dreyfuss <haim.dreyfuss@intel.com>
Signed-off-by: Gregory Greenman <gregory.greenman@intel.com>
Link: https://lore.kernel.org/r/20230413213309.3b53213b10d4.Ibf2f15aca614def2d262dd267d1aad65931b58f1@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Stable-dep-of: 457d7fb03e6c ("wifi: iwlwifi: mvm: fix potential memory leak")
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

net: phy: mscc: enable VSC8501/2 RGMII RX clock

BugLink: https://bugs.launchpad.net/bugs/2028979
[ Upstream commit 71460c9ec5c743e9ffffca3c874d66267c36345e ]

By default the VSC8501 and VSC8502 RGMII/GMII/MII RX_CLK output is
disabled. To allow packet forwarding towards the MAC it needs to be
enabled.

For other PHYs supported by this driver the clock output is enabled
by default.

Fixes: d3169863310d ("net: phy: mscc: add support for VSC8502")
Signed-off-by: David Epping <david.epping@missinglinkelectronics.com>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>