Robin Murphy [Fri, 7 Jul 2023 16:38:12 +0000 (17:38 +0100)]
perf/arm-cmn: Refactor HN-F event selector macros
Refactor the macros for defining HN-F events with additional selectors,
so they can be shared with another upcoming similar-but-distinct HN
type. No functional change intended.
Robin Murphy [Fri, 7 Jul 2023 16:38:11 +0000 (17:38 +0100)]
perf/arm-cmn: Remove spurious event aliases
As the name suggests, the "partial DAT flit" event is only counted for
the DAT channel, and furthermore is only applicable to device ports, not
mesh links (strictly it's only device ports with CHI-A requesters
connected, but detecting that degree of detail is more bother than it's
worth). Stop generating spurious event aliases for other combinations
which aren't meaningful.
Rob Herring [Fri, 14 Jul 2023 17:48:31 +0000 (11:48 -0600)]
drivers/perf: Explicitly include correct DT includes
The DT of_device.h and of_platform.h date back to the separate
of_platform_bus_type before it as merged into the regular platform bus.
As part of that merge prepping Arm DT support 13 years ago, they
"temporarily" include each other. They also include platform_device.h
and of.h. As a result, there's a pretty much random mix of those include
files used throughout the tree. In order to detangle these headers and
replace the implicit includes with struct declarations, users need to
explicitly include the correct includes.
Add support for the Arm Cortex-A520, Cortex-A715, Cortex-A720,
Cortex-X3, and Cortex-X4 CPU PMUs. They are straight-forward additions
with just new compatible strings.
Merge tag 'trace-v6.5-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull tracing fixes from Steven Rostedt:
- Swapping the ring buffer for snapshotting (for things like irqsoff)
can crash if the ring buffer is being resized. Disable swapping when
this happens. The missed swap will be reported to the tracer
- Report error if the histogram fails to be created due to an error in
adding a histogram variable, in event_hist_trigger_parse()
- Remove unused declaration of tracing_map_set_field_descr()
* tag 'trace-v6.5-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
tracing/histograms: Return an error if we fail to add histogram to hist_vars list
ring-buffer: Do not swap cpu_buffer during resize process
tracing: Remove unused extern declaration tracing_map_set_field_descr()
Merge tag 'kbuild-fixes-v6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
Pull Kbuild fixes from Masahiro Yamada:
- Fix stale help text in gconfig
- Support *.S files in compile_commands.json
- Flatten KBUILD_CFLAGS
- Fix external module builds with Rust so that temporary files are
created in the modules directories instead of the kernel tree
* tag 'kbuild-fixes-v6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
kbuild: rust: avoid creating temporary files
kbuild: flatten KBUILD_CFLAGS
gen_compile_commands: add assembly files to compilation database
kconfig: gconfig: correct program name in help text
kconfig: gconfig: drop the Show Debug Info help text
Miguel Ojeda [Sun, 23 Jul 2023 14:21:28 +0000 (16:21 +0200)]
kbuild: rust: avoid creating temporary files
`rustc` outputs by default the temporary files (i.e. the ones saved
by `-Csave-temps`, such as `*.rcgu*` files) in the current working
directory when `-o` and `--out-dir` are not given (even if
`--emit=x=path` is given, i.e. it does not use those for temporaries).
Since out-of-tree modules are compiled from the `linux` tree,
`rustc` then tries to create them there, which may not be accessible.
Thus pass `--out-dir` explicitly, even if it is just for the temporary
files.
Similarly, do so for Rust host programs too.
Reported-by: Raphael Nestler <raphael.nestler@gmail.com> Closes: https://github.com/Rust-for-Linux/linux/issues/1015 Reported-by: Andrea Righi <andrea.righi@canonical.com> Tested-by: Raphael Nestler <raphael.nestler@gmail.com> # non-hostprogs Tested-by: Andrea Righi <andrea.righi@canonical.com> # non-hostprogs Fixes: 295d8398c67e ("kbuild: specify output names separately for each emission type from rustc") Cc: stable@vger.kernel.org Signed-off-by: Miguel Ojeda <ojeda@kernel.org> Tested-by: Martin Rodriguez Reboredo <yakoyoku@gmail.com> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm fixes from Paolo Bonzini:
"ARM:
- Avoid pKVM finalization if KVM initialization fails
- Add missing BTI instructions in the hypervisor, fixing an early
boot failure on BTI systems
- Handle MMU notifiers correctly for non hugepage-aligned memslots
- Work around a bug in the architecture where hypervisor timer
controls have UNKNOWN behavior under nested virt
- Disable preemption in kvm_arch_hardware_enable(), fixing a kernel
BUG in cpu hotplug resulting from per-CPU accessor sanity checking
- Make WFI emulation on GICv4 systems robust w.r.t. preemption,
consistently requesting a doorbell interrupt on vcpu_put()
- Uphold RES0 sysreg behavior when emulating older PMU versions
- Avoid macro expansion when initializing PMU register names,
ensuring the tracepoints pretty-print the sysreg
s390:
- Two fixes for asynchronous destroy
x86 fixes will come early next week"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: s390: pv: fix index value of replaced ASCE
KVM: s390: pv: simplify shutdown and fix race
KVM: arm64: Fix the name of sys_reg_desc related to PMU
KVM: arm64: Correctly handle RES0 bits PMEVTYPER<n>_EL0.evtCount
KVM: arm64: vgic-v4: Make the doorbell request robust w.r.t preemption
KVM: arm64: Add missing BTI instructions
KVM: arm64: Correctly handle page aging notifiers for unaligned memslot
KVM: arm64: Disable preemption in kvm_arch_hardware_enable()
KVM: arm64: Handle kvm_arm_init failure correctly in finalize_pkvm
KVM: arm64: timers: Use CNTHCTL_EL2 when setting non-CNTKCTL_EL1 bits
Merge tag 'ext4_for_linus-6.5-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4
Pull ext4 fixes from Ted Ts'o:
"Bug and regression fixes for 6.5-rc3 for ext4's mballoc and jbd2's
checkpoint code"
* tag 'ext4_for_linus-6.5-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
ext4: fix rbtree traversal bug in ext4_mb_use_preallocated
ext4: fix off by one issue in ext4_mb_choose_next_group_best_avail()
ext4: correct inline offset when handling xattrs in inode body
jbd2: remove __journal_try_to_free_buffer()
jbd2: fix a race when checking checkpoint buffer busy
jbd2: Fix wrongly judgement for buffer head removing while doing checkpoint
jbd2: remove journal_clean_one_cp_list()
jbd2: remove t_checkpoint_io_list
jbd2: recheck chechpointing non-dirty buffer
Merge tag '6.5-rc2-smb3-client-fixes-ver2' of git://git.samba.org/sfrench/cifs-2.6
Pull smb client fix from Steve French:
"Add minor debugging improvement.
The change improves ability to read a network trace to debug problems
on encrypted connections which are very common (e.g. using wireshark
or tcpdump).
That works today with tools like 'smbinfo keys /mnt/file' but requires
passing in a filename on the mount (see e.g. [1]), but it often makes
more sense to just pass in the mount point path (ie a directory not a
filename).
So this fix was needed to debug some types of problems (an obvious
example is on an encrypted connection failing operations on an empty
share or with no files in the root of the directory) - so you can
simply pass in the 'smbinfo keys <mntpoint>' and get the information
that wireshark needs"
Link: https://wiki.samba.org/index.php/Wireshark_Decryption
* tag '6.5-rc2-smb3-client-fixes-ver2' of git://git.samba.org/sfrench/cifs-2.6:
cifs: update internal module version number for cifs.ko
cifs: allow dumping keys for directories too
tracing/histograms: Return an error if we fail to add histogram to hist_vars list
Commit 6018b585e8c6 ("tracing/histograms: Add histograms to hist_vars if
they have referenced variables") added a check to fail histogram creation
if save_hist_vars() failed to add histogram to hist_vars list. But the
commit failed to set ret to failed return code before jumping to
unregister histogram, fix it.
Chen Lin [Wed, 19 Jul 2023 07:58:47 +0000 (15:58 +0800)]
ring-buffer: Do not swap cpu_buffer during resize process
When ring_buffer_swap_cpu was called during resize process,
the cpu buffer was swapped in the middle, resulting in incorrect state.
Continuing to run in the wrong state will result in oops.
This issue can be easily reproduced using the following two scripts:
/tmp # cat test1.sh
//#! /bin/sh
for i in `seq 0 100000`
do
echo 2000 > /sys/kernel/debug/tracing/buffer_size_kb
sleep 0.5
echo 5000 > /sys/kernel/debug/tracing/buffer_size_kb
sleep 0.5
done
/tmp # cat test2.sh
//#! /bin/sh
for i in `seq 0 100000`
do
echo irqsoff > /sys/kernel/debug/tracing/current_tracer
sleep 1
echo nop > /sys/kernel/debug/tracing/current_tracer
sleep 1
done
/tmp # ./test1.sh &
/tmp # ./test2.sh &
A typical oops log is as follows, sometimes with other different oops logs.
After analysis, the seq of the error is as follows [1-5]:
int ring_buffer_resize(struct trace_buffer *buffer, unsigned long size,
int cpu_id)
{
for_each_buffer_cpu(buffer, cpu) {
cpu_buffer = buffer->buffers[cpu];
//1. get cpu_buffer, aka cpu_buffer(A)
...
...
schedule_work_on(cpu,
&cpu_buffer->update_pages_work);
//2. 'update_pages_work' is queue on 'cpu', cpu_buffer(A) is passed to
// update_pages_handler, do the update process, set 'update_done' in
// complete(&cpu_buffer->update_done) and to wakeup resize process.
//---->
//3. Just at this moment, ring_buffer_swap_cpu is triggered,
//cpu_buffer(A) be swaped to cpu_buffer(B), the max_buffer.
//ring_buffer_swap_cpu is called as the 'Call trace' below.
/* wait for all the updates to complete */
for_each_buffer_cpu(buffer, cpu) {
cpu_buffer = buffer->buffers[cpu];
//4. get cpu_buffer, cpu_buffer(B) is used in the following process,
//the state of cpu_buffer(A) and cpu_buffer(B) is totally wrong.
//for example, cpu_buffer(A)->update_done will leave be set 1, and will
//not 'wait_for_completion' at the next resize round.
if (!cpu_buffer->nr_pages_to_update)
continue;
if (cpu_online(cpu))
wait_for_completion(&cpu_buffer->update_done);
cpu_buffer->nr_pages_to_update = 0;
}
...
}
//5. the state of cpu_buffer(A) and cpu_buffer(B) is totally wrong,
//Continuing to run in the wrong state, then oops occurs.
ext4: fix rbtree traversal bug in ext4_mb_use_preallocated
During allocations, while looking for preallocations(PA) in the per
inode rbtree, we can't do a direct traversal of the tree because
ext4_mb_discard_group_preallocation() can paralelly mark the pa deleted
and that can cause direct traversal to skip some entries. This was
leading to a BUG_ON() being hit [1] when we missed a PA that could satisfy
our request and ultimately tried to create a new PA that would overlap
with the missed one.
To makes sure we handle that case while still keeping the performance of
the rbtree, we make use of the fact that the only pa that could possibly
overlap the original goal start is the one that satisfies the below
conditions:
1. It must have it's logical start immediately to the left of
(ie less than) original logical start.
2. It must not be deleted
To find this pa we use the following traversal method:
1. Descend into the rbtree normally to find the immediate neighboring
PA. Here we keep descending irrespective of if the PA is deleted or if
it overlaps with our request etc. The goal is to find an immediately
adjacent PA.
2. If the found PA is on right of original goal, use rb_prev() to find
the left adjacent PA.
3. Check if this PA is deleted and keep moving left with rb_prev() until
a non deleted PA is found.
4. This is the PA we are looking for. Now we can check if it can satisfy
the original request and proceed accordingly.
This approach also takes care of having deleted PAs in the tree.
(While we are at it, also fix a possible overflow bug in calculating the
end of a PA)
Ojaswin Mujoo [Fri, 9 Jun 2023 10:34:03 +0000 (16:04 +0530)]
ext4: fix off by one issue in ext4_mb_choose_next_group_best_avail()
In ext4_mb_choose_next_group_best_avail(), we want the start order to be
1 less than goal length and the min_order to be, at max, 1 more than the
original length. This commit fixes an off by one issue that arose due to
the fact that 1 << fls(n) > (n).
After all the processing:
order = 1 order below goal len
min_order = maximum of the three:-
- order - trim_order
- 1 order below B2C(s_stripe)
- 1 order above original len
Eric Whitney [Mon, 22 May 2023 18:15:20 +0000 (14:15 -0400)]
ext4: correct inline offset when handling xattrs in inode body
When run on a file system where the inline_data feature has been
enabled, xfstests generic/269, generic/270, and generic/476 cause ext4
to emit error messages indicating that inline directory entries are
corrupted. This occurs because the inline offset used to locate
inline directory entries in the inode body is not updated when an
xattr in that shared region is deleted and the region is shifted in
memory to recover the space it occupied. If the deleted xattr precedes
the system.data attribute, which points to the inline directory entries,
that attribute will be moved further up in the region. The inline
offset continues to point to whatever is located in system.data's former
location, with unfortunate effects when used to access directory entries
or (presumably) inline data in the inode body.
Merge tag 'powerpc-6.5-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:
- Reinstate support for little endian ELFv1 binaries, which it turns
out still exist in the wild.
- Revert a change which used asm goto for WARN_ON/__WARN_FLAGS, as it
lead to dead code generation and seemed to trigger compiler bugs in
some edge cases.
- Fix a deadlock in the pseries VAS code, between live migration and
the driver's mmap handler.
- Disable KCOV instrumentation in the powerpc KASAN code.
Thanks to Andrew Donnellan, Benjamin Gray, Christophe Leroy, Haren
Myneni, Russell Currey, and Uwe Kleine-König.
* tag 'powerpc-6.5-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
Revert "powerpc/64s: Remove support for ELFv1 little endian userspace"
powerpc/kasan: Disable KCOV in KASAN code
powerpc/512x: lpbfifo: Convert to platform remove callback returning void
powerpc/crypto: Add gitignore for generated P10 AES/GCM .S files
Revert "powerpc/bug: Provide better flexibility to WARN_ON/__WARN_FLAGS() with asm goto"
powerpc/pseries/vas: Hold mmap_mutex after mmap lock during window close
Shyam Prasad N [Fri, 16 Jun 2023 10:37:46 +0000 (10:37 +0000)]
cifs: allow dumping keys for directories too
Dumping the enc/dec keys is a session wide operation.
And it should not matter if the ioctl was run on
a regular file or a directory.
Currently, we obtain the tcon pointer from the
cifs file handle. But since there's no dir open call
in cifs, this is not populated for dirs.
This change allows dumping of session keys using ioctl
even for directories. To do this, we'll now get the
tcon pointer from the superblock, and not from the file
handle.
Signed-off-by: Shyam Prasad N <sprasad@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
Merge tag 's390-6.5-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 fixes from Heiko Carstens:
- Fix per vma lock fault handling: add missing !(fault & VM_FAULT_ERROR)
check to fault handler to prevent error handling for return values
that don't indicate an error
- Use kfree_sensitive() instead of kfree() in paes crypto code to clear
memory that may contain keys before freeing it
- Fix reply buffer size calculation for CCA replies in zcrypt device
driver
* tag 's390-6.5-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390/zcrypt: fix reply buffer calculations for CCA replies
s390/crypto: use kfree_sensitive() instead of kfree()
s390/mm: fix per vma lock fault handling
Merge tag 'block-6.5-2023-07-21' of git://git.kernel.dk/linux
Pull block fixes from Jens Axboe:
- Fix for loop regressions (Mauricio)
- Fix a potential stall with batched wakeups in sbitmap (David)
- Fix for stall with recursive plug flushes (Ross)
- Skip accounting of empty requests for blk-iocost (Chengming)
- Remove a dead field in struct blk_mq_hw_ctx (Chengming)
* tag 'block-6.5-2023-07-21' of git://git.kernel.dk/linux:
loop: do not enforce max_loop hard limit by (new) default
loop: deprecate autoloading callback loop_probe()
sbitmap: fix batching wakeup
blk-iocost: skip empty flush bio in iocost
blk-mq: delete dead struct blk_mq_hw_ctx->queued field
blk-mq: Fix stall due to recursive flush plug
Merge tag 'io_uring-6.5-2023-07-21' of git://git.kernel.dk/linux
Pull io_uring fixes from Jens Axboe:
- Fix for io-wq not always honoring REQ_F_NOWAIT, if it was set and
punted directly (eg via DRAIN) (me)
- Capability check fix (Ondrej)
- Regression fix for the mmap changes that went into 6.4, which
apparently broke IA64 (Helge)
* tag 'io_uring-6.5-2023-07-21' of git://git.kernel.dk/linux:
ia64: mmap: Consider pgoff when searching for free mapping
io_uring: Fix io_uring mmap() by using architecture-provided get_unmapped_area()
io_uring: treat -EAGAIN for REQ_F_NOWAIT as final for io-wq
io_uring: don't audit the capability check in io_uring_create()
Merge tag 'devicetree-fixes-for-6.5-1' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux
Pull devicetree fixes from Rob Herring:
- Fix moortec,mr75203 schema usage of 'multipleOf' keyword
- Fix regression in systems depending on "of-display" device name
- Build fix for s390 with CONFIG_PCI=n and OF_EARLY_FLATTREE=y
- Drop two obsolete serial .txt bindings
* tag 'devicetree-fixes-for-6.5-1' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux:
dt-bindings: serial: Remove obsolete nxp,lpc1850-uart.txt
dt-bindings: serial: Remove obsolete cavium-uart.txt
dt-bindings: hwmon: moortec,mr75203: fix multipleOf for coefficients
of: Preserve "of-display" device name for compatibility
of: make OF_EARLY_FLATTREE depend on HAS_IOMEM
Merge tag 'regmap-fix-v6.5-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap
Pull regmap fixes from Mark Brown:
"Three fixes here:
- The issues with accounting for register and padding length on raw
buses turn out to be quite widespread in custom buses.
In order to avoid disturbing anything drop the initial fixes and
fall back to a point fix in the SMBus code where the issue was
originally noticed, a more substantial refactoring of the API which
ensures that all buses make the same assumptions will follow.
- The generic regcache code had been forcing on async I/O which did
not work with the new maple tree sync code when used with SPI.
Since that was mainly for the rbtree cache and the assumptions
about hardware that drove the choice are probably not true any more
fix this by pushing the enablement of async down into the rbtree
code.
This probably also makes cache syncs for systems faster though it's
not the point.
- The test code was triggering use of the rbtree and maple tree
caches with dynamic allocation of nodes since all the testing is
with RAM backed caches with no I/O performance issues.
Just disable the locking in the tests to avoid triggering warnings
when allocation debugging is turned on, it's not really what's
being tested"
* tag 'regmap-fix-v6.5-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap:
regmap: Disable locking for RBTREE and MAPLE unit tests
regcache: Push async I/O request down into the rbtree cache
regmap: Account for register length in SMBus I/O limits
regmap: Drop initial version of maximum transfer length fixes
Merge tag 'gpio-fixes-for-v6.5-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux
Pull gpio fixes from Bartosz Golaszewski:
- fix initial value handling for output-only pins in gpio-tps68470
- fix two resource leaks in gpio-mvebu
* tag 'gpio-fixes-for-v6.5-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
gpio: mvebu: fix irq domain leak
gpio: mvebu: Make use of devm_pwmchip_add
gpio: tps68470: Make tps68470_gpio_output() always set the initial value
loop: do not enforce max_loop hard limit by (new) default
Problem:
The max_loop parameter is used for 2 different purposes:
1) initial number of loop devices to pre-create on init
2) maximum number of loop devices to add on access/open()
Historically, its default value (zero) caused 1) to create non-zero
number of devices (CONFIG_BLK_DEV_LOOP_MIN_COUNT), and no hard limit on
2) to add devices with autoloading.
However, the default value changed in commit 85c50197716c ("loop: Fix
the max_loop commandline argument treatment when it is set to 0") to
CONFIG_BLK_DEV_LOOP_MIN_COUNT, for max_loop=0 not to pre-create devices.
That does improve 1), but unfortunately it breaks 2), as the default
behavior changed from no-limit to hard-limit.
Example:
For example, this userspace code broke for N >= CONFIG, if the user
relied on the default value 0 for max_loop:
mknod("/dev/loopN");
open("/dev/loopN"); // now fails with ENXIO
Though affected users may "fix" it with (loop.)max_loop=0, this means to
require a kernel parameter change on stable kernel update (that commit Fixes: an old commit in stable).
Solution:
The original semantics for the default value in 2) can be applied if the
parameter is not set (ie, default behavior).
This still keeps the intended function in 1) and 2) if set, and that
commit's intended improvement in 1) if max_loop=0.
Before 85c50197716c:
- default: 1) CONFIG devices 2) no limit
- max_loop=0: 1) CONFIG devices 2) no limit
- max_loop=X: 1) X devices 2) X limit
After 85c50197716c:
- default: 1) CONFIG devices 2) CONFIG limit (*)
- max_loop=0: 1) 0 devices (*) 2) no limit
- max_loop=X: 1) X devices 2) X limit
This commit:
- default: 1) CONFIG devices 2) no limit (*)
- max_loop=0: 1) 0 devices 2) no limit
- max_loop=X: 1) X devices 2) X limit
Future:
The issue/regression from that commit only affects code under the
CONFIG_BLOCK_LEGACY_AUTOLOAD deprecation guard, thus the fix too is
contained under it.
Once that deprecated functionality/code is removed, the purpose 2) of
max_loop (hard limit) is no longer in use, so the module parameter
description can be changed then.
Tests:
Linux 6.4-rc7
CONFIG_BLK_DEV_LOOP_MIN_COUNT=8
CONFIG_BLOCK_LEGACY_AUTOLOAD=y
- default (original)
# ls -1 /dev/loop*
/dev/loop-control
/dev/loop0
...
/dev/loop7
# ./test-loop
open: /dev/loop8: No such device or address
- default (patched)
# ls -1 /dev/loop*
/dev/loop-control
/dev/loop0
...
/dev/loop7
# ./test-loop
#
- max_loop=0 (original & patched):
# ls -1 /dev/loop*
/dev/loop-control
# ./test-loop
#
- max_loop=8 (original & patched):
# ls -1 /dev/loop*
/dev/loop-control
/dev/loop0
...
/dev/loop7
# ./test-loop
open: /dev/loop8: No such device or address
- max_loop=0 (patched; CONFIG_BLOCK_LEGACY_AUTOLOAD is not set)
# ls -1 /dev/loop*
/dev/loop-control
# ./test-loop
open: /dev/loop8: No such device or address
Fixes: 85c50197716c ("loop: Fix the max_loop commandline argument treatment when it is set to 0") Signed-off-by: Mauricio Faria de Oliveira <mfo@canonical.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20230720143033.841001-3-mfo@canonical.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
David Jeffery [Fri, 21 Jul 2023 09:57:15 +0000 (17:57 +0800)]
sbitmap: fix batching wakeup
Current code supposes that it is enough to provide forward progress by
just waking up one wait queue after one completion batch is done.
Unfortunately this way isn't enough, cause waiter can be added to wait
queue just after it is woken up.
Follows one example(64 depth, wake_batch is 8)
1) all 64 tags are active
2) in each wait queue, there is only one single waiter
3) each time one completion batch(8 completions) wakes up just one
waiter in each wait queue, then immediately one new sleeper is added
to this wait queue
4) after 64 completions, 8 waiters are wakeup, and there are still 8
waiters in each wait queue
5) after another 8 active tags are completed, only one waiter can be
wakeup, and the other 7 can't be waken up anymore.
Turns out it isn't easy to fix this problem, so simply wakeup enough
waiters for single batch.
Cc: Kemeng Shi <shikemeng@huaweicloud.com> Cc: Chengming Zhou <zhouchengming@bytedance.com> Cc: Jan Kara <jack@suse.cz> Signed-off-by: David Jeffery <djeffery@redhat.com> Signed-off-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Gabriel Krisman Bertazi <krisman@suse.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Link: https://lore.kernel.org/r/20230721095715.232728-1-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Will Deacon:
"I've picked up a handful of arm64 fixes while Catalin's been away, so
here they are. Below is the usual summary, but we have basically have
two cleanups, a fix for an SME crash and a fix for hibernation:
- Fix saving of SME state after SVE vector length is changed
- Fix sparse warnings for missing vDSO function prototypes
- Fix hibernation resume path when kfence is enabled
- Fix field names for the HFGxTR_EL2 register"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64/fpsimd: Ensure SME storage is allocated after SVE VL changes
arm64: vdso: Clear common make C=2 warnings
arm64: mm: Make hibernation aware of KFENCE
arm64: Fix HFGxTR_EL2 field naming
Merge tag 'pm-6.5-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management fixes from Rafael Wysocki:
"Revert three recent intel_idle commits that introduced a functional
issue, included a coding mistake and have been questioned at the
design level"
* tag 'pm-6.5-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
Revert "intel_idle: Add support for using intel_idle in a VM guest using just hlt"
Revert "intel_idle: Add a "Long HLT" C1 state for the VM guest mode"
Revert "intel_idle: Add __init annotation to matchup_vm_state_with_baremetal()"
Merge tag 'sound-6.5-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"A pile of fixes that have been gathered since the previous pull. Most
of changes are device-specific, and nothing looks too scary.
- A memory leak fix in ALSA sequencer code in 6.5-rc
- Many fixes for ASoC Qualcomm CODEC drivers, covering SoundWire
probe problems
- A series of ASoC AMD fixes
- A few fixes and cleanups of selftest stuff
- HD-audio codec fixes and quirks for Clevo, HP, Lenovo, Dell"
* tag 'sound-6.5-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (52 commits)
ALSA: hda/realtek: Add support for DELL Oasis 13/14/16 laptops
ALSA: hda/realtek: Fix generic fixup definition for cs35l41 amp
ALSA: hda/realtek: Enable Mute LED on HP Laptop 15s-eq2xxx
selftests: ALSA: Add test-pcmtest-driver to .gitignore
ALSA: hda/realtek: Add quirk for Clevo NS70AU
ASoC: fsl_sai: Disable bit clock with transmitter
ALSA: seq: Fix memory leak at error path in snd_seq_create_port()
ASoC: SOF: ipc3-dtrace: uninitialized data in dfsentry_trace_filter_write()
ASoC: cs42l51: fix driver to properly autoload with automatic module loading
MAINTAINERS: Redo addition of ssm3515 to APPLE SOUND
ASoC: rt5640: Fix the issue of speaker noise
ALSA: hda/realtek - remove 3k pull low procedure
selftests: ALSA: Fix fclose on an already fclosed file pointer
ALSA: pcmtest: Don't use static storage to track per device data
ALSA: pcmtest: Convert to platform remove callback returning void
ASoC: dt-bindings: audio-graph-card2: Drop incomplete example
ASoC: dt-bindings: Update maintainer email id
ASoC: amd: ps: Fix extraneous error messages
ASoC: fsl_sai: Revert "ASoC: fsl_sai: Enable MCTL_MCLK_EN bit for master mode"
ASoC: codecs: SND_SOC_WCD934X should select REGMAP_IRQ
...
* tag 'fbdev-for-6.5-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/linux-fbdev:
fbdev: Explicitly include correct DT includes
fbdev: ep93xx-fb: fix return value check in ep93xxfb_probe
fbdev: au1200fb: Fix missing IRQ check in au1200fb_drv_probe
fbdev: kyro: make some const read-only arrays static and reduce type size
fbcon: remove unused display (p) from fbcon_redraw()
sticon: make sticon_set_def_font() void and remove op parameter
vgacon: cache vc_cell_height in vgacon_cursor()
vgacon: let vgacon_doresize() return void
vgacon: remove unused xpos from vgacon_set_cursor_size()
vgacon: remove unneeded forward declarations
vgacon: switch vgacon_scrolldelta() and vgacon_restore_screen()
fbdev: imxfb: remove unneeded labels
fbdev: imxfb: Convert to devm_platform_ioremap_resource()
fbdev: imxfb: Convert to devm_kmalloc_array()
fbdev: imxfb: Removed unneeded release_mem_region
fbdev: imxfb: switch to DEFINE_SIMPLE_DEV_PM_OPS
fbdev: imxfb: warn about invalid left/right margin
The trouble is that the drm_dev_unplugged() checks are by design racy,
they do not synchronize against all outstanding ioctl. This is because
those ioctl could block forever (both for modeset and for driver
specific ioctls), leading to deadlocks in hotunplug. Instead the code
sections that touch the hardware need to be annotated with
drm_dev_enter/exit, to avoid accessing hardware resources after the
unload/remove has finished.
To avoid use-after-free issues all the involved userspace visible
objects are supposed to hold a reference on the underlying drm_device,
like drm_file does.
The issue now is that we missed one, the atomic modeset ioctl can be run
in a nonblocking fashion, and in that case it cannot rely on the implied
drm_device reference provided by the ioctl calling context. This can
result in a use-after-free if an nonblocking atomic commit is carefully
raced against a driver unload.
Fix this by unconditionally grabbing a drm_device reference for any
drm_atomic_state structures. Strictly speaking this isn't required for
blocking commits and TEST_ONLY calls, but it's the simpler approach.
Thanks to shanzhulig for the initial idea of grabbing an unconditional
reference, I just added comments, a condensed commit message and fixed a
minor potential issue in where exactly we drop the final reference.
Reported-by: shanzhulig <shanzhulig@gmail.com> Suggested-by: shanzhulig <shanzhulig@gmail.com> Reviewed-by: Maxime Ripard <mripard@kernel.org> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Thomas Zimmermann <tzimmermann@suse.de> Cc: David Airlie <airlied@gmail.com> Cc: stable@kernel.org Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
ia64: mmap: Consider pgoff when searching for free mapping
IA64 is the only architecture which does not consider the pgoff value when
searching for a possible free memory region with vm_unmapped_area().
Adding this seems to have no negative side effect on IA64, so add it now
to make IA64 consistent with all other architectures.
io_uring: Fix io_uring mmap() by using architecture-provided get_unmapped_area()
The io_uring testcase is broken on IA-64 since commit d808459b2e31
("io_uring: Adjust mapping wrt architecture aliasing requirements").
The reason is, that this commit introduced an own architecture
independend get_unmapped_area() search algorithm which finds on IA-64 a
memory region which is outside of the regular memory region used for
shared userspace mappings and which can't be used on that platform
due to aliasing.
To avoid similar problems on IA-64 and other platforms in the future,
it's better to switch back to the architecture-provided
get_unmapped_area() function and adjust the needed input parameters
before the call. Beside fixing the issue, the function now becomes
easier to understand and maintain.
This patch has been successfully tested with the io_uring testcase on
physical x86-64, ppc64le, IA-64 and PA-RISC machines. On PA-RISC the LTP
mmmap testcases did not report any regressions.
Mark Brown [Thu, 20 Jul 2023 18:38:58 +0000 (19:38 +0100)]
arm64/fpsimd: Ensure SME storage is allocated after SVE VL changes
When we reconfigure the SVE vector length we discard the backing storage
for the SVE vectors and then reallocate on next SVE use, leaving the SME
specific state alone. This means that we do not enable SME traps if they
were already disabled. That means that userspace code can enter streaming
mode without trapping, putting the task in a state where if we try to save
the state of the task we will fault.
Since the ABI does not specify that changing the SVE vector length disturbs
SME state, and since SVE code may not be aware of SME code in the process,
we shouldn't simply discard any ZA state. Instead immediately reallocate
the storage for SVE, and disable SME if we change the SVE vector length
while there is no SME state active.
Disabling SME traps on SVE vector length changes would make the overall
code more complex since we would have a state where we have valid SME state
stored but might get a SME trap.
* tag 'drm-fixes-2023-07-21' of git://anongit.freedesktop.org/drm/drm: (31 commits)
accel/habanalabs: add more debugfs stub helpers
drm/nouveau/kms/nv50-: init hpd_irq_lock for PIOR DP
drm/nouveau/disp: PIOR DP uses GPIO for HPD, not PMGR AUX interrupts
drm/nouveau/i2c: fix number of aux event slots
drm/amdgpu: use a macro to define no xcp partition case
drm/amdgpu/vm: use the same xcp_id from root PD
drm/amdgpu: fix slab-out-of-bounds issue in amdgpu_vm_pt_create
drm/amdgpu: Allocate root PD on correct partition
drm/amd/display: Keep PHY active for DP displays on DCN31
drm/amd/display: Prevent vtotal from being set to 0
drm/amd/display: Disable MPC split by default on special asic
drm/amd/display: check TG is non-null before checking if enabled
drm/amd/display: Add polling method to handle MST reply packet
drm/amd/display: Clean up errors & warnings in amdgpu_dm.c
drm/amdgpu: Allow the initramfs generator to include psp_13_0_6_ta
drm/amdgpu/pm: make mclk consistent for smu 13.0.7
drm/amdgpu/pm: make gfxclock consistent for sienna cichlid
drm/amd/display: only accept async flips for fast updates
drm/amdgpu/vkms: relax timer deactivation by hrtimer_try_to_cancel
drm/amd/display: add DCN301 specific logic for OTG programming
...
Merge tag 'net-6.5-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Including fixes from BPF, netfilter, bluetooth and CAN.
Current release - regressions:
- eth: r8169: multiple fixes for PCIe ASPM-related problems
- vrf: fix RCU lockdep splat in output path
Previous releases - regressions:
- gso: fall back to SW segmenting with GSO_UDP_L4 dodgy bit set
- dsa: mv88e6xxx: do a final check before timing out when polling
- nf_tables: fix sleep in atomic in nft_chain_validate
Previous releases - always broken:
- sched: fix undoing tcf_bind_filter() in multiple classifiers
- bpf, arm64: fix BTI type used for freplace attached functions
- can: gs_usb: fix time stamp counter initialization
- nft_set_pipapo: fix improper element removal (leading to UAF)
Misc:
- net: support STP on bridge in non-root netns, STP prevents packet
loops so not supporting it results in freezing systems of
unsuspecting users, and in turn very upset noises being made
- fix kdoc warnings
- annotate various bits of TCP state to prevent data races"
* tag 'net-6.5-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (95 commits)
net: phy: prevent stale pointer dereference in phy_init()
tcp: annotate data-races around fastopenq.max_qlen
tcp: annotate data-races around icsk->icsk_user_timeout
tcp: annotate data-races around tp->notsent_lowat
tcp: annotate data-races around rskq_defer_accept
tcp: annotate data-races around tp->linger2
tcp: annotate data-races around icsk->icsk_syn_retries
tcp: annotate data-races around tp->keepalive_probes
tcp: annotate data-races around tp->keepalive_intvl
tcp: annotate data-races around tp->keepalive_time
tcp: annotate data-races around tp->tsoffset
tcp: annotate data-races around tp->tcp_tx_delay
Bluetooth: MGMT: Use correct address for memcpy()
Bluetooth: btusb: Fix bluetooth on Intel Macbook 2014
Bluetooth: SCO: fix sco_conn related locking and validity issues
Bluetooth: hci_conn: return ERR_PTR instead of NULL when there is no link
Bluetooth: hci_sync: Avoid use-after-free in dbg for hci_remove_adv_monitor()
Bluetooth: coredump: fix building with coredump disabled
Bluetooth: ISO: fix iso_conn related locking and validity issues
Bluetooth: hci_event: call disconnect callback before deleting conn
...
The flush bio may have data, may have no data (empty flush), we couldn't
calculate cost for empty flush bio. So we'd better just skip it for now.
Another side effect is that empty flush bio's bio_end_sector() is 0, cause
iocg->cursor reset to 0, may break the cost calculation of other bios.
This isn't good enough, since flush bio still consume the device bandwidth,
but flush request is special, can be merged randomly in the flush state
machine, we don't know how to calculate cost for it for now.
Its completion time also has flaws, which may include the pre-flush or
post-flush completion time, but I don't know if we need to fix that and
how to fix it.
Jakub Kicinski [Thu, 20 Jul 2023 19:57:55 +0000 (12:57 -0700)]
Merge tag 'for-net-2023-07-20' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth
Luiz Augusto von Dentz says:
====================
bluetooth pull request for net:
- Fix building with coredump disabled
- Fix use-after-free in hci_remove_adv_monitor
- Use RCU for hci_conn_params and iterate safely in hci_sync
- Fix locking issues on ISO and SCO
- Fix bluetooth on Intel Macbook 2014
* tag 'for-net-2023-07-20' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth:
Bluetooth: MGMT: Use correct address for memcpy()
Bluetooth: btusb: Fix bluetooth on Intel Macbook 2014
Bluetooth: SCO: fix sco_conn related locking and validity issues
Bluetooth: hci_conn: return ERR_PTR instead of NULL when there is no link
Bluetooth: hci_sync: Avoid use-after-free in dbg for hci_remove_adv_monitor()
Bluetooth: coredump: fix building with coredump disabled
Bluetooth: ISO: fix iso_conn related locking and validity issues
Bluetooth: hci_event: call disconnect callback before deleting conn
Bluetooth: use RCU for hci_conn_params and iterate safely in hci_sync
====================
Jakub Kicinski [Thu, 20 Jul 2023 19:54:21 +0000 (12:54 -0700)]
Merge tag 'nf-23-07-20' of https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf
Florian Westphal says:
====================
Netfilter fixes for net:
The following patchset contains Netfilter fixes for net:
1. Fix spurious -EEXIST error from userspace due to
padding holes, this was broken since 4.9 days
when 'ignore duplicate entries on insert' feature was
added.
2. Fix a sched-while-atomic bug, present since 5.19.
3. Properly remove elements if they lack an "end range".
nft userspace always sets an end range attribute, even
when its the same as the start, but the abi doesn't
have such a restriction. Always broken since it was
added in 5.6, all three from myself.
4 + 5: Bound chain needs to be skipped in netns release
and on rule flush paths, from Pablo Neira.
* tag 'nf-23-07-20' of https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
netfilter: nf_tables: skip bound chain on rule flush
netfilter: nf_tables: skip bound chain in netns release path
netfilter: nft_set_pipapo: fix improper element removal
netfilter: nf_tables: can't schedule in nft_chain_validate
netfilter: nf_tables: fix spurious set element insertion failure
====================
Vladimir Oltean [Thu, 20 Jul 2023 00:02:31 +0000 (03:02 +0300)]
net: phy: prevent stale pointer dereference in phy_init()
mdio_bus_init() and phy_driver_register() both have error paths, and if
those are ever hit, ethtool will have a stale pointer to the
phy_ethtool_phy_ops stub structure, which references memory from a
module that failed to load (phylib).
It is probably hard to force an error in this code path even manually,
but the error teardown path of phy_init() should be the same as
phy_exit(), which is now simply not the case.
io_uring: treat -EAGAIN for REQ_F_NOWAIT as final for io-wq
io-wq assumes that an issue is blocking, but it may not be if the
request type has asked for a non-blocking attempt. If we get
-EAGAIN for that case, then we need to treat it as a final result
and not retry or arm poll for it.
Andy Shevchenko [Mon, 17 Jul 2023 09:32:14 +0000 (12:32 +0300)]
Bluetooth: MGMT: Use correct address for memcpy()
In function ‘fortify_memcpy_chk’,
inlined from ‘get_conn_info_complete’ at net/bluetooth/mgmt.c:7281:2:
include/linux/fortify-string.h:592:25: error: call to
‘__read_overflow2_field’ declared with attribute warning: detected read
beyond size of field (2nd parameter); maybe use struct_group()?
[-Werror=attribute-warning]
592 | __read_overflow2_field(q_size_field, size);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
cc1: all warnings being treated as errors
This is due to the wrong member is used for memcpy(). Use correct one.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Tomasz Moń [Thu, 13 Jul 2023 10:25:14 +0000 (12:25 +0200)]
Bluetooth: btusb: Fix bluetooth on Intel Macbook 2014
Commit c13380a55522 ("Bluetooth: btusb: Do not require hardcoded
interface numbers") inadvertedly broke bluetooth on Intel Macbook 2014.
The intention was to keep behavior intact when BTUSB_IFNUM_2 is set and
otherwise allow any interface numbers. The problem is that the new logic
condition omits the case where bInterfaceNumber is 0.
Fix BTUSB_IFNUM_2 handling by allowing both interface number 0 and 2
when the flag is set.
Fixes: c13380a55522 ("Bluetooth: btusb: Do not require hardcoded interface numbers") Reported-by: John Holland <johnbholland@icloud.com> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217651 Signed-off-by: Tomasz Moń <tomasz.mon@nordicsemi.no> Tested-by: John Holland<johnbholland@icloud.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Pauli Virtanen [Mon, 10 Jul 2023 16:48:19 +0000 (19:48 +0300)]
Bluetooth: SCO: fix sco_conn related locking and validity issues
Operations that check/update sk_state and access conn should hold
lock_sock, otherwise they can race.
The order of taking locks is hci_dev_lock > lock_sock > sco_conn_lock,
which is how it is in connect/disconnect_cfm -> sco_conn_del ->
sco_chan_del.
Fix locking in sco_connect to take lock_sock around updating sk_state
and conn.
sco_conn_del must not occur during sco_connect, as it frees the
sco_conn. Hold hdev->lock longer to prevent that.
sco_conn_add shall return sco_conn with valid hcon. Make it so also when
reusing an old SCO connection waiting for disconnect timeout (see
__sco_sock_close where conn->hcon is set to NULL).
This should not reintroduce the issue fixed in the earlier
commit 9a8ec9e8ebb5 ("Bluetooth: SCO: Fix possible circular locking
dependency on sco_connect_cfm"), the relevant fix of releasing lock_sock
in sco_sock_connect before acquiring hdev->lock is retained.
These changes mirror similar fixes earlier in ISO sockets.
Fixes: 9a8ec9e8ebb5 ("Bluetooth: SCO: Fix possible circular locking dependency on sco_connect_cfm") Signed-off-by: Pauli Virtanen <pav@iki.fi> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Bluetooth: hci_conn: return ERR_PTR instead of NULL when there is no link
hci_connect_sco currently returns NULL when there is no link (i.e. when
hci_conn_link() returns NULL).
sco_connect() expects an ERR_PTR in case of any error (see line 266 in
sco.c). Thus, hcon set as NULL passes through to sco_conn_add(), which
tries to get hcon->hdev, resulting in dereferencing a NULL pointer as
reported by syzkaller.
The same issue exists for iso_connect_cis() calling hci_connect_cis().
Thus, make hci_connect_sco() and hci_connect_cis() return ERR_PTR
instead of NULL.
Reported-and-tested-by: syzbot+37acd5d80d00d609d233@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=37acd5d80d00d609d233 Fixes: 06149746e720 ("Bluetooth: hci_conn: Add support for linking multiple hcon") Signed-off-by: Siddh Raman Pant <code@siddh.me> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Douglas Anderson [Fri, 30 Jun 2023 22:33:14 +0000 (15:33 -0700)]
Bluetooth: hci_sync: Avoid use-after-free in dbg for hci_remove_adv_monitor()
KASAN reports that there's a use-after-free in
hci_remove_adv_monitor(). Trawling through the disassembly, you can
see that the complaint is from the access in bt_dev_dbg() under the
HCI_ADV_MONITOR_EXT_MSFT case. The problem case happens because
msft_remove_monitor() can end up freeing the monitor
structure. Specifically:
hci_remove_adv_monitor() ->
msft_remove_monitor() ->
msft_remove_monitor_sync() ->
msft_le_cancel_monitor_advertisement_cb() ->
hci_free_adv_monitor()
Let's fix the problem by just stashing the relevant data when it's
still valid.
Fixes: 7cf5c2978f23 ("Bluetooth: hci_sync: Refactor remove Adv Monitor") Signed-off-by: Douglas Anderson <dianders@chromium.org> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Bluetooth: coredump: fix building with coredump disabled
The btmtk driver uses an IS_ENABLED() check to conditionally compile
the coredump support, but this fails to build because the hdev->dump
member is in an #ifdef:
drivers/bluetooth/btmtk.c: In function 'btmtk_process_coredump':
drivers/bluetooth/btmtk.c:386:30: error: 'struct hci_dev' has no member named 'dump'
386 | schedule_delayed_work(&hdev->dump.dump_timeout,
| ^~
The struct member doesn't really make a huge difference in the total size,
so just remove the #ifdef around it to avoid adding similar checks
around each user.
Fixes: 872f8c253cb9e ("Bluetooth: btusb: mediatek: add MediaTek devcoredump support") Fixes: 9695ef876fd12 ("Bluetooth: Add support for hci devcoredump") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Pauli Virtanen [Sun, 18 Jun 2023 22:04:33 +0000 (01:04 +0300)]
Bluetooth: ISO: fix iso_conn related locking and validity issues
sk->sk_state indicates whether iso_pi(sk)->conn is valid. Operations
that check/update sk_state and access conn should hold lock_sock,
otherwise they can race.
The order of taking locks is hci_dev_lock > lock_sock > iso_conn_lock,
which is how it is in connect/disconnect_cfm -> iso_conn_del ->
iso_chan_del.
Fix locking in iso_connect_cis/bis and sendmsg/recvmsg to take lock_sock
around updating sk_state and conn.
iso_conn_del must not occur during iso_connect_cis/bis, as it frees the
iso_conn. Hold hdev->lock longer to prevent that.
This should not reintroduce the issue fixed in commit 241f51931c35
("Bluetooth: ISO: Avoid circular locking dependency"), since the we
acquire locks in order. We retain the fix in iso_sock_connect to release
lock_sock before iso_connect_* acquires hdev->lock.
Similarly for commit 6a5ad251b7cd ("Bluetooth: ISO: Fix possible
circular locking dependency"). We retain the fix in iso_conn_ready to
not acquire iso_conn_lock before lock_sock.
iso_conn_add shall return iso_conn with valid hcon. Make it so also when
reusing an old CIS connection waiting for disconnect timeout (see
__iso_sock_close where conn->hcon is set to NULL).
Pauli Virtanen [Sun, 18 Jun 2023 22:04:32 +0000 (01:04 +0300)]
Bluetooth: hci_event: call disconnect callback before deleting conn
In hci_cs_disconnect, we do hci_conn_del even if disconnection failed.
ISO, L2CAP and SCO connections refer to the hci_conn without
hci_conn_get, so disconn_cfm must be called so they can clean up their
conn, otherwise use-after-free occurs.
Fixes: b8d290525e39 ("Bluetooth: clean up connection in hci_cs_disconnect") Signed-off-by: Pauli Virtanen <pav@iki.fi> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Pauli Virtanen [Sun, 18 Jun 2023 22:04:31 +0000 (01:04 +0300)]
Bluetooth: use RCU for hci_conn_params and iterate safely in hci_sync
hci_update_accept_list_sync iterates over hdev->pend_le_conns and
hdev->pend_le_reports, and waits for controller events in the loop body,
without holding hdev lock.
Meanwhile, these lists and the items may be modified e.g. by
le_scan_cleanup. This can invalidate the list cursor or any other item
in the list, resulting to invalid behavior (eg use-after-free).
Use RCU for the hci_conn_params action lists. Since the loop bodies in
hci_sync block and we cannot use RCU or hdev->lock for the whole loop,
copy list items first and then iterate on the copy. Only the flags field
is written from elsewhere, so READ_ONCE/WRITE_ONCE should guarantee we
read valid values.
Free params everywhere with hci_conn_params_free so the cleanup is
guaranteed to be done properly.
This fixes the following, which can be triggered e.g. by BlueZ new
mgmt-tester case "Add + Remove Device Nowait - Success", or by changing
hci_le_set_cig_params to always return false, and running iso-tester:
==================================================================
BUG: KASAN: slab-use-after-free in hci_update_passive_scan_sync (net/bluetooth/hci_sync.c:2536 net/bluetooth/hci_sync.c:2723 net/bluetooth/hci_sync.c:2841)
Read of size 8 at addr ffff888001265018 by task kworker/u3:0/32
Fixes: e8907f76544f ("Bluetooth: hci_sync: Make use of hci_cmd_sync_queue set 3") Signed-off-by: Pauli Virtanen <pav@iki.fi> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Merge tag 'iomap-6.5-fixes-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux
Pull iomap fix from Darrick Wong:
"Fix partial write regression.
It turns out that fstests doesn't have any test coverage for short
writes, but LTP does. Fortunately, this was caught right after -rc1
was tagged.
Summary:
- Fix a bug wherein a failed write could clobber short write status"
* tag 'iomap-6.5-fixes-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
iomap: micro optimize the ki_pos assignment in iomap_file_buffered_write
iomap: fix a regression for partial write errors
Merge tag 'xfs-6.5-fixes-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux
Pull xfs fixes from Darrick Wong:
"Flexarray declaration conversions.
This probably should've been done with the merge window open, but I
was not aware that the UBSAN knob would be getting turned up for 6.5,
and the fstests failures due to the kernel warnings are getting in the
way of testing.
Summary:
- Convert all the array[1] declarations into the accepted flex
array[] declarations so that UBSAN and friends will not get
confused"
* tag 'xfs-6.5-fixes-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
xfs: convert flex-array declarations in xfs attr shortform objects
xfs: convert flex-array declarations in xfs attr leaf blocks
xfs: convert flex-array declarations in struct xfs_attrlist*
Merge tag 'for-6.5-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs fixes from David Sterba:
"Stable fixes:
- fix race between balance and cancel/pause
- various iput() fixes
- fix use-after-free of new block group that became unused
- fix warning when putting transaction with qgroups enabled after
abort
- fix crash in subpage mode when page could be released between map
and map read
- when scrubbing raid56 verify the P/Q stripes unconditionally
- fix minor memory leak in zoned mode when a block group with an
unexpected superblock is found
Regression fixes:
- fix ordered extent split error handling when submitting direct IO
- user irq-safe locking when adding delayed iputs"
* tag 'for-6.5-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
btrfs: fix warning when putting transaction with qgroups enabled after abort
btrfs: fix ordered extent split error handling in btrfs_dio_submit_io
btrfs: set_page_extent_mapped after read_folio in btrfs_cont_expand
btrfs: raid56: always verify the P/Q contents for scrub
btrfs: use irq safe locking when running and adding delayed iputs
btrfs: fix iput() on error pointer after error during orphan cleanup
btrfs: fix double iput() on inode after an error during orphan cleanup
btrfs: zoned: fix memory leak after finding block group with super blocks
btrfs: fix use-after-free of new block group that became unused
btrfs: be a bit more careful when setting mirror_num_ret in btrfs_map_block
btrfs: fix race between balance and cancel/pause
Merge tag 'regulator-fix-v6.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator
Pull regulator fix from Mark Brown:
"One fix for an issue with parsing partially specified DTs"
* tag 'regulator-fix-v6.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
regulator: da9063: fix null pointer deref with partial DT config
s390/zcrypt: fix reply buffer calculations for CCA replies
The length information for available buffer space for CCA
replies is covered with two fields in the T6 header prepended
on each CCA reply: fromcardlen1 and fromcardlen2. The sum of
these both values must not exceed the AP bus limit for this
card (24KB for CEX8, 12KB CEX7 and older) minus the always
present headers.
The current code adjusted the fromcardlen2 value in case
of exceeding the AP bus limit when there was a non-zero
value given from userspace. Some tests now showed that this
was the wrong assumption. Instead the userspace value given for
this field should always be trusted and if the sum of the
two fields exceeds the AP bus limit for this card the first
field fromcardlen1 should be adjusted instead.
So now the calculation is done with this new insight in mind.
Also some additional checks for overflow have been introduced
and some comments to provide some documentation for future
maintainers of this complicated calculation code.
Furthermore the 128 bytes of fix overhead which is used
in the current code is not correct. Investigations showed
that for a reply always the same two header structs are
prepended before a possible payload. So this is also fixed
with this patch.
regmap: Disable locking for RBTREE and MAPLE unit tests
REGCACHE_RBTREE and REGCACHE_MAPLE dynamically allocate memory
for regmap operations. This is incompatible with spinlock based locking
which is used for fast_io operations. Disable locking for the associated
unit tests to avoid lockdep splashes.
Fixes: f033c26de5a5 ("regmap: Add maple tree based register cache") Fixes: 2238959b6ad2 ("regmap: Add some basic kunit tests") Signed-off-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/r/20230720032848.1306349-1-linux@roeck-us.net Signed-off-by: Mark Brown <broonie@kernel.org>
Uwe Kleine-König pointed out we still have one resource leak in the mvebu
driver triggered on driver detach. Let's address it with a custom devm
action.
Fixes: 812d47889a8e ("gpio/mvebu: Use irq_domain_add_linear") Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org> Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com> Reviewed-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Arnd Bergmann [Fri, 9 Jun 2023 12:06:32 +0000 (14:06 +0200)]
accel/habanalabs: add more debugfs stub helpers
Two functions got added with normal prototypes for debugfs, but not
alternative when building without it:
drivers/accel/habanalabs/common/device.c: In function 'hl_device_init':
drivers/accel/habanalabs/common/device.c:2177:14: error: implicit declaration of function 'hl_debugfs_device_init'; did you mean 'hl_debugfs_init'? [-Werror=implicit-function-declaration]
drivers/accel/habanalabs/common/device.c:2305:9: error: implicit declaration of function 'hl_debugfs_device_fini'; did you mean 'hl_debugfs_remove_file'? [-Werror=implicit-function-declaration]
Zhen Lei [Thu, 13 Jul 2023 11:58:31 +0000 (19:58 +0800)]
arm64: vdso: Clear common make C=2 warnings
make C=2 ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu- xxx.o
When I use the command above to do a 'make C=2' check on any object file,
the following warnings are always output:
CHECK arch/arm64/kernel/vdso/vgettimeofday.c
arch/arm64/kernel/vdso/vgettimeofday.c:9:5: warning:
symbol '__kernel_clock_gettime' was not declared. Should it be static?
arch/arm64/kernel/vdso/vgettimeofday.c:15:5: warning:
symbol '__kernel_gettimeofday' was not declared. Should it be static?
arch/arm64/kernel/vdso/vgettimeofday.c:21:5: warning:
symbol '__kernel_clock_getres' was not declared. Should it be static?
Therefore, the declaration of the three functions is added to eliminate
these common warnings to provide a clean output.
Nikhil V [Thu, 13 Jul 2023 07:07:57 +0000 (12:37 +0530)]
arm64: mm: Make hibernation aware of KFENCE
In the restore path, swsusp_arch_suspend_exit uses copy_page() to
over-write memory. However, with features like KFENCE enabled, there could
be situations where it may have marked some pages as not valid, due to
which it could be reported as invalid accesses.
Consider a situation where page 'P' was part of the hibernation image.
Now, when the resume kernel tries to restore the pages, the same page 'P'
is already in use in the resume kernel and is kfence protected, due to
which its mapping is removed from linear map. Since restoring pages happens
with the resume kernel page tables, we would end up accessing 'P' during
copy and results in kernel pagefault.
The proposed fix tries to solve this issue by marking PTE as valid for such
kfence protected pages.
Co-developed-by: Pavankumar Kondeti <quic_pkondeti@quicinc.com> Signed-off-by: Pavankumar Kondeti <quic_pkondeti@quicinc.com> Signed-off-by: Nikhil V <quic_nprakash@quicinc.com> Link: https://lore.kernel.org/r/20230713070757.4093-1-quic_nprakash@quicinc.com Signed-off-by: Will Deacon <will@kernel.org>
netfilter: nft_set_pipapo: fix improper element removal
end key should be equal to start unless NFT_SET_EXT_KEY_END is present.
Its possible to add elements that only have a start key
("{ 1.0.0.0 . 2.0.0.0 }") without an internval end.
Insertion treats this via:
if (nft_set_ext_exists(ext, NFT_SET_EXT_KEY_END))
end = (const u8 *)nft_set_ext_key_end(ext)->data;
else
end = start;
but removal side always uses nft_set_ext_key_end().
This is wrong and leads to garbage remaining in the set after removal
next lookup/insert attempt will give:
BUG: KASAN: slab-use-after-free in pipapo_get+0x8eb/0xb90
Read of size 1 at addr ffff888100d50586 by task nft-pipapo_uaf_/1399
Call Trace:
kasan_report+0x105/0x140
pipapo_get+0x8eb/0xb90
nft_pipapo_insert+0x1dc/0x1710
nf_tables_newsetelem+0x31f5/0x4e00
..
Fixes: 3c4287f62044 ("nf_tables: Add set type for arbitrary concatenation of ranges") Reported-by: lonial con <kongln9170@gmail.com> Reviewed-by: Stefano Brivio <sbrivio@redhat.com> Signed-off-by: Florian Westphal <fw@strlen.de>
netfilter: nf_tables: fix spurious set element insertion failure
On some platforms there is a padding hole in the nft_verdict
structure, between the verdict code and the chain pointer.
On element insertion, if the new element clashes with an existing one and
NLM_F_EXCL flag isn't set, we want to ignore the -EEXIST error as long as
the data associated with duplicated element is the same as the existing
one. The data equality check uses memcmp.
For normal data (NFT_DATA_VALUE) this works fine, but for NFT_DATA_VERDICT
padding area leads to spurious failure even if the verdict data is the
same.
This then makes the insertion fail with 'already exists' error, even
though the new "key : data" matches an existing entry and userspace
told the kernel that it doesn't want to receive an error indication.
Fixes: c016c7e45ddf ("netfilter: nf_tables: honor NLM_F_EXCL flag in set element insertion") Signed-off-by: Florian Westphal <fw@strlen.de>
ALSA: hda/realtek: Fix generic fixup definition for cs35l41 amp
Generic fixup for CS35L41 amplifies should not have vendor specific
chained fixup. For ThinkPad laptops with led issue, we can just add
specific fixup.