Linus Torvalds [Tue, 21 May 2024 17:25:44 +0000 (10:25 -0700)]
Merge tag 'rpmsg-v6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/remoteproc/linux
Pull rpmsg updates from Bjorn Andersson:
"This makes core rpmsg_class const and ensures that the automatic
module loading of the Qualcomm glink_ssr driver happens"
* tag 'rpmsg-v6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/remoteproc/linux:
rpmsg: qcom_glink_ssr: fix module autoloading
rpmsg: core: Make rpmsg_class constant
Linus Torvalds [Tue, 21 May 2024 17:09:28 +0000 (10:09 -0700)]
Merge tag 'pci-v6.10-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci
Pull pci updates from Bjorn Helgaas:
"Enumeration:
- Skip E820 checks for MCFG ECAM regions for new (2016+) machines,
since there's no requirement to describe them in E820 and some
platforms require ECAM to work (Bjorn Helgaas)
- Rename PCI_IRQ_LEGACY to PCI_IRQ_INTX to be more specific (Damien
Le Moal)
- Remove last user and pci_enable_device_io() (Heiner Kallweit)
- Wait for Link Training==0 to avoid possible race (Ilpo Järvinen)
- Skip waiting for devices that have been disconnected while
suspended (Ilpo Järvinen)
- Clear Secondary Status errors after enumeration since Master Aborts
and Unsupported Request errors are an expected part of enumeration
(Vidya Sagar)
MSI:
- Remove unused IMS (Interrupt Message Store) support (Bjorn Helgaas)
Error handling:
- Mask Genesys GL975x SD host controller Replay Timer Timeout
correctable errors caused by a hardware defect; the errors cause
interrupts that prevent system suspend (Kai-Heng Feng)
- Fix EDR-related _DSM support, which previously evaluated revision 5
but assumed revision 6 behavior (Kuppuswamy Sathyanarayanan)
ASPM:
- Simplify link state definitions and mask calculation (Ilpo
Järvinen)
Power management:
- Avoid D3cold for HP Pavilion 17 PC/1972 PCIe Ports, where BIOS
apparently doesn't know how to put them back in D0 (Mario
Limonciello)
CXL:
- Support resetting CXL devices; special handling required because
CXL Ports mask Secondary Bus Reset by default (Dave Jiang)
DOE:
- Support DOE Discovery Version 2 (Alexey Kardashevskiy)
Endpoint framework:
- Set endpoint BAR to be 64-bit if the driver says that's all the
device supports, in addition to doing so if the size is >2GB
(Niklas Cassel)
- Simplify endpoint BAR allocation and setting interfaces (Niklas
Cassel)
Cadence PCIe controller driver:
- Drop DT binding redundant msi-parent and pci-bus.yaml (Krzysztof
Kozlowski)
Cadence PCIe endpoint driver:
- Configure endpoint BARs to be 64-bit based on the BAR type, not the
BAR value (Niklas Cassel)
- Add DT binding R-Car V4H compatible for host and endpoint mode
(Yoshihiro Shimoda)
Rockchip PCIe controller driver:
- Configure endpoint BARs to be 64-bit based on the BAR type, not the
BAR value (Niklas Cassel)
- Add DT binding missing maxItems to ep-gpios (Krzysztof Kozlowski)
- Set the Subsystem Vendor ID, which was previously zero because it
was masked incorrectly (Rick Wertenbroek)
Synopsys DesignWare PCIe controller driver:
- Restructure DBI register access to accommodate devices where this
requires Refclk to be active (Manivannan Sadhasivam)
- Remove the deinit() callback, which was only need by the
pcie-rcar-gen4, and do it directly in that driver (Manivannan
Sadhasivam)
- Add dw_pcie_ep_cleanup() so drivers that support PERST# can clean
up things like eDMA (Manivannan Sadhasivam)
- Rename dw_pcie_ep_exit() to dw_pcie_ep_deinit() to make it parallel
to dw_pcie_ep_init() (Manivannan Sadhasivam)
- Rename dw_pcie_ep_init_complete() to dw_pcie_ep_init_registers() to
reflect the actual functionality (Manivannan Sadhasivam)
- Call dw_pcie_ep_init_registers() directly from all the glue
drivers, not just those that require active Refclk from the host
(Manivannan Sadhasivam)
- Remove the "core_init_notifier" flag, which was an obscure way for
glue drivers to indicate that they depend on Refclk from the host
(Manivannan Sadhasivam)
TI J721E PCIe driver:
- Add DT binding J784S4 SoC Device ID (Siddharth Vadapalli)
- Add DT binding J722S SoC support (Siddharth Vadapalli)
- Constify and annotate with __ro_after_init (Heiner Kallweit)
- Convert DT bindings to YAML (Krzysztof Kozlowski)
- Check for kcalloc() failure in of_pci_prop_intr_map() (Duoming
Zhou)"
* tag 'pci-v6.10-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci: (97 commits)
PCI: Do not wait for disconnected devices when resuming
x86/pci: Skip early E820 check for ECAM region
PCI: Remove unused pci_enable_device_io()
ata: pata_cs5520: Remove unnecessary call to pci_enable_device_io()
PCI: Update pci_find_capability() stub return types
PCI: Remove PCI_IRQ_LEGACY
scsi: vmw_pvscsi: Do not use PCI_IRQ_LEGACY instead of PCI_IRQ_LEGACY
scsi: pmcraid: Use PCI_IRQ_INTX instead of PCI_IRQ_LEGACY
scsi: mpt3sas: Use PCI_IRQ_INTX instead of PCI_IRQ_LEGACY
scsi: megaraid_sas: Use PCI_IRQ_INTX instead of PCI_IRQ_LEGACY
scsi: ipr: Use PCI_IRQ_INTX instead of PCI_IRQ_LEGACY
scsi: hpsa: Use PCI_IRQ_INTX instead of PCI_IRQ_LEGACY
scsi: arcmsr: Use PCI_IRQ_INTX instead of PCI_IRQ_LEGACY
wifi: rtw89: Use PCI_IRQ_INTX instead of PCI_IRQ_LEGACY
dt-bindings: PCI: rockchip,rk3399-pcie: Add missing maxItems to ep-gpios
Revert "genirq/msi: Provide constants for PCI/IMS support"
Revert "x86/apic/msi: Enable PCI/IMS"
Revert "iommu/vt-d: Enable PCI/IMS"
Revert "iommu/amd: Enable PCI/IMS"
Revert "PCI/MSI: Provide IMS (Interrupt Message Store) support"
...
Linus Torvalds [Tue, 21 May 2024 17:04:02 +0000 (10:04 -0700)]
Merge tag 'keys-trusted-next-6.10-rc1-part2' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd
Pull trusted keys fixes from Jarkko Sakkinen:
"These are two bugs I found from trusted keys while working on a new
RSA key type for TPM2. Both originate form v5.13.
The memory leak is more crucial but I don't think it is either good
idea if kernel throws WARN when ASN.1 parser fails, even if it is
related to programming error, as it is not that mature code yet.
There's at least two WARN's in that code but I picked just the one
more likely to trigger. Planning to fix the other one too over time"
* tag 'keys-trusted-next-6.10-rc1-part2' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd:
KEYS: trusted: Do not use WARN when encode fails
KEYS: trusted: Fix memory leak in tpm2_key_encode()
Linus Torvalds [Tue, 21 May 2024 16:51:42 +0000 (09:51 -0700)]
Merge tag 'pull-bd_inode-1' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull bdev bd_inode updates from Al Viro:
"Replacement of bdev->bd_inode with sane(r) set of primitives by me and
Yu Kuai"
* tag 'pull-bd_inode-1' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
RIP ->bd_inode
dasd_format(): killing the last remaining user of ->bd_inode
nilfs_attach_log_writer(): use ->bd_mapping->host instead of ->bd_inode
block/bdev.c: use the knowledge of inode/bdev coallocation
gfs2: more obvious initializations of mapping->host
fs/buffer.c: massage the remaining users of ->bd_inode to ->bd_mapping
blk_ioctl_{discard,zeroout}(): we only want ->bd_inode->i_mapping here...
grow_dev_folio(): we only want ->bd_inode->i_mapping there
use ->bd_mapping instead of ->bd_inode->i_mapping
block_device: add a pointer to struct address_space (page cache of bdev)
missing helpers: bdev_unhash(), bdev_drop()
block: move two helpers into bdev.c
block2mtd: prevent direct access of bd_inode
dm-vdo: use bdev_nr_bytes(bdev) instead of i_size_read(bdev->bd_inode)
blkdev_write_iter(): saner way to get inode and bdev
bcachefs: remove dead function bdev_sectors()
ext4: remove block_device_ejected()
erofs_buf: store address_space instead of inode
erofs: switch erofs_bread() to passing offset instead of block number
Linus Torvalds [Tue, 21 May 2024 15:34:51 +0000 (08:34 -0700)]
Merge tag 'pull-set_blocksize' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs blocksize updates from Al Viro:
"This gets rid of bogus set_blocksize() uses, switches it over
to be based on a 'struct file *' and verifies that the caller
has the device opened exclusively"
* tag 'pull-set_blocksize' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
make set_blocksize() fail unless block device is opened exclusive
set_blocksize(): switch to passing struct file *
btrfs_get_bdev_and_sb(): call set_blocksize() only for exclusive opens
swsusp: don't bother with setting block size
zram: don't bother with reopening - just use O_EXCL for open
swapon(2): open swap with O_EXCL
swapon(2)/swapoff(2): don't bother with block size
pktcdvd: sort set_blocksize() calls out
bcache_register(): don't bother with set_blocksize()
Linus Torvalds [Tue, 21 May 2024 12:34:43 +0000 (14:34 +0200)]
fs/pidfs: make 'lsof' happy with our inode changes
pidfs started using much saner inodes in commit b28ddcc32d8f ("pidfs:
convert to path_from_stashed() helper"), but that exposed the fact that
lsof had some knowledge of just how odd our old anon_inode usage was.
For example, legacy anon_inodes hadn't even initialized the inode type
in the inode mode, so everything had a type of zero.
So sane tools like 'stat' would report these files as "weird file", but
'lsof' instead used that (together with the name of the link in proc) to
notice that it's an anonymous inode, and used it to detect pidfd files.
Let's keep our internal new sane inode model, but mask the file type
bits at 'stat()' time in the getattr() function we already have, and by
making the dentry name match what lsof expects too.
This keeps our internal models sane, but should make user space see the
same old odd behavior.
Jarkko Sakkinen [Mon, 13 May 2024 18:19:04 +0000 (21:19 +0300)]
KEYS: trusted: Do not use WARN when encode fails
When asn1_encode_sequence() fails, WARN is not the correct solution.
1. asn1_encode_sequence() is not an internal function (located
in lib/asn1_encode.c).
2. Location is known, which makes the stack trace useless.
3. Results a crash if panic_on_warn is set.
It is also noteworthy that the use of WARN is undocumented, and it
should be avoided unless there is a carefully considered rationale to
use it.
Replace WARN with pr_err, and print the return value instead, which is
only useful piece of information.
Cc: stable@vger.kernel.org # v5.13+ Fixes: f2219745250f ("security: keys: trusted: use ASN.1 TPM2 key format for the blobs") Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
Jarkko Sakkinen [Sun, 19 May 2024 23:31:53 +0000 (02:31 +0300)]
KEYS: trusted: Fix memory leak in tpm2_key_encode()
'scratch' is never freed. Fix this by calling kfree() in the success, and
in the error case.
Cc: stable@vger.kernel.org # +v5.13 Fixes: f2219745250f ("security: keys: trusted: use ASN.1 TPM2 key format for the blobs") Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
Linus Torvalds [Mon, 20 May 2024 23:00:04 +0000 (16:00 -0700)]
Merge tag 'cocci-for-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/jlawall/linux
Pull coccinelle updates from Julia Lawall:
"One patch slightly improves the text in a comment.
The other patch (on minmax.cocci) removes a report about ? being used
in return statements that has been generating not very useful
suggestions to change idiomatic code"
* tag 'cocci-for-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/jlawall/linux:
Coccinelle: pm_runtime: Fix grammar in comment
coccinelle: misc: minmax: Suppress reports for err returns
Linus Torvalds [Mon, 20 May 2024 22:18:34 +0000 (15:18 -0700)]
Merge tag 'asm-generic-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic
Pull asm-generic cleanups from Arnd Bergmann:
"These are a few cross-architecture cleanup patches:
- separate out fbdev support from the asm/video.h contents that may
be used by either the old fbdev drivers or the newer drm display
code (Thomas Zimmermann)
- cleanups for the generic bitops code and asm-generic/bug.h
(Thorsten Blum)
- remove the orphaned include/asm-generic/page.h header that used to
be included by long-removed mmu-less architectures (me)"
* tag 'asm-generic-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic:
arch: Fix name collision with ACPI's video.o
bug: Improve comment
asm-generic: remove unused asm-generic/page.h
arch: Rename fbdev header and source files
arch: Remove struct fb_info from video helpers
arch: Select fbdev helpers with CONFIG_VIDEO
bitops: Change function return types from long to int
Linus Torvalds [Mon, 20 May 2024 22:11:53 +0000 (15:11 -0700)]
Merge tag 'soc-dt-late-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Pull more SoC devicetree updates from Arnd Bergmann:
"This is a follow-up to an earlier pull request for device tree
changes, as three platform maintainers sent their contents too late to
be included in the main set, but had not caused any further problems
since then:
- The Amlogic platform now containts support for two new SoC types,
the A4 and A5 chips for audio applications. Both come with a
reference board, and one more dts file gets addded for the
combination of the MNT Reform Laptop with the BPI-CM4 CPU module
- The ASpeed platform adds support for six addititional server
platforms that use ast2500 or ast2600 as their BMC, while another
one gets removed
- The RISC-V platforms from Microchip, Starfive and and T-HEAD get
additional features for existing hardware, plus the addition of the
Milk-V Mars based on the StarFive VisionFive v2 board"
* tag 'soc-dt-late-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (76 commits)
riscv: dts: microchip: add pac1934 power-monitor to icicle
riscv: dts: thead: Fix node ordering in TH1520 device tree
ARM: dts: aspeed: Add ASRock E3C256D4I BMC
dt-bindings: arm: aspeed: document ASRock E3C256D4I
dt-bindings: trivial-devices: add isil,isl69269
ARM: dts: aspeed: x4tf: Add dts for asus x4tf project
dt-bindings: arm: aspeed: add ASUS X4TF board
ARM: dts: aspeed: Remove Facebook Cloudripper dts
ARM: dts: aspeed: drop unused ref_voltage ADC property
ARM: dts: aspeed: harma: correct Mellanox multi-host property
ARM: dts: aspeed: yosemitev2: correct Mellanox multi-host property
ARM: dts: aspeed: yosemite4: correct Mellanox multi-host property
ARM: dts: aspeed: greatlakes: correct Mellanox multi-host property
ARM: dts: aspeed: Modify I2C bus configuration
ARM: dts: aspeed: Disable unused ADC channels for Asrock X570D4U BMC
ARM: dts: aspeed: Modify GPIO table for Asrock X570D4U BMC
ARM: dts: aspeed: yosemite4: set bus13 frequency to 100k
ARM: dts: Aspeed: Bonnell: Fix NVMe LED labels
ARM: dts: aspeed: yosemite4: Enable ipmb device for OCP debug card
ARM: dts: aspeed: ahe50dc: Update lm25066 regulator name
...
Linus Torvalds [Mon, 20 May 2024 21:56:50 +0000 (14:56 -0700)]
Merge tag 'vfio-v6.10-rc1' of https://github.com/awilliam/linux-vfio
Pull vfio updates from Alex Williamson:
- The vfio fsl-mc bus driver has become orphaned. We'll consider
removing it in future releases if a new maintainer isn't found (Alex
Williamson)
- Improved usage of opaque data in vfio-pci INTx handling, avoiding
lookups of the eventfd through the interrupt and irqfd runtime paths
(Alex Williamson)
- Resolve an error path memory leak introduced in vfio-pci interrupt
code (Ye Bin)
- Addition of interrupt support for vfio devices exposed on the CDX
bus, including a new MSI allocation helper and export of existing
helpers for MSI alloc and free (Nipun Gupta)
- A new vfio-pci variant driver supporting migration of Intel QAT VF
devices for the GEN4 PFs (Xin Zeng & Yahui Cao)
- Resolve a possibly circular locking dependency in vfio-pci by
avoiding copy_to_user() from a PCI bus walk callback (Alex
Williamson)
- Trivial docs update to remove a duplicate semicolon (Foryun Ma)
* tag 'vfio-v6.10-rc1' of https://github.com/awilliam/linux-vfio:
vfio/pci: Restore zero affected bus reset devices warning
vfio: remove an extra semicolon
vfio/pci: Collect hot-reset devices to local buffer
vfio/qat: Add vfio_pci driver for Intel QAT SR-IOV VF devices
vfio/cdx: add interrupt support
genirq/msi: Add MSI allocation helper and export MSI functions
vfio/pci: fix potential memory leak in vfio_intx_enable()
vfio/pci: Pass eventfd context object through irqfd
vfio/pci: Pass eventfd context to IRQ handler
MAINTAINERS: Orphan vfio fsl-mc bus driver
Linus Torvalds [Mon, 20 May 2024 21:49:39 +0000 (14:49 -0700)]
Merge tag 'linux_kselftest-next-6.10-rc1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
Pull kselftest fixes from Shuah Khan:
"Revert framework change to add D_GNU_SOURCE to KHDR_INCLUDES to
Makefile, lib.mk, and kselftest_harness.h and follow-on changes to
cgroup and sgx test as they are causing build failures and warnings"
* tag 'linux_kselftest-next-6.10-rc1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
Revert "selftests/cgroup: Drop define _GNU_SOURCE"
Revert "selftests/sgx: Include KHDR_INCLUDES in Makefile"
Revert "selftests: Compile kselftest headers with -D_GNU_SOURCE"
Commit 2fd001cd3600 ("arch: Rename fbdev header and source files")
renames the video source files under arch/ such that they do not
refer to fbdev any longer. The new files named video.o conflict with
ACPI's video.ko module. Modprobing the ACPI module can then fail with
warnings about missing symbols, as shown below.
(i915_selftest:1107) igt_kmod-WARNING: i915: Unknown symbol acpi_video_unregister (err -2)
(i915_selftest:1107) igt_kmod-WARNING: i915: Unknown symbol acpi_video_register_backlight (err -2)
(i915_selftest:1107) igt_kmod-WARNING: i915: Unknown symbol __acpi_video_get_backlight_type (err -2)
(i915_selftest:1107) igt_kmod-WARNING: i915: Unknown symbol acpi_video_register (err -2)
Fix the issue by renaming the architecture's video.o to video-common.o.
Reported-by: Chaitanya Kumar Borah <chaitanya.kumar.borah@intel.com> Closes: https://lore.kernel.org/intel-gfx/9dcac6e9-a3bf-4ace-bbdc-f697f767f9e0@suse.de/T/#t Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Fixes: 2fd001cd3600 ("arch: Rename fbdev header and source files") Reviewed-by: Hans de Goede <hdegoede@redhat.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: linux-arch@vger.kernel.org Cc: linux-fbdev@vger.kernel.org Cc: dri-devel@lists.freedesktop.org Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Linus Torvalds [Mon, 20 May 2024 20:23:43 +0000 (13:23 -0700)]
Merge tag 'f2fs-for-6.10.rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs
Pull f2fs updates from Jaegeuk Kim:
"In this round, we've tried to address some performance issues on zoned
storage such as direct IO and write_hints. In addition, we've migrated
some IO paths using folio. Meanwhile, there are multiple bug fixes in
the compression paths, sanity check conditions, and error handlers.
Enhancements:
- allow direct io of pinned files for zoned storage
- assign the write hint per stream by default
- convert read paths and test_writeback to folio
- avoid allocating WARM_DATA segment for direct IO
Bug fixes:
- fix false alarm on invalid block address
- fix to add missing iput() in gc_data_segment()
- fix to release node block count in error path of
f2fs_new_node_page()
- compress:
- don't allow unaligned truncation on released compress inode
- cover {reserve,release}_compress_blocks() w/ cp_rwsem lock
- fix error path of inc_valid_block_count()
- fix to update i_compr_blocks correctly
- fix block migration when section is not aligned to pow2
- don't trigger OPU on pinfile for direct IO
- fix to do sanity check on i_xattr_nid in sanity_check_inode()
- write missing last sum blk of file pinning section
- clear writeback when compression failed
- fix to adjust appropirate defragment pg_end
As usual, there are several minor code clean-ups, and fixes to manage
missing corner cases in the error paths"
* tag 'f2fs-for-6.10.rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (50 commits)
f2fs: initialize last_block_in_bio variable
f2fs: Add inline to f2fs_build_fault_attr() stub
f2fs: fix some ambiguous comments
f2fs: fix to add missing iput() in gc_data_segment()
f2fs: allow dirty sections with zero valid block for checkpoint disabled
f2fs: compress: don't allow unaligned truncation on released compress inode
f2fs: fix to release node block count in error path of f2fs_new_node_page()
f2fs: compress: fix to cover {reserve,release}_compress_blocks() w/ cp_rwsem lock
f2fs: compress: fix error path of inc_valid_block_count()
f2fs: compress: fix typo in f2fs_reserve_compress_blocks()
f2fs: compress: fix to update i_compr_blocks correctly
f2fs: check validation of fault attrs in f2fs_build_fault_attr()
f2fs: fix to limit gc_pin_file_threshold
f2fs: remove unused GC_FAILURE_PIN
f2fs: use f2fs_{err,info}_ratelimited() for cleanup
f2fs: fix block migration when section is not aligned to pow2
f2fs: zone: fix to don't trigger OPU on pinfile for direct IO
f2fs: fix to do sanity check on i_xattr_nid in sanity_check_inode()
f2fs: fix to avoid allocating WARM_DATA segment for direct IO
f2fs: remove redundant parameter in is_next_segment_free()
...
Linus Torvalds [Mon, 20 May 2024 19:55:12 +0000 (12:55 -0700)]
Merge tag 'xfs-6.10-merge-6' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux
Pull xfs updates from Chandan Babu:
"Online repair feature continues to be expanded. Also, we now support
delayed allocation for realtime devices which have an extent size that
is equal to filesystem's block size.
New code:
- Introduce Parent Pointer extended attribute for inodes
- Bring back delalloc support for realtime devices which have an
extent size that is equal to filesystem's block size
- Improve performance of log incompat feature handling
Online Repair:
- Implement atomic file content exchanges i.e. exchange ranges of
bytes between two files atomically
- Create temporary files to repair file-based metadata. This uses
atomic file content exchange facility to swap file fork mappings
between the temporary file and the metadata inode
- Allow callers of directory/xattr code to set an explicit owner
number to be written into the header fields of any new blocks that
are created. This is required to avoid walking every block of the
new structure and modify their ownership during online repair
- Repair more data structures:
- Extended attributes
- Inode unlinked state
- Directories
- Symbolic links
- AGI's unlinked inode list
- Parent pointers
- Move Orphan files to lost and found directory
- Fixes for Inode repair functionality
- Introduce a new sub-AG FITRIM implementation to reduce the duration
for which the AGF lock is held
- Updates for the design documentation
- Use Parent Pointers to assist in checking directories, parent
pointers, extended attributes, and link counts
Fixes:
- Prevent userspace from reading invalid file data due to incorrect.
updation of file size when performing a non-atomic clone operation
- Minor fixes to online repair
- Fix confusing return values from xfs_bmapi_write()
- Fix an out of bounds access due to incorrect h_size during log
recovery
- Defer upgrading the extent counters in xfs_reflink_end_cow_extent()
until we know we are going to modify the extent mapping
- Remove racy access to if_bytes check in
xfs_reflink_end_cow_extent()
- Fix sparse warnings
Cleanups:
- Hold inode locks on all files involved in a rename until the
completion of the operation. This is in preparation for the parent
pointers patchset where parent pointers are applied in a separate
chained update from the actual directory update
- Compile out v4 support when disabled
- Cleanup xfs_extent_busy_clear()
- Remove unused flags and fields from struct xfs_da_args
- Remove definitions of unused functions
- Improve extended attribute validation
- Add higher level directory operations helpers to remove duplication
of code
- Cleanup quota (un)reservation interfaces"
* tag 'xfs-6.10-merge-6' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: (221 commits)
xfs: simplify iext overflow checking and upgrade
xfs: remove a racy if_bytes check in xfs_reflink_end_cow_extent
xfs: upgrade the extent counters in xfs_reflink_end_cow_extent later
xfs: xfs_quota_unreserve_blkres can't fail
xfs: consolidate the xfs_quota_reserve_blkres definitions
xfs: clean up buffer allocation in xlog_do_recovery_pass
xfs: fix log recovery buffer allocation for the legacy h_size fixup
xfs: widen flags argument to the xfs_iflags_* helpers
xfs: minor cleanups of xfs_attr3_rmt_blocks
xfs: create a helper to compute the blockcount of a max sized remote value
xfs: turn XFS_ATTR3_RMT_BUF_SPACE into a function
xfs: use unsigned ints for non-negative quantities in xfs_attr_remote.c
xfs: do not allocate the entire delalloc extent in xfs_bmapi_write
xfs: fix xfs_bmap_add_extent_delay_real for partial conversions
xfs: remove the xfs_iext_peek_prev_extent call in xfs_bmapi_allocate
xfs: pass the actual offset and len to allocate to xfs_bmapi_allocate
xfs: don't open code XFS_FILBLKS_MIN in xfs_bmapi_write
xfs: lift a xfs_valid_startblock into xfs_bmapi_allocate
xfs: remove the unusued tmp_logflags variable in xfs_bmapi_allocate
xfs: fix error returns from xfs_bmapi_write
...
Linus Torvalds [Mon, 20 May 2024 19:49:25 +0000 (12:49 -0700)]
Merge tag 'fs_for_v6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs
Pull isofs, udf, quota, ext2, and reiserfs updates from Jan Kara:
- convert isofs to the new mount API
- cleanup isofs Makefile
- udf conversion to folios
- some other small udf cleanups and fixes
- ext2 cleanups
- removal of reiserfs .writepage method
- update reiserfs README file
* tag 'fs_for_v6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
isofs: Use *-y instead of *-objs in Makefile
ext2: Remove LEGACY_DIRECT_IO dependency
isofs: Remove calls to set/clear the error flag
ext2: Remove call to folio_set_error()
udf: Use a folio in udf_write_end()
udf: Convert udf_page_mkwrite() to use a folio
udf: Convert udf_symlink_getattr() to use a folio
udf: Convert udf_adinicb_readpage() to udf_adinicb_read_folio()
udf: Convert udf_expand_file_adinicb() to use a folio
udf: Convert udf_write_begin() to use a folio
udf: Convert udf_symlink_filler() to use a folio
reiserfs: Trim some README bits
quota: fix to propagate error of mark_dquot_dirty() to caller
reiserfs: Convert to writepages
udf: udftime: prevent overflow in udf_disk_stamp_to_time()
ext2: set FMODE_CAN_ODIRECT instead of a dummy direct_IO method
udf: replace deprecated strncpy/strcpy with strscpy
udf: Remove second semicolon
isofs: convert isofs to use the new mount API
fs: quota: use group allocation of per-cpu counters API
These kinds of patches are only making the code worse.
Compilers don't care about the unnecessary check, but removing it makes
the code less obvious to a human. The declaration of 'len' is more than
80 lines earlier, so a human won't easily see that 'len' is of an
unsigned type, so to a human the range check that checks against zero is
much more explicit and obvious.
Any tool that complains about a range check like this just because the
variable is unsigned is actively detrimental, and should be ignored.
Linus Torvalds [Mon, 20 May 2024 19:31:43 +0000 (12:31 -0700)]
Merge tag 'fsnotify_for_v6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs
Pull fsnotify updates from Jan Kara:
- reduce overhead of fsnotify infrastructure when no permission events
are in use
- a few small cleanups
* tag 'fsnotify_for_v6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
fsnotify: fix UAF from FS_ERROR event on a shutting down filesystem
fsnotify: optimize the case of no permission event watchers
fsnotify: use an enum for group priority constants
fsnotify: move s_fsnotify_connectors into fsnotify_sb_info
fsnotify: lazy attach fsnotify_sb_info state to sb
fsnotify: create helper fsnotify_update_sb_watchers()
fsnotify: pass object pointer and type to fsnotify mark helpers
fanotify: merge two checks regarding add of ignore mark
fsnotify: create a wrapper fsnotify_find_inode_mark()
fsnotify: create helpers to get sb and connp from object
fsnotify: rename fsnotify_{get,put}_sb_connectors()
fsnotify: Avoid -Wflex-array-member-not-at-end warning
fanotify: remove unneeded sub-zero check for unsigned value
Linus Torvalds [Mon, 20 May 2024 17:23:39 +0000 (10:23 -0700)]
Merge tag 'dma-mapping-6.10-2024-05-20' of git://git.infradead.org/users/hch/dma-mapping
Pull dma-mapping updates from Christoph Hellwig:
- optimize DMA sync calls when they are no-ops (Alexander Lobakin)
- fix swiotlb padding for untrusted devices (Michael Kelley)
- add documentation for swiotb (Michael Kelley)
* tag 'dma-mapping-6.10-2024-05-20' of git://git.infradead.org/users/hch/dma-mapping:
dma: fix DMA sync for drivers not calling dma_set_mask*()
xsk: use generic DMA sync shortcut instead of a custom one
page_pool: check for DMA sync shortcut earlier
page_pool: don't use driver-set flags field directly
page_pool: make sure frag API fields don't span between cachelines
iommu/dma: avoid expensive indirect calls for sync operations
dma: avoid redundant calls for sync operations
dma: compile-out DMA sync op calls when not used
iommu/dma: fix zeroing of bounce buffer padding used by untrusted devices
swiotlb: remove alloc_size argument to swiotlb_tbl_map_single()
Documentation/core-api: add swiotlb documentation
Linus Torvalds [Mon, 20 May 2024 16:23:36 +0000 (09:23 -0700)]
Merge tag 'dmi-for-v6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging
Pull dmi updates from Jean Delvare:
"Bug fixes:
- KCFI violation in dmi-id
- stop decoding on broken (short) DMI table entry
New features:
- print info about populated memory slots at boot"
* tag 'dmi-for-v6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging:
firmware: dmi: Add info message for number of populated and total memory slots
firmware: dmi: Stop decoding on broken entry
firmware: dmi-id: add a release callback function
Linus Torvalds [Mon, 20 May 2024 16:07:27 +0000 (09:07 -0700)]
Merge tag 'linux-watchdog-6.10-rc1' of git://www.linux-watchdog.org/linux-watchdog
Pull watchdog updates from Wim Van Sebroeck:
- Add Lenovo SE10 platform Watchdog Driver
- Other small fixes and improvements
* tag 'linux-watchdog-6.10-rc1' of git://www.linux-watchdog.org/linux-watchdog:
watchdog: LENOVO_SE10_WDT should depend on X86 && DMI
watchdog: sa1100: Fix PTR_ERR_OR_ZERO() vs NULL check in sa1100dog_probe()
watchdog: rti_wdt: Set min_hw_heartbeat_ms to accommodate a safety margin
watchdog: add HAS_IOPORT dependencies
watchdog/wdt-main: Use cpumask_of() to avoid cpumask var on stack
watchdog: bd9576: Drop "always-running" property
watchdog: mtx-1: drop driver owner assignment
watchdog: cpu5wdt.c: Fix use-after-free bug caused by cpu5wdt_trigger
watchdog: lenovo_se10_wdt: Watchdog driver for Lenovo SE10 platform
Linus Torvalds [Mon, 20 May 2024 15:55:18 +0000 (08:55 -0700)]
Merge tag 'i2c-for-6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux
Pull i2c updates from Wolfram Sang:
"i2c core removes an argument from the i2c_mux_add_adapter() call to
further deprecate class based I2C device instantiation. All users are
converted, too.
Other that that, Andi collected a number if I2C host driver patches.
Those merges have their own description"
* tag 'i2c-for-6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: (72 commits)
power: supply: sbs-manager: Remove class argument from i2c_mux_add_adapter()
i2c: mux: Remove class argument from i2c_mux_add_adapter()
i2c: synquacer: Fix an error handling path in synquacer_i2c_probe()
i2c: acpi: Unbind mux adapters before delete
i2c: designware: Replace MODULE_ALIAS() with MODULE_DEVICE_TABLE()
i2c: pxa: use 'time_left' variable with wait_event_timeout()
i2c: s3c2410: use 'time_left' variable with wait_event_timeout()
i2c: rk3x: use 'time_left' variable with wait_event_timeout()
i2c: qcom-geni: use 'time_left' variable with wait_for_completion_timeout()
i2c: jz4780: use 'time_left' variable with wait_for_completion_timeout()
i2c: synquacer: use 'time_left' variable with wait_for_completion_timeout()
i2c: stm32f7: use 'time_left' variable with wait_for_completion_timeout()
i2c: stm32f4: use 'time_left' variable with wait_for_completion_timeout()
i2c: st: use 'time_left' variable with wait_for_completion_timeout()
i2c: omap: use 'time_left' variable with wait_for_completion_timeout()
i2c: imx-lpi2c: use 'time_left' variable with wait_for_completion_timeout()
i2c: hix5hd2: use 'time_left' variable with wait_for_completion_timeout()
i2c: exynos5: use 'time_left' variable with wait_for_completion_timeout()
i2c: digicolor: use 'time_left' variable with wait_for_completion_timeout()
i2c: amd-mp2-plat: use 'time_left' variable with wait_for_completion_timeout()
...
Linus Torvalds [Mon, 20 May 2024 15:47:54 +0000 (08:47 -0700)]
Merge tag 'v6.10-p2' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto fixes from Herbert Xu:
"Fix a bug in the new ecc P521 code as well as a buggy fix in qat"
* tag 'v6.10-p2' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
crypto: ecc - Prevent ecc_digits_from_bytes from reading too many bytes
crypto: qat - Fix ADF_DEV_RESET_SYNC memory leak
The framework change to add D_GNU_SOURCE to KHDR_INCLUDES
to Makefile, lib.mk, and kselftest_harness.h is reverted
as it is causing build failures and warnings.
Revert this change as this change depends on the framework
change.
Reported-by: Mark Brown <broonie@kernel.org> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
The framework change to add D_GNU_SOURCE to KHDR_INCLUDES
to Makefile, lib.mk, and kselftest_harness.h is reverted
as it is causing build failures and warnings.
Revert this change as this change depends on the framework
change.
Reported-by: Mark Brown <broonie@kernel.org> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Linus Torvalds [Sun, 19 May 2024 21:02:03 +0000 (14:02 -0700)]
Merge tag 'mm-nonmm-stable-2024-05-19-11-56' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull non-mm updates from Andrew Morton:
"Mainly singleton patches, documented in their respective changelogs.
Notable series include:
- Some maintenance and performance work for ocfs2 in Heming Zhao's
series "improve write IO performance when fragmentation is high".
- Some ocfs2 bugfixes from Su Yue in the series "ocfs2 bugs fixes
exposed by fstests".
- kfifo header rework from Andy Shevchenko in the series "kfifo:
Clean up kfifo.h".
- GDB script fixes from Florian Rommel in the series "scripts/gdb:
Fixes for $lx_current and $lx_per_cpu".
- After much discussion, a coding-style update from Barry Song
explaining one reason why inline functions are preferred over
macros. The series is "codingstyle: avoid unused parameters for a
function-like macro""
* tag 'mm-nonmm-stable-2024-05-19-11-56' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (62 commits)
fs/proc: fix softlockup in __read_vmcore
nilfs2: convert BUG_ON() in nilfs_finish_roll_forward() to WARN_ON()
scripts: checkpatch: check unused parameters for function-like macro
Documentation: coding-style: ask function-like macros to evaluate parameters
nilfs2: use __field_struct() for a bitwise field
selftests/kcmp: remove unused open mode
nilfs2: remove calls to folio_set_error() and folio_clear_error()
kernel/watchdog_perf.c: tidy up kerneldoc
watchdog: allow nmi watchdog to use raw perf event
watchdog: handle comma separated nmi_watchdog command line
nilfs2: make superblock data array index computation sparse friendly
squashfs: remove calls to set the folio error flag
squashfs: convert squashfs_symlink_read_folio to use folio APIs
scripts/gdb: fix detection of current CPU in KGDB
scripts/gdb: make get_thread_info accept pointers
scripts/gdb: fix parameter handling in $lx_per_cpu
scripts/gdb: fix failing KGDB detection during probe
kfifo: don't use "proxy" headers
media: stih-cec: add missing io.h
media: rc: add missing io.h
...
Linus Torvalds [Sun, 19 May 2024 20:45:48 +0000 (13:45 -0700)]
Merge tag 'bcachefs-2024-05-19' of https://evilpiepirate.org/git/bcachefs
Pull bcachefs updates from Kent Overstreet:
- More safety fixes, primarily found by syzbot
- Run the upgrade/downgrade paths in nochnages mode. Nochanges mode is
primarily for testing fsck/recovery in dry run mode, so it shouldn't
change anything besides disabling writes and holding dirty metadata
in memory.
The idea here was to reduce the amount of activity if we can't write
anything out, so that bringing up a filesystem in "super ro" mode
would be more lilkely to work for data recovery - but norecovery is
the correct option for this.
- btree_trans->locked; we now track whether a btree_trans has any btree
nodes locked, and this is used for improved assertions related to
trans_unlock() and trans_relock(). We'll also be using it for
improving how we work with lockdep in the future: we don't want
lockdep to be tracking individual btree node locks because we take
too many for lockdep to track, and it's not necessary since we have a
cycle detector.
- Trigger improvements that are prep work for online fsck
- BTREE_TRIGGER_check_repair; this regularizes how we do some repair
work for extents that goes with running triggers in fsck, and fixes
some subtle issues with transaction restarts there.
- bch2_snapshot_equiv() has now been ripped out of fsck.c; snapshot
equivalence classes are for when snapshot deletion leaves behind
redundant snapshot nodes, but snapshot deletion now cleans this up
right away, so the abstraction doesn't need to leak.
- Improvements to how we resume writing to the journal in recovery. The
code for picking the new place to write when reading the journal is
greatly simplified and we also store the position in the superblock
for when we don't read the journal; this means that we preserve more
of the journal for list_journal debugging.
- Improvements to sysfs btree_cache and btree_node_cache, for debugging
memory reclaim.
- We now detect when we've blocked for 10 seconds on the allocator in
the write path and dump some useful info.
- Safety fixes for devices references: this is a big series that
changes almost all device lookups to properly check if the device
exists and take a reference to it.
Previously we assumed that if a bkey exists that references a device
then the device must exist, and this was enforced in .invalid
methods, but this was incorrect because it meant device removal
relied on accounting being correct to not leave keys pointing to
invalid devices, and that's not something we can assume.
Getting the "pointer to invalid device" checks out of our .invalid()
methods fixes some long standing device removal bugs; the only
outstanding bug with device removal now is a race between the discard
path and deleting alloc info, which should be easily fixed.
- The allocator now prefers not to expand the new
member_info.btree_allocated bitmap, meaning if repair ever requires
scanning for btree nodes (because of a corrupt interior nodes) we
won't have to scan the whole device(s).
- New coding style document, which among other things talks about the
correct usage of assertions
* tag 'bcachefs-2024-05-19' of https://evilpiepirate.org/git/bcachefs: (155 commits)
bcachefs: add no_invalid_checks flag
bcachefs: add counters for failed shrinker reclaim
bcachefs: Fix sb_field_downgrade validation
bcachefs: Plumb bch_validate_flags to sb_field_ops.validate()
bcachefs: s/bkey_invalid_flags/bch_validate_flags
bcachefs: fsync() should not return -EROFS
bcachefs: Invalid devices are now checked for by fsck, not .invalid methods
bcachefs: kill bch2_dev_bkey_exists() in bch2_check_fix_ptrs()
bcachefs: kill bch2_dev_bkey_exists() in bch2_read_endio()
bcachefs: bch2_dev_get_ioref() checks for device not present
bcachefs: bch2_dev_get_ioref2(); io_read.c
bcachefs: bch2_dev_get_ioref2(); debug.c
bcachefs: bch2_dev_get_ioref2(); journal_io.c
bcachefs: bch2_dev_get_ioref2(); io_write.c
bcachefs: bch2_dev_get_ioref2(); btree_io.c
bcachefs: bch2_dev_get_ioref2(); backpointers.c
bcachefs: bch2_dev_get_ioref2(); alloc_background.c
bcachefs: for_each_bset() declares loop iter
bcachefs: Move BCACHEFS_STATFS_MAGIC value to UAPI magic.h
bcachefs: Improve sysfs internal/btree_cache
...
Linus Torvalds [Sun, 19 May 2024 19:33:28 +0000 (12:33 -0700)]
Merge tag 'turbostat-for-Linux-6.10-merge-window' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux
Pull turbostat updates from Len Brown:
- Survive sparse die id's seen in Linux-6.9
- Handle clustered-uncore topology in new/upcoming hardware
- For non-root use, add ability to see software C-state counters
- Enable reading core and package hardware cstate via perf, and prefer
perf over the MSR driver access for these counters
* tag 'turbostat-for-Linux-6.10-merge-window' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux:
tools/power turbostat: version 2024.05.10
tools/power turbostat: Ignore pkg_cstate_limit when it is not available
tools/power turbostat: Fix order of strings in pkg_cstate_limit_strings
tools/power turbostat: Read Package-cstates via perf
tools/power turbostat: Read Core-cstates via perf
tools/power turbostat: Avoid possible memory corruption due to sparse topology IDs
tools/power turbostat: Add columns for clustered uncore frequency
tools/power turbostat: Enable non-privileged users to read sysfs counters
tools/power turbostat: Replace _Static_assert with BUILD_BUG_ON
tools/power turbostat: Add ARL-H support
tools/power turbostat: Enhance ARL/LNL support
tools/power turbostat: Survive sparse die_id
tools/power turbostat: Remember global max_die_id
tools/power turbostat: Harden probe_intel_uncore_frequency()
tools/power turbostat: Add "snapshot:" Makefile target
Linus Torvalds [Sun, 19 May 2024 19:01:00 +0000 (12:01 -0700)]
Merge tag 'kgdb-6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/danielt/linux
Pull kgdb updates from Daniel Thompson:
"Nine patches this cycle and they split into just three topics:
- Adopt coccinelle's recommendation to adopt str_plural()
- A set of seven patches to refactor kdb_read() to improve both code
clarity and its discipline with respect to fixed size buffers.
This isn't just a refactor. Between them these also fix a cursor
movement redraw problem and two buffer overflows (one latent and
one real, albeit difficult to tickle).
- Fix an NMI-safety problem when enqueuing kdb's keyboard reset code
I wrote eight of the nine patches in this collection so many thanks to
Doug Anderson for the reviews. The changes that affects
drivers/tty/serial is acked by Greg KH"
* tag 'kgdb-6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/danielt/linux:
serial: kgdboc: Fix NMI-safety problems from keyboard reset code
kdb: Simplify management of tmpbuffer in kdb_read()
kdb: Replace double memcpy() with memmove() in kdb_read()
kdb: Use format-specifiers rather than memset() for padding in kdb_read()
kdb: Merge identical case statements in kdb_read()
kdb: Fix console handling when editing and tab-completing commands
kdb: Use format-strings rather than '\0' injection in kdb_read()
kdb: Fix buffer overflow during tab-complete
kdb: Use str_plural() to fix Coccinelle warning
Linus Torvalds [Sun, 19 May 2024 18:42:29 +0000 (11:42 -0700)]
Merge tag 'x86-urgent-2024-05-18' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:
- Fix a NOP-patching bug that resulted in valid but suboptimal
NOP sequences in certain cases
- Fix build warnings related to fall-through control flow
* tag 'x86-urgent-2024-05-18' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/alternatives: Use the correct length when optimizing NOPs
x86/boot: Address clang -Wimplicit-fallthrough in vsprintf()
x86/boot: Add a fallthrough annotation
Linus Torvalds [Sun, 19 May 2024 18:38:15 +0000 (11:38 -0700)]
Merge tag 'sched-urgent-2024-05-18' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler fixes from Ingo Molnar:
- Fix a sched_balance_newidle setting bug
- Fix bug in the setting of /sys/fs/cgroup/test/cpu.max.burst
- Fix variable-shadowing build warning
- Extend sched-domains debug output
- Fix documentation
- Fix comments
* tag 'sched-urgent-2024-05-18' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched/core: Fix incorrect initialization of the 'burst' parameter in cpu_max_write()
sched/fair: Remove stale FREQUENCY_UTIL comment
sched/fair: Fix initial util_avg calculation
docs: cgroup-v1: Clarify that domain levels are system-specific
sched/debug: Dump domains' level
sched/fair: Allow disabling sched_balance_newidle with sched_relax_domain_level
arch/topology: Fix variable naming to avoid shadowing
Linus Torvalds [Sun, 19 May 2024 18:32:42 +0000 (11:32 -0700)]
Merge tag 'perf-urgent-2024-05-18' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf event updates from Ingo Molnar:
- Extend the x86 instruction decoder with APX and
other new instructions
- Misc cleanups
* tag 'perf-urgent-2024-05-18' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/x86/cstate: Remove unused 'struct perf_cstate_msr'
perf/x86/rapl: Rename 'maxdie' to nr_rapl_pmu and 'dieid' to rapl_pmu_idx
x86/insn: Add support for APX EVEX instructions to the opcode map
x86/insn: Add support for APX EVEX to the instruction decoder logic
x86/insn: x86/insn: Add support for REX2 prefix to the instruction decoder opcode map
x86/insn: Add support for REX2 prefix to the instruction decoder logic
x86/insn: Add misc new Intel instructions
x86/insn: Add VEX versions of VPDPBUSD, VPDPBUSDS, VPDPWSSD and VPDPWSSDS
x86/insn: Fix PUSH instruction in x86 instruction decoder opcode map
x86/insn: Add Key Locker instructions to the opcode map
Linus Torvalds [Sun, 19 May 2024 16:21:03 +0000 (09:21 -0700)]
Merge tag 'mm-stable-2024-05-17-19-19' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull mm updates from Andrew Morton:
"The usual shower of singleton fixes and minor series all over MM,
documented (hopefully adequately) in the respective changelogs.
Notable series include:
- Lucas Stach has provided some page-mapping cleanup/consolidation/
maintainability work in the series "mm/treewide: Remove pXd_huge()
API".
- In the series "Allow migrate on protnone reference with
MPOL_PREFERRED_MANY policy", Donet Tom has optimized mempolicy's
MPOL_PREFERRED_MANY mode, yielding almost doubled performance in
one test.
- In their series "Memory allocation profiling" Kent Overstreet and
Suren Baghdasaryan have contributed a means of determining (via
/proc/allocinfo) whereabouts in the kernel memory is being
allocated: number of calls and amount of memory.
- Matthew Wilcox has provided the series "Various significant MM
patches" which does a number of rather unrelated things, but in
largely similar code sites.
- In his series "mm: page_alloc: freelist migratetype hygiene"
Johannes Weiner has fixed the page allocator's handling of
migratetype requests, with resulting improvements in compaction
efficiency.
- In the series "make the hugetlb migration strategy consistent"
Baolin Wang has fixed a hugetlb migration issue, which should
improve hugetlb allocation reliability.
- Liu Shixin has hit an I/O meltdown caused by readahead in a
memory-tight memcg. Addressed in the series "Fix I/O high when
memory almost met memcg limit".
- In the series "mm/filemap: optimize folio adding and splitting"
Kairui Song has optimized pagecache insertion, yielding ~10%
performance improvement in one test.
- Baoquan He has cleaned up and consolidated the early zone
initialization code in the series "mm/mm_init.c: refactor
free_area_init_core()".
- Baoquan has also redone some MM initializatio code in the series
"mm/init: minor clean up and improvement".
- MM helper cleanups from Christoph Hellwig in his series "remove
follow_pfn".
- More cleanups from Matthew Wilcox in the series "Various
page->flags cleanups".
- Vlastimil Babka has contributed maintainability improvements in the
series "memcg_kmem hooks refactoring".
- More folio conversions and cleanups in Matthew Wilcox's series:
"Convert huge_zero_page to huge_zero_folio"
"khugepaged folio conversions"
"Remove page_idle and page_young wrappers"
"Use folio APIs in procfs"
"Clean up __folio_put()"
"Some cleanups for memory-failure"
"Remove page_mapping()"
"More folio compat code removal"
- David Hildenbrand chipped in with "fs/proc/task_mmu: convert
hugetlb functions to work on folis".
- Code consolidation and cleanup work related to GUP's handling of
hugetlbs in Peter Xu's series "mm/gup: Unify hugetlb, part 2".
- Rick Edgecombe has developed some fixes to stack guard gaps in the
series "Cover a guard gap corner case".
- Jinjiang Tu has fixed KSM's behaviour after a fork+exec in the
series "mm/ksm: fix ksm exec support for prctl".
- Baolin Wang has implemented NUMA balancing for multi-size THPs.
This is a simple first-cut implementation for now. The series is
"support multi-size THP numa balancing".
- Cleanups to vma handling helper functions from Matthew Wilcox in
the series "Unify vma_address and vma_pgoff_address".
- Some selftests maintenance work from Dev Jain in the series
"selftests/mm: mremap_test: Optimizations and style fixes".
- Improvements to the swapping of multi-size THPs from Ryan Roberts
in the series "Swap-out mTHP without splitting".
- Kefeng Wang has significantly optimized the handling of arm64's
permission page faults in the series
"arch/mm/fault: accelerate pagefault when badaccess"
"mm: remove arch's private VM_FAULT_BADMAP/BADACCESS"
- GUP cleanups from David Hildenbrand in "mm/gup: consistently call
it GUP-fast".
- hugetlb fault code cleanups from Vishal Moola in "Hugetlb fault
path to use struct vm_fault".
- selftests build fixes from John Hubbard in the series "Fix
selftests/mm build without requiring "make headers"".
- Memory tiering fixes/improvements from Ho-Ren (Jack) Chuang in the
series "Improved Memory Tier Creation for CPUless NUMA Nodes".
Fixes the initialization code so that migration between different
memory types works as intended.
- David Hildenbrand has improved follow_pte() and fixed an errant
driver in the series "mm: follow_pte() improvements and acrn
follow_pte() fixes".
- David also did some cleanup work on large folio mapcounts in his
series "mm: mapcount for large folios + page_mapcount() cleanups".
- Folio conversions in KSM in Alex Shi's series "transfer page to
folio in KSM".
- Barry Song has added some sysfs stats for monitoring multi-size
THP's in the series "mm: add per-order mTHP alloc and swpout
counters".
- Some zswap cleanups from Yosry Ahmed in the series "zswap
same-filled and limit checking cleanups".
- Matthew Wilcox has been looking at buffer_head code and found the
documentation to be lacking. The series is "Improve buffer head
documentation".
- Multi-size THPs get more work, this time from Lance Yang. His
series "mm/madvise: enhance lazyfreeing with mTHP in madvise_free"
optimizes the freeing of these things.
- Kemeng Shi has added more userspace-visible writeback
instrumentation in the series "Improve visibility of writeback".
- Kemeng Shi then sent some maintenance work on top in the series
"Fix and cleanups to page-writeback".
- Matthew Wilcox reduces mmap_lock traffic in the anon vma code in
the series "Improve anon_vma scalability for anon VMAs". Intel's
test bot reported an improbable 3x improvement in one test.
- SeongJae Park adds some DAMON feature work in the series
"mm/damon: add a DAMOS filter type for page granularity access recheck"
"selftests/damon: add DAMOS quota goal test"
- Also some maintenance work in the series
"mm/damon/paddr: simplify page level access re-check for pageout"
"mm/damon: misc fixes and improvements"
- David Hildenbrand has disabled some known-to-fail selftests ni the
series "selftests: mm: cow: flag vmsplice() hugetlb tests as
XFAIL".
- memcg metadata storage optimizations from Shakeel Butt in "memcg:
reduce memory consumption by memcg stats".
- DAX fixes and maintenance work from Vishal Verma in the series
"dax/bus.c: Fixups for dax-bus locking""
* tag 'mm-stable-2024-05-17-19-19' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (426 commits)
memcg, oom: cleanup unused memcg_oom_gfp_mask and memcg_oom_order
selftests/mm: hugetlb_madv_vs_map: avoid test skipping by querying hugepage size at runtime
mm/hugetlb: add missing VM_FAULT_SET_HINDEX in hugetlb_wp
mm/hugetlb: add missing VM_FAULT_SET_HINDEX in hugetlb_fault
selftests: cgroup: add tests to verify the zswap writeback path
mm: memcg: make alloc_mem_cgroup_per_node_info() return bool
mm/damon/core: fix return value from damos_wmark_metric_value
mm: do not update memcg stats for NR_{FILE/SHMEM}_PMDMAPPED
selftests: cgroup: remove redundant enabling of memory controller
Docs/mm/damon/maintainer-profile: allow posting patches based on damon/next tree
Docs/mm/damon/maintainer-profile: change the maintainer's timezone from PST to PT
Docs/mm/damon/design: use a list for supported filters
Docs/admin-guide/mm/damon/usage: fix wrong schemes effective quota update command
Docs/admin-guide/mm/damon/usage: fix wrong example of DAMOS filter matching sysfs file
selftests/damon: classify tests for functionalities and regressions
selftests/damon/_damon_sysfs: use 'is' instead of '==' for 'None'
selftests/damon/_damon_sysfs: find sysfs mount point from /proc/mounts
selftests/damon/_damon_sysfs: check errors from nr_schemes file reads
mm/damon/core: initialize ->esz_bp from damos_quota_init_priv()
selftests/damon: add a test for DAMOS quota goal
...
Linus Torvalds [Sat, 18 May 2024 21:11:54 +0000 (14:11 -0700)]
Merge tag 'ext4_for_linus-6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4
Pull ext4 updates from Ted Ts'o:
- more folio conversion patches
- add support for FS_IOC_GETFSSYSFSPATH
- mballoc cleaups and add more kunit tests
- sysfs cleanups and bug fixes
- miscellaneous bug fixes and cleanups
* tag 'ext4_for_linus-6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (40 commits)
ext4: fix error pointer dereference in ext4_mb_load_buddy_gfp()
jbd2: add prefix 'jbd2' for 'shrink_type'
jbd2: use shrink_type type instead of bool type for __jbd2_journal_clean_checkpoint_list()
ext4: fix uninitialized ratelimit_state->lock access in __ext4_fill_super()
ext4: remove calls to to set/clear the folio error flag
ext4: propagate errors from ext4_sb_bread() in ext4_xattr_block_cache_find()
ext4: fix mb_cache_entry's e_refcnt leak in ext4_xattr_block_cache_find()
jbd2: remove redundant assignement to variable err
ext4: remove the redundant folio_wait_stable()
ext4: fix potential unnitialized variable
ext4: convert ac_buddy_page to ac_buddy_folio
ext4: convert ac_bitmap_page to ac_bitmap_folio
ext4: convert ext4_mb_init_cache() to take a folio
ext4: convert bd_buddy_page to bd_buddy_folio
ext4: convert bd_bitmap_page to bd_bitmap_folio
ext4: open coding repeated check in next_linear_group
ext4: use correct criteria name instead stale integer number in comment
ext4: call ext4_mb_mark_free_simple to free continuous bits in found chunk
ext4: add test_mb_mark_used_cost to estimate cost of mb_mark_used
ext4: keep "prefetch_grp" and "nr" consistent
...
Linus Torvalds [Sat, 18 May 2024 21:04:20 +0000 (14:04 -0700)]
Merge tag 'nfsd-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux
Pull nfsd updates from Chuck Lever:
"This is a light release containing mostly optimizations, code clean-
ups, and minor bug fixes. This development cycle has focused on non-
upstream kernel work:
1. Continuing to build upstream CI for NFSD, based on kdevops
2. Backporting NFSD filecache-related fixes to selected LTS kernels
One notable new feature in v6.10 NFSD is the addition of a new netlink
protocol dedicated to configuring NFSD. A new user space tool,
nfsdctl, is to be added to nfs-utils. Lots more to come here.
As always I am very grateful to NFSD contributors, reviewers, testers,
and bug reporters who participated during this cycle"
* tag 'nfsd-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: (29 commits)
NFSD: Force all NFSv4.2 COPY requests to be synchronous
SUNRPC: Fix gss_free_in_token_pages()
NFS/knfsd: Remove the invalid NFS error 'NFSERR_OPNOTSUPP'
knfsd: LOOKUP can return an illegal error value
nfsd: set security label during create operations
NFSD: Add COPY status code to OFFLOAD_STATUS response
NFSD: Record status of async copy operation in struct nfsd4_copy
SUNRPC: Remove comment for sp_lock
NFSD: add listener-{set,get} netlink command
SUNRPC: add a new svc_find_listener helper
SUNRPC: introduce svc_xprt_create_from_sa utility routine
NFSD: add write_version to netlink command
NFSD: convert write_threads to netlink command
NFSD: allow callers to pass in scope string to nfsd_svc
NFSD: move nfsd_mutex handling into nfsd_svc callers
lockd: host: Remove unnecessary statements'host = NULL;'
nfsd: don't create nfsv4recoverydir in nfsdfs when not used.
nfsd: optimise recalculate_deny_mode() for a common case
nfsd: add tracepoint in mark_client_expired_locked
nfsd: new tracepoint for check_slot_seqid
...
Linus Torvalds [Sat, 18 May 2024 20:04:15 +0000 (13:04 -0700)]
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma
Pull rdma updates from Jason Gunthorpe:
"Aside from the usual things this has an arch update for
__iowrite64_copy() used by the RDMA drivers.
This API was intended to generate large 64 byte MemWr TLPs on PCI.
These days most processors had done this by just repeating writel() in
a loop. S390 and some new ARM64 designs require a special helper to
get this to generate.
- Small improvements and fixes for erdma, efa, hfi1, bnxt_re
- Fix a UAF crash after module unload on leaking restrack entry
- Continue adding full RDMA support in mana with support for EQs,
GID's and CQs
- Improvements to the mkey cache in mlx5
- DSCP traffic class support in hns and several bug fixes
- Cap the maximum number of MADs in the receive queue to avoid OOM
- Another batch of rxe bug fixes from large scale testing
- __iowrite64_copy() optimizations for write combining MMIO memory
- Remove NULL checks before dev_put/hold()
- EFA support for receive with immediate
- Fix a recent memleaking regression in a cma error path"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (70 commits)
RDMA/cma: Fix kmemleak in rdma_core observed during blktests nvme/rdma use siw
RDMA/IPoIB: Fix format truncation compilation errors
bnxt_re: avoid shift undefined behavior in bnxt_qplib_alloc_init_hwq
RDMA/efa: Support QP with unsolicited write w/ imm. receive
IB/hfi1: Remove generic .ndo_get_stats64
IB/hfi1: Do not use custom stat allocator
RDMA/hfi1: Use RMW accessors for changing LNKCTL2
RDMA/mana_ib: implement uapi for creation of rnic cq
RDMA/mana_ib: boundary check before installing cq callbacks
RDMA/mana_ib: introduce a helper to remove cq callbacks
RDMA/mana_ib: create and destroy RNIC cqs
RDMA/mana_ib: create EQs for RNIC CQs
RDMA/core: Remove NULL check before dev_{put, hold}
RDMA/ipoib: Remove NULL check before dev_{put, hold}
RDMA/mlx5: Remove NULL check before dev_{put, hold}
RDMA/mlx5: Track DCT, DCI and REG_UMR QPs as diver_detail resources.
RDMA/core: Add an option to display driver-specific QPs in the rdmatool
RDMA/efa: Add shutdown notifier
RDMA/mana_ib: Fix missing ret value
IB/mlx5: Use __iowrite64_copy() for write combining stores
...
- test-power: add POWER_SUPPLY_PROP_CHARGE_BEHAVIOUR support
- chrome EC drivers: add ID based probing
- bq27xxx: simplify update loop to reduce I2C traffic
- max8903 binding: fix GPIO polarity description
* tag 'for-v6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-power-supply:
dt-bindings: power: supply: max8903: specify flt-gpios as input
power: supply: bq27xxx: Move health reading out of update loop
power: supply: bq27xxx: Move cycle count reading out of update loop
power: supply: bq27xxx: Move energy reading out of update loop
power: supply: bq27xxx: Move charge reading out of update loop
power: supply: bq27xxx: Move time reading out of update loop
power: supply: bq27xxx: Move temperature reading out of update loop
power: supply: cros_pchg: provide ID table for avoiding fallback match
power: supply: cros_usbpd: provide ID table for avoiding fallback match
power: supply: core: simplify charge_behaviour formatting
power: supply: test-power: implement charge_behaviour property
Linus Torvalds [Sat, 18 May 2024 19:48:37 +0000 (12:48 -0700)]
Merge tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux
Pull clk updates from Stephen Boyd:
"I'm actually surprised this time. There aren't any new Qualcomm SoC
clk drivers. And there's zero diff in the core clk framework.
Instead we have new clk drivers for STM and Sophgo, with
Samsung^WGoogle in third for the diffstat because they introduced HSI0
and HSI2 clk drivers for Google's GS101 SoC (high speed interface
things like PCIe, UFS, and MMC).
Beyond those big diffs there's the usual updates to various clk
drivers for incorrect parent descriptions or mising
MODULE_DEVICE_TABLE()s, etc. Nothing in particular stands out as super
interesting here.
New Drivers:
- STM32MP257 SoC clk driver
- Airoha EN7581 SoC clk driver
- Sophgo CV1800B, CV1812H and SG2000 SoC clk driver
- Loongson-2k0500 and Loongson-2k2000 SoC clk driver
- Add HSI0 and HSI2 clock controllers for Google GS101
- Add i.MX95 BLK CTL clock driver
Updates:
- Allocate clk_ops dynamically for SCMI clk driver
- Add support in qcom RCG and RCG2 for multiple configurations for
the same frequency
- Use above support for IPQ8074 NSS port 5 and 6 clocks to resolve
issues
- Fix the Qualcomm APSS IPQ5018 PLL to fix boot failures of some
boards
- Cleanups and fixes for Qualcomm Stromer PLLs
- Reduce max CPU frequency on Qualcomm APSS IPQ5018
- Fix Kconfig dependencies of Qualcomm SM8650 GPU and SC8280XP camera
clk drivers
- Make Qualcomm MSM8998 Venus clocks functional
- Cleanup downstream remnants related to DisplayPort across Qualcomm
SM8450, SM6350, SM8550, and SM8650
- Reuse the Huayra APSS register map on Qualcomm MSM8996 CBF PLL
- Use a specific Qualcomm QCS404 compatible for the otherwise generic
HFPLL
- Remove Qualcomm SM8150 CPUSS AHB clk as it is unused
- Remove an unused field in the Qualcomm RPM clk driver
- Add missing MODULE_DEVICE_TABLE to Qualcomm MSM8917 and MSM8953
global clock controller drivers
- Allow choice of manual or firmware-driven control over PLLs, needed
to fully implement CPU clock controllers on Exynos850
- Correct PLL clock IDs on ExynosAutov9
- Propagate certain clock rates to allow setting proper SPI clock
rates on Google GS101
- Mark certain Google GS101 clocks critical
- Convert old S3C64xx clock controller bindings to DT schema
- Add new PLL rate and missing mux on Rockchip rk3568
- Add missing reset line on Rockchip rk3588
- Removal of an unused field in struct rockchip_mmc_clock
- Amlogic s4/a1: add regmap maximum register for proper debugfs dump
- Amlogic s4: add MODULE_DEVICE_TABLE() on pll and periph controllers
- Amlogic pll driver: print clock name on lock error to help debug
- Amlogic vclk: finish dsi clock path support
- Amlogic license: fix occurence "GPL v2" as reported by checkpatch
- Add PM runtime support to i.MX8MP Audiomix
- Add DT schema for i.MX95 Display Master Block Control
- Convert to platform remove callback returning void for i.MX8MP
Audiomix
- Add SPI (MSIOF) and external interrupt (INTC-EX) clocks on Renesas
R-Car V4M
- Add interrupt controller (PLIC) clock and reset on Renesas RZ/Five
- Prepare power domain support for Renesas RZ/G2L family members, and
add actual support on Renesas RZ/G3S SoC
- Add thermal, serial (SCIF), and timer (CMT/TMU) clocks on Renesas
R-Car V4M
- Add additional constraints to Allwinner A64 PLL MIPI clock
- Fix autoloading sunxi-ng clocks when build as a module"
* tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux: (118 commits)
clk: samsung: Don't register clkdev lookup for the fixed rate clocks
clk, reset: microchip: mpfs: fix incorrect preprocessor conditions
clk: qcom: clk-alpha-pll: fix rate setting for Stromer PLLs
clk: qcom: apss-ipq-pll: fix PLL rate for IPQ5018
clk: qcom: Fix SM_GPUCC_8650 dependencies
clk: qcom: Fix SC_CAMCC_8280XP dependencies
dt-bindings: clocks: stm32mp25: add access-controllers description
clock, reset: microchip: move all mpfs reset code to the reset subsystem
clk: samsung: gs101: drop unused HSI2 clock parent data
clk: rockchip: rk3568: Add PLL rate for 724 MHz
clk: rockchip: Remove an unused field in struct rockchip_mmc_clock
dt-bindings: clock: fixed: Define a preferred node name
clk: meson: s4: fix module autoloading
clk: samsung: gs101: mark some apm UASC and XIU clocks critical
clk: imx: imx8mp: Convert to platform remove callback returning void
clk: imx: imx8mp: Switch to RUNTIME_PM_OPS()
clk: bcm: rpi: Assign ->num before accessing ->hws
clk: bcm: dvp: Assign ->num before accessing ->hws
clk: samsung: gs101: add support for cmu_hsi2
clk: samsung: gs101: add support for cmu_hsi0
...
Linus Torvalds [Sat, 18 May 2024 19:39:20 +0000 (12:39 -0700)]
Merge tag 'kbuild-v6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
Pull Kbuild updates from Masahiro Yamada:
- Avoid 'constexpr', which is a keyword in C23
- Allow 'dtbs_check' and 'dt_compatible_check' run independently of
'dt_binding_check'
- Fix weak references to avoid GOT entries in position-independent code
generation
- Convert the last use of 'optional' property in arch/sh/Kconfig
- Remove support for the 'optional' property in Kconfig
- Remove support for Clang's ThinLTO caching, which does not work with
the .incbin directive
- Change the semantics of $(src) so it always points to the source
directory, which fixes Makefile inconsistencies between upstream and
downstream
- Fix 'make tar-pkg' for RISC-V to produce a consistent package
- Provide reasonable default coverage for objtool, sanitizers, and
profilers
- Remove redundant OBJECT_FILES_NON_STANDARD, KASAN_SANITIZE, etc.
- Remove the last use of tristate choice in drivers/rapidio/Kconfig
- Various cleanups and fixes in Kconfig
* tag 'kbuild-v6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (46 commits)
kconfig: use sym_get_choice_menu() in sym_check_prop()
rapidio: remove choice for enumeration
kconfig: lxdialog: remove initialization with A_NORMAL
kconfig: m/nconf: merge two item_add_str() calls
kconfig: m/nconf: remove dead code to display value of bool choice
kconfig: m/nconf: remove dead code to display children of choice members
kconfig: gconf: show checkbox for choice correctly
kbuild: use GCOV_PROFILE and KCSAN_SANITIZE in scripts/Makefile.modfinal
Makefile: remove redundant tool coverage variables
kbuild: provide reasonable defaults for tool coverage
modules: Drop the .export_symbol section from the final modules
kconfig: use menu_list_for_each_sym() in sym_check_choice_deps()
kconfig: use sym_get_choice_menu() in conf_write_defconfig()
kconfig: add sym_get_choice_menu() helper
kconfig: turn defaults and additional prompt for choice members into error
kconfig: turn missing prompt for choice members into error
kconfig: turn conf_choice() into void function
kconfig: use linked list in sym_set_changed()
kconfig: gconf: use MENU_CHANGED instead of SYMBOL_CHANGED
kconfig: gconf: remove debug code
...
Linus Torvalds [Sat, 18 May 2024 17:55:13 +0000 (10:55 -0700)]
Merge tag 'iommu-updates-v6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu
Pull iommu updates from Joerg Roedel:
"Core:
- IOMMU memory usage observability - This will make the memory used
for IO page tables explicitly visible.
- Simplify arch_setup_dma_ops()
Intel VT-d:
- Consolidate domain cache invalidation
- Remove private data from page fault message
- Allocate DMAR fault interrupts locally
- Cleanup and refactoring
ARM-SMMUv2:
- Support for fault debugging hardware on Qualcomm implementations
- Re-land support for the ->domain_alloc_paging() callback
ARM-SMMUv3:
- Improve handling of MSI allocation failure
- Drop support for the "disable_bypass" cmdline option
- Major rework of the CD creation code, following on directly from
the STE rework merged last time around.
- Add unit tests for the new STE/CD manipulation logic
AMD-Vi:
- Final part of SVA changes with generic IO page fault handling
Renesas IPMMU:
- Add support for R8A779H0 hardware
... and a couple smaller fixes and updates across the sub-tree"
* tag 'iommu-updates-v6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (80 commits)
iommu/arm-smmu-v3: Make the kunit into a module
arm64: Properly clean up iommu-dma remnants
iommu/amd: Enable Guest Translation after reading IOMMU feature register
iommu/vt-d: Decouple igfx_off from graphic identity mapping
iommu/amd: Fix compilation error
iommu/arm-smmu-v3: Add unit tests for arm_smmu_write_entry
iommu/arm-smmu-v3: Build the whole CD in arm_smmu_make_s1_cd()
iommu/arm-smmu-v3: Move the CD generation for SVA into a function
iommu/arm-smmu-v3: Allocate the CD table entry in advance
iommu/arm-smmu-v3: Make arm_smmu_alloc_cd_ptr()
iommu/arm-smmu-v3: Consolidate clearing a CD table entry
iommu/arm-smmu-v3: Move the CD generation for S1 domains into a function
iommu/arm-smmu-v3: Make CD programming use arm_smmu_write_entry()
iommu/arm-smmu-v3: Add an ops indirection to the STE code
iommu/arm-smmu-qcom: Don't build debug features as a kernel module
iommu/amd: Add SVA domain support
iommu: Add ops->domain_alloc_sva()
iommu/amd: Initial SVA support for AMD IOMMU
iommu/amd: Add support for enable/disable IOPF
iommu/amd: Add IO page fault notifier handler
...
Linus Torvalds [Sat, 18 May 2024 17:51:35 +0000 (10:51 -0700)]
Merge tag 'random-6.10-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random
Pull random number generator updates from Jason Donenfeld:
- The vmgenid driver can now be bound using device tree, rather than
just ACPI.
The improvement, from Sudan Landge, lets Amazon's Firecracker VMM
make use of the virtual device without having to expose an otherwise
unused ACPI stack in their "micro VM".
* tag 'random-6.10-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random:
virt: vmgenid: add support for devicetree bindings
dt-bindings: rng: Add vmgenid support
virt: vmgenid: change implementation to use a platform driver
Linus Torvalds [Sat, 18 May 2024 17:48:07 +0000 (10:48 -0700)]
Merge tag 'landlock-6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux
Pull landlock updates from Mickaël Salaün:
"This brings ioctl control to Landlock, contributed by Günther Noack.
This also adds him as a Landlock reviewer, and fixes an issue in the
sample"
* tag 'landlock-6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux:
MAINTAINERS: Add Günther Noack as Landlock reviewer
fs/ioctl: Add a comment to keep the logic in sync with LSM policies
MAINTAINERS: Notify Landlock maintainers about changes to fs/ioctl.c
landlock: Document IOCTL support
samples/landlock: Add support for LANDLOCK_ACCESS_FS_IOCTL_DEV
selftests/landlock: Exhaustive test for the IOCTL allow-list
selftests/landlock: Check IOCTL restrictions for named UNIX domain sockets
selftests/landlock: Test IOCTLs on named pipes
selftests/landlock: Test ioctl(2) and ftruncate(2) with open(O_PATH)
selftests/landlock: Test IOCTL with memfds
selftests/landlock: Test IOCTL support
landlock: Add IOCTL access right for character and block devices
samples/landlock: Fix incorrect free in populate_ruleset_net
Linus Torvalds [Sat, 18 May 2024 17:32:39 +0000 (10:32 -0700)]
Merge tag 'net-accept-more-20240515' of git://git.kernel.dk/linux
Pull more io_uring updates from Jens Axboe:
"This adds support for IORING_CQE_F_SOCK_NONEMPTY for io_uring accept
requests.
This is very similar to previous work that enabled the same hint for
doing receives on sockets. By far the majority of the work here is
refactoring to enable the networking side to pass back whether or not
the socket had more pending requests after accepting the current one,
the last patch just wires it up for io_uring.
Not only does this enable applications to know whether there are more
connections to accept right now, it also enables smarter logic for
io_uring multishot accept on whether to retry immediately or wait for
a poll trigger"
* tag 'net-accept-more-20240515' of git://git.kernel.dk/linux:
io_uring/net: wire up IORING_CQE_F_SOCK_NONEMPTY for accept
net: pass back whether socket was empty post accept
net: have do_accept() take a struct proto_accept_arg argument
net: change proto and proto_ops accept type
Linus Torvalds [Sat, 18 May 2024 02:17:55 +0000 (19:17 -0700)]
kprobe/ftrace: fix build error due to bad function definition
Commit 1a7d0890dd4a ("kprobe/ftrace: bail out if ftrace was killed")
introduced a bad K&R function definition, which we haven't accepted in a
long long time.
Gcc seems to let it slide, but clang notices with the appropriate error:
kernel/kprobes.c:1140:24: error: a function declaration without a prototype is deprecated in all >
1140 | void kprobe_ftrace_kill()
| ^
| void
but this commit was apparently never in linux-next before it was sent
upstream, so it didn't get the appropriate build test coverage.
Fixes: 1a7d0890dd4a kprobe/ftrace: bail out if ftrace was killed Cc: Stephen Brennan <stephen.s.brennan@oracle.com> Cc: Masami Hiramatsu (Google) <mhiramat@kernel.org> Cc: Guo Ren <guoren@kernel.org> Cc: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Sat, 18 May 2024 01:57:14 +0000 (18:57 -0700)]
Merge tag 'net-6.10-rc0' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Current release - regressions:
- virtio_net: fix missed error path rtnl_unlock after control queue
locking rework
Current release - new code bugs:
- bpf: fix KASAN slab-out-of-bounds in percpu_array_map_gen_lookup,
caused by missing nested map handling
- drv: dsa: correct initialization order for KSZ88x3 ports
Previous releases - regressions:
- af_packet: do not call packet_read_pending() from
tpacket_destruct_skb() fix performance regression
- ipv6: fix route deleting failure when metric equals 0, don't assume
0 means not set / default in this case
Previous releases - always broken:
- bridge: couple of syzbot-driven fixes"
* tag 'net-6.10-rc0' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (30 commits)
selftests: net: local_termination: annotate the expected failures
net: dsa: microchip: Correct initialization order for KSZ88x3 ports
MAINTAINERS: net: Update reviewers for TI's Ethernet drivers
dt-bindings: net: ti: Update maintainers list
l2tp: fix ICMP error handling for UDP-encap sockets
net: txgbe: fix to control VLAN strip
net: wangxun: match VLAN CTAG and STAG features
net: wangxun: fix to change Rx features
af_packet: do not call packet_read_pending() from tpacket_destruct_skb()
virtio_net: Fix missed rtnl_unlock
netrom: fix possible dead-lock in nr_rt_ioctl()
idpf: don't skip over ethtool tcp-data-split setting
dt-bindings: net: qcom: ethernet: Allow dma-coherent
bonding: fix oops during rmmod
net/ipv6: Fix route deleting failure when metric equals 0
selftests/net: reduce xfrm_policy test time
selftests/bpf: Adjust btf_dump test to reflect recent change in file_operations
selftests/bpf: Adjust test_access_variable_array after a kernel function name change
selftests/net/lib: no need to record ns name if it already exist
net: qrtr: ns: Fix module refcnt
...
Linus Torvalds [Sat, 18 May 2024 01:49:18 +0000 (18:49 -0700)]
Merge tag 'trace-tools-v6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull tracing tool updates from Steven Rostedt:
"Specific for timerlat:
- Improve the output of timerlat top by adding a missing \n, and by
avoiding printing color-formatting characters where they are
translated to regular characters.
- Improve timerlat auto-analysis output by replacing '\t' with spaces
to avoid copy-and-paste issues when reporting problems.
- Make the user-space (-u) option the default, as it is the most
complete test. Add a -k option to use the in-kernel workload.
- On timerlat top and hist, add a summary with the overall results.
For instance, the minimum value for all CPUs, the overall average
and the maximum value from all CPUs.
- timerlat hist was printing initial values (i.e., 0 as max, and ~0
as min) if the trace stopped before the first Ret-User event. This
problem was fixed by printing the " - " no value string to the
output if that was the case.
For all RTLA tools:
- Add a --warm-up <seconds> option, allowing the workload to run for
<seconds> before starting to collect results.
- Add a --trace-buffer-size option, allowing the user to set the
tracing buffer size for -t option. This option is mainly useful for
reducing the trace file. Now rtla depends on libtracefs >= 1.6.
- Fix the -t [trace_file] parsing, now it does not require the '='
before the option parameter, and better handles the multiple ways a
user can pass the trace_file.txt"
* tag 'trace-tools-v6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
rtla: Documentation: Fix -t, --trace
rtla: Fix -t\--trace[=file]
rtla/timerlat: Fix histogram report when a cpu count is 0
rtla: Add --trace-buffer-size option
rtla/timerlat: Make user-space threads the default
rtla: Add the --warm-up option
rtla/timerlat: Add a summary for hist mode
rtla/timerlat: Add a summary for top mode
rtla/timerlat: Use pretty formatting only on interactive tty
rtla/auto-analysis: Replace \t with spaces
rtla/timerlat: Simplify "no value" printing on top
Linus Torvalds [Sat, 18 May 2024 01:46:30 +0000 (18:46 -0700)]
Merge tag 'trace-user-events-v6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull tracing user-event updates from Steven Rostedt:
- Minor update to the user_events interface
The ABI of creating a user event states that the fields are separated
by semicolons, and spaces should be ignored.
But the parsing expected at least one space to be there (which was
incorrect). Fix the reading of the string to handle fields separated
by semicolons but no space between them.
This does extend the API sightly as now "field;field" will now be
parsed and not cause an error. But it should not cause any regressions
as no logic should expect it to fail.
Note, that the logic that parses the event fields to create the
trace_event works with no spaces after the semi-colon. It is
the logic that tests against existing events that is inconsistent.
This causes registering an event without using spaces to succeed
if it doesn't exist, but makes the same call that tries to register
to the same event, but doesn't use spaces, fail.
* tag 'trace-user-events-v6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
selftests/user_events: Add non-spacing separator check
tracing/user_events: Fix non-spaced field matching
Linus Torvalds [Sat, 18 May 2024 01:40:37 +0000 (18:40 -0700)]
Merge tag 'trace-ringbuffer-v6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull tracing ring buffer updates from Steven Rostedt:
"Add ring_buffer memory mappings.
The tracing ring buffer was created based on being mostly used with
the splice system call. It is broken up into page ordered sub-buffers
and the reader swaps a new sub-buffer with an existing sub-buffer
that's part of the write buffer. It then has total access to the
swapped out sub-buffer and can do copyless movements of the memory
into other mediums (file system, network, etc).
The buffer is great for passing around the ring buffer contents in the
kernel, but is not so good for when the consumer is the user space
task itself.
A new interface is added that allows user space to memory map the ring
buffer. It will get all the write sub-buffers as well as reader
sub-buffer (that is not written to). It can send an ioctl to change
which sub-buffer is the new reader sub-buffer.
The ring buffer is read only to user space. It only needs to call the
ioctl when it is finished with a sub-buffer and needs a new sub-buffer
that the writer will not write over.
A self test program was also created for testing and can be used as an
example for the interface to user space. The libtracefs (external to
the kernel) also has code that interacts with this, although it is
disabled until the interface is in a official release. It can be
enabled by compiling the library with a special flag. This was used
for testing applications that perform better with the buffer being
mapped.
Memory mapped buffers have limitations. The main one is that it can
not be used with the snapshot logic. If the buffer is mapped,
snapshots will be disabled. If any logic is set to trigger snapshots
on a buffer, that buffer will not be allowed to be mapped"
* tag 'trace-ringbuffer-v6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
ring-buffer: Add cast to unsigned long addr passed to virt_to_page()
ring-buffer: Have mmapped ring buffer keep track of missed events
ring-buffer/selftest: Add ring-buffer mapping test
Documentation: tracing: Add ring-buffer mapping
tracing: Allow user-space mapping of the ring-buffer
ring-buffer: Introducing ring-buffer mapping functions
ring-buffer: Allocate sub-buffers with __GFP_COMP
Linus Torvalds [Sat, 18 May 2024 01:34:27 +0000 (18:34 -0700)]
Merge tag 'trace-v6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull tracing updates from Steven Rostedt:
- Remove unused ftrace_direct_funcs variables
- Fix a possible NULL pointer dereference race in eventfs
- Update do_div() usage in trace event benchmark test
- Speedup direct function registration with asynchronous RCU callback.
The synchronization was done in the registration code and this caused
delays when registering direct callbacks. Move the freeing to a
call_rcu() that will prevent delaying of the registering.
- Replace simple_strtoul() usage with kstrtoul()
* tag 'trace-v6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
eventfs: Fix a possible null pointer dereference in eventfs_find_events()
ftrace: Fix possible use-after-free issue in ftrace_location()
ftrace: Remove unused global 'ftrace_direct_func_count'
ftrace: Remove unused list 'ftrace_direct_funcs'
tracing: Improve benchmark test performance by using do_div()
ftrace: Use asynchronous grace period for register_ftrace_direct()
ftrace: Replaces simple_strtoul in ftrace
Linus Torvalds [Sat, 18 May 2024 01:29:30 +0000 (18:29 -0700)]
Merge tag 'probes-v6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull probes updates from Masami Hiramatsu:
- tracing/probes: Add new pseudo-types %pd and %pD support for dumping
dentry name from 'struct dentry *' and file name from 'struct file *'
- uprobes performance optimizations:
- Speed up the BPF uprobe event by delaying the fetching of the
uprobe event arguments that are not used in BPF
- Avoid locking by speculatively checking whether uprobe event is
valid
- Reduce lock contention by using read/write_lock instead of
spinlock for uprobe list operation. This improved BPF uprobe
benchmark result 43% on average
- rethook: Remove non-fatal warning messages when tracing stack from
BPF and skip rcu_is_watching() validation in rethook if possible
- objpool: Optimize objpool (which is used by kretprobes and fprobe as
rethook backend storage) by inlining functions and avoid caching
nr_cpu_ids because it is a const value
- kprobes: Check ftrace was killed in kprobes if it uses ftrace
* tag 'probes-v6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
kprobe/ftrace: bail out if ftrace was killed
selftests/ftrace: Fix required features for VFS type test case
objpool: cache nr_possible_cpus() and avoid caching nr_cpu_ids
objpool: enable inlining objpool_push() and objpool_pop() operations
rethook: honor CONFIG_FTRACE_VALIDATE_RCU_IS_WATCHING in rethook_try_get()
ftrace: make extra rcu_is_watching() validation check optional
uprobes: reduce contention on uprobes_tree access
rethook: Remove warning messages printed for finding return address of a frame.
fprobe: Add entry/exit callbacks types
selftests/ftrace: add fprobe test cases for VFS type "%pd" and "%pD"
selftests/ftrace: add kprobe test cases for VFS type "%pd" and "%pD"
Documentation: tracing: add new type '%pd' and '%pD' for kprobe
tracing/probes: support '%pD' type for print struct file's name
tracing/probes: support '%pd' type for print struct dentry's name
uprobes: add speculative lockless system-wide uprobe filter check
uprobes: prepare uprobe args buffer lazily
uprobes: encapsulate preparation of uprobe args buffer
Linus Torvalds [Sat, 18 May 2024 01:23:55 +0000 (18:23 -0700)]
Merge tag 'bootconfig-v6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull bootconfig updates from Masami Hiramatsu:
- Do not put unneeded quotes on the extra command line items which was
inserted from the bootconfig.
- Remove redundant spaces from the extra command line.
* tag 'bootconfig-v6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
init/main.c: Minor cleanup for the setup_command_line() function
init/main.c: Remove redundant space from saved_command_line
bootconfig: do not put quotes on cmdline items unless necessary
Linus Torvalds [Sat, 18 May 2024 00:31:24 +0000 (17:31 -0700)]
Merge tag 'sysctl-6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/sysctl/sysctl
Pull sysctl updates from Joel Granados:
- Remove sentinel elements from ctl_table structs in kernel/*
Removing sentinels in ctl_table arrays reduces the build time size
and runtime memory consumed by ~64 bytes per array. Removals for
net/, io_uring/, mm/, ipc/ and security/ are set to go into mainline
through their respective subsystems making the next release the most
likely place where the final series that removes the check for
proc_name == NULL will land.
This adds to removals already in arch/, drivers/ and fs/.
- Adjust ctl_table definitions and references to allow constification
- Remove unused ctl_table function arguments
- Move non-const elements from ctl_table to ctl_table_header
- Make ctl_table pointers const in ctl_table_root structure
Making the static ctl_table structs const will increase safety by
keeping the pointers to proc_handler functions in .rodata. Though no
ctl_tables where made const in this PR, the ground work for making
that possible has started with these changes sent by Thomas
Weißschuh.
* tag 'sysctl-6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/sysctl/sysctl:
sysctl: drop now unnecessary out-of-bounds check
sysctl: move sysctl type to ctl_table_header
sysctl: drop sysctl_is_perm_empty_ctl_table
sysctl: treewide: constify argument ctl_table_root::permissions(table)
sysctl: treewide: drop unused argument ctl_table_root::set_ownership(table)
bpf: Remove the now superfluous sentinel elements from ctl_table array
delayacct: Remove the now superfluous sentinel elements from ctl_table array
kprobes: Remove the now superfluous sentinel elements from ctl_table array
printk: Remove the now superfluous sentinel elements from ctl_table array
scheduler: Remove the now superfluous sentinel elements from ctl_table array
seccomp: Remove the now superfluous sentinel elements from ctl_table array
timekeeping: Remove the now superfluous sentinel elements from ctl_table array
ftrace: Remove the now superfluous sentinel elements from ctl_table array
umh: Remove the now superfluous sentinel elements from ctl_table array
kernel misc: Remove the now superfluous sentinel elements from ctl_table array
Linus Torvalds [Sat, 18 May 2024 00:27:49 +0000 (17:27 -0700)]
Merge tag 'devicetree-for-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux
Pull devicetree updates from Rob Herring:
"DT Bindings:
- Convert samsung,exynos5-dp, atmel,lcdc, aspeed,ast2400-wdt bindings
to schemas
- Add bindings for Allwinner H616 NMI controller, Renesas r8a779g0
irqc, Renesas R-Car V4M TMU and CMT timers, Freescale S32G3
linflexuart, and Mediatek MT7988 XHCI
- Add 'reg' constraints on DSI and SPI display panels
- More dropping of unnecessary quotes in schemas
- Use full paths rather than relative paths in schema $refs
- Drop redundant storing of phandle for reserved memory
DT Core:
- Use scope based cleanups for kfree() and of_node_put()
- Track interrupt-map and power-supplies for fw_devlink
- Add buffer overflow check in of_modalias()
- Add and use __of_prop_free() helper for freeing struct property"
* tag 'devicetree-for-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux: (25 commits)
of: property: Add fw_devlink support for interrupt-map property
dt-bindings: display: panel: constrain 'reg' in DSI panels
dt-bindings: display: panel: constrain 'reg' in SPI panels
dt-bindings: display: samsung,ams495qa01: add missing SPI properties ref
dt-bindings: Use full path to other schemas
dt-bindings: PCI: qcom,pcie-sm8350: Drop redundant 'oneOf' sub-schema
of: module: add buffer overflow check in of_modalias()
dt-bindings: PCI: microchip: increase number of items in ranges property
dt-bindings: Drop unnecessary quotes on keys
dt-bindings: interrupt-controller: mediatek,mt6577-sysirq: Drop unnecessary quotes
of: property: Use scope based cleanup on port_node
of: reserved_mem: Remove the use of phandle from the reserved_mem APIs
of: property: fw_devlink: Add support for "power-supplies" binding
dt-bindings: watchdog: aspeed,ast2400-wdt: Convert to DT schema
dt-bindings: irq: sun7i-nmi: Add binding for the H616 NMI controller
dt-bindings: interrupt-controller: renesas,irqc: Add r8a779g0 support
dt-bindings: timer: renesas,tmu: Add R-Car V4M support
dt-bindings: timer: renesas,cmt: Add R-Car V4M support
of: Use scope based of_node_put() cleanups
of: Use scope based kfree() cleanups
...
Jakub Kicinski [Thu, 16 May 2024 15:25:13 +0000 (08:25 -0700)]
selftests: net: local_termination: annotate the expected failures
Vladimir said when adding this test:
The bridge driver fares particularly badly [...] mainly because
it does not implement IFF_UNICAST_FLT.
See commit 90b9566aa5cd ("selftests: forwarding: add a test for
local_termination.sh").
We don't want to hide the known gaps, but having a test which
always fails prevents us from catching regressions. Report
the cases we know may fail as XFAIL.
Oleksij Rempel [Fri, 17 May 2024 05:01:21 +0000 (07:01 +0200)]
net: dsa: microchip: Correct initialization order for KSZ88x3 ports
Adjust the initialization sequence of KSZ88x3 switches to enable
802.1p priority control on Port 2 before configuring Port 1. This
change ensures the apptrust functionality on Port 1 operates
correctly, as it depends on the priority settings of Port 2. The
prior initialization sequence incorrectly configured Port 1 first,
which could lead to functional discrepancies.
Fixes: a1ea57710c9d ("net: dsa: microchip: dcb: add special handling for KSZ88X3 family") Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Reviewed-by: Hariprasad Kelam <hkelam@marvell.com> Acked-by: Arun Ramadoss <arun.ramadoss@microchip.com> Link: https://lore.kernel.org/r/20240517050121.2174412-1-o.rempel@pengutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Tom Parkin [Mon, 13 May 2024 17:22:47 +0000 (18:22 +0100)]
l2tp: fix ICMP error handling for UDP-encap sockets
Since commit a36e185e8c85
("udp: Handle ICMP errors for tunnels with same destination port on both endpoints")
UDP's handling of ICMP errors has allowed for UDP-encap tunnels to
determine socket associations in scenarios where the UDP hash lookup
could not.
Subsequently, commit d26796ae58940
("udp: check udp sock encap_type in __udp_lib_err")
subtly tweaked the approach such that UDP ICMP error handling would be
skipped for any UDP socket which has encapsulation enabled.
In the case of L2TP tunnel sockets using UDP-encap, this latter
modification effectively broke ICMP error reporting for the L2TP
control plane.
To a degree this isn't catastrophic inasmuch as the L2TP control
protocol defines a reliable transport on top of the underlying packet
switching network which will eventually detect errors and time out.
However, paying attention to the ICMP error reporting allows for more
timely detection of errors in L2TP userspace, and aids in debugging
connectivity issues.
Reinstate ICMP error handling for UDP encap L2TP tunnels:
* implement struct udp_tunnel_sock_cfg .encap_err_rcv in order to allow
the L2TP code to handle ICMP errors;
* only implement error-handling for tunnels which have a managed
socket: unmanaged tunnels using a kernel socket have no userspace to
report errors back to;
* flag the error on the socket, which allows for userspace to get an
error such as -ECONNREFUSED back from sendmsg/recvmsg;
* pass the error into ip[v6]_icmp_error() which allows for userspace to
get extended error information via. MSG_ERRQUEUE.
- Add support for passing additional parameters to the fadump kernel.
- Add support for updating the kdump image on CPU/memory add/remove
events.
- Other small features, cleanups and fixes.
Thanks to Andrew Donnellan, Andy Shevchenko, Aneesh Kumar K.V, Arnd
Bergmann, Benjamin Gray, Bjorn Helgaas, Christian Zigotzky, Christophe
Jaillet, Christophe Leroy, Colin Ian King, Cédric Le Goater, Dr. David
Alan Gilbert, Erhard Furtner, Frank Li, GUO Zihua, Ganesh Goudar, Geoff
Levand, Ghanshyam Agrawal, Greg Kurz, Hari Bathini, Joel Stanley, Justin
Stitt, Kunwu Chan, Li Yang, Lidong Zhong, Madhavan Srinivasan, Mahesh
Salgaonkar, Masahiro Yamada, Matthias Schiffer, Naresh Kamboju, Nathan
Chancellor, Nathan Lynch, Naveen N Rao, Nicholas Miehlbradt, Ran Wang,
Randy Dunlap, Ritesh Harjani, Sachin Sant, Shirisha Ganta, Shrikanth
Hegde, Sourabh Jain, Stephen Rothwell, sundar, Thorsten Blum, Vaibhav
Jain, Xiaowei Bao, Yang Li, and Zhao Chenhui.
* tag 'powerpc-6.10-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (85 commits)
powerpc/fadump: Fix section mismatch warning
powerpc/85xx: fix compile error without CONFIG_CRASH_DUMP
powerpc/fadump: update documentation about bootargs_append
powerpc/fadump: pass additional parameters when fadump is active
powerpc/fadump: setup additional parameters for dump capture kernel
powerpc/pseries/fadump: add support for multiple boot memory regions
selftests/powerpc/dexcr: Fix spelling mistake "predicition" -> "prediction"
KVM: PPC: Book3S HV nestedv2: Fix an error handling path in gs_msg_ops_kvmhv_nestedv2_config_fill_info()
KVM: PPC: Fix documentation for ppc mmu caps
KVM: PPC: code cleanup for kvmppc_book3s_irqprio_deliver
KVM: PPC: Book3S HV nestedv2: Cancel pending DEC exception
powerpc/xmon: Check cpu id in commands "c#", "dp#" and "dx#"
powerpc/code-patching: Use dedicated memory routines for patching
powerpc/code-patching: Test patch_instructions() during boot
powerpc64/kasan: Pass virtual addresses to kasan_init_phys_region()
powerpc: rename SPRN_HID2 define to SPRN_HID2_750FX
powerpc: Fix typos
powerpc/eeh: Fix spelling of the word "auxillary" and update comment
macintosh/ams: Fix unused variable warning
powerpc/Makefile: Remove bits related to the previous use of -mcmodel=large
...
Dan Carpenter [Fri, 10 May 2024 15:22:53 +0000 (18:22 +0300)]
ext4: fix error pointer dereference in ext4_mb_load_buddy_gfp()
This code calls folio_put() on an error pointer which will lead to a
crash. Check for both error pointers and NULL pointers before calling
folio_put().
Alex Williamson [Thu, 16 May 2024 17:48:30 +0000 (11:48 -0600)]
vfio/pci: Restore zero affected bus reset devices warning
Yi notes relative to commit f6944d4a0b87 ("vfio/pci: Collect hot-reset
devices to local buffer") that we previously tested the resulting
device count with a WARN_ON, which was removed when we switched to
the in-loop user copy in commit b56b7aabcf3c ("vfio/pci: Copy hot-reset
device info to userspace in the devices loop"). Finding no devices in
the bus/slot would be an unexpected condition, so let's restore the
warning and trigger a -ERANGE error here as success with no devices
would be an unexpected result to userspace as well.
Stefan Berger [Fri, 10 May 2024 01:59:21 +0000 (21:59 -0400)]
crypto: ecc - Prevent ecc_digits_from_bytes from reading too many bytes
Prevent ecc_digits_from_bytes from reading too many bytes from the input
byte array in case an insufficient number of bytes is provided to fill the
output digit array of ndigits. Therefore, initialize the most significant
digits with 0 to avoid trying to read too many bytes later on. Convert the
function into a regular function since it is getting too big for an inline
function.
If too many bytes are provided on the input byte array the extra bytes
are ignored since the input variable 'ndigits' limits the number of digits
that will be filled.
Fixes: d67c96fb97b5 ("crypto: ecdsa - Convert byte arrays with key coordinates to digits") Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Stefan Berger <stefanb@linux.ibm.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Herbert Xu [Wed, 8 May 2024 08:39:51 +0000 (16:39 +0800)]
crypto: qat - Fix ADF_DEV_RESET_SYNC memory leak
Using completion_done to determine whether the caller has gone
away only works after a complete call. Furthermore it's still
possible that the caller has not yet called wait_for_completion,
resulting in another potential UAF.
Fix this by making the caller use cancel_work_sync and then freeing
the memory safely.
Fixes: 7d42e097607c ("crypto: qat - resolve race condition during AER recovery") Cc: <stable@vger.kernel.org> #6.8+ Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Reviewed-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
David S. Miller [Fri, 17 May 2024 09:17:36 +0000 (10:17 +0100)]
Merge branch 'wangxun-fixes'
Jiawen Wu says:
====================
Wangxun fixes
Fixed some bugs when using ethtool to operate network devices.
v4 -> v5:
- Simplify if...else... to fix features.
v3 -> v4:
- Require both ctag and stag to be enabled or disabled.
v2 -> v3:
- Drop the first patch.
v1 -> v2:
- Factor out the same code.
- Remove statistics printing with more than 64 queues.
- Detail the commit logs to describe issues.
- Remove reset flag check in wx_update_stats().
- Change to set VLAN CTAG and STAG to be consistent.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Jiawen Wu [Fri, 17 May 2024 06:51:40 +0000 (14:51 +0800)]
net: txgbe: fix to control VLAN strip
When VLAN tag strip is changed to enable or disable, the hardware requires
the Rx ring to be in a disabled state, otherwise the feature cannot be
changed.
Fixes: f3b03c655f67 ("net: wangxun: Implement vlan add and kill functions") Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Jiawen Wu [Fri, 17 May 2024 06:51:39 +0000 (14:51 +0800)]
net: wangxun: match VLAN CTAG and STAG features
Hardware requires VLAN CTAG and STAG configuration always matches. And
whether VLAN CTAG or STAG changes, the configuration needs to be changed
as well.
Fixes: 6670f1ece2c8 ("net: txgbe: Add netdev features support") Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com> Reviewed-by: Sai Krishna <saikrishnag@marvell.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Jiawen Wu [Fri, 17 May 2024 06:51:38 +0000 (14:51 +0800)]
net: wangxun: fix to change Rx features
Fix the issue where some Rx features cannot be changed.
When using ethtool -K to turn off rx offload, it returns error and
displays "Could not change any device features". And netdev->features
is not assigned a new value to actually configure the hardware.
Fixes: 6dbedcffcf54 ("net: libwx: Implement xx_set_features ops") Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
... we find that the cpu.max.burst value changed unexpectedly.
In cpu_max_write(), the unit of the burst value returned
by tg_get_cfs_burst() is microseconds, while in cpu_max_write(),
the burst unit used for calculation should be nanoseconds,
which leads to the bug.
To fix it, get the burst value directly from tg->cfs_bandwidth.burst.
I'm fine with either and that was my first thought here, too, but it did seem like
the comment was mostly placed there to justify the 'unexpected' high utilization
when explicitly passing FREQUENCY_UTIL and the need to clamp it then.
So removing did feel slightly more natural to me anyway.
So alternatively:
From: Christian Loehle <christian.loehle@arm.com>
Date: Tue, 5 Mar 2024 09:34:41 +0000
Subject: [PATCH] sched/fair: Remove stale FREQUENCY_UTIL mention
effective_cpu_util() flags were removed, so remove mentioning of the
flag.
Dawei Li [Fri, 15 Mar 2024 01:59:16 +0000 (18:59 -0700)]
sched/fair: Fix initial util_avg calculation
Change se->load.weight to se_weight(se) in the calculation for the
initial util_avg to avoid unnecessarily inflating the util_avg by 1024
times.
The reason is that se->load.weight has the unit/scale as the scaled-up
load, while cfs_rg->avg.load_avg has the unit/scale as the true task
weight (as mapped directly from the task's nice/priority value). With
CONFIG_32BIT, the scaled-up load is equal to the true task weight. With
CONFIG_64BIT, the scaled-up load is 1024 times the true task weight.
Thus, the current code may inflate the util_avg by 1024 times. The
follow-up capping will not allow the util_avg value to go wild. But the
calculation should have the correct logic.
Signed-off-by: Dawei Li <daweilics@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org> Reviewed-by: Vishal Chourasia <vishalc@linux.ibm.com> Link: https://lore.kernel.org/r/20240315015916.21545-1-daweilics@gmail.com
where the replacement length is 0. And using repl_len is wrong because
apply_alternatives() expands the buffer size to the length of the source
insn that is being patched, by padding it with one-byte NOPs:
for (; insn_buff_sz < a->instrlen; insn_buff_sz++)
insn_buff[insn_buff_sz] = 0x90;
Long story short: pass the length of the original instruction(s) as the
length of the temporary buffer which to optimize.
x86/boot: Address clang -Wimplicit-fallthrough in vsprintf()
After enabling -Wimplicit-fallthrough for the x86 boot code, clang
warns:
arch/x86/boot/printf.c:257:3: warning: unannotated fall-through between switch labels [-Wimplicit-fallthrough]
257 | case 'u':
| ^
Clang is a little more pedantic than GCC, which does not warn when
falling through to a case that is just break or return. Clang's version
is more in line with the kernel's own stance in deprecated.rst, which
states that all switch/case blocks must end in either break,
fallthrough, continue, goto, or return. Add the missing break to silence
the warning.
Eric Dumazet [Wed, 15 May 2024 16:33:58 +0000 (16:33 +0000)]
af_packet: do not call packet_read_pending() from tpacket_destruct_skb()
trafgen performance considerably sank on hosts with many cores
after the blamed commit.
packet_read_pending() is very expensive, and calling it
in af_packet fast path defeats Daniel intent in commit b013840810c2 ("packet: use percpu mmap tx frame pending refcount")
tpacket_destruct_skb() makes room for one packet, we can immediately
wakeup a producer, no need to completely drain the tx ring.
Fixes: 89ed5b519004 ("af_packet: Block execution of tasks waiting for transmit to complete in AF_PACKET") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Neil Horman <nhorman@tuxdriver.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://lore.kernel.org/r/20240515163358.4105915-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Michal Schmidt [Wed, 15 May 2024 09:24:14 +0000 (11:24 +0200)]
idpf: don't skip over ethtool tcp-data-split setting
Disabling tcp-data-split on idpf silently fails:
# ethtool -G $NETDEV tcp-data-split off
# ethtool -g $NETDEV | grep 'TCP data split'
TCP data split: on
But it works if you also change 'tx' or 'rx':
# ethtool -G $NETDEV tcp-data-split off tx 256
# ethtool -g $NETDEV | grep 'TCP data split'
TCP data split: off
The bug is in idpf_set_ringparam, where it takes a shortcut out if the
TX and RX sizes are not changing. Fix it by checking also if the
tcp-data-split setting remains unchanged. Only then can the soft reset
be skipped.
Fixes: 9b1aa3ef2328 ("idpf: add get/set for Ethtool's header split ringparam") Reported-by: Xu Du <xudu@redhat.com> Closes: https://issues.redhat.com/browse/RHEL-36182 Signed-off-by: Michal Schmidt <mschmidt@redhat.com> Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com> Link: https://lore.kernel.org/r/20240515092414.158079-1-mschmidt@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Tony Battersby [Tue, 14 May 2024 19:57:29 +0000 (15:57 -0400)]
bonding: fix oops during rmmod
"rmmod bonding" causes an oops ever since commit cc317ea3d927 ("bonding:
remove redundant NULL check in debugfs function"). Here are the relevant
functions being called:
bonding_exit()
bond_destroy_debugfs()
debugfs_remove_recursive(bonding_debug_root);
bonding_debug_root = NULL; <--------- SET TO NULL HERE
bond_netlink_fini()
rtnl_link_unregister()
__rtnl_link_unregister()
unregister_netdevice_many_notify()
bond_uninit()
bond_debug_unregister()
(commit removed check for bonding_debug_root == NULL)
debugfs_remove()
simple_recursive_removal()
down_write() -> OOPS
However, reverting the bad commit does not solve the problem completely
because the original code contains a race that could cause the same
oops, although it was much less likely to be triggered unintentionally:
CPU2
echo -bond0 > /sys/class/net/bonding_masters
bond_uninit()
bond_debug_unregister()
if (!bonding_debug_root)
CPU1
bonding_debug_root = NULL;
So do NOT revert the bad commit (since the removed checks were racy
anyway), and instead change the order of actions taken during module
removal. The same oops can also happen if there is an error during
module init, so apply the same fix there.
Fixes: cc317ea3d927 ("bonding: remove redundant NULL check in debugfs function") Cc: stable@vger.kernel.org Signed-off-by: Tony Battersby <tonyb@cybernetics.com> Reviewed-by: Simon Horman <horms@kernel.org> Acked-by: Jay Vosburgh <jay.vosburgh@canonical.com> Link: https://lore.kernel.org/r/641f914f-3216-4eeb-87dd-91b78aa97773@cybernetics.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
xu xin [Tue, 14 May 2024 12:11:02 +0000 (20:11 +0800)]
net/ipv6: Fix route deleting failure when metric equals 0
Problem
=========
After commit 67f695134703 ("ipv6: Move setting default metric for routes"),
we noticed that the logic of assigning the default value of fc_metirc
changed in the ioctl process. That is, when users use ioctl(fd, SIOCADDRT,
rt) with a non-zero metric to add a route, then they may fail to delete a
route with passing in a metric value of 0 to the kernel by ioctl(fd,
SIOCDELRT, rt). But iproute can succeed in deleting it.
As a reference, when using iproute tools by netlink to delete routes with
a metric parameter equals 0, like the command as follows:
ip -6 route del fe80::/64 via fe81::5054:ff:fe11:3451 dev eth0 metric 0
the user can still succeed in deleting the route entry with the smallest
metric.
Root Reason
===========
After commit 67f695134703 ("ipv6: Move setting default metric for routes"),
When ioctl() pass in SIOCDELRT with a zero metric, rtmsg_to_fib6_config()
will set a defalut value (1024) to cfg->fc_metric in kernel, and in
ip6_route_del() and the line 4074 at net/ipv3/route.c, it will check by
if (cfg->fc_metric && cfg->fc_metric != rt->fib6_metric)
continue;
and the condition is true and skip the later procedure (deleting route)
because cfg->fc_metric != rt->fib6_metric. But before that commit,
cfg->fc_metric is still zero there, so the condition is false and it
will do the following procedure (deleting).
Solution
========
In order to keep a consistent behaviour across netlink() and ioctl(), we
should allow to delete a route with a metric value of 0. So we only do
the default setting of fc_metric in route adding.
CC: stable@vger.kernel.org # 5.4+ Fixes: 67f695134703 ("ipv6: Move setting default metric for routes") Co-developed-by: Fan Yu <fan.yu9@zte.com.cn> Signed-off-by: Fan Yu <fan.yu9@zte.com.cn> Signed-off-by: xu xin <xu.xin16@zte.com.cn> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://lore.kernel.org/r/20240514201102055dD2Ba45qKbLlUMxu_DTHP@zte.com.cn Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Hangbin Liu [Tue, 14 May 2024 09:52:27 +0000 (17:52 +0800)]
selftests/net: reduce xfrm_policy test time
The check_random_order test add/get plenty of xfrm rules, which consume
a lot time on debug kernel and always TIMEOUT. Let's reduce the test
loop and see if it works.
Stephen Boyd [Fri, 17 May 2024 01:09:08 +0000 (18:09 -0700)]
Merge branches 'clk-counted', 'clk-imx', 'clk-amlogic', 'clk-binding' and 'clk-rockchip' into clk-next
* clk-counted:
clk: bcm: rpi: Assign ->num before accessing ->hws
clk: bcm: dvp: Assign ->num before accessing ->hws
* clk-imx:
clk: imx: imx8mp: Convert to platform remove callback returning void
clk: imx: imx8mp: Switch to RUNTIME_PM_OPS()
clk: imx: add i.MX95 BLK CTL clk driver
dt-bindings: clock: support i.MX95 Display Master CSR module
dt-bindings: clock: support i.MX95 BLK CTL module
dt-bindings: clock: add i.MX95 clock header
clk: imx: imx8mp: Add pm_runtime support for power saving
* clk-amlogic:
clk: meson: s4: fix module autoloading
clk: meson: fix module license to GPL only
clk: meson: g12a: make VCLK2 and ENCL clock path configurable by CCF
clk: meson: add vclk driver
clk: meson: pll: print out pll name when unable to lock it
clk: meson: s4: pll: determine maximum register in regmap config
clk: meson: s4: peripherals: determine maximum register in regmap config
clk: meson: a1: pll: determine maximum register in regmap config
clk: meson: a1: peripherals: determine maximum register in regmap config
* clk-binding:
dt-bindings: clock: fixed: Define a preferred node name
* clk-rockchip:
clk: rockchip: rk3568: Add PLL rate for 724 MHz
clk: rockchip: Remove an unused field in struct rockchip_mmc_clock
clk: rockchip: rk3588: Add reset line for HDMI Receiver
clk: rockchip: rk3568: Add missing USB480M_PHY mux
dt-bindings: reset: Define reset id used for HDMI Receiver
dt-bindings: clock: rockchip: add USB480M_PHY mux
* clk-stm:
dt-bindings: clocks: stm32mp25: add access-controllers description
clk: stm32: introduce clocks for STM32MP257 platform
dt-bindings: clocks: stm32mp25: add description of all parents
clk: stm32mp13: use platform device APIs
* clk-renesas:
clk: renesas: r9a08g045: Add support for power domains
clk: renesas: rzg2l: Extend power domain support
dt-bindings: clock: renesas,rzg2l-cpg: Update #power-domain-cells = <1> for RZ/G3S
dt-bindings: clock: r9a08g045-cpg: Add power domain IDs
dt-bindings: clock: r9a07g054-cpg: Add power domain IDs
dt-bindings: clock: r9a07g044-cpg: Add power domain IDs
dt-bindings: clock: r9a07g043-cpg: Add power domain IDs
clk: renesas: shmobile: Remove unused CLK_ENABLE_ON_INIT
clk: renesas: r8a7740: Remove unused div4_clk.flags field
clk: renesas: r9a07g043: Add clock and reset entry for PLIC
clk: renesas: r8a779h0: Add INTC-EX clock
clk: renesas: r8a779h0: Add MSIOF clocks
clk: renesas: r8a779a0: Fix CANFD parent clock
clk: rs9: fix wrong default value for clock amplitude
clk: renesas: r8a779h0: Add timer clocks
clk: renesas: r8a779h0: Add SCIF clocks
clk: renesas: r9a07g044: Mark resets array as const
clk: renesas: r9a07g043: Mark mod_clks and resets arrays as const
clk: renesas: r8a779h0: Add thermal clock
dt-bindings: clock: r9a07g043-cpg: Annotate RZ/G2UL-only core clocks
* clk-scmi:
clk: scmi: Add support for get/set duty_cycle operations
clk: scmi: Add support for re-parenting restricted clocks
clk: scmi: Add support for rate change restricted clocks
clk: scmi: Add support for state control restricted clocks
clk: scmi: Allocate CLK operations dynamically
* clk-allwinner:
clk: sunxi-ng: fix module autoloading
clk: sunxi-ng: a64: Add constraints on PLL-MIPI's n/m ratio and parent rate
clk: sunxi-ng: nkm: Support constraints on m/n ratio and parent rate
* clk-cleanup:
clk: gemini: Remove an unused field in struct clk_gemini_pci
clk: highbank: Remove an unused field in struct hb_clk
clk: ti: dpll: fix incorrect #ifdef checks
clk: nxp: Remove an unused field in struct lpc18xx_pll
* clk-airoha:
clk: en7523: Add EN7581 support
clk: en7523: Add en_clk_soc_data data structure
dt-bindings: clock: airoha: add EN7581 binding
Jakub Kicinski [Fri, 17 May 2024 00:48:04 +0000 (17:48 -0700)]
Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Daniel Borkmann says:
====================
pull-request: bpf 2024-05-17
We've added 7 non-merge commits during the last 2 day(s) which contain
a total of 8 files changed, 20 insertions(+), 9 deletions(-).
The main changes are:
1) Fix KASAN slab-out-of-bounds in percpu_array_map_gen_lookup and add
BPF selftests to cover this case, from Andrii Nakryiko.
(Report https://lore.kernel.org/bpf/20240514231155.1004295-1-kuba@kernel.org/)
2) Fix two BPF selftests to adjust for kernel changes after fast-forwarding
Linus' tree to make BPF CI all green again, from Martin KaFai Lau.
3) Fix libbpf feature detectors when using token_fd by adjusting the
attribute size for memset to cover the former, also from Andrii Nakryiko.
4) Fix the description of 'src' in ALU instructions for the BPF ISA
standardization doc, from Puranjay Mohan.
* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
selftests/bpf: Adjust btf_dump test to reflect recent change in file_operations
selftests/bpf: Adjust test_access_variable_array after a kernel function name change
selftests/bpf: add more variations of map-in-map situations
bpf: save extended inner map info for percpu array maps as well
MAINTAINERS: Update ARM64 BPF JIT maintainer
bpf, docs: Fix the description of 'src' in ALU instructions
libbpf: fix feature detectors when using token_fd
====================
Martin KaFai Lau [Thu, 16 May 2024 17:01:40 +0000 (10:01 -0700)]
selftests/bpf: Adjust test_access_variable_array after a kernel function name change
After commit 4c3e509ea9f2 ("sched/balancing: Rename load_balance() => sched_balance_rq()"),
the load_balance kernel function is renamed to sched_balance_rq.
This patch adjusts the fentry program in test_access_variable_array.c
to reflect this kernel function name change.