]> git.proxmox.com Git - mirror_ubuntu-kernels.git/log
mirror_ubuntu-kernels.git
3 years agoMerge tag 'io_uring-5.14-2021-07-24' of git://git.kernel.dk/linux-block
Linus Torvalds [Sat, 24 Jul 2021 20:03:40 +0000 (13:03 -0700)]
Merge tag 'io_uring-5.14-2021-07-24' of git://git.kernel.dk/linux-block

Pull io_uring fixes from Jens Axboe:

 - Fix a memory leak due to a race condition in io_init_wq_offload
   (Yang)

 - Poll error handling fixes (Pavel)

 - Fix early fdput() regression (me)

 - Don't reissue iopoll requests off release path (me)

 - Add a safety check for io-wq queue off wrong path (me)

* tag 'io_uring-5.14-2021-07-24' of git://git.kernel.dk/linux-block:
  io_uring: explicitly catch any illegal async queue attempt
  io_uring: never attempt iopoll reissue from release path
  io_uring: fix early fdput() of file
  io_uring: fix memleak in io_init_wq_offload()
  io_uring: remove double poll entry on arm failure
  io_uring: explicitly count entries for poll reqs

3 years agoMerge tag 'block-5.14-2021-07-24' of git://git.kernel.dk/linux-block
Linus Torvalds [Sat, 24 Jul 2021 19:57:06 +0000 (12:57 -0700)]
Merge tag 'block-5.14-2021-07-24' of git://git.kernel.dk/linux-block

Pull block fixes from Jens Axboe:

 - NVMe pull request (Christoph):
    - tracing fix (Keith Busch)
    - fix multipath head refcounting (Hannes Reinecke)
    - Write Zeroes vs PI fix (me)
    - drop a bogus WARN_ON (Zhihao Cheng)

 - Increase max blk-cgroup policy size, now that mq-deadline
   uses it too (Oleksandr)

* tag 'block-5.14-2021-07-24' of git://git.kernel.dk/linux-block:
  nvme: set the PRACT bit when using Write Zeroes with T10 PI
  nvme: fix nvme_setup_command metadata trace event
  nvme: fix refcounting imbalance when all paths are down
  nvme-pci: don't WARN_ON in nvme_reset_work if ctrl.state is not RESETTING
  block: increase BLKCG_MAX_POLS

3 years agoMerge branch 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa...
Linus Torvalds [Sat, 24 Jul 2021 19:55:06 +0000 (12:55 -0700)]
Merge branch 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux

Pull i2c fixes from Wolfram Sang:
 "Two bugfixes for the I2C subsystem"

* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
  i2c: mpc: Poll for MCF
  misc: eeprom: at24: Always append device id even if label property is set.

3 years agoMerge branch 'akpm' (patches from Andrew)
Linus Torvalds [Sat, 24 Jul 2021 19:27:16 +0000 (12:27 -0700)]
Merge branch 'akpm' (patches from Andrew)

Merge misc mm fixes from Andrew Morton:
 "15 patches.

  VM subsystems affected by this patch series: userfaultfd, kfence,
  highmem, pagealloc, memblock, pagecache, secretmem, pagemap, and
  hugetlbfs"

* akpm:
  hugetlbfs: fix mount mode command line processing
  mm: fix the deadlock in finish_fault()
  mm: mmap_lock: fix disabling preemption directly
  mm/secretmem: wire up ->set_page_dirty
  writeback, cgroup: do not reparent dax inodes
  writeback, cgroup: remove wb from offline list before releasing refcnt
  memblock: make for_each_mem_range() traverse MEMBLOCK_HOTPLUG regions
  mm: page_alloc: fix page_poison=1 / INIT_ON_ALLOC_DEFAULT_ON interaction
  mm: use kmap_local_page in memzero_page
  mm: call flush_dcache_page() in memcpy_to_page() and memzero_page()
  kfence: skip all GFP_ZONEMASK allocations
  kfence: move the size check to the beginning of __kfence_alloc()
  kfence: defer kfence_test_init to ensure that kunit debugfs is created
  selftest: use mmap instead of posix_memalign to allocate memory
  userfaultfd: do not untag user pointers

3 years agohugetlbfs: fix mount mode command line processing
Mike Kravetz [Fri, 23 Jul 2021 22:50:44 +0000 (15:50 -0700)]
hugetlbfs: fix mount mode command line processing

In commit 32021982a324 ("hugetlbfs: Convert to fs_context") processing
of the mount mode string was changed from match_octal() to fsparam_u32.

This changed existing behavior as match_octal does not require octal
values to have a '0' prefix, but fsparam_u32 does.

Use fsparam_u32oct which provides the same behavior as match_octal.

Link: https://lkml.kernel.org/r/20210721183326.102716-1-mike.kravetz@oracle.com
Fixes: 32021982a324 ("hugetlbfs: Convert to fs_context")
Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
Reported-by: Dennis Camera <bugs+kernel.org@dtnr.ch>
Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
3 years agomm: fix the deadlock in finish_fault()
Qi Zheng [Fri, 23 Jul 2021 22:50:41 +0000 (15:50 -0700)]
mm: fix the deadlock in finish_fault()

Commit 63f3655f9501 ("mm, memcg: fix reclaim deadlock with writeback")
fix the following ABBA deadlock by pre-allocating the pte page table
without holding the page lock.

                                lock_page(A)
                                        SetPageWriteback(A)
                                        unlock_page(A)
  lock_page(B)
                                        lock_page(B)
  pte_alloc_one
    shrink_page_list
      wait_on_page_writeback(A)
                                        SetPageWriteback(B)
                                        unlock_page(B)

                                        # flush A, B to clear the writeback

Commit f9ce0be71d1f ("mm: Cleanup faultaround and finish_fault()
codepaths") reworked the relevant code but ignored this race.  This will
cause the deadlock above to appear again, so fix it.

Link: https://lkml.kernel.org/r/20210721074849.57004-1-zhengqi.arch@bytedance.com
Fixes: f9ce0be71d1f ("mm: Cleanup faultaround and finish_fault() codepaths")
Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: Muchun Song <songmuchun@bytedance.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
3 years agomm: mmap_lock: fix disabling preemption directly
Muchun Song [Fri, 23 Jul 2021 22:50:38 +0000 (15:50 -0700)]
mm: mmap_lock: fix disabling preemption directly

Commit 832b50725373 ("mm: mmap_lock: use local locks instead of
disabling preemption") fixed a bug by using local locks.

But commit d01079f3d0c0 ("mm/mmap_lock: remove dead code for
!CONFIG_TRACING configurations") changed those lines back to the
original version.

I guess it was introduced by fixing conflicts.

Link: https://lkml.kernel.org/r/20210720074228.76342-1-songmuchun@bytedance.com
Fixes: d01079f3d0c0 ("mm/mmap_lock: remove dead code for !CONFIG_TRACING configurations")
Signed-off-by: Muchun Song <songmuchun@bytedance.com>
Acked-by: Mel Gorman <mgorman@techsingularity.net>
Reviewed-by: Yang Shi <shy828301@gmail.com>
Reviewed-by: Pankaj Gupta <pankaj.gupta@ionos.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
3 years agomm/secretmem: wire up ->set_page_dirty
Mike Rapoport [Fri, 23 Jul 2021 22:50:35 +0000 (15:50 -0700)]
mm/secretmem: wire up ->set_page_dirty

Make secretmem up to date with the changes done in commit 0af573780b0b
("mm: require ->set_page_dirty to be explicitly wired up") so that
unconditional call to this method won't cause crashes.

Link: https://lkml.kernel.org/r/20210716063933.31633-1-rppt@kernel.org
Fixes: 0af573780b0b ("mm: require ->set_page_dirty to be explicitly wired up")
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
3 years agowriteback, cgroup: do not reparent dax inodes
Roman Gushchin [Fri, 23 Jul 2021 22:50:32 +0000 (15:50 -0700)]
writeback, cgroup: do not reparent dax inodes

The inode switching code is not suited for dax inodes.  An attempt to
switch a dax inode to a parent writeback structure (as a part of a
writeback cleanup procedure) results in a panic like this:

  run fstests generic/270 at 2021-07-15 05:54:02
  XFS (pmem0p2): EXPERIMENTAL big timestamp feature in use.  Use at your own risk!
  XFS (pmem0p2): DAX enabled. Warning: EXPERIMENTAL, use at your own risk
  XFS (pmem0p2): EXPERIMENTAL inode btree counters feature in use. Use at your own risk!
  XFS (pmem0p2): Mounting V5 Filesystem
  XFS (pmem0p2): Ending clean mount
  XFS (pmem0p2): Quotacheck needed: Please wait.
  XFS (pmem0p2): Quotacheck: Done.
  XFS (pmem0p2): xlog_verify_grant_tail: space > BBTOB(tail_blocks)
  XFS (pmem0p2): xlog_verify_grant_tail: space > BBTOB(tail_blocks)
  XFS (pmem0p2): xlog_verify_grant_tail: space > BBTOB(tail_blocks)
  BUG: unable to handle page fault for address: 0000000005b0f669
  #PF: supervisor read access in kernel mode
  #PF: error_code(0x0000) - not-present page
  PGD 0 P4D 0
  Oops: 0000 [#1] SMP PTI
  CPU: 13 PID: 10479 Comm: kworker/13:16 Not tainted 5.14.0-rc1-master-8096acd7442e+ #8
  Hardware name: HP ProLiant DL360 Gen9/ProLiant DL360 Gen9, BIOS P89 09/13/2016
  Workqueue: inode_switch_wbs inode_switch_wbs_work_fn
  RIP: 0010:inode_do_switch_wbs+0xaf/0x470
  Code: 00 30 0f 85 c1 03 00 00 0f 1f 44 00 00 31 d2 48 c7 c6 ff ff ff ff 48 8d 7c 24 08 e8 eb 49 1a 00 48 85 c0 74 4a bb ff ff ff ff <48> 8b 50 08 48 8d 4a ff 83 e2 01 48 0f 45 c1 48 8b 00 a8 08 0f 85
  RSP: 0018:ffff9c66691abdc8 EFLAGS: 00010002
  RAX: 0000000005b0f661 RBX: 00000000ffffffff RCX: ffff89e6a21382b0
  RDX: 0000000000000001 RSI: ffff89e350230248 RDI: ffffffffffffffff
  RBP: ffff89e681d19400 R08: 0000000000000000 R09: 0000000000000228
  R10: ffffffffffffffff R11: ffffffffffffffc0 R12: ffff89e6a2138130
  R13: ffff89e316af7400 R14: ffff89e316af6e78 R15: ffff89e6a21382b0
  FS:  0000000000000000(0000) GS:ffff89ee5fb40000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 0000000005b0f669 CR3: 0000000cb2410004 CR4: 00000000001706e0
  Call Trace:
   inode_switch_wbs_work_fn+0xb6/0x2a0
   process_one_work+0x1e6/0x380
   worker_thread+0x53/0x3d0
   kthread+0x10f/0x130
   ret_from_fork+0x22/0x30
  Modules linked in: xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 nft_compat nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nft_counter nf_tables nfnetlink bridge stp llc rfkill sunrpc intel_rapl_msr intel_rapl_common sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel ipmi_ssif kvm mgag200 i2c_algo_bit iTCO_wdt irqbypass drm_kms_helper iTCO_vendor_support acpi_ipmi rapl syscopyarea sysfillrect intel_cstate ipmi_si sysimgblt ioatdma dax_pmem_compat fb_sys_fops ipmi_devintf device_dax i2c_i801 pcspkr intel_uncore hpilo nd_pmem cec dax_pmem_core dca i2c_smbus acpi_tad lpc_ich ipmi_msghandler acpi_power_meter drm fuse xfs libcrc32c sd_mod t10_pi crct10dif_pclmul crc32_pclmul crc32c_intel tg3 ghash_clmulni_intel serio_raw hpsa hpwdt scsi_transport_sas wmi dm_mirror dm_region_hash dm_log dm_mod
  CR2: 0000000005b0f669
  ---[ end trace ed2105faff8384f3 ]---
  RIP: 0010:inode_do_switch_wbs+0xaf/0x470
  Code: 00 30 0f 85 c1 03 00 00 0f 1f 44 00 00 31 d2 48 c7 c6 ff ff ff ff 48 8d 7c 24 08 e8 eb 49 1a 00 48 85 c0 74 4a bb ff ff ff ff <48> 8b 50 08 48 8d 4a ff 83 e2 01 48 0f 45 c1 48 8b 00 a8 08 0f 85
  RSP: 0018:ffff9c66691abdc8 EFLAGS: 00010002
  RAX: 0000000005b0f661 RBX: 00000000ffffffff RCX: ffff89e6a21382b0
  RDX: 0000000000000001 RSI: ffff89e350230248 RDI: ffffffffffffffff
  RBP: ffff89e681d19400 R08: 0000000000000000 R09: 0000000000000228
  R10: ffffffffffffffff R11: ffffffffffffffc0 R12: ffff89e6a2138130
  R13: ffff89e316af7400 R14: ffff89e316af6e78 R15: ffff89e6a21382b0
  FS:  0000000000000000(0000) GS:ffff89ee5fb40000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 0000000005b0f669 CR3: 0000000cb2410004 CR4: 00000000001706e0
  Kernel panic - not syncing: Fatal exception
  Kernel Offset: 0x15200000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
  ---[ end Kernel panic - not syncing: Fatal exception ]---

The crash happens on an attempt to iterate over attached pagecache pages
and check the dirty flag: a dax inode's xarray contains pfn's instead of
generic struct page pointers.

This happens for DAX and not for other kinds of non-page entries in the
inodes because it's a tagged iteration, and shadow/swap entries are
never tagged; only DAX entries get tagged.

Fix the problem by bailing out (with the false return value) of
inode_prepare_sbs_switch() if a dax inode is passed.

[willy@infradead.org: changelog addition]

Link: https://lkml.kernel.org/r/20210719171350.3876830-1-guro@fb.com
Fixes: c22d70a162d3 ("writeback, cgroup: release dying cgwbs by switching attached inodes")
Signed-off-by: Roman Gushchin <guro@fb.com>
Reported-by: Murphy Zhou <jencce.kernel@gmail.com>
Reported-by: Darrick J. Wong <djwong@kernel.org>
Tested-by: Darrick J. Wong <djwong@kernel.org>
Tested-by: Murphy Zhou <jencce.kernel@gmail.com>
Acked-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Jan Kara <jack@suse.cz>
Cc: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
3 years agowriteback, cgroup: remove wb from offline list before releasing refcnt
Roman Gushchin [Fri, 23 Jul 2021 22:50:29 +0000 (15:50 -0700)]
writeback, cgroup: remove wb from offline list before releasing refcnt

Boyang reported that the commit c22d70a162d3 ("writeback, cgroup:
release dying cgwbs by switching attached inodes") causes the kernel to
crash while running xfstests generic/256 on ext4 on aarch64 and ppc64le.

  run fstests generic/256 at 2021-07-12 05:41:40
  EXT4-fs (vda3): mounted filesystem with ordered data mode. Opts: . Quota mode: none.
  Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
  Mem abort info:
     ESR = 0x96000005
     EC = 0x25: DABT (current EL), IL = 32 bits
     SET = 0, FnV = 0
     EA = 0, S1PTW = 0
     FSC = 0x05: level 1 translation fault
  Data abort info:
     ISV = 0, ISS = 0x00000005
     CM = 0, WnR = 0
  user pgtable: 64k pages, 48-bit VAs, pgdp=00000000b0502000
  [0000000000000000] pgd=0000000000000000, p4d=0000000000000000, pud=0000000000000000
  Internal error: Oops: 96000005 [#1] SMP
  Modules linked in: dm_flakey dm_snapshot dm_bufio dm_zero dm_mod loop tls rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache netfs rfkill sunrpc ext4 vfat fat mbcache jbd2 drm fuse xfs libcrc32c crct10dif_ce ghash_ce sha2_ce sha256_arm64 sha1_ce virtio_blk virtio_net net_failover virtio_console failover virtio_mmio aes_neon_bs [last unloaded: scsi_debug]
  CPU: 0 PID: 408468 Comm: kworker/u8:5 Tainted: G X --------- ---  5.14.0-0.rc1.15.bx.el9.aarch64 #1
  Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015
  Workqueue: events_unbound cleanup_offline_cgwbs_workfn
  pstate: 004000c5 (nzcv daIF +PAN -UAO -TCO BTYPE=--)
  pc : cleanup_offline_cgwbs_workfn+0x320/0x394
  lr : cleanup_offline_cgwbs_workfn+0xe0/0x394
  sp : ffff80001554fd10
  x29: ffff80001554fd10 x28: 0000000000000000 x27: 0000000000000001
  x26: 0000000000000000 x25: 00000000000000e0 x24: ffffd2a2fbe671a8
  x23: ffff80001554fd88 x22: ffffd2a2fbe67198 x21: ffffd2a2fc25a730
  x20: ffff210412bc3000 x19: ffff210412bc3280 x18: 0000000000000000
  x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000
  x14: 0000000000000000 x13: 0000000000000030 x12: 0000000000000040
  x11: ffff210481572238 x10: ffff21048157223a x9 : ffffd2a2fa276c60
  x8 : ffff210484106b60 x7 : 0000000000000000 x6 : 000000000007d18a
  x5 : ffff210416a86400 x4 : ffff210412bc0280 x3 : 0000000000000000
  x2 : ffff80001554fd88 x1 : ffff210412bc0280 x0 : 0000000000000003
  Call trace:
     cleanup_offline_cgwbs_workfn+0x320/0x394
     process_one_work+0x1f4/0x4b0
     worker_thread+0x184/0x540
     kthread+0x114/0x120
     ret_from_fork+0x10/0x18
  Code: d63f0020 97f99963 17ffffa6 f8588263 (f9400061)
  ---[ end trace e250fe289272792a ]---
  Kernel panic - not syncing: Oops: Fatal exception
  SMP: stopping secondary CPUs
  SMP: failed to stop secondary CPUs 0-2
  Kernel Offset: 0x52a2e9fa0000 from 0xffff800010000000
  PHYS_OFFSET: 0xfff0defca0000000
  CPU features: 0x00200251,23200840
  Memory Limit: none
  ---[ end Kernel panic - not syncing: Oops: Fatal exception ]---

The problem happens when cgwb_release_workfn() races with
cleanup_offline_cgwbs_workfn(): wb_tryget() in
cleanup_offline_cgwbs_workfn() can be called after percpu_ref_exit() is
cgwb_release_workfn(), which is basically a use-after-free error.

Fix the problem by making removing the writeback structure from the
offline list before releasing the percpu reference counter.  It will
guarantee that cleanup_offline_cgwbs_workfn() will not see and not
access writeback structures which are about to be released.

Link: https://lkml.kernel.org/r/20210716201039.3762203-1-guro@fb.com
Fixes: c22d70a162d3 ("writeback, cgroup: release dying cgwbs by switching attached inodes")
Signed-off-by: Roman Gushchin <guro@fb.com>
Reported-by: Boyang Xue <bxue@redhat.com>
Suggested-by: Jan Kara <jack@suse.cz>
Tested-by: Darrick J. Wong <djwong@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: Dave Chinner <dchinner@redhat.com>
Cc: Murphy Zhou <jencce.kernel@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
3 years agomemblock: make for_each_mem_range() traverse MEMBLOCK_HOTPLUG regions
Mike Rapoport [Fri, 23 Jul 2021 22:50:26 +0000 (15:50 -0700)]
memblock: make for_each_mem_range() traverse MEMBLOCK_HOTPLUG regions

Commit b10d6bca8720 ("arch, drivers: replace for_each_membock() with
for_each_mem_range()") didn't take into account that when there is
movable_node parameter in the kernel command line, for_each_mem_range()
would skip ranges marked with MEMBLOCK_HOTPLUG.

The page table setup code in POWER uses for_each_mem_range() to create
the linear mapping of the physical memory and since the regions marked
as MEMORY_HOTPLUG are skipped, they never make it to the linear map.

A later access to the memory in those ranges will fail:

  BUG: Unable to handle kernel data access on write at 0xc000000400000000
  Faulting instruction address: 0xc00000000008a3c0
  Oops: Kernel access of bad area, sig: 11 [#1]
  LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries
  Modules linked in:
  CPU: 0 PID: 53 Comm: kworker/u2:0 Not tainted 5.13.0 #7
  NIP:  c00000000008a3c0 LR: c0000000003c1ed8 CTR: 0000000000000040
  REGS: c000000008a57770 TRAP: 0300   Not tainted  (5.13.0)
  MSR:  8000000002009033 <SF,VEC,EE,ME,IR,DR,RI,LE>  CR: 84222202  XER: 20040000
  CFAR: c0000000003c1ed4 DAR: c000000400000000 DSISR: 42000000 IRQMASK: 0
  GPR00: c0000000003c1ed8 c000000008a57a10 c0000000019da700 c000000400000000
  GPR04: 0000000000000280 0000000000000180 0000000000000400 0000000000000200
  GPR08: 0000000000000100 0000000000000080 0000000000000040 0000000000000300
  GPR12: 0000000000000380 c000000001bc0000 c0000000001660c8 c000000006337e00
  GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
  GPR20: 0000000040000000 0000000020000000 c000000001a81990 c000000008c30000
  GPR24: c000000008c20000 c000000001a81998 000fffffffff0000 c000000001a819a0
  GPR28: c000000001a81908 c00c000001000000 c000000008c40000 c000000008a64680
  NIP clear_user_page+0x50/0x80
  LR __handle_mm_fault+0xc88/0x1910
  Call Trace:
    __handle_mm_fault+0xc44/0x1910 (unreliable)
    handle_mm_fault+0x130/0x2a0
    __get_user_pages+0x248/0x610
    __get_user_pages_remote+0x12c/0x3e0
    get_arg_page+0x54/0xf0
    copy_string_kernel+0x11c/0x210
    kernel_execve+0x16c/0x220
    call_usermodehelper_exec_async+0x1b0/0x2f0
    ret_from_kernel_thread+0x5c/0x70
  Instruction dump:
  79280fa4 79271764 79261f24 794ae8e2 7ca94214 7d683a14 7c893a14 7d893050
  7d4903a6 60000000 60000000 60000000 <7c001fec7c091fec 7c081fec 7c051fec
  ---[ end trace 490b8c67e6075e09 ]---

Making for_each_mem_range() include MEMBLOCK_HOTPLUG regions in the
traversal fixes this issue.

Link: https://bugzilla.redhat.com/show_bug.cgi?id=1976100
Link: https://lkml.kernel.org/r/20210712071132.20902-1-rppt@kernel.org
Fixes: b10d6bca8720 ("arch, drivers: replace for_each_membock() with for_each_mem_range()")
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Tested-by: Greg Kurz <groug@kaod.org>
Reviewed-by: David Hildenbrand <david@redhat.com>
Cc: <stable@vger.kernel.org> [5.10+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
3 years agomm: page_alloc: fix page_poison=1 / INIT_ON_ALLOC_DEFAULT_ON interaction
Sergei Trofimovich [Fri, 23 Jul 2021 22:50:23 +0000 (15:50 -0700)]
mm: page_alloc: fix page_poison=1 / INIT_ON_ALLOC_DEFAULT_ON interaction

To reproduce the failure we need the following system:

 - kernel command: page_poison=1 init_on_free=0 init_on_alloc=0

 - kernel config:
    * CONFIG_INIT_ON_ALLOC_DEFAULT_ON=y
    * CONFIG_INIT_ON_FREE_DEFAULT_ON=y
    * CONFIG_PAGE_POISONING=y

Resulting in:

    0000000085629bdd: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    0000000022861832: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    00000000c597f5b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    CPU: 11 PID: 15195 Comm: bash Kdump: loaded Tainted: G     U     O      5.13.1-gentoo-x86_64 #1
    Hardware name: System manufacturer System Product Name/PRIME Z370-A, BIOS 2801 01/13/2021
    Call Trace:
     dump_stack+0x64/0x7c
     __kernel_unpoison_pages.cold+0x48/0x84
     post_alloc_hook+0x60/0xa0
     get_page_from_freelist+0xdb8/0x1000
     __alloc_pages+0x163/0x2b0
     __get_free_pages+0xc/0x30
     pgd_alloc+0x2e/0x1a0
     mm_init+0x185/0x270
     dup_mm+0x6b/0x4f0
     copy_process+0x190d/0x1b10
     kernel_clone+0xba/0x3b0
     __do_sys_clone+0x8f/0xb0
     do_syscall_64+0x68/0x80
     entry_SYSCALL_64_after_hwframe+0x44/0xae

Before commit 51cba1ebc60d ("init_on_alloc: Optimize static branches")
init_on_alloc never enabled static branch by default.  It could only be
enabed explicitly by init_mem_debugging_and_hardening().

But after commit 51cba1ebc60d, a static branch could already be enabled
by default.  There was no code to ever disable it.  That caused
page_poison=1 / init_on_free=1 conflict.

This change extends init_mem_debugging_and_hardening() to also disable
static branch disabling.

Link: https://lkml.kernel.org/r/20210714031935.4094114-1-keescook@chromium.org
Link: https://lore.kernel.org/r/20210712215816.1512739-1-slyfox@gentoo.org
Fixes: 51cba1ebc60d ("init_on_alloc: Optimize static branches")
Signed-off-by: Sergei Trofimovich <slyfox@gentoo.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Co-developed-by: Kees Cook <keescook@chromium.org>
Reported-by: Mikhail Morfikov <mmorfikov@gmail.com>
Reported-by: <bowsingbetee@pm.me>
Tested-by: <bowsingbetee@protonmail.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
3 years agomm: use kmap_local_page in memzero_page
Christoph Hellwig [Fri, 23 Jul 2021 22:50:20 +0000 (15:50 -0700)]
mm: use kmap_local_page in memzero_page

The commit message introducing the global memzero_page explicitly
mentions switching to kmap_local_page in the commit log but doesn't
actually do that.

Link: https://lkml.kernel.org/r/20210713055231.137602-3-hch@lst.de
Fixes: 28961998f858 ("iov_iter: lift memzero_page() to highmem.h")
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
3 years agomm: call flush_dcache_page() in memcpy_to_page() and memzero_page()
Christoph Hellwig [Fri, 23 Jul 2021 22:50:17 +0000 (15:50 -0700)]
mm: call flush_dcache_page() in memcpy_to_page() and memzero_page()

memcpy_to_page and memzero_page can write to arbitrary pages, which
could be in the page cache or in high memory, so call
flush_kernel_dcache_pages to flush the dcache.

This is a problem when using these helpers on dcache challeneged
architectures.  Right now there are just a few users, chances are no one
used the PC floppy driver, the aha1542 driver for an ISA SCSI HBA, and a
few advanced and optional btrfs and ext4 features on those platforms yet
since the conversion.

Link: https://lkml.kernel.org/r/20210713055231.137602-2-hch@lst.de
Fixes: bb90d4bc7b6a ("mm/highmem: Lift memcpy_[to|from]_page to core")
Fixes: 28961998f858 ("iov_iter: lift memzero_page() to highmem.h")
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Cc: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
3 years agokfence: skip all GFP_ZONEMASK allocations
Alexander Potapenko [Fri, 23 Jul 2021 22:50:14 +0000 (15:50 -0700)]
kfence: skip all GFP_ZONEMASK allocations

Allocation requests outside ZONE_NORMAL (MOVABLE, HIGHMEM or DMA) cannot
be fulfilled by KFENCE, because KFENCE memory pool is located in a zone
different from the requested one.

Because callers of kmem_cache_alloc() may actually rely on the
allocation to reside in the requested zone (e.g.  memory allocations
done with __GFP_DMA must be DMAable), skip all allocations done with
GFP_ZONEMASK and/or respective SLAB flags (SLAB_CACHE_DMA and
SLAB_CACHE_DMA32).

Link: https://lkml.kernel.org/r/20210714092222.1890268-2-glider@google.com
Fixes: 0ce20dd84089 ("mm: add Kernel Electric-Fence infrastructure")
Signed-off-by: Alexander Potapenko <glider@google.com>
Reviewed-by: Marco Elver <elver@google.com>
Acked-by: Souptick Joarder <jrdr.linux@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Souptick Joarder <jrdr.linux@gmail.com>
Cc: <stable@vger.kernel.org> [5.12+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
3 years agokfence: move the size check to the beginning of __kfence_alloc()
Alexander Potapenko [Fri, 23 Jul 2021 22:50:11 +0000 (15:50 -0700)]
kfence: move the size check to the beginning of __kfence_alloc()

Check the allocation size before toggling kfence_allocation_gate.

This way allocations that can't be served by KFENCE will not result in
waiting for another CONFIG_KFENCE_SAMPLE_INTERVAL without allocating
anything.

Link: https://lkml.kernel.org/r/20210714092222.1890268-1-glider@google.com
Signed-off-by: Alexander Potapenko <glider@google.com>
Suggested-by: Marco Elver <elver@google.com>
Reviewed-by: Marco Elver <elver@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Marco Elver <elver@google.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: <stable@vger.kernel.org> [5.12+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
3 years agokfence: defer kfence_test_init to ensure that kunit debugfs is created
Weizhao Ouyang [Fri, 23 Jul 2021 22:50:08 +0000 (15:50 -0700)]
kfence: defer kfence_test_init to ensure that kunit debugfs is created

kfence_test_init and kunit_init both use the same level late_initcall,
which means if kfence_test_init linked ahead of kunit_init,
kfence_test_init will get a NULL debugfs_rootdir as parent dentry, then
kfence_test_init and kfence_debugfs_init both create a debugfs node
named "kfence" under debugfs_mount->mnt_root, and it will throw out
"debugfs: Directory 'kfence' with parent '/' already present!" with
EEXIST.  So kfence_test_init should be deferred.

Link: https://lkml.kernel.org/r/20210714113140.2949995-1-o451686892@gmail.com
Signed-off-by: Weizhao Ouyang <o451686892@gmail.com>
Tested-by: Marco Elver <elver@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
3 years agoselftest: use mmap instead of posix_memalign to allocate memory
Peter Collingbourne [Fri, 23 Jul 2021 22:50:04 +0000 (15:50 -0700)]
selftest: use mmap instead of posix_memalign to allocate memory

This test passes pointers obtained from anon_allocate_area to the
userfaultfd and mremap APIs.  This causes a problem if the system
allocator returns tagged pointers because with the tagged address ABI
the kernel rejects tagged addresses passed to these APIs, which would
end up causing the test to fail.  To make this test compatible with such
system allocators, stop using the system allocator to allocate memory in
anon_allocate_area, and instead just use mmap.

Link: https://lkml.kernel.org/r/20210714195437.118982-3-pcc@google.com
Link: https://linux-review.googlesource.com/id/Icac91064fcd923f77a83e8e133f8631c5b8fc241
Fixes: c47174fc362a ("userfaultfd: selftest")
Co-developed-by: Lokesh Gidra <lokeshgidra@google.com>
Signed-off-by: Lokesh Gidra <lokeshgidra@google.com>
Signed-off-by: Peter Collingbourne <pcc@google.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
Cc: Dave Martin <Dave.Martin@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Alistair Delva <adelva@google.com>
Cc: William McVicker <willmcvicker@google.com>
Cc: Evgenii Stepanov <eugenis@google.com>
Cc: Mitch Phillips <mitchp@google.com>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: <stable@vger.kernel.org> [5.4]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
3 years agouserfaultfd: do not untag user pointers
Peter Collingbourne [Fri, 23 Jul 2021 22:50:01 +0000 (15:50 -0700)]
userfaultfd: do not untag user pointers

Patch series "userfaultfd: do not untag user pointers", v5.

If a user program uses userfaultfd on ranges of heap memory, it may end
up passing a tagged pointer to the kernel in the range.start field of
the UFFDIO_REGISTER ioctl.  This can happen when using an MTE-capable
allocator, or on Android if using the Tagged Pointers feature for MTE
readiness [1].

When a fault subsequently occurs, the tag is stripped from the fault
address returned to the application in the fault.address field of struct
uffd_msg.  However, from the application's perspective, the tagged
address *is* the memory address, so if the application is unaware of
memory tags, it may get confused by receiving an address that is, from
its point of view, outside of the bounds of the allocation.  We observed
this behavior in the kselftest for userfaultfd [2] but other
applications could have the same problem.

Address this by not untagging pointers passed to the userfaultfd ioctls.
Instead, let the system call fail.  Also change the kselftest to use
mmap so that it doesn't encounter this problem.

[1] https://source.android.com/devices/tech/debug/tagged-pointers
[2] tools/testing/selftests/vm/userfaultfd.c

This patch (of 2):

Do not untag pointers passed to the userfaultfd ioctls.  Instead, let
the system call fail.  This will provide an early indication of problems
with tag-unaware userspace code instead of letting the code get confused
later, and is consistent with how we decided to handle brk/mmap/mremap
in commit dcde237319e6 ("mm: Avoid creating virtual address aliases in
brk()/mmap()/mremap()"), as well as being consistent with the existing
tagged address ABI documentation relating to how ioctl arguments are
handled.

The code change is a revert of commit 7d0325749a6c ("userfaultfd: untag
user pointers") plus some fixups to some additional calls to
validate_range that have appeared since then.

[1] https://source.android.com/devices/tech/debug/tagged-pointers
[2] tools/testing/selftests/vm/userfaultfd.c

Link: https://lkml.kernel.org/r/20210714195437.118982-1-pcc@google.com
Link: https://lkml.kernel.org/r/20210714195437.118982-2-pcc@google.com
Link: https://linux-review.googlesource.com/id/I761aa9f0344454c482b83fcfcce547db0a25501b
Fixes: 63f0c6037965 ("arm64: Introduce prctl() options to control the tagged user addresses ABI")
Signed-off-by: Peter Collingbourne <pcc@google.com>
Reviewed-by: Andrey Konovalov <andreyknvl@gmail.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Alistair Delva <adelva@google.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Dave Martin <Dave.Martin@arm.com>
Cc: Evgenii Stepanov <eugenis@google.com>
Cc: Lokesh Gidra <lokeshgidra@google.com>
Cc: Mitch Phillips <mitchp@google.com>
Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: William McVicker <willmcvicker@google.com>
Cc: <stable@vger.kernel.org> [5.4]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
3 years agoio_uring: explicitly catch any illegal async queue attempt
Jens Axboe [Fri, 23 Jul 2021 17:53:54 +0000 (11:53 -0600)]
io_uring: explicitly catch any illegal async queue attempt

Catch an illegal case to queue async from an unrelated task that got
the ring fd passed to it. This should not be possible to hit, but
better be proactive and catch it explicitly. io-wq is extended to
check for early IO_WQ_WORK_CANCEL being set on a work item as well,
so it can run the request through the normal cancelation path.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agoio_uring: never attempt iopoll reissue from release path
Jens Axboe [Fri, 23 Jul 2021 17:49:29 +0000 (11:49 -0600)]
io_uring: never attempt iopoll reissue from release path

There are two reasons why this shouldn't be done:

1) Ring is exiting, and we're canceling requests anyway. Any request
   should be canceled anyway. In theory, this could iterate for a
   number of times if someone else is also driving the target block
   queue into request starvation, however the likelihood of this
   happening is miniscule.

2) If the original task decided to pass the ring to another task, then
   we don't want to be reissuing from this context as it may be an
   unrelated task or context. No assumptions should be made about
   the context in which ->release() is run. This can only happen for pure
   read/write, and we'll get -EFAULT on them anyway.

Link: https://lore.kernel.org/io-uring/YPr4OaHv0iv0KTOc@zeniv-ca.linux.org.uk/
Reported-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agoMerge tag 'for-5.14-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave...
Linus Torvalds [Fri, 23 Jul 2021 19:49:07 +0000 (12:49 -0700)]
Merge tag 'for-5.14-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux

Pull btrfs fixes from David Sterba:
 "A few fixes and one patch to help some block layer API cleanups:

   - skip missing device when running fstrim

   - fix unpersisted i_size on fsync after expanding truncate

   - fix lock inversion problem when doing qgroup extent tracing

   - replace bdgrab/bdput usage, replace gendisk by block_device"

* tag 'for-5.14-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
  btrfs: store a block_device in struct btrfs_ordered_extent
  btrfs: fix lock inversion problem when doing qgroup extent tracing
  btrfs: check for missing device in btrfs_trim_fs
  btrfs: fix unpersisted i_size on fsync after expanding truncate

3 years agoMerge tag 'ceph-for-5.14-rc3' of git://github.com/ceph/ceph-client
Linus Torvalds [Fri, 23 Jul 2021 18:30:12 +0000 (11:30 -0700)]
Merge tag 'ceph-for-5.14-rc3' of git://github.com/ceph/ceph-client

Pull ceph fixes from Ilya Dryomov:
 "A subtle deadlock on lock_rwsem (marked for stable) and rbd fixes for
  a -rc1 regression.

  Also included a rare WARN condition tweak"

* tag 'ceph-for-5.14-rc3' of git://github.com/ceph/ceph-client:
  rbd: resurrect setting of disk->private_data in rbd_init_disk()
  ceph: don't WARN if we're still opening a session to an MDS
  rbd: don't hold lock_rwsem while running_list is being drained
  rbd: always kick acquire on "acquired" and "released" notifications

3 years agoMerge tag 'trace-v5.14-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt...
Linus Torvalds [Fri, 23 Jul 2021 18:25:21 +0000 (11:25 -0700)]
Merge tag 'trace-v5.14-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace

Pull tracing fixes from Steven Rostedt:

 - Fix deadloop in ring buffer because of using stale "read" variable

 - Fix synthetic event use of field_pos as boolean and not an index

 - Fixed histogram special var "cpu" overriding event fields called
   "cpu"

 - Cleaned up error prone logic in alloc_synth_event()

 - Removed call to synchronize_rcu_tasks_rude() when not needed

 - Removed redundant initialization of a local variable "ret"

 - Fixed kernel crash when updating tracepoint callbacks of different
   priorities.

* tag 'trace-v5.14-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  tracepoints: Update static_call before tp_funcs when adding a tracepoint
  ftrace: Remove redundant initialization of variable ret
  ftrace: Avoid synchronize_rcu_tasks_rude() call when not necessary
  tracing: Clean up alloc_synth_event()
  tracing/histogram: Rename "cpu" to "common_cpu"
  tracing: Synthetic event field_pos is an index not a boolean
  tracing: Fix bug in rb_per_cpu_empty() that might cause deadloop.

3 years agoMerge tag 'm68k-for-v5.14-tag2' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Fri, 23 Jul 2021 18:19:57 +0000 (11:19 -0700)]
Merge tag 'm68k-for-v5.14-tag2' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k

Pull m68k fix from Geert Uytterhoeven:

 - Fix a Mac defconfig regression due to the IDE -> ATA switch

* tag 'm68k-for-v5.14-tag2' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k:
  m68k: MAC should select HAVE_PATA_PLATFORM

3 years agoMerge tag 'acpi-5.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael...
Linus Torvalds [Fri, 23 Jul 2021 18:08:06 +0000 (11:08 -0700)]
Merge tag 'acpi-5.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull ACPI fixes from Rafael Wysocki:
 "These fix a recently broken Kconfig dependency and ACPI device
  reference counting in an iterator macro.

  Specifics:

   - Fix recently broken Kconfig dependency for the ACPI table override
     via built-in initrd (Robert Richter)

   - Fix ACPI device reference counting in the for_each_acpi_dev_match()
     helper macro to avoid use-after-free (Andy Shevchenko)"

* tag 'acpi-5.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  ACPI: utils: Fix reference counting in for_each_acpi_dev_match()
  ACPI: Kconfig: Fix table override from built-in initrd

3 years agoMerge tag 'driver-core-5.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Fri, 23 Jul 2021 17:20:15 +0000 (10:20 -0700)]
Merge tag 'driver-core-5.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core

Pull driver core fixes from Greg KH:
 "Here are two small driver core fixes to resolve some reported problems
  for 5.14-rc3. They include:

   - aux bus memory leak fix

   - unneeded warning message removed when removing a device link.

  Both have been in linux-next with no reported problems"

* tag 'driver-core-5.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
  driver core: Prevent warning when removing a device link from unregistered consumer
  driver core: auxiliary bus: Fix memory leak when driver_register() fail

3 years agoMerge tag 'char-misc-5.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregk...
Linus Torvalds [Fri, 23 Jul 2021 17:14:56 +0000 (10:14 -0700)]
Merge tag 'char-misc-5.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc

Pull char/misc fixes from Greg KH:
 "Here are some small char/misc driver fixes for 5.14-rc3.

  Included in here are:

   - MAINTAINERS file updates for two changes in different driver
     subsystems

   - mhi bus bugfixes

   - nds32 bugfix that resolves a reported problem

  All have been in linux-next with no reported problems"

* tag 'char-misc-5.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
  nds32: fix up stack guard gap
  MAINTAINERS: Change ACRN HSM driver maintainer
  MAINTAINERS: Update for VMCI driver
  bus: mhi: pci_generic: Fix inbound IPCR channel
  bus: mhi: core: Validate channel ID when processing command completions
  bus: mhi: pci_generic: Apply no-op for wake using sideband wake boolean

3 years agoMerge tag 'usb-5.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Linus Torvalds [Fri, 23 Jul 2021 17:09:27 +0000 (10:09 -0700)]
Merge tag 'usb-5.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb

Pull USB fixes from Greg KH:
 "Here are some USB fixes for 5.14-rc3 to resolve a bunch of tiny
  problems reported. Included in here are:

   - dtsi revert to resolve a problem which broke android systems that
     relied on the dts name to find the USB controller device.

     People are still working out the "real" solution for this, but for
     now the revert is needed.

   - core USB fix for pipe calculation found by syzbot

   - typec fixes

   - gadget driver fixes

   - new usb-serial device ids

   - new USB quirks

   - xhci fixes

   - usb hub fixes for power management issues reported

   - other tiny fixes

  All have been in linux-next with no reported problems"

* tag 'usb-5.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (27 commits)
  USB: serial: cp210x: add ID for CEL EM3588 USB ZigBee stick
  Revert "USB: quirks: ignore remote wake-up on Fibocom L850-GL LTE modem"
  usb: cdc-wdm: fix build error when CONFIG_WWAN_CORE is not set
  Revert "arm64: dts: qcom: Harmonize DWC USB3 DT nodes name"
  usb: dwc2: gadget: Fix sending zero length packet in DDMA mode.
  usb: dwc2: Skip clock gating on Samsung SoCs
  usb: renesas_usbhs: Fix superfluous irqs happen after usb_pkt_pop()
  usb: dwc2: gadget: Fix GOUTNAK flow for Slave mode.
  usb: phy: Fix page fault from usb_phy_uevent
  usb: xhci: avoid renesas_usb_fw.mem when it's unusable
  usb: gadget: u_serial: remove WARN_ON on null port
  usb: dwc3: avoid NULL access of usb_gadget_driver
  usb: max-3421: Prevent corruption of freed memory
  usb: gadget: Fix Unbalanced pm_runtime_enable in tegra_xudc_probe
  MAINTAINERS: repair reference in USB IP DRIVER FOR HISILICON KIRIN 970
  usb: typec: stusb160x: Don't block probing of consumer of "connector" nodes
  usb: typec: stusb160x: register role switch before interrupt registration
  USB: usb-storage: Add LaCie Rugged USB3-FW to IGNORE_UAS
  usb: ehci: Prevent missed ehci interrupts with edge-triggered MSI
  usb: hub: Disable USB 3 device initiated lpm if exit latency is too high
  ...

3 years agoMerge tag 'sound-5.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai...
Linus Torvalds [Fri, 23 Jul 2021 16:58:23 +0000 (09:58 -0700)]
Merge tag 'sound-5.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound

Pull sound fixes from Takashi Iwai:
 "A collection of small fixes, mostly covering device-specific
  regressions and bugs over ASoC, HD-audio and USB-audio, while
  the ALSA PCM core received a few additional fixes for the
  possible (new and old) regressions"

* tag 'sound-5.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (29 commits)
  ALSA: usb-audio: Add registration quirk for JBL Quantum headsets
  ALSA: hda/hdmi: Add quirk to force pin connectivity on NUC10
  ALSA: pcm: Fix mmap without buffer preallocation
  ALSA: pcm: Fix mmap capability check
  ALSA: hda: intel-dsp-cfg: add missing ElkhartLake PCI ID
  ASoC: ti: j721e-evm: Check for not initialized parent_clk_id
  ASoC: ti: j721e-evm: Fix unbalanced domain activity tracking during startup
  ALSA: hda/realtek: Fix pop noise and 2 Front Mic issues on a machine
  ALSA: hdmi: Expose all pins on MSI MS-7C94 board
  ALSA: sb: Fix potential ABBA deadlock in CSP driver
  ASoC: rt5682: Fix the issue of garbled recording after powerd_dbus_suspend
  ASoC: amd: reverse stop sequence for stoneyridge platform
  ASoC: soc-pcm: add a flag to reverse the stop sequence
  ASoC: codecs: wcd938x: setup irq during component bind
  ASoC: dt-bindings: renesas: rsnd: Fix incorrect 'port' regex schema
  ALSA: usb-audio: Add missing proc text entry for BESPOKEN type
  ASoC: codecs: wcd938x: make sdw dependency explicit in Kconfig
  ASoC: SOF: Intel: Update ADL descriptor to use ACPI power states
  ASoC: rt5631: Fix regcache sync errors on resume
  ALSA: pcm: Call substream ack() method upon compat mmap commit
  ...

3 years agoMerge branch 'acpi-utils'
Rafael J. Wysocki [Fri, 23 Jul 2021 15:06:15 +0000 (17:06 +0200)]
Merge branch 'acpi-utils'

* acpi-utils:
  ACPI: utils: Fix reference counting in for_each_acpi_dev_match()

3 years agotracepoints: Update static_call before tp_funcs when adding a tracepoint
Steven Rostedt (VMware) [Fri, 23 Jul 2021 01:52:18 +0000 (21:52 -0400)]
tracepoints: Update static_call before tp_funcs when adding a tracepoint

Because of the significant overhead that retpolines pose on indirect
calls, the tracepoint code was updated to use the new "static_calls" that
can modify the running code to directly call a function instead of using
an indirect caller, and this function can be changed at runtime.

In the tracepoint code that calls all the registered callbacks that are
attached to a tracepoint, the following is done:

it_func_ptr = rcu_dereference_raw((&__tracepoint_##name)->funcs);
if (it_func_ptr) {
__data = (it_func_ptr)->data;
static_call(tp_func_##name)(__data, args);
}

If there's just a single callback, the static_call is updated to just call
that callback directly. Once another handler is added, then the static
caller is updated to call the iterator, that simply loops over all the
funcs in the array and calls each of the callbacks like the old method
using indirect calling.

The issue was discovered with a race between updating the funcs array and
updating the static_call. The funcs array was updated first and then the
static_call was updated. This is not an issue as long as the first element
in the old array is the same as the first element in the new array. But
that assumption is incorrect, because callbacks also have a priority
field, and if there's a callback added that has a higher priority than the
callback on the old array, then it will become the first callback in the
new array. This means that it is possible to call the old callback with
the new callback data element, which can cause a kernel panic.

static_call = callback1()
funcs[] = {callback1,data1};
callback2 has higher priority than callback1

CPU 1 CPU 2
----- -----

   new_funcs = {callback2,data2},
               {callback1,data1}

   rcu_assign_pointer(tp->funcs, new_funcs);

  /*
   * Now tp->funcs has the new array
   * but the static_call still calls callback1
   */

it_func_ptr = tp->funcs [ new_funcs ]
data = it_func_ptr->data [ data2 ]
static_call(callback1, data);

/* Now callback1 is called with
 * callback2's data */

[ KERNEL PANIC ]

   update_static_call(iterator);

To prevent this from happening, always switch the static_call to the
iterator before assigning the tp->funcs to the new array. The iterator will
always properly match the callback with its data.

To trigger this bug:

  In one terminal:

    while :; do hackbench 50; done

  In another terminal

    echo 1 > /sys/kernel/tracing/events/sched/sched_waking/enable
    while :; do
        echo 1 > /sys/kernel/tracing/set_event_pid;
        sleep 0.5
        echo 0 > /sys/kernel/tracing/set_event_pid;
        sleep 0.5
   done

And it doesn't take long to crash. This is because the set_event_pid adds
a callback to the sched_waking tracepoint with a high priority, which will
be called before the sched_waking trace event callback is called.

Note, the removal to a single callback updates the array first, before
changing the static_call to single callback, which is the proper order as
the first element in the array is the same as what the static_call is
being changed to.

Link: https://lore.kernel.org/io-uring/4ebea8f0-58c9-e571-fd30-0ce4f6f09c70@samba.org/
Cc: stable@vger.kernel.org
Fixes: d25e37d89dd2f ("tracepoint: Optimize using static_call()")
Reported-by: Stefan Metzmacher <metze@samba.org>
tested-by: Stefan Metzmacher <metze@samba.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
3 years agoftrace: Remove redundant initialization of variable ret
Colin Ian King [Wed, 21 Jul 2021 12:09:15 +0000 (13:09 +0100)]
ftrace: Remove redundant initialization of variable ret

The variable ret is being initialized with a value that is never
read, it is being updated later on. The assignment is redundant and
can be removed.

Link: https://lkml.kernel.org/r/20210721120915.122278-1-colin.king@canonical.com
Addresses-Coverity: ("Unused value")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
3 years agoftrace: Avoid synchronize_rcu_tasks_rude() call when not necessary
Nicolas Saenz Julienne [Wed, 21 Jul 2021 11:47:26 +0000 (13:47 +0200)]
ftrace: Avoid synchronize_rcu_tasks_rude() call when not necessary

synchronize_rcu_tasks_rude() triggers IPIs and forces rescheduling on
all CPUs. It is a costly operation and, when targeting nohz_full CPUs,
very disrupting (hence the name). So avoid calling it when 'old_hash'
doesn't need to be freed.

Link: https://lkml.kernel.org/r/20210721114726.1545103-1-nsaenzju@redhat.com
Signed-off-by: Nicolas Saenz Julienne <nsaenzju@redhat.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
3 years agotracing: Clean up alloc_synth_event()
Steven Rostedt (VMware) [Wed, 21 Jul 2021 23:53:41 +0000 (19:53 -0400)]
tracing: Clean up alloc_synth_event()

alloc_synth_event() currently has the following code to initialize the
event fields and dynamic_fields:

for (i = 0, j = 0; i < n_fields; i++) {
event->fields[i] = fields[i];

if (fields[i]->is_dynamic) {
event->dynamic_fields[j] = fields[i];
event->dynamic_fields[j]->field_pos = i;
event->dynamic_fields[j++] = fields[i];
event->n_dynamic_fields++;
}
}

1) It would make more sense to have all fields keep track of their
   field_pos.

2) event->dynmaic_fields[j] is assigned twice for no reason.

3) We can move updating event->n_dynamic_fields outside the loop, and just
   assign it to j.

This combination makes the code much cleaner.

Link: https://lkml.kernel.org/r/20210721195341.29bb0f77@oasis.local.home
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
3 years agotracing/histogram: Rename "cpu" to "common_cpu"
Steven Rostedt (VMware) [Wed, 21 Jul 2021 15:00:53 +0000 (11:00 -0400)]
tracing/histogram: Rename "cpu" to "common_cpu"

Currently the histogram logic allows the user to write "cpu" in as an
event field, and it will record the CPU that the event happened on.

The problem with this is that there's a lot of events that have "cpu"
as a real field, and using "cpu" as the CPU it ran on, makes it
impossible to run histograms on the "cpu" field of events.

For example, if I want to have a histogram on the count of the
workqueue_queue_work event on its cpu field, running:

 ># echo 'hist:keys=cpu' > events/workqueue/workqueue_queue_work/trigger

Gives a misleading and wrong result.

Change the command to "common_cpu" as no event should have "common_*"
fields as that's a reserved name for fields used by all events. And
this makes sense here as common_cpu would be a field used by all events.

Now we can even do:

 ># echo 'hist:keys=common_cpu,cpu if cpu < 100' > events/workqueue/workqueue_queue_work/trigger
 ># cat events/workqueue/workqueue_queue_work/hist
 # event histogram
 #
 # trigger info: hist:keys=common_cpu,cpu:vals=hitcount:sort=hitcount:size=2048 if cpu < 100 [active]
 #

 { common_cpu:          0, cpu:          2 } hitcount:          1
 { common_cpu:          0, cpu:          4 } hitcount:          1
 { common_cpu:          7, cpu:          7 } hitcount:          1
 { common_cpu:          0, cpu:          7 } hitcount:          1
 { common_cpu:          0, cpu:          1 } hitcount:          1
 { common_cpu:          0, cpu:          6 } hitcount:          2
 { common_cpu:          0, cpu:          5 } hitcount:          2
 { common_cpu:          1, cpu:          1 } hitcount:          4
 { common_cpu:          6, cpu:          6 } hitcount:          4
 { common_cpu:          5, cpu:          5 } hitcount:         14
 { common_cpu:          4, cpu:          4 } hitcount:         26
 { common_cpu:          0, cpu:          0 } hitcount:         39
 { common_cpu:          2, cpu:          2 } hitcount:        184

Now for backward compatibility, I added a trick. If "cpu" is used, and
the field is not found, it will fall back to "common_cpu" and work as
it did before. This way, it will still work for old programs that use
"cpu" to get the actual CPU, but if the event has a "cpu" as a field, it
will get that event's "cpu" field, which is probably what it wants
anyway.

I updated the tracefs/README to include documentation about both the
common_timestamp and the common_cpu. This way, if that text is present in
the README, then an application can know that common_cpu is supported over
just plain "cpu".

Link: https://lkml.kernel.org/r/20210721110053.26b4f641@oasis.local.home
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: stable@vger.kernel.org
Fixes: 8b7622bf94a44 ("tracing: Add cpu field for hist triggers")
Reviewed-by: Tom Zanussi <zanussi@kernel.org>
Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
3 years agotracing: Synthetic event field_pos is an index not a boolean
Steven Rostedt (VMware) [Wed, 21 Jul 2021 23:10:08 +0000 (19:10 -0400)]
tracing: Synthetic event field_pos is an index not a boolean

Performing the following:

 ># echo 'wakeup_lat s32 pid; u64 delta; char wake_comm[]' > synthetic_events
 ># echo 'hist:keys=pid:__arg__1=common_timestamp.usecs' > events/sched/sched_waking/trigger
 ># echo 'hist:keys=next_pid:pid=next_pid,delta=common_timestamp.usecs-$__arg__1:onmatch(sched.sched_waking).trace(wakeup_lat,$pid,$delta,prev_comm)'\
      > events/sched/sched_switch/trigger
 ># echo 1 > events/synthetic/enable

Crashed the kernel:

 BUG: kernel NULL pointer dereference, address: 000000000000001b
 #PF: supervisor read access in kernel mode
 #PF: error_code(0x0000) - not-present page
 PGD 0 P4D 0
 Oops: 0000 [#1] PREEMPT SMP
 CPU: 7 PID: 0 Comm: swapper/7 Not tainted 5.13.0-rc5-test+ #104
 Hardware name: Hewlett-Packard HP Compaq Pro 6300 SFF/339A, BIOS K01 v03.03 07/14/2016
 RIP: 0010:strlen+0x0/0x20
 Code: f6 82 80 2b 0b bc 20 74 11 0f b6 50 01 48 83 c0 01 f6 82 80 2b 0b bc
  20 75 ef c3 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 <80> 3f 00 74 10
  48 89 f8 48 83 c0 01 80 38 9 f8 c3 31
 RSP: 0018:ffffaa75000d79d0 EFLAGS: 00010046
 RAX: 0000000000000002 RBX: ffff9cdb55575270 RCX: 0000000000000000
 RDX: ffff9cdb58c7a320 RSI: ffffaa75000d7b40 RDI: 000000000000001b
 RBP: ffffaa75000d7b40 R08: ffff9cdb40a4f010 R09: ffffaa75000d7ab8
 R10: ffff9cdb4398c700 R11: 0000000000000008 R12: ffff9cdb58c7a320
 R13: ffff9cdb55575270 R14: ffff9cdb58c7a000 R15: 0000000000000018
 FS:  0000000000000000(0000) GS:ffff9cdb5aa00000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 000000000000001b CR3: 00000000c0612006 CR4: 00000000001706e0
 Call Trace:
  trace_event_raw_event_synth+0x90/0x1d0
  action_trace+0x5b/0x70
  event_hist_trigger+0x4bd/0x4e0
  ? cpumask_next_and+0x20/0x30
  ? update_sd_lb_stats.constprop.0+0xf6/0x840
  ? __lock_acquire.constprop.0+0x125/0x550
  ? find_held_lock+0x32/0x90
  ? sched_clock_cpu+0xe/0xd0
  ? lock_release+0x155/0x440
  ? update_load_avg+0x8c/0x6f0
  ? enqueue_entity+0x18a/0x920
  ? __rb_reserve_next+0xe5/0x460
  ? ring_buffer_lock_reserve+0x12a/0x3f0
  event_triggers_call+0x52/0xe0
  trace_event_buffer_commit+0x1ae/0x240
  trace_event_raw_event_sched_switch+0x114/0x170
  __traceiter_sched_switch+0x39/0x50
  __schedule+0x431/0xb00
  schedule_idle+0x28/0x40
  do_idle+0x198/0x2e0
  cpu_startup_entry+0x19/0x20
  secondary_startup_64_no_verify+0xc2/0xcb

The reason is that the dynamic events array keeps track of the field
position of the fields array, via the field_pos variable in the
synth_field structure. Unfortunately, that field is a boolean for some
reason, which means any field_pos greater than 1 will be a bug (in this
case it was 2).

Link: https://lkml.kernel.org/r/20210721191008.638bce34@oasis.local.home
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: stable@vger.kernel.org
Fixes: bd82631d7ccdc ("tracing: Add support for dynamic strings to synthetic events")
Reviewed-by: Tom Zanussi <zanussi@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
3 years agoMerge tag 'drm-fixes-2021-07-23' of git://anongit.freedesktop.org/drm/drm
Linus Torvalds [Fri, 23 Jul 2021 03:32:13 +0000 (20:32 -0700)]
Merge tag 'drm-fixes-2021-07-23' of git://anongit.freedesktop.org/drm/drm

Pull drm fixes from Dave Airlie:
 "Regular fixes - a bunch of amdgpu fixes are the main thing mostly for
  the new gpus. There is also some i915 reverts for older changes that
  were having some unwanted side effects. One nouveau fix for a report
  regressions, and otherwise just some misc fixes.

  core:
   - fix for non-drm ioctls on drm fd

  panel:
   - avoid double free

  ttm:
   - refcounting fix
   - NULL checks

  amdgpu:
   - Yellow Carp updates
   - Add some Yellow Carp DIDs
   - Beige Goby updates
   - CIK 10bit 4K regression fix
   - GFX10 golden settings updates
   - eDP panel regression fix
   - Misc display fixes
   - Aldebaran fix
   - fix COW checks

  nouveau:
   - init BO GEM fields

  i915:
   - revert async command parsing
   - revert fence error propogation
   - GVT fix for shadow ppgtt

  vc4:
   - fix interrupt handling"

* tag 'drm-fixes-2021-07-23' of git://anongit.freedesktop.org/drm/drm: (34 commits)
  drm/panel: raspberrypi-touchscreen: Prevent double-free
  drm/amdgpu - Corrected the video codecs array name for yellow carp
  drm/amd/display: Fix ASSR regression on embedded panels
  drm/amdgpu: add yellow carp pci id (v2)
  drm/amdgpu: update yellow carp external rev_id handling
  drm/amd/pm: Support board calibration on aldebaran
  drm/amd/display: change zstate allow msg condition
  drm/amd/display: Populate dtbclk entries for dcn3.02/3.03
  drm/amd/display: Line Buffer changes
  drm/amd/display: Remove MALL function from DCN3.1
  drm/amd/display: Only set default brightness for OLED
  drm/amd/display: Update bounding box for DCN3.1
  drm/amd/display: Query VCO frequency from register for DCN3.1
  drm/amd/display: Populate socclk entries for dcn3.02/3.03
  drm/amd/display: Fix max vstartup calculation for modes with borders
  drm/amd/display: implement workaround for riommu related hang
  drm/amd/display: Fix comparison error in dcn21 DML
  drm/i915: Correct the docs for intel_engine_cmd_parser
  drm/ttm: add missing NULL checks
  drm/ttm: Force re-init if ttm_global_init() fails
  ...

3 years agoMerge tag 'fallthrough-fixes-clang-5.14-rc3' of git://git.kernel.org/pub/scm/linux...
Linus Torvalds [Fri, 23 Jul 2021 02:02:25 +0000 (19:02 -0700)]
Merge tag 'fallthrough-fixes-clang-5.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux

Pull fallthrough fix from Gustavo Silva:
 "Fix a fall-through warning when building with -Wimplicit-fallthrough
  on PowerPC"

* tag 'fallthrough-fixes-clang-5.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux:
  powerpc/pasemi: Fix fall-through warning for Clang

3 years agoMerge tag 'drm-misc-fixes-2021-07-22' of git://anongit.freedesktop.org/drm/drm-misc...
Dave Airlie [Fri, 23 Jul 2021 01:16:58 +0000 (11:16 +1000)]
Merge tag 'drm-misc-fixes-2021-07-22' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes

Short summary of fixes pull:

 * Return -ENOTTY for non-DRM ioctls
 * amdgpu: Fix COW checks
 * nouveau: init BO GME fields
 * panel: Avoid double free
 * ttm: Fix refcounting in ttm_global_init(); NULL checks
 * vc4: Fix interrupt handling

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/YPlbkmH6S4VAHP9j@linux-uq9g.fritz.box
3 years agoMerge tag 'drm-intel-fixes-2021-07-22' of git://anongit.freedesktop.org/drm/drm-intel...
Dave Airlie [Fri, 23 Jul 2021 00:43:37 +0000 (10:43 +1000)]
Merge tag 'drm-intel-fixes-2021-07-22' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes

Couple reverts from Jason getting rid of asynchronous command parsing
and fence error propagation and a GVT fix of shadow ppgtt invalidation
with proper D3 state tracking from Colin.

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/YPl1sIyruD0U5Orl@intel.com
3 years agoio_uring: fix early fdput() of file
Jens Axboe [Thu, 22 Jul 2021 23:08:07 +0000 (17:08 -0600)]
io_uring: fix early fdput() of file

A previous commit shuffled some code around, and inadvertently used
struct file after fdput() had been called on it. As we can't touch
the file post fdput() dropping our reference, move the fdput() to
after that has been done.

Cc: Pavel Begunkov <asml.silence@gmail.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/io-uring/YPnqM0fY3nM5RdRI@zeniv-ca.linux.org.uk/
Fixes: f2a48dd09b8e ("io_uring: refactor io_sq_offload_create()")
Reported-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agoMerge tag 'array-bounds-fixes-5.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Thu, 22 Jul 2021 21:38:28 +0000 (14:38 -0700)]
Merge tag 'array-bounds-fixes-5.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux

Pull array bounds warning fix from Gustavo Silva:
 "Fix a couple of out-of-bounds warnings in the media subsystem.

  This is part of the ongoing efforts to globally enable -Warray-bounds"

* tag 'array-bounds-fixes-5.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux:
  media: ngene: Fix out-of-bounds bug in ngene_command_config_free_buf()

3 years agoMerge tag 'nvme-5.14-2021-07-22' of git://git.infradead.org/nvme into block-5.14
Jens Axboe [Thu, 22 Jul 2021 20:23:55 +0000 (14:23 -0600)]
Merge tag 'nvme-5.14-2021-07-22' of git://git.infradead.org/nvme into block-5.14

Pull NVMe fixes from Christoph:

"nvme fixes for Linux 5.14:

 - tracing fix (Keith Busch)
 - fix multipath head refcounting (Hannes Reinecke)
 - Write Zeroes vs PI fix (me)
 - drop a bogus WARN_ON (Zhihao Cheng)"

* tag 'nvme-5.14-2021-07-22' of git://git.infradead.org/nvme:
  nvme: set the PRACT bit when using Write Zeroes with T10 PI
  nvme: fix nvme_setup_command metadata trace event
  nvme: fix refcounting imbalance when all paths are down
  nvme-pci: don't WARN_ON in nvme_reset_work if ctrl.state is not RESETTING

3 years agoMerge tag 'usb-serial-5.14-rc3' of https://git.kernel.org/pub/scm/linux/kernel/git...
Greg Kroah-Hartman [Thu, 22 Jul 2021 18:51:14 +0000 (20:51 +0200)]
Merge tag 'usb-serial-5.14-rc3' of https://git.kernel.org/pub/scm/linux/kernel/git/johan/usb-serial into usb-linus

Johan writes:

USB-serial fixes for 5.14-rc3

Here are some new device ids and a device-id comment fix.

All have been in linux-next with no reported issues.

* tag 'usb-serial-5.14-rc3' of https://git.kernel.org/pub/scm/linux/kernel/git/johan/usb-serial:
  USB: serial: cp210x: add ID for CEL EM3588 USB ZigBee stick
  USB: serial: cp210x: fix comments for GE CS1000
  USB: serial: option: add support for u-blox LARA-R6 family

3 years agoMerge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Linus Torvalds [Thu, 22 Jul 2021 17:38:19 +0000 (10:38 -0700)]
Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 fixes from Will Deacon:
 "A pair of arm64 fixes for -rc3. The straightforward one is a fix to
  our firmware calling stub, which accidentally started corrupting the
  link register on machines with SVE. Since these machines don't really
  exist yet, it wasn't spotted in -next.

  The other fix is a revert-and-a-bit of a patch originally intended to
  allow PTE-level huge mappings for the VMAP area on 32-bit PPC 8xx. A
  side-effect of this change was that our pXd_set_huge() implementations
  could be replaced with generic dummy functions depending on the levels
  of page-table being used, which in turn broke the boot if we fail to
  create the linear mapping as a result of using these functions to
  operate on the pgd. Huge thanks to Michael Ellerman for modifying the
  revert so as not to regress PPC 8xx in terms of functionality.

  Anyway, that's the background and it's also available in the commit
  message along with Link tags pointing at all of the fun.

  Summary:

   - Fix hang when issuing SMC on SVE-capable system due to
     clobbered LR

   - Fix boot failure due to missing block mappings with folded
     page-table"

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  Revert "mm/pgtable: add stubs for {pmd/pub}_{set/clear}_huge"
  arm64: smccc: Save lr before calling __arm_smccc_sve_check()

3 years agoMerge tag 'hyperv-fixes-signed-20210722' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Thu, 22 Jul 2021 17:22:52 +0000 (10:22 -0700)]
Merge tag 'hyperv-fixes-signed-20210722' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux

Pull hyperv fixes from Wei Liu:

 - bug fix from Haiyang for vmbus CPU assignment

 - revert of a bogus patch that went into 5.14-rc1

* tag 'hyperv-fixes-signed-20210722' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux:
  Revert "x86/hyperv: fix logical processor creation"
  Drivers: hv: vmbus: Fix duplicate CPU assignments within a device

3 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Linus Torvalds [Thu, 22 Jul 2021 17:11:27 +0000 (10:11 -0700)]
Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from David Miller:

 1) Fix type of bind option flag in af_xdp, from Baruch Siach.

 2) Fix use after free in bpf_xdp_link_release(), from Xuan Zhao.

 3) PM refcnt imbakance in r8152, from Takashi Iwai.

 4) Sign extension ug in liquidio, from Colin Ian King.

 5) Mising range check in s390 bpf jit, from Colin Ian King.

 6) Uninit value in caif_seqpkt_sendmsg(), from Ziyong Xuan.

 7) Fix skb page recycling race, from Ilias Apalodimas.

 8) Fix memory leak in tcindex_partial_destroy_work, from Pave Skripkin.

 9) netrom timer sk refcnt issues, from Nguyen Dinh Phi.

10) Fix data races aroun tcp's tfo_active_disable_stamp, from Eric
    Dumazet.

11) act_skbmod should only operate on ethernet packets, from Peilin Ye.

12) Fix slab out-of-bpunds in fib6_nh_flush_exceptions(),, from Psolo
    Abeni.

13) Fix sparx5 dependencies, from Yajun Deng.

* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (74 commits)
  dpaa2-switch: seed the buffer pool after allocating the swp
  net: sched: cls_api: Fix the the wrong parameter
  net: sparx5: fix unmet dependencies warning
  net: dsa: tag_ksz: dont let the hardware process the layer 4 checksum
  net: dsa: ensure linearized SKBs in case of tail taggers
  ravb: Remove extra TAB
  ravb: Fix a typo in comment
  net: dsa: sja1105: make VID 4095 a bridge VLAN too
  tcp: disable TFO blackhole logic by default
  sctp: do not update transport pathmtu if SPP_PMTUD_ENABLE is not set
  net: ixp46x: fix ptp build failure
  ibmvnic: Remove the proper scrq flush
  selftests: net: add ESP-in-UDP PMTU test
  udp: check encap socket in __udp_lib_err
  sctp: update active_key for asoc when old key is being replaced
  r8169: Avoid duplicate sysfs entry creation error
  ixgbe: Fix packet corruption due to missing DMA sync
  Revert "qed: fix possible unpaired spin_{un}lock_bh in _qed_mcp_cmd_and_union()"
  ipv6: fix another slab-out-of-bounds in fib6_nh_flush_exceptions
  fsl/fman: Add fibre support
  ...

3 years agoMerge tag 'mmc-v5.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc
Linus Torvalds [Thu, 22 Jul 2021 16:51:38 +0000 (09:51 -0700)]
Merge tag 'mmc-v5.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc

Pull MMC fixes from Ulf Hansson:

 - Use kref to fix KASAN splats triggered during card removal

 - Don't allocate IDA for OF aliases

* tag 'mmc-v5.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
  mmc: core: Don't allocate IDA for OF aliases
  mmc: core: Use kref in place of struct mmc_blk_data::usage

3 years agotracing: Fix bug in rb_per_cpu_empty() that might cause deadloop.
Haoran Luo [Wed, 21 Jul 2021 14:12:07 +0000 (14:12 +0000)]
tracing: Fix bug in rb_per_cpu_empty() that might cause deadloop.

The "rb_per_cpu_empty()" misinterpret the condition (as not-empty) when
"head_page" and "commit_page" of "struct ring_buffer_per_cpu" points to
the same buffer page, whose "buffer_data_page" is empty and "read" field
is non-zero.

An error scenario could be constructed as followed (kernel perspective):

1. All pages in the buffer has been accessed by reader(s) so that all of
them will have non-zero "read" field.

2. Read and clear all buffer pages so that "rb_num_of_entries()" will
return 0 rendering there's no more data to read. It is also required
that the "read_page", "commit_page" and "tail_page" points to the same
page, while "head_page" is the next page of them.

3. Invoke "ring_buffer_lock_reserve()" with large enough "length"
so that it shot pass the end of current tail buffer page. Now the
"head_page", "commit_page" and "tail_page" points to the same page.

4. Discard current event with "ring_buffer_discard_commit()", so that
"head_page", "commit_page" and "tail_page" points to a page whose buffer
data page is now empty.

When the error scenario has been constructed, "tracing_read_pipe" will
be trapped inside a deadloop: "trace_empty()" returns 0 since
"rb_per_cpu_empty()" returns 0 when it hits the CPU containing such
constructed ring buffer. Then "trace_find_next_entry_inc()" always
return NULL since "rb_num_of_entries()" reports there's no more entry
to read. Finally "trace_seq_to_user()" returns "-EBUSY" spanking
"tracing_read_pipe" back to the start of the "waitagain" loop.

I've also written a proof-of-concept script to construct the scenario
and trigger the bug automatically, you can use it to trace and validate
my reasoning above:

  https://github.com/aegistudio/RingBufferDetonator.git

Tests has been carried out on linux kernel 5.14-rc2
(2734d6c1b1a089fb593ef6a23d4b70903526fe0c), my fixed version
of kernel (for testing whether my update fixes the bug) and
some older kernels (for range of affected kernels). Test result is
also attached to the proof-of-concept repository.

Link: https://lore.kernel.org/linux-trace-devel/YPaNxsIlb2yjSi5Y@aegistudio/
Link: https://lore.kernel.org/linux-trace-devel/YPgrN85WL9VyrZ55@aegistudio
Cc: stable@vger.kernel.org
Fixes: bf41a158cacba ("ring-buffer: make reentrant")
Suggested-by: Linus Torvalds <torvalds@linuxfoundation.org>
Signed-off-by: Haoran Luo <www@aegistudio.net>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
3 years agobtrfs: store a block_device in struct btrfs_ordered_extent
Christoph Hellwig [Thu, 22 Jul 2021 07:53:59 +0000 (09:53 +0200)]
btrfs: store a block_device in struct btrfs_ordered_extent

Store the block device instead of the gendisk in the btrfs_ordered_extent
structure instead of acquiring a reference to it later.

Note: this is from series removing bdgrab/bdput, btrfs is one of the
last users.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
3 years agobtrfs: fix lock inversion problem when doing qgroup extent tracing
Filipe Manana [Wed, 21 Jul 2021 16:31:48 +0000 (17:31 +0100)]
btrfs: fix lock inversion problem when doing qgroup extent tracing

At btrfs_qgroup_trace_extent_post() we call btrfs_find_all_roots() with a
NULL value as the transaction handle argument, which makes that function
take the commit_root_sem semaphore, which is necessary when we don't hold
a transaction handle or any other mechanism to prevent a transaction
commit from wiping out commit roots.

However btrfs_qgroup_trace_extent_post() can be called in a context where
we are holding a write lock on an extent buffer from a subvolume tree,
namely from btrfs_truncate_inode_items(), called either during truncate
or unlink operations. In this case we end up with a lock inversion problem
because the commit_root_sem is a higher level lock, always supposed to be
acquired before locking any extent buffer.

Lockdep detects this lock inversion problem since we switched the extent
buffer locks from custom locks to semaphores, and when running btrfs/158
from fstests, it reported the following trace:

[ 9057.626435] ======================================================
[ 9057.627541] WARNING: possible circular locking dependency detected
[ 9057.628334] 5.14.0-rc2-btrfs-next-93 #1 Not tainted
[ 9057.628961] ------------------------------------------------------
[ 9057.629867] kworker/u16:4/30781 is trying to acquire lock:
[ 9057.630824] ffff8e2590f58760 (btrfs-tree-00){++++}-{3:3}, at: __btrfs_tree_read_lock+0x24/0x110 [btrfs]
[ 9057.632542]
               but task is already holding lock:
[ 9057.633551] ffff8e25582d4b70 (&fs_info->commit_root_sem){++++}-{3:3}, at: iterate_extent_inodes+0x10b/0x280 [btrfs]
[ 9057.635255]
               which lock already depends on the new lock.

[ 9057.636292]
               the existing dependency chain (in reverse order) is:
[ 9057.637240]
               -> #1 (&fs_info->commit_root_sem){++++}-{3:3}:
[ 9057.638138]        down_read+0x46/0x140
[ 9057.638648]        btrfs_find_all_roots+0x41/0x80 [btrfs]
[ 9057.639398]        btrfs_qgroup_trace_extent_post+0x37/0x70 [btrfs]
[ 9057.640283]        btrfs_add_delayed_data_ref+0x418/0x490 [btrfs]
[ 9057.641114]        btrfs_free_extent+0x35/0xb0 [btrfs]
[ 9057.641819]        btrfs_truncate_inode_items+0x424/0xf70 [btrfs]
[ 9057.642643]        btrfs_evict_inode+0x454/0x4f0 [btrfs]
[ 9057.643418]        evict+0xcf/0x1d0
[ 9057.643895]        do_unlinkat+0x1e9/0x300
[ 9057.644525]        do_syscall_64+0x3b/0xc0
[ 9057.645110]        entry_SYSCALL_64_after_hwframe+0x44/0xae
[ 9057.645835]
               -> #0 (btrfs-tree-00){++++}-{3:3}:
[ 9057.646600]        __lock_acquire+0x130e/0x2210
[ 9057.647248]        lock_acquire+0xd7/0x310
[ 9057.647773]        down_read_nested+0x4b/0x140
[ 9057.648350]        __btrfs_tree_read_lock+0x24/0x110 [btrfs]
[ 9057.649175]        btrfs_read_lock_root_node+0x31/0x40 [btrfs]
[ 9057.650010]        btrfs_search_slot+0x537/0xc00 [btrfs]
[ 9057.650849]        scrub_print_warning_inode+0x89/0x370 [btrfs]
[ 9057.651733]        iterate_extent_inodes+0x1e3/0x280 [btrfs]
[ 9057.652501]        scrub_print_warning+0x15d/0x2f0 [btrfs]
[ 9057.653264]        scrub_handle_errored_block.isra.0+0x135f/0x1640 [btrfs]
[ 9057.654295]        scrub_bio_end_io_worker+0x101/0x2e0 [btrfs]
[ 9057.655111]        btrfs_work_helper+0xf8/0x400 [btrfs]
[ 9057.655831]        process_one_work+0x247/0x5a0
[ 9057.656425]        worker_thread+0x55/0x3c0
[ 9057.656993]        kthread+0x155/0x180
[ 9057.657494]        ret_from_fork+0x22/0x30
[ 9057.658030]
               other info that might help us debug this:

[ 9057.659064]  Possible unsafe locking scenario:

[ 9057.659824]        CPU0                    CPU1
[ 9057.660402]        ----                    ----
[ 9057.660988]   lock(&fs_info->commit_root_sem);
[ 9057.661581]                                lock(btrfs-tree-00);
[ 9057.662348]                                lock(&fs_info->commit_root_sem);
[ 9057.663254]   lock(btrfs-tree-00);
[ 9057.663690]
                *** DEADLOCK ***

[ 9057.664437] 4 locks held by kworker/u16:4/30781:
[ 9057.665023]  #0: ffff8e25922a1148 ((wq_completion)btrfs-scrub){+.+.}-{0:0}, at: process_one_work+0x1c7/0x5a0
[ 9057.666260]  #1: ffffabb3451ffe70 ((work_completion)(&work->normal_work)){+.+.}-{0:0}, at: process_one_work+0x1c7/0x5a0
[ 9057.667639]  #2: ffff8e25922da198 (&ret->mutex){+.+.}-{3:3}, at: scrub_handle_errored_block.isra.0+0x5d2/0x1640 [btrfs]
[ 9057.669017]  #3: ffff8e25582d4b70 (&fs_info->commit_root_sem){++++}-{3:3}, at: iterate_extent_inodes+0x10b/0x280 [btrfs]
[ 9057.670408]
               stack backtrace:
[ 9057.670976] CPU: 7 PID: 30781 Comm: kworker/u16:4 Not tainted 5.14.0-rc2-btrfs-next-93 #1
[ 9057.672030] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
[ 9057.673492] Workqueue: btrfs-scrub btrfs_work_helper [btrfs]
[ 9057.674258] Call Trace:
[ 9057.674588]  dump_stack_lvl+0x57/0x72
[ 9057.675083]  check_noncircular+0xf3/0x110
[ 9057.675611]  __lock_acquire+0x130e/0x2210
[ 9057.676132]  lock_acquire+0xd7/0x310
[ 9057.676605]  ? __btrfs_tree_read_lock+0x24/0x110 [btrfs]
[ 9057.677313]  ? lock_is_held_type+0xe8/0x140
[ 9057.677849]  down_read_nested+0x4b/0x140
[ 9057.678349]  ? __btrfs_tree_read_lock+0x24/0x110 [btrfs]
[ 9057.679068]  __btrfs_tree_read_lock+0x24/0x110 [btrfs]
[ 9057.679760]  btrfs_read_lock_root_node+0x31/0x40 [btrfs]
[ 9057.680458]  btrfs_search_slot+0x537/0xc00 [btrfs]
[ 9057.681083]  ? _raw_spin_unlock+0x29/0x40
[ 9057.681594]  ? btrfs_find_all_roots_safe+0x11f/0x140 [btrfs]
[ 9057.682336]  scrub_print_warning_inode+0x89/0x370 [btrfs]
[ 9057.683058]  ? btrfs_find_all_roots_safe+0x11f/0x140 [btrfs]
[ 9057.683834]  ? scrub_write_block_to_dev_replace+0xb0/0xb0 [btrfs]
[ 9057.684632]  iterate_extent_inodes+0x1e3/0x280 [btrfs]
[ 9057.685316]  scrub_print_warning+0x15d/0x2f0 [btrfs]
[ 9057.685977]  ? ___ratelimit+0xa4/0x110
[ 9057.686460]  scrub_handle_errored_block.isra.0+0x135f/0x1640 [btrfs]
[ 9057.687316]  scrub_bio_end_io_worker+0x101/0x2e0 [btrfs]
[ 9057.688021]  btrfs_work_helper+0xf8/0x400 [btrfs]
[ 9057.688649]  ? lock_is_held_type+0xe8/0x140
[ 9057.689180]  process_one_work+0x247/0x5a0
[ 9057.689696]  worker_thread+0x55/0x3c0
[ 9057.690175]  ? process_one_work+0x5a0/0x5a0
[ 9057.690731]  kthread+0x155/0x180
[ 9057.691158]  ? set_kthread_struct+0x40/0x40
[ 9057.691697]  ret_from_fork+0x22/0x30

Fix this by making btrfs_find_all_roots() never attempt to lock the
commit_root_sem when it is called from btrfs_qgroup_trace_extent_post().

We can't just pass a non-NULL transaction handle to btrfs_find_all_roots()
from btrfs_qgroup_trace_extent_post(), because that would make backref
lookup not use commit roots and acquire read locks on extent buffers, and
therefore could deadlock when btrfs_qgroup_trace_extent_post() is called
from the btrfs_truncate_inode_items() code path which has acquired a write
lock on an extent buffer of the subvolume btree.

CC: stable@vger.kernel.org # 4.19+
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
3 years agobtrfs: check for missing device in btrfs_trim_fs
Anand Jain [Sun, 4 Jul 2021 11:14:39 +0000 (19:14 +0800)]
btrfs: check for missing device in btrfs_trim_fs

A fstrim on a degraded raid1 can trigger the following null pointer
dereference:

  BTRFS info (device loop0): allowing degraded mounts
  BTRFS info (device loop0): disk space caching is enabled
  BTRFS info (device loop0): has skinny extents
  BTRFS warning (device loop0): devid 2 uuid 97ac16f7-e14d-4db1-95bc-3d489b424adb is missing
  BTRFS warning (device loop0): devid 2 uuid 97ac16f7-e14d-4db1-95bc-3d489b424adb is missing
  BTRFS info (device loop0): enabling ssd optimizations
  BUG: kernel NULL pointer dereference, address: 0000000000000620
  PGD 0 P4D 0
  Oops: 0000 [#1] SMP NOPTI
  CPU: 0 PID: 4574 Comm: fstrim Not tainted 5.13.0-rc7+ #31
  Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
  RIP: 0010:btrfs_trim_fs+0x199/0x4a0 [btrfs]
  RSP: 0018:ffff959541797d28 EFLAGS: 00010293
  RAX: 0000000000000000 RBX: ffff946f84eca508 RCX: a7a67937adff8608
  RDX: ffff946e8122d000 RSI: 0000000000000000 RDI: ffffffffc02fdbf0
  RBP: ffff946ea4615000 R08: 0000000000000001 R09: 0000000000000000
  R10: 0000000000000000 R11: ffff946e8122d960 R12: 0000000000000000
  R13: ffff959541797db8 R14: ffff946e8122d000 R15: ffff959541797db8
  FS:  00007f55917a5080(0000) GS:ffff946f9bc00000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 0000000000000620 CR3: 000000002d2c8001 CR4: 00000000000706f0
  Call Trace:
  btrfs_ioctl_fitrim+0x167/0x260 [btrfs]
  btrfs_ioctl+0x1c00/0x2fe0 [btrfs]
  ? selinux_file_ioctl+0x140/0x240
  ? syscall_trace_enter.constprop.0+0x188/0x240
  ? __x64_sys_ioctl+0x83/0xb0
  __x64_sys_ioctl+0x83/0xb0

Reproducer:

  $ mkfs.btrfs -fq -d raid1 -m raid1 /dev/loop0 /dev/loop1
  $ mount /dev/loop0 /btrfs
  $ umount /btrfs
  $ btrfs dev scan --forget
  $ mount -o degraded /dev/loop0 /btrfs

  $ fstrim /btrfs

The reason is we call btrfs_trim_free_extents() for the missing device,
which uses device->bdev (NULL for missing device) to find if the device
supports discard.

Fix is to check if the device is missing before calling
btrfs_trim_free_extents().

CC: stable@vger.kernel.org # 5.4+
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
3 years agobtrfs: fix unpersisted i_size on fsync after expanding truncate
Filipe Manana [Tue, 6 Jul 2021 14:41:15 +0000 (15:41 +0100)]
btrfs: fix unpersisted i_size on fsync after expanding truncate

If we have an inode that does not have the full sync flag set, was changed
in the current transaction, then it is logged while logging some other
inode (like its parent directory for example), its i_size is increased by
a truncate operation, the log is synced through an fsync of some other
inode and then finally we explicitly call fsync on our inode, the new
i_size is not persisted.

The following example shows how to trigger it, with comments explaining
how and why the issue happens:

  $ mkfs.btrfs -f /dev/sdc
  $ mount /dev/sdc /mnt

  $ touch /mnt/foo
  $ xfs_io -f -c "pwrite -S 0xab 0 1M" /mnt/bar

  $ sync

  # Fsync bar, this will be a noop since the file has not yet been
  # modified in the current transaction. The goal here is to clear
  # BTRFS_INODE_NEEDS_FULL_SYNC from the inode's runtime flags.
  $ xfs_io -c "fsync" /mnt/bar

  # Now rename both files, without changing their parent directory.
  $ mv /mnt/bar /mnt/bar2
  $ mv /mnt/foo /mnt/foo2

  # Increase the size of bar2 with a truncate operation.
  $ xfs_io -c "truncate 2M" /mnt/bar2

  # Now fsync foo2, this results in logging its parent inode (the root
  # directory), and logging the parent results in logging the inode of
  # file bar2 (its inode item and the new name). The inode of file bar2
  # is logged with an i_size of 0 bytes since it's logged in
  # LOG_INODE_EXISTS mode, meaning we are only logging its names (and
  # xattrs if it had any) and the i_size of the inode will not be changed
  # when the log is replayed.
  $ xfs_io -c "fsync" /mnt/foo2

  # Now explicitly fsync bar2. This resulted in doing nothing, not
  # logging the inode with the new i_size of 2M and the hole from file
  # offset 1M to 2M. Because the inode did not have the flag
  # BTRFS_INODE_NEEDS_FULL_SYNC set, when it was logged through the
  # fsync of file foo2, its last_log_commit field was updated,
  # resulting in this explicit of file bar2 not doing anything.
  $ xfs_io -c "fsync" /mnt/bar2

  # File bar2 content and size before a power failure.
  $ od -A d -t x1 /mnt/bar2
  0000000 ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab
  *
  1048576 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  *
  2097152

  <power failure>

  # Mount the filesystem to replay the log.
  $ mount /dev/sdc /mnt

  # Read the file again, should have the same content and size as before
  # the power failure happened, but it doesn't, i_size is still at 1M.
  $ od -A d -t x1 /mnt/bar2
  0000000 ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab
  *
  1048576

This started to happen after commit 209ecbb8585bf6 ("btrfs: remove stale
comment and logic from btrfs_inode_in_log()"), since btrfs_inode_in_log()
no longer checks if the inode's list of modified extents is not empty.
However, checking that list is not the right way to address this case
and the check was added long time ago in commit 125c4cf9f37c98
("Btrfs: set inode's logged_trans/last_log_commit after ranged fsync")
for a different purpose, to address consecutive ranged fsyncs.

The reason that checking for the list emptiness makes this test pass is
because during an expanding truncate we create an extent map to represent
a hole from the old i_size to the new i_size, and add that extent map to
the list of modified extents in the inode. However if we are low on
available memory and we can not allocate a new extent map, then we don't
treat it as an error and just set the full sync flag on the inode, so that
the next fsync does not rely on the list of modified extents - so checking
for the emptiness of the list to decide if the inode needs to be logged is
not reliable, and results in not logging the inode if it was not possible
to allocate the extent map for the hole.

Fix this by ensuring that if we are only logging that an inode exists
(inode item, names/references and xattrs), we don't update the inode's
last_log_commit even if it does not have the full sync runtime flag set.

A test case for fstests follows soon.

CC: stable@vger.kernel.org # 5.13+
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
3 years agodpaa2-switch: seed the buffer pool after allocating the swp
Ioana Ciornei [Thu, 22 Jul 2021 12:15:51 +0000 (15:15 +0300)]
dpaa2-switch: seed the buffer pool after allocating the swp

Any interraction with the buffer pool (seeding a buffer, acquire one) is
made through a software portal (SWP, a DPIO object).
There are circumstances where the dpaa2-switch driver probes on a DPSW
before any DPIO devices have been probed. In this case, seeding of the
buffer pool will lead to a panic since no SWPs are initialized.

To fix this, seed the buffer pool after making sure that the software
portals have been probed and are ready to be used.

Fixes: 0b1b71370458 ("staging: dpaa2-switch: handle Rx path on control interface")
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agodrm/panel: raspberrypi-touchscreen: Prevent double-free
Maxime Ripard [Tue, 20 Jul 2021 13:45:23 +0000 (15:45 +0200)]
drm/panel: raspberrypi-touchscreen: Prevent double-free

The mipi_dsi_device allocated by mipi_dsi_device_register_full() is
already free'd on release.

Fixes: 2f733d6194bd ("drm/panel: Add support for the Raspberry Pi 7" Touchscreen.")
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
Reviewed-by: Sam Ravnborg <sam@ravnborg.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20210720134525.563936-9-maxime@cerno.tech
3 years agonet: sched: cls_api: Fix the the wrong parameter
Yajun Deng [Thu, 22 Jul 2021 03:23:43 +0000 (11:23 +0800)]
net: sched: cls_api: Fix the the wrong parameter

The 4th parameter in tc_chain_notify() should be flags rather than seq.
Let's change it back correctly.

Fixes: 32a4f5ecd738 ("net: sched: introduce chain object to uapi")
Signed-off-by: Yajun Deng <yajun.deng@linux.dev>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: sparx5: fix unmet dependencies warning
Randy Dunlap [Wed, 21 Jul 2021 22:33:36 +0000 (15:33 -0700)]
net: sparx5: fix unmet dependencies warning

WARNING: unmet direct dependencies detected for PHY_SPARX5_SERDES
  Depends on [n]: (ARCH_SPARX5 || COMPILE_TEST [=n]) && OF [=y] && HAS_IOMEM [=y]
  Selected by [y]:
  - SPARX5_SWITCH [=y] && NETDEVICES [=y] && ETHERNET [=y] && NET_VENDOR_MICROCHIP [=y] && NET_SWITCHDEV [=y] && HAS_IOMEM [=y] && OF [=y]

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Lars Povlsen <lars.povlsen@microchip.com>
Cc: Steen Hegelund <Steen.Hegelund@microchip.com>
Cc: UNGLinuxDriver@microchip.com
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoALSA: usb-audio: Add registration quirk for JBL Quantum headsets
Alexander Tsoy [Wed, 21 Jul 2021 23:56:05 +0000 (02:56 +0300)]
ALSA: usb-audio: Add registration quirk for JBL Quantum headsets

These devices has two interfaces, but only the second interface
contains the capture endpoint, thus quirk is required to delay the
registration until the second interface appears.

Tested-by: Jakub Fišer <jakub@ufiseru.cz>
Signed-off-by: Alexander Tsoy <alexander@tsoy.me>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20210721235605.53741-1-alexander@tsoy.me
Signed-off-by: Takashi Iwai <tiwai@suse.de>
3 years agoMerge branch 'ksz-dsa-fixes'
David S. Miller [Thu, 22 Jul 2021 06:15:06 +0000 (23:15 -0700)]
Merge branch 'ksz-dsa-fixes'

Lino Sanfilippo says:

====================
Fixes for KSZ DSA switch

These patches fix issues I encountered while using a KSZ9897 as a DSA
switch with a broadcom GENET network device as the DSA master device.

PATCH 1 fixes an invalid access to an SKB in case it is scattered.
PATCH 2 fixes incorrect hardware checksum calculation caused by the DSA
tag.

Changes in v2:
- instead of linearizing the SKBs only for KSZ switches ensure linearized
  SKBs for all tail taggers by clearing the feature flags NETIF_F_HW_SG and
  NETIF_F_FRAGLIST (suggested by Vladimir Oltean)

The patches have been tested with a KSZ9897 and apply against net-next.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: dsa: tag_ksz: dont let the hardware process the layer 4 checksum
Lino Sanfilippo [Wed, 21 Jul 2021 21:56:42 +0000 (23:56 +0200)]
net: dsa: tag_ksz: dont let the hardware process the layer 4 checksum

If the checksum calculation is offloaded to the network device (e.g due to
NETIF_F_HW_CSUM inherited from the DSA master device), the calculated
layer 4 checksum is incorrect. This is since the DSA tag which is placed
after the layer 4 data is considered as being part of the daa and thus
errorneously included into the checksum calculation.
To avoid this, always calculate the layer 4 checksum in software.

Signed-off-by: Lino Sanfilippo <LinoSanfilippo@gmx.de>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: dsa: ensure linearized SKBs in case of tail taggers
Lino Sanfilippo [Wed, 21 Jul 2021 21:56:41 +0000 (23:56 +0200)]
net: dsa: ensure linearized SKBs in case of tail taggers

The function skb_put() that is used by tail taggers to make room for the
DSA tag must only be called for linearized SKBS. However in case that the
slave device inherited features like NETIF_F_HW_SG or NETIF_F_FRAGLIST the
SKB passed to the slaves transmit function may not be linearized.
Avoid those SKBs by clearing the NETIF_F_HW_SG and NETIF_F_FRAGLIST flags
for tail taggers.
Furthermore since the tagging protocol can be changed at runtime move the
code for setting up the slaves features into dsa_slave_setup_tagger().

Suggested-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Lino Sanfilippo <LinoSanfilippo@gmx.de>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoravb: Remove extra TAB
Biju Das [Wed, 21 Jul 2021 18:21:26 +0000 (19:21 +0100)]
ravb: Remove extra TAB

Align the member description comments for struct ravb_desc by
removing the extra TAB.

Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com>
Reviewed-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Reviewed-by: Sergei Shtylyov <sergei.shtylyov@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoravb: Fix a typo in comment
Biju Das [Wed, 21 Jul 2021 18:17:21 +0000 (19:17 +0100)]
ravb: Fix a typo in comment

Fix the typo RX->TX in comment, as the code following the comment
process TX and not RX.

Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com>
Reviewed-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Reviewed-by: Sergei Shtylyov <sergei.shtylyov@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agonet: dsa: sja1105: make VID 4095 a bridge VLAN too
Vladimir Oltean [Wed, 21 Jul 2021 12:37:59 +0000 (15:37 +0300)]
net: dsa: sja1105: make VID 4095 a bridge VLAN too

This simple series of commands:

ip link add br0 type bridge vlan_filtering 1
ip link set swp0 master br0

fails on sja1105 with the following error:
[   33.439103] sja1105 spi0.1: vlan-lookup-table needs to have at least the default untagged VLAN
[   33.447710] sja1105 spi0.1: Invalid config, cannot upload
Warning: sja1105: Failed to change VLAN Ethertype.

For context, sja1105 has 3 operating modes:
- SJA1105_VLAN_UNAWARE: the dsa_8021q_vlans are committed to hardware
- SJA1105_VLAN_FILTERING_FULL: the bridge_vlans are committed to hardware
- SJA1105_VLAN_FILTERING_BEST_EFFORT: both the dsa_8021q_vlans and the
  bridge_vlans are committed to hardware

Swapping out a VLAN list and another in happens in
sja1105_build_vlan_table(), which performs a delta update procedure.
That function is called from a few places, notably from
sja1105_vlan_filtering() which is called from the
SWITCHDEV_ATTR_ID_BRIDGE_VLAN_FILTERING handler.

The above set of 2 commands fails when run on a kernel pre-commit
8841f6e63f2c ("net: dsa: sja1105: make devlink property
best_effort_vlan_filtering true by default"). So the priv->vlan_state
transition that takes place is between VLAN-unaware and full VLAN
filtering. So the dsa_8021q_vlans are swapped out and the bridge_vlans
are swapped in.

So why does it fail?

Well, the bridge driver, through nbp_vlan_init(), first sets up the
SWITCHDEV_ATTR_ID_BRIDGE_VLAN_FILTERING attribute, and only then
proceeds to call nbp_vlan_add for the default_pvid.

So when we swap out the dsa_8021q_vlans and swap in the bridge_vlans in
the SWITCHDEV_ATTR_ID_BRIDGE_VLAN_FILTERING handler, there are no bridge
VLANs (yet). So we have wiped the VLAN table clean, and the low-level
static config checker complains of an invalid configuration. We _will_
add the bridge VLANs using the dynamic config interface, albeit later,
when nbp_vlan_add() calls us. So it is natural that it fails.

So why did it ever work?

Surprisingly, it looks like I only tested this configuration with 2
things set up in a particular way:
- a network manager that brings all ports up
- a kernel with CONFIG_VLAN_8021Q=y

It is widely known that commit ad1afb003939 ("vlan_dev: VLAN 0 should be
treated as "no vlan tag" (802.1p packet)") installs VID 0 to every net
device that comes up. DSA treats these VLANs as bridge VLANs, and
therefore, in my testing, the list of bridge_vlans was never empty.

However, if CONFIG_VLAN_8021Q is not enabled, or the port is not up when
it joins a VLAN-aware bridge, the bridge_vlans list will be temporarily
empty, and the sja1105_static_config_reload() call from
sja1105_vlan_filtering() will fail.

To fix this, the simplest thing is to keep VID 4095, the one used for
CPU-injected control packets since commit ed040abca4c1 ("net: dsa:
sja1105: use 4095 as the private VLAN for untagged traffic"), in the
list of bridge VLANs too, not just the list of tag_8021q VLANs. This
ensures that the list of bridge VLANs will never be empty.

Fixes: ec5ae61076d0 ("net: dsa: sja1105: save/restore VLANs using a delta commit method")
Reported-by: Radu Pirea (NXP OSS) <radu-nicolae.pirea@oss.nxp.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agotcp: disable TFO blackhole logic by default
Wei Wang [Wed, 21 Jul 2021 17:27:38 +0000 (10:27 -0700)]
tcp: disable TFO blackhole logic by default

Multiple complaints have been raised from the TFO users on the internet
stating that the TFO blackhole logic is too aggressive and gets falsely
triggered too often.
(e.g. https://blog.apnic.net/2021/07/05/tcp-fast-open-not-so-fast/)
Considering that most middleboxes no longer drop TFO packets, we decide
to disable the blackhole logic by setting
/proc/sys/net/ipv4/tcp_fastopen_blackhole_timeout_set to 0 by default.

Fixes: cf1ef3f0719b4 ("net/tcp_fastopen: Disable active side TFO in certain scenarios")
Signed-off-by: Wei Wang <weiwan@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Acked-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoMerge tag 'amd-drm-fixes-5.14-2021-07-21' of https://gitlab.freedesktop.org/agd5f...
Dave Airlie [Thu, 22 Jul 2021 00:52:54 +0000 (10:52 +1000)]
Merge tag 'amd-drm-fixes-5.14-2021-07-21' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes

amd-drm-fixes-5.14-2021-07-21:

amdgpu:
- Yellow Carp updates
- Add some Yellow Carp DIDs
- Beige Goby updates
- CIK 10bit 4K regression fix
- GFX10 golden settings updates
- eDP panel regression fix
- Misc display fixes
- Aldebaran fix

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210721215800.17590-1-alexander.deucher@amd.com
3 years agodrm/amdgpu - Corrected the video codecs array name for yellow carp
Veerabadhran Gopalakrishnan [Mon, 19 Jul 2021 13:36:23 +0000 (19:06 +0530)]
drm/amdgpu - Corrected the video codecs array name for yellow carp

Signed-off-by: Veerabadhran Gopalakrishnan <veerabadhran.gopalakrishnan@amd.com>
Reviewed-by: James Zhu <James.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agosctp: do not update transport pathmtu if SPP_PMTUD_ENABLE is not set
Xin Long [Wed, 21 Jul 2021 18:45:54 +0000 (14:45 -0400)]
sctp: do not update transport pathmtu if SPP_PMTUD_ENABLE is not set

Currently, in sctp_packet_config(), sctp_transport_pmtu_check() is
called to update transport pathmtu with dst's mtu when dst's mtu
has been changed by non sctp stack like xfrm.

However, this should only happen when SPP_PMTUD_ENABLE is set, no
matter where dst's mtu changed. This patch is to fix by checking
SPP_PMTUD_ENABLE flag before calling sctp_transport_pmtu_check().

Thanks Jacek for reporting and looking into this issue.

v1->v2:
  - add the missing "{" to fix the build error.

Fixes: 69fec325a643 ('Revert "sctp: remove sctp_transport_pmtu_check"')
Reported-by: Jacek Szafraniec <jacek.szafraniec@nokia.com>
Tested-by: Jacek Szafraniec <jacek.szafraniec@nokia.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoMerge tag 's390-5.14-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Linus Torvalds [Wed, 21 Jul 2021 20:51:26 +0000 (13:51 -0700)]
Merge tag 's390-5.14-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux

Pull s390 updates from Heiko Carstens:

 - fix / add expoline usage in "DMA" code

 - fix compat vdso Makefile to avoid permanent rebuild

 - fix ftrace_update_ftrace_func to avoid NULL pointer dereference

 - update defconfigs

 - trivial coding style fix

* tag 's390-5.14-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  s390: update defconfigs
  s390/cpumf: fix semicolon.cocci warnings
  s390/boot: fix use of expolines in the DMA code
  s390/ftrace: fix ftrace_update_ftrace_func implementation
  s390/defconfig: allow early device mapper disks
  s390/vdso32: add vdso32.lds to targets

3 years agoMerge tag 'spi-fix-v5.14-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/brooni...
Linus Torvalds [Wed, 21 Jul 2021 19:41:41 +0000 (12:41 -0700)]
Merge tag 'spi-fix-v5.14-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi

Pull spi fixes from Mark Brown:
 "A collection of driver specific fixes, there was a bit of a kerfuffle
  with some last minute review on hte spi-cadence-quadspi division by
  zero change but otherwise nothing terribly remarkable here - important
  fixes if you have the hardware but nothing with too wide an impact"

* tag 'spi-fix-v5.14-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
  spi: spi-bcm2835: Fix deadlock
  spi: cadence: Correct initialisation of runtime PM again
  spi: cadence-quadspi: Disable Auto-HW polling
  spi: spi-cadence-quadspi: Fix division by zero warning
  spi: spi-cadence-quadspi: Revert "Fix division by zero warning"
  spi: spi-cadence-quadspi: Fix division by zero warning
  spi: mediatek: move devm_spi_register_master position
  spi: mediatek: fix fifo rx mode
  spi: atmel: Fix CS and initialization bug
  spi: stm32: fixes pm_runtime calls in probe/remove
  spi: imx: mx51-ecspi: Reinstate low-speed CONFIGREG delay
  spi: stm32h7: fix full duplex irq handler handling

3 years agoMerge tag 'regulator-fix-v5.14-rc2' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Wed, 21 Jul 2021 19:37:49 +0000 (12:37 -0700)]
Merge tag 'regulator-fix-v5.14-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator

Pull regulator fixes from Mark Brown:
 "A few driver specific fixes that came in since the merge window, plus
  a change to mark the regulator-fixed-domain DT binding as deprecated
  in order to try to to discourage any new users while a better solution
  is put in place"

* tag 'regulator-fix-v5.14-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
  regulator: hi6421: Fix getting wrong drvdata
  regulator: mtk-dvfsrc: Fix wrong dev pointer for devm_regulator_register
  regulator: fixed: Mark regulator-fixed-domain as deprecated
  regulator: bd9576: Fix testing wrong flag in check_temp_flag_mismatch
  regulator: hi6421v600: Fix getting wrong drvdata that causes boot failure
  regulator: rt5033: Fix n_voltages settings for BUCK and LDO
  regulator: rtmv20: Fix wrong mask for strobe-polarity-high

3 years agoMerge tag 'afs-fixes-20210721' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowe...
Linus Torvalds [Wed, 21 Jul 2021 18:51:59 +0000 (11:51 -0700)]
Merge tag 'afs-fixes-20210721' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs

Pull AFS fixes from David Howells:

 - Fix a tracepoint that causes one of the tracing subsystem query files
   to crash if the module is loaded

 - Fix afs_writepages() to take account of whether the storage rpc
   actually succeeded when updating the cyclic writeback counter

 - Fix some error code propagation/handling

 - Fix place where afs_writepages() was setting writeback_index to a
   file position rather than a page index

* tag 'afs-fixes-20210721' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
  afs: Remove redundant assignment to ret
  afs: Fix setting of writeback_index
  afs: check function return
  afs: Fix tracepoint string placement with built-in AFS

3 years agodrm/amd/display: Fix ASSR regression on embedded panels
Stylon Wang [Wed, 21 Jul 2021 04:25:24 +0000 (12:25 +0800)]
drm/amd/display: Fix ASSR regression on embedded panels

[Why]
Regression found in some embedded panels traces back to the earliest
upstreamed ASSR patch. The changed code flow are causing problems
with some panels.

[How]
- Change ASSR enabling code while preserving original code flow
  as much as possible
- Simplify the code on guarding with internal display flag

Bug: https://bugzilla.kernel.org/show_bug.cgi?id=213779
Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1620
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Stylon Wang <stylon.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
3 years agodrm/amdgpu: add yellow carp pci id (v2)
Aaron Liu [Wed, 4 Nov 2020 05:04:06 +0000 (13:04 +0800)]
drm/amdgpu: add yellow carp pci id (v2)

Add Yellow Carp PCI id support.

v2: add another DID

Signed-off-by: Aaron Liu <aaron.liu@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amdgpu: update yellow carp external rev_id handling
Aaron Liu [Wed, 2 Jun 2021 02:32:41 +0000 (10:32 +0800)]
drm/amdgpu: update yellow carp external rev_id handling

0x1681 has a different external revision id.

Signed-off-by: Aaron Liu <aaron.liu@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amd/pm: Support board calibration on aldebaran
Lijo Lazar [Thu, 15 Jul 2021 06:54:49 +0000 (14:54 +0800)]
drm/amd/pm: Support board calibration on aldebaran

Add support for board power calibration on Aldebaran.
Board calibration is done after DC offset calibration.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Kevin Wang <kevin1.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amd/display: change zstate allow msg condition
Eric Yang [Wed, 30 Jun 2021 22:22:51 +0000 (18:22 -0400)]
drm/amd/display: change zstate allow msg condition

[Why]
PMFW message which previously thought to only control Z9 controls both
Z9 and Z10. Also HW design team requested that Z9 must only be supported
on eDP due to content protection interop.

[How]
Change zstate support condition to match updated policy

Reviewed-by: Nicholas Kazlauskas <Nicholas.Kazlauskas@amd.com>
Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Eric Yang <Eric.Yang2@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amd/display: Populate dtbclk entries for dcn3.02/3.03
Bindu Ramamurthy [Fri, 9 Jul 2021 14:35:33 +0000 (10:35 -0400)]
drm/amd/display: Populate dtbclk entries for dcn3.02/3.03

[Why]
Populate dtbclk values from bwparams for dcn302, dcn303.

[How]
dtbclk values are fetched from bandwidthparams for all DPM levels and
for DPM levels where smu returns 0, previous level values are reported.

Reviewed-by: Roman Li <Roman.Li@amd.com>
Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Bindu Ramamurthy <bindu.r@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amd/display: Line Buffer changes
Nevenko Stupar [Fri, 9 Jul 2021 17:05:11 +0000 (13:05 -0400)]
drm/amd/display: Line Buffer changes

DCN 3x increased Line buffer size for DCHUB latency hiding, from 4 lines
of 4K resolution lines to 5 lines of 4K resolution lines. All Line
Buffer can be used as extended memory for P State change latency hiding.
The maximum number of lines is increased to 32 lines. Finally,
LB_MEMORY_CONFIG_1 (LB memory piece 1) and LB_MEMORY _CONFIG_2 (LB
memory piece 2) are not affected, no change in size, only 3 pieces is
affected, i.e., when all 3 pieces are used in both LB_MEMORY_CONFIG_0
and LB_MEMORY_CONFIG_3 (for 4:2:0) modes.

Reviewed-by: Jun Lei <jun.lei@amd.com>
Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Nevenko Stupar <Nevenko.Stupar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amd/display: Remove MALL function from DCN3.1
Mikita Lipski [Tue, 15 Jun 2021 00:21:42 +0000 (20:21 -0400)]
drm/amd/display: Remove MALL function from DCN3.1

[why]
DCN31 doesn't have MALL in DMUB so to avoid sending unknown commands to
DMUB just remove the function pointer.

[how]
Remove apply_idle_power_optimizations from function pointers structure
for DCN31

Reviewed-by: Nicholas Kazlauskas <Nicholas.Kazlauskas@amd.com>
Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Mikita Lipski <mikita.lipski@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amd/display: Only set default brightness for OLED
Camille Cho [Thu, 8 Jul 2021 10:28:37 +0000 (18:28 +0800)]
drm/amd/display: Only set default brightness for OLED

[Why]
We used to unconditionally set backlight path as AUX for panels capable
of backlight adjustment via DPCD in set default brightness.

[How]
This should be limited to OLED panel only since we control backlight via
PWM path for SDR mode in LCD HDR panel.

Reviewed-by: Krunoslav Kovac <krunoslav.kovac@amd.com>
Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Camille Cho <Camille.Cho@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amd/display: Update bounding box for DCN3.1
Nicholas Kazlauskas [Thu, 8 Jul 2021 16:59:59 +0000 (12:59 -0400)]
drm/amd/display: Update bounding box for DCN3.1

[Why & How]
We're missing a default value for dram_channel_width_bytes in the
DCN3.1 SOC bounding box and we don't currently have the interface in
place to query the actual value from VBIOS.

Put in a hardcoded default until we have the interface in place.

Reviewed-by: Eric Yang <eric.yang2@amd.com>
Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amd/display: Query VCO frequency from register for DCN3.1
Nicholas Kazlauskas [Wed, 7 Jul 2021 20:38:57 +0000 (16:38 -0400)]
drm/amd/display: Query VCO frequency from register for DCN3.1

[Why]
Hardcoding the VCO frequency isn't correct since we don't own or control
the value.

In the case where the hardcode is also missing we can't lightup display.

[How]
Query from the CLK register instead. Update the DFS frequency to be able
to compute the VCO frequency.

Reviewed-by: Eric Yang <eric.yang2@amd.com>
Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amd/display: Populate socclk entries for dcn3.02/3.03
Bindu Ramamurthy [Thu, 27 May 2021 14:11:32 +0000 (10:11 -0400)]
drm/amd/display: Populate socclk entries for dcn3.02/3.03

[Why]
Initialize socclk entries in bandwidth params for dcn302, dcn303.

[How]
Fetch the sockclk values from smu for the DPM levels and for the DPM
levels where smu returns 0, previous level values are reported.

Reviewed-by: Roman Li <Roman.Li@amd.com>
Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Bindu Ramamurthy <bindu.r@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amd/display: Fix max vstartup calculation for modes with borders
Nicholas Kazlauskas [Wed, 7 Jul 2021 17:19:14 +0000 (13:19 -0400)]
drm/amd/display: Fix max vstartup calculation for modes with borders

[Why]
Vertical and horizontal borders in timings are treated as increasing the
active area - vblank and hblank actually shrink.

Our input into DML does not include these borders so it incorrectly
assumes it has more time than available for vstartup and tmdl
calculations for some modes with borders.

An example of such a timing would be 640x480@72Hz:

h_total: 832
h_border_left: 8
h_addressable: 640
h_border_right: 8
h_front_porch: 16
h_sync_width: 40
v_total: 520
v_border_top: 8
v_addressable: 480
v_border_bottom: 8
v_front_porch: 1
v_sync_width: 3
pix_clk_100hz: 315000

[How]
Include borders as part of destination vactive/hactive.

This change DCN20+ so it has wide impact, but the destination vactive
and hactive are only really used for vstartup calculation anyway.

Most modes do not have vertical or horizontal borders.

Reviewed-by: Dmytro Laktyushkin <Dmytro.Laktyushkin@amd.com>
Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amd/display: implement workaround for riommu related hang
Eric Yang [Wed, 23 Jun 2021 19:48:02 +0000 (15:48 -0400)]
drm/amd/display: implement workaround for riommu related hang

[Why]
During S4/S5/reboot, sometimes riommu invalidation request arrive too
early, DCN may be unable to respond to the invalidation request
resulting in pstate hang.

[How]
VBIOS will force allow pstate for riommu invalidation and driver will
clear it after powering down display pipes.

Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Eric Yang <Eric.Yang2@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agodrm/amd/display: Fix comparison error in dcn21 DML
Victor Lu [Thu, 24 Jun 2021 15:05:42 +0000 (11:05 -0400)]
drm/amd/display: Fix comparison error in dcn21 DML

[why]
A comparison error made it possible to not iterate through all the
specified prefetch modes.

[how]
Correct "<" to "<="

Reviewed-by: Dmytro Laktyushkin <Dmytro.Laktyushkin@amd.com>
Reviewed-by: Yongqiang Sun <Yongqiang.Sun@amd.com>
Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Victor Lu <victorchengchi.lu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 years agoMerge tag 'asoc-fix-v5.14-rc2' of https://git.kernel.org/pub/scm/linux/kernel/git...
Takashi Iwai [Wed, 21 Jul 2021 17:48:09 +0000 (19:48 +0200)]
Merge tag 'asoc-fix-v5.14-rc2' of https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus

ASoC: Fixes for v5.14

A collection of fixes for ASoC that have come in since the merge window,
all driver specific.  There is a new core feature added for reversing
the order of operations when shutting down, this is needed to fix a bug
with the AMD Stonyridge platform, and we also tweak the Kconfig to make
the SSM2518 driver user selectable so it can be used with generic cards
but that requires no actual code changes.

3 years agoUSB: serial: cp210x: add ID for CEL EM3588 USB ZigBee stick
John Keeping [Wed, 21 Jul 2021 16:17:45 +0000 (17:17 +0100)]
USB: serial: cp210x: add ID for CEL EM3588 USB ZigBee stick

Add the USB serial device ID for the CEL ZigBee EM3588 radio stick.

Signed-off-by: John Keeping <john@metanate.com>
Cc: stable@vger.kernel.org
Signed-off-by: Johan Hovold <johan@kernel.org>
3 years agonet: ixp46x: fix ptp build failure
Arnd Bergmann [Wed, 21 Jul 2021 15:19:32 +0000 (17:19 +0200)]
net: ixp46x: fix ptp build failure

The rework of the ixp46x cpu detection left the network driver in
a half broken state:

drivers/net/ethernet/xscale/ptp_ixp46x.c: In function 'ptp_ixp_init':
drivers/net/ethernet/xscale/ptp_ixp46x.c:290:51: error: 'IXP4XX_TIMESYNC_BASE_VIRT' undeclared (first use in this function)
  290 |                 (struct ixp46x_ts_regs __iomem *) IXP4XX_TIMESYNC_BASE_VIRT;
      |                                                   ^~~~~~~~~~~~~~~~~~~~~~~~~
drivers/net/ethernet/xscale/ptp_ixp46x.c:290:51: note: each undeclared identifier is reported only once for each function it appears in
drivers/net/ethernet/xscale/ptp_ixp46x.c: At top level:
drivers/net/ethernet/xscale/ptp_ixp46x.c:323:1: error: data definition has no type or storage class [-Werror]
  323 | module_init(ptp_ixp_init);

I have patches to complete the transition for a future release, but
for the moment, add the missing include statements to get it to build
again.

Fixes: 09aa9aabdcc4 ("soc: ixp4xx: move cpu detection to linux/soc/ixp4xx/cpu.h")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoibmvnic: Remove the proper scrq flush
Sukadev Bhattiprolu [Wed, 21 Jul 2021 02:34:39 +0000 (19:34 -0700)]
ibmvnic: Remove the proper scrq flush

Commit 65d6470d139a ("ibmvnic: clean pending indirect buffs during reset")
intended to remove the call to ibmvnic_tx_scrq_flush() when the
->resetting flag is true and was tested that way. But during the final
rebase to net-next, the hunk got applied to a block few lines below
(which happened to have the same diff context) and the wrong call to
ibmvnic_tx_scrq_flush() got removed.

Fix that by removing the correct ibmvnic_tx_scrq_flush() and restoring
the one that was incorrectly removed.

Fixes: 65d6470d139a ("ibmvnic: clean pending indirect buffs during reset")
Reported-by: Dany Madden <drt@linux.ibm.com>
Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoRevert "x86/hyperv: fix logical processor creation"
Wei Liu [Wed, 21 Jul 2021 15:55:43 +0000 (15:55 +0000)]
Revert "x86/hyperv: fix logical processor creation"

This reverts commit 450605c28d571eddca39a65fdbc1338add44c6d9.

Signed-off-by: Wei Liu <wei.liu@kernel.org>
3 years agoMerge branch 'pmtu-esp'
David S. Miller [Wed, 21 Jul 2021 15:50:52 +0000 (08:50 -0700)]
Merge branch 'pmtu-esp'

Vadim Fedorenko ays:

====================
Fix PMTU for ESP-in-UDP encapsulation

Bug 213669 uncovered regression in PMTU discovery for UDP-encapsulated
routes and some incorrect usage in udp tunnel fields. This series fixes
problems and also adds such case for selftests

v3:
 - update checking logic to account SCTP use case
v2:
 - remove refactor code that was in first patch
 - move checking logic to __udp{4,6}_lib_err_encap
 - add more tests, especially routed configuration
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agoselftests: net: add ESP-in-UDP PMTU test
Vadim Fedorenko [Tue, 20 Jul 2021 20:35:29 +0000 (23:35 +0300)]
selftests: net: add ESP-in-UDP PMTU test

The case of ESP in UDP encapsulation was not covered before. Add
cases of local changes of MTU and difference on routed path.

Signed-off-by: Vadim Fedorenko <vfedorenko@novek.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agodrm/i915: Correct the docs for intel_engine_cmd_parser
Jason Ekstrand [Tue, 20 Jul 2021 18:21:08 +0000 (13:21 -0500)]
drm/i915: Correct the docs for intel_engine_cmd_parser

In 93b713304188 ("drm/i915: Revert "drm/i915/gem: Asynchronous
cmdparser""), the parameters to intel_engine_cmd_parser() were altered
without updating the docs, causing Fi.CI.DOCS to start failing.

Fixes: c9d9fdbc108a ("drm/i915: Revert "drm/i915/gem: Asynchronous cmdparser"")
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210720182108.2761496-1-jason@jlekstrand.net
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
[Added 'Fixes:' tag and corrected the hash for the ancestor]
(cherry picked from commit 15eb083bdb561bb4862cd04cd0523e55483e877e)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
[Updated Fixes tag to match fixes branch]

3 years agoudp: check encap socket in __udp_lib_err
Vadim Fedorenko [Tue, 20 Jul 2021 20:35:28 +0000 (23:35 +0300)]
udp: check encap socket in __udp_lib_err

Commit d26796ae5894 ("udp: check udp sock encap_type in __udp_lib_err")
added checks for encapsulated sockets but it broke cases when there is
no implementation of encap_err_lookup for encapsulation, i.e. ESP in
UDP encapsulation. Fix it by calling encap_err_lookup only if socket
implements this method otherwise treat it as legal socket.

Fixes: d26796ae5894 ("udp: check udp sock encap_type in __udp_lib_err")
Signed-off-by: Vadim Fedorenko <vfedorenko@novek.ru>
Reviewed-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agosctp: update active_key for asoc when old key is being replaced
Xin Long [Tue, 20 Jul 2021 20:07:01 +0000 (16:07 -0400)]
sctp: update active_key for asoc when old key is being replaced

syzbot reported a call trace:

  BUG: KASAN: use-after-free in sctp_auth_shkey_hold+0x22/0xa0 net/sctp/auth.c:112
  Call Trace:
   sctp_auth_shkey_hold+0x22/0xa0 net/sctp/auth.c:112
   sctp_set_owner_w net/sctp/socket.c:131 [inline]
   sctp_sendmsg_to_asoc+0x152e/0x2180 net/sctp/socket.c:1865
   sctp_sendmsg+0x103b/0x1d30 net/sctp/socket.c:2027
   inet_sendmsg+0x99/0xe0 net/ipv4/af_inet.c:821
   sock_sendmsg_nosec net/socket.c:703 [inline]
   sock_sendmsg+0xcf/0x120 net/socket.c:723

This is an use-after-free issue caused by not updating asoc->shkey after
it was replaced in the key list asoc->endpoint_shared_keys, and the old
key was freed.

This patch is to fix by also updating active_key for asoc when old key is
being replaced with a new one. Note that this issue doesn't exist in
sctp_auth_del_key_id(), as it's not allowed to delete the active_key
from the asoc.

Fixes: 1b1e0bc99474 ("sctp: add refcnt support for sh_key")
Reported-by: syzbot+b774577370208727d12b@syzkaller.appspotmail.com
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3 years agodriver core: Prevent warning when removing a device link from unregistered consumer
Adrian Hunter [Fri, 16 Jul 2021 11:44:07 +0000 (14:44 +0300)]
driver core: Prevent warning when removing a device link from unregistered consumer

sysfs_remove_link() causes a warning if the parent directory does not
exist. That can happen if the device link consumer has not been registered.
So do not attempt sysfs_remove_link() in that case.

Fixes: 287905e68dd29 ("driver core: Expose device link details in sysfs")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: stable@vger.kernel.org # 5.9+
Reviewed-by: Rafael J. Wysocki <rafael@kernel.org>
Link: https://lore.kernel.org/r/20210716114408.17320-2-adrian.hunter@intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agonvme: set the PRACT bit when using Write Zeroes with T10 PI
Christoph Hellwig [Wed, 21 Jul 2021 08:00:11 +0000 (10:00 +0200)]
nvme: set the PRACT bit when using Write Zeroes with T10 PI

When using Write Zeroes on a namespace that has protection
information enabled they behavior without the PRACT bit
counter-intuitive and will generally lead to validation failures
when reading the written blocks.  Fix this by always setting the
PRACT bit that generates matching PI data on the fly.

Fixes: 6e02318eaea5 ("nvme: add support for the Write Zeroes command")
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>