]> git.proxmox.com Git - mirror_ubuntu-kernels.git/log
mirror_ubuntu-kernels.git
4 years agomm: replace memmap_context by meminit_context
Laurent Dufour [Sat, 26 Sep 2020 04:19:28 +0000 (21:19 -0700)]
mm: replace memmap_context by meminit_context

Patch series "mm: fix memory to node bad links in sysfs", v3.

Sometimes, firmware may expose interleaved memory layout like this:

 Early memory node ranges
   node   1: [mem 0x0000000000000000-0x000000011fffffff]
   node   2: [mem 0x0000000120000000-0x000000014fffffff]
   node   1: [mem 0x0000000150000000-0x00000001ffffffff]
   node   0: [mem 0x0000000200000000-0x000000048fffffff]
   node   2: [mem 0x0000000490000000-0x00000007ffffffff]

In that case, we can see memory blocks assigned to multiple nodes in
sysfs:

  $ ls -l /sys/devices/system/memory/memory21
  total 0
  lrwxrwxrwx 1 root root     0 Aug 24 05:27 node1 -> ../../node/node1
  lrwxrwxrwx 1 root root     0 Aug 24 05:27 node2 -> ../../node/node2
  -rw-r--r-- 1 root root 65536 Aug 24 05:27 online
  -r--r--r-- 1 root root 65536 Aug 24 05:27 phys_device
  -r--r--r-- 1 root root 65536 Aug 24 05:27 phys_index
  drwxr-xr-x 2 root root     0 Aug 24 05:27 power
  -r--r--r-- 1 root root 65536 Aug 24 05:27 removable
  -rw-r--r-- 1 root root 65536 Aug 24 05:27 state
  lrwxrwxrwx 1 root root     0 Aug 24 05:25 subsystem -> ../../../../bus/memory
  -rw-r--r-- 1 root root 65536 Aug 24 05:25 uevent
  -r--r--r-- 1 root root 65536 Aug 24 05:27 valid_zones

The same applies in the node's directory with a memory21 link in both
the node1 and node2's directory.

This is wrong but doesn't prevent the system to run.  However when
later, one of these memory blocks is hot-unplugged and then hot-plugged,
the system is detecting an inconsistency in the sysfs layout and a
BUG_ON() is raised:

  kernel BUG at /Users/laurent/src/linux-ppc/mm/memory_hotplug.c:1084!
  LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries
  Modules linked in: rpadlpar_io rpaphp pseries_rng rng_core vmx_crypto gf128mul binfmt_misc ip_tables x_tables xfs libcrc32c crc32c_vpmsum autofs4
  CPU: 8 PID: 10256 Comm: drmgr Not tainted 5.9.0-rc1+ #25
  Call Trace:
    add_memory_resource+0x23c/0x340 (unreliable)
    __add_memory+0x5c/0xf0
    dlpar_add_lmb+0x1b4/0x500
    dlpar_memory+0x1f8/0xb80
    handle_dlpar_errorlog+0xc0/0x190
    dlpar_store+0x198/0x4a0
    kobj_attr_store+0x30/0x50
    sysfs_kf_write+0x64/0x90
    kernfs_fop_write+0x1b0/0x290
    vfs_write+0xe8/0x290
    ksys_write+0xdc/0x130
    system_call_exception+0x160/0x270
    system_call_common+0xf0/0x27c

This has been seen on PowerPC LPAR.

The root cause of this issue is that when node's memory is registered,
the range used can overlap another node's range, thus the memory block
is registered to multiple nodes in sysfs.

There are two issues here:

 (a) The sysfs memory and node's layouts are broken due to these
     multiple links

 (b) The link errors in link_mem_sections() should not lead to a system
     panic.

To address (a) register_mem_sect_under_node should not rely on the
system state to detect whether the link operation is triggered by a hot
plug operation or not.  This is addressed by the patches 1 and 2 of this
series.

Issue (b) will be addressed separately.

This patch (of 2):

The memmap_context enum is used to detect whether a memory operation is
due to a hot-add operation or happening at boot time.

Make it general to the hotplug operation and rename it as
meminit_context.

There is no functional change introduced by this patch

Suggested-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Laurent Dufour <ldufour@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "Rafael J . Wysocki" <rafael@kernel.org>
Cc: Nathan Lynch <nathanl@linux.ibm.com>
Cc: Scott Cheloha <cheloha@linux.ibm.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: <stable@vger.kernel.org>
Link: https://lkml.kernel.org/r/20200915094143.79181-1-ldufour@linux.ibm.com
Link: https://lkml.kernel.org/r/20200915132624.9723-1-ldufour@linux.ibm.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
4 years agoarch/x86/lib/usercopy_64.c: fix __copy_user_flushcache() cache writeback
Mikulas Patocka [Sat, 26 Sep 2020 04:19:24 +0000 (21:19 -0700)]
arch/x86/lib/usercopy_64.c: fix __copy_user_flushcache() cache writeback

If we copy less than 8 bytes and if the destination crosses a cache
line, __copy_user_flushcache would invalidate only the first cache line.

This patch makes it invalidate the second cache line as well.

Fixes: 0aed55af88345b ("x86, uaccess: introduce copy_from_iter_flushcache for pmem / cache-bypass operations")
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Dan Williams <dan.j.wiilliams@intel.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Toshi Kani <toshi.kani@hpe.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Matthew Wilcox <mawilcox@microsoft.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: <stable@vger.kernel.org>
Link: https://lkml.kernel.org/r/alpine.LRH.2.02.2009161451140.21915@file01.intranet.prod.int.rdu2.redhat.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
4 years agolib/memregion.c: include memregion.h
Jason Yan [Sat, 26 Sep 2020 04:19:21 +0000 (21:19 -0700)]
lib/memregion.c: include memregion.h

This addresses the following sparse warning:

  lib/memregion.c:8:5: warning: symbol 'memregion_alloc' was not declared. Should it be static?
  lib/memregion.c:14:6: warning: symbol 'memregion_free' was not declared. Should it be static?

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Jason Yan <yanaijie@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Link: https://lkml.kernel.org/r/20200921142852.875312-1-yanaijie@huawei.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
4 years agolib/string.c: implement stpcpy
Nick Desaulniers [Sat, 26 Sep 2020 04:19:18 +0000 (21:19 -0700)]
lib/string.c: implement stpcpy

LLVM implemented a recent "libcall optimization" that lowers calls to
`sprintf(dest, "%s", str)` where the return value is used to
`stpcpy(dest, str) - dest`.

This generally avoids the machinery involved in parsing format strings.
`stpcpy` is just like `strcpy` except it returns the pointer to the new
tail of `dest`.  This optimization was introduced into clang-12.

Implement this so that we don't observe linkage failures due to missing
symbol definitions for `stpcpy`.

Similar to last year's fire drill with: commit 5f074f3e192f
("lib/string.c: implement a basic bcmp")

The kernel is somewhere between a "freestanding" environment (no full
libc) and "hosted" environment (many symbols from libc exist with the
same type, function signature, and semantics).

As Peter Anvin notes, there's not really a great way to inform the
compiler that you're targeting a freestanding environment but would like
to opt-in to some libcall optimizations (see pr/47280 below), rather
than opt-out.

Arvind notes, -fno-builtin-* behaves slightly differently between GCC
and Clang, and Clang is missing many __builtin_* definitions, which I
consider a bug in Clang and am working on fixing.

Masahiro summarizes the subtle distinction between compilers justly:
  To prevent transformation from foo() into bar(), there are two ways in
  Clang to do that; -fno-builtin-foo, and -fno-builtin-bar.  There is
  only one in GCC; -fno-buitin-foo.

(Any difference in that behavior in Clang is likely a bug from a missing
__builtin_* definition.)

Masahiro also notes:
  We want to disable optimization from foo() to bar(),
  but we may still benefit from the optimization from
  foo() into something else. If GCC implements the same transform, we
  would run into a problem because it is not -fno-builtin-bar, but
  -fno-builtin-foo that disables that optimization.

  In this regard, -fno-builtin-foo would be more future-proof than
  -fno-built-bar, but -fno-builtin-foo is still potentially overkill. We
  may want to prevent calls from foo() being optimized into calls to
  bar(), but we still may want other optimization on calls to foo().

It seems that compilers today don't quite provide the fine grain control
over which libcall optimizations pseudo-freestanding environments would
prefer.

Finally, Kees notes that this interface is unsafe, so we should not
encourage its use.  As such, I've removed the declaration from any
header, but it still needs to be exported to avoid linkage errors in
modules.

Reported-by: Sami Tolvanen <samitolvanen@google.com>
Suggested-by: Andy Lavr <andy.lavr@gmail.com>
Suggested-by: Arvind Sankar <nivedita@alum.mit.edu>
Suggested-by: Joe Perches <joe@perches.com>
Suggested-by: Kees Cook <keescook@chromium.org>
Suggested-by: Masahiro Yamada <masahiroy@kernel.org>
Suggested-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Tested-by: Nathan Chancellor <natechancellor@gmail.com>
Cc: <stable@vger.kernel.org>
Link: https://lkml.kernel.org/r/20200914161643.938408-1-ndesaulniers@google.com
Link: https://bugs.llvm.org/show_bug.cgi?id=47162
Link: https://bugs.llvm.org/show_bug.cgi?id=47280
Link: https://github.com/ClangBuiltLinux/linux/issues/1126
Link: https://man7.org/linux/man-pages/man3/stpcpy.3.html
Link: https://pubs.opengroup.org/onlinepubs/9699919799/functions/stpcpy.html
Link: https://reviews.llvm.org/D85963
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
4 years agomm/migrate: correct thp migration stats
Zi Yan [Sat, 26 Sep 2020 04:19:14 +0000 (21:19 -0700)]
mm/migrate: correct thp migration stats

PageTransHuge returns true for both thp and hugetlb, so thp stats was
counting both thp and hugetlb migrations.  Exclude hugetlb migration by
setting is_thp variable right.

Clean up thp handling code too when we are there.

Fixes: 1a5bae25e3cf ("mm/vmstat: add events for THP migration without split")
Signed-off-by: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Link: https://lkml.kernel.org/r/20200917210413.1462975-1-zi.yan@sent.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
4 years agomm/gup: fix gup_fast with dynamic page table folding
Vasily Gorbik [Sat, 26 Sep 2020 04:19:10 +0000 (21:19 -0700)]
mm/gup: fix gup_fast with dynamic page table folding

Currently to make sure that every page table entry is read just once
gup_fast walks perform READ_ONCE and pass pXd value down to the next
gup_pXd_range function by value e.g.:

  static int gup_pud_range(p4d_t p4d, unsigned long addr, unsigned long end,
                           unsigned int flags, struct page **pages, int *nr)
  ...
          pudp = pud_offset(&p4d, addr);

This function passes a reference on that local value copy to pXd_offset,
and might get the very same pointer in return.  This happens when the
level is folded (on most arches), and that pointer should not be
iterated.

On s390 due to the fact that each task might have different 5,4 or
3-level address translation and hence different levels folded the logic
is more complex and non-iteratable pointer to a local copy leads to
severe problems.

Here is an example of what happens with gup_fast on s390, for a task
with 3-level paging, crossing a 2 GB pud boundary:

  // addr = 0x1007ffff000, end = 0x10080001000
  static int gup_pud_range(p4d_t p4d, unsigned long addr, unsigned long end,
                           unsigned int flags, struct page **pages, int *nr)
  {
        unsigned long next;
        pud_t *pudp;

        // pud_offset returns &p4d itself (a pointer to a value on stack)
        pudp = pud_offset(&p4d, addr);
        do {
                // on second iteratation reading "random" stack value
                pud_t pud = READ_ONCE(*pudp);

                // next = 0x10080000000, due to PUD_SIZE/MASK != PGDIR_SIZE/MASK on s390
                next = pud_addr_end(addr, end);
                ...
        } while (pudp++, addr = next, addr != end); // pudp++ iterating over stack

        return 1;
  }

This happens since s390 moved to common gup code with commit
d1874a0c2805 ("s390/mm: make the pxd_offset functions more robust") and
commit 1a42010cdc26 ("s390/mm: convert to the generic
get_user_pages_fast code").

s390 tried to mimic static level folding by changing pXd_offset
primitives to always calculate top level page table offset in pgd_offset
and just return the value passed when pXd_offset has to act as folded.

What is crucial for gup_fast and what has been overlooked is that
PxD_SIZE/MASK and thus pXd_addr_end should also change correspondingly.
And the latter is not possible with dynamic folding.

To fix the issue in addition to pXd values pass original pXdp pointers
down to gup_pXd_range functions.  And introduce pXd_offset_lockless
helpers, which take an additional pXd entry value parameter.  This has
already been discussed in

  https://lkml.kernel.org/r/20190418100218.0a4afd51@mschwideX1

Fixes: 1a42010cdc26 ("s390/mm: convert to the generic get_user_pages_fast code")
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Mike Rapoport <rppt@linux.ibm.com>
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Claudio Imbrenda <imbrenda@linux.ibm.com>
Cc: <stable@vger.kernel.org> [5.2+]
Link: https://lkml.kernel.org/r/patch.git-943f1e5dcff2.your-ad-here.call-01599856292-ext-8676@work.hours
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
4 years agomm: memcontrol: fix missing suffix of workingset_restore
Muchun Song [Sat, 26 Sep 2020 04:19:05 +0000 (21:19 -0700)]
mm: memcontrol: fix missing suffix of workingset_restore

We forget to add the suffix to the workingset_restore string, so fix it.

And also update the documentation of cgroup-v2.rst.

Fixes: 170b04b7ae49 ("mm/workingset: prepare the workingset detection infrastructure for anon LRU")
Signed-off-by: Muchun Song <songmuchun@bytedance.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Tejun Heo <tj@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: Roman Gushchin <guro@fb.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Link: https://lkml.kernel.org/r/20200916100030.71698-1-songmuchun@bytedance.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
4 years agomm, THP, swap: fix allocating cluster for swapfile by mistake
Gao Xiang [Sat, 26 Sep 2020 04:19:01 +0000 (21:19 -0700)]
mm, THP, swap: fix allocating cluster for swapfile by mistake

SWP_FS is used to make swap_{read,write}page() go through the
filesystem, and it's only used for swap files over NFS.  So, !SWP_FS
means non NFS for now, it could be either file backed or device backed.
Something similar goes with legacy SWP_FILE.

So in order to achieve the goal of the original patch, SWP_BLKDEV should
be used instead.

FS corruption can be observed with SSD device + XFS + fragmented
swapfile due to CONFIG_THP_SWAP=y.

I reproduced the issue with the following details:

Environment:

  QEMU + upstream kernel + buildroot + NVMe (2 GB)

Kernel config:

  CONFIG_BLK_DEV_NVME=y
  CONFIG_THP_SWAP=y

Some reproducible steps:

  mkfs.xfs -f /dev/nvme0n1
  mkdir /tmp/mnt
  mount /dev/nvme0n1 /tmp/mnt
  bs="32k"
  sz="1024m"    # doesn't matter too much, I also tried 16m
  xfs_io -f -c "pwrite -R -b $bs 0 $sz" -c "fdatasync" /tmp/mnt/sw
  xfs_io -f -c "pwrite -R -b $bs 0 $sz" -c "fdatasync" /tmp/mnt/sw
  xfs_io -f -c "pwrite -R -b $bs 0 $sz" -c "fdatasync" /tmp/mnt/sw
  xfs_io -f -c "pwrite -F -S 0 -b $bs 0 $sz" -c "fdatasync" /tmp/mnt/sw
  xfs_io -f -c "pwrite -R -b $bs 0 $sz" -c "fsync" /tmp/mnt/sw

  mkswap /tmp/mnt/sw
  swapon /tmp/mnt/sw

  stress --vm 2 --vm-bytes 600M   # doesn't matter too much as well

Symptoms:
 - FS corruption (e.g. checksum failure)
 - memory corruption at: 0xd2808010
 - segfault

Fixes: f0eea189e8e9 ("mm, THP, swap: Don't allocate huge cluster for file backed swap device")
Fixes: 38d8b4e6bdc8 ("mm, THP, swap: delay splitting THP during swap out")
Signed-off-by: Gao Xiang <hsiangkao@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: "Huang, Ying" <ying.huang@intel.com>
Reviewed-by: Yang Shi <shy828301@gmail.com>
Acked-by: Rafael Aquini <aquini@redhat.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Carlos Maiolino <cmaiolino@redhat.com>
Cc: Eric Sandeen <esandeen@redhat.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: <stable@vger.kernel.org>
Link: https://lkml.kernel.org/r/20200820045323.7809-1-hsiangkao@redhat.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
4 years agoMerge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Linus Torvalds [Sat, 26 Sep 2020 00:15:19 +0000 (17:15 -0700)]
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull more kvm fixes from Paolo Bonzini:
 "Five small fixes.

  The nested migration bug will be fixed with a better API in 5.10 or
  5.11, for now this is a fix that works with existing userspace but
  keeps the current ugly API"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: SVM: Add a dedicated INVD intercept routine
  KVM: x86: Reset MMU context if guest toggles CR4.SMAP or CR4.PKE
  KVM: x86: fix MSR_IA32_TSC read for nested migration
  selftests: kvm: Fix assert failure in single-step test
  KVM: x86: VMX: Make smaller physical guest address space support user-configurable

4 years agoMerge tag 'mips_fixes_5.9_3' of git://git.kernel.org/pub/scm/linux/kernel/git/mips...
Linus Torvalds [Fri, 25 Sep 2020 22:24:04 +0000 (15:24 -0700)]
Merge tag 'mips_fixes_5.9_3' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux

Pull MIPS fixes from Thomas Bogendoerfer:

 - fixed FP register access on Loongsoon-3

 - added missing 1074 cpu handling

 - fixed Loongson2ef build error

* tag 'mips_fixes_5.9_3' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux:
  MIPS: BCM47XX: Remove the needless check with the 1074K
  MIPS: Add the missing 'CPU_1074K' into __get_cpu_type()
  MIPS: Loongson2ef: Disable Loongson MMI instructions
  MIPS: Loongson-3: Fix fp register access if MSA enabled

4 years agoMerge tag 'spi-fix-v5.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi
Linus Torvalds [Fri, 25 Sep 2020 22:21:54 +0000 (15:21 -0700)]
Merge tag 'spi-fix-v5.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi

Pull spi fixes from Mark Brown:
 "A small collection of driver specific fixes, the fsl-espi and bcm-qspi
  changes in particular have been causing breakage for users"

* tag 'spi-fix-v5.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
  spi: bcm-qspi: Fix probe regression on iProc platforms
  spi: fsl-dspi: fix use-after-free in remove path
  spi: fsl-espi: Only process interrupts for expected events
  spi: bcm2835: Make polling_limit_us static
  spi: spi-fsl-dspi: use XSPI mode instead of DMA for DPAA2 SoCs

4 years agoMerge tag 'regulator-fix-v5.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Fri, 25 Sep 2020 22:16:01 +0000 (15:16 -0700)]
Merge tag 'regulator-fix-v5.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator

Pull regulator fix from Mark Brown:
 "A single fix for incorrect specification of some of the register
  fields on axp20x devices which would break voltage setting on affected
  systems"

* tag 'regulator-fix-v5.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
  regulator: axp20x: fix LDO2/4 description

4 years agoMerge tag 'regmap-fix-v5.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Fri, 25 Sep 2020 22:11:24 +0000 (15:11 -0700)]
Merge tag 'regmap-fix-v5.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap

Pull regmap fixes from Mark Brown:
 "Two issues here - one is a fix for use after free issues in the case
  where a regmap overrides its name using something dynamically
  generated, the other is that we weren't handling access checks
  non-incrementing I/O on registers within paged register regions
  correctly resulting in spurious errors.

  Both of these are quite rare but serious if they occur"

* tag 'regmap-fix-v5.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap:
  regmap: fix page selection for noinc writes
  regmap: fix page selection for noinc reads
  regmap: debugfs: Add back in erroneously removed initialisation of ret
  regmap: debugfs: Fix handling of name string for debugfs init delays

4 years agoMerge tag 'nfsd-5.9-2' of git://git.linux-nfs.org/projects/cel/cel-2.6
Linus Torvalds [Fri, 25 Sep 2020 17:46:11 +0000 (10:46 -0700)]
Merge tag 'nfsd-5.9-2' of git://git.linux-nfs.org/projects/cel/cel-2.6

Pull NFS server fix from Chuck Lever:
 "Fix incorrect calculation on platforms that implement
  flush_dcache_page()"

* tag 'nfsd-5.9-2' of git://git.linux-nfs.org/projects/cel/cel-2.6:
  SUNRPC: Fix svc_flush_dcache()

4 years agoMerge tag 'pm-5.9-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Linus Torvalds [Fri, 25 Sep 2020 17:39:22 +0000 (10:39 -0700)]
Merge tag 'pm-5.9-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull power management fixes from Rafael Wysocki:
 "These fix more fallout of recent RCU-lockdep changes in CPU idle code
  and two devfreq issues.

  Specifics:

   - Export rcu_idle_{enter,exit} to modules to fix build issues
     introduced by recent RCU-lockdep fixes (Borislav Petkov)

   - Add missing return statement to a stub function in the ACPI
     processor driver to fix a build issue introduced by recent
     RCU-lockdep fixes (Rafael Wysocki)

   - Fix recently introduced suspicious RCU usage warnings in the PSCI
     cpuidle driver and drop stale comments regarding RCU_NONIDLE()
     usage from enter_s2idle_proper() (Ulf Hansson)

   - Fix error code path in the tegra30 devfreq driver (Dan Carpenter)

   - Add missing information to devfreq_summary debugfs (Chanwoo Choi)"

* tag 'pm-5.9-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  ACPI: processor: Fix build for ARCH_APICTIMER_STOPS_ON_C3 unset
  PM / devfreq: tegra30: Disable clock on error in probe
  PM / devfreq: Add timer type to devfreq_summary debugfs
  cpuidle: Drop misleading comments about RCU usage
  cpuidle: psci: Fix suspicious RCU usage
  rcu/tree: Export rcu_idle_{enter,exit} to modules

4 years agoKVM: SVM: Add a dedicated INVD intercept routine
Tom Lendacky [Thu, 24 Sep 2020 18:41:57 +0000 (13:41 -0500)]
KVM: SVM: Add a dedicated INVD intercept routine

The INVD instruction intercept performs emulation. Emulation can't be done
on an SEV guest because the guest memory is encrypted.

Provide a dedicated intercept routine for the INVD intercept. And since
the instruction is emulated as a NOP, just skip it instead.

Fixes: 1654efcbc431 ("KVM: SVM: Add KVM_SEV_INIT command")
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Message-Id: <a0b9a19ffa7fef86a3cc700c7ea01cb2731e04e5.1600972918.git.thomas.lendacky@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
4 years agoMerge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma
Linus Torvalds [Fri, 25 Sep 2020 16:49:19 +0000 (09:49 -0700)]
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma

Pull rdma fix from Jason Gunthorpe:
 "One fix for a bug that blktests hits when using rxe: tear down the CQ
  pool before waiting for all references to go away"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
  RDMA/core: Fix ordering of CQ pool destruction

4 years agoMerge tag 'drm-fixes-2020-09-25' of git://anongit.freedesktop.org/drm/drm
Linus Torvalds [Fri, 25 Sep 2020 16:41:57 +0000 (09:41 -0700)]
Merge tag 'drm-fixes-2020-09-25' of git://anongit.freedesktop.org/drm/drm

Pull drm fixes from Dave Airlie:
 "Fairly quiet, a couple of i915 fixes, one dma-buf fix, one vc4 and two
  sun4i changes

  dma-buf:
   - Single null pointer deref fix

  i915:
   - Fix selftest reference to stack data out of scope
   - Fix GVT null pointer dereference

  vc4:
   - fill asoc card owner

  sun4i:
   - program secondary CSC correctly"

* tag 'drm-fixes-2020-09-25' of git://anongit.freedesktop.org/drm/drm:
  drm/i915/selftests: Push the fake iommu device from the stack to data
  dmabuf: fix NULL pointer dereference in dma_buf_release()
  drm/i915/gvt: Fix port number for BDW on EDID region setup
  drm/sun4i: mixer: Extend regmap max_register
  drm/sun4i: sun8i-csc: Secondary CSC register correction
  drm/vc4/vc4_hdmi: fill ASoC card owner

4 years agoMerge branch 'pm-cpuidle'
Rafael J. Wysocki [Fri, 25 Sep 2020 16:33:46 +0000 (18:33 +0200)]
Merge branch 'pm-cpuidle'

* pm-cpuidle:
  ACPI: processor: Fix build for ARCH_APICTIMER_STOPS_ON_C3 unset
  cpuidle: Drop misleading comments about RCU usage
  cpuidle: psci: Fix suspicious RCU usage
  rcu/tree: Export rcu_idle_{enter,exit} to modules

4 years agoMerge tag 'devfreq-fixes-for-5.9-rc7' of git://git.kernel.org/pub/scm/linux/kernel...
Rafael J. Wysocki [Fri, 25 Sep 2020 14:33:19 +0000 (16:33 +0200)]
Merge tag 'devfreq-fixes-for-5.9-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/linux

Pull devfreq updates for 5.9-rc7 from Chanwoo Choi:

"1. Update devfreq core
  - Add missing timer type to devfreq_summary debugfs node.

 2. Fix devfreq device driver
  - Fix the exception handling about clock on tegra30-devfreq.c"

* tag 'devfreq-fixes-for-5.9-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/linux:
  PM / devfreq: tegra30: Disable clock on error in probe
  PM / devfreq: Add timer type to devfreq_summary debugfs

4 years agoKVM: x86: Reset MMU context if guest toggles CR4.SMAP or CR4.PKE
Sean Christopherson [Wed, 23 Sep 2020 21:53:52 +0000 (14:53 -0700)]
KVM: x86: Reset MMU context if guest toggles CR4.SMAP or CR4.PKE

Reset the MMU context during kvm_set_cr4() if SMAP or PKE is toggled.
Recent commits to (correctly) not reload PDPTRs when SMAP/PKE are
toggled inadvertantly skipped the MMU context reset due to the mask
of bits that triggers PDPTR loads also being used to trigger MMU context
resets.

Fixes: 427890aff855 ("kvm: x86: Toggling CR4.SMAP does not load PDPTEs in PAE mode")
Fixes: cb957adb4ea4 ("kvm: x86: Toggling CR4.PKE does not load PDPTEs in PAE mode")
Cc: Jim Mattson <jmattson@google.com>
Cc: Peter Shier <pshier@google.com>
Cc: Oliver Upton <oupton@google.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20200923215352.17756-1-sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
4 years agoMerge tag 'drm-misc-fixes-2020-09-24' of git://anongit.freedesktop.org/drm/drm-misc...
Dave Airlie [Fri, 25 Sep 2020 01:28:36 +0000 (11:28 +1000)]
Merge tag 'drm-misc-fixes-2020-09-24' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes

drm-misc-fixes for v5.9:
- Single null pointer deref fix for dma-buf.

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/4106c21e-f52c-4c05-6cdb-daa743bb8617@linux.intel.com
4 years agoMerge tag 'drm-intel-fixes-2020-09-24' of git://anongit.freedesktop.org/drm/drm-intel...
Dave Airlie [Fri, 25 Sep 2020 01:07:01 +0000 (11:07 +1000)]
Merge tag 'drm-intel-fixes-2020-09-24' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes

drm/i915 fixes for v5.9-rc7:
- Fix selftest reference to stack data out of scope
- Fix GVT null pointer dereference
- Backmerge from Linus' master to fix build

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/87zh5fpmha.fsf@intel.com
4 years agoBackMerge commit '98477740630f270aecf648f1d6a9dbc6027d4ff1' into drm-fixes
Dave Airlie [Fri, 25 Sep 2020 01:06:18 +0000 (11:06 +1000)]
BackMerge commit '98477740630f270aecf648f1d6a9dbc6027d4ff1' into drm-fixes

The dax mess had some fallout, and i915 used a later base to fix their CI.

Signed-off-by: Dave Airlie <airlied@redhat.com>
4 years agoKVM: x86: fix MSR_IA32_TSC read for nested migration
Maxim Levitsky [Mon, 21 Sep 2020 10:38:05 +0000 (13:38 +0300)]
KVM: x86: fix MSR_IA32_TSC read for nested migration

MSR reads/writes should always access the L1 state, since the (nested)
hypervisor should intercept all the msrs it wants to adjust, and these
that it doesn't should be read by the guest as if the host had read it.

However IA32_TSC is an exception. Even when not intercepted, guest still
reads the value + TSC offset.
The write however does not take any TSC offset into account.

This is documented in Intel's SDM and seems also to happen on AMD as well.

This creates a problem when userspace wants to read the IA32_TSC value and then
write it. (e.g for migration)

In this case it reads L2 value but write is interpreted as an L1 value.
To fix this make the userspace initiated reads of IA32_TSC return L1 value
as well.

Huge thanks to Dave Gilbert for helping me understand this very confusing
semantic of MSR writes.

Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Message-Id: <20200921103805.9102-2-mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
4 years agoMerge tag 'mmc-v5.9-rc4-2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc
Linus Torvalds [Thu, 24 Sep 2020 16:09:47 +0000 (09:09 -0700)]
Merge tag 'mmc-v5.9-rc4-2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc

Pull MMC fix from Ulf Hansson:
 "Fix build warning in mmc_spi when CONFIG_HAS_DMA is unset"

* tag 'mmc-v5.9-rc4-2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
  mmc: mmc_spi: Fix mmc_spi_dma_alloc() return type for !HAS_DMA

4 years agoMerge tag 'media/v5.9-3' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab...
Linus Torvalds [Thu, 24 Sep 2020 16:05:04 +0000 (09:05 -0700)]
Merge tag 'media/v5.9-3' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media

Pull media fixes from Mauro Carvalho Chehab:

 - fix a regression at the CEC adapter core

 - two uAPI patches (one revert) for changes in this development cycle

* tag 'media/v5.9-3' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
  media: dt-bindings: media: imx274: Convert to json-schema
  media: media/v4l2: remove V4L2_FLAG_MEMORY_NON_CONSISTENT flag
  media: cec-adap.c: don't use flush_scheduled_work()

4 years agoMerge tag 'sound-5.9-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai...
Linus Torvalds [Thu, 24 Sep 2020 16:00:05 +0000 (09:00 -0700)]
Merge tag 'sound-5.9-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound

Pull sound fixes from Takashi Iwai:
 "Just a handful small device-specific fixes including a couple of
  reverts"

* tag 'sound-5.9-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
  Revert "ALSA: usb-audio: Disable Lenovo P620 Rear line-in volume control"
  Revert "ALSA: hda - Fix silent audio output and corrupted input on MSI X570-A PRO"
  ALSA: usb-audio: Add delay quirk for H570e USB headsets
  ALSA: hda/realtek: Enable front panel headset LED on Lenovo ThinkStation P520
  ALSA: hda/realtek - Couldn't detect Mic if booting with headset plugged
  ALSA: asihpi: fix iounmap in error handler

4 years agomm: fix misplaced unlock_page in do_wp_page()
Linus Torvalds [Thu, 24 Sep 2020 15:41:32 +0000 (08:41 -0700)]
mm: fix misplaced unlock_page in do_wp_page()

Commit 09854ba94c6a ("mm: do_wp_page() simplification") reorganized all
the code around the page re-use vs copy, but in the process also moved
the final unlock_page() around to after the wp_page_reuse() call.

That normally doesn't matter - but it means that the unlock_page() is
now done after releasing the page table lock.  Again, not a big deal,
you'd think.

But it turns out that it's very wrong indeed, because once we've
released the page table lock, we've basically lost our only reference to
the page - the page tables - and it could now be free'd at any time.  We
do hold the mmap_sem, so no actual unmap() can happen, but madvise can
come in and a MADV_DONTNEED will zap the page range - and free the page.

So now the page may be free'd just as we're unlocking it, which in turn
will usually trigger a "Bad page state" error in the freeing path.  To
make matters more confusing, by the time the debug code prints out the
page state, the unlock has typically completed and everything looks fine
again.

This all doesn't happen in any normal situations, but it does trigger
with the dirtyc0w_child LTP test.  And it seems to trigger much more
easily (but not expclusively) on s390 than elsewhere, probably because
s390 doesn't do the "batch pages up for freeing after the TLB flush"
that gives the unlock_page() more time to complete and makes the race
harder to hit.

Fixes: 09854ba94c6a ("mm: do_wp_page() simplification")
Link: https://lore.kernel.org/lkml/a46e9bbef2ed4e17778f5615e818526ef848d791.camel@redhat.com/
Link: https://lore.kernel.org/linux-mm/c41149a8-211e-390b-af1d-d5eee690fecb@linux.alibaba.com/
Reported-by: Qian Cai <cai@redhat.com>
Reported-by: Alex Shi <alex.shi@linux.alibaba.com>
Bisected-and-analyzed-by: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
Tested-by: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
4 years agospi: bcm-qspi: Fix probe regression on iProc platforms
Ray Jui [Thu, 10 Sep 2020 15:25:38 +0000 (08:25 -0700)]
spi: bcm-qspi: Fix probe regression on iProc platforms

iProc chips have QSPI controller that does not have the MSPI_REV
offset. Reading from that offset will cause a bus error. Fix it by
having MSPI_REV query disabled in the generic compatible string.

Fixes: 3a01f04d74ef ("spi: bcm-qspi: Handle lack of MSPI_REV offset")
Link: https://lore.kernel.org/linux-arm-kernel/20200909211857.4144718-1-f.fainelli@gmail.com/T/#u
Signed-off-by: Ray Jui <ray.jui@broadcom.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Link: https://lore.kernel.org/r/20200910152539.45584-3-ray.jui@broadcom.com
Signed-off-by: Mark Brown <broonie@kernel.org>
4 years agoMerge tag 'trace-v5.9-rc5-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt...
Linus Torvalds [Wed, 23 Sep 2020 21:52:22 +0000 (14:52 -0700)]
Merge tag 'trace-v5.9-rc5-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace

Pull bootconfig fixes from Steven Rostedt:
 "A couple of fixes for bootconfig.

  Masami discovered two bugs which this fixes and he added tests to
  cover these issues.

   - Fix a bug that breaks bootconfig tree nodes

   - Fix a bug that does not truncate whitespace properly

   - Add tests to cover the above two cases"

* tag 'trace-v5.9-rc5-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  tools/bootconfig: Add testcase for tailing space
  tools/bootconfig: Add testcases for repeated key with brace
  lib/bootconfig: Fix to remove tailing spaces after value
  lib/bootconfig: Fix a bug of breaking existing tree nodes

4 years agoMerge tag 'for-5.9/dm-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/devic...
Linus Torvalds [Wed, 23 Sep 2020 21:38:21 +0000 (14:38 -0700)]
Merge tag 'for-5.9/dm-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm

Pull device mapper fixes from Mike Snitzer:

 - DM core fix for incorrect double bio splitting. Keep "fixing" this
   because past attempts didn't fully appreciate the liability relative
   to recursive bio splitting. This fix limits DM's bio splitting to a
   single method and does _not_ use blk_queue_split() for normal IO.

 - DM crypt Documentation updates for features added during 5.9 merge.

* tag 'for-5.9/dm-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
  dm crypt: document encrypted keyring key option
  dm crypt: document new no_workqueue flags
  dm: fix comment in dm_process_bio()
  dm: fix bio splitting and its bio completion order for regular IO

4 years agoMerge tag 'for-5.9-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave...
Linus Torvalds [Wed, 23 Sep 2020 21:32:23 +0000 (14:32 -0700)]
Merge tag 'for-5.9-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux

Pull btrfs fixes from David Sterba:
 "syzkaller started to hit us with reports, here's a fix for one type
  (stack overflow when printing checksums on read error).

  The other patch is a fix for sysfs object, we have a test for that and
  it leads to a crash."

* tag 'for-5.9-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
  btrfs: fix put of uninitialized kobject after seed device delete
  btrfs: fix overflow when copying corrupt csums for a message

4 years agomm: move the copy_one_pte() pte_present check into the caller
Linus Torvalds [Wed, 23 Sep 2020 17:04:16 +0000 (10:04 -0700)]
mm: move the copy_one_pte() pte_present check into the caller

This completes the split of the non-present and present pte cases by
moving the check for the source pte being present into the single
caller, which also means that we clearly separate out the very different
return value case for a non-present pte.

The present pte case currently always succeeds.

This is a pure code re-organization with no semantic change: the intent
is to make it much easier to add a new return case to the present pte
case for when we do early COW at page table copy time.

This was split out from the previous commit simply to make it easy to
visually see that there were no semantic changes from this code
re-organization.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
4 years agomm: split out the non-present case from copy_one_pte()
Linus Torvalds [Wed, 23 Sep 2020 16:56:59 +0000 (09:56 -0700)]
mm: split out the non-present case from copy_one_pte()

This is a purely mechanical split of the copy_one_pte() function.  It's
not immediately obvious when looking at the diff because of the
indentation change, but the way to see what is going on in this commit
is to use the "-w" flag to not show pure whitespace changes, and you see
how the first part of copy_one_pte() is simply lifted out into a
separate function.

And since the non-present case is marked unlikely, don't make the new
function be inlined.  Not that gcc really seems to care, since it looks
like it will inline it anyway due to the whole "single callsite for
static function" logic.  In fact, code generation with the function
split is almost identical to before.  But not marking it inline is the
right thing to do.

This is pure prep-work and cleanup for subsequent changes.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
4 years agospi: fsl-dspi: fix use-after-free in remove path
Sascha Hauer [Wed, 23 Sep 2020 13:10:26 +0000 (15:10 +0200)]
spi: fsl-dspi: fix use-after-free in remove path

spi_unregister_controller() not only unregisters the controller, but
also frees the controller. This will free the driver data with it, so
we must not access it later dspi_remove().

Solve this by allocating the driver data separately from the SPI
controller.

Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Link: https://lore.kernel.org/r/20200923131026.20707-1-s.hauer@pengutronix.de
Signed-off-by: Mark Brown <broonie@kernel.org>
4 years agoregulator: axp20x: fix LDO2/4 description
Icenowy Zheng [Wed, 23 Sep 2020 00:51:42 +0000 (08:51 +0800)]
regulator: axp20x: fix LDO2/4 description

Currently we wrongly set the mask of value of LDO2/4 both to the mask of
LDO2, and the LDO4 voltage configuration is left untouched. This leads
to conflict when LDO2/4 are both in use.

Fix this issue by setting different vsel_mask to both regulators.

Fixes: db4a555f7c4c ("regulator: axp20x: use defines for masks")
Signed-off-by: Icenowy Zheng <icenowy@aosc.io>
Link: https://lore.kernel.org/r/20200923005142.147135-1-icenowy@aosc.io
Signed-off-by: Mark Brown <broonie@kernel.org>
4 years agoselftests: kvm: Fix assert failure in single-step test
Yang Weijiang [Wed, 26 Aug 2020 01:55:24 +0000 (09:55 +0800)]
selftests: kvm: Fix assert failure in single-step test

This is a follow-up patch to fix an issue left in commit:
98b0bf02738004829d7e26d6cb47b2e469aaba86
selftests: kvm: Use a shorter encoding to clear RAX

With the change in the commit, we also need to modify "xor" instruction
length from 3 to 2 in array ss_size accordingly to pass below check:

for (i = 0; i < (sizeof(ss_size) / sizeof(ss_size[0])); i++) {
        target_rip += ss_size[i];
        CLEAR_DEBUG();
        debug.control = KVM_GUESTDBG_ENABLE | KVM_GUESTDBG_SINGLESTEP;
        debug.arch.debugreg[7] = 0x00000400;
        APPLY_DEBUG();
        vcpu_run(vm, VCPU_ID);
        TEST_ASSERT(run->exit_reason == KVM_EXIT_DEBUG &&
                    run->debug.arch.exception == DB_VECTOR &&
                    run->debug.arch.pc == target_rip &&
                    run->debug.arch.dr6 == target_dr6,
                    "SINGLE_STEP[%d]: exit %d exception %d rip 0x%llx "
                    "(should be 0x%llx) dr6 0x%llx (should be 0x%llx)",
                    i, run->exit_reason, run->debug.arch.exception,
                    run->debug.arch.pc, target_rip, run->debug.arch.dr6,
                    target_dr6);
}

Reported-by: kernel test robot <rong.a.chen@intel.com>
Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
Message-Id: <20200826015524.13251-1-weijiang.yang@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
4 years agoKVM: x86: VMX: Make smaller physical guest address space support user-configurable
Mohammed Gamal [Thu, 3 Sep 2020 14:11:22 +0000 (16:11 +0200)]
KVM: x86: VMX: Make smaller physical guest address space support user-configurable

This patch exposes allow_smaller_maxphyaddr to the user as a module parameter.
Since smaller physical address spaces are only supported on VMX, the
parameter is only exposed in the kvm_intel module.

For now disable support by default, and let the user decide if they want
to enable it.

Modifications to VMX page fault and EPT violation handling will depend
on whether that parameter is enabled.

Signed-off-by: Mohammed Gamal <mgamal@redhat.com>
Message-Id: <20200903141122.72908-1-mgamal@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
4 years agoMIPS: BCM47XX: Remove the needless check with the 1074K
Wei Li [Wed, 23 Sep 2020 06:53:26 +0000 (14:53 +0800)]
MIPS: BCM47XX: Remove the needless check with the 1074K

As there is no known soc powered by mips 1074K in bcm47xx series,
the check with 1074K is needless. So just remove it.

Link: https://wireless.wiki.kernel.org/en/users/Drivers/b43/soc
Fixes: 442e14a2c55e ("MIPS: Add 1074K CPU support explicitly.")
Signed-off-by: Wei Li <liwei391@huawei.com>
Acked-by: Rafał Miłecki <rafal@milecki.pl>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
4 years agoMIPS: Add the missing 'CPU_1074K' into __get_cpu_type()
Wei Li [Wed, 23 Sep 2020 06:53:12 +0000 (14:53 +0800)]
MIPS: Add the missing 'CPU_1074K' into __get_cpu_type()

Commit 442e14a2c55e ("MIPS: Add 1074K CPU support explicitly.") split
1074K from the 74K as an unique CPU type, while it missed to add the
'CPU_1074K' in __get_cpu_type(). So let's add it back.

Fixes: 442e14a2c55e ("MIPS: Add 1074K CPU support explicitly.")
Signed-off-by: Wei Li <liwei391@huawei.com>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
4 years agoMIPS: Loongson2ef: Disable Loongson MMI instructions
Jiaxun Yang [Wed, 23 Sep 2020 10:33:12 +0000 (18:33 +0800)]
MIPS: Loongson2ef: Disable Loongson MMI instructions

It was missed when I was forking Loongson2ef from Loongson64 but
should be applied to Loongson2ef as march=loongson2f
will also enable Loongson MMI in GCC-9+.

Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Fixes: 71e2f4dd5a65 ("MIPS: Fork loongson2ef from loongson64")
Reported-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: stable@vger.kernel.org # v5.8+
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
4 years agoACPI: processor: Fix build for ARCH_APICTIMER_STOPS_ON_C3 unset
Rafael J. Wysocki [Wed, 23 Sep 2020 11:50:12 +0000 (13:50 +0200)]
ACPI: processor: Fix build for ARCH_APICTIMER_STOPS_ON_C3 unset

Fix the lapic_timer_needs_broadcast() stub for
ARCH_APICTIMER_STOPS_ON_C3 unset to actually return
a value.

Fixes: aa6b43d57f99 ("ACPI: processor: Use CPUIDLE_FLAG_TIMER_STOP")
Reported-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
4 years agodrm/i915/selftests: Push the fake iommu device from the stack to data
Chris Wilson [Wed, 16 Sep 2020 10:50:22 +0000 (11:50 +0100)]
drm/i915/selftests: Push the fake iommu device from the stack to data

Since we store a pointer to the fake iommu device that is allocated on
the stack, as soon as we leave the function it goes out of scope and any
future dereference is undefined behaviour. Just in case we may need to
look at the fake iommu device after initialiation, move the allocation
from the stack into the data.

Fixes: 01b9d4e21148 ("iommu/vt-d: Use dev_iommu_priv_get/set()")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200916105022.28316-2-chris@chris-wilson.co.uk
(cherry picked from commit 9f9f4101fc98db56714e71676d5a1e2d27e01f7e)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
4 years agoPM / devfreq: tegra30: Disable clock on error in probe
Dan Carpenter [Tue, 8 Sep 2020 07:25:57 +0000 (10:25 +0300)]
PM / devfreq: tegra30: Disable clock on error in probe

This error path needs to call clk_disable_unprepare().

Fixes: 7296443b900e ("PM / devfreq: tegra30: Handle possible round-rate error")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
4 years agoPM / devfreq: Add timer type to devfreq_summary debugfs
Chanwoo Choi [Tue, 8 Sep 2020 10:57:07 +0000 (19:57 +0900)]
PM / devfreq: Add timer type to devfreq_summary debugfs

The commit 4dc3bab8687f ("PM / devfreq: Add support delayed timer for
polling mode") supports the delayed timer but this commit missed
the adding the timer type to devfreq_summary debugfs node.
Add the timer type to devfreq_summary debugfs.

Fixes: 4dc3bab8687f ("PM / devfreq: Add support delayed timer for polling mode")
Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
4 years agoMerge tag 'drm-misc-fixes-2020-09-18' of git://anongit.freedesktop.org/drm/drm-misc...
Dave Airlie [Tue, 22 Sep 2020 23:30:29 +0000 (09:30 +1000)]
Merge tag 'drm-misc-fixes-2020-09-18' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes

drm-misc-fixes for v5.9-rc6:
- Fill asoc card owner in vc4.
- Program secondary CSC correctly in sun4i, and extend
  register mapping to cover secondary CSC registers.

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/e3ab56cf-3b8e-9b21-f1b6-9a4989a52996@linux.intel.com
4 years agoMerge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Linus Torvalds [Tue, 22 Sep 2020 22:08:41 +0000 (15:08 -0700)]
Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs

Pull vfs fixes from Al Viro:
 "No common topic, just assorted fixes"

* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  fuse: fix the ->direct_IO() treatment of iov_iter
  fs: fix cast in fsparam_u32hex() macro
  vboxsf: Fix the check for the old binary mount-arguments struct

4 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Linus Torvalds [Tue, 22 Sep 2020 21:43:50 +0000 (14:43 -0700)]
Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:

 - fix failure to add bond interfaces to a bridge, the offload-handling
   code was too defensive there and recent refactoring unearthed that.
   Users complained (Ido)

 - fix unnecessarily reflecting ECN bits within TOS values / QoS marking
   in TCP ACK and reset packets (Wei)

 - fix a deadlock with bpf iterator. Hopefully we're in the clear on
   this front now... (Yonghong)

 - BPF fix for clobbering r2 in bpf_gen_ld_abs (Daniel)

 - fix AQL on mt76 devices with FW rate control and add a couple of AQL
   issues in mac80211 code (Felix)

 - fix authentication issue with mwifiex (Maximilian)

 - WiFi connectivity fix: revert IGTK support in ti/wlcore (Mauro)

 - fix exception handling for multipath routes via same device (David
   Ahern)

 - revert back to a BH spin lock flavor for nsid_lock: there are paths
   which do require the BH context protection (Taehee)

 - fix interrupt / queue / NAPI handling in the lantiq driver (Hauke)

 - fix ife module load deadlock (Cong)

 - make an adjustment to netlink reply message type for code added in
   this release (the sole change touching uAPI here) (Michal)

 - a number of fixes for small NXP and Microchip switches (Vladimir)

[ Pull request acked by David: "you can expect more of this in the
  future as I try to delegate more things to Jakub" ]

* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (167 commits)
  net: mscc: ocelot: fix some key offsets for IP4_TCP_UDP VCAP IS2 entries
  net: dsa: seville: fix some key offsets for IP4_TCP_UDP VCAP IS2 entries
  net: dsa: felix: fix some key offsets for IP4_TCP_UDP VCAP IS2 entries
  inet_diag: validate INET_DIAG_REQ_PROTOCOL attribute
  net: bridge: br_vlan_get_pvid_rcu() should dereference the VLAN group under RCU
  net: Update MAINTAINERS for MediaTek switch driver
  net/mlx5e: mlx5e_fec_in_caps() returns a boolean
  net/mlx5e: kTLS, Avoid kzalloc(GFP_KERNEL) under spinlock
  net/mlx5e: kTLS, Fix leak on resync error flow
  net/mlx5e: kTLS, Add missing dma_unmap in RX resync
  net/mlx5e: kTLS, Fix napi sync and possible use-after-free
  net/mlx5e: TLS, Do not expose FPGA TLS counter if not supported
  net/mlx5e: Fix using wrong stats_grps in mlx5e_update_ndo_stats()
  net/mlx5e: Fix multicast counter not up-to-date in "ip -s"
  net/mlx5e: Fix endianness when calculating pedit mask first bit
  net/mlx5e: Enable adding peer miss rules only if merged eswitch is supported
  net/mlx5e: CT: Fix freeing ct_label mapping
  net/mlx5e: Fix memory leak of tunnel info when rule under multipath not ready
  net/mlx5e: Use synchronize_rcu to sync with NAPI
  net/mlx5e: Use RCU to protect rq->xdp_prog
  ...

4 years agoMerge tag 'io_uring-5.9-2020-09-22' of git://git.kernel.dk/linux-block
Linus Torvalds [Tue, 22 Sep 2020 21:36:50 +0000 (14:36 -0700)]
Merge tag 'io_uring-5.9-2020-09-22' of git://git.kernel.dk/linux-block

Pull io_uring fixes from Jens Axboe:
 "A few fixes - most of them regression fixes from this cycle, but also
  a few stable heading fixes, and a build fix for the included demo tool
  since some systems now actually have gettid() available"

* tag 'io_uring-5.9-2020-09-22' of git://git.kernel.dk/linux-block:
  io_uring: fix openat/openat2 unified prep handling
  io_uring: mark statx/files_update/epoll_ctl as non-SQPOLL
  tools/io_uring: fix compile breakage
  io_uring: don't use retry based buffered reads for non-async bdev
  io_uring: don't re-setup vecs/iter in io_resumit_prep() is already there
  io_uring: don't run task work on an exiting task
  io_uring: drop 'ctx' ref on task work cancelation
  io_uring: grab any needed state during defer prep

4 years agoMerge tag 'block-5.9-2020-09-22' of git://git.kernel.dk/linux-block
Linus Torvalds [Tue, 22 Sep 2020 21:31:38 +0000 (14:31 -0700)]
Merge tag 'block-5.9-2020-09-22' of git://git.kernel.dk/linux-block

Pull block fixes from Jens Axboe:
 "A few NVMe fixes, and a dasd write zero fix"

* tag 'block-5.9-2020-09-22' of git://git.kernel.dk/linux-block:
  nvmet: get transport reference for passthru ctrl
  nvme-core: get/put ctrl and transport module in nvme_dev_open/release()
  nvme-tcp: fix kconfig dependency warning when !CRYPTO
  nvme-pci: disable the write zeros command for Intel 600P/P3100
  s390/dasd: Fix zero write for FBA devices

4 years agocpuidle: Drop misleading comments about RCU usage
Ulf Hansson [Tue, 22 Sep 2020 09:15:50 +0000 (11:15 +0200)]
cpuidle: Drop misleading comments about RCU usage

The commit 1098582a0f6c ("sched,idle,rcu: Push rcu_idle deeper into the
idle path"), moved the calls rcu_idle_enter|exit() into the cpuidle core.

However, it forgot to remove a couple of comments in enter_s2idle_proper()
about why RCU_NONIDLE earlier was needed. So, let's drop them as they have
become a bit misleading.

Fixes: 1098582a0f6c ("sched,idle,rcu: Push rcu_idle deeper into the idle path")
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
4 years agodm crypt: document encrypted keyring key option
Milan Broz [Thu, 20 Aug 2020 19:20:26 +0000 (21:20 +0200)]
dm crypt: document encrypted keyring key option

Commit 27f5411a718c4 ("dm crypt: support using encrypted keys")
introduced support for encrypted keyring type.

Fix documentation in admin guide to mention this type.

Fixes: 27f5411a718c4 ("dm crypt: support using encrypted keys")
Signed-off-by: Milan Broz <gmazyland@gmail.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
4 years agoMerge tag 'gvt-fixes-2020-09-17' of https://github.com/intel/gvt-linux into drm-intel...
Jani Nikula [Tue, 22 Sep 2020 17:25:11 +0000 (20:25 +0300)]
Merge tag 'gvt-fixes-2020-09-17' of https://github.com/intel/gvt-linux into drm-intel-fixes

gvt-fixes-2020-09-17

- Fix kernel oops for VFIO edid on BDW (Zhenyu)

Signed-off-by: Jani Nikula <jani.nikula@intel.com>
From: Zhenyu Wang <zhenyuw@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200917064208.GF11592@zhen-hp.sh.intel.com
4 years agodm crypt: document new no_workqueue flags
Milan Broz [Thu, 20 Aug 2020 17:45:38 +0000 (19:45 +0200)]
dm crypt: document new no_workqueue flags

Commit 39d42fa96ba1 ("dm crypt: add flags to optionally bypass kcryptd
workqueues") introduced new dm-crypt 'no_read_workqueue' and
'no_write_workqueue' flags.

Add documentation to admin guide for them.

Fixes: 39d42fa96ba1 ("dm crypt: add flags to optionally bypass kcryptd workqueues")
Signed-off-by: Milan Broz <gmazyland@gmail.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
4 years agoMerge tag 'trace-v5.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt...
Linus Torvalds [Tue, 22 Sep 2020 16:08:33 +0000 (09:08 -0700)]
Merge tag 'trace-v5.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace

Pull tracing fixes from Steven Rostedt:

 - Check kprobe is enabled before unregistering from ftrace as it isn't
   registered when disabled.

 - Remove kprobes enabled via command-line that is on init text when
   freed.

 - Add missing RCU synchronization for ftrace trampoline symbols removed
   from kallsyms.

 - Free trampoline on error path if ftrace_startup() fails.

 - Give more space for the longer PID numbers in trace output.

 - Fix a possible double free in the histogram code.

 - A couple of fixes that were discovered by sparse.

* tag 'trace-v5.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  bootconfig: init: make xbc_namebuf static
  kprobes: tracing/kprobes: Fix to kill kprobes on initmem after boot
  tracing: fix double free
  ftrace: Let ftrace_enable_sysctl take a kernel pointer buffer
  tracing: Make the space reserved for the pid wider
  ftrace: Fix missing synchronize_rcu() removing trampoline from kallsyms
  ftrace: Free the trampoline when ftrace_startup() fails
  kprobes: Fix to check probe enabled before disarm_kprobe_ftrace()

4 years agoRevert "ALSA: usb-audio: Disable Lenovo P620 Rear line-in volume control"
Kai-Heng Feng [Tue, 15 Sep 2020 10:39:23 +0000 (18:39 +0800)]
Revert "ALSA: usb-audio: Disable Lenovo P620 Rear line-in volume control"

This reverts commit 34dedd2a83b241ba6aeb290260313c65dc58660e.

According to Realtek, volume FU works for line-in.

I can confirm volume control works after device firmware is updated.

Fixes: 34dedd2a83b2 ("ALSA: usb-audio: Disable Lenovo P620 Rear line-in volume control")
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Link: https://lore.kernel.org/r/20200915103925.12777-1-kai.heng.feng@canonical.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
4 years agobtrfs: fix put of uninitialized kobject after seed device delete
Anand Jain [Fri, 4 Sep 2020 17:34:21 +0000 (01:34 +0800)]
btrfs: fix put of uninitialized kobject after seed device delete

The following test case leads to NULL kobject free error:

  mount seed /mnt
  add sprout to /mnt
  umount /mnt
  mount sprout to /mnt
  delete seed

  kobject: '(null)' (00000000dd2b87e4): is not initialized, yet kobject_put() is being called.
  WARNING: CPU: 1 PID: 15784 at lib/kobject.c:736 kobject_put+0x80/0x350
  RIP: 0010:kobject_put+0x80/0x350
  ::
  Call Trace:
  btrfs_sysfs_remove_devices_dir+0x6e/0x160 [btrfs]
  btrfs_rm_device.cold+0xa8/0x298 [btrfs]
  btrfs_ioctl+0x206c/0x22a0 [btrfs]
  ksys_ioctl+0xe2/0x140
  __x64_sys_ioctl+0x1e/0x29
  do_syscall_64+0x96/0x150
  entry_SYSCALL_64_after_hwframe+0x44/0xa9
  RIP: 0033:0x7f4047c6288b
  ::

This is because, at the end of the seed device-delete, we try to remove
the seed's devid sysfs entry. But for the seed devices under the sprout
fs, we don't initialize the devid kobject yet. So add a kobject state
check, which takes care of the bug.

Fixes: 668e48af7a94 ("btrfs: sysfs, add devid/dev_state kobject and device attributes")
CC: stable@vger.kernel.org # 5.6+
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
4 years agoMIPS: Loongson-3: Fix fp register access if MSA enabled
Huacai Chen [Mon, 24 Aug 2020 07:44:03 +0000 (15:44 +0800)]
MIPS: Loongson-3: Fix fp register access if MSA enabled

If MSA is enabled, FPU_REG_WIDTH is 128 rather than 64, then get_fpr64()
/set_fpr64() in the original unaligned instruction emulation code access
the wrong fp registers. This is because the current code doesn't specify
the correct index field, so fix it.

Fixes: f83e4f9896eff614d0f2547a ("MIPS: Loongson-3: Add some unaligned instructions emulation")
Signed-off-by: Huacai Chen <chenhc@lemote.com>
Signed-off-by: Pei Huang <huangpei@loongson.cn>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
4 years agoMerge remote-tracking branch 'origin/master' into drm-intel-fixes
Jani Nikula [Tue, 22 Sep 2020 10:15:19 +0000 (13:15 +0300)]
Merge remote-tracking branch 'origin/master' into drm-intel-fixes

Direct backmerge to get commit 88b67edd7247 ("dax: Fix compilation for
CONFIG_DAX && !CONFIG_FS_DAX") to fix CI build issues.

Acked-by: Daniel Vetter <daniel@ffwll.ch>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
4 years agomedia: dt-bindings: media: imx274: Convert to json-schema
Jacopo Mondi [Sat, 12 Sep 2020 10:30:45 +0000 (12:30 +0200)]
media: dt-bindings: media: imx274: Convert to json-schema

Convert the imx274 bindings document to json-schema and update
the MAINTAINERS file accordingly.

Reviewed-by: Luca Ceresoli <luca@lucaceresoli.net>
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Jacopo Mondi <jacopo+renesas@jmondi.org>
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
4 years agotools/bootconfig: Add testcase for tailing space
Masami Hiramatsu [Mon, 21 Sep 2020 09:45:11 +0000 (18:45 +0900)]
tools/bootconfig: Add testcase for tailing space

Add testcases for removing/keeping tailing space
in the value.

Link: https://lkml.kernel.org/r/160068151151.1088739.3469541807296024227.stgit@devnote2
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
4 years agotools/bootconfig: Add testcases for repeated key with brace
Masami Hiramatsu [Mon, 21 Sep 2020 09:45:02 +0000 (18:45 +0900)]
tools/bootconfig: Add testcases for repeated key with brace

Add a testcase for repeated key with brace parsing issue.

Link: https://lkml.kernel.org/r/160068150176.1088739.409481347784771987.stgit@devnote2
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
4 years agolib/bootconfig: Fix to remove tailing spaces after value
Masami Hiramatsu [Mon, 21 Sep 2020 09:44:51 +0000 (18:44 +0900)]
lib/bootconfig: Fix to remove tailing spaces after value

Fix to remove tailing spaces after value. If there is a space
after value, the bootconfig failed to remove it because it
applies strim() before replacing the delimiter with null.

For example,

foo = var    # comment

was parsed as below.

foo="var    "

but user will expect

foo="var"

This fixes it by applying strim() after removing the delimiter.

Link: https://lkml.kernel.org/r/160068149134.1088739.8868306567670058853.stgit@devnote2
Fixes: 76db5a27a827 ("bootconfig: Add Extra Boot Config support")
Cc: Ingo Molnar <mingo@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
4 years agolib/bootconfig: Fix a bug of breaking existing tree nodes
Masami Hiramatsu [Mon, 21 Sep 2020 09:44:42 +0000 (18:44 +0900)]
lib/bootconfig: Fix a bug of breaking existing tree nodes

Fix a bug of breaking existing tree nodes by parsing the second
and subsequent braces. Since the bootconfig parser uses the
node.next field as a flag of current parent node, but this will
break the existing tree if the same key node is specified again
in the bootconfig.

For example, the following bootconfig should be foo.buz and bar.

foo
bar
foo { buz }

However, when parsing the brace "{", it breaks foo->bar link
by marking open-brace node. So the bootconfig unlinks bar
from the bootconfig internal tree.

This introduces a stack outside of the tree and record the
last open-brace on the stack instead of using node.next field.

Link: https://lkml.kernel.org/r/160068148267.1088739.8264704338030168660.stgit@devnote2
Fixes: 76db5a27a827 ("bootconfig: Add Extra Boot Config support")
Cc: Ingo Molnar <mingo@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
4 years agoMerge branch 'Fix-broken-tc-flower-rules-for-mscc_ocelot-switches'
David S. Miller [Tue, 22 Sep 2020 00:40:53 +0000 (17:40 -0700)]
Merge branch 'Fix-broken-tc-flower-rules-for-mscc_ocelot-switches'

Vladimir Oltean says:

====================
Fix broken tc-flower rules for mscc_ocelot switches

All 3 switch drivers from the Ocelot family have the same bug in the
VCAP IS2 key offsets, which is that some keys are in the incorrect
order.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: mscc: ocelot: fix some key offsets for IP4_TCP_UDP VCAP IS2 entries
Vladimir Oltean [Mon, 21 Sep 2020 22:56:38 +0000 (01:56 +0300)]
net: mscc: ocelot: fix some key offsets for IP4_TCP_UDP VCAP IS2 entries

The IS2 IP4_TCP_UDP key offsets do not correspond to the VSC7514
datasheet. Whether they work or not is unknown to me. On VSC9959 and
VSC9953, with the same mistake and same discrepancy from the
documentation, tc-flower src_port and dst_port rules did not work, so I
am assuming the same is true here.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: dsa: seville: fix some key offsets for IP4_TCP_UDP VCAP IS2 entries
Vladimir Oltean [Mon, 21 Sep 2020 22:56:37 +0000 (01:56 +0300)]
net: dsa: seville: fix some key offsets for IP4_TCP_UDP VCAP IS2 entries

Since these were copied from the Felix VCAP IS2 code, and only the
offsets were adjusted, the order of the bit fields is still wrong.
Fix it.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: dsa: felix: fix some key offsets for IP4_TCP_UDP VCAP IS2 entries
Xiaoliang Yang [Mon, 21 Sep 2020 22:56:36 +0000 (01:56 +0300)]
net: dsa: felix: fix some key offsets for IP4_TCP_UDP VCAP IS2 entries

Some of the IS2 IP4_TCP_UDP keys are not correct, like L4_DPORT,
L4_SPORT and other L4 keys. This prevents offloaded tc-flower rules from
matching on src_port and dst_port for TCP and UDP packets.

Signed-off-by: Xiaoliang Yang <xiaoliang.yang_1@nxp.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoinet_diag: validate INET_DIAG_REQ_PROTOCOL attribute
Eric Dumazet [Mon, 21 Sep 2020 14:27:20 +0000 (07:27 -0700)]
inet_diag: validate INET_DIAG_REQ_PROTOCOL attribute

User space could send an invalid INET_DIAG_REQ_PROTOCOL attribute
as caught by syzbot.

BUG: KMSAN: uninit-value in inet_diag_lock_handler net/ipv4/inet_diag.c:55 [inline]
BUG: KMSAN: uninit-value in __inet_diag_dump+0x58c/0x720 net/ipv4/inet_diag.c:1147
CPU: 0 PID: 8505 Comm: syz-executor174 Not tainted 5.9.0-rc4-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x21c/0x280 lib/dump_stack.c:118
 kmsan_report+0xf7/0x1e0 mm/kmsan/kmsan_report.c:122
 __msan_warning+0x58/0xa0 mm/kmsan/kmsan_instr.c:219
 inet_diag_lock_handler net/ipv4/inet_diag.c:55 [inline]
 __inet_diag_dump+0x58c/0x720 net/ipv4/inet_diag.c:1147
 inet_diag_dump_compat+0x2a5/0x380 net/ipv4/inet_diag.c:1254
 netlink_dump+0xb73/0x1cb0 net/netlink/af_netlink.c:2246
 __netlink_dump_start+0xcf2/0xea0 net/netlink/af_netlink.c:2354
 netlink_dump_start include/linux/netlink.h:246 [inline]
 inet_diag_rcv_msg_compat+0x5da/0x6c0 net/ipv4/inet_diag.c:1288
 sock_diag_rcv_msg+0x24f/0x620 net/core/sock_diag.c:256
 netlink_rcv_skb+0x6d7/0x7e0 net/netlink/af_netlink.c:2470
 sock_diag_rcv+0x63/0x80 net/core/sock_diag.c:275
 netlink_unicast_kernel net/netlink/af_netlink.c:1304 [inline]
 netlink_unicast+0x11c8/0x1490 net/netlink/af_netlink.c:1330
 netlink_sendmsg+0x173a/0x1840 net/netlink/af_netlink.c:1919
 sock_sendmsg_nosec net/socket.c:651 [inline]
 sock_sendmsg net/socket.c:671 [inline]
 ____sys_sendmsg+0xc82/0x1240 net/socket.c:2353
 ___sys_sendmsg net/socket.c:2407 [inline]
 __sys_sendmsg+0x6d1/0x820 net/socket.c:2440
 __do_sys_sendmsg net/socket.c:2449 [inline]
 __se_sys_sendmsg+0x97/0xb0 net/socket.c:2447
 __x64_sys_sendmsg+0x4a/0x70 net/socket.c:2447
 do_syscall_64+0x9f/0x140 arch/x86/entry/common.c:48
 entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x441389
Code: e8 fc ab 02 00 48 83 c4 18 c3 0f 1f 80 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 1b 09 fc ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007fff3b02ce98 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 0000000000441389
RDX: 0000000000000000 RSI: 0000000020001500 RDI: 0000000000000003
RBP: 00000000006cb018 R08: 00000000004002c8 R09: 00000000004002c8
R10: 0000000000000004 R11: 0000000000000246 R12: 0000000000402130
R13: 00000000004021c0 R14: 0000000000000000 R15: 0000000000000000

Uninit was created at:
 kmsan_save_stack_with_flags mm/kmsan/kmsan.c:143 [inline]
 kmsan_internal_poison_shadow+0x66/0xd0 mm/kmsan/kmsan.c:126
 kmsan_slab_alloc+0x8a/0xe0 mm/kmsan/kmsan_hooks.c:80
 slab_alloc_node mm/slub.c:2907 [inline]
 __kmalloc_node_track_caller+0x9aa/0x12f0 mm/slub.c:4511
 __kmalloc_reserve net/core/skbuff.c:142 [inline]
 __alloc_skb+0x35f/0xb30 net/core/skbuff.c:210
 alloc_skb include/linux/skbuff.h:1094 [inline]
 netlink_alloc_large_skb net/netlink/af_netlink.c:1176 [inline]
 netlink_sendmsg+0xdb9/0x1840 net/netlink/af_netlink.c:1894
 sock_sendmsg_nosec net/socket.c:651 [inline]
 sock_sendmsg net/socket.c:671 [inline]
 ____sys_sendmsg+0xc82/0x1240 net/socket.c:2353
 ___sys_sendmsg net/socket.c:2407 [inline]
 __sys_sendmsg+0x6d1/0x820 net/socket.c:2440
 __do_sys_sendmsg net/socket.c:2449 [inline]
 __se_sys_sendmsg+0x97/0xb0 net/socket.c:2447
 __x64_sys_sendmsg+0x4a/0x70 net/socket.c:2447
 do_syscall_64+0x9f/0x140 arch/x86/entry/common.c:48
 entry_SYSCALL_64_after_hwframe+0x44/0xa9

Fixes: 3f935c75eb52 ("inet_diag: support for wider protocol numbers")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: Christoph Paasch <cpaasch@apple.com>
Cc: Mat Martineau <mathew.j.martineau@linux.intel.com>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: bridge: br_vlan_get_pvid_rcu() should dereference the VLAN group under RCU
Vladimir Oltean [Mon, 21 Sep 2020 22:07:09 +0000 (01:07 +0300)]
net: bridge: br_vlan_get_pvid_rcu() should dereference the VLAN group under RCU

When calling the RCU brother of br_vlan_get_pvid(), lockdep warns:

=============================
WARNING: suspicious RCU usage
5.9.0-rc3-01631-g13c17acb8e38-dirty #814 Not tainted
-----------------------------
net/bridge/br_private.h:1054 suspicious rcu_dereference_protected() usage!

Call trace:
 lockdep_rcu_suspicious+0xd4/0xf8
 __br_vlan_get_pvid+0xc0/0x100
 br_vlan_get_pvid_rcu+0x78/0x108

The warning is because br_vlan_get_pvid_rcu() calls nbp_vlan_group()
which calls rtnl_dereference() instead of rcu_dereference(). In turn,
rtnl_dereference() calls rcu_dereference_protected() which assumes
operation under an RCU write-side critical section, which obviously is
not the case here. So, when the incorrect primitive is used to access
the RCU-protected VLAN group pointer, READ_ONCE() is not used, which may
cause various unexpected problems.

I'm sad to say that br_vlan_get_pvid() and br_vlan_get_pvid_rcu() cannot
share the same implementation. So fix the bug by splitting the 2
functions, and making br_vlan_get_pvid_rcu() retrieve the VLAN groups
under proper locking annotations.

Fixes: 7582f5b70f9a ("bridge: add br_vlan_get_pvid_rcu()")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoMerge tag 'mlx5-fixes-2020-09-18' of git://git.kernel.org/pub/scm/linux/kernel/git...
David S. Miller [Tue, 22 Sep 2020 00:32:42 +0000 (17:32 -0700)]
Merge tag 'mlx5-fixes-2020-09-18' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
mlx5 fixes-2020-09-18

This series introduces some fixes to mlx5 driver.

Please pull and let me know if there is any problem.

v1->v2:
 Remove missing patch from -stable list.

For -stable v5.1
 ('net/mlx5: Fix FTE cleanup')

For -stable v5.3
 ('net/mlx5e: TLS, Do not expose FPGA TLS counter if not supported')
 ('net/mlx5e: Enable adding peer miss rules only if merged eswitch is supported')

For -stable v5.7
 ('net/mlx5e: Fix memory leak of tunnel info when rule under multipath not ready')

For -stable v5.8
 ('net/mlx5e: Use RCU to protect rq->xdp_prog')
 ('net/mlx5e: Fix endianness when calculating pedit mask first bit')
 ('net/mlx5e: Use synchronize_rcu to sync with NAPI')
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: Update MAINTAINERS for MediaTek switch driver
Sean Wang [Mon, 21 Sep 2020 23:09:23 +0000 (07:09 +0800)]
net: Update MAINTAINERS for MediaTek switch driver

Update maintainers for MediaTek switch driver with Landen Chao who is
familiar with MediaTek MT753x switch devices and will help maintenance
from the vendor side.

Cc: Steven Liu <steven.liu@mediatek.com>
Signed-off-by: Sean Wang <sean.wang@mediatek.com>
Signed-off-by: Landen Chao <Landen.Chao@mediatek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet/mlx5e: mlx5e_fec_in_caps() returns a boolean
Saeed Mahameed [Fri, 11 Sep 2020 18:00:06 +0000 (11:00 -0700)]
net/mlx5e: mlx5e_fec_in_caps() returns a boolean

Returning errno is a bug, fix that.

Also fixes smatch warnings:
drivers/net/ethernet/mellanox/mlx5/core/en/port.c:453
mlx5e_fec_in_caps() warn: signedness bug returning '(-95)'

Fixes: 2132b71f78d2 ("net/mlx5e: Advertise globaly supported FEC modes")
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Aya Levin <ayal@nvidia.com>
4 years agonet/mlx5e: kTLS, Avoid kzalloc(GFP_KERNEL) under spinlock
Saeed Mahameed [Tue, 8 Sep 2020 05:54:43 +0000 (22:54 -0700)]
net/mlx5e: kTLS, Avoid kzalloc(GFP_KERNEL) under spinlock

The spinlock only needed when accessing the channel's icosq, grab the lock
after the buf allocation in resync_post_get_progress_params() to avoid
kzalloc(GFP_KERNEL) in atomic context.

Fixes: 0419d8c9d8f8 ("net/mlx5e: kTLS, Add kTLS RX resync support")
Reported-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
4 years agonet/mlx5e: kTLS, Fix leak on resync error flow
Saeed Mahameed [Fri, 11 Sep 2020 04:16:04 +0000 (21:16 -0700)]
net/mlx5e: kTLS, Fix leak on resync error flow

Resync progress params buffer and dma weren't released on error,
Add missing error unwinding for resync_post_get_progress_params().

Fixes: 0419d8c9d8f8 ("net/mlx5e: kTLS, Add kTLS RX resync support")
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
4 years agonet/mlx5e: kTLS, Add missing dma_unmap in RX resync
Saeed Mahameed [Tue, 8 Sep 2020 05:58:50 +0000 (22:58 -0700)]
net/mlx5e: kTLS, Add missing dma_unmap in RX resync

Progress params dma address is never unmapped, unmap it when completion
handling is over.

Fixes: 0419d8c9d8f8 ("net/mlx5e: kTLS, Add kTLS RX resync support")
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
4 years agonet/mlx5e: kTLS, Fix napi sync and possible use-after-free
Tariq Toukan [Mon, 10 Aug 2020 12:59:41 +0000 (15:59 +0300)]
net/mlx5e: kTLS, Fix napi sync and possible use-after-free

Using synchronize_rcu() is sufficient to wait until running NAPI quits.

See similar upstream fix with detailed explanation:
("net/mlx5e: Use synchronize_rcu to sync with NAPI")

This change also fixes a possible use-after-free as the NAPI
might be already released at this stage.

Fixes: 0419d8c9d8f8 ("net/mlx5e: kTLS, Add kTLS RX resync support")
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
4 years agonet/mlx5e: TLS, Do not expose FPGA TLS counter if not supported
Tariq Toukan [Sun, 28 Jun 2020 10:06:06 +0000 (13:06 +0300)]
net/mlx5e: TLS, Do not expose FPGA TLS counter if not supported

The set of TLS TX global SW counters in mlx5e_tls_sw_stats_desc
is updated from all rings by using atomic ops.
This set of stats is used only in the FPGA TLS use case, not in
the Connect-X TLS one, where regular per-ring counters are used.

Do not expose them in the Connect-X use case, as this would cause
counter duplication. For example, tx_tls_drop_no_sync_data would
appear twice in the ethtool stats.

Fixes: d2ead1f360e8 ("net/mlx5e: Add kTLS TX HW offload support")
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
4 years agonet/mlx5e: Fix using wrong stats_grps in mlx5e_update_ndo_stats()
Alaa Hleihel [Tue, 25 Aug 2020 07:41:50 +0000 (10:41 +0300)]
net/mlx5e: Fix using wrong stats_grps in mlx5e_update_ndo_stats()

The cited commit started to reuse function mlx5e_update_ndo_stats() for
the representors as well.
However, the function is hard-coded to work on mlx5e_nic_stats_grps only.
Due to this issue, the representors statistics were not updated in the
output of "ip -s".

Fix it to work with the correct group by extracting it from the caller's
profile.

Also, while at it and since this function became generic, move it to
en_stats.c and rename it accordingly.

Fixes: 8a236b15144b ("net/mlx5e: Convert rep stats to mlx5e_stats_grp-based infra")
Signed-off-by: Alaa Hleihel <alaa@nvidia.com>
Reviewed-by: Vlad Buslov <vladbu@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
4 years agonet/mlx5e: Fix multicast counter not up-to-date in "ip -s"
Ron Diskin [Sun, 10 May 2020 11:39:51 +0000 (14:39 +0300)]
net/mlx5e: Fix multicast counter not up-to-date in "ip -s"

Currently the FW does not generate events for counters other than error
counters. Unlike ".get_ethtool_stats", ".ndo_get_stats64" (which ip -s
uses) might run in atomic context, while the FW interface is non atomic.
Thus, 'ip' is not allowed to issue FW commands, so it will only display
cached counters in the driver.

Add a SW counter (mcast_packets) in the driver to count rx multicast
packets. The counter also counts broadcast packets, as we consider it a
special case of multicast.
Use the counter value when calling "ip -s"/"ifconfig".

Fixes: f62b8bb8f2d3 ("net/mlx5: Extend mlx5_core to support ConnectX-4 Ethernet functionality")
Signed-off-by: Ron Diskin <rondi@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
4 years agonet/mlx5e: Fix endianness when calculating pedit mask first bit
Maor Dickman [Wed, 2 Sep 2020 13:49:52 +0000 (16:49 +0300)]
net/mlx5e: Fix endianness when calculating pedit mask first bit

The field mask value is provided in network byte order and has to
be converted to host byte order before calculating pedit mask
first bit.

Fixes: 88f30bbcbaaa ("net/mlx5e: Bit sized fields rewrite support")
Signed-off-by: Maor Dickman <maord@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
4 years agonet/mlx5e: Enable adding peer miss rules only if merged eswitch is supported
Maor Dickman [Wed, 5 Aug 2020 14:56:04 +0000 (17:56 +0300)]
net/mlx5e: Enable adding peer miss rules only if merged eswitch is supported

The cited commit creates peer miss group during switchdev mode
initialization in order to handle miss packets correctly while in VF
LAG mode. This is done regardless of FW support of such groups which
could cause rules setups failure later on.

Fix by adding FW capability check before creating peer groups/rule.

Fixes: ac004b832128 ("net/mlx5e: E-Switch, Add peer miss rules")
Signed-off-by: Maor Dickman <maord@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Raed Salem <raeds@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
4 years agonet/mlx5e: CT: Fix freeing ct_label mapping
Roi Dayan [Sun, 26 Jul 2020 13:37:47 +0000 (16:37 +0300)]
net/mlx5e: CT: Fix freeing ct_label mapping

Add missing mapping remove call when removing ct rule,
as the mapping was allocated when ct rule was adding with ct_label.
Also there is a missing mapping remove call in error flow.

Fixes: 54b154ecfb8c ("net/mlx5e: CT: Map 128 bits labels to 32 bit map ID")
Signed-off-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Eli Britstein <elibr@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
4 years agonet/mlx5e: Fix memory leak of tunnel info when rule under multipath not ready
Jianbo Liu [Tue, 7 Jul 2020 06:16:24 +0000 (06:16 +0000)]
net/mlx5e: Fix memory leak of tunnel info when rule under multipath not ready

When deleting vxlan flow rule under multipath, tun_info in parse_attr is
not freed when the rule is not ready.

Fixes: ef06c9ee8933 ("net/mlx5e: Allow one failure when offloading tc encap rules under multipath")
Signed-off-by: Jianbo Liu <jianbol@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
4 years agonet/mlx5e: Use synchronize_rcu to sync with NAPI
Maxim Mikityanskiy [Thu, 11 Jun 2020 11:25:19 +0000 (14:25 +0300)]
net/mlx5e: Use synchronize_rcu to sync with NAPI

As described in the previous commit, napi_synchronize doesn't quite fit
the purpose when we just need to wait until the currently running NAPI
quits. Its implementation waits until NAPI is not running by polling and
waiting for 1ms in between. In cases where we need to deactivate one
queue (e.g., recovery flows) or where we deactivate them one-by-one
(deactivate channel flow), we may get stuck in napi_synchronize forever
if other queues keep NAPI active, causing a soft lockup. Depending on
kernel configuration (CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC), it may result
in a kernel panic.

To fix the issue, use synchronize_rcu to wait for NAPI to quit, and wrap
the whole NAPI in rcu_read_lock.

Fixes: acc6c5953af1 ("net/mlx5e: Split open/close channels to stages")
Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
4 years agonet/mlx5e: Use RCU to protect rq->xdp_prog
Maxim Mikityanskiy [Thu, 11 Jun 2020 10:55:19 +0000 (13:55 +0300)]
net/mlx5e: Use RCU to protect rq->xdp_prog

Currently, the RQs are temporarily deactivated while hot-replacing the
XDP program, and napi_synchronize is used to make sure rq->xdp_prog is
not in use. However, napi_synchronize is not ideal: instead of waiting
till the end of a NAPI cycle, it polls and waits until NAPI is not
running, sleeping for 1ms between the periodic checks. Under heavy
workloads, this loop will never end, which may even lead to a kernel
panic if the kernel detects the hangup. Such workloads include XSK TX
and possibly also heavy RX (XSK or normal).

The fix is inspired by commit 326fe02d1ed6 ("net/mlx4_en: protect
ring->xdp_prog with rcu_read_lock"). As mlx5e_xdp_handle is already
protected by rcu_read_lock, and bpf_prog_put uses call_rcu to free the
program, there is no need for additional synchronization if proper RCU
functions are used to access the pointer. This patch converts all
accesses to rq->xdp_prog to use RCU functions.

Fixes: 86994156c736 ("net/mlx5e: XDP fast RX drop bpf programs support")
Fixes: db05815b36cb ("net/mlx5e: Add XSK zero-copy support")
Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
4 years agonet/mlx5: Fix FTE cleanup
Maor Gottlieb [Mon, 31 Aug 2020 17:50:42 +0000 (20:50 +0300)]
net/mlx5: Fix FTE cleanup

Currently, when an FTE is allocated, its refcount is decreased to 0
with the purpose it will not be a stand alone steering object and every
rule (destination) of the FTE would increase the refcount.
When mlx5_cleanup_fs is called while not all rules were deleted by the
steering users, it hit refcount underflow on the FTE once clean_tree
calls to tree_remove_node after the deleted rules already decreased
the refcount to 0.

FTE is no longer destroyed implicitly when the last rule (destination)
is deleted. mlx5_del_flow_rules avoids it by increasing the refcount on
the FTE and destroy it explicitly after all rules were deleted. So we
can avoid the refcount underflow by making FTE as stand alone object.
In addition need to set del_hw_func to FTE so the HW object will be
destroyed when the FTE is deleted from the cleanup_tree flow.

refcount_t: underflow; use-after-free.
WARNING: CPU: 2 PID: 15715 at lib/refcount.c:28 refcount_warn_saturate+0xd9/0xe0
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
Call Trace:
 tree_put_node+0xf2/0x140 [mlx5_core]
 clean_tree+0x4e/0xf0 [mlx5_core]
 clean_tree+0x4e/0xf0 [mlx5_core]
 clean_tree+0x4e/0xf0 [mlx5_core]
 clean_tree+0x5f/0xf0 [mlx5_core]
 clean_tree+0x4e/0xf0 [mlx5_core]
 clean_tree+0x5f/0xf0 [mlx5_core]
 mlx5_cleanup_fs+0x26/0x270 [mlx5_core]
 mlx5_unload+0x2e/0xa0 [mlx5_core]
 mlx5_unload_one+0x51/0x120 [mlx5_core]
 mlx5_devlink_reload_down+0x51/0x90 [mlx5_core]
 devlink_reload+0x39/0x120
 ? devlink_nl_cmd_reload+0x43/0x220
 genl_rcv_msg+0x1e4/0x420
 ? genl_family_rcv_msg_attrs_parse+0x100/0x100
 netlink_rcv_skb+0x47/0x110
 genl_rcv+0x24/0x40
 netlink_unicast+0x217/0x2f0
 netlink_sendmsg+0x30f/0x430
 sock_sendmsg+0x30/0x40
 __sys_sendto+0x10e/0x140
 ? handle_mm_fault+0xc4/0x1f0
 ? do_page_fault+0x33f/0x630
 __x64_sys_sendto+0x24/0x30
 do_syscall_64+0x48/0x130
 entry_SYSCALL_64_after_hwframe+0x44/0xa9

Fixes: 718ce4d601db ("net/mlx5: Consolidate update FTE for all removal changes")
Fixes: bd71b08ec2ee ("net/mlx5: Support multiple updates of steering rules in parallel")
Signed-off-by: Maor Gottlieb <maorg@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
4 years agodm: fix comment in dm_process_bio()
Mike Snitzer [Mon, 21 Sep 2020 23:08:30 +0000 (19:08 -0400)]
dm: fix comment in dm_process_bio()

Refer to the correct function (->submit_bio instead of ->queue_bio).
Also, add details about why using blk_queue_split() isn't needed for
dm_wq_work()'s call to dm_process_bio().

Fixes: c62b37d96b6eb ("block: move ->make_request_fn to struct block_device_operations")
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
4 years agodm: fix bio splitting and its bio completion order for regular IO
Mike Snitzer [Mon, 14 Sep 2020 17:04:19 +0000 (13:04 -0400)]
dm: fix bio splitting and its bio completion order for regular IO

dm_queue_split() is removed because __split_and_process_bio() _must_
handle splitting bios to ensure proper bio submission and completion
ordering as a bio is split.

Otherwise, multiple recursive calls to ->submit_bio will cause multiple
split bios to be allocated from the same ->bio_split mempool at the same
time. This would result in deadlock in low memory conditions because no
progress could be made (only one bio is available in ->bio_split
mempool).

This fix has been verified to still fix the loss of performance, due
to excess splitting, that commit 120c9257f5f1 provided.

Fixes: 120c9257f5f1 ("Revert "dm: always call blk_queue_split() in dm_process_bio()"")
Cc: stable@vger.kernel.org # 5.0+, requires custom backport due to 5.9 changes
Reported-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
4 years agoMerge tag 'mac80211-for-net-2020-09-21' of git://git.kernel.org/pub/scm/linux/kernel...
David S. Miller [Mon, 21 Sep 2020 21:54:35 +0000 (14:54 -0700)]
Merge tag 'mac80211-for-net-2020-09-21' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211

Johannes Berg says:

====================
Just a few fixes:
 * fix using HE on 2.4 GHz
 * AQL (airtime queue limit) estimation & VHT160 fix
 * do not oversize A-MPDUs if local capability is smaller than peer's
 * fix radiotap on 6 GHz to not put 2.4 GHz flag
 * fix Kconfig for lib80211
 * little fixlet for 6 GHz channel number / frequency conversion
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoipv6: route: convert comma to semicolon
Xu Wang [Mon, 21 Sep 2020 06:38:56 +0000 (06:38 +0000)]
ipv6: route: convert comma to semicolon

Replace a comma between expression statements by a semicolon.

Signed-off-by: Xu Wang <vulab@iscas.ac.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agosfc: Fix error code in probe
Dan Carpenter [Fri, 18 Sep 2020 14:33:11 +0000 (17:33 +0300)]
sfc: Fix error code in probe

This failure path should return a negative error code but it currently
returns success.

Fixes: 51b35a454efd ("sfc: skeleton EF100 PF driver")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agospi: fsl-espi: Only process interrupts for expected events
Chris Packham [Fri, 4 Sep 2020 00:28:12 +0000 (12:28 +1200)]
spi: fsl-espi: Only process interrupts for expected events

The SPIE register contains counts for the TX FIFO so any time the irq
handler was invoked we would attempt to process the RX/TX fifos. Use the
SPIM value to mask the events so that we only process interrupts that
were expected.

This was a latent issue exposed by commit 3282a3da25bd ("powerpc/64:
Implement soft interrupt replay in C").

Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz>
Link: https://lore.kernel.org/r/20200904002812.7300-1-chris.packham@alliedtelesis.co.nz
Signed-off-by: Mark Brown <broonie@kernel.org>
4 years agoregmap: fix page selection for noinc writes
Dmitry Baryshkov [Thu, 17 Sep 2020 15:34:05 +0000 (18:34 +0300)]
regmap: fix page selection for noinc writes

Non-incrementing writes can fail if register + length crosses page
border. However for non-incrementing writes we should not check for page
border crossing. Fix this by passing additional flag to _regmap_raw_write
and passing length to _regmap_select_page basing on the flag.

Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Fixes: cdf6b11daa77 ("regmap: Add regmap_noinc_write API")
Link: https://lore.kernel.org/r/20200917153405.3139200-2-dmitry.baryshkov@linaro.org
Signed-off-by: Mark Brown <broonie@kernel.org>
4 years agoregmap: fix page selection for noinc reads
Dmitry Baryshkov [Thu, 17 Sep 2020 15:34:04 +0000 (18:34 +0300)]
regmap: fix page selection for noinc reads

Non-incrementing reads can fail if register + length crosses page
border. However for non-incrementing reads we should not check for page
border crossing. Fix this by passing additional flag to _regmap_raw_read
and passing length to _regmap_select_page basing on the flag.

Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Fixes: 74fe7b551f33 ("regmap: Add regmap_noinc_read API")
Link: https://lore.kernel.org/r/20200917153405.3139200-1-dmitry.baryshkov@linaro.org
Signed-off-by: Mark Brown <broonie@kernel.org>
4 years agoMerge branch 'rcu/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck...
Linus Torvalds [Mon, 21 Sep 2020 19:42:31 +0000 (12:42 -0700)]
Merge branch 'rcu/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu

Pull RCU fix from Paul McKenney:
 "This contains a single commit that fixes a bug that was introduced in
  the last merge window. This bug causes a compiler warning complaining
  about show_rcu_tasks_classic_gp_kthread() being an unused static
  function in !SMP kernels.

  The fix is straightforward, just adding an 'inline' to make this a
  static inline function, thus avoiding the warning.

  This bug was reported by Laurent Pinchart, who would like it fixed
  sooner rather than later"

* 'rcu/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu:
  rcu-tasks: Prevent complaints of unused show_rcu_tasks_classic_gp_kthread()

4 years agoMerge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Linus Torvalds [Mon, 21 Sep 2020 15:53:48 +0000 (08:53 -0700)]
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm fixes from Paolo Bonzini:
 "ARM:
   - fix fault on page table writes during instruction fetch

  s390:
   - doc improvement

  x86:
   - The obvious patches are always the ones that turn out to be
     completely broken. /me hangs his head in shame"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  Revert "KVM: Check the allocation of pv cpu mask"
  KVM: arm64: Remove S1PTW check from kvm_vcpu_dabt_iswrite()
  KVM: arm64: Assume write fault on S1PTW permission fault on instruction fetch
  docs: kvm: add documentation for KVM_CAP_S390_DIAG318

4 years agoMerge tag 'libnvdimm-fixes-5.9-rc7' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Mon, 21 Sep 2020 15:46:20 +0000 (08:46 -0700)]
Merge tag 'libnvdimm-fixes-5.9-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm

Pull libnvdimm fix from Dan Williams:
 "Fix compilation for the new dax_supported() exported helper"

* tag 'libnvdimm-fixes-5.9-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
  dax: Fix compilation for CONFIG_DAX && !CONFIG_FS_DAX

4 years agoSUNRPC: Fix svc_flush_dcache()
Chuck Lever [Sun, 20 Sep 2020 17:46:25 +0000 (13:46 -0400)]
SUNRPC: Fix svc_flush_dcache()

On platforms that implement flush_dcache_page(), a large NFS WRITE
triggers the WARN_ONCE in bvec_iter_advance():

Sep 20 14:01:05 klimt.1015granger.net kernel: Attempted to advance past end of bvec iter
Sep 20 14:01:05 klimt.1015granger.net kernel: WARNING: CPU: 0 PID: 1032 at include/linux/bvec.h:101 bvec_iter_advance.isra.0+0xa7/0x158 [sunrpc]

Sep 20 14:01:05 klimt.1015granger.net kernel: Call Trace:
Sep 20 14:01:05 klimt.1015granger.net kernel:  svc_tcp_recvfrom+0x60c/0x12c7 [sunrpc]
Sep 20 14:01:05 klimt.1015granger.net kernel:  ? bvec_iter_advance.isra.0+0x158/0x158 [sunrpc]
Sep 20 14:01:05 klimt.1015granger.net kernel:  ? del_timer_sync+0x4b/0x55
Sep 20 14:01:05 klimt.1015granger.net kernel:  ? test_bit+0x1d/0x27 [sunrpc]
Sep 20 14:01:05 klimt.1015granger.net kernel:  svc_recv+0x1193/0x15e4 [sunrpc]
Sep 20 14:01:05 klimt.1015granger.net kernel:  ? try_to_freeze.isra.0+0x6f/0x6f [sunrpc]
Sep 20 14:01:05 klimt.1015granger.net kernel:  ? refcount_sub_and_test.constprop.0+0x13/0x40 [sunrpc]
Sep 20 14:01:05 klimt.1015granger.net kernel:  ? svc_xprt_put+0x1e/0x29f [sunrpc]
Sep 20 14:01:05 klimt.1015granger.net kernel:  ? svc_send+0x39f/0x3c1 [sunrpc]
Sep 20 14:01:05 klimt.1015granger.net kernel:  nfsd+0x282/0x345 [nfsd]
Sep 20 14:01:05 klimt.1015granger.net kernel:  ? __kthread_parkme+0x74/0xba
Sep 20 14:01:05 klimt.1015granger.net kernel:  kthread+0x2ad/0x2bc
Sep 20 14:01:05 klimt.1015granger.net kernel:  ? nfsd_destroy+0x124/0x124 [nfsd]
Sep 20 14:01:05 klimt.1015granger.net kernel:  ? test_bit+0x1d/0x27
Sep 20 14:01:05 klimt.1015granger.net kernel:  ? kthread_mod_delayed_work+0x115/0x115
Sep 20 14:01:05 klimt.1015granger.net kernel:  ret_from_fork+0x22/0x30

Reported-by: He Zhe <zhe.he@windriver.com>
Fixes: ca07eda33e01 ("SUNRPC: Refactor svc_recvfrom()")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>