Linus Torvalds [Wed, 1 May 2019 19:19:20 +0000 (12:19 -0700)]
gcc-9: don't warn about uninitialized btrfs extent_type variable
The 'extent_type' variable does seem to be reliably initialized, but
it's _very_ non-obvious, since there's a "goto next" case that jumps
over the normal initialization. That will then always trigger the
"start >= extent_end" test, which will end up never falling through to
the use of that variable.
But the code is certainly not obvious, and the compiler warning looks
reasonable. Make 'extent_type' an int, and initialize it to an invalid
negative value, which seems to be the common pattern in other places.
Linus Torvalds [Wed, 1 May 2019 18:07:40 +0000 (11:07 -0700)]
gcc-9: don't warn about uninitialized variable
I'm not sure what made gcc warn about this code now. The 'ret' variable
does end up initialized in all cases, but it's definitely not obvious,
so the compiler is quite reasonable to warn about this.
So just add initialization to make it all much more obvious both to
compilers and to humans.
Merge tag 'usb-5.1-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB fixes from Greg KH:
"Here are some small USB fixes for a bunch of warnings/errors that the
syzbot has been finding with it's new-found ability to stress-test the
USB layer.
All of these are tiny, but fix real issues, and are marked for stable
as well. All of these have had lots of testing in linux-next as well"
* tag 'usb-5.1-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
USB: w1 ds2490: Fix bug caused by improper use of altsetting array
USB: yurex: Fix protection fault after device removal
usb: usbip: fix isoc packet num validation in get_pipe
USB: core: Fix bug caused by duplicate interface PM usage counter
USB: dummy-hcd: Fix failure to give back unlinked URBs
USB: core: Fix unterminated string returned by usb_string()
Merge tag 'selinux-pr-20190429' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux
Pull selinux fix from Paul Moore:
"One small patch for the stable folks to fix a problem when building
against the latest glibc.
I'll be honest and say that I'm not really thrilled with the idea of
sending this up right now, but Greg is a little annoyed so here I
figured I would at least send this"
* tag 'selinux-pr-20190429' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux:
selinux: use kernel linux/socket.h for genheaders and mdp
Merge tag 'seccomp-v5.1-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull seccomp fixes from Kees Cook:
"Syzbot found a use-after-free bug in seccomp due to flags that should
not be allowed to be used together.
Tycho fixed this, I updated the self-tests, and the syzkaller PoC has
been running for several days without triggering KASan (before this
fix, it would reproduce). These patches have also been in -next for
almost a week, just to be sure.
- Add logic for making some seccomp flags exclusive (Tycho)
- Update selftests for exclusivity testing (Kees)"
* tag 'seccomp-v5.1-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
seccomp: Make NEW_LISTENER and TSYNC flags exclusive
selftests/seccomp: Prepare for exclusive seccomp flags
This doesn't really do anything, but at least we now parse teh
ZERO_PAGE() address argument so that we'll catch the most obvious errors
in usage next time they'll happen.
See commit 6a5c5d26c4c6 ("rdma: fix build errors on s390 and MIPS due to
bad ZERO_PAGE use") what happens when we don't have any use of the macro
argument at all.
rdma: fix build errors on s390 and MIPS due to bad ZERO_PAGE use
The parameter to ZERO_PAGE() was wrong, but since all architectures
except for MIPS and s390 ignore it, it wasn't noticed until 0-day
reported the build error.
Fixes: 67f269b37f9b ("RDMA/ucontext: Fix regression with disassociate") Cc: stable@vger.kernel.org Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Leon Romanovsky <leonro@mellanox.com> Cc: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Paulo Alcantara [Mon, 25 Feb 2019 00:55:28 +0000 (21:55 -0300)]
selinux: use kernel linux/socket.h for genheaders and mdp
When compiling genheaders and mdp from a newer host kernel, the
following error happens:
In file included from scripts/selinux/genheaders/genheaders.c:18:
./security/selinux/include/classmap.h:238:2: error: #error New
address family defined, please update secclass_map. #error New
address family defined, please update secclass_map. ^~~~~
make[3]: *** [scripts/Makefile.host:107:
scripts/selinux/genheaders/genheaders] Error 1 make[2]: ***
[scripts/Makefile.build:599: scripts/selinux/genheaders] Error 2
make[1]: *** [scripts/Makefile.build:599: scripts/selinux] Error 2
make[1]: *** Waiting for unfinished jobs....
Instead of relying on the host definition, include linux/socket.h in
classmap.h to have PF_MAX.
Cc: stable@vger.kernel.org Signed-off-by: Paulo Alcantara <paulo@paulo.ac> Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
[PM: manually merge in mdp.c, subject line tweaks] Signed-off-by: Paul Moore <paul@paul-moore.com>
Jan Kara [Wed, 24 Apr 2019 16:39:57 +0000 (18:39 +0200)]
fsnotify: Fix NULL ptr deref in fanotify_get_fsid()
fanotify_get_fsid() is reading mark->connector->fsid under srcu. It can
happen that it sees mark not fully initialized or mark that is already
detached from the object list. In these cases mark->connector
can be NULL leading to NULL ptr dereference. Fix the problem by
being careful when reading mark->connector and check it for being NULL.
Also use WRITE_ONCE when writing the mark just to prevent compiler from
doing something stupid.
Reported-by: syzbot+15927486a4f1bfcbaf91@syzkaller.appspotmail.com Fixes: 77115225acc6 ("fanotify: cache fsid in fsnotify_mark_connector") Signed-off-by: Jan Kara <jack@suse.cz>
- Fix EFI decompressor entry (ensuring barrier instructions are
enabled prior to use)"
* tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm:
ARM: 8857/1: efi: enable CP15 DMB instructions before cleaning the cache
ARM: 8856/1: NOMMU: Fix CCR register faulty initialization when MPU is disabled
ARM: fix function graph tracer and unwinder dependencies
Merge tag 'powerpc-5.1-6' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:
"A one-liner to make our Radix MMU support depend on HUGETLB_PAGE. We
use some of the hugetlb inlines (eg. pud_huge()) when operating on the
linear mapping and if they're compiled into empty wrappers we can
corrupt memory.
Then two fixes to our VFIO IOMMU code. The first is not a regression
but fixes the locking to avoid a user-triggerable deadlock.
The second does fix a regression since rc1, and depends on the first
fix. It makes it possible to run guests with large amounts of memory
again (~256GB).
Thanks to Alexey Kardashevskiy"
* tag 'powerpc-5.1-6' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/mm_iommu: Allow pinning large regions
powerpc/mm_iommu: Fix potential deadlock
powerpc/mm/radix: Make Radix require HUGETLB_PAGE
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma
Pull rdma fixes from Jason Gunthorpe:
"One core bug fix and a few driver ones
- FRWR memory registration for hfi1/qib didn't work with with some
iovas causing a NFSoRDMA failure regression due to a fix in the NFS
side
- A command flow error in mlx5 allowed user space to send a corrupt
command (and also smash the kernel stack we've since learned)
- Fix a regression and some bugs with device hot unplug that was
discovered while reviewing Andrea's patches
- hns has a failure if the user asks for certain QP configurations"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
RDMA/hns: Bugfix for mapping user db
RDMA/ucontext: Fix regression with disassociate
RDMA/mlx5: Use rdma_user_map_io for mapping BAR pages
RDMA/mlx5: Do not allow the user to write to the clock page
IB/mlx5: Fix scatter to CQE in DCT QP creation
IB/rdmavt: Fix frwr memory registration
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
Pull input fixes from Dmitry Torokhov:
"Just a couple of fixups for Synaptics RMI4 driver and allowing
snvs_pwrkey to be selected on more boards"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
Input: synaptics-rmi4 - write config register values to the right offset
Input: synaptics-rmi4 - fix possible double free
Input: snvs_pwrkey - make it depend on ARCH_MXC
Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:
- Fix an early boot crash in the RSDP parsing code by effectively
turning off the parsing call - we ran out of time but want to fix the
regression. The more involved fix is being worked on.
- Fix a crash that can trigger in the kmemlek code.
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/mm: Fix a crash with kmemleak_scan()
x86/boot: Disable RSDP parsing temporarily
Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fix from Ingo Molnar:
"A cstate event enumeration fix for Kaby/Coffee Lake CPUs"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/x86/intel: Update KBL Package C-state events to also include PC8/PC9/PC10 counters
for the error handling path, and rather than complicate that code, just
make it ok to always free what was returned by the init function.
That's what the code used to do before commit 4ab42d78e37a ("ppp, slip:
Validate VJ compression slot parameters completely") when slhc_init()
just returned NULL for the error case, with no actual indication of the
details of the error.
Reported-by: syzbot+45474c076a4927533d2e@syzkaller.appspotmail.com Fixes: 4ab42d78e37a ("ppp, slip: Validate VJ compression slot parameters completely") Acked-by: Ben Hutchings <ben@decadent.org.uk> Cc: David Miller <davem@davemloft.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
fs/proc/proc_sysctl.c: Fix a NULL pointer dereference
mm/page_alloc.c: fix never set ALLOC_NOFRAGMENT flag
mm/page_alloc.c: avoid potential NULL pointer dereference
mm, page_alloc: always use a captured page regardless of compaction result
mm: do not boost watermarks to avoid fragmentation for the DISCONTIG memory model
lib/test_vmalloc.c: do not create cpumask_t variable on stack
lib/Kconfig.debug: fix build error without CONFIG_BLOCK
zram: pass down the bvec we need to read into in the work struct
mm/memory_hotplug.c: drop memory device reference after find_memory_block()
Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Catalin Marinas:
- keep the tail of an unaligned initrd reserved
- adjust ftrace_make_call() to deal with the relative nature of PLTs
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64/module: ftrace: deal with place relative nature of PLTs
arm64: mm: Ensure tail of unaligned initrd is reserved
Merge tag 'trace-v5.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing fixes from Steven Rostedt:
"Three tracing fixes:
- Use "nosteal" for ring buffer splice pages
- Memory leak fix in error path of trace_pid_write()
- Fix preempt_enable_no_resched() (use preempt_enable()) in ring
buffer code"
* tag 'trace-v5.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
trace: Fix preempt_enable_no_resched() abuse
tracing: Fix a memory leak by early error exit in trace_pid_write()
tracing: Fix buffer_ref pipe ops
Merge tag 'gpio-v5.1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio
Pull GPIO fixes from Linus Walleij:
"Not much to say about them, regular fixes:
- Fix a bug on the errorpath of gpiochip_add_data_with_key()
- IRQ type setting on the spreadtrum GPIO driver"
* tag 'gpio-v5.1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio:
gpio: Fix gpiochip_add_data_with_key() error path
gpio: eic: sprd: Fix incorrect irq type setting for the sync EIC
Merge tag 'drm-fixes-2019-04-26' of git://anongit.freedesktop.org/drm/drm
Pull drm fixes from Dave Airlie:
"Regular drm fixes, nothing too outstanding, I'm guessing Easter was
slowing people down.
i915:
- FEC enable fix
- BXT display lanes fix
ttm:
- fix reinit for reloading drivers regression
imx:
- DP CSC fix
sun4i:
- module unload/load fix
vc4:
- memory leak fix
- compile fix
dw-hdmi:
- rockchip scdc overflow fix
sched:
- docs fix
vmwgfx:
- dma api layering fix"
* tag 'drm-fixes-2019-04-26' of git://anongit.freedesktop.org/drm/drm:
drm/bridge: dw-hdmi: fix SCDC configuration for ddc-i2c-bus
drm/vmwgfx: Fix dma API layer violation
drm/vc4: Fix compilation error reported by kbuild test bot
drm/sun4i: Unbind components before releasing DRM and memory
drm/vc4: Fix memory leak during gpu reset.
drm/sched: Fix description of drm_sched_stop
drm/imx: don't skip DP channel disable for background plane
gpu: ipu-v3: dp: fix CSC handling
drm/ttm: fix re-init of global structures
drm/sun4i: Fix component unbinding and component master deletion
drm/sun4i: Set device driver data at bind time for use in unbind
drm/sun4i: Add missing drm_atomic_helper_shutdown at driver unbind
drm/i915: Restore correct bxt_ddi_phy_calc_lane_lat_optim_mask() calculation
drm/i915: Do not enable FEC without DSC
drm: bridge: dw-hdmi: Fix overflow workaround for Rockchip SoCs
Merge tag 'for-5.1-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs fix from David Sterba:
"One patch to fix a crash in io submission path, due to memory
allocation errors.
In short, the multipage bio work that landed in 5.1 caused larger bios
that in turn require larger temporary memory for checksums. The patch
is a workaround, we're going to rework the allocation so it does not
require the vmalloc fallback.
It took a while to identify that it's caused by patches in 5.1 and not
a patchset that did some changes in error handling in the code. I've
tested it on various memory/cpu combinations, it could hit OOM but
does not crash.
The timestamp of the patch is less than a day due to updates in the
changelog, tests were running meanwhile"
* tag 'for-5.1-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
btrfs: Switch memory allocations in async csum calculation path to kvmalloc
Merge tag '5.1-rc6-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6
Pull cifs fixes from Steve French:
"Three small SMB3 fixes (all for stable as well): two leaks and a
rename bug"
* tag '5.1-rc6-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6:
cifs: fix page reference leak with readv/writev
cifs: do not attempt cifs operation on smb2+ rename error
cifs: fix memory leak in SMB2_read
commit 23da9588037e ("fs/proc/proc_sysctl.c: fix NULL pointer
dereference in put_links") forgot to handle start_unregistering() case,
while header->parent is NULL, it calls erase_header() and as seen in the
above syzkaller call trace, accessing &header->parent->root will trigger
a NULL pointer dereference.
As that commit explained, there is also no need to call
start_unregistering() if header->parent is NULL.
Link: http://lkml.kernel.org/r/20190409153622.28112-1-yuehaibing@huawei.com Fixes: 23da9588037e ("fs/proc/proc_sysctl.c: fix NULL pointer dereference in put_links") Fixes: 0e47c99d7fe25 ("sysctl: Replace root_list with links between sysctl_table_sets") Signed-off-by: YueHaibing <yuehaibing@huawei.com> Reported-by: Hulk Robot <hulkci@huawei.com> Reviewed-by: Kees Cook <keescook@chromium.org> Cc: Luis Chamberlain <mcgrof@kernel.org> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
mm/page_alloc.c: fix never set ALLOC_NOFRAGMENT flag
Commit 0a79cdad5eb2 ("mm: use alloc_flags to record if kswapd can wake")
removed setting of the ALLOC_NOFRAGMENT flag. Bring it back.
The runtime effect is that ALLOC_NOFRAGMENT behaviour is restored so
that allocations are spread across local zones to avoid fragmentation
due to mixing pageblocks as long as possible.
Link: http://lkml.kernel.org/r/20190423120806.3503-2-aryabinin@virtuozzo.com Fixes: 0a79cdad5eb2 ("mm: use alloc_flags to record if kswapd can wake") Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Acked-by: Mel Gorman <mgorman@techsingularity.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
ac.preferred_zoneref->zone passed to alloc_flags_nofragment() can be NULL.
'zone' pointer unconditionally derefernced in alloc_flags_nofragment().
Bail out on NULL zone to avoid potential crash. Currently we don't see
any crashes only because alloc_flags_nofragment() has another bug which
allows compiler to optimize away all accesses to 'zone'.
Link: http://lkml.kernel.org/r/20190423120806.3503-1-aryabinin@virtuozzo.com Fixes: 6bb154504f8b ("mm, page_alloc: spread allocations across zones before introducing fragmentation") Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Acked-by: Mel Gorman <mgorman@techsingularity.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
mm, page_alloc: always use a captured page regardless of compaction result
During the development of commit 5e1f0f098b46 ("mm, compaction: capture
a page under direct compaction"), a paranoid check was added to ensure
that if a captured page was available after compaction that it was
consistent with the final state of compaction. The intent was to catch
serious programming bugs such as using a stale page pointer and causing
corruption problems.
However, it is possible to get a captured page even if compaction was
unsuccessful if an interrupt triggered and happened to free pages in
interrupt context that got merged into a suitable high-order page. It's
highly unlikely but Li Wang did report the following warning on s390
occuring when testing OOM handling. Note that the warning is slightly
edited for clarity.
This patch simply removes the check entirely instead of trying to be
clever about pages freed from interrupt context. If a serious
programming error was introduced, it is highly likely to be caught by
prep_new_page() instead.
Link: http://lkml.kernel.org/r/20190419085133.GH18914@techsingularity.net Fixes: 5e1f0f098b46 ("mm, compaction: capture a page under direct compaction") Signed-off-by: Mel Gorman <mgorman@techsingularity.net> Reported-by: Li Wang <liwang@redhat.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
mm: do not boost watermarks to avoid fragmentation for the DISCONTIG memory model
Mikulas Patocka reported that commit 1c30844d2dfe ("mm: reclaim small
amounts of memory when an external fragmentation event occurs") "broke"
memory management on parisc.
The machine is not NUMA but the DISCONTIG model creates three pgdats
even though it's a UMA machine for the following ranges
0) Start 0x0000000000000000 End 0x000000003fffffff Size 1024 MB
1) Start 0x0000000100000000 End 0x00000001bfdfffff Size 3070 MB
2) Start 0x0000004040000000 End 0x00000040ffffffff Size 3072 MB
Mikulas reported:
With the patch 1c30844d2, the kernel will incorrectly reclaim the
first zone when it fills up, ignoring the fact that there are two
completely free zones. Basiscally, it limits cache size to 1GiB.
For example, if I run:
# dd if=/dev/sda of=/dev/null bs=1M count=2048
- with the proper kernel, there should be "Buffers - 2GiB"
when this command finishes. With the patch 1c30844d2, buffers
will consume just 1GiB or slightly more, because the kernel was
incorrectly reclaiming them.
The page allocator and reclaim makes assumptions that pgdats really
represent NUMA nodes and zones represent ranges and makes decisions on
that basis. Watermark boosting for small pgdats leads to unexpected
results even though this would have behaved reasonably on SPARSEMEM.
DISCONTIG is essentially deprecated and even parisc plans to move to
SPARSEMEM so there is no need to be fancy, this patch simply disables
watermark boosting by default on DISCONTIGMEM.
Link: http://lkml.kernel.org/r/20190419094335.GJ18914@techsingularity.net Fixes: 1c30844d2dfe ("mm: reclaim small amounts of memory when an external fragmentation event occurs") Signed-off-by: Mel Gorman <mgorman@techsingularity.net> Reported-by: Mikulas Patocka <mpatocka@redhat.com> Tested-by: Mikulas Patocka <mpatocka@redhat.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: James Bottomley <James.Bottomley@hansenpartnership.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
lib/Kconfig.debug: fix build error without CONFIG_BLOCK
If CONFIG_TEST_KMOD is set to M, while CONFIG_BLOCK is not set, XFS and
BTRFS can not be compiled successly.
Link: http://lkml.kernel.org/r/20190410075434.35220-1-yuehaibing@huawei.com Fixes: d9c6a72d6fa2 ("kmod: add test driver to stress test the module loader") Signed-off-by: YueHaibing <yuehaibing@huawei.com> Reported-by: Hulk Robot <hulkci@huawei.com> Reviewed-by: Kees Cook <keescook@chromium.org> Cc: Masahiro Yamada <yamada.masahiro@socionext.com> Cc: Petr Mladek <pmladek@suse.com> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Joe Lawrence <joe.lawrence@redhat.com> Cc: Robin Murphy <robin.murphy@arm.com> Cc: Luis Chamberlain <mcgrof@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
zram: pass down the bvec we need to read into in the work struct
When scheduling work item to read page we need to pass down the proper
bvec struct which points to the page to read into. Before this patch it
uses a randomly initialized bvec (only if PAGE_SIZE != 4096) which is
wrong.
Note that without this patch on arch/kernel where PAGE_SIZE != 4096
userspace could read random memory through a zram block device (thought
userspace probably would have no control on the address being read).
mm/memory_hotplug.c: drop memory device reference after find_memory_block()
Right now we are using find_memory_block() to get the node id for the
pfn range to online. We are missing to drop a reference to the memory
block device. While the device still gets unregistered via
device_unregister(), resulting in no user visible problem, the device is
never released via device_release(), resulting in a memory leak. Fix
that by properly using a put_device().
Link: http://lkml.kernel.org/r/20190411110955.1430-1-david@redhat.com Fixes: d0dc12e86b31 ("mm/memory_hotplug: optimize memory hotplug") Signed-off-by: David Hildenbrand <david@redhat.com> Reviewed-by: Oscar Salvador <osalvador@suse.de> Reviewed-by: Wei Yang <richard.weiyang@gmail.com> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Pankaj Gupta <pagupta@redhat.com> Cc: David Hildenbrand <david@redhat.com> Cc: Pavel Tatashin <pasha.tatashin@soleen.com> Cc: Qian Cai <cai@lca.pw> Cc: Arun KS <arunks@codeaurora.org> Cc: Mathieu Malaterre <malat@debian.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Peter Zijlstra [Tue, 23 Apr 2019 20:03:18 +0000 (22:03 +0200)]
trace: Fix preempt_enable_no_resched() abuse
Unless the very next line is schedule(), or implies it, one must not use
preempt_enable_no_resched(). It can cause a preemption to go missing and
thereby cause arbitrary delays, breaking the PREEMPT=y invariant.
Link: http://lkml.kernel.org/r/20190423200318.GY14281@hirez.programming.kicks-ass.net Cc: Waiman Long <longman@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: the arch/x86 maintainers <x86@kernel.org> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: huang ying <huang.ying.caritas@gmail.com> Cc: Roman Gushchin <guro@fb.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: stable@vger.kernel.org Fixes: 2c2d7329d8af ("tracing/ftrace: use preempt_enable_no_resched_notrace in ring_buffer_time_stamp()") Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Wenwen Wang [Sat, 20 Apr 2019 02:22:59 +0000 (21:22 -0500)]
tracing: Fix a memory leak by early error exit in trace_pid_write()
In trace_pid_write(), the buffer for trace parser is allocated through
kmalloc() in trace_parser_get_init(). Later on, after the buffer is used,
it is then freed through kfree() in trace_parser_put(). However, it is
possible that trace_pid_write() is terminated due to unexpected errors,
e.g., ENOMEM. In that case, the allocated buffer will not be freed, which
is a memory leak bug.
To fix this issue, free the allocated buffer when an error is encountered.
Link: http://lkml.kernel.org/r/1555726979-15633-1-git-send-email-wang6495@umn.edu Fixes: f4d34a87e9c10 ("tracing: Use pid bitmap instead of a pid array for set_event_pid") Cc: stable@vger.kernel.org Signed-off-by: Wenwen Wang <wang6495@umn.edu> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
This fixes multiple issues in buffer_pipe_buf_ops:
- The ->steal() handler must not return zero unless the pipe buffer has
the only reference to the page. But generic_pipe_buf_steal() assumes
that every reference to the pipe is tracked by the page's refcount,
which isn't true for these buffers - buffer_pipe_buf_get(), which
duplicates a buffer, doesn't touch the page's refcount.
Fix it by using generic_pipe_buf_nosteal(), which refuses every
attempted theft. It should be easy to actually support ->steal, but the
only current users of pipe_buf_steal() are the virtio console and FUSE,
and they also only use it as an optimization. So it's probably not worth
the effort.
- The ->get() and ->release() handlers can be invoked concurrently on pipe
buffers backed by the same struct buffer_ref. Make them safe against
concurrency by using refcount_t.
- The pointers stored in ->private were only zeroed out when the last
reference to the buffer_ref was dropped. As far as I know, this
shouldn't be necessary anyway, but if we do it, let's always do it.
Link: http://lkml.kernel.org/r/20190404215925.253531-1-jannh@google.com Cc: Ingo Molnar <mingo@redhat.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: stable@vger.kernel.org Fixes: 73a757e63114d ("ring-buffer: Return reader page back into existing ring buffer") Signed-off-by: Jann Horn <jannh@google.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Dave Airlie [Fri, 26 Apr 2019 00:32:57 +0000 (10:32 +1000)]
Merge tag 'imx-drm-fixes-2019-04-25' of git://git.pengutronix.de/pza/linux into drm-fixes
drm/imx: fix DP CSC handling
- Fix the DP color space conversion matrix setup to avoid bugs where
disabling the overlay plane while both primary and overlay plane are
routed via the CSC unit would not reconfigure the CSC routing
properly, leaving the display in a nonworking state, or the CSC
setting from a previously set mode would be left behind, causing
wrong colors when reenabling the display in certain configurations.
Tycho Andersen [Wed, 6 Mar 2019 20:14:13 +0000 (13:14 -0700)]
seccomp: Make NEW_LISTENER and TSYNC flags exclusive
As the comment notes, the return codes for TSYNC and NEW_LISTENER
conflict, because they both return positive values, one in the case of
success and one in the case of error. So, let's disallow both of these
flags together.
While this is technically a userspace break, all the users I know
of are still waiting on me to land this feature in libseccomp, so I
think it'll be safe. Also, at present my use case doesn't require
TSYNC at all, so this isn't a big deal to disallow. If someone
wanted to support this, a path forward would be to add a new flag like
TSYNC_AND_LISTENER_YES_I_UNDERSTAND_THAT_TSYNC_WILL_JUST_RETURN_EAGAIN,
but the use cases are so different I don't see it really happening.
Finally, it's worth noting that this does actually fix a UAF issue: at the
end of seccomp_set_mode_filter(), we have:
if (flags & SECCOMP_FILTER_FLAG_NEW_LISTENER) {
if (ret < 0) {
listener_f->private_data = NULL;
fput(listener_f);
put_unused_fd(listener);
} else {
fd_install(listener, listener_f);
ret = listener;
}
}
out_free:
seccomp_filter_free(prepared);
But if ret > 0 because TSYNC raced, we'll install the listener fd and then
free the filter out from underneath it, causing a UAF when the task closes
it or dies. This patch also switches the condition to be simply if (ret),
so that if someone does add the flag mentioned above, they won't have to
remember to fix this too.
Reported-by: syzbot+b562969adb2e04af3442@syzkaller.appspotmail.com Fixes: 6a21cc50f0c7 ("seccomp: add a return code to trap to userspace") CC: stable@vger.kernel.org # v5.0+ Signed-off-by: Tycho Andersen <tycho@tycho.ws> Signed-off-by: Kees Cook <keescook@chromium.org> Acked-by: James Morris <jamorris@linux.microsoft.com>
sched_clock_cpu() may not be consistent between CPUs. If a task
migrates to another CPU, then se.exec_start is set to that CPU's
rq_clock_task() by update_stats_curr_start(). Specifically, the new
value might be before the old value due to clock skew.
So then if in numa_get_avg_runtime() the expression:
'now - p->last_task_numa_placement'
ends up as -1, then the divider '*period + 1' in task_numa_placement()
is 0 and things go bang. Similar to update_curr(), check if time goes
backwards to avoid this.
[ peterz: Wrote new changelog. ]
[ mingo: Tweaked the code comment. ]
Merge tag 'ceph-for-5.1-rc7' of git://github.com/ceph/ceph-client
Pull ceph fixes from Ilya Dryomov:
"dentry name handling fixes from Jeff and a memory leak fix from Zheng.
Both are old issues, marked for stable"
* tag 'ceph-for-5.1-rc7' of git://github.com/ceph/ceph-client:
ceph: fix ci->i_head_snapc leak
ceph: handle the case where a dentry has been renamed on outstanding req
ceph: ensure d_name stability in ceph_dentry_hash()
ceph: only use d_name directly when parent is locked
Lijun Ou [Tue, 23 Apr 2019 09:30:26 +0000 (17:30 +0800)]
RDMA/hns: Bugfix for mapping user db
When the maximum send wr delivered by the user is zero, the qp does not
have a sq.
When allocating the sq db buffer to store the user sq pi pointer and map
it to the kernel mode, max_send_wr is used as the trigger condition, while
the kernel does not consider the max_send_wr trigger condition when
mapmping db. It will cause sq record doorbell map fail and create qp fail.
Fixes: 0425e3e6e0c7 ("RDMA/hns: Support flush cqe for hip08 in kernel space") Signed-off-by: Lijun Ou <oulijun@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Nikolay Borisov [Mon, 1 Apr 2019 08:29:58 +0000 (11:29 +0300)]
btrfs: Switch memory allocations in async csum calculation path to kvmalloc
Recent multi-page biovec rework allowed creation of bios that can span
large regions - up to 128 megabytes in the case of btrfs. OTOH btrfs'
submission path currently allocates a contiguous array to store the
checksums for every bio submitted. This means we can request up to
(128mb / BTRFS_SECTOR_SIZE) * 4 bytes + 32bytes of memory from kmalloc.
On busy systems with possibly fragmented memory said kmalloc can fail
which will trigger BUG_ON due to improper error handling IO submission
context in btrfs.
Until error handling is improved or bios in btrfs limited to a more
manageable size (e.g. 1m) let's use kvmalloc to fallback to vmalloc for
such large allocations. There is no hard requirement that the memory
allocated for checksums during IO submission has to be contiguous, but
this is a simple fix that does not require several non-contiguous
allocations.
For small writes this is unlikely to have any visible effect since
kmalloc will still satisfy allocation requests as usual. For larger
requests the code will just fallback to vmalloc.
We've performed evaluation on several workload types and there was no
significant difference kmalloc vs kvmalloc.
Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
Alan Stern [Mon, 22 Apr 2019 15:16:04 +0000 (11:16 -0400)]
USB: w1 ds2490: Fix bug caused by improper use of altsetting array
The syzkaller USB fuzzer spotted a slab-out-of-bounds bug in the
ds2490 driver. This bug is caused by improper use of the altsetting
array in the usb_interface structure (the array's entries are not
always stored in numerical order), combined with a naive assumption
that all interfaces probed by the driver will have the expected number
of altsettings.
The bug can be fixed by replacing references to the possibly
non-existent intf->altsetting[alt] entry with the guaranteed-to-exist
intf->cur_altsetting entry.
Alan Stern [Tue, 23 Apr 2019 18:48:29 +0000 (14:48 -0400)]
USB: yurex: Fix protection fault after device removal
The syzkaller USB fuzzer found a general-protection-fault bug in the
yurex driver. The fault occurs when a device has been unplugged; the
driver's interrupt-URB handler logs an error message referring to the
device by name, after the device has been unregistered and its name
deallocated.
This problem is caused by the fact that the interrupt URB isn't
cancelled until the driver's private data structure is released, which
can happen long after the device is gone. The cure is to make sure
that the interrupt URB is killed before yurex_disconnect() returns;
this is exactly the sort of thing that usb_poison_urb() was meant for.
usb: usbip: fix isoc packet num validation in get_pipe
Change the validation of number_of_packets in get_pipe to compare the
number of packets to a fixed maximum number of packets allowed, set to
be 1024. This number was chosen due to it being used by other drivers as
well, for example drivers/usb/host/uhci-q.c
Background/reason:
The get_pipe function in stub_rx.c validates the number of packets in
isochronous mode and aborts with an error if that number is too large,
in order to prevent malicious input from possibly triggering large
memory allocations. This was previously done by checking whether
pdu->u.cmd_submit.number_of_packets is bigger than the number of packets
that would be needed for pdu->u.cmd_submit.transfer_buffer_length bytes
if all except possibly the last packet had maximum length, given by
usb_endpoint_maxp(epd) * usb_endpoint_maxp_mult(epd). This leads to an
error if URBs with packets shorter than the maximum possible length are
submitted, which is allowed according to
Documentation/driver-api/usb/URB.rst and occurs for example with the
snd-usb-audio driver.
[ 31.041908] [<c073273c>] (i2c_transfer) from [<c05926c4>] (drm_scdc_read+0x54/0x84)
[ 31.041913] [<c05926c4>] (drm_scdc_read) from [<c0592858>] (drm_scdc_set_scrambling+0x30/0xbc)
[ 31.041919] [<c0592858>] (drm_scdc_set_scrambling) from [<c05cc0f4>] (dw_hdmi_update_power+0x1440/0x1610)
[ 31.041926] [<c05cc0f4>] (dw_hdmi_update_power) from [<c05cc574>] (dw_hdmi_bridge_enable+0x2c/0x70)
[ 31.041932] [<c05cc574>] (dw_hdmi_bridge_enable) from [<c05aed48>] (drm_bridge_enable+0x24/0x34)
[ 31.041938] [<c05aed48>] (drm_bridge_enable) from [<c0591060>] (drm_atomic_helper_commit_modeset_enables+0x114/0x220)
[ 31.041943] [<c0591060>] (drm_atomic_helper_commit_modeset_enables) from [<c05c3fe0>] (rockchip_atomic_helper_commit_tail_rpm+0x28/0x64)
hdmi->i2c may not be set when ddc-i2c-bus property is used in device tree.
Fix this by using hdmi->ddc as the i2c adapter when calling drm_scdc_*().
Also report that SCDC is not supported when there is no DDC bus.
The err_remove_chip block is too coarse, and may perform cleanup that
must not be done. E.g. if of_gpiochip_add() fails, of_gpiochip_remove()
is still called, causing:
OF: ERROR: Bad of_node_put() on /soc/gpio@e6050000
CPU: 1 PID: 20 Comm: kworker/1:1 Not tainted 5.1.0-rc2-koelsch+ #407
Hardware name: Generic R-Car Gen2 (Flattened Device Tree)
Workqueue: events deferred_probe_work_func
[<c020ec74>] (unwind_backtrace) from [<c020ae58>] (show_stack+0x10/0x14)
[<c020ae58>] (show_stack) from [<c07c1224>] (dump_stack+0x7c/0x9c)
[<c07c1224>] (dump_stack) from [<c07c5a80>] (kobject_put+0x94/0xbc)
[<c07c5a80>] (kobject_put) from [<c0470420>] (gpiochip_add_data_with_key+0x8d8/0xa3c)
[<c0470420>] (gpiochip_add_data_with_key) from [<c0473738>] (gpio_rcar_probe+0x1d4/0x314)
[<c0473738>] (gpio_rcar_probe) from [<c052fca8>] (platform_drv_probe+0x48/0x94)
and later, if a GPIO consumer tries to use a GPIO from a failed
controller:
WARNING: CPU: 0 PID: 1 at lib/refcount.c:156 kobject_get+0x38/0x4c
refcount_t: increment on 0; use-after-free.
Modules linked in:
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.1.0-rc2-koelsch+ #407
Hardware name: Generic R-Car Gen2 (Flattened Device Tree)
[<c020ec74>] (unwind_backtrace) from [<c020ae58>] (show_stack+0x10/0x14)
[<c020ae58>] (show_stack) from [<c07c1224>] (dump_stack+0x7c/0x9c)
[<c07c1224>] (dump_stack) from [<c0221580>] (__warn+0xd0/0xec)
[<c0221580>] (__warn) from [<c02215e0>] (warn_slowpath_fmt+0x44/0x6c)
[<c02215e0>] (warn_slowpath_fmt) from [<c07c58fc>] (kobject_get+0x38/0x4c)
[<c07c58fc>] (kobject_get) from [<c068b3ec>] (of_node_get+0x14/0x1c)
[<c068b3ec>] (of_node_get) from [<c0686f24>] (of_find_node_by_phandle+0xc0/0xf0)
[<c0686f24>] (of_find_node_by_phandle) from [<c0686fbc>] (of_phandle_iterator_next+0x68/0x154)
[<c0686fbc>] (of_phandle_iterator_next) from [<c0687fe4>] (__of_parse_phandle_with_args+0x40/0xd0)
[<c0687fe4>] (__of_parse_phandle_with_args) from [<c0688204>] (of_parse_phandle_with_args_map+0x100/0x3ac)
[<c0688204>] (of_parse_phandle_with_args_map) from [<c0471240>] (of_get_named_gpiod_flags+0x38/0x380)
[<c0471240>] (of_get_named_gpiod_flags) from [<c046f864>] (gpiod_get_from_of_node+0x24/0xd8)
[<c046f864>] (gpiod_get_from_of_node) from [<c0470aa4>] (devm_fwnode_get_index_gpiod_from_child+0xa0/0x144)
[<c0470aa4>] (devm_fwnode_get_index_gpiod_from_child) from [<c05f425c>] (gpio_keys_probe+0x418/0x7bc)
[<c05f425c>] (gpio_keys_probe) from [<c052fca8>] (platform_drv_probe+0x48/0x94)
Fix this by splitting the cleanup block, and adding a missing call to
gpiochip_irqchip_remove().
Thomas Hellstrom [Tue, 23 Apr 2019 12:02:57 +0000 (14:02 +0200)]
drm/vmwgfx: Fix dma API layer violation
Remove the check for IOMMU presence since it was considered a
layer violation.
This means we have no reliable way to destinguish between coherent
hardware IOMMU DMA address translations and incoherent SWIOTLB DMA
address translations, which we can't handle. So always presume the
former. This means that if anybody forces SWIOTLB without also setting
the vmw_force_coherent=1 vmwgfx option, driver operation will fail,
like it will on most other graphics drivers.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
Harry Pan [Wed, 24 Apr 2019 14:50:33 +0000 (22:50 +0800)]
perf/x86/intel: Update KBL Package C-state events to also include PC8/PC9/PC10 counters
Kaby Lake (and Coffee Lake) has PC8/PC9/PC10 residency counters.
This patch updates the list of Kaby/Coffee Lake PMU event counters
from the snb_cstates[] list of events to the hswult_cstates[]
list of events, which keeps all previously supported events and
also adds the PKG_C8, PKG_C9 and PKG_C10 residency counters.
This allows user space tools to profile them through the perf interface.
Signed-off-by: Harry Pan <harry.pan@intel.com> Cc: <stable@vger.kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: gs0622@gmail.com Link: http://lkml.kernel.org/r/20190424145033.1924-1-harry.pan@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
Pull networking fixes from David Miller:
"Just the usual assortment of small'ish fixes:
1) Conntrack timeout is sometimes not initialized properly, from
Alexander Potapenko.
2) Add a reasonable range limit to tcp_min_rtt_wlen to avoid
undefined behavior. From ZhangXiaoxu.
3) des1 field of descriptor in stmmac driver is initialized with the
wrong variable. From Yue Haibing.
4) Increase mlxsw pci sw reset timeout a little bit more, from Ido
Schimmel.
5) Match IOT2000 stmmac devices more accurately, from Su Bao Cheng.
6) Fallback refcount fix in TLS code, from Jakub Kicinski.
7) Fix max MTU check when using XDP in mlx5, from Maxim Mikityanskiy.
8) Fix recursive locking in team driver, from Hangbin Liu.
9) Fix tls_set_device_offload_Rx() deadlock, from Jakub Kicinski.
10) Don't use napi_alloc_frag() outside of softiq context of socionext
driver, from Ilias Apalodimas.
11) MAC address increment overflow in ncsi, from Tao Ren.
12) Fix a regression in 8K/1M pool switching of RDS, from Zhu Yanjun.
13) ipv4_link_failure has to validate the headers that are actually
there because RAW sockets can pass in arbitrary garbage, from Eric
Dumazet"
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (43 commits)
ipv4: add sanity checks in ipv4_link_failure()
net/rose: fix unbound loop in rose_loopback_timer()
rxrpc: fix race condition in rxrpc_input_packet()
net: rds: exchange of 8K and 1M pool
net: vrf: Fix operation not supported when set vrf mac
net/ncsi: handle overflow when incrementing mac address
net: socionext: replace napi_alloc_frag with the netdev variant on init
net: atheros: fix spelling mistake "underun" -> "underrun"
spi: ST ST95HF NFC: declare missing of table
spi: Micrel eth switch: declare missing of table
net: stmmac: move stmmac_check_ether_addr() to driver probe
netfilter: fix nf_l4proto_log_invalid to log invalid packets
netfilter: never get/set skb->tstamp
netfilter: ebtables: CONFIG_COMPAT: drop a bogus WARN_ON
Documentation: decnet: remove reference to CONFIG_DECNET_ROUTE_FWMARK
dt-bindings: add an explanation for internal phy-mode
net/tls: don't leak IV and record seq when offload fails
net/tls: avoid potential deadlock in tls_set_device_offload_rx()
selftests/net: correct the return value for run_afpackettests
team: fix possible recursive locking when add slaves
...
Merge tag 'leds-for-5.1-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/j.anaszewski/linux-leds
Pull LED update from Jacek Anaszewski:
"A single change to MAINTAINERS:
We announce a new LED reviewer - Dan Murphy"
* tag 'leds-for-5.1-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/j.anaszewski/linux-leds:
MAINTAINERS: LEDs: Add designated reviewer for LED subsystem
Eric Dumazet [Wed, 24 Apr 2019 15:04:05 +0000 (08:04 -0700)]
ipv4: add sanity checks in ipv4_link_failure()
Before calling __ip_options_compile(), we need to ensure the network
header is a an IPv4 one, and that it is already pulled in skb->head.
RAW sockets going through a tunnel can end up calling ipv4_link_failure()
with total garbage in the skb, or arbitrary lengthes.
syzbot report :
BUG: KASAN: stack-out-of-bounds in memcpy include/linux/string.h:355 [inline]
BUG: KASAN: stack-out-of-bounds in __ip_options_echo+0x294/0x1120 net/ipv4/ip_options.c:123
Write of size 69 at addr ffff888096abf068 by task syz-executor.4/9204
Fixes: ed0de45a1008 ("ipv4: recompile ip options in ipv4_link_failure") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Stephen Suryaputra <ssuryaextr@gmail.com> Acked-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Wed, 24 Apr 2019 16:44:11 +0000 (09:44 -0700)]
rxrpc: fix race condition in rxrpc_input_packet()
After commit 5271953cad31 ("rxrpc: Use the UDP encap_rcv hook"),
rxrpc_input_packet() is directly called from lockless UDP receive
path, under rcu_read_lock() protection.
It must therefore use RCU rules :
- udp_sk->sk_user_data can be cleared at any point in this function.
rcu_dereference_sk_user_data() is what we need here.
- Also, since sk_user_data might have been set in rxrpc_open_socket()
we must observe a proper RCU grace period before kfree(local) in
rxrpc_lookup_local()
v4: @local can be NULL in xrpc_lookup_local() as reported by kbuild test robot <lkp@intel.com>
and Julia Lawall <julia.lawall@lip6.fr>, thanks !
v3,v2 : addressed David Howells feedback, thanks !
Fixes: 5271953cad31 ("rxrpc: Use the UDP encap_rcv hook") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Acked-by: David Howells <dhowells@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Before the commit 490ea5967b0d ("RDS: IB: move FMR code to its own file"),
when the dirty_count is greater than 9/10 of max_items of 8K pool,
1M pool is used, Vice versa. After the commit 490ea5967b0d ("RDS: IB: move
FMR code to its own file"), the above is removed. When we make the
following tests.
Server:
rds-stress -r 1.1.1.16 -D 1M
Client:
rds-stress -r 1.1.1.14 -s 1.1.1.16 -D 1M
The following will appear.
"
connecting to 1.1.1.16:4000
negotiated options, tasks will start in 2 seconds
Starting up..header from 1.1.1.166:4001 to id 4001 bogus
..
tsks tx/s rx/s tx+rx K/s mbi K/s mbo K/s tx us/c rtt us
cpu %
1 0 0 0.00 0.00 0.00 0.00 0.00 -1.00
1 0 0 0.00 0.00 0.00 0.00 0.00 -1.00
1 0 0 0.00 0.00 0.00 0.00 0.00 -1.00
1 0 0 0.00 0.00 0.00 0.00 0.00 -1.00
1 0 0 0.00 0.00 0.00 0.00 0.00 -1.00
...
"
So this exchange between 8K and 1M pool is added back.
Fixes: commit 490ea5967b0d ("RDS: IB: move FMR code to its own file") Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com> Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
CIFS can leak pages reference gotten through GUP (get_user_pages*()
through iov_iter_get_pages()). This happen if cifs_send_async_read()
or cifs_write_from_iter() calls fail from within __cifs_readv() and
__cifs_writev() respectively. This patch move page unreference to
cifs_aio_ctx_release() which will happens on all code paths this is
all simpler to follow for correctness.
Signed-off-by: Jérôme Glisse <jglisse@redhat.com> Cc: Steve French <sfrench@samba.org> Cc: linux-cifs@vger.kernel.org Cc: samba-technical@lists.samba.org Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: linux-fsdevel@vger.kernel.org Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Stable <stable@vger.kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com> Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
Frank Sorenson [Tue, 16 Apr 2019 13:37:27 +0000 (08:37 -0500)]
cifs: do not attempt cifs operation on smb2+ rename error
A path-based rename returning EBUSY will incorrectly try opening
the file with a cifs (NT Create AndX) operation on an smb2+ mount,
which causes the server to force a session close.
If the mount is smb2+, skip the fallback.
Signed-off-by: Frank Sorenson <sorenson@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com> CC: Stable <stable@vger.kernel.org> Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
Cc: Stable <stable@vger.kernel.org> Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
Pan Bian [Fri, 19 Apr 2019 07:39:00 +0000 (07:39 +0000)]
Input: synaptics-rmi4 - fix possible double free
The RMI4 function structure has been released in rmi_register_function
if error occurs. However, it will be released again in the function
rmi_create_function, which may result in a double-free bug.
Signed-off-by: Pan Bian <bianpan2016@163.com> Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Jason Gunthorpe [Tue, 16 Apr 2019 11:07:28 +0000 (14:07 +0300)]
RDMA/ucontext: Fix regression with disassociate
When this code was consolidated the intention was that the VMA would
become backed by anonymous zero pages after the zap_vma_pte - however this
very subtly relied on setting the vm_ops = NULL and clearing the VM_SHARED
bits to transform the VMA into an anonymous VMA. Since the vm_ops was
removed this broke.
Now userspace gets a SIGBUS if it touches the vma after disassociation.
Instead of converting the VMA to anonymous provide a fault handler that
puts a zero'd page into the VMA when user-space touches it after
disassociation.
Cc: stable@vger.kernel.org Suggested-by: Andrea Arcangeli <aarcange@redhat.com> Fixes: 5f9794dc94f5 ("RDMA/ucontext: Add a core API for mmaping driver IO memory") Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Jacky Bai [Fri, 5 Apr 2019 17:31:09 +0000 (10:31 -0700)]
Input: snvs_pwrkey - make it depend on ARCH_MXC
The SNVS power key is not only used on i.MX6SX and i.MX7D, it is also
used by i.MX6UL and NXP's latest ARMv8 based i.MX8M series SOC. So
update the config dependency to use ARCH_MXC, and add the COMPILE_TEST
too.
Jason Gunthorpe [Tue, 16 Apr 2019 11:07:26 +0000 (14:07 +0300)]
RDMA/mlx5: Use rdma_user_map_io for mapping BAR pages
Since mlx5 supports device disassociate it must use this API for all
BAR page mmaps, otherwise the pages can remain mapped after the device
is unplugged causing a system crash.
Cc: stable@vger.kernel.org Fixes: 5f9794dc94f5 ("RDMA/ucontext: Add a core API for mmaping driver IO memory") Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Reviewed-by: Haggai Eran <haggaie@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Jason Gunthorpe [Tue, 16 Apr 2019 11:07:25 +0000 (14:07 +0300)]
RDMA/mlx5: Do not allow the user to write to the clock page
The intent of this VMA was to be read-only from user space, but the
VM_MAYWRITE masking was missed, so mprotect could make it writable.
Cc: stable@vger.kernel.org Fixes: 5c99eaecb1fc ("IB/mlx5: Mmap the HCA's clock info to user-space") Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Reviewed-by: Haggai Eran <haggaie@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
drm/vc4: Fix compilation error reported by kbuild test bot
A pointer to crtc was missing, resulting in the following build error:
drivers/gpu/drm/vc4/vc4_crtc.c:1045:44: sparse: sparse: incorrect type in argument 1 (different base types)
drivers/gpu/drm/vc4/vc4_crtc.c:1045:44: sparse: expected struct drm_crtc *crtc
drivers/gpu/drm/vc4/vc4_crtc.c:1045:44: sparse: got struct drm_crtc_state *state
drivers/gpu/drm/vc4/vc4_crtc.c:1045:39: sparse: sparse: not enough arguments for function vc4_crtc_destroy_state
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Reported-by: kbuild test robot <lkp@intel.com> Cc: Eric Anholt <eric@anholt.net> Link: https://patchwork.freedesktop.org/patch/msgid/2b6ed5e6-81b0-4276-8860-870b54ca3262@linux.intel.com Fixes: d08106796a78 ("drm/vc4: Fix memory leak during gpu reset.") Cc: <stable@vger.kernel.org> # v4.6+ Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
drm/sun4i: Unbind components before releasing DRM and memory
Our components may still be using the DRM device driver (if only to
access our driver's private data), so make sure to unbind them before
the final drm_dev_put.
Also release our reserved memory after component unbind instead of
before to match reverse creation order.
unmaps memory inside the .bss when DEBUG_PAGEALLOC=y.
kmemleak_init() will register the .data/.bss sections and then
kmemleak_scan() will scan those addresses and dereference them looking
for pointer references. If free_init_pages() frees and unmaps pages in
those sections, kmemleak_scan() will crash if referencing one of those
addresses:
BUG: unable to handle kernel paging request at ffffffffbd402000
CPU: 12 PID: 325 Comm: kmemleak Not tainted 5.1.0-rc4+ #4
RIP: 0010:scan_block
Call Trace:
scan_gray_list
kmemleak_scan
kmemleak_scan_thread
kthread
ret_from_fork
Since kmemleak_free_part() is tolerant to unknown objects (not tracked
by kmemleak), it is fine to call it from free_init_pages() even if not
all address ranges passed to this function are known to kmemleak.
[ bp: Massage. ]
Fixes: b3f0907c71e0 ("x86/mm: Add .bss..decrypted section to hold shared variables") Signed-off-by: Qian Cai <cai@lca.pw> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Brijesh Singh <brijesh.singh@amd.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: x86-ml <x86@kernel.org> Link: https://lkml.kernel.org/r/20190423165811.36699-1-cai@lca.pw
Tao Ren [Wed, 24 Apr 2019 01:43:32 +0000 (01:43 +0000)]
net/ncsi: handle overflow when incrementing mac address
Previously BMC's MAC address is calculated by simply adding 1 to the
last byte of network controller's MAC address, and it produces incorrect
result when network controller's MAC address ends with 0xFF.
The problem can be fixed by calling eth_addr_inc() function to increment
MAC address; besides, the MAC address is also validated before assigning
to BMC.
Fixes: cb10c7c0dfd9 ("net/ncsi: Add NCSI Broadcom OEM command") Signed-off-by: Tao Ren <taoren@fb.com> Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Acked-by: Samuel Mendoza-Jonas <sam@mendozajonas.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Merge tag 'drm-fixes-2019-04-24' of git://anongit.freedesktop.org/drm/drm
Pull drm regression fixes from Dave Airlie:
"We interrupt your regularly scheduled drm fixes for a regression
special.
The first is for a fix in i915 that had unexpected side effects
fallout in the userspace X.org modesetting driver where X would no
longer start. I got tired of the nitpicking and issued a large hammer
on it. The X.org driver is buggy, but blackscreen regressions are
worse.
The second was an oversight that myself and Gerd should have noticed
better, Gerd is trying to fix this properly, but the regression is too
large to leave, even if the original behaviour is bad in some cases,
it's clearly bad to break a bunch of working use cases.
I'll likely have a regular fixes pull later, but I really wanted to
highlight these"
* tag 'drm-fixes-2019-04-24' of git://anongit.freedesktop.org/drm/drm:
Revert "drm/virtio: drop prime import/export callbacks"
Revert "drm/i915/fbdev: Actually configure untiled displays"
net: socionext: replace napi_alloc_frag with the netdev variant on init
The netdev variant is usable on any context since it disables interrupts.
The napi variant of the call should only be used within softirq context.
Replace napi_alloc_frag on driver init with the correct netdev_alloc_frag
call
Changes since v1:
- Adjusted commit message
Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Acked-by: Jassi Brar <jaswinder.singh@linaro.org> Fixes: 4acb20b46214 ("net: socionext: different approach on DMA") Signed-off-by: Ilias Apalodimas <ilias.apalodimas@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Dave Airlie [Wed, 24 Apr 2019 00:52:20 +0000 (10:52 +1000)]
Revert "drm/virtio: drop prime import/export callbacks"
This patch does more harm than good, as it breaks both Xwayland and
gnome-shell with X11.
Xwayland requires DRI3 & DRI3 requires PRIME.
X11 crash for obscure double-free reason which are hard to debug
(starting X11 by hand doesn't trigger the crash).
I don't see an apparent problem implementing those stub prime
functions, they may return an error at run-time, and it seems to be
handled fine by GNOME at least.
This brings back the original problem, but it's better than regressions.]
Fixes: b318e3ff7ca065d6b107e424c85a63d7a6798a ("drm/virtio: drop prime import/export callbacks") Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Signed-off-by: Dave Airlie <airlied@redhat.com>
This commit is documented to break userspace X.org modesetting driver in certain configurations.
The X.org modesetting userspace driver is broken. No fixes are available yet. In order for this patch to be applied it either needs a config option or a workaround developed.
This has been reported a few times, saying it's a userspace problem is clearly against the regression rules.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109806 Signed-off-by: Dave Airlie <airlied@redhat.com> Cc: <stable@vger.kernel.org> # v3.19+
Merge tag 'nfsd-5.1-1' of git://linux-nfs.org/~bfields/linux
Pull nfsd bugfixes from Bruce Fields:
"Fix miscellaneous nfsd bugs, in NFSv4.1 callbacks, NFSv4.1
lock-notification callbacks, NFSv3 readdir encoding, and the
cache/upcall code"
* tag 'nfsd-5.1-1' of git://linux-nfs.org/~bfields/linux:
nfsd: wake blocked file lock waiters before sending callback
nfsd: wake waiters blocked on file_lock before deleting it
nfsd: Don't release the callback slot unless it was actually held
nfsd/nfsd3_proc_readdir: fix buffer count and page pointers
sunrpc: don't mark uninitialised items as VALID.
Merge tag 'syscalls-5.1' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic
Pull syscall numbering updates from Arnd Bergmann:
"arch: add pidfd and io_uring syscalls everywhere
This comes a bit late, but should be in 5.1 anyway: we want the newly
added system calls to be synchronized across all architectures in the
release.
I hope that in the future, any newly added system calls can be added
to all architectures at the same time, and tested there while they are
in linux-next, avoiding dependencies between the architecture
maintainer trees and the tree that contains the new system call"
* tag 'syscalls-5.1' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic:
arch: add pidfd and io_uring syscalls everywhere
We missed two places that i_wrbuffer_ref_head, i_wr_ref, i_dirty_caps
and i_flushing_caps may change. When they are all zeros, we should free
i_head_snapc.
Jeff Layton [Mon, 15 Apr 2019 16:00:42 +0000 (12:00 -0400)]
ceph: handle the case where a dentry has been renamed on outstanding req
It's possible for us to issue a lookup to revalidate a dentry
concurrently with a rename. If done in the right order, then we could
end up processing dentry info in the reply that no longer reflects the
state of the dentry.
If req->r_dentry->d_name differs from the one in the trace, then just
ignore the trace in the reply. We only need to do this however if the
parent's i_rwsem is not held.
Jeff Layton [Mon, 15 Apr 2019 16:00:42 +0000 (12:00 -0400)]
ceph: only use d_name directly when parent is locked
Ben reported tripping the BUG_ON in create_request_message during some
performance testing. Analysis of the vmcore showed that the length of
the r_dentry->d_name string changed after we allocated the buffer, but
before we encoded it.
build_dentry_path returns pointers to d_name in the common case of
non-snapped dentries, but this optimization isn't safe unless the parent
directory is locked. When it isn't, have the code make a copy of the
d_name while holding the d_lock.
Cc: stable@vger.kernel.org Reported-by: Ben England <bengland@redhat.com> Signed-off-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: "Yan, Zheng" <zyan@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Daniel Gomez [Mon, 22 Apr 2019 19:08:04 +0000 (21:08 +0200)]
spi: ST ST95HF NFC: declare missing of table
Add missing <of_device_id> table for SPI driver relying on SPI
device match since compatible is in a DT binding or in a DTS.
Before this patch:
modinfo drivers/nfc/st95hf/st95hf.ko | grep alias
alias: spi:st95hf
After this patch:
modinfo drivers/nfc/st95hf/st95hf.ko | grep alias
alias: spi:st95hf
alias: of:N*T*Cst,st95hfC*
alias: of:N*T*Cst,st95hf
Reported-by: Javier Martinez Canillas <javier@dowhile0.org> Signed-off-by: Daniel Gomez <dagmcr@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Gomez [Mon, 22 Apr 2019 19:08:03 +0000 (21:08 +0200)]
spi: Micrel eth switch: declare missing of table
Add missing <of_device_id> table for SPI driver relying on SPI
device match since compatible is in a DT binding or in a DTS.
Before this patch:
modinfo drivers/net/phy/spi_ks8995.ko | grep alias
alias: spi:ksz8795
alias: spi:ksz8864
alias: spi:ks8995
After this patch:
modinfo drivers/net/phy/spi_ks8995.ko | grep alias
alias: spi:ksz8795
alias: spi:ksz8864
alias: spi:ks8995
alias: of:N*T*Cmicrel,ksz8795C*
alias: of:N*T*Cmicrel,ksz8795
alias: of:N*T*Cmicrel,ksz8864C*
alias: of:N*T*Cmicrel,ksz8864
alias: of:N*T*Cmicrel,ks8995C*
alias: of:N*T*Cmicrel,ks8995
Reported-by: Javier Martinez Canillas <javier@dowhile0.org> Signed-off-by: Daniel Gomez <dagmcr@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
ARM: 8857/1: efi: enable CP15 DMB instructions before cleaning the cache
The EFI stub is entered with the caches and MMU enabled by the
firmware, and once the stub is ready to hand over to the decompressor,
we clean and disable the caches.
The cache clean routines use CP15 barrier instructions, which can be
disabled via SCTLR. Normally, when using the provided cache handling
routines to enable the caches and MMU, this bit is enabled as well.
However, but since we entered the stub with the caches already enabled,
this routine is not executed before we call the cache clean routines,
resulting in undefined instruction exceptions if the firmware never
enabled this bit.
So set the bit explicitly in the EFI entry code, but do so in a way that
guarantees that the resulting code can still run on v6 cores as well
(which are guaranteed to have CP15 barriers enabled)
Cc: <stable@vger.kernel.org> # v4.9+ Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
ARM: 8856/1: NOMMU: Fix CCR register faulty initialization when MPU is disabled
When CONFIG_ARM_MPU is not defined, the base address of v7M SCB register
is not initialized with correct value. This prevents enabling I/D caches
when the L1 cache poilcy is applied in kernel.
Fixes: 3c24121039c9da14692eb48f6e39565b28c0f3cf ("ARM: 8756/1: NOMMU: Postpone MPU activation till __after_proc_init") Signed-off-by: Tigran Tadevosyan <tigran.tadevosyan@arm.com> Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Russell King [Tue, 23 Apr 2019 16:09:38 +0000 (17:09 +0100)]
ARM: fix function graph tracer and unwinder dependencies
Naresh Kamboju recently reported that the function-graph tracer crashes
on ARM. The function-graph tracer assumes that the kernel is built with
frame pointers.
We explicitly disabled the function-graph tracer when building Thumb2,
since the Thumb2 ABI doesn't have frame pointers.
We recently changed the way the unwinder method was selected, which
seems to have made it more likely that we can end up with the function-
graph tracer enabled but without the kernel built with frame pointers.
Fix up the function graph tracer dependencies so the option is not
available when we have no possibility of having frame pointers, and
adjust the dependencies on the unwinder option to hide the non-frame
pointer unwinder options if the function-graph tracer is enabled.
Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org> Tested-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>