Stop printing a (ratelimited) kernel message for each instance of an
unimplemented syscall being called. Userland making an unimplemented
syscall is not necessarily misbehaviour and to be expected with a
current userland running on an older kernel. Also, the current message
looks scary to users but does not actually indicate a real problem nor
help them narrow down the cause. Just rely on sys_ni_syscall() to return
-ENOSYS.
Cc: <stable@vger.kernel.org> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Michael Weiser <michael.weiser@gmx.de> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
There is a race condition between finish_unlinks->finish_urb() function
and usb_kill_urb() in ohci controller case. The finish_urb calls
spin_unlock(&ohci->lock) before usb_hcd_giveback_urb() function call,
then if during this time, usb_kill_urb is called for another endpoint,
then new ed will be added to ed_rm_list at beginning for unlink, and
ed_rm_list will point to newly added.
When finish_urb() is completed in finish_unlinks() and ed->td_list
becomes empty as in below code (in finish_unlinks() function):
The *last = ed->ed_next will make ed_rm_list to point to ed->ed_next
and previously added ed by usb_kill_urb will be left unreferenced by
ed_rm_list. This causes usb_kill_urb() hang forever waiting for
finish_unlink to remove added ed from ed_rm_list.
The main reason for hang in this race condtion is addition and removal
of ed from ed_rm_list in the beginning during usb_kill_urb and later
last* is modified in finish_unlinks().
As suggested by Alan Stern, the solution for proper handling of
ohci->ed_rm_list is to remove ed from the ed_rm_list before finishing
any URBs. Then at the end, we can add ed back to the list if necessary.
This properly handle the updated ohci->ed_rm_list in usb_kill_urb().
Fixes: 977dcfdc6031 ("USB: OHCI: don't lose track of EDs when a controller dies") Acked-by: Alan Stern <stern@rowland.harvard.edu> CC: <stable@vger.kernel.org> Signed-off-by: Aman Deep <aman.deep@samsung.com> Signed-off-by: Jeffy Chen <jeffy.chen@rock-chips.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
Running io_watchdog_func() while ohci_urb_enqueue() is running can
cause a race condition where ohci->prev_frame_no is corrupted and the
watchdog can mis-detect following error:
ohci-platform 664a0800.usb: frame counter not updating; disabled
ohci-platform 664a0800.usb: HC died; cleaning up
Specifically, following scenario causes a race condition:
1. ohci_urb_enqueue() calls spin_lock_irqsave(&ohci->lock, flags)
and enters the critical section
2. ohci_urb_enqueue() calls timer_pending(&ohci->io_watchdog) and it
returns false
3. ohci_urb_enqueue() sets ohci->prev_frame_no to a frame number
read by ohci_frame_no(ohci)
4. ohci_urb_enqueue() schedules io_watchdog_func() with mod_timer()
5. ohci_urb_enqueue() calls spin_unlock_irqrestore(&ohci->lock,
flags) and exits the critical section
6. Later, ohci_urb_enqueue() is called
7. ohci_urb_enqueue() calls spin_lock_irqsave(&ohci->lock, flags)
and enters the critical section
8. The timer scheduled on step 4 expires and io_watchdog_func() runs
9. io_watchdog_func() calls spin_lock_irqsave(&ohci->lock, flags)
and waits on it because ohci_urb_enqueue() is already in the
critical section on step 7
10. ohci_urb_enqueue() calls timer_pending(&ohci->io_watchdog) and it
returns false
11. ohci_urb_enqueue() sets ohci->prev_frame_no to new frame number
read by ohci_frame_no(ohci) because the frame number proceeded
between step 3 and 6
12. ohci_urb_enqueue() schedules io_watchdog_func() with mod_timer()
13. ohci_urb_enqueue() calls spin_unlock_irqrestore(&ohci->lock,
flags) and exits the critical section, then wake up
io_watchdog_func() which is waiting on step 9
14. io_watchdog_func() enters the critical section
15. io_watchdog_func() calls ohci_frame_no(ohci) and set frame_no
variable to the frame number
16. io_watchdog_func() compares frame_no and ohci->prev_frame_no
On step 16, because this calling of io_watchdog_func() is scheduled on
step 4, the frame number set in ohci->prev_frame_no is expected to the
number set on step 3. However, ohci->prev_frame_no is overwritten on
step 11. Because step 16 is executed soon after step 11, the frame
number might not proceed, so ohci->prev_frame_no must equals to
frame_no.
To address above scenario, this patch introduces a special sentinel
value IO_WATCHDOG_OFF and set this value to ohci->prev_frame_no when
the watchdog is not pending or running. When ohci_urb_enqueue()
schedules the watchdog (step 4 and 12 above), it compares
ohci->prev_frame_no to IO_WATCHDOG_OFF so that ohci->prev_frame_no is
not overwritten while io_watchdog_func() is running.
Signed-off-by: Shigeru Yoshida <Shigeru.Yoshida@windriver.com> Signed-off-by: Haiqing Bai <Haiqing.Bai@windriver.com> Acked-by: Alan Stern <stern@rowland.harvard.edu> Cc: stable <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
The control channel calls registered callbacks when control messages
such as XDomain protocol messages are received. The control channel
handling is done in a worker running on system workqueue which means the
networking driver can't run tear down flow which includes sending
disconnect request and waiting for a reply in the same worker. Otherwise
reply is never received (as the work is already running) and the
operation times out.
To fix this run disconnect ThunderboltIP flow asynchronously once
ThunderboltIP logout message is received.
Fixes: e69b6c02b4c3 ("net: Add support for networking over Thunderbolt cable") Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Cc: stable@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
When suspending to mem or disk the Thunderbolt controller typically goes
down as well tearing down the connection automatically. However, when
suspend to idle is used this does not happen so we need to make sure the
connection is properly disconnected before it can be re-established
during resume.
Fixes: e69b6c02b4c3 ("net: Add support for networking over Thunderbolt cable") Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Cc: stable@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
We've run into a problem where our device is attached
to a Virtual Machine and the use of the new pci_set_vpd_size()
API doesn't help. The VM kernel has been informed that
the accesses are okay, but all of the actual VPD Capability
Accesses are trapped down into the KVM Hypervisor where it
goes ahead and imposes the silent denials.
The right idea is to follow the kernel.org
commit 1c7de2b4ff88 ("PCI: Enable access to non-standard VPD for
Chelsio devices (cxgb3)") which Alexey Kardashevskiy authored
to establish a PCI Quirk for our T3-based adapters. This commit
extends that PCI Quirk to cover Chelsio T4 devices and later.
The advantage of this approach is that the VPD Size gets set early
in the Base OS/Hypervisor Boot and doesn't require that the cxgb4
driver even be available in the Base OS/Hypervisor. Thus PF4 can
be exported to a Virtual Machine and everything should work.
Fixes: 67e658794ca1 ("cxgb4: Set VPD size so we can read both VPD structures") Cc: <stable@vger.kernel.org> # v4.9+ Signed-off-by: Casey Leedom <leedom@chelsio.com> Signed-off-by: Arjun Vynipadath <arjun@chelsio.com> Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
Commit 7778c4b27cbe ("irqchip: mips-gic: Use pcpu_masks to avoid reading
GIC_SH_MASK*") removed the read of the hardware mask register when
handling shared interrupts, instead using the driver's shadow pcpu_masks
entry as the effective mask. Unfortunately this did not take account of
the write to pcpu_masks during gic_shared_irq_domain_map, which
effectively unmasks the interrupt early. If an interrupt is asserted,
gic_handle_shared_int decodes and processes the interrupt even though it
has not yet been unmasked via gic_unmask_irq, which also sets the
appropriate bit in pcpu_masks.
On the MIPS Boston board, when a console command line of
"console=ttyS0,115200n8r" is passed, the modem status IRQ is enabled in
the UART, which is immediately raised to the GIC. The interrupt has been
mapped, but no handler has yet been registered, nor is it expected to be
unmasked. However, the write to pcpu_masks in gic_shared_irq_domain_map
has effectively unmasked it, resulting in endless reports of:
To fix this, just remove the write to pcpu_masks in
gic_shared_irq_domain_map. The existing write in gic_unmask_irq is the
correct place for what is now the effective unmasking.
Cc: stable@vger.kernel.org Fixes: 7778c4b27cbe ("irqchip: mips-gic: Use pcpu_masks to avoid reading GIC_SH_MASK*") Signed-off-by: Matt Redfearn <matt.redfearn@mips.com> Reviewed-by: Paul Burton <paul.burton@mips.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
A DMB instruction can be used to ensure the relative order of only
memory accesses before and after the barrier. Since writes to system
registers are not memory operations, barrier DMB is not sufficient
for observability of memory accesses that occur before ICC_SGI1R_EL1
writes.
A DSB instruction ensures that no instructions that appear in program
order after the DSB instruction, can execute until the DSB instruction
has completed.
Cc: stable@vger.kernel.org Acked-by: Will Deacon <will.deacon@arm.com>, Signed-off-by: Shanker Donthineni <shankerd@codeaurora.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
This fixes a compile problem of some user space applications by not
including linux/libc-compat.h in uapi/if_ether.h.
linux/libc-compat.h checks which "features" the header files, included
from the libc, provide to make the Linux kernel uapi header files only
provide no conflicting structures and enums. If a user application mixes
kernel headers and libc headers it could happen that linux/libc-compat.h
gets included too early where not all other libc headers are included
yet. Then the linux/libc-compat.h would not prevent all the
redefinitions and we run into compile problems.
This patch removes the include of linux/libc-compat.h from
uapi/if_ether.h to fix the recently introduced case, but not all as this
is more or less impossible.
It is no problem to do the check directly in the if_ether.h file and not
in libc-compat.h as this does not need any fancy glibc header detection
as glibc never provided struct ethhdr and should define
__UAPI_DEF_ETHHDR by them self when they will provide this.
The following test program did not compile correctly any more:
Commit f7f99100d8d9 ("mm: stop zeroing memory during allocation in
vmemmap") broke Xen pv domains in some configurations, as the "Pinned"
information in struct page of early page tables could get lost.
This will lead to the kernel trying to write directly into the page
tables instead of asking the hypervisor to do so. The result is a crash
like the following:
Avoid this problem by not deferring struct page initialization when
running as Xen pv guest.
Pavel said:
: This is unique for Xen, so this particular issue won't effect other
: configurations. I am going to investigate if there is a way to
: re-enable deferred page initialization on xen guests.
[akpm@linux-foundation.org: explicitly include xen.h] Link: http://lkml.kernel.org/r/20180216154101.22865-1-jgross@suse.com Fixes: f7f99100d8d95d ("mm: stop zeroing memory during allocation in vmemmap") Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Pavel Tatashin <pasha.tatashin@oracle.com> Cc: Steven Sistare <steven.sistare@oracle.com> Cc: Daniel Jordan <daniel.m.jordan@oracle.com> Cc: Bob Picco <bob.picco@oracle.com> Cc: <stable@vger.kernel.org> [4.15.x] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
It was reported by Sergey Senozhatsky that if THP (Transparent Huge
Page) and frontswap (via zswap) are both enabled, when memory goes low
so that swap is triggered, segfault and memory corruption will occur in
random user space applications as follow,
After bisection, it was found the first bad commit is bd4c82c22c36 ("mm,
THP, swap: delay splitting THP after swapped out").
The root cause is as follows:
When the pages are written to swap device during swapping out in
swap_writepage(), zswap (fontswap) is tried to compress the pages to
improve performance. But zswap (frontswap) will treat THP as a normal
page, so only the head page is saved. After swapping in, tail pages
will not be restored to their original contents, causing memory
corruption in the applications.
This is fixed by refusing to save page in the frontswap store functions
if the page is a THP. So that the THP will be swapped out to swap
device.
Another choice is to split THP if frontswap is enabled. But it is found
that the frontswap enabling isn't flexible. For example, if
CONFIG_ZSWAP=y (cannot be module), frontswap will be enabled even if
zswap itself isn't enabled.
Frontswap has multiple backends, to make it easy for one backend to
enable THP support, the THP checking is put in backend frontswap store
functions instead of the general interfaces.
Link: http://lkml.kernel.org/r/20180209084947.22749-1-ying.huang@intel.com Fixes: bd4c82c22c367e068 ("mm, THP, swap: delay splitting THP after swapped out") Signed-off-by: "Huang, Ying" <ying.huang@intel.com> Reported-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Tested-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Suggested-by: Minchan Kim <minchan@kernel.org> [put THP checking in backend] Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Dan Streetman <ddstreet@ieee.org> Cc: Seth Jennings <sjenning@redhat.com> Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Cc: Shaohua Li <shli@kernel.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Shakeel Butt <shakeelb@google.com> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Juergen Gross <jgross@suse.com> Cc: <stable@vger.kernel.org> [4.14] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
GCC-8 shows a warning for the x86 oprofile code that copies per-CPU
data from CPU 0 to all other CPUs, which when building a non-SMP
kernel turns into a memcpy() with identical source and destination
pointers:
arch/x86/oprofile/nmi_int.c: In function 'mux_clone':
arch/x86/oprofile/nmi_int.c:285:2: error: 'memcpy' source argument is the same as destination [-Werror=restrict]
memcpy(per_cpu(cpu_msrs, cpu).multiplex,
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
per_cpu(cpu_msrs, 0).multiplex,
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
sizeof(struct op_msr) * model->num_virt_counters);
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
arch/x86/oprofile/nmi_int.c: In function 'nmi_setup':
arch/x86/oprofile/nmi_int.c:466:3: error: 'memcpy' source argument is the same as destination [-Werror=restrict]
arch/x86/oprofile/nmi_int.c:470:3: error: 'memcpy' source argument is the same as destination [-Werror=restrict]
I have analyzed a number of such warnings now: some are valid and the
GCC warning is welcome. Others turned out to be false-positives, and
GCC was changed to not warn about those any more. This is a corner case
that is a false-positive but the GCC developers feel it's better to keep
warning about it.
In this case, it seems best to work around it by telling GCC
a little more clearly that this code path is never hit with
an IS_ENABLED() configuration check.
Cc:stable as we also want old kernels to build cleanly with GCC-8.
Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: Jessica Yu <jeyu@kernel.org> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Martin Sebor <msebor@gcc.gnu.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Robert Richter <rric@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: oprofile-list@lists.sf.net Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/20180220205826.2008875-1-arnd@arndb.de Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84095 Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
When a irq vector is replaced, then the previous vector is normally
released when the first interrupt happens on the new vector. If the target
CPU of the previous vector is already offline when the new vector is
installed, then the previous vector is silently discarded, which leads to
accounting issues causing suspend failures and other problems.
Adjust the logic so that the previous vector is freed in the underlying
matrix allocator to ensure that the accounting stays correct.
Fixes: 69cde0004a4b ("x86/vector: Use matrix allocator for vector assignment") Reported-by: Yuriy Vostrikov <delamonpansie@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Yuriy Vostrikov <delamonpansie@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20180222112316.930791749@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
Build testing with LTO found a couple of files that get compiled
differently depending on whether asm/byteorder.h gets included early
enough or not. In particular, include/asm-generic/qrwlock_types.h is
affected by this, but there are probably others as well.
The symptom is a series of LTO link time warnings, including these:
net/netlabel/netlabel_unlabeled.h:223: error: type of 'netlbl_unlhsh_add' does not match original declaration [-Werror=lto-type-mismatch]
int netlbl_unlhsh_add(struct net *net,
net/netlabel/netlabel_unlabeled.c:377: note: 'netlbl_unlhsh_add' was previously declared here
include/net/ipv6.h:360: error: type of 'ipv6_renew_options_kern' does not match original declaration [-Werror=lto-type-mismatch]
ipv6_renew_options_kern(struct sock *sk,
net/ipv6/exthdrs.c:1162: note: 'ipv6_renew_options_kern' was previously declared here
net/core/dev.c:761: note: 'dev_get_by_name_rcu' was previously declared here
struct net_device *dev_get_by_name_rcu(struct net *net, const char *name)
net/core/dev.c:761: note: code may be misoptimized unless -fno-strict-aliasing is used
drivers/gpu/drm/i915/i915_drv.h:3377: error: type of 'i915_gem_object_set_to_wc_domain' does not match original declaration [-Werror=lto-type-mismatch]
i915_gem_object_set_to_wc_domain(struct drm_i915_gem_object *obj, bool write);
drivers/gpu/drm/i915/i915_gem.c:3639: note: 'i915_gem_object_set_to_wc_domain' was previously declared here
include/linux/debugfs.h:92:9: error: type of 'debugfs_attr_read' does not match original declaration [-Werror=lto-type-mismatch]
ssize_t debugfs_attr_read(struct file *file, char __user *buf,
fs/debugfs/file.c:318: note: 'debugfs_attr_read' was previously declared here
include/linux/rwlock_api_smp.h:30: error: type of '_raw_read_unlock' does not match original declaration [-Werror=lto-type-mismatch]
void __lockfunc _raw_read_unlock(rwlock_t *lock) __releases(lock);
kernel/locking/spinlock.c:246:26: note: '_raw_read_unlock' was previously declared here
include/linux/fs.h:3308:5: error: type of 'simple_attr_open' does not match original declaration [-Werror=lto-type-mismatch]
int simple_attr_open(struct inode *inode, struct file *file,
fs/libfs.c:795: note: 'simple_attr_open' was previously declared here
All of the above are caused by include/asm-generic/qrwlock_types.h
failing to include asm/byteorder.h after commit e0d02285f16e
("locking/qrwlock: Use 'struct qrwlock' instead of 'struct __qrwlock'")
in linux-4.15.
Similar bugs may or may not exist in older kernels as well, but there is
no easy way to test those with link-time optimizations, and kernels
before 4.14 are harder to fix because they don't have Babu's patch
series
We had similar issues with CONFIG_ symbols in the past and ended up
always including the configuration headers though linux/kconfig.h. This
works around the issue through that same file, defining either
__BIG_ENDIAN or __LITTLE_ENDIAN depending on CONFIG_CPU_BIG_ENDIAN,
which is now always set on all architectures since commit 4c97a0c8fee3
("arch: define CPU_BIG_ENDIAN for all fixed big endian archs").
Link: http://lkml.kernel.org/r/20180202154104.1522809-2-arnd@arndb.de Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: Babu Moger <babu.moger@amd.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Masahiro Yamada <yamada.masahiro@socionext.com> Cc: Nicolas Pitre <nico@linaro.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will.deacon@arm.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
The adis_probe_trigger() creates a new IIO trigger and requests an
interrupt associated with the trigger. The interrupt uses the generic
iio_trigger_generic_data_rdy_poll() function as its interrupt handler.
Currently the driver initializes some fields of the trigger structure after
the interrupt has been requested. But an interrupt can fire as soon as it
has been requested. This opens up a race condition.
iio_trigger_generic_data_rdy_poll() will access the trigger data structure
and dereference the ops field. If the ops field is not yet initialized this
will result in a NULL pointer deref.
It is not expected that the device generates an interrupt at this point, so
typically this issue did not surface unless e.g. due to a hardware
misconfiguration (wrong interrupt number, wrong polarity, etc.).
But some newer devices from the ADIS family start to generate periodic
interrupts in their power-on reset configuration and unfortunately the
interrupt can not be masked in the device. This makes the race condition
much more visible and the following crash has been observed occasionally
when booting a system using the ADIS16460.
Unable to handle kernel NULL pointer dereference at virtual address 00000008
pgd = c0004000
[00000008] *pgd=00000000
Internal error: Oops: 5 [#1] PREEMPT SMP ARM
Modules linked in:
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.9.0-04126-gf9739f0-dirty #257
Hardware name: Xilinx Zynq Platform
task: ef04f640 task.stack: ef050000
PC is at iio_trigger_notify_done+0x30/0x68
LR is at iio_trigger_generic_data_rdy_poll+0x18/0x20
pc : [<c042d868>] lr : [<c042d924>] psr: 60000193
sp : ef051bb8 ip : 00000000 fp : ef106400
r10: c081d80a r9 : ef3bfa00 r8 : 00000087
r7 : ef051bec r6 : 00000000 r5 : ef3bfa00 r4 : ee92ab00
r3 : 00000000 r2 : 00000000 r1 : 00000000 r0 : ee97e400
Flags: nZCv IRQs off FIQs on Mode SVC_32 ISA ARM Segment none
Control: 18c5387d Table: 0000404a DAC: 00000051
Process swapper/0 (pid: 1, stack limit = 0xef050210)
[<c042d868>] (iio_trigger_notify_done) from [<c0065b10>] (__handle_irq_event_percpu+0x88/0x118)
[<c0065b10>] (__handle_irq_event_percpu) from [<c0065bbc>] (handle_irq_event_percpu+0x1c/0x58)
[<c0065bbc>] (handle_irq_event_percpu) from [<c0065c30>] (handle_irq_event+0x38/0x5c)
[<c0065c30>] (handle_irq_event) from [<c0068e28>] (handle_level_irq+0xa4/0x130)
[<c0068e28>] (handle_level_irq) from [<c0064e74>] (generic_handle_irq+0x24/0x34)
[<c0064e74>] (generic_handle_irq) from [<c021ab7c>] (zynq_gpio_irqhandler+0xb8/0x13c)
[<c021ab7c>] (zynq_gpio_irqhandler) from [<c0064e74>] (generic_handle_irq+0x24/0x34)
[<c0064e74>] (generic_handle_irq) from [<c0065370>] (__handle_domain_irq+0x5c/0xb4)
[<c0065370>] (__handle_domain_irq) from [<c000940c>] (gic_handle_irq+0x48/0x8c)
[<c000940c>] (gic_handle_irq) from [<c0013e8c>] (__irq_svc+0x6c/0xa8)
To fix this make sure that the trigger is fully initialized before
requesting the interrupt.
Fixes: ccd2b52f4ac6 ("staging:iio: Add common ADIS library") Reported-by: Robin Getz <Robin.Getz@analog.com> Signed-off-by: Lars-Peter Clausen <lars@metafoo.de> Cc: <Stable@vger.kernel.org> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
If no iio buffer has been set up and poll is called return 0.
Without this check there will be a null pointer dereference when
calling poll on a iio driver without an iio buffer.
Cc: stable@vger.kernel.org Signed-off-by: Stefan Windfeldt-Prytz <stefan.windfeldt@axis.com> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
Functions for triggered buffer support are needed by this module.
If they are not defined accidentally by another driver, there's an error
thrown out while linking.
Add a select of IIO_BUFFER and IIO_TRIGGERED_BUFFER in the Kconfig file.
Signed-off-by: Andreas Klinger <ak@it-klinger.de> Fixes: a83195937151 ("iio: srf08: add triggered buffer support") Cc: <Stable@vger.kernel.org> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
Error handling in stm32h7_adc_enable routine doesn't unwind enable
sequence correctly. ADEN can only be cleared by hardware (e.g. by
writing one to ADDIS).
It's also better to clear ADRDY just after it's been set by hardware.
Fixes: 95e339b6e85d ("iio: adc: stm32: add support for STM32H7") Signed-off-by: Fabrice Gasnier <fabrice.gasnier@st.com> Cc: <Stable@vger.kernel.org> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
==================================================================
BUG: KASAN: use-after-free in copy_ah_attr_from_uverbs+0x6f2/0x8c0
Read of size 4 at addr ffff88006476a198 by task syzkaller697701/265
There is no matching lock for this mutex. Git history suggests this is
just a missed remnant from an earlier version of the function before
this locking was moved into uverbs_free_xrcd.
Originally this lock was protecting the xrcd_table_delete()
=====================================
WARNING: bad unlock balance detected!
4.15.0+ #87 Not tainted
-------------------------------------
syzkaller223405/269 is trying to release lock (&uverbs_dev->xrcd_tree_mutex) at:
[<00000000b8703372>] ib_uverbs_close_xrcd+0x195/0x1f0
but there are no more locks to release!
other info that might help us debug this:
1 lock held by syzkaller223405/269:
#0: (&uverbs_dev->disassociate_srcu){....}, at: [<000000005af3b960>] ib_uverbs_write+0x265/0xef0
The command number is not bounds checked against the command mask before it
is shifted, resulting in an ubsan hit. This does not cause malfunction since
the command number is eventually bounds checked, but we can make this ubsan
clean by moving the bounds check to before the mask check.
The race is between lookup_get_idr_uobject and
uverbs_idr_remove_uobj -> uverbs_uobject_put.
We deliberately do not call sychronize_rcu after the idr_remove in
uverbs_idr_remove_uobj for performance reasons, instead we call
kfree_rcu() during uverbs_uobject_put.
However, this means we can obtain pointers to uobj's that have
already been released and must protect against krefing them
using kref_get_unless_zero.
==================================================================
BUG: KASAN: use-after-free in copy_ah_attr_from_uverbs.isra.2+0x860/0xa00
Read of size 4 at addr ffff88005fda1ac8 by task syz-executor2/441
Memory state around the buggy address: ffff88005fda1980: fc fc fb fb fb fb fc fc fb fb fb fb fc fc fb fb ffff88005fda1a00: fb fb fc fc fb fb fb fb fc fc 00 00 00 00 fc fc ffff88005fda1a80: fb fb fb fb fc fc fb fb fb fb fc fc fb fb fb fb ffff88005fda1b00: fc fc 00 00 00 00 fc fc fb fb fb fb fc fc fb fb ffff88005fda1b80: fb fb fc fc fb fb fb fb fc fc fb fb fb fb fc fc
==================================================================@
Cc: syzkaller <syzkaller@googlegroups.com> Cc: <stable@vger.kernel.org> # 4.11 Fixes: 3832125624b7 ("IB/core: Add support for idr types") Reported-by: Noa Osherovich <noaos@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
At CPU hotunplug the corresponding per cpu matrix allocator is shut down and
the allocated interrupt bits are discarded under the assumption that all
allocated bits have been either migrated away or shut down through the
managed interrupts mechanism.
This is not true because interrupts which are not started up might have a
vector allocated on the outgoing CPU. When the interrupt is started up
later or completely shutdown and freed then the allocated vector is handed
back, triggering warnings or causing accounting issues which result in
suspend failures and other issues.
Change the CPU hotplug mechanism of the matrix allocator so that the
remaining allocations at unplug time are preserved and global accounting at
hotplug is correctly readjusted to take the dormant vectors into account.
Fixes: 2f75d9e1c905 ("genirq: Implement bitmap matrix allocator") Reported-by: Yuriy Vostrikov <delamonpansie@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Yuriy Vostrikov <delamonpansie@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20180222112316.849980972@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
Some other drivers may be waiting for our extcon to show-up, exiting their
probe methods with -EPROBE_DEFER until we show up.
These drivers will typically get the cable state directly after getting
the extcon, this commit changes the int3496 code to wait for the initial
processing of the id-pin to complete before exiting probe() with 0, which
will cause devices waiting on the defered probe to get reprobed.
This fixes a race where the initial work might still be running while other
drivers were already calling extcon_get_state().
Fixes: 2f556bdb9f2e ("extcon: int3496: Add Intel INT3496 ACPI ... driver") Cc: stable@vger.kernel.org Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
If there is a blacklisted certificate in a SignerInfo's certificate
chain, then pkcs7_verify_sig_chain() sets sinfo->blacklisted and returns
0. But, pkcs7_verify() fails to handle this case appropriately, as it
actually continues on to the line 'actual_ret = 0;', indicating that the
SignerInfo has passed verification. Consequently, PKCS#7 signature
verification ignores the certificate blacklist.
Fix this by not considering blacklisted SignerInfos to have passed
verification.
Also fix the function comment with regards to when 0 is returned.
Fixes: 03bb79315ddc ("PKCS#7: Handle blacklisted certificates") Cc: <stable@vger.kernel.org> # v4.12+ Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
When pkcs7_verify_sig_chain() is building the certificate chain for a
SignerInfo using the certificates in the PKCS#7 message, it is passing
the wrong arguments to public_key_verify_signature(). Consequently,
when the next certificate is supposed to be used to verify the previous
certificate, the next certificate is actually used to verify itself.
An attacker can use this bug to create a bogus certificate chain that
has no cryptographic relationship between the beginning and end.
Fortunately I couldn't quite find a way to use this to bypass the
overall signature verification, though it comes very close. Here's the
reasoning: due to the bug, every certificate in the chain beyond the
first actually has to be self-signed (where "self-signed" here refers to
the actual key and signature; an attacker might still manipulate the
certificate fields such that the self_signed flag doesn't actually get
set, and thus the chain doesn't end immediately). But to pass trust
validation (pkcs7_validate_trust()), either the SignerInfo or one of the
certificates has to actually be signed by a trusted key. Since only
self-signed certificates can be added to the chain, the only way for an
attacker to introduce a trusted signature is to include a self-signed
trusted certificate.
But, when pkcs7_validate_trust_one() reaches that certificate, instead
of trying to verify the signature on that certificate, it will actually
look up the corresponding trusted key, which will succeed, and then try
to verify the *previous* certificate, which will fail. Thus, disaster
is narrowly averted (as far as I could tell).
Fixes: 6c2dc5ae4ab7 ("X.509: Extract signature digest and make self-signed cert checks earlier") Cc: <stable@vger.kernel.org> # v4.7+ Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
The asymmetric key type allows an X.509 certificate to be added even if
its signature's hash algorithm is not available in the crypto API. In
that case 'payload.data[asym_auth]' will be NULL. But the key
restriction code failed to check for this case before trying to use the
signature, resulting in a NULL pointer dereference in
key_or_keyring_common() or in restrict_link_by_signature().
Fix this by returning -ENOPKG when the signature is unsupported.
Reproducer when all the CONFIG_CRYPTO_SHA512* options are disabled and
keyctl has support for the 'restrict_keyring' command:
Fixes: a511e1af8b12 ("KEYS: Move the point of trust determination to __key_link()") Cc: <stable@vger.kernel.org> # v4.7+ Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
The X.509 parser mishandles the case where the certificate's signature's
hash algorithm is not available in the crypto API. In this case,
x509_get_sig_params() doesn't allocate the cert->sig->digest buffer;
this part seems to be intentional. However,
public_key_verify_signature() is still called via
x509_check_for_self_signed(), which triggers the 'BUG_ON(!sig->digest)'.
Fix this by making public_key_verify_signature() return -ENOPKG if the
hash buffer has not been allocated.
Reproducer when all the CONFIG_CRYPTO_SHA512* options are disabled:
We were leaving them in the power on state (or the state the firmware
had set up for some client, if we were taking over from them). The
boot state was 30 core clocks, when we actually want to sample some
time after (to make sure that the new input bit has actually arrived).
Signed-off-by: Eric Anholt <eric@anholt.net> Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com> Signed-off-by: Wolfram Sang <wsa@the-dreams.de> Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
One I2C bus on my Atom E3845 board has been broken since 4.9.
It has two devices, both declared by ACPI and with built-in drivers.
There are two back-to-back transactions originating from the kernel, one
targeting each device. The first transaction works, the second one locks
up the I2C controller. The controller never recovers.
These kernel logs show up whenever an I2C transaction is attempted after
this failure.
i2c-designware-pci 0000:00:18.3: timeout in disabling adapter
i2c-designware-pci 0000:00:18.3: timeout waiting for bus ready
Waiting for the I2C controller status to indicate that it is enabled
before programming it fixes the issue.
I have tested this patch on 4.14 and 4.15.
Fixes: commit 2702ea7dbec5 ("i2c: designware: wait for disable/enable only if necessary") Cc: linux-stable <stable@vger.kernel.org> #4.13+ Signed-off-by: Ben Gardner <gardner.ben@gmail.com> Acked-by: Jarkko Nikula <jarkko.nikula@linux.intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Signed-off-by: Wolfram Sang <wsa@the-dreams.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
net/mac80211/cfg.c: In function 'cfg80211_beacon_dup':
net/mac80211/cfg.c:2896:3: error: 'memcpy' source argument is the same as destination [-Werror=restrict]
From the context, I conclude that we want to copy from beacon into
new_beacon, as we do in the rest of the function.
Cc: stable@vger.kernel.org Fixes: 73da7d5bab79 ("mac80211: add channel switch command and beacon callbacks") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
MIPS' struct compat_flock doesn't match the 32-bit struct flock, as it
has an extra short __unused before pad[4], which combined with alignment
increases the size to 40 bytes compared with struct flock's 36 bytes.
Since commit 8c6657cb50cb ("Switch flock copyin/copyout primitives to
copy_{from,to}_user()"), put_compat_flock() writes the full compat_flock
struct to userland, which results in corruption of the userland word
after the struct flock when running 32-bit userlands on 64-bit kernels.
This was observed to cause a bus error exception when starting Firefox
on Debian 8 (Jessie).
Reported-by: Peter Mamonov <pmamonov@gmail.com> Signed-off-by: James Hogan <jhogan@kernel.org> Tested-by: Peter Mamonov <pmamonov@gmail.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: linux-mips@linux-mips.org Cc: <stable@vger.kernel.org> # 4.13+
Patchwork: https://patchwork.linux-mips.org/patch/18646/ Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
The fcp_rsp_info structure as defined in the FC spec has an initial 3
bytes reserved field. The ibmvfc driver mistakenly defined this field as
4 bytes resulting in the rsp_code field being defined in what should be
the start of the second reserved field and thus always being reported as
zero by the driver.
Ideally, we should wire ibmvfc up with libfc for the sake of code
deduplication, and ease of maintaining standardized structures in a
single place. However, for now simply fixup the definition in ibmvfc for
backporting to distros on older kernels. Wiring up with libfc will be
done in a followup patch.
Cc: <stable@vger.kernel.org> Reported-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Tyrel Datwyler <tyreld@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
Xtensa memory initialization code frees high memory pages without
checking whether they are in the reserved memory regions or not. That
results in invalid value of totalram_pages and duplicate page usage by
CMA and highmem. It produces a bunch of BUGs at startup looking like
this:
The MIPS %.its.S compiler command did not define __ASSEMBLY__, which meant
when compiler_types.h was added to kconfig.h, unexpected things appeared
(e.g. struct declarations) which should not have been present. As done in
the general %.S compiler command, __ASSEMBLY__ is now included here too.
The failure was:
Error: arch/mips/boot/vmlinux.gz.its:201.1-2 syntax error
FATAL ERROR: Unable to parse input tree
/usr/bin/mkimage: Can't read arch/mips/boot/vmlinux.gz.itb.tmp: Invalid argument
/usr/bin/mkimage Can't add hashes to FIT blob
Reported-by: kbuild test robot <lkp@intel.com> Fixes: 28128c61e08e ("kconfig.h: Include compiler types to avoid missed struct attributes") Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
The header files for some structures could get included in such a way
that struct attributes (specifically __randomize_layout from path.h) would
be parsed as variable names instead of attributes. This could lead to
some instances of a structure being unrandomized, causing nasty GPFs, etc.
This patch makes sure the compiler_types.h header is included in
kconfig.h so that we've always got types and struct attributes defined,
since kconfig.h is included from the compiler command line.
Reported-by: Patrick McLean <chutzpah@gentoo.org> Root-caused-by: Maciej S. Szmigiero <mail@maciej.szmigiero.name> Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Tested-by: Maciej S. Szmigiero <mail@maciej.szmigiero.name> Fixes: 3859a271a003 ("randstruct: Mark various structs for randomization") Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
Ard Biesheuvel [Fri, 23 Feb 2018 18:29:02 +0000 (18:29 +0000)]
arm64: mm: don't write garbage into TTBR1_EL1 register
BugLink: https://bugs.launchpad.net/bugs/1752317
Stable backport commit 173358a49173 ("arm64: kpti: Add ->enable callback
to remap swapper using nG mappings") of upstream commit f992b4dfd58b did
not survive the backporting process unscathed, and ends up writing garbage
into the TTBR1_EL1 register, rather than pointing it to the zero page to
disable translations. Fix that.
Cc: <stable@vger.kernel.org> #v4.14 Reported-by: Nicolas Dechesne <nicolas.dechesne@linaro.org> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
The Syzbot reported a possible deadlock in the netfilter area caused by
rtnl lock, xt lock and socket lock being acquired with a different order
on different code paths, leading to the following backtrace: Reviewed-by: Xin Long <lucien.xin@gmail.com>
======================================================
WARNING: possible circular locking dependency detected
4.15.0+ #301 Not tainted
------------------------------------------------------
syzkaller233489/4179 is trying to acquire lock:
(rtnl_mutex){+.+.}, at: [<0000000048e996fd>] rtnl_lock+0x17/0x20
net/core/rtnetlink.c:74
but task is already holding lock:
(&xt[i].mutex){+.+.}, at: [<00000000328553a2>]
xt_find_table_lock+0x3e/0x3e0 net/netfilter/x_tables.c:1041
which lock already depends on the new lock.
===
Since commit 3f34cfae1230 ("netfilter: on sockopt() acquire sock lock
only in the required scope"), we already acquire the socket lock in
the innermost scope, where needed. In such commit I forgot to remove
the outer-most socket lock from the getsockopt() path, this commit
addresses the issues dropping it now.
v1 -> v2: fix bad subj, added relavant 'fixes' tag
Fixes: 22265a5c3c10 ("netfilter: xt_TEE: resolve oif using netdevice notifiers") Fixes: 202f59afd441 ("netfilter: ipt_CLUSTERIP: do not hold dev") Fixes: 3f34cfae1230 ("netfilter: on sockopt() acquire sock lock only in the required scope") Reported-by: syzbot+ddde1c7b7ff7442d7f2d@syzkaller.appspotmail.com Suggested-by: Florian Westphal <fw@strlen.de> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Tested-by: Krzysztof Piotr Oledzki <ole@ans.pl> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
Jason Yan [Fri, 8 Dec 2017 09:42:10 +0000 (17:42 +0800)]
scsi: libsas: notify event PORTE_BROADCAST_RCVD in sas_enable_revalidation()
BugLink: https://bugs.launchpad.net/bugs/1752146
There are two places queuing the disco event DISCE_REVALIDATE_DOMAIN.
One is in sas_porte_broadcast_rcvd() and uses sas_chain_event() to queue
the event. The other is in sas_enable_revalidation() and uses
sas_queue_event() to queue the event. We have diffrent work queues for
event and discovery now, so the DISCE_REVALIDATE_DOMAIN event may be
processed in both event queue and discovery queue.
Now since we do synchronous event handling, we cannot do it in discovery
queue, so have to trigger a fake broadcast event to re-trigger the
revalidation from event queue.
Signed-off-by: Jason Yan <yanaijie@huawei.com> CC: John Garry <john.garry@huawei.com> CC: Johannes Thumshirn <jthumshirn@suse.de> CC: Ewan Milne <emilne@redhat.com> CC: Christoph Hellwig <hch@lst.de> CC: Tomas Henzl <thenzl@redhat.com> CC: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 1689c9367bfaf4b5ff3973f26f5acbff16b63bfb) Signed-off-by: dann frazier <dann.frazier@canonical.com> Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Jason Yan [Fri, 8 Dec 2017 09:42:09 +0000 (17:42 +0800)]
scsi: libsas: direct call probe and destruct
BugLink: https://bugs.launchpad.net/bugs/1752146
In commit 87c8331fcf72 ("[SCSI] libsas: prevent domain rediscovery
competing with ata error handling") introduced disco mutex to prevent
rediscovery competing with ata error handling and put the whole
revalidation in the mutex. But the rphy add/remove needs to wait for the
error handling which also grabs the disco mutex. This may leads to dead
lock.So the probe and destruct event were introduce to do the rphy
add/remove asynchronously and out of the lock.
The asynchronously processed workers makes the whole discovery process
not atomic, the other events may interrupt the process. For example,
if a loss of signal event inserted before the probe event, the
sas_deform_port() is called and the port will be deleted.
And sas_port_delete() may run before the destruct event, but the
port-x:x is the top parent of end device or expander. This leads to
a kernel WARNING such as:
Make probe and destruct a direct call in the disco and revalidate function,
but put them outside the lock. The whole discovery or revalidate won't
be interrupted by other events. And the DISCE_PROBE and DISCE_DESTRUCT
event are deleted as a result of the direct call.
Introduce a new list to destruct the sas_port and put the port delete after
the destruct. This makes sure the right order of destroying the sysfs
kobject and fix the warning above.
In sas_ex_revalidate_domain() have a loop to find all broadcasted
device, and sometimes we have a chance to find the same expander twice.
Because the sas_port will be deleted at the end of the whole revalidate
process, sas_port with the same name cannot be added before this.
Otherwise the sysfs will complain of creating duplicate filename. Since
the LLDD will send broadcast for every device change, we can only
process one expander's revalidation.
[mkp: kbuild test robot warning]
Signed-off-by: Jason Yan <yanaijie@huawei.com> CC: John Garry <john.garry@huawei.com> CC: Johannes Thumshirn <jthumshirn@suse.de> CC: Ewan Milne <emilne@redhat.com> CC: Christoph Hellwig <hch@lst.de> CC: Tomas Henzl <thenzl@redhat.com> CC: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 0558f33c06bb910e2879e355192227a8e8f0219d) Signed-off-by: dann frazier <dann.frazier@canonical.com> Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Jason Yan [Fri, 8 Dec 2017 09:42:08 +0000 (17:42 +0800)]
scsi: libsas: use flush_workqueue to process disco events synchronously
BugLink: https://bugs.launchpad.net/bugs/1752146
Now we are processing sas event and discover event in different
workqueues. It's safe to wait the discover event done in the sas event
work. Use flush_workqueue() to insure the disco and revalidate events
processed synchronously so that the whole discover and revalidate
process will not be interrupted by other events.
Signed-off-by: Jason Yan <yanaijie@huawei.com> CC: John Garry <john.garry@huawei.com> CC: Johannes Thumshirn <jthumshirn@suse.de> CC: Ewan Milne <emilne@redhat.com> CC: Christoph Hellwig <hch@lst.de> CC: Tomas Henzl <thenzl@redhat.com> CC: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 517e5153d242cb2dd0a1150d2a7bd6788d501ca9) Signed-off-by: dann frazier <dann.frazier@canonical.com> Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Jason Yan [Fri, 8 Dec 2017 09:42:07 +0000 (17:42 +0800)]
scsi: libsas: Use new workqueue to run sas event and disco event
BugLink: https://bugs.launchpad.net/bugs/1752146
Now all libsas works are queued to scsi host workqueue, include sas
event work post by LLDD and sas discovery work, and a sas hotplug flow
may be divided into several works, e.g libsas receive a
PORTE_BYTES_DMAED event, currently we process it as following steps:
sas_form_port --- run in work in shost workq
sas_discover_domain --- run in another work in shost workq
...
sas_probe_devices --- run in new work in shost workq
We found during hot-add a device, libsas may need run several
works in same workqueue to add device in system, the process is
not atomic, it may interrupt by other sas event works, like
PHYE_LOSS_OF_SIGNAL.
This patch is preparation of execute libsas sas event in sync. We need
to use different workqueue to run sas event and disco event. Otherwise
the work will be blocked for waiting another chained work in the same
workqueue.
Signed-off-by: Yijing Wang <wangyijing@huawei.com> CC: John Garry <john.garry@huawei.com> CC: Johannes Thumshirn <jthumshirn@suse.de> CC: Ewan Milne <emilne@redhat.com> CC: Christoph Hellwig <hch@lst.de> CC: Tomas Henzl <thenzl@redhat.com> CC: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Jason Yan <yanaijie@huawei.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 93bdbd06b1644ac15aa152e91faefed86cc04937) Signed-off-by: dann frazier <dann.frazier@canonical.com> Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Jason Yan [Fri, 8 Dec 2017 09:42:06 +0000 (17:42 +0800)]
scsi: libsas: make the event threshold configurable
BugLink: https://bugs.launchpad.net/bugs/1752146
Add a sysfs attr that LLDD can configure it for every host. We made an
example in hisi_sas. Other LLDDs using libsas can implement it if they
want.
Suggested-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Jason Yan <yanaijie@huawei.com> CC: John Garry <john.garry@huawei.com> CC: Johannes Thumshirn <jthumshirn@suse.de> CC: Ewan Milne <emilne@redhat.com> CC: Christoph Hellwig <hch@lst.de> CC: Tomas Henzl <thenzl@redhat.com> CC: Dan Williams <dan.j.williams@intel.com> Acked-by: John Garry <john.garry@huawei.com> #for hisi_sas part Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 8eea9dd84e450e5262643823691108f2a208a2ac) Signed-off-by: dann frazier <dann.frazier@canonical.com> Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Jason Yan [Fri, 8 Dec 2017 09:42:05 +0000 (17:42 +0800)]
scsi: libsas: shut down the PHY if events reached the threshold
BugLink: https://bugs.launchpad.net/bugs/1752146
If the PHY burst too many events, we will alloc a lot of events for the
worker. This may leads to memory exhaustion.
Dan Williams suggested to shut down the PHY if the events reached the
threshold, because in this case the PHY may have gone into some
erroneous state. Users can re-enable the PHY by sysfs if they want.
We cannot use the fixed memory pool because if we run out of events, the
shut down event and loss of signal event will lost too. The events still
need to be allocated and processed in this case.
Suggested-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Jason Yan <yanaijie@huawei.com> CC: John Garry <john.garry@huawei.com> CC: Johannes Thumshirn <jthumshirn@suse.de> CC: Ewan Milne <emilne@redhat.com> CC: Christoph Hellwig <hch@lst.de> CC: Tomas Henzl <thenzl@redhat.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit f12486e06ae87453530f00a6cb49b60ae3fe4551) Signed-off-by: dann frazier <dann.frazier@canonical.com> Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Jason Yan [Fri, 8 Dec 2017 09:42:04 +0000 (17:42 +0800)]
scsi: libsas: Use dynamic alloced work to avoid sas event lost
BugLink: https://bugs.launchpad.net/bugs/1752146
Now libsas hotplug work is static, every sas event type has its own
static work, LLDD driver queues the hotplug work into shost->work_q. If
LLDD driver burst posts lots hotplug events to libsas, the hotplug
events may pending in the workqueue like
shost->work_q
new work[PORTE_BYTES_DMAED] --> |[PHYE_LOSS_OF_SIGNAL][PORTE_BYTES_DMAED] -> processing
|<-------wait worker to process-------->|
In this case, a new PORTE_BYTES_DMAED event coming, libsas try to queue
it to shost->work_q, but this work is already pending, so it would be
lost. Finally, libsas delete the related sas port and sas devices, but
LLDD driver expect libsas add the sas port and devices(last sas event).
This patch use dynamic allocated work to avoid this issue.
Signed-off-by: Yijing Wang <wangyijing@huawei.com> CC: John Garry <john.garry@huawei.com> CC: Johannes Thumshirn <jthumshirn@suse.de> CC: Ewan Milne <emilne@redhat.com> CC: Christoph Hellwig <hch@lst.de> CC: Tomas Henzl <thenzl@redhat.com> CC: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Jason Yan <yanaijie@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 1c393b970e0f4070e4376d45f89a2d19a5c895d0) Signed-off-by: dann frazier <dann.frazier@canonical.com> Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
chenxiang [Thu, 4 Jan 2018 13:04:33 +0000 (21:04 +0800)]
scsi: libsas: initialize sas_phy status according to response of DISCOVER
BugLink: https://bugs.launchpad.net/bugs/1752146
The status of SAS PHY is in sas_phy->enabled. There is an issue that the
status of a remote SAS PHY may be initialized incorrectly: if disable
remote SAS PHY through sysfs interface (such as echo 0 >
/sys/class/sas_phy/phy-1:0:0/enable), then reboot the system, and we
will find the status of remote SAS PHY which is disabled before is
1 (cat /sys/class/sas_phy/phy-1:0:0/enable). But actually the status of
remote SAS PHY is disabled and the device attached is not found.
In SAS protocol, NEGOTIATED LOGICAL LINK RATE field of DISCOVER response
is 0x1 when remote SAS PHY is disabled. So initialize sas_phy->enabled
according to the value of NEGOTIATED LOGICAL LINK RATE field.
Signed-off-by: chenxiang <chenxiang66@hisilicon.com> Reviewed-by: John Garry <john.garry@huawei.com> Signed-off-by: Jason Yan <yanaijie@huawei.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit affc67788fe5dfffad5cda3d461db5cf2b2ff2b0) Signed-off-by: dann frazier <dann.frazier@canonical.com> Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Jason Yan [Thu, 4 Jan 2018 13:04:32 +0000 (21:04 +0800)]
scsi: libsas: fix error when getting phy events
BugLink: https://bugs.launchpad.net/bugs/1752146
The intend purpose here was to goto out if smp_execute_task() returned
error. Obviously something got screwed up. We will never get these link
error statistics below:
Obviously we should goto error handler if smp_execute_task() returns
non-zero.
Fixes: 2908d778ab3e ("[SCSI] aic94xx: new driver") Signed-off-by: Jason Yan <yanaijie@huawei.com> CC: John Garry <john.garry@huawei.com> CC: chenqilin <chenqilin2@huawei.com> CC: chenxiang <chenxiang66@hisilicon.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 2b23d9509fd7174b362482cf5f3b5f9a2265bc33) Signed-off-by: dann frazier <dann.frazier@canonical.com> Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
while true;
do cat /sys/class/sas_phy/phy-1:0:12/invalid_dword_count >/dev/null;
done
The buffer req is allocated and not freed after we return. Fix it.
Fixes: 2908d778ab3e ("[SCSI] aic94xx: new driver") Signed-off-by: Jason Yan <yanaijie@huawei.com> CC: John Garry <john.garry@huawei.com> CC: chenqilin <chenqilin2@huawei.com> CC: chenxiang <chenxiang66@hisilicon.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 4a491b1ab11ca0556d2fda1ff1301e862a2d44c4) Signed-off-by: dann frazier <dann.frazier@canonical.com> Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
BugLink: http://bugs.launchpad.net/bugs/1748232
Since we've added support for IFLA_IF_NETNSID for RTM_{DEL,GET,SET,NEW}LINK
it is possible for userspace to send us requests with three different
properties to identify a target network namespace. This affects at least
RTM_{NEW,SET}LINK. Each of them could potentially refer to a different
network namespace which is confusing. For legacy reasons the kernel will
pick the IFLA_NET_NS_PID property first and then look for the
IFLA_NET_NS_FD property but there is no reason to extend this type of
behavior to network namespace ids. The regression potential is quite
minimal since the rtnetlink requests in question either won't allow
IFLA_IF_NETNSID requests before 4.16 is out (RTM_{NEW,SET}LINK) or don't
support IFLA_NET_NS_{PID,FD} (RTM_{DEL,GET}LINK) in the first place.
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com> Acked-by: Jiri Benc <jbenc@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 4ff66cae7f10b65b028dc3bdaaad9cc2989ef6ae) Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 7973bfd8758d05c85ee32052a3d7d5d0549e91b4) Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
BugLink: http://bugs.launchpad.net/bugs/1748232
- Backwards Compatibility:
If userspace wants to determine whether RTM_NEWLINK supports the
IFLA_IF_NETNSID property they should first send an RTM_GETLINK request
with IFLA_IF_NETNSID on lo. If either EACCESS is returned or the reply
does not include IFLA_IF_NETNSID userspace should assume that
IFLA_IF_NETNSID is not supported on this kernel.
If the reply does contain an IFLA_IF_NETNSID property userspace
can send an RTM_NEWLINK with a IFLA_IF_NETNSID property. If they receive
EOPNOTSUPP then the kernel does not support the IFLA_IF_NETNSID property
with RTM_NEWLINK. Userpace should then fallback to other means.
- Security:
Callers must have CAP_NET_ADMIN in the owning user namespace of the
target network namespace.
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 5bb8ed075428b71492734af66230aa0c07fcc515) Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
BugLink: http://bugs.launchpad.net/bugs/1748232
- Backwards Compatibility:
If userspace wants to determine whether RTM_DELLINK supports the
IFLA_IF_NETNSID property they should first send an RTM_GETLINK request
with IFLA_IF_NETNSID on lo. If either EACCESS is returned or the reply
does not include IFLA_IF_NETNSID userspace should assume that
IFLA_IF_NETNSID is not supported on this kernel.
If the reply does contain an IFLA_IF_NETNSID property userspace
can send an RTM_DELLINK with a IFLA_IF_NETNSID property. If they receive
EOPNOTSUPP then the kernel does not support the IFLA_IF_NETNSID property
with RTM_DELLINK. Userpace should then fallback to other means.
- Security:
Callers must have CAP_NET_ADMIN in the owning user namespace of the
target network namespace.
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit b61ad68a9fe85d29d5363eb36860164a049723cf) Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
BugLink: http://bugs.launchpad.net/bugs/1748232
- Backwards Compatibility:
If userspace wants to determine whether RTM_SETLINK supports the
IFLA_IF_NETNSID property they should first send an RTM_GETLINK request
with IFLA_IF_NETNSID on lo. If either EACCESS is returned or the reply
does not include IFLA_IF_NETNSID userspace should assume that
IFLA_IF_NETNSID is not supported on this kernel.
If the reply does contain an IFLA_IF_NETNSID property userspace
can send an RTM_SETLINK with a IFLA_IF_NETNSID property. If they receive
EOPNOTSUPP then the kernel does not support the IFLA_IF_NETNSID property
with RTM_SETLINK. Userpace should then fallback to other means.
To retain backwards compatibility the kernel will first check whether a
IFLA_NET_NS_PID or IFLA_NET_NS_FD property has been passed. If either
one is found it will be used to identify the target network namespace.
This implies that users who do not care whether their running kernel
supports IFLA_IF_NETNSID with RTM_SETLINK can pass both
IFLA_NET_NS_{FD,PID} and IFLA_IF_NETNSID referring to the same network
namespace.
- Security:
Callers must have CAP_NET_ADMIN in the owning user namespace of the
target network namespace.
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit c310bfcb6e1be993629c5747accf8e1c65fbb255) Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
BugLink: http://bugs.launchpad.net/bugs/1748232
RTM_{NEW,SET}LINK already allow operations on other network namespaces
by identifying the target network namespace through IFLA_NET_NS_{FD,PID}
properties. This is done by looking for the corresponding properties in
do_setlink(). Extend do_setlink() to also look for the IFLA_IF_NETNSID
property. This introduces no functional changes since all callers of
do_setlink() currently block IFLA_IF_NETNSID by reporting an error before
they reach do_setlink().
This introduces the helpers:
static struct net *rtnl_link_get_net_by_nlattr(struct net *src_net, struct
nlattr *tb[])
static struct net *rtnl_link_get_net_capable(const struct sk_buff *skb,
struct net *src_net,
struct nlattr *tb[], int cap)
to simplify permission checks and target network namespace retrieval for
RTM_* requests that already support IFLA_NET_NS_{FD,PID} but get extended
to IFLA_IF_NETNSID. To perserve backwards compatibility the helpers look
for IFLA_NET_NS_{FD,PID} properties first before checking for
IFLA_IF_NETNSID.
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 7c4f63ba824302492985553018881455982241d6) Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Kai Heng Feng has noticed that BUG_ON(PageHighMem(pg)) triggers in
drivers/media/common/saa7146/saa7146_core.c since 19809c2da28a ("mm,
vmalloc: use __GFP_HIGHMEM implicitly").
saa7146_vmalloc_build_pgtable uses vmalloc_32 and it is reasonable to
expect that the resulting page is not in highmem. The above commit
aimed to add __GFP_HIGHMEM only for those requests which do not specify
any zone modifier gfp flag. vmalloc_32 relies on GFP_VMALLOC32 which
should do the right thing. Except it has been missed that GFP_VMALLOC32
is an alias for GFP_KERNEL on 32b architectures. Thanks to Matthew to
notice this.
Fix the problem by unconditionally setting GFP_DMA32 in GFP_VMALLOC32
for !64b arches (as a bailout). This should do the right thing and use
ZONE_NORMAL which should be always below 4G on 32b systems.
Debugged by Matthew Wilcox.
[akpm@linux-foundation.org: coding-style fixes] Link: http://lkml.kernel.org/r/20180212095019.GX21609@dhcp22.suse.cz Fixes: 19809c2da28a ("mm, vmalloc: use __GFP_HIGHMEM implicitly”) Signed-off-by: Michal Hocko <mhocko@suse.com> Reported-by: Kai Heng Feng <kai.heng.feng@canonical.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Laura Abbott <labbott@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
In function xhci_stop, xhci_debugfs_exit called before xhci_mem_cleanup.
xhci_debugfs_exit removed the xhci debugfs root nodes, xhci_mem_cleanup
called function xhci_free_virt_devices_depth_first which in turn called
function xhci_debugfs_remove_slot.
Function xhci_debugfs_remove_slot removed the nodes for devices, the nodes
folders are sub folder of xhci debugfs.
It is unreasonable to remove xhci debugfs root folder before
xhci debugfs sub folder. Function xhci_mem_cleanup should be called
before function xhci_debugfs_exit.
There is a bug after plugged out USB device, the device and its ep00
nodes are still kept, we need to remove the nodes in xhci_free_dev when
USB device is plugged out.
During system resume from hibernation, xhci host is reset, all the
nodes in devices folder are removed in xhci_mem_cleanup function.
Later nodes in /sys/kernel/debug/usb/xhci/* are created again in
function xhci_run, but the nodes already exist, so the nodes still
keep the old ones, finally device nodes in xhci debugfs folder
/sys/kernel/debug/usb/xhci/*/devices/* are disappeared.
This fix removed xhci debugfs nodes before the nodes are re-created,
so all the nodes in xhci debugfs can be re-created successfully.
Commit dde634057da7 ("xhci: Fix use-after-free in xhci debugfs") causes a
null pointer dereference while fixing xhci-debugfs usage of ring pointers
that were freed during hibernate.
The fix passed addresses to ring pointers instead, but forgot to do this
change for the xhci_ring_trb_show function.
The address of the ring pointer passed to xhci-debugfs was of a temporary
ring pointer "new_ring" instead of the actual ring "ring" pointer. The
temporary new_ring pointer will be set to NULL later causing the NULL
pointer dereference.
This issue was seen when reading xhci related files in debugfs:
Since commit 152a6a884ae1 ("staging:iio:accel:sca3000 move
to hybrid hard / soft buffer design.")
the buffer mechanism has changed and the
INDIO_BUFFER_HARDWARE flag has been unused.
Since commit 2d6ca60f3284 ("iio: Add a DMAengine framework
based buffer")
the INDIO_BUFFER_HARDWARE flag has been re-purposed for
DMA buffers.
This driver has lagged behind these changes, and
in order for buffers to work, the INDIO_BUFFER_SOFTWARE
needs to be used.
Signed-off-by: Alexandru Ardelean <alexandru.ardelean@analog.com> Fixes: 2d6ca60f3284 ("iio: Add a DMAengine framework based buffer") Cc: <Stable@vger.kernel.org> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Selecting GENERIC_MSI_IRQ_DOMAIN on x86 causes a compile-time error in
some configurations:
drivers/base/platform-msi.c:37:19: error: field 'arg' has incomplete type
On the other architectures, we are fine, but here we should have an additional
dependency on X86_LOCAL_APIC so we can get the PCI_MSI_IRQ_DOMAIN symbol.
binder_send_failed_reply() is called when a synchronous
transaction fails. It reports an error to the thread that
is waiting for the completion. Given that the transaction
is synchronous, there should never be more than 1 error
response to that thread -- this was being asserted with
a WARN().
However, when exercising the driver with syzbot tests, cases
were observed where multiple "synchronous" requests were
sent without waiting for responses, so it is possible that
multiple errors would be reported to the thread. This testing
was conducted with panic_on_warn set which forced the crash.
This is easily reproduced by sending back-to-back
"synchronous" transactions without checking for any
response (eg, set read_size to 0):
The first transaction is sent to the servicemanager and the reply
fails because no VMA is set up by this client. After
binder_send_failed_reply() is called, the BINDER_WORK_RETURN_ERROR
is sitting on the thread's todo list since the read_size was 0 and
the client is not waiting for a response.
The 2nd transaction is sent and the BINDER_WORK_RETURN_ERROR has not
been consumed, so the thread's reply_error.cmd is still set (normally
cleared when the BINDER_WORK_RETURN_ERROR is handled). Therefore
when the servicemanager attempts to reply to the 2nd failed
transaction, the error is already set and it triggers this warning.
This is a user error since it is not waiting for the synchronous
transaction to complete. If it ever does check, it will see an
error.
After commit 3f34cfae1238 ("netfilter: on sockopt() acquire sock lock
only in the required scope"), the caller of nf_{get/set}sockopt() must
not hold any lock, but, in such changeset, I forgot to cope with DECnet.
This commit addresses the issue moving the nf call outside the lock,
in the dn_{get,set}sockopt() with the same schema currently used by
ipv4 and ipv6. Also moves the unhandled sockopts of the end of the main
switch statements, to improve code readability.
Reported-by: Petr Vandrovec <petr@vandrovec.name> BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=198791#c2 Fixes: 3f34cfae1238 ("netfilter: on sockopt() acquire sock lock only in the required scope") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
dtc complains about the lack of #coolin-cells properties for the
CPU nodes that are referred to as "cooling-device":
arch/arm64/boot/dts/mediatek/mt8173-evb.dtb: Warning (cooling_device_property): Missing property '#cooling-cells' in node /cpus/cpu@0 or bad phandle (referred from /thermal-zones/cpu_thermal/cooling-maps/map@0:cooling-device[0])
arch/arm64/boot/dts/mediatek/mt8173-evb.dtb: Warning (cooling_device_property): Missing property '#cooling-cells' in node /cpus/cpu@100 or bad phandle (referred from /thermal-zones/cpu_thermal/cooling-maps/map@1:cooling-device[0])
Apparently this property must be '<2>' to match the binding.
syzbot reported a lockdep splat in gen_new_estimator() /
est_fetch_counters() when attempting to lock est->stats_lock.
Since est_fetch_counters() is called from BH context from timer
interrupt, we need to block BH as well when calling it from process
context.
Most qdiscs use per cpu counters and are immune to the problem,
but net/sched/act_api.c and net/netfilter/xt_RATEEST.c are using
a spinlock to protect their data. They both call gen_new_estimator()
while object is created and not yet alive, so this bug could
not trigger a deadlock, only a lockdep splat.
Fixes: 1c0d32fde5bd ("net_sched: gen_estimator: complete rewrite of rate estimators") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Acked-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
skb_warn_bad_offload warns when packets enter the GSO stack that
require skb_checksum_help or vice versa. Do not warn on arbitrary
bad packets. Packet sockets can craft many. Syzkaller was able to
demonstrate another one with eth_type games.
In particular, suppress the warning when segmentation returns an
error, which is for reasons other than checksum offload.
See also commit 36c92474498a ("net: WARN if skb_checksum_help() is
called on skb requiring segmentation") for context on this warning.
Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
The rds_tcp_kill_sock() function parses the rds_tcp_conn_list
to find the rds_connection entries marked for deletion as part
of the netns deletion under the protection of the rds_tcp_conn_lock.
Since the rds_tcp_conn_list tracks rds_tcp_connections (which
have a 1:1 mapping with rds_conn_path), multiple tc entries in
the rds_tcp_conn_list will map to a single rds_connection, and will
be deleted as part of the rds_conn_destroy() operation that is
done outside the rds_tcp_conn_lock.
The rds_tcp_conn_list traversal done under the protection of
rds_tcp_conn_lock should not leave any doomed tc entries in
the list after the rds_tcp_conn_lock is released, else another
concurrently executiong netns delete (for a differnt netns) thread
may trip on these entries.
Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com> Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Commit 8edc3affc077 ("rds: tcp: Take explicit refcounts on struct net")
introduces a regression in rds-tcp netns cleanup. The cleanup_net(),
(and thus rds_tcp_dev_event notification) is only called from put_net()
when all netns refcounts go to 0, but this cannot happen if the
rds_connection itself is holding a c_net ref that it expects to
release in rds_tcp_kill_sock.
Instead, the rds_tcp_kill_sock callback should make sure to
tear down state carefully, ensuring that the socket teardown
is only done after all data-structures and workqs that depend
on it are quiesced.
The original motivation for commit 8edc3affc077 ("rds: tcp: Take explicit
refcounts on struct net") was to resolve a race condition reported by
syzkaller where workqs for tx/rx/connect were triggered after the
namespace was deleted. Those worker threads should have been
cancelled/flushed before socket tear-down and indeed,
rds_conn_path_destroy() does try to sequence this by doing
/* cancel cp_send_w */
/* cancel cp_recv_w */
/* flush cp_down_w */
/* free data structures */
Here the "flush cp_down_w" will trigger rds_conn_shutdown and thus
invoke rds_tcp_conn_path_shutdown() to close the tcp socket, so that
we ought to have satisfied the requirement that "socket-close is
done after all other dependent state is quiesced". However,
rds_conn_shutdown has a bug in that it *always* triggers the reconnect
workq (and if connection is successful, we always restart tx/rx
workqs so with the right timing, we risk the race conditions reported
by syzkaller).
Netns deletion is like module teardown- no need to restart a
reconnect in this case. We can use the c_destroy_in_prog bit
to avoid restarting the reconnect.
Fixes: 8edc3affc077 ("rds: tcp: Take explicit refcounts on struct net") Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com> Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
rateest_hash is supposed to be protected by xt_rateest_mutex,
and, as suggested by Eric, lookup and insert should be atomic,
so we should acquire the xt_rateest_mutex once for both.
So introduce a non-locking helper for internal use and keep the
locking one for external.
Syzbot reported several deadlocks in the netfilter area caused by
rtnl lock and socket lock being acquired with a different order on
different code paths, leading to backtraces like the following one:
======================================================
WARNING: possible circular locking dependency detected
4.15.0-rc9+ #212 Not tainted
------------------------------------------------------
syzkaller041579/3682 is trying to acquire lock:
(sk_lock-AF_INET6){+.+.}, at: [<000000008775e4dd>] lock_sock
include/net/sock.h:1463 [inline]
(sk_lock-AF_INET6){+.+.}, at: [<000000008775e4dd>]
do_ipv6_setsockopt.isra.8+0x3c5/0x39d0 net/ipv6/ipv6_sockglue.c:167
but task is already holding lock:
(rtnl_mutex){+.+.}, at: [<000000004342eaa9>] rtnl_lock+0x17/0x20
net/core/rtnetlink.c:74
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
1 lock held by syzkaller041579/3682:
#0: (rtnl_mutex){+.+.}, at: [<000000004342eaa9>] rtnl_lock+0x17/0x20
net/core/rtnetlink.c:74
The problem, as Florian noted, is that nf_setsockopt() is always
called with the socket held, even if the lock itself is required only
for very tight scopes and only for some operation.
This patch addresses the issues moving the lock_sock() call only
where really needed, namely in ipv*_getorigdst(), so that nf_setsockopt()
does not need anymore to acquire both locks.
Commit 136e92bbec0a switched local_nodes from an array to a bitmask
but did not add proper bounds checks. As the result
clusterip_config_init_nodelist() can both over-read
ipt_clusterip_tgt_info.local_nodes and over-write
clusterip_config.local_nodes.
Add bounds checks for both.
Fixes: 136e92bbec0a ("[NETFILTER] CLUSTERIP: use a bitmap to store node responsibility data") Signed-off-by: Dmitry Vyukov <dvyukov@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
syzkaller triggered OOM kills by passing ipt_replace.size = -1
to IPT_SO_SET_REPLACE. The root cause is that SMP_ALIGN() in
xt_alloc_table_info() causes int overflow and the size check passes
when it should not. SMP_ALIGN() is no longer needed leftover.
Currently KCOV_ENABLE does not check if the current task is already
associated with another kcov descriptor. As the result it is possible
to associate a single task with more than one kcov descriptor, which
later leads to a memory leak of the old descriptor. This relation is
really meant to be one-to-one (task has only one back link).
The testcase sets a watchpoint (with perf_event_open) on a buffer that is
passed to timer_create() as the struct sigevent argument. In timer_create(),
copy_from_user()'s rep movsb triggers the BP. The testcase also sets
the debug registers for the guest.
However, KVM only restores host debug registers when the host has active
watchpoints, which triggers a race condition when running the testcase with
multiple threads. The guest's DR6.BS bit can escape to the host before
another thread invokes timer_create(), and do_debug() complains.
The fix is to respect do_debug()'s dr6 invariant when leaving KVM.
Reported-by: Dmitry Vyukov <dvyukov@google.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: David Hildenbrand <david@redhat.com> Cc: Dmitry Vyukov <dvyukov@google.com> Reviewed-by: David Hildenbrand <david@redhat.com> Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
During stress tests by syzkaller on the sg driver the block layer
infrequently returns EINVAL. Closer inspection shows the block
layer was trying to return ENOMEM (which is much more
understandable) but for some reason overroad that useful error.
Patch below does not show this (unchanged) line:
ret =__blk_rq_map_user_iov(rq, map_data, &i, gfp_mask, copy);
That 'ret' was being overridden when that function failed.
WARNING: CPU: 0 PID: 3502 at drivers/staging/android/ion/ion-ioctl.c:73 ion_ioctl+0x2db/0x380 drivers/staging/android/ion/ion-ioctl.c:73
Kernel panic - not syncing: panic_on_warn set ...
This is a warning that validation of the ioctl fields failed. This was
deliberately added as a warning to make it very obvious to developers that
something needed to be fixed. In reality, this is overkill and disturbs
fuzzing. Switch to pr_warn for a message instead.
This is a warning about attempting to allocate order > MAX_ORDER. This
is coming from a userspace Ion allocation request. Since userspace is
free to request however much memory it wants (and the kernel is free to
deny its allocation), silence the allocation attempt with __GFP_NOWARN
in case it fails.
Using %rbp as a temporary register breaks frame pointer convention and
breaks stack traces when unwinding from an interrupt in the crypto code.
In twofish-3way, we can't simply replace %rbp with another register
because there are none available. Instead, we use the stack to hold the
values that %rbp, %r11, and %r12 were holding previously. Each of these
values represents the half of the output from the previous Feistel round
that is being passed on unchanged to the following round. They are only
used once per round, when they are exchanged with %rax, %rbx, and %rcx.
As a result, we free up 3 registers (one per block) and can reassign
them so that %rbp is not used, and additionally %r14 and %r15 are not
used so they do not need to be saved/restored.
There may be a small overhead caused by replacing 'xchg REG, REG' with
the needed sequence 'mov MEM, REG; mov REG, MEM; mov REG, REG' once per
round. But, counterintuitively, when I tested "ctr-twofish-3way" on a
Haswell processor, the new version was actually about 2% faster.
(Perhaps 'xchg' is not as well optimized as plain moves.)