Commit 1cf6e8fc8341 ("tty/serial: at91: fix RTS line management when
hardware handshake is enabled") actually allowed to enable hardware
handshaking.
Before, the CRTSCTS flags was silently ignored.
As the DMA controller can't drive RTS (as explain in the commit message).
Ensure that hardware flow control stays disabled when DMA is used and FIFOs
are not available.
Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com> Acked-by: Nicolas Ferre <nicolas.ferre@atmel.com> Fixes: 1cf6e8fc8341 ("tty/serial: at91: fix RTS line management when hardware handshake is enabled") Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
When csw->con_startup() fails in do_register_con_driver, we return no
error (i.e. 0). This was changed back in 2006 by commit 3e795de763.
Before that we used to return -ENODEV.
So fix the return value to be -ENODEV in that case again.
Fixes: 3e795de763 ("VT binding: Add binding/unbinding support for the VT console") Signed-off-by: Jiri Slaby <jslaby@suse.cz> Reported-by: "Dan Carpenter" <dan.carpenter@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
b4ff8389ed14 is incomplete: relies on nr_legacy_irqs() to get the number
of legacy interrupts when actually nr_legacy_irqs() returns 0 after
probe_8259A(). Use NR_IRQS_LEGACY instead.
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
This ensures that the guest doesn't see XSAVE extensions
(e.g. xgetbv1 or xsavec) that the host lacks.
Cc: stable@vger.kernel.org Reviewed-by: Radim Krčmář <rkrcmar@redhat.com>
[4.5 does have CPUID_D_1_EAX, but earlier kernels don't, so use
the numeric value. This is consistent with other occurrences
of cpuid_mask in arch/x86/kvm/cpuid.c - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Writing CP0_Compare clears the timer interrupt pending bit
(CP0_Cause.TI), but this wasn't being done atomically. If a timer
interrupt raced with the write of the guest CP0_Compare, the timer
interrupt could end up being pending even though the new CP0_Compare is
nowhere near CP0_Count.
We were already updating the hrtimer expiry with
kvm_mips_update_hrtimer(), which used both kvm_mips_freeze_hrtimer() and
kvm_mips_resume_hrtimer(). Close the race window by expanding out
kvm_mips_update_hrtimer(), and clearing CP0_Cause.TI and setting
CP0_Compare between the freeze and resume. Since the pending timer
interrupt should not be cleared when CP0_Compare is written via the KVM
user API, an ack argument is added to distinguish the source of the
write.
Fixes: e30492bbe95a ("MIPS: KVM: Rewrite count/compare timer emulation") Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Radim KrÄ\8dmář" <rkrcmar@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: kvm@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
There's a particularly narrow and subtle race condition when the
software emulated guest timer is frozen which can allow a guest timer
interrupt to be missed.
This happens due to the hrtimer expiry being inexact, so very
occasionally the freeze time will be after the moment when the emulated
CP0_Count transitions to the same value as CP0_Compare (so an IRQ should
be generated), but before the moment when the hrtimer is due to expire
(so no IRQ is generated). The IRQ won't be generated when the timer is
resumed either, since the resume CP0_Count will already match CP0_Compare.
With VZ guests in particular this is far more likely to happen, since
the soft timer may be frozen frequently in order to restore the timer
state to the hardware guest timer. This happens after 5-10 hours of
guest soak testing, resulting in an overflow in guest kernel timekeeping
calculations, hanging the guest. A more focussed test case to
intentionally hit the race (with the help of a new hypcall to cause the
timer state to migrated between hardware & software) hits the condition
fairly reliably within around 30 seconds.
Instead of relying purely on the inexact hrtimer expiry to determine
whether an IRQ should be generated, read the guest CP0_Compare and
directly check whether the freeze time is before or after it. Only if
CP0_Count is on or after CP0_Compare do we check the hrtimer expiry to
determine whether the last IRQ has already been generated (which will
have pushed back the expiry by one timer period).
Fixes: e30492bbe95a ("MIPS: KVM: Rewrite count/compare timer emulation") Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: "Radim KrÄ\8dmář" <rkrcmar@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: kvm@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Commit d28bc9dd25ce reversed the order of two lines which initialize cr0,
allowing the current (old) cr0 value to mess up vcpu initialization.
This was observed in the checks for cr0 X86_CR0_WP bit in the context of
kvm_mmu_reset_context(). Besides, setting vcpu->arch.cr0 after vmx_set_cr0()
is completely redundant. Change the order back to ensure proper vcpu
initialization.
The combination of booting with ovmf firmware when guest vcpus > 1 and kvm's
ept=N option being set results in a VM-entry failure. This patch fixes that.
Fixes: d28bc9dd25ce ("KVM: x86: INIT and reset sequences are different") Signed-off-by: Bruce Rogers <brogers@suse.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
DMA is optional with this driver. If it was not enabled the devpriv->dma
pointer will be NULL.
Fix the possible NULL pointer dereference when trying to disable the DMA
channels in das1800_ai_cancel() and tidy up the comments to fix the
checkpatch.pl issues:
WARNING: line over 80 characters
It's probably harmless in das1800_ai_setup_dma() because the 'desc' pointer
will not be used if DMA is disabled but fix it there also.
Fixes: 99dfc3357e98 ("staging: comedi: das1800: remove depends on ISA_DMA_API limitation") Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com> Reviewed-by: Ian Abbott <abbotti@mev.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
The current implemenentation restart the sent pattern for each entry in
the sg list. The receiving end expects a continuous pattern, and test
will fail unless scatterilst entries happen to be aligned with the
pattern
Fix this by calculating the pattern byte based on total sent size
instead of just the current sg entry.
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Fixes: 8b5249019352 ("[PATCH] USB: usbtest: scatterlist OUT data pattern testing") Acked-by: Felipe Balbi <felipe.balbi@linux.intel.com> Acked-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
When binding the function to usb_configuration, check whether the thread
is running before starting another one. Without that, when function
instance is added to multiple configurations, fsg_bing starts multiple
threads with all but the latest one being forgotten by the driver. This
leads to obvious thread leaks, possible lockups when trying to halt the
machine and possible more issues.
This fixes issues with legacy/multi¹ gadget as well as configfs gadgets
when mass_storage function is added to multiple configurations.
This change also simplifies API since the legacy gadgets no longer need
to worry about starting the thread by themselves (which was where bug
in legacy/multi was in the first place).
N.B., this patch doesn’t address adding single mass_storage function
instance to a single configuration twice. Thankfully, there’s no
legitimate reason for such setup plus, if I’m not mistaken, configfs
gadget doesn’t even allow it to be expressed.
¹ I have no example failure though. Conclusion that legacy/multi has
a bug is based purely on me reading the code.
Acked-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Michal Nazarewicz <mina86@mina86.com> Tested-by: Ivaylo Dimitrov <ivo.g.dimitrov.75@gmail.com> Cc: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
In the current implementation functionfs generates a EFAULT for async read
operations if the read buffer size is larger than the URB data size. Since
a application does not necessarily know how much data the host side is
going to send it typically supplies a buffer larger than the actual data,
which will then result in a EFAULT error.
This behaviour was introduced while refactoring the code to use iov_iter
interface in commit c993c39b8639 ("gadget/function/f_fs.c: use put iov_iter
into io_data"). The original code took the minimum over the URB size and
the user buffer size and then attempted to copy that many bytes using
copy_to_user(). If copy_to_user() could not copy all data a EFAULT error
was generated. Restore the original behaviour by only generating a EFAULT
error when the number of bytes copied is not the size of the URB and the
target buffer has not been fully filled.
Commit 342f39a6c8d3 ("usb: gadget: f_fs: fix check in read operation")
already fixed the same problem for the synchronous read path.
Fixes: c993c39b8639 ("gadget/function/f_fs.c: use put iov_iter into io_data") Acked-by: Michal Nazarewicz <mina86@mina86.com> Signed-off-by: Lars-Peter Clausen <lars@metafoo.de> Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: lei liu <liu.lei78@zte.com.cn>
[johan: rebase and replace commit message ] Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: lei liu <liu.lei78@zte.com.cn>
[properly sort them - gregkh] Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Added support for Gemalto's Cinterion PH8 and AHxx products
with 2 RmNet Interfaces and products with 1 RmNet + 1 USB Audio interface.
In addition some minor renaming and formatting.
Signed-off-by: Hans-Christoph Schemmel <hans-christoph.schemmel@gemalto.com>
[johan: sort current entries and trim trailing whitespace ] Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
URBs and buffers allocated in attach for Epic devices would never be
deallocated in case of a later probe error (e.g. failure to allocate
minor numbers) as disconnect is then never called.
Fix by moving deallocation to release and making sure that the
URBs are first unlinked.
Fixes: f9c99bb8b3a1 ("USB: usb-serial: replace shutdown with disconnect,
release") Signed-off-by: Johan Hovold <johan@kernel.org> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Private data, URBs and buffers allocated for Epic devices during
attach were never released on errors (e.g. missing endpoints).
Fixes: 6e8cf7751f9f ("USB: add EPIC support to the io_edgeport driver") Signed-off-by: Johan Hovold <johan@kernel.org> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
The interface read URB is submitted in attach, but was only unlinked by
the driver at disconnect.
In case of a late probe error (e.g. due to failed minor allocation),
disconnect is never called and we would end up with active URBs for an
unbound interface. This in turn could lead to deallocated memory being
dereferenced in the completion callback.
Fixes: f7a33e608d9a ("USB: serial: add quatech2 usb to serial driver") Signed-off-by: Johan Hovold <johan@kernel.org> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
The interface instat and indat URBs were submitted in attach, but never
unlinked in release before deallocating the corresponding transfer
buffers.
In the case of a late probe error (e.g. due to failed minor allocation),
disconnect would not have been called before release, causing the
buffers to be freed while the URBs are still in use. We'd also end up
with active URBs for an unbound interface.
Fixes: f9c99bb8b3a1 ("USB: usb-serial: replace shutdown with disconnect,
release") Signed-off-by: Johan Hovold <johan@kernel.org> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
The interface read and event URBs are submitted in attach, but were
never explicitly unlinked by the driver. Instead the URBs would have
been killed by usb-serial core on disconnect.
In case of a late probe error (e.g. due to failed minor allocation),
disconnect is never called and we could end up with active URBs for an
unbound interface. This in turn could lead to deallocated memory being
dereferenced in the completion callbacks.
Fixes: ee467a1f2066 ("USB: serial: add Moxa UPORT 12XX/14XX/16XX
driver") Signed-off-by: Johan Hovold <johan@kernel.org> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Ensure that mei_cl_read_start is called under the device lock
also in the bus layer. The function updates global ctrl_wr_list
which should be locked.
Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com> Signed-off-by: Tomas Winkler <tomas.winkler@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
When a message is received and amthif client is not in reading state
the message is ignored and left dangling in the queue. This may happen
after one of the amthif host connections is closed w/o completing the
reading. Another client will pick up a wrong message on next read
attempt which will lead to link reset.
To prevent this the driver has to properly discard the message when
amthif client is not in reading state.
Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com> Signed-off-by: Tomas Winkler <tomas.winkler@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
We move the list flushing to the completion routine.
Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com> Signed-off-by: Tomas Winkler <tomas.winkler@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
hci_vhci driver creates a hci device object dynamically upon each
HCI_VENDOR_PKT write. Although it checks the already created object
and returns an error, it's still racy and may build multiple hci_dev
objects concurrently when parallel writes are performed, as the device
tracks only a single hci_dev object.
This patch introduces a mutex to protect against the concurrent device
creations.
Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
The write handler allocates skbs and queues them into data->readq.
Read side should read them, if there is any. If there is none, skbs
should be dropped by hdev->flush. But this happens only if the device
is HCI_UP, i.e. hdev->power_on work was triggered already. When it was
not, skbs stay allocated in the queue when /dev/vhci is closed. So
purge the queue in ->release.
Program to reproduce:
#include <err.h>
#include <fcntl.h>
#include <stdio.h>
#include <unistd.h>
Both vhci_get_user and vhci_release race with open_timeout work. They
both contain cancel_delayed_work_sync, but do not test whether the
work actually created hdev or not. Since the work can be in progress
and _sync will wait for finishing it, we can have data->hdev allocated
when cancel_delayed_work_sync returns. But the call sites do 'if
(data->hdev)' *before* cancel_delayed_work_sync.
As a result:
* vhci_get_user allocates a second hdev and puts it into
data->hdev. The former is leaked.
* vhci_release does not release data->hdev properly as it thinks there
is none.
Fix both cases by moving the actual test *after* the call to
cancel_delayed_work_sync.
This can be hit by this program:
#include <err.h>
#include <fcntl.h>
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#include <unistd.h>
#include <sys/stat.h>
#include <sys/types.h>
int main(int argc, char **argv)
{
int fd;
srand(time(NULL));
while (1) {
const int delta = (rand() % 200 - 100) * 100;
fd = open("/dev/vhci", O_RDWR);
if (fd < 0)
err(1, "open");
And the result is:
BUG: KASAN: use-after-free in skb_queue_tail+0x13e/0x150 at addr ffff88006b0c1228
Read of size 8 by task kworker/u13:1/32068
=============================================================================
BUG kmalloc-192 (Tainted: G E ): kasan: bad access detected
-----------------------------------------------------------------------------
Disabling lock debugging due to kernel taint
INFO: Allocated in vhci_open+0x50/0x330 [hci_vhci] age=260 cpu=3 pid=32040
...
kmem_cache_alloc_trace+0x150/0x190
vhci_open+0x50/0x330 [hci_vhci]
misc_open+0x35b/0x4e0
chrdev_open+0x23b/0x510
...
INFO: Freed in vhci_release+0xa4/0xd0 [hci_vhci] age=9 cpu=2 pid=32040
...
__slab_free+0x204/0x310
vhci_release+0xa4/0xd0 [hci_vhci]
...
INFO: Slab 0xffffea0001ac3000 objects=16 used=13 fp=0xffff88006b0c1e00 flags=0x5fffff80004080
INFO: Object 0xffff88006b0c1200 @offset=4608 fp=0xffff88006b0c0600
Bytes b4 ffff88006b0c11f0: 09 df 00 00 01 00 00 00 00 00 00 00 00 00 00 00 ................
Object ffff88006b0c1200: 00 06 0c 6b 00 88 ff ff 00 00 00 00 00 00 00 00 ...k............
Object ffff88006b0c1210: 10 12 0c 6b 00 88 ff ff 10 12 0c 6b 00 88 ff ff ...k.......k....
Object ffff88006b0c1220: c0 46 c2 6b 00 88 ff ff c0 46 c2 6b 00 88 ff ff .F.k.....F.k....
Object ffff88006b0c1230: 01 00 00 00 01 00 00 00 e0 ff ff ff 0f 00 00 00 ................
Object ffff88006b0c1240: 40 12 0c 6b 00 88 ff ff 40 12 0c 6b 00 88 ff ff @..k....@..k....
Object ffff88006b0c1250: 50 0d 6e a0 ff ff ff ff 00 02 00 00 00 00 ad de P.n.............
Object ffff88006b0c1260: 00 00 00 00 00 00 00 00 ab 62 02 00 01 00 00 00 .........b......
Object ffff88006b0c1270: 90 b9 19 81 ff ff ff ff 38 12 0c 6b 00 88 ff ff ........8..k....
Object ffff88006b0c1280: 03 00 20 00 ff ff ff ff ff ff ff ff 00 00 00 00 .. .............
Object ffff88006b0c1290: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
Object ffff88006b0c12a0: 00 00 00 00 00 00 00 00 00 80 cd 3d 00 88 ff ff ...........=....
Object ffff88006b0c12b0: 00 20 00 00 00 00 00 00 00 00 00 00 00 00 00 00 . ..............
Redzone ffff88006b0c12c0: bb bb bb bb bb bb bb bb ........
Padding ffff88006b0c13f8: 00 00 00 00 00 00 00 00 ........
CPU: 3 PID: 32068 Comm: kworker/u13:1 Tainted: G B E 4.4.6-0-default #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.8.1-0-g4adadbd-20151112_172657-sheep25 04/01/2014
Workqueue: hci0 hci_cmd_work [bluetooth] 00000000ffffffffffffffff81926cfaffff88006be37c68ffff88006bc27180 ffff88006b0c1200ffff88006b0c1234ffffffff81577993ffffffff82489320 ffff88006bc242400000000000000046ffff88006a100000000000026e51eb80
Call Trace:
...
[<ffffffff81ec8ebe>] ? skb_queue_tail+0x13e/0x150
[<ffffffffa06e027c>] ? vhci_send_frame+0xac/0x100 [hci_vhci]
[<ffffffffa0c61268>] ? hci_send_frame+0x188/0x320 [bluetooth]
[<ffffffffa0c61515>] ? hci_cmd_work+0x115/0x310 [bluetooth]
[<ffffffff811a1375>] ? process_one_work+0x815/0x1340
[<ffffffff811a1f85>] ? worker_thread+0xe5/0x11f0
[<ffffffff811a1ea0>] ? process_one_work+0x1340/0x1340
[<ffffffff811b3c68>] ? kthread+0x1c8/0x230
...
Memory state around the buggy address: ffff88006b0c1100: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ffff88006b0c1180: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>ffff88006b0c1200: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
^ ffff88006b0c1280: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc ffff88006b0c1300: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
008GE0 Toshiba mmc in some Intel Baytrail tablets responds to
MMC_SEND_EXT_CSD in 450-600ms.
This patch will...
() Increase the long read time quirk timeout from 300ms to 600ms. Original
author of that quirk says 300ms was only a guess and that the number
may need to be raised in the future.
() Add this specific MMC to the quirk
Signed-off-by: Matt Gumbel <matthew.k.gumbel@intel.com> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Some BIOSes unconditionally send an ACPI notification to RBTN when the
system is resuming from suspend. This makes dell-rbtn send an input
event to userspace as if a function key was pressed. Prevent this by
ignoring all the notifications received while the device is suspended.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=106031 Signed-off-by: Gabriele Mazzotta <gabriele.mzt@gmail.com> Tested-by: Alex Hung <alex.hung@canonical.com> Reviewed-by: Pali Rohár <pali.rohar@gmail.com> Signed-off-by: Darren Hart <dvhart@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
The order of the _OSI related functionalities is as follows:
acpi_blacklisted()
acpi_dmi_osi_linux()
acpi_osi_setup()
acpi_osi_setup()
acpi_update_interfaces() if "!*"
<<<<<<<<<<<<<<<<<<<<<<<<
parse_args()
__setup("acpi_osi=")
acpi_osi_setup_linux()
acpi_update_interfaces() if "!*"
<<<<<<<<<<<<<<<<<<<<<<<<
acpi_early_init()
acpi_initialize_subsystem()
acpi_ut_initialize_interfaces()
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
acpi_bus_init()
acpi_os_initialize1()
acpi_install_interface_handler(acpi_osi_handler)
acpi_osi_setup_late()
acpi_update_interfaces() for "!"
>>>>>>>>>>>>>>>>>>>>>>>>
acpi_osi_handler()
Since acpi_osi_setup_linux() can override acpi_dmi_osi_linux(), the command
line setting can override the DMI detection. That's why acpi_blacklisted()
is put before __setup("acpi_osi=").
Then we can notice the following wrong invocation order. There are
acpi_update_interfaces() (marked by <<<<) calls invoked before
acpi_ut_initialize_interfaces() (marked by ^^^^). This makes it impossible
to use acpi_osi=!* correctly from OSI DMI table or from the command line.
The use of acpi_osi=!* is meant to disable both ACPICA
(acpi_gbl_supported_interfaces) and Linux specific strings
(osi_setup_entries) while the ACPICA part should have stopped working
because of the order issue.
This patch fixes this issue by moving acpi_update_interfaces() to where
it is invoked for acpi_osi=! (marked by >>>>) as this is ensured to be
invoked after acpi_ut_initialize_interfaces() (marked by ^^^^). Linux
specific strings are still handled in the original place in order to make
the following command line working: acpi_osi=!* acpi_osi="Module Device".
Note that since acpi_osi=!* is meant to further disable linux specific
string comparing to the acpi_osi=!, there is no such use case in our bug
fixing work and hence there is no one using acpi_osi=!* either from the
command line or from the DMI quirks, this issue is just a theoretical
issue.
Fixes: 741d81280ad2 (ACPI: Add facility to remove all _OSI strings) Tested-by: Lukas Wunner <lukas@wunner.de> Tested-by: Chen Yu <yu.c.chen@intel.com> Signed-off-by: Lv Zheng <lv.zheng@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Some eMMCs set the partition switch timeout too low.
Now typically eMMCs are considered a critical component (e.g. because
they store the root file system) and consequently are expected to be
reliable. Thus we can neglect the use case where eMMCs can't switch
reliably and we might want a lower timeout to facilitate speedy
recovery.
Although we could employ a quirk for the cards that are affected (if
we could identify them all), as described above, there is little
benefit to having a low timeout, so instead simply set a minimum
timeout.
The minimum is set to 300ms somewhat arbitrarily - the examples that
have been seen had a timeout of 10ms but were sometimes taking 60-70ms.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
As described in 'can: m_can: tag current CAN FD controllers as non-ISO'
(6cfda7fbebe) it is possible to define fixed configuration options by
setting the according bit in 'ctrlmode' and clear it in 'ctrlmode_supported'.
This leads to the incovenience that the fixed configuration bits can not be
passed by netlink even when they have the correct values (e.g. non-ISO, FD).
This patch fixes that issue and not only allows fixed set bit values to be set
again but now requires(!) to provide these fixed values at configuration time.
A valid CAN FD configuration consists of a nominal/arbitration bittiming, a
data bittiming and a control mode with CAN_CTRLMODE_FD set - which is now
enforced by a new can_validate() function. This fix additionally removed the
inconsistency that was prohibiting the support of 'CANFD-only' controller
drivers, like the RCar CAN FD.
For this reason a new helper can_set_static_ctrlmode() has been introduced to
provide a proper interface to handle static enabled CAN controller options.
Reported-by: Ramesh Shanmugasundaram <ramesh.shanmugasundaram@bp.renesas.com> Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net> Reviewed-by: Ramesh Shanmugasundaram <ramesh.shanmugasundaram@bp.renesas.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
The GICv3 driver wrongly assumes that it runs on the non-secure
side of a secure-enabled system, while it could be on a system
with a single security state, or a GICv3 with GICD_CTLR.DS set.
Either way, it is important to configure this properly, or
interrupts will simply not be delivered on this HW.
Reported-by: Peter Maydell <peter.maydell@linaro.org> Tested-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
When an IPI is generated by a CPU, the pattern looks roughly like:
<write shared data>
smp_wmb();
<write to GIC to signal SGI>
On the receiving CPU we rely on the fact that, once we've taken the
interrupt, then the freshly written shared data must be visible to us.
Put another way, the CPU isn't going to speculate taking an interrupt.
Unfortunately, this assumption turns out to be broken.
Consider that CPUx wants to send an IPI to CPUy, which will cause CPUy
to read some shared_data. Before CPUx has done anything, a random
peripheral raises an IRQ to the GIC and the IRQ line on CPUy is raised.
CPUy then takes the IRQ and starts executing the entry code, heading
towards gic_handle_irq. Furthermore, let's assume that a bunch of the
previous interrupts handled by CPUy were SGIs, so the branch predictor
kicks in and speculates that irqnr will be <16 and we're likely to
head into handle_IPI. The prefetcher then grabs a speculative copy of
shared_data which contains a stale value.
Meanwhile, CPUx gets round to updating shared_data and asking the GIC
to send an SGI to CPUy. Internally, the GIC decides that the SGI is
more important than the peripheral interrupt (which hasn't yet been
ACKed) but doesn't need to do anything to CPUy, because the IRQ line
is already raised.
CPUy then reads the ACK register on the GIC, sees the SGI value which
confirms the branch prediction and we end up with a stale shared_data
value.
This patch fixes the problem by adding an smp_rmb() to the IPI entry
code in gic_handle_irq. As it turns out, the combination of a control
dependency and an ISB instruction from the EOI in the GICv3 driver is
enough to provide the ordering we need, so we add a comment there
justifying the absence of an explicit smp_rmb().
Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
On a Freescale i.MX53 based board we ran into "BUG: scheduling while
atomic" because input_inject_event locks interrupts, but
imx_pwm_config_v2 sleeps.
Tested on Freescale i.MX53 SoC with 4.6.0.
Signed-off-by: Manfred Schlaegl <manfred.schlaegl@gmx.at> Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Reported-by: H. Nikolaus Schaller <hns@goldelico.com> Tested-by: H. Nikolaus Schaller <hns@goldelico.com> Signed-off-by: Roger Quadros <rogerq@ti.com> Signed-off-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Systems show a minimal load average of 0.00, 0.01, 0.05 even when they
have no load at all.
Uptime and /proc/loadavg on all systems with kernels released during the
last five years up until kernel version 4.6-rc5, show a 5- and 15-minute
minimum loadavg of 0.01 and 0.05 respectively. This should be 0.00 on
idle systems, but the way the kernel calculates this value prevents it
from getting lower than the mentioned values.
Likewise but not as obviously noticeable, a fully loaded system with no
processes waiting, shows a maximum 1/5/15 loadavg of 1.00, 0.99, 0.95
(multiplied by number of cores).
Once the (old) load becomes 93 or higher, it mathematically can never
get lower than 93, even when the active (load) remains 0 forever.
This results in the strange 0.00, 0.01, 0.05 uptime values on idle
systems. Note: 93/2048 = 0.0454..., which rounds up to 0.05.
It is not correct to add a 0.5 rounding (=1024/2048) here, since the
result from this function is fed back into the next iteration again,
so the result of that +0.5 rounding value then gets multiplied by
(2048-2037), and then rounded again, so there is a virtual "ghost"
load created, next to the old and active load terms.
By changing the way the internally kept value is rounded, that internal
value equivalent now can reach 0.00 on idle, and 1.00 on full load. Upon
increasing load, the internally kept load value is rounded up, when the
load is decreasing, the load value is rounded down.
The modified code was tested on nohz=off and nohz kernels. It was tested
on vanilla kernel 4.6-rc5 and on centos 7.1 kernel 3.10.0-327. It was
tested on single, dual, and octal cores system. It was tested on virtual
hosts and bare hardware. No unwanted effects have been observed, and the
problems that the patch intended to fix were indeed gone.
Tested-by: Damien Wyart <damien.wyart@free.fr> Signed-off-by: Vik Heyndrickx <vik.heyndrickx@veribox.net> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Doug Smythies <dsmythies@telus.net> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: 0f004f5a696a ("sched: Cure more NO_HZ load average woes") Link: http://lkml.kernel.org/r/e8d32bff-d544-7748-72b5-3c86cc71f09f@veribox.net Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
This patch adds the CLK_SET_RATE_PARENT flag for the crypto core and
ahb blocks. Without this flag, clk_set_rate can fail for certain
frequency requests.
Signed-off-by: Andy Gross <andy.gross@linaro.org> Fixes: 3966fab8b6ab ("clk: qcom: Add MSM8916 Global Clock Controller support") Signed-off-by: Stephen Boyd <sboyd@codeaurora.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
The current sun4i-ss driver could generate data corruption when ciphering/deciphering.
It occurs randomly on end of handled data.
No root cause have been found and the only way to remove it is to replace
all spin_lock_bh by their irq counterparts.
Fixes: 6298e948215f ("crypto: sunxi-ss - Add Allwinner Security System crypto accelerator") Signed-off-by: LABBE Corentin <clabbe.montjoie@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
caam_jr_alloc() used to return NULL if a JR device could not be
allocated for a session. In turn, every user of this function used
IS_ERR() function to verify if anything went wrong, which does NOT look
for NULL values. This made the kernel crash if the sanity check failed,
because the driver continued to think it had allocated a valid JR dev
instance to the session and at some point it tries to do a caam_jr_free()
on a NULL JR dev pointer.
This patch is a fix for this issue.
Signed-off-by: Catalin Vasile <cata.vasile@nxp.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Thus it overflows and the resulting number is less than 4080, which makes
3823 / 4080 = 0
an nr_pages is set to this. As we already checked against the minimum that
nr_pages may be, this causes the logic to fail as well, and we crash the
kernel.
There's no reason to have the two DIV_ROUND_UP() (that's just result of
historical code changes), clean up the code and fix this bug.
Fixes: 83f40318dab00 ("ring-buffer: Make removal of ring buffer pages atomic") Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
The size variable to change the ring buffer in ftrace is a long. The
nr_pages used to update the ring buffer based on the size is int. On 64 bit
machines this can cause an overflow problem.
For example, the following will cause the ring buffer to crash:
WARNING: CPU: 1 PID: 318 at kernel/trace/ring_buffer.c:1527 rb_update_pages+0x22f/0x260
Which is:
RB_WARN_ON(cpu_buffer, nr_removed);
Note each ring buffer page holds 4080 bytes.
This is because:
1) 10 causes the ring buffer to have 3 pages.
(10kb requires 3 * 4080 pages to hold)
2) (2^31 / 2^10 + 1) * 4080 = 8556384240
The value written into buffer_size_kb is shifted by 10 and then passed
to ring_buffer_resize(). 8556384240 * 2^10 = 8761737461760
3) The size passed to ring_buffer_resize() is then divided by BUF_PAGE_SIZE
which is 4080. 8761737461760 / 4080 = 2147484672
4) nr_pages is subtracted from the current nr_pages (3) and we get: 2147484669. This value is saved in a signed integer nr_pages_to_update
5) 2147484669 is greater than 2^31 but smaller than 2^32, a signed int
turns into the value of -2147482627
6) As the value is a negative number, in update_pages_handler() it is
negated and passed to rb_remove_pages() and 2147482627 pages will
be removed, which is much larger than 3 and it causes the warning
because not all the pages asked to be removed were removed.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=118001 Fixes: 7a8e76a3829f1 ("tracing: unified trace buffer") Reported-by: Hao Qin <QEver.cn@gmail.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
In testing with HiKey, we found that since
commit 3f30b158eba5 ("asix: On RX avoid creating bad Ethernet
frames"),
we're seeing lots of noise during network transfers:
[ 239.027993] asix 1-1.1:1.0 eth0: asix_rx_fixup() Data Header synchronisation was lost, remaining 988
[ 239.037310] asix 1-1.1:1.0 eth0: asix_rx_fixup() Bad Header Length 0x54ebb5ec, offset 4
[ 239.045519] asix 1-1.1:1.0 eth0: asix_rx_fixup() Bad Header Length 0xcdffe7a2, offset 4
[ 239.275044] asix 1-1.1:1.0 eth0: asix_rx_fixup() Data Header synchronisation was lost, remaining 988
[ 239.284355] asix 1-1.1:1.0 eth0: asix_rx_fixup() Bad Header Length 0x1d36f59d, offset 4
[ 239.292541] asix 1-1.1:1.0 eth0: asix_rx_fixup() Bad Header Length 0xaef3c1e9, offset 4
[ 239.518996] asix 1-1.1:1.0 eth0: asix_rx_fixup() Data Header synchronisation was lost, remaining 988
[ 239.528300] asix 1-1.1:1.0 eth0: asix_rx_fixup() Bad Header Length 0x2881912, offset 4
[ 239.536413] asix 1-1.1:1.0 eth0: asix_rx_fixup() Bad Header Length 0x5638f7e2, offset 4
And network throughput ends up being pretty bursty and slow with
a overall throughput of at best ~30kB/s (where as previously we
got 1.1MB/s with the slower USB1.1 "full speed" host).
We found the issue also was reproducible on a x86_64 system,
using a "high-speed" USB2.0 port but the throughput did not
measurably drop (possibly due to the scp transfer being cpu
bound on my slow test hardware).
After lots of debugging, I found the check added in the
problematic commit seems to be calculating the offset
incorrectly.
In the normal case, in the main loop of the function, we do:
(where offset is zero, or set to "offset += (copy_length + 1) &
0xfffe" in the previous loop)
rx->header = get_unaligned_le32(skb->data +
offset);
offset += sizeof(u32);
But the problematic patch calculates:
offset = ((rx->remaining + 1) & 0xfffe) + sizeof(u32);
rx->header = get_unaligned_le32(skb->data + offset);
Adding some debug logic to check those offset calculation used
to find rx->header, the one in problematic code is always too
large by sizeof(u32).
Thus, this patch removes the incorrect " + sizeof(u32)" addition
in the problematic calculation, and resolves the issue.
Cc: Dean Jenkins <Dean_Jenkins@mentor.com> Cc: "David B. Robins" <linux@davidrobins.net> Cc: Mark Craske <Mark_Craske@mentor.com> Cc: Emil Goode <emilgoode@gmail.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: YongQin Liu <yongqin.liu@linaro.org> Cc: Guodong Xu <guodong.xu@linaro.org> Cc: Ivan Vecera <ivecera@redhat.com> Cc: linux-usb@vger.kernel.org Cc: netdev@vger.kernel.org Reported-by: Yongqin Liu <yongqin.liu@linaro.org> Signed-off-by: John Stultz <john.stultz@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Stefan Metzmacher <metze@samba.org> Signed-off-by: Steve French <smfrench@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Stefan Metzmacher <metze@samba.org> Signed-off-by: Steve French <smfrench@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Stefan Metzmacher <metze@samba.org> Signed-off-by: Steve French <smfrench@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
See [MS-NLMP] 3.2.5.1.2 Server Receives an AUTHENTICATE_MESSAGE from the Client:
...
Set NullSession to FALSE
If (AUTHENTICATE_MESSAGE.UserNameLen == 0 AND
AUTHENTICATE_MESSAGE.NtChallengeResponse.Length == 0 AND
(AUTHENTICATE_MESSAGE.LmChallengeResponse == Z(1)
OR
AUTHENTICATE_MESSAGE.LmChallengeResponse.Length == 0))
-- Special case: client requested anonymous authentication
Set NullSession to TRUE
...
Only server which map unknown users to guest will allow
access using a non-null NTChallengeResponse.
For Samba it's the "map to guest = bad user" option.
Signed-off-by: Stefan Metzmacher <metze@samba.org> Signed-off-by: Steve French <smfrench@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Wrong return code was being returned on SMB3 rmdir of
non-empty directory.
For SMB3 (unlike for cifs), we attempt to delete a directory by
set of delete on close flag on the open. Windows clients set
this flag via a set info (SET_FILE_DISPOSITION to set this flag)
which properly checks if the directory is empty.
With this patch on smb3 mounts we correctly return
"DIRECTORY NOT EMPTY"
on attempts to remove a non-empty directory.
Signed-off-by: Steve French <steve.french@primarydata.com> Acked-by: Sachin Prabhu <sprabhu@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
The EC field of the constructed ESR is conditionally modified by ORing in
ESR_ELx_EC_DABT_LOW for a data abort. However, ESR_ELx_EC_SHIFT is missing
from this condition.
Signed-off-by: Matt Evans <matt.evans@arm.com> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
The ARM architecture mandates that when changing a page table entry
from a valid entry to another valid entry, an invalid entry is first
written, TLB invalidated, and only then the new entry being written.
The current code doesn't respect this, directly writing the new
entry and only then invalidating TLBs. Let's fix it up.
Reported-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
The loop that browses the array compat_hwcap_str will stop when a NULL
is encountered, however NULL is missing at the end of array. This will
lead to overrun until a NULL is found somewhere in the following memory.
In reality, this works out because the compat_hwcap2_str array tends to
follow immediately in memory, and that *is* terminated correctly.
Furthermore, the unsigned int compat_elf_hwcap is checked before
printing each capability, so we end up doing the right thing because
the size of the two arrays is less than 32. Still, this is an obvious
mistake and should be fixed.
Note for backporting: commit 12d11817eaafa414 ("arm64: Move
/proc/cpuinfo handling code") moved this code in v4.4. Prior to that
commit, the same change should be made in arch/arm64/kernel/setup.c.
Fixes: 44b82b7700d0 "arm64: Fix up /proc/cpuinfo" Signed-off-by: Julien Grall <julien.grall@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
The update to the accessed or dirty states for block mappings must be
done atomically on hardware with support for automatic AF/DBM. The
ptep_set_access_flags() function has been fixed as part of commit 66dbd6e61a52 ("arm64: Implement ptep_set_access_flags() for hardware
AF/DBM"). This patch brings pmdp_set_access_flags() in line with the pte
counterpart.
Fixes: 2f4b829c625e ("arm64: Add support for hardware updates of the access and dirty pte bits") Reviewed-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
When hardware updates of the access and dirty states are enabled, the
default ptep_set_access_flags() implementation based on calling
set_pte_at() directly is potentially racy. This triggers the "racy dirty
state clearing" warning in set_pte_at() because an existing writable PTE
is overridden with a clean entry.
There are two main scenarios for this situation:
1. The CPU getting an access fault does not support hardware updates of
the access/dirty flags. However, a different agent in the system
(e.g. SMMU) can do this, therefore overriding a writable entry with a
clean one could potentially lose the automatically updated dirty
status
2. A more complex situation is possible when all CPUs support hardware
AF/DBM:
a) Initial state: shareable + writable vma and pte_none(pte)
b) Read fault taken by two threads of the same process on different
CPUs
c) CPU0 takes the mmap_sem and proceeds to handling the fault. It
eventually reaches do_set_pte() which sets a writable + clean pte.
CPU0 releases the mmap_sem
d) CPU1 acquires the mmap_sem and proceeds to handle_pte_fault(). The
pte entry it reads is present, writable and clean and it continues
to pte_mkyoung()
e) CPU1 calls ptep_set_access_flags()
If between (d) and (e) the hardware (another CPU) updates the dirty
state (clears PTE_RDONLY), CPU1 will override the PTR_RDONLY bit
marking the entry clean again.
This patch implements an arm64-specific ptep_set_access_flags() function
to perform an atomic update of the PTE flags.
Fixes: 2f4b829c625e ("arm64: Add support for hardware updates of the access and dirty pte bits") Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Reported-by: Ming Lei <tom.leiming@gmail.com> Tested-by: Julien Grall <julien.grall@arm.com> Cc: Will Deacon <will.deacon@arm.com>
[will: reworded comment] Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Currently, pmd_present() only checks for a non-zero value, returning
true even after pmd_mknotpresent() (which only clears the type bits).
This patch converts pmd_present() to using pte_present(), similar to the
other pmd_*() checks. As a side effect, it will return true for
PROT_NONE mappings, though they are not yet used by the kernel with
transparent huge pages.
For consistency, also change pmd_mknotpresent() to only clear the
PMD_SECT_VALID bit, even though the PMD_TABLE_BIT is already 0 for block
mappings (no functional change). The unused PMD_SECT_PROT_NONE
definition is removed as transparent huge pages use the pte page prot
values.
Fixes: 9c7e535fcc17 ("arm64: mm: Route pmd thp functions through pte equivalents") Reviewed-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
With hardware AF/DBM support, pmd modifications (transparent huge pages)
should be performed atomically using load/store exclusive. The initial
patches defined the get-and-clear function and __HAVE_ARCH_* macro
without the "huge" word, leaving the pmdp_huge_get_and_clear() to the
default, non-atomic implementation.
Fixes: 2f4b829c625e ("arm64: Add support for hardware updates of the access and dirty pte bits") Reviewed-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
In commit bcff24887d00 ("ext4: don't read blocks from disk after extents
being swapped") bh is not updated correctly in the for loop and wrong
data has been written to disk. generic/324 catches this on sub-page
block size ext4.
Fixes: bcff24887d00 ("ext4: don't read blocks from disk after extentsbeing swapped") Signed-off-by: Eryu Guan <guaneryu@gmail.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
<SNIP>
CC /tmp/build/perf/tests/llvm.o
cc1: warnings being treated as errors
tests/llvm.c: In function ‘test_llvm__fetch_bpf_obj’:
tests/llvm.c:53: error: declaration of ‘index’ shadows a global declaration
/usr/include/string.h:489: error: shadowed declaration is here
<SNIP>
CC /tmp/build/perf/tests/bpf.o
cc1: warnings being treated as errors
tests/bpf.c: In function ‘__test__bpf’:
tests/bpf.c:149: error: declaration of ‘index’ shadows a global declaration
/usr/include/string.h:489: error: shadowed declaration is here
<SNIP>
Cc: He Kuang <hekuang@huawei.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: pi3orama@163.com Cc: Wang Nan <wangnan0@huawei.com> Cc: Zefan Li <lizefan@huawei.com> Fixes: b31de018a628 ("perf test: Enhance the LLVM test: update basic BPF test program") Fixes: ba1fae431e74 ("perf test: Add 'perf test BPF'") Link: http://lkml.kernel.org/n/tip-akpo4r750oya2phxoh9e3447@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Nikolay Borisov <kernel@kyup.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Jann reported that the ptrace_may_access() check in
find_lively_task_by_vpid() is racy against exec().
Specifically:
perf_event_open() execve()
ptrace_may_access()
commit_creds()
... if (get_dumpable() != SUID_DUMP_USER)
perf_event_exit_task();
perf_install_in_context()
would result in installing a counter across the creds boundary.
Fix this by wrapping lots of perf_event_open() in cred_guard_mutex.
This should be fine as perf_event_exit_task() is already called with
cred_guard_mutex held, so all perf locks already nest inside it.
Reported-by: Jann Horn <jannh@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: He Kuang <hekuang@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Currently, the PT driver always sets the PMI bit one region (page) before
the STOP region so that we can wake up the consumer before we run out of
room in the buffer and have to disable the event. However, we also need
an interrupt in the last output region, so that we actually get to disable
the event (if no more room from new data is available at that point),
otherwise hardware just quietly refuses to start, but the event is
scheduled in and we end up losing trace data till the event gets removed.
For a cpu-wide event it is even worse since there may not be any
re-scheduling at all and no chance for the ring buffer code to notice
that its buffer is filled up and the event needs to be disabled (so that
the consumer can re-enable it when it finishes reading the data out). In
other words, all the trace data will be lost after the buffer gets filled
up.
This patch makes PT also generate a PMI when the last output region is
full.
Reported-by: Markus Metzger <markus.t.metzger@intel.com> Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: vince@deater.net Link: http://lkml.kernel.org/r/1462886313-13660-2-git-send-email-alexander.shishkin@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
The fd we pass in may not be on a btrfs file system, so don't try to do
BTRFS_I() on it. Thanks,
Signed-off-by: Josef Bacik <jbacik@fb.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Cc: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
BugLink: http://bugs.launchpad.net/bugs/1588965
Hyper-V vmbus module registers TSC page clocksource when loaded. This is
the clocksource with the highest rating and thus it becomes the watchdog
making unloading of the vmbus module impossible.
Separate clocksource_select_watchdog() from clocksource_enqueue_watchdog()
and use it on clocksource register/rating change/unregister.
After all, lobotomized monkeys may need some love too.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Cc: John Stultz <john.stultz@linaro.org> Cc: Dexuan Cui <decui@microsoft.com> Cc: K. Y. Srinivasan <kys@microsoft.com> Link: http://lkml.kernel.org/r/1453483913-25672-1-git-send-email-vkuznets@redhat.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
(cherry picked from commit bbf66d897adf2bb0c310db96c97e8db6369f39e1) Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Acked-by: Chris J Arges <chris.j.arges@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Tyler Hicks [Wed, 1 Jun 2016 02:43:44 +0000 (21:43 -0500)]
UBUNTU: SAUCE: net: Use ns_capable_noaudit() when determining net sysctl permissions
BugLink: https://bugs.launchpad.net/bugs/1465724
The capability check should not be audited since it is only being used
to determine the inode permissions. A failed check does not indicate a
violation of security policy but, when an LSM is enabled, a denial audit
message was being generated.
The denial audit message caused confusion for some application authors
because root-running Go applications always triggered the denial. To
prevent this confusion, the capability check in net_ctl_permissions() is
switched to the noaudit variant.
Signed-off-by: Tyler Hicks <tyhicks@canonical.com> Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com> Acked-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Tyler Hicks [Wed, 1 Jun 2016 02:43:43 +0000 (21:43 -0500)]
UBUNTU: SAUCE: kernel: Add noaudit variant of ns_capable()
BugLink: https://bugs.launchpad.net/bugs/1465724
When checking the current cred for a capability in a specific user
namespace, it isn't always desirable to have the LSMs audit the check.
This patch adds a noaudit variant of ns_capable() for when those
situations arise.
The common logic between ns_capable() and the new ns_capable_noaudit()
is moved into a single, shared function to keep duplicated code to a
minimum and ease maintainability.
Signed-off-by: Tyler Hicks <tyhicks@canonical.com> Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com> Acked-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
dd if=/dev/zero of=$MNT/test bs=1 count=2048 &
btrfs subvolume snapshot -r $MNT $MNT/test_snap &
wait
--
We can see dd failed on NO_SPACE.
Reason:
__btrfs_buffered_write should run cow write when no_cow impossible,
and current code is designed with above logic.
But check_can_nocow() have 2 type of return value(0 and <0) on
can_not_no_cow, and current code only continue write on first case,
the second case happened in doing subvolume.
Fix:
Continue write when check_can_nocow() return 0 and <0.
Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
(cherry picked from commit 4da2e26a2a32b174878744bd0f07db180c875f26) Signed-off-by: Joseph Salisbury <joseph.salisbury@canonical.com> Acked-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Chris Bainbridge [Thu, 26 May 2016 19:11:56 +0000 (13:11 -0600)]
usb: core: hub: hub_port_init lock controller instead of bus
BugLink: http://bugs.launchpad.net/bugs/1437492
The XHCI controller presents two USB buses to the system - one for USB2
and one for USB3. The hub init code (hub_port_init) is reentrant but
only locks one bus per thread, leading to a race condition failure when
two threads attempt to simultaneously initialise a USB2 and USB3 device:
[ 8.034843] xhci_hcd 0000:00:14.0: Timeout while waiting for setup device command
[ 13.183701] usb 3-3: device descriptor read/all, error -110
On a test system this failure occurred on 6% of all boots.
Mathias Nyman explains the current behaviour violates the XHCI spec:
hub_port_reset() will end up moving the corresponding xhci device slot
to default state.
As hub_port_reset() is called several times in hub_port_init() it
sounds reasonable that we could end up with two threads having their
xhci device slots in default state at the same time, which according to
xhci 4.5.3 specs still is a big no no:
"Note: Software shall not transition more than one Device Slot to the
Default State at a time"
So both threads fail at their next task after this.
One fails to read the descriptor, and the other fails addressing the
device.
Fix this in hub_port_init by locking the USB controller (instead of an
individual bus) to prevent simultaneous initialisation of both buses.
Fixes: 638139eb95d2 ("usb: hub: allow to process more usb hub events in parallel") Link: https://lkml.org/lkml/2016/2/8/312 Link: https://lkml.org/lkml/2016/2/4/748 Signed-off-by: Chris Bainbridge <chris.bainbridge@gmail.com> Cc: stable <stable@vger.kernel.org> Acked-by: Mathias Nyman <mathias.nyman@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(back ported from commit feb26ac31a2a5cb88d86680d9a94916a6343e9e6) Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Conflicts:
drivers/usb/core/hcd.c
Acked-by: Brad Figg <brad.figg@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Andi Kleen [Thu, 26 May 2016 13:09:46 +0000 (07:09 -0600)]
kernek/fork.c: allocate idle task for a CPU always on its local node
BugLink: http://bugs.launchpad.net/bugs/1585850
Linux preallocates the task structs of the idle tasks for all possible
CPUs. This currently means they all end up on node 0. This also implies
that the cache line of MWAIT, which is around the flags field in the task
struct, are all located in node 0.
We see a noticeable performance improvement on Knights Landing CPUs when
the cache lines used for MWAIT are located in the local nodes of the CPUs
using them. I would expect this to give a (likely slight) improvement on
other systems too.
The patch implements placing the idle task in the node of its CPUs, by
passing the right target node to copy_process()
[akpm@linux-foundation.org: use NUMA_NO_NODE, not a bare -1] Link: http://lkml.kernel.org/r/1463492694-15833-1-git-send-email-andi@firstfloor.org Signed-off-by: Andi Kleen <ak@linux.intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
(cherry picked from commit 725fc629ff2545b061407305ae51016c9f928fce) Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Acked-by: Brad Figg <brad.figg@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
PCI: hv: Add explicit barriers to config space access
BugLink: http://bugs.launchpad.net/bugs/1581243
I'm trying to pass-through Broadcom BCM5720 NIC (Dell device 1f5b) on a
Dell R720 server. Everything works fine when the target VM has only one
CPU, but SMP guests reboot when the NIC driver accesses PCI config space
with hv_pcifront_read_config()/hv_pcifront_write_config(). The reboot
appears to be induced by the hypervisor and no crash is observed. Windows
event logs are not helpful at all ('Virtual machine ... has quit
unexpectedly'). The particular access point is always different and
putting debug between them (printk/mdelay/...) moves the issue further
away. The server model affects the issue as well: on Dell R420 I'm able to
pass-through BCM5720 NIC to SMP guests without issues.
While I'm obviously failing to reveal the essence of the issue I was able
to come up with a (possible) solution: if explicit barriers are added to
hv_pcifront_read_config()/hv_pcifront_write_config() the issue goes away.
The essential minimum is rmb() at the end on _hv_pcifront_read_config() and
wmb() at the end of _hv_pcifront_write_config() but I'm not confident it
will be sufficient for all hardware. I suggest the following barriers:
1) wmb()/mb() between choosing the function and writing to its space.
2) mb() before releasing the spinlock in both _hv_pcifront_read_config()/
_hv_pcifront_write_config() to ensure that consecutive reads/writes to
the space won't get re-ordered as drivers may count on that.
Config space access is not supposed to be performance-critical so these
explicit barriers should not cause any slowdown.
[bhelgaas: use Linux "barriers" terminology] Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Jake Oshins <jakeo@microsoft.com>
(cherry picked from commit bdd74440d9e887b1fa648eefa17421def5f5243c) Signed-off-by: Joseph Salisbury <joseph.salisbury@canonical.com> Acked-by: Brad Figg <brad.figg@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
PCI: hv: Report resources release after stopping the bus
BugLink: http://bugs.launchpad.net/bugs/1581243
Kernel hang is observed when pci-hyperv module is release with device
drivers still attached. E.g., when I do 'rmmod pci_hyperv' with BCM5720
device pass-through-ed (tg3 module) I see the following:
The problem seems to be that we report local resources release before
stopping the bus and removing devices from it and device drivers may try to
perform some operations with these resources on shutdown. Move resources
release report after we do pci_stop_root_bus().
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Jake Oshins <jakeo@microsoft.com>
(cherry picked from commit deb22e5c84c884a129d801cf3bfde7411536998d) Signed-off-by: Joseph Salisbury <joseph.salisbury@canonical.com> Acked-by: Brad Figg <brad.figg@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Alan Stern [Wed, 25 May 2016 17:58:49 +0000 (11:58 -0600)]
USB: leave LPM alone if possible when binding/unbinding interface drivers
BugLink: http://bugs.launchpad.net/bugs/1577024
When a USB driver is bound to an interface (either through probing or
by claiming it) or is unbound from an interface, the USB core always
disables Link Power Management during the transition and then
re-enables it afterward. The reason is because the driver might want
to prevent hub-initiated link power transitions, in which case the HCD
would have to recalculate the various LPM parameters. This
recalculation takes place when LPM is re-enabled and the new
parameters are sent to the device and its parent hub.
However, if the driver does not want to prevent hub-initiated link
power transitions then none of this work is necessary. The parameters
don't need to be recalculated, and LPM doesn't need to be disabled and
re-enabled.
It turns out that disabling and enabling LPM can be time-consuming,
enough so that it interferes with user programs that want to claim and
release interfaces rapidly via usbfs. Since the usbfs kernel driver
doesn't set the disable_hub_initiated_lpm flag, we can speed things up
and get the user programs to work by leaving LPM alone whenever the
flag isn't set.
And while we're improving the way disable_hub_initiated_lpm gets used,
let's also fix its kerneldoc.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Tested-by: Matthew Giassa <matthew@giassa.net> CC: Mathias Nyman <mathias.nyman@intel.com> CC: <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 6fb650d43da3e7054984dc548eaa88765a94d49f) Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Acked-by: Andy Whitcroft <apw@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Keith Busch [Thu, 26 May 2016 16:25:51 +0000 (10:25 -0600)]
blk-mq: End unstarted requests on dying queue
BugLink: http://bugs.launchpad.net/bugs/1581034
Go directly to ending a request if it wasn't started. Previously, completing a
request may invoke a driver callback for a request it didn't initialize.
Signed-off-by: Keith Busch <keith.busch@intel.com> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Reviewed-by: Johannes Thumshirn <jthumshirn at suse.de> Acked-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>
(cherry picked from commit a59e0f5795fe52dad42a99c00287e3766153b312) Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Acked-by: Christopher Arges <chris.j.arges@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Keith Busch [Thu, 26 May 2016 16:25:50 +0000 (10:25 -0600)]
NVMe: Move error handling to failed reset handler
BugLink: http://bugs.launchpad.net/bugs/1581034
This moves failed queue handling out of the namespace removal path and
into the reset failure path, fixing a hanging condition if the controller
fails or link down during del_gendisk. Previously the driver had to see
the controller as degraded prior to calling del_gendisk to setup the
queues to fail. But, if the controller happened to fail after this,
there was no task to end outstanding requests.
On failure, all namespace states are set to dead. This has capacity
revalidate to 0, and ends all new requests with error status.
Signed-off-by: Keith Busch <keith.busch@intel.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Jens Axboe <axboe@fb.com>
(back ported from commit 69d9a99c258eb1d6478fd9608a2070890797eed7) Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Conflicts:
drivers/nvme/host/pci.c
Acked-by: Christopher Arges <chris.j.arges@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Keith Busch [Thu, 26 May 2016 16:25:49 +0000 (10:25 -0600)]
NVMe: Requeue requests on suspended queues
BugLink: http://bugs.launchpad.net/bugs/1581034
It's possible a request may get to the driver after the nvme queue was
disabled. This has the request requeue if that happens.
Note the request is still "started" by the driver, but requeuing will
clear the start state for timeout handling.
Signed-off-by: Keith Busch <keith.busch@intel.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Jens Axboe <axboe@fb.com>
(cherry picked from commit ae1fba20015bca7401db2422fe18c9c049184163) Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Acked-by: Christopher Arges <chris.j.arges@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Keith Busch [Thu, 26 May 2016 16:25:48 +0000 (10:25 -0600)]
NVMe: Fix namespace removal deadlock
BugLink: http://bugs.launchpad.net/bugs/1581034
This patch makes nvme namespace removal lockless. It is up to the caller
to ensure no active namespace scanning is occuring. To ensure no scan
work occurs, the nvme pci driver adds a removing state to the controller
device to avoid queueing scan work during removal. The work is flushed
after setting the state, so no new scan work can be queued.
The lockless removal allows the driver to cleanup a namespace
request_queue if the controller fails during removal. Previously this
could deadlock trying to acquire the namespace mutex in order to handle
such events.
Signed-off-by: Keith Busch <keith.busch@intel.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>
(back ported from commit 646017a612e72f19bd9f991fe25287a149c5f627) Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Conflicts:
drivers/nvme/host/pci.c
Acked-by: Christopher Arges <chris.j.arges@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Kangjie Lu [Thu, 26 May 2016 12:58:23 +0000 (13:58 +0100)]
USB: usbfs: fix potential infoleak in devio
The stack object “ci” has a total size of 8 bytes. Its last 3 bytes
are padding bytes which are not initialized and leaked to userland
via “copy_to_user”.
Signed-off-by: Kangjie Lu <kjlu@gatech.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 681fef8380eb818c0b845fca5d2ab1dcbab114ee)
CVE-2016-4482 BugLink: https://bugs.launchpad.net/bugs/1578493 Signed-off-by: Luis Henriques <luis.henriques@canonical.com> Acked-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Majd Dibbiny [Thu, 26 May 2016 10:31:11 +0000 (13:31 +0300)]
net/mlx5: Add pci shutdown callback
BugLink: http://bugs.launchpad.net/bugs/1585978
This patch introduces kexec support for mlx5.
When switching kernels, kexec() calls shutdown, which unloads
the driver and cleans its resources.
In addition, remove unregister netdev from shutdown flow. This will
allow a clean shutdown, even if some netdev clients did not release their
reference from this netdev. Releasing The HW resources only is enough as
the kernel is shutting down
Signed-off-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Haggai Abramovsky <hagaya@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
(back-ported from commit 5fc7197d3a256d9c5de3134870304b24892a4908) Signed-off-by: Talat Batheesh <talatb@mellanox.com> Acked-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Philip Whineray [Tue, 24 May 2016 14:42:19 +0000 (09:42 -0500)]
netfilter: Set /proc/net entries owner to root in namespace
BugLink: http://bugs.launchpad.net/bugs/1584953
Various files are owned by root with 0440 permission. Reading them is
impossible in an unprivileged user namespace, interfering with firewall
tools. For instance, iptables-save relies on /proc/net/ip_tables_names
contents to dump only loaded tables.
This patch assigned ownership of the following files to root in the
current namespace:
A mapping for root must be available, so this order should be followed:
unshare(CLONE_NEWUSER);
/* Setup the mapping */
unshare(CLONE_NEWNET);
Signed-off-by: Philip Whineray <phil@firehol.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
(cherry picked from commit f13f2aeed154da8e48f90b85e720f8ba39b1e881) Signed-off-by: Seth Forshee <seth.forshee@canonical.com> Acked-by: Tim Gardner <tim.gardner@canonical.com> Acked-by: Andy Whitcroft <apw@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Richard Alpe [Wed, 25 May 2016 15:23:10 +0000 (16:23 +0100)]
tipc: check nl sock before parsing nested attributes
Make sure the socket for which the user is listing publication exists
before parsing the socket netlink attributes.
Prior to this patch a call without any socket caused a NULL pointer
dereference in tipc_nl_publ_dump().
Tested-and-reported-by: Baozeng Ding <sploving1@gmail.com> Signed-off-by: Richard Alpe <richard.alpe@ericsson.com> Acked-by: Jon Maloy <jon.maloy@ericsson.cm> Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 45e093ae2830cd1264677d47ff9a95a71f5d9f9c)
CVE-2016-4951 BugLink: https://bugs.launchpad.net/bugs/1585365 Signed-off-by: Luis Henriques <luis.henriques@canonical.com> Acked-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Kangjie Lu [Wed, 25 May 2016 15:17:25 +0000 (16:17 +0100)]
ALSA: timer: Fix leak in events via snd_timer_user_tinterrupt
The stack object “r1” has a total size of 32 bytes. Its field
“event” and “val” both contain 4 bytes padding. These 8 bytes
padding bytes are sent to user without being initialized.
Signed-off-by: Kangjie Lu <kjlu@gatech.edu> Signed-off-by: Takashi Iwai <tiwai@suse.de>
(cherry picked from commit e4ec8cc8039a7063e24204299b462bd1383184a5)
CVE-2016-4578 BugLink: https://bugs.launchpad.net/bugs/1581866 Signed-off-by: Luis Henriques <luis.henriques@canonical.com> Acked-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Kangjie Lu [Wed, 25 May 2016 15:17:24 +0000 (16:17 +0100)]
ALSA: timer: Fix leak in events via snd_timer_user_ccallback
The stack object “r1” has a total size of 32 bytes. Its field
“event” and “val” both contain 4 bytes padding. These 8 bytes
padding bytes are sent to user without being initialized.
Signed-off-by: Kangjie Lu <kjlu@gatech.edu> Signed-off-by: Takashi Iwai <tiwai@suse.de>
(cherry picked from commit 9a47e9cff994f37f7f0dbd9ae23740d0f64f9fe6)
CVE-2016-4578 BugLink: https://bugs.launchpad.net/bugs/1581866 Signed-off-by: Luis Henriques <luis.henriques@canonical.com> Acked-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Kangjie Lu [Wed, 25 May 2016 15:15:21 +0000 (16:15 +0100)]
ALSA: timer: Fix leak in SNDRV_TIMER_IOCTL_PARAMS
The stack object “tread” has a total size of 32 bytes. Its field
“event” and “val” both contain 4 bytes padding. These 8 bytes
padding bytes are sent to user without being initialized.
Signed-off-by: Kangjie Lu <kjlu@gatech.edu> Signed-off-by: Takashi Iwai <tiwai@suse.de>
(cherry picked from commit cec8f96e49d9be372fdb0c3836dcf31ec71e457e)
CVE-2016-4569 BugLink: https://bugs.launchpad.net/bugs/1580379 Signed-off-by: Luis Henriques <luis.henriques@canonical.com> Acked-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Sebastian Ott [Mon, 23 May 2016 17:15:32 +0000 (11:15 -0600)]
s390/pci: fix use after free in dma_init
BugLink: http://bugs.launchpad.net/bugs/1584828
After a failure during registration of the dma_table (because of the
function being in error state) we free its memory but don't reset the
associated pointer to zero.
When we then receive a notification from firmware (about the function
being in error state) we'll try to walk and free the dma_table again.
Fix this by resetting the dma_table pointer. In addition to that make
sure that we free the iommu_bitmap when appropriate.
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
(cherry picked from commit dba599091c191d209b1499511a524ad9657c0e5a) Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Acked-by: Andy Whitcroft <apw@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
There is a race with multi-threaded applications between context switch and
pagetable upgrade. In switch_mm() a new user_asce is built from mm->pgd and
mm->context.asce_bits, w/o holding any locks. A concurrent mmap with a
pagetable upgrade on another thread in crst_table_upgrade() could already
have set new asce_bits, but not yet the new mm->pgd. This would result in a
corrupt user_asce in switch_mm(), and eventually in a kernel panic from a
translation exception.
Fix this by storing the complete asce instead of just the asce_bits, which
can then be read atomically from switch_mm(), so that it either sees the
old value or the new value, but no mixture. Both cases are OK. Having the
old value would result in a page fault on access to the higher level memory,
but the fault handler would see the new mm->pgd, if it was a valid access
after the mmap on the other thread has completed. So as worst-case scenario
we would have a page fault loop for the racing thread until the next time
slice.
Also remove dead code and simplify the upgrade/downgrade path, there are no
upgrades from 2 levels, and only downgrades from 3 levels for compat tasks.
There are also no concurrent upgrades, because the mmap_sem is held with
down_write() in do_mmap, so the flush and table checks during upgrade can
be removed.
Reported-by: Michael Munday <munday@ca.ibm.com> Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Acked-by: Andy Whitcroft <apw@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Frederic Barrat [Wed, 25 May 2016 15:03:53 +0000 (09:03 -0600)]
cxl: Increase timeout for detection of AFU mmio hang
BugLink: http://bugs.launchpad.net/bugs/1584066
PSL designers recommend a larger value for the mmio hang pulse, 256 us
instead of 1 us. The CAIA architecture states that it needs to be
smaller than 1/2 of the RTOS timeout set in the PHB for outbound
non-posted transactions, which is still (easily) the case here.
Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Tested-by: Frank Haverkamp <haver@linux.vnet.ibm.com> Tested-by: Manoj Kumar <manoj@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
(cherry picked from commit 4aec6ec0da9c72c0fa1a5b0d1133707481347bb3) Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Acked-by: Andy Whitcroft <apw@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
cxl: Configure the PSL for two CAPI ports on POWER8NVL
BugLink: http://bugs.launchpad.net/bugs/1584066
The POWER8NVL chip has two CAPI ports. Configure the PSL to route
data to the port corresponding to the CAPP unit.
Signed-off-by: Philippe Bergheaud <felix@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
(cherry picked from commit aa14138a51ca42eada706d4b9635bd32d1e09ced) Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Acked-by: Andy Whitcroft <apw@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
BugLink: http://bugs.launchpad.net/bugs/1584066 Signed-off-by: Philippe Bergheaud <felix@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
(cherry picked from commit 86c9ffcc1ed17497a5df473232a7e654fa8cfa1f) Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Acked-by: Andy Whitcroft <apw@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
This is caused by the vt visual_init() function calling into
fbcon_init() with a vc_cur_blink_ms value of zero. This is a
transient condition, as it is later set to a non-zero value. But, if
the timer happens to expire while the blink rate is zero, it goes into
an endless loop, and we get soft lockup.
The fix is to initialize vc_cur_blink_ms before calling the con_init()
function.
Tested-by: Ming Lei <ming.lei@canonical.com> Acked-by: Pavel Machek <pavel@ucw.cz> Acked-by: Scot Doyle <lkml14@scotdoyle.com> Signed-off-by: David Daney <david.daney@cavium.com> Cc: stable@vger.kernel.org
Reference: https://lkml.org/lkml/2016/5/17/455 Cc: Ming Lei <ming.lei@canonical.com> Acked-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
BugLink: http://bugs.launchpad.net/bugs/1584890
Make the "WARNING: inconsistent compiler versions" check actually do
something useful by restricting it to check just the source package
version number (since the whole version string now contains e.g.
"Ubuntu/Linaro", "Ubuntu/IBM")
Signed-off-by: Kamal Mostafa <kamal@canonical.com> Acked-by: Tim Gardner <tim.gardner@canonical.com> Acked-by: Stefan Bader <stefan.bader@canonical.com>
Kamal Mostafa [Mon, 23 May 2016 17:29:56 +0000 (10:29 -0700)]
UBUNTU: [debian] getabis: Only git add $abidir if running in local repo
BugLink: http://bugs.launchpad.net/bugs/1584890
Fixes bogus error when run on a remote host, as via maint-startnewrelease:
fatal: Not a git repository (or any of the parent directories): .git
Signed-off-by: Kamal Mostafa <kamal@canonical.com> Acked-by: Tim Gardner <tim.gardner@canonical.com> Acked-by: Stefan Bader <stefan.bader@canonical.com>
Manoj N. Kumar [Mon, 23 May 2016 20:30:42 +0000 (14:30 -0600)]
cxlflash: Fix to resolve dead-lock during EEH recovery
BugLink: http://bugs.launchpad.net/bugs/1584935
When a cxlflash adapter goes into EEH recovery and multiple processes
(each having established its own context) are active, the EEH recovery
can hang if the processes attempt to recover in parallel. The symptom
logged after a couple of minutes is:
INFO: task eehd:48 blocked for more than 120 seconds.
Not tainted 4.5.0-491-26f710d+ #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
eehd 0 48 2
Call Trace:
__switch_to+0x2f0/0x410
__schedule+0x300/0x980
schedule+0x48/0xc0
rwsem_down_write_failed+0x294/0x410
down_write+0x88/0xb0
cxlflash_pci_error_detected+0x100/0x1c0 [cxlflash]
cxl_vphb_error_detected+0x88/0x110 [cxl]
cxl_pci_error_detected+0xb0/0x1d0 [cxl]
eeh_report_error+0xbc/0x130
eeh_pe_dev_traverse+0x94/0x160
eeh_handle_normal_event+0x17c/0x450
eeh_handle_event+0x184/0x370
eeh_event_handler+0x1c8/0x1d0
kthread+0x110/0x130
ret_from_kernel_thread+0x5c/0xa4
INFO: task blockio:33215 blocked for more than 120 seconds.
Not tainted 4.5.0-491-26f710d+ #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
blockio 0 33215 33213
Call Trace:
0x1 (unreliable)
__switch_to+0x2f0/0x410
__schedule+0x300/0x980
schedule+0x48/0xc0
rwsem_down_read_failed+0x124/0x1d0
down_read+0x68/0x80
cxlflash_ioctl+0x70/0x6f0 [cxlflash]
scsi_ioctl+0x3b0/0x4c0
sg_ioctl+0x960/0x1010
do_vfs_ioctl+0xd8/0x8c0
SyS_ioctl+0xd4/0xf0
system_call+0x38/0xb4
INFO: task eehd:48 blocked for more than 120 seconds.
The hang is because of a 3 way dead-lock:
Process A holds the recovery mutex, and waits for eehd to complete.
Process B holds the semaphore and waits for the recovery mutex.
eehd waits for semaphore.
The fix is to have Process B above release the semaphore before
attempting to acquire the recovery mutex. This will allow
eehd to proceed to completion.
Signed-off-by: Manoj N. Kumar <manoj@linux.vnet.ibm.com> Reviewed-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 635f6b0893cff193a1774881ebb1e4a4b9a7fead) Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Acked-by: Chris J Arges <chris.j.arges@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
BugLink: http://bugs.launchpad.net/bugs/1584912 Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
The slab name ends up being visible in the directory structure under
/sys, and even if you don't have access rights to the file you can see
the filenames.
Just use a 64-bit counter instead of the pointer to the 'net' structure
to generate a unique name.
This code will go away in 4.7 when the conntrack code moves to a single
kmemcache, but this is the backportable simple solution to avoiding
leaking kernel pointers to user space.
Fixes: 5b3501faa874 ("netfilter: nf_conntrack: per netns nf_conntrack_cachep") Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
There is an issue observed when we hotplug a second DP
4K monitor to the system. Sometimes, the link training
fails for the second monitor after HPD interrupt
generation.
The issue happens when some queued or deferred transactions
are already present on the AUX channel when we initiate
a new transcation to (say) get DPCD or during link training.
We set AUX_IGNORE_HPD_DISCON bit in the AUX_CONTROL
register so that we can ignore any such deferred
transactions when a new AUX transaction is initiated.
Signed-off-by: Arindam Nath <arindam.nath@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
BSpec requires us to wait ~100 clocks before re-enabling clock gating,
so make sure we do this.
CC: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Imre Deak <imre.deak@intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1462280061-1457-2-git-send-email-imre.deak@intel.com
(cherry picked from commit 48e5d68d28f00c0cadac5a830980ff3222781abb) Signed-off-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>