git.proxmox.com Git - mirror_ubuntu-artful-kernel.git/log

intel-hid: new hid event driver for hotkeys

BugLink: http://bugs.launchpad.net/bugs/1589886
This driver supports various HID events including hotkeys.
Dell XPS 13 9350 requires it for the wireless hotkey.

Signed-off-by: Alex Hung <alex.hung@canonical.com>
Reviewed-and-tested-by: Andy Lutomirski <luto@kernel.org>
[dvhart: Kconfig help typo fix and INPUT_SPARSEKMAP fix from Sedat Dilek]

Signed-off-by: Darren Hart <dvhart@linux.intel.com>
(cherry picked from commit ecc83e52b28c707da3e7fb8aa471417d9c0d1ec7)
Signed-off-by: Alex Hung <alex.hung@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>

(namespace) ipc/mqueue: The mqueue filesystem should never contain executables

BugLink: http://bugs.launchpad.net/bugs/1588056
Set SB_I_NOEXEC on mqueuefs to ensure small implementation mistakes
do not result in executable on mqueuefs by accident.

Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
(backported from linux-next commit 3ee690143c3c99f6c0e83f08ff17556890bc6027)
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Acked-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>

(namespace) kernfs: The cgroup filesystem also benefits from SB_I_NOEXEC

BugLink: http://bugs.launchpad.net/bugs/1588056
The cgroup filesystem is in the same boat as sysfs. No one ever
permits executables of any kind on the cgroup filesystem, and there is
no reasonable future case to support executables in the future.

Therefore move the setting of SB_I_NOEXEC which makes the code proof
against future mistakes of accidentally creating executables from
sysfs to kernfs itself. Making the code simpler and covering the
sysfs, cgroup, and cgroup2 filesystems.

Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
(backported from linux-next commit 29a517c232d21a717aecea29838aeb07131f6196)
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Acked-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>

UBUNTU: SAUCE: (namespace) Sync with upstream s_user_ns patches

BugLink: http://bugs.launchpad.net/bugs/1588056
Sync up with changes from Eric Biederman when merging s_user_ns
support upstream. Partial backport of
6e4eab577a0cae15b3da9b888cff16fe57981b3e from linux-next.

Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Acked-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>

(namespace) vfs: Pass data, ns, and ns->userns to mount_ns

BugLink: http://bugs.launchpad.net/bugs/1588056
Today what is normally called data (the mount options) is not passed
to fill_super through mount_ns.

Pass the mount options and the namespace separately to mount_ns so
that filesystems such as proc that have mount options, can use
mount_ns.

Pass the user namespace to mount_ns so that the standard permission
check that verifies the mounter has permissions over the namespace can
be performed in mount_ns instead of in each filesystems .mount method.
Thus removing the duplication between mqueuefs and proc in terms of
permission checks. The extra permission check does not currently
affect the rpc_pipefs filesystem and the nfsd filesystem as those
filesystems do not currently allow unprivileged mounts. Without
unpvileged mounts it is guaranteed that the caller has already passed
capable(CAP_SYS_ADMIN) which guarantees extra permission check will
pass.

Update rpc_pipefs and the nfsd filesystem to ensure that the network
namespace reference is always taken in fill_super and always put in kill_sb
so that the logic is simpler and so that errors originating inside of
fill_super do not cause a network namespace leak.

Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
(cherry picked from linux-next commit d91ee87d8d85a0808c01787e8b4a6b48f2ba487b)
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Acked-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>

(namespace) ipc: Initialize ipc_namespace->user_ns early.

BugLink: http://bugs.launchpad.net/bugs/1588056
Allow the ipc namespace initialization code to depend on ns->user_ns
being set during initialization.

In particular this allows mq_init_ns to use ns->user_ns for permission
checks and initializating s_user_ns while the the mq filesystem is
being mounted.

Acked-by: Seth Forshee <seth.forshee@canonical.com>
Suggested-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
(cherry picked from linux-next commit b236017acffa73d52eac9427f42d8993067d20fb)
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Acked-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>

(namespace) bpf, inode: disallow userns mounts

BugLink: http://bugs.launchpad.net/bugs/1588056
Follow-up to commit e27f4a942a0e ("bpf: Use mount_nodev not mount_ns
to mount the bpf filesystem"), which removes the FS_USERNS_MOUNT flag.

The original idea was to have a per mountns instance instead of a
single global fs instance, but that didn't work out and we had to
switch to mount_nodev() model. The intent of that middle ground was
that we avoid users who don't play nice to create endless instances
of bpf fs which are difficult to control and discover from an admin
point of view, but at the same time it would have allowed us to be
more flexible with regard to namespaces.

Therefore, since we now did the switch to mount_nodev() as a fix
where individual instances are created, we also need to remove userns
mount flag along with it to avoid running into mentioned situation.
I don't expect any breakage at this early point in time with removing
the flag and we can revisit this later should the requirement for
this come up with future users. This and commit e27f4a942a0e have
been split to facilitate tracking should any of them run into the
unlikely case of causing a regression.

Fixes: b2197755b263 ("bpf: add support for persistent maps/progs")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 612bacad78ba6d0a91166fc4487af114bac172a8)
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Acked-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>

(namespace) bpf: Use mount_nodev not mount_ns to mount the bpf filesystem

BugLink: http://bugs.launchpad.net/bugs/1588056
While reviewing the filesystems that set FS_USERNS_MOUNT I spotted the
bpf filesystem.  Looking at the code I saw a broken usage of mount_ns
with current->nsproxy->mnt_ns. As the code does not acquire a
reference to the mount namespace it can not possibly be correct to
store the mount namespace on the superblock as it does.

Replace mount_ns with mount_nodev so that each mount of the bpf
filesystem returns a distinct instance, and the code is not buggy.

In discussion with Hannes Frederic Sowa it was reported that the use
of mount_ns was an attempt to have one bpf instance per mount
namespace, in an attempt to keep resources that pin resources from
hiding.  That intent simply does not work, the vfs is not built to
allow that kind of behavior.  Which means that the bpf filesystem
really is buggy both semantically and in it's implemenation as it does
not nor can it implement the original intent.

This change is userspace visible, but my experience with similar
filesystems leads me to believe nothing will break with a model of each
mount of the bpf filesystem is distinct from all others.

Fixes: b2197755b263 ("bpf: add support for persistent maps/progs")
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit e27f4a942a0ee4b84567a3c6cfa84f273e55cbb7)
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Acked-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>

Revert "UBUNTU: SAUCE: cgroup: Use a new super block when mounting in a cgroup namespace"

BugLink: http://bugs.launchpad.net/bugs/1588056
This reverts commit 794fbce4fb2e1f4b5ea7634d69ad05cbf65b11f5.

Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Acked-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>

Revert "UBUNTU: SAUCE: kernfs: Do not match superblock in another user namespace when mounting"

BugLink: http://bugs.launchpad.net/bugs/1588056
This reverts commit aadbec3a89fe98b072e506e9af782b4485c642d8.

Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Acked-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>

Revert "UBUNTU: SAUCE: (namespace) mqueue: Super blocks must be owned by the user ns which owns the ipc ns"

BugLink: http://bugs.launchpad.net/bugs/1588056
This reverts commit dec77184fe7e43a3a505125b627eb245f8e12ce0.

Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Acked-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>

xhci: Cleanup only when releasing primary hcd

BugLink: http://bugs.launchpad.net/bugs/1596635
Under stress occasions some TI devices might not return early when
reading the status register during the quirk invocation of xhci_irq made
by usb_hcd_pci_remove.  This means that instead of returning, we end up
handling this interruption in the middle of a shutdown.  Since
xhci->event_ring has already been freed in xhci_mem_cleanup, we end up
accessing freed memory, causing the Oops below.

commit 8c24d6d7b09d ("usb: xhci: stop everything on the first call to
xhci_stop") is the one that changed the instant in which we clean up the
event queue when stopping a device.  Before, we didn't call
xhci_mem_cleanup at the first time xhci_stop is executed (for the shared
HCD), instead, we only did it after the invocation for the primary HCD,
much later at the removal path.  The code flow for this oops looks like
this:

xhci_pci_remove()
usb_remove_hcd(xhci->shared)
        xhci_stop(xhci->shared)
xhci_halt()
xhci_mem_cleanup(xhci);  // Free the event_queue
usb_hcd_pci_remove(primary)
xhci_irq()  // Access the event_queue if STS_EINT is set. Crash.
xhci_stop()
xhci_halt()
// return early

The fix modifies xhci_stop to only cleanup the xhci data when releasing
the primary HCD.  This way, we still have the event_queue configured
when invoking xhci_irq.  We still halt the device on the first call to
xhci_stop, though.

I could reproduce this issue several times on the mainline kernel by
doing a bind-unbind stress test with a specific storage gadget attached.
I also ran the same test over-night with my patch applied and didn't
observe the issue anymore.

[  113.334124] Unable to handle kernel paging request for data at address 0x00000028
[  113.335514] Faulting instruction address: 0xd00000000d4f767c
[  113.336839] Oops: Kernel access of bad area, sig: 11 [#1]
[  113.338214] SMP NR_CPUS=1024 NUMA PowerNV

[c000000efe47ba90] c000000000720850 usb_hcd_irq+0x50/0x80
[c000000efe47bac0] c00000000073d328 usb_hcd_pci_remove+0x68/0x1f0
[c000000efe47bb00] d00000000daf0128 xhci_pci_remove+0x78/0xb0
[xhci_pci]
[c000000efe47bb30] c00000000055cf70 pci_device_remove+0x70/0x110
[c000000efe47bb70] c00000000061c6bc __device_release_driver+0xbc/0x190
[c000000efe47bba0] c00000000061c7d0 device_release_driver+0x40/0x70
[c000000efe47bbd0] c000000000619510 unbind_store+0x120/0x150
[c000000efe47bc20] c0000000006183c4 drv_attr_store+0x64/0xa0
[c000000efe47bc60] c00000000039f1d0 sysfs_kf_write+0x80/0xb0
[c000000efe47bca0] c00000000039e14c kernfs_fop_write+0x18c/0x1f0
[c000000efe47bcf0] c0000000002e962c __vfs_write+0x6c/0x190
[c000000efe47bd90] c0000000002eab40 vfs_write+0xc0/0x200
[c000000efe47bde0] c0000000002ec85c SyS_write+0x6c/0x110
[c000000efe47be30] c000000000009260 system_call+0x38/0x108

Signed-off-by: Gabriel Krisman Bertazi <krisman@linux.vnet.ibm.com>
Cc: Roger Quadros <rogerq@ti.com>
Cc: joel@jms.id.au
Cc: stable@vger.kernel.org
Reviewed-by: Roger Quadros <rogerq@ti.com>
Cc: <stable@vger.kernel.org> #v4.3+
Tested-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 27a41a83ec54d0edfcaf079310244e7f013a7701)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>

crypto: vmx - IV size failing on skcipher API

BugLink: http://bugs.launchpad.net/bugs/1596557
IV size was zero on CBC and CTR modes,
causing a bug triggered by skcipher.

Fixing this adding a correct size.

Signed-off-by: Leonidas Da Silva Barbosa <leosilva@linux.vnet.ibm.com>
Signed-off-by: Paulo Smorigo <pfsmorigo@linux.vnet.ibm.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit 0d3d054b43719ef33232677ba27ba6097afdafbc)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>

tpm_crb: fix mapping of the buffers

BugLink: http://bugs.launchpad.net/bugs/1596469
On my Lenovo x250 the following situation occurs:

[18697.813871] tpm_crb MSFT0101:00: can't request region for resource
[mem 0xacdff080-0xacdfffff]

The mapping of the control area overlaps the mapping of the command
buffer. The control area is mapped over page, which is not right. It
should mapped over sizeof(struct crb_control_area).

Fixing this issue unmasks another issue. Command and response buffers
can overlap and they do interleave on this machine. According to the PTP
specification the overlapping means that they are mapped to the same
buffer.

The commit has been also on a Haswell NUC where things worked before
applying this fix so that the both code paths for response buffer
initialization are tested.

Cc: stable@vger.kernel.org
Fixes: 1bd047be37d9 ("tpm_crb: Use devm_ioremap_resource")
Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Reviewed-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
(cherry picked from linux-next commit 0af6e0a2da2e4fedaa2743333da438d3b879192b)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>

tpm_crb: drop struct resource res from struct crb_priv

BugLink: http://bugs.launchpad.net/bugs/1596469
The iomem resource is needed only temporarily so it is better to pass
it on instead of storing it permanently. Named the variable as io_res
so that the code better documents itself.

Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Reviewed-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
(cherry picked from linux-next commit 3f944075e75e28c9cf1af8f82798398b0e3594b6)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>

cxlflash: Shutdown notify support for CXL Flash cards

BugLink: http://bugs.launchpad.net/bugs/1592114
Some CXL Flash cards need notification of device shutdown in order to
flush pending I/Os.

A PCI notification hook for shutdown has been added where the driver
notifies the card and returns. When the device is removed in the PCI
remove path, notification code will wait for shutdown processing to
complete.

Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Acked-by: Manoj N. Kumar <manoj@linux.vnet.ibm.com>
Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from linux-next commit 61f7d211b07d34ea9bcb61a83d8adb3abfe75a5f)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>

cxlflash: Add device dependent flags

BugLink: http://bugs.launchpad.net/bugs/1592114
Device dependent flags are needed to support functions that are specific
to a particular device.

One such case is - some CXL Flash cards need to be notified of device
shutdown. For other CXL devices, this feature does not prove to be
useful yet. Such distinct features need to be identified in the driver
to bypass or invoke specific functionality.

In this patch, a member 'flags' has been added to device dependent
values. These flags will be used and expanded in the future to support
various device specific functions.

Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Acked-by: Manoj N. Kumar <manoj@linux.vnet.ibm.com>
Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from linux-next commit 4fecd2767dccfe9aafabc337e08acb7e585171ad)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>

cxlflash: Fix to drain operations from previous reset

BugLink: http://bugs.launchpad.net/bugs/1592114
While running 'sg_reset -H' in a loop with a user-space application active,
hit the following exception:

cpu 0x2: Vector: 300 (Data Access)
    pc: : afu_attach+0x50/0x240 [cxlflash]
    lr: : cxlflash_afu_recover+0x3dc/0x7d0 [cxlflash]
    pid   = 20365, comm = run_block_fvt

Linux version 4.5.0-491-26f710d+

cxlflash_afu_recover+0x3dc/0x7d0 [cxlflash]
cxlflash_ioctl+0x5a8/0x6f0 [cxlflash]
scsi_ioctl+0x3b0/0x4c0
sd_ioctl+0x110/0x190
blkdev_ioctl+0x28c/0xc20
block_ioctl+0xa4/0xd0
do_vfs_ioctl+0xd8/0x8c0
SyS_ioctl+0xd4/0xf0
system_call+0x38/0xb4

The problem here is that the problem space area is unmapped while the
application issues the DK_CXLFLASH_RECOVER_AFU ioctl.

This is the order I observe:

proc1 proc2
1) sg_reset
2) ioctl(DK_CXLFLASH_RECOVER_AFU)
3) sg_reset again
   causing a PSA unmap
4) continues RECOVER_AFU processing

The resolution to this problem is to have the reset handler drain all
outstanding user space initiated ioctls before proceeding.  It is safe
to drain after the state has been changed to STATE_RESET. Also since
drain_ioctls() was static, it had to be moved up a bit to be before
cxlflash_eh_host_reset_handler().

Signed-off-by: Manoj N. Kumar <manoj@linux.vnet.ibm.com>
Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from linux-next commit 894ef44ea6d14153136fc5e5fba2c15a71be404d)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>

cxlflash: Fix regression issue with re-ordering patch

BugLink: http://bugs.launchpad.net/bugs/1592114
While running 'sg_reset -H' back to back the following exception was seen:

[  735.115695] Faulting instruction address: 0xd0000000098c0864
cpu 0x0: Vector: 300 (Data Access) at [c000000ffffafa80]
    pc: d0000000098c0864: cxlflash_async_err_irq+0x84/0x5c0 [cxlflash]
    lr: c00000000013aed0: handle_irq_event_percpu+0xa0/0x310
    sp: c000000ffffafd00
   msr: 9000000000009033
   dar: 2010000
dsisr: 40000000
  current = 0xc000000001510880
  paca    = 0xc00000000fb80000   softe: 0        irq_happened: 0x01
    pid   = 0, comm = swapper/0

Linux version 4.5.0-491-26f710d+

enter ? for help
[c000000ffffafe10] c00000000013aed0 handle_irq_event_percpu+0xa0/0x310
[c000000ffffafed0] c00000000013b1a8 handle_irq_event+0x68/0xc0
[c000000ffffaff00] c0000000001404ec handle_fasteoi_irq+0xec/0x2a0
[c000000ffffaff30] c00000000013a084 generic_handle_irq+0x54/0x80
[c000000ffffaff60] c000000000011130 __do_irq+0x80/0x1d0
[c000000ffffaff90] c000000000024d40 call_do_irq+0x14/0x24
[c000000001573a20] c000000000011318 do_IRQ+0x98/0x140
[c000000001573a70] c000000000002594 hardware_interrupt_common+0x114/0x180

This exception is being hit because the async_err interrupt path performs
an MMIO to read the interrupt status register. The MMIO region in this
case is not available.

Commit 6ded8b3cbd9a ("cxlflash: Unmap problem state area before detaching
master context") re-ordered the sequence in which term_mc() and stop_afu()
are called. This introduces a window for interrupts to come in with the
problem space area unmapped, that did not exist previously.

The fix is to separate the disabling of all AFU interrupts to a distinct
function, term_intr() so that it is the first thing that is done in the
tear down process.

To keep the initialization process symmetric, separate the AFU interrupt
setup also to a distinct function: init_intr().

Fixes: 6ded8b3cbd9a ("cxlflash: Unmap problem state area before detaching master context")
Signed-off-by: Manoj N. Kumar <manoj@linux.vnet.ibm.com>
Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 9526f36026f778e82b5175249443854c03b2e660)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>

UBUNTU: [Config] Add pm80xx scsi driver to d-i

This will add debian installer support for the Adaptec PMC-Sierra SAS
HBA controller.

BugLink: http://bugs.launchpad.net/bugs/1595628
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Andy Whitcroft <apw@canonical.com>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>

[media] tda10071: Fix dependency to REGMAP_I2C

BugLink: http://bugs.launchpad.net/bugs/1592531
Without I get this error for by dvb-card:
  tda10071: Unknown symbol devm_regmap_init_i2c (err 0)
  cx23885_dvb_register() dvb_register failed err = -22
  cx23885_dev_setup() Failed to register dvb adapters on VID_B

Signed-off-by: Matthias Schwarzott <zzam@gentoo.org>
Reviewed-by: Antti Palosaari <crope@iki.fi>
Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
(cherry picked from commit b046d3ad38d90276379c862f15ddd99fa8739906)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Andy Whitcroft <apw@canonical.com>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>

cxl: Make vPHB device node match adapter's

BugLink: http://bugs.launchpad.net/bugs/1594847
On bare-metal, when a device is attached to the cxl card, lsvpd shows
a location code such as (with cxlflash):
     # lsvpd -l sg22
     ...
     *YL U78CB.001.WZS0073-P1-C33-B0-T0-L0
which makes it hard to easily identify the cxl adapter owning the
flash device, since in this example C33 refers to a P8 processor.

lsvpd looks in the parent devices until it finds a location code, so the
device node for the vPHB ends up being used.

By reusing the device node of the adapter for the vPHB, lsvpd shows:
     # lsvpd -l sg16
     ...
     *YL U78C9.001.WZS09XA-P1-C7-B1-T0-L3
where C7 is the PCI slot of the cxl adapter.

On powerVM, the vPHB was already using the adapter device node, so
there's no change there.

Tested by cxlflash on bare-metal and powerVM.

Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Reviewed-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
(cherry picked from linux-next commit a430739009384ba2c4804f3a427334ff395433cd)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Andy Whitcroft <apw@canonical.com>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>

UBUNTU: [Config] Enable arm64 AES and CRC32 crypto

BugLink: http://bugs.launchpad.net/bugs/1594455
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>

crypto: vmx - Increase priority of aes-cbc cipher

BugLink: http://bugs.launchpad.net/bugs/1592481
All of the VMX AES ciphers (AES, AES-CBC and AES-CTR) are set at
priority 1000. Unfortunately this means we never use AES-CBC and
AES-CTR, because the base AES-CBC cipher that is implemented on
top of AES inherits its priority.

To fix this, AES-CBC and AES-CTR have to be a higher priority. Set
them to 2000.

Testing on a POWER8 with:

cryptsetup benchmark --cipher aes --key-size 256

Shows decryption speed increase from 402.4 MB/s to 3069.2 MB/s,
over 7x faster. Thanks to Mike Strosaker for helping me debug
this issue.

Fixes: 8c755ace357c ("crypto: vmx - Adding CBC routines for VMX module")
Cc: stable@vger.kernel.org
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit 12d3f49e1ffbbf8cbbb60acae5a21103c5c841ac git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git)
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
Acked-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>

crypto: vmx - Fix ABI detection

BugLink: http://bugs.launchpad.net/bugs/1592481
When calling ppc-xlate.pl, we pass it either linux-ppc64 or
linux-ppc64le. The script however was expecting linux64le, a result
of its OpenSSL origins. This means we aren't obeying the ppc64le
ABIv2 rules.

Fix this by checking for linux-ppc64le.

Fixes: 5ca55738201c ("crypto: vmx - comply with ABIs that specify vrsave as reserved.")
Cc: stable@vger.kernel.org
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit 975f57fdff1d0eb9816806cabd27162a8a1a4038 git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git)
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
Acked-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>

crypto: vmx - comply with ABIs that specify vrsave as reserved.

BugLink: http://bugs.launchpad.net/bugs/1592481
It gives significant improvements ( ~+15%) on some modes.

These code has been adopted from OpenSSL project in collaboration
with the original author (Andy Polyakov <appro@openssl.org>).

Signed-off-by: Paulo Flabiano Smorigo <pfsmorigo@linux.vnet.ibm.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit 5ca55738201c7ae1b556ad87bbb22c139ecc01dd)
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Acked-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>

UBUNTU: [Config] CONFIG_SQUASHFS=y

BugLink: http://bugs.launchpad.net/bugs/1593134
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Paolo Pisati <paolo.pisati@canonical.com>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>

KVM: PPC: Book3S HV: Re-enable XICS fast path for irqfd-generated interrupts

BugLink: http://bugs.launchpad.net/bugs/1592809
Commit c9a5eccac1ab ("kvm/eventfd: add arch-specific set_irq",
2015-10-16) added the possibility for architecture-specific code
to handle the generation of virtual interrupts in atomic context
where possible, without having to schedule a work function.

Since we can easily generate virtual interrupts on XICS without
having to do anything worse than take a spinlock, we define a
kvm_arch_set_irq_inatomic() for XICS.  We also remove kvm_set_msi()
since it is not used any more.

The one slightly tricky thing is that with the new interface, we
don't get told whether the interrupt is an MSI (or other edge
sensitive interrupt) vs. level-sensitive.  The difference as far
as interrupt generation is concerned is that for LSIs we have to
set the asserted flag so it will continue to fire until it is
explicitly cleared.

In fact the XICS code gets told which interrupts are LSIs by userspace
when it configures the interrupt via the KVM_DEV_XICS_GRP_SOURCES
attribute group on the XICS device.  To store this information, we add
a new "lsi" field to struct ics_irq_state.  With that we can also do a
better job of returning accurate values when reading the attribute
group.

Signed-off-by: Paul Mackerras <paulus@samba.org>
(cherry picked from commit b1a4286b8f3393857a205ec89607683161b75f90)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Christopher Arges <chris.j.arges@canonical.com>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>

virtio_balloon: fix PFN format for virtio-1

BugLink: http://bugs.launchpad.net/bugs/1592042
Everything should be LE when using virtio-1, but
the linux balloon driver does not seem to care about that.

Reported-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
(cherry picked from commit 87c9403b0d1de4676b0bd273eea68fcf6de68e68)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Christopher Arges <chris.j.arges@canonical.com>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>

HID: core: prevent out-of-bound readings

BugLink: http://bugs.launchpad.net/bugs/1579190
Plugging a Logitech DJ receiver with KASAN activated raises a bunch of
out-of-bound readings.

The fields are allocated up to MAX_USAGE, meaning that potentially, we do
not have enough fields to fit the incoming values.
Add checks and silence KASAN.

Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
(cherry picked from commit 50220dead1650609206efe91f0cc116132d59b3f)
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Acked-by: Andy Whitcroft <apw@canonical.com>
Acked-by: Tim Gardner <tim.gardner@canonical.com>

Fix ztest truncated cache file

BugLink: http://bugs.launchpad.net/bugs/1587686
Commit efc412b updated spa_config_write() for Linux 4.2 kernels to
truncate and overwrite rather than rename the cache file. This is
the correct fix but it should have only been applied for the kernel
build. In user space rename(2) is needed because ztest depends on
the cache file.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #4129
(cherry picked from zfs commit 151f84e2c32f690b92c424d8c55d2dfccaa76e51)
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>

lpfc: Fix DMA faults observed upon plugging loopback connector

BugLink: http://bugs.launchpad.net/bugs/1587316
Driver didn't program the REG_VFI mailbox correctly, giving the adapter
bad addresses.

Signed-off-by: Dick Kennedy <dick.kennedy@avagotech.com>
Signed-off-by: James Smart <james.smart@avagotech.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit ae09c765109293b600ba9169aa3d632e1ac1a843)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>

UBUNTU: Start new release

Ignore: yes
Signed-off-by: Kamal Mostafa <kamal@canonical.com>

UBUNTU: Ubuntu-4.4.0-28.47

Signed-off-by: Luis Henriques <luis.henriques@canonical.com>

netfilter: x_tables: introduce and use xt_copy_counters_from_user

The three variants use same copy&pasted code, condense this into a
helper and use that.

Make sure info.name is 0-terminated.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
(cherry picked from commit d7591f0c41ce3e67600a982bab6989ef0f07b3ce)
BugLink: https://bugs.launchpad.net/bugs/1595350
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Acked-by: Tim Gardner <tim.gardner@canonical.com>
Signed-off-by: Brad Figg <brad.figg@canonical.com>

netfilter: x_tables: do compat validation via translate_table

This looks like refactoring, but its also a bug fix.

Problem is that the compat path (32bit iptables, 64bit kernel) lacks a few
sanity tests that are done in the normal path.

For example, we do not check for underflows and the base chain policies.

While its possible to also add such checks to the compat path, its more
copy&pastry, for instance we cannot reuse check_underflow() helper as
e->target_offset differs in the compat case.

Other problem is that it makes auditing for validation errors harder; two
places need to be checked and kept in sync.

At a high level 32 bit compat works like this:
1- initial pass over blob:
   validate match/entry offsets, bounds checking
   lookup all matches and targets
   do bookkeeping wrt. size delta of 32/64bit structures
   assign match/target.u.kernel pointer (points at kernel
   implementation, needed to access ->compatsize etc.)

2- allocate memory according to the total bookkeeping size to
   contain the translated ruleset

3- second pass over original blob:
   for each entry, copy the 32bit representation to the newly allocated
   memory.  This also does any special match translations (e.g.
   adjust 32bit to 64bit longs, etc).

4- check if ruleset is free of loops (chase all jumps)

5-first pass over translated blob:
   call the checkentry function of all matches and targets.

The alternative implemented by this patch is to drop steps 3&4 from the
compat process, the translation is changed into an intermediate step
rather than a full 1:1 translate_table replacement.

In the 2nd pass (step #3), change the 64bit ruleset back to a kernel
representation, i.e. put() the kernel pointer and restore ->u.user.name .

This gets us a 64bit ruleset that is in the format generated by a 64bit
iptables userspace -- we can then use translate_table() to get the
'native' sanity checks.

This has two drawbacks:

1. we re-validate all the match and target entry structure sizes even
though compat translation is supposed to never generate bogus offsets.
2. we put and then re-lookup each match and target.

THe upside is that we get all sanity tests and ruleset validations
provided by the normal path and can remove some duplicated compat code.

iptables-restore time of autogenerated ruleset with 300k chains of form
-A CHAIN0001 -m limit --limit 1/s -j CHAIN0002
-A CHAIN0002 -m limit --limit 1/s -j CHAIN0003

shows no noticeable differences in restore times:
old:   0m30.796s
new:   0m31.521s
64bit: 0m25.674s

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
(cherry picked from commit 09d9686047dbbe1cf4faa558d3ecc4aae2046054)
BugLink: https://bugs.launchpad.net/bugs/1595350
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Acked-by: Tim Gardner <tim.gardner@canonical.com>
Signed-off-by: Brad Figg <brad.figg@canonical.com>

netfilter: x_tables: xt_compat_match_from_user doesn't need a retval

Always returned 0.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
(cherry picked from commit 0188346f21e6546498c2a0f84888797ad4063fc5)
BugLink: https://bugs.launchpad.net/bugs/1595350
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Acked-by: Tim Gardner <tim.gardner@canonical.com>
Signed-off-by: Brad Figg <brad.figg@canonical.com>

netfilter: ip6_tables: simplify translate_compat_table args

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
(cherry picked from commit 329a0807124f12fe1c8032f95d8a8eb47047fb0e)
BugLink: https://bugs.launchpad.net/bugs/1595350
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Acked-by: Tim Gardner <tim.gardner@canonical.com>
Signed-off-by: Brad Figg <brad.figg@canonical.com>

netfilter: ip_tables: simplify translate_compat_table args

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
(cherry picked from commit 7d3f843eed29222254c9feab481f55175a1afcc9)
BugLink: https://bugs.launchpad.net/bugs/1595350
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Acked-by: Tim Gardner <tim.gardner@canonical.com>
Signed-off-by: Brad Figg <brad.figg@canonical.com>

netfilter: arp_tables: simplify translate_compat_table args

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
(cherry picked from commit 8dddd32756f6fe8e4e82a63361119b7e2384e02f)
BugLink: https://bugs.launchpad.net/bugs/1595350
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Acked-by: Tim Gardner <tim.gardner@canonical.com>
Signed-off-by: Brad Figg <brad.figg@canonical.com>

netfilter: x_tables: don't reject valid target size on some architectures

Quoting John Stultz:
  In updating a 32bit arm device from 4.6 to Linus' current HEAD, I
  noticed I was having some trouble with networking, and realized that
  /proc/net/ip_tables_names was suddenly empty.
  Digging through the registration process, it seems we're catching on the:

   if (strcmp(t->u.user.name, XT_STANDARD_TARGET) == 0 &&
       target_offset + sizeof(struct xt_standard_target) != next_offset)
         return -EINVAL;

  Where next_offset seems to be 4 bytes larger then the
  offset + standard_target struct size.

next_offset needs to be aligned via XT_ALIGN (so we can access all members
of ip(6)t_entry struct).

This problem didn't show up on i686 as it only needs 4-byte alignment for
u64, but iptables userspace on other 32bit arches does insert extra padding.

Reported-by: John Stultz <john.stultz@linaro.org>
Tested-by: John Stultz <john.stultz@linaro.org>
Fixes: 7ed2abddd20cf ("netfilter: x_tables: check standard target size too")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
(cherry picked from commit 7b7eba0f3515fca3296b8881d583f7c1042f5226)
BugLink: https://bugs.launchpad.net/bugs/1595350
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Acked-by: Tim Gardner <tim.gardner@canonical.com>
Signed-off-by: Brad Figg <brad.figg@canonical.com>

netfilter: x_tables: validate all offsets and sizes in a rule

Validate that all matches (if any) add up to the beginning of
the target and that each match covers at least the base structure size.

The compat path should be able to safely re-use the function
as the structures only differ in alignment; added a
BUILD_BUG_ON just in case we have an arch that adds padding as well.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
(cherry picked from commit 13631bfc604161a9d69cd68991dff8603edd66f9)
BugLink: https://bugs.launchpad.net/bugs/1595350
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Acked-by: Tim Gardner <tim.gardner@canonical.com>
Signed-off-by: Brad Figg <brad.figg@canonical.com>

netfilter: x_tables: check for bogus target offset

We're currently asserting that targetoff + targetsize <= nextoff.

Extend it to also check that targetoff is >= sizeof(xt_entry).
Since this is generic code, add an argument pointing to the start of the
match/target, we can then derive the base structure size from the delta.

We also need the e->elems pointer in a followup change to validate matches.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
(cherry picked from commit ce683e5f9d045e5d67d1312a42b359cb2ab2a13c)
BugLink: https://bugs.launchpad.net/bugs/1595350
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Acked-by: Tim Gardner <tim.gardner@canonical.com>
Signed-off-by: Brad Figg <brad.figg@canonical.com>

netfilter: x_tables: check standard target size too

We have targets and standard targets -- the latter carries a verdict.

The ip/ip6tables validation functions will access t->verdict for the
standard targets to fetch the jump offset or verdict for chainloop
detection, but this happens before the targets get checked/validated.

Thus we also need to check for verdict presence here, else t->verdict
can point right after a blob.

Spotted with UBSAN while testing malformed blobs.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
(cherry picked from commit 7ed2abddd20cf8f6bd27f65bd218f26fa5bf7f44)
BugLink: https://bugs.launchpad.net/bugs/1595350
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Acked-by: Tim Gardner <tim.gardner@canonical.com>
Signed-off-by: Brad Figg <brad.figg@canonical.com>

netfilter: x_tables: add compat version of xt_check_entry_offsets

32bit rulesets have different layout and alignment requirements, so once
more integrity checks get added to xt_check_entry_offsets it will reject
well-formed 32bit rulesets.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
(cherry picked from commit fc1221b3a163d1386d1052184202d5dc50d302d1)
BugLink: https://bugs.launchpad.net/bugs/1595350
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Acked-by: Tim Gardner <tim.gardner@canonical.com>
Signed-off-by: Brad Figg <brad.figg@canonical.com>

netfilter: x_tables: assert minimum target size

The target size includes the size of the xt_entry_target struct.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
(cherry picked from commit a08e4e190b866579896c09af59b3bdca821da2cd)
BugLink: https://bugs.launchpad.net/bugs/1595350
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Acked-by: Tim Gardner <tim.gardner@canonical.com>
Signed-off-by: Brad Figg <brad.figg@canonical.com>

netfilter: x_tables: kill check_entry helper

Once we add more sanity testing to xt_check_entry_offsets it
becomes relvant if we're expecting a 32bit 'config_compat' blob
or a normal one.

Since we already have a lot of similar-named functions (check_entry,
compat_check_entry, find_and_check_entry, etc.) and the current
incarnation is short just fold its contents into the callers.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
(cherry picked from commit aa412ba225dd3bc36d404c28cdc3d674850d80d0)
BugLink: https://bugs.launchpad.net/bugs/1595350
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Acked-by: Tim Gardner <tim.gardner@canonical.com>
Signed-off-by: Brad Figg <brad.figg@canonical.com>

netfilter: x_tables: add and use xt_check_entry_offsets

Currently arp/ip and ip6tables each implement a short helper to check that
the target offset is large enough to hold one xt_entry_target struct and
that t->u.target_size fits within the current rule.

Unfortunately these checks are not sufficient.

To avoid adding new tests to all of ip/ip6/arptables move the current
checks into a helper, then extend this helper in followup patches.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
(cherry picked from commit 7d35812c3214afa5b37a675113555259cfd67b98)
BugLink: https://bugs.launchpad.net/bugs/1595350
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Acked-by: Tim Gardner <tim.gardner@canonical.com>
Signed-off-by: Brad Figg <brad.figg@canonical.com>

netfilter: x_tables: validate targets of jumps

When we see a jump also check that the offset gets us to beginning of
a rule (an ipt_entry).

The extra overhead is negible, even with absurd cases.

300k custom rules, 300k jumps to 'next' user chain:
[ plus one jump from INPUT to first userchain ]:

Before:
real    0m24.874s
user    0m7.532s
sys     0m16.076s

After:
real    0m27.464s
user    0m7.436s
sys     0m18.840s

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
(backported from commit 36472341017529e2b12573093cc0f68719300997)
[ luis: adjusted context ]
BugLink: https://bugs.launchpad.net/bugs/1595350
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Acked-by: Tim Gardner <tim.gardner@canonical.com>
Signed-off-by: Brad Figg <brad.figg@canonical.com>

netfilter: x_tables: don't move to non-existent next rule

Ben Hawkes says:

In the mark_source_chains function (net/ipv4/netfilter/ip_tables.c) it
is possible for a user-supplied ipt_entry structure to have a large
next_offset field. This field is not bounds checked prior to writing a
counter value at the supplied offset.

Base chains enforce absolute verdict.

User defined chains are supposed to end with an unconditional return,
xtables userspace adds them automatically.

But if such return is missing we will move to non-existent next rule.

Reported-by: Ben Hawkes <hawkes@google.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
(cherry picked from commit f24e230d257af1ad7476c6e81a8dc3127a74204e)
BugLink: https://bugs.launchpad.net/bugs/1595350
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Acked-by: Tim Gardner <tim.gardner@canonical.com>
Signed-off-by: Brad Figg <brad.figg@canonical.com>

netfilter: x_tables: fix unconditional helper

Ben Hawkes says:

In the mark_source_chains function (net/ipv4/netfilter/ip_tables.c) it
is possible for a user-supplied ipt_entry structure to have a large
next_offset field. This field is not bounds checked prior to writing a
counter value at the supplied offset.

Problem is that mark_source_chains should not have been called --
the rule doesn't have a next entry, so its supposed to return
an absolute verdict of either ACCEPT or DROP.

However, the function conditional() doesn't work as the name implies.
It only checks that the rule is using wildcard address matching.

However, an unconditional rule must also not be using any matches
(no -m args).

The underflow validator only checked the addresses, therefore
passing the 'unconditional absolute verdict' test, while
mark_source_chains also tested for presence of matches, and thus
proceeeded to the next (not-existent) rule.

Unify this so that all the callers have same idea of 'unconditional rule'.

Reported-by: Ben Hawkes <hawkes@google.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
CVE-2016-3134
(cherry picked from commit 54d83fc74aa9ec72794373cb47432c5f7fb1a309)
BugLink: https://bugs.launchpad.net/bugs/1555338
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Acked-by: Tim Gardner <tim.gardner@canonical.com>
Signed-off-by: Brad Figg <brad.figg@canonical.com>

netfilter: x_tables: make sure e->next_offset covers remaining blob size

Otherwise this function may read data beyond the ruleset blob.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
CVE-2016-3134
(cherry picked from commit 6e94e0cfb0887e4013b3b930fa6ab1fe6bb6ba91)
BugLink: https://bugs.launchpad.net/bugs/1555338
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Acked-by: Tim Gardner <tim.gardner@canonical.com>
Signed-off-by: Brad Figg <brad.figg@canonical.com>

netfilter: x_tables: validate e->target_offset early

We should check that e->target_offset is sane before
mark_source_chains gets called since it will fetch the target entry
for loop detection.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
CVE-2016-3134
(cherry picked from commit bdf533de6968e9686df777dc178486f600c6e617)
BugLink: https://bugs.launchpad.net/bugs/1555338
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
Acked-by: Tim Gardner <tim.gardner@canonical.com>
Signed-off-by: Brad Figg <brad.figg@canonical.com>

UBUNTU: Start new release

Ignore: yes
Signed-off-by: Kamal Mostafa <kamal@canonical.com>

UBUNTU: Ubuntu-4.4.0-27.46

Signed-off-by: Kamal Mostafa <kamal@canonical.com>

Revert "UBUNTU: SAUCE: Bluetooth: Support for LED on Marvell modules"

BugLink: https://launchpad.net/bugs/1512999
This commit reverts 8b6d64a7ef7967b9bcab363261ae48edb2986f79.

Although the BT LED works with the patch, it introduces a new regression
that causes HCI stops responding after LED-on command is issued.

Tested with the latest master-next head with this revert patch, HCI
works, and both Marvell wireless driver update and WiFi LED work without
problem.

Signed-off-by: Wen-chien Jesse Sung <jesse.sung@canonical.com>
Acked-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Andy Whitcroft <apw@canonical.com>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>

UBUNTU: Start new release

Ignore: yes
Signed-off-by: Kamal Mostafa <kamal@canonical.com>

UBUNTU: Ubuntu-4.4.0-26.45

Signed-off-by: Kamal Mostafa <kamal@canonical.com>

UBUNTU: SAUCE: UEFI: Add secure boot and MOK SB State disabled sysctl

BugLink: http://bugs.launchpad.net/bugs/1593075
This is a better method for detecting the state of secure boot and
the MOKSBState override, as opposed to grepping status from the kernel log.
Both variables return 0 or 1. If secure_boot==0 then signed module
enforcement is not enabled. Likewise, if moksbstate_disabled==1 then
signed module enforcement is not enabled. The only conditions uder which
signed module enforcement is enabled is when secure_boot==1 and
moksbstate_disabled==0.

/proc/sys/kernel/secure_boot
/proc/sys/kernel/moksbstate_disabled

Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Acked-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>

ethtool.h: define INT_MAX for userland

BugLink: http://bugs.launchpad.net/bugs/1592930
INT_MAX needs limits.h in userland.
When ethtool.h is included by a userland app, we got the following error:

.../usr/include/linux/ethtool.h: In function 'ethtool_validate_speed':
.../usr/include/linux/ethtool.h:1471:18: error: 'INT_MAX' undeclared (first use in this function)
return speed <= INT_MAX || speed == SPEED_UNKNOWN
^
Fixes: 72f843bdbdca ("ethtool: make validate_speed accept all speeds between 0 and INT_MAX")
CC: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 14e2037902d65213842b4e40305ff54a64abbcb6)
Signed-off-by: Andy Gospodarek <gospo@cumulusnetworks.com>
Reported-by: Iain Lane <iain@orangesquash.org.uk>
Acked-by: Tim Gardner <tim.gardner@canonical.com>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>

UBUNTU: Start new release

Ignore: yes
Signed-off-by: Kamal Mostafa <kamal@canonical.com>

UBUNTU: Ubuntu-4.4.0-25.44

Signed-off-by: Kamal Mostafa <kamal@canonical.com>

Linux 4.4.13

BugLink: http://bugs.launchpad.net/bugs/1590455
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>

xfs: handle dquot buffer readahead in log recovery correctly

BugLink: http://bugs.launchpad.net/bugs/1590455
commit 7d6a13f023567d573ac362502bb702eda716e654 upstream.

When we do dquot readahead in log recovery, we do not use a verifier
as the underlying buffer may not have dquots in it. e.g. the
allocation operation hasn't yet been replayed. Hence we do not want
to fail recovery because we detect an operation to be replayed has
not been run yet. This problem was addressed for inodes in commit
d891400 ("xfs: inode buffers may not be valid during recovery
readahead") but the problem was not recognised to exist for dquots
and their buffers as the dquot readahead did not have a verifier.

The result of not using a verifier is that when the buffer is then
next read to replay a dquot modification, the dquot buffer verifier
will only be attached to the buffer if *readahead is not complete*.
Hence we can read the buffer, replay the dquot changes and then add
it to the delwri submission list without it having a verifier
attached to it. This then generates warnings in xfs_buf_ioapply(),
which catches and warns about this case.

Fix this and make it handle the same readahead verifier error cases
as for inode buffers by adding a new readahead verifier that has a
write operation as well as a read operation that marks the buffer as
not done if any corruption is detected. Also make sure we don't run
readahead if the dquot buffer has been marked as cancelled by
recovery.

This will result in readahead either succeeding and the buffer
having a valid write verifier, or readahead failing and the buffer
state requiring the subsequent read to resubmit the IO with the new
verifier. In either case, this will result in the buffer always
ending up with a valid write verifier on it.

Note: we also need to fix the inode buffer readahead error handling
to mark the buffer with EIO. Brian noticed the code I copied from
there wrong during review, so fix it at the same time. Add comments
linking the two functions that handle readahead verifier errors
together so we don't forget this behavioural link in future.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>

xfs: print name of verifier if it fails

BugLink: http://bugs.launchpad.net/bugs/1590455
commit 233135b763db7c64d07b728a9c66745fb0376275 upstream.

This adds a name to each buf_ops structure, so that if
a verifier fails we can print the type of verifier that
failed it. Should be a slight debugging aid, I hope.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Cc: Holger Hoffstätte <holger@applied-asynchrony.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>

xfs: skip stale inodes in xfs_iflush_cluster

BugLink: http://bugs.launchpad.net/bugs/1590455
commit 7d3aa7fe970791f1a674b14572a411accf2f4d4e upstream.

We don't write back stale inodes so we should skip them in
xfs_iflush_cluster, too.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>

xfs: fix inode validity check in xfs_iflush_cluster

BugLink: http://bugs.launchpad.net/bugs/1590455
commit 51b07f30a71c27405259a0248206ed4e22adbee2 upstream.

Some careless idiot(*) wrote crap code in commit 1a3e8f3 ("xfs:
convert inode cache lookups to use RCU locking") back in late 2010,
and so xfs_iflush_cluster checks the wrong inode for whether it is
still valid under RCU protection. Fix it to lock and check the
correct inode.

(*) Careless-idiot: Dave Chinner <dchinner@redhat.com>

Discovered-by: Brain Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>

xfs: xfs_iflush_cluster fails to abort on error

BugLink: http://bugs.launchpad.net/bugs/1590455
commit b1438f477934f5a4d5a44df26f3079a7575d5946 upstream.

When a failure due to an inode buffer occurs, the error handling
fails to abort the inode writeback correctly. This can result in the
inode being reclaimed whilst still in the AIL, leading to
use-after-free situations as well as filesystems that cannot be
unmounted as the inode log items left in the AIL never get removed.

Fix this by ensuring fatal errors from xfs_imap_to_bp() result in
the inode flush being aborted correctly.

Reported-by: Shyam Kaushik <shyam@zadarastorage.com>
Diagnosed-by: Shyam Kaushik <shyam@zadarastorage.com>
Tested-by: Shyam Kaushik <shyam@zadarastorage.com>
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>

xfs: Don't wrap growfs AGFL indexes

BugLink: http://bugs.launchpad.net/bugs/1590455
commit ad747e3b299671e1a53db74963cc6c5f6cdb9f6d upstream.

Commit 96f859d ("libxfs: pack the agfl header structure so
XFS_AGFL_SIZE is correct") allowed the freelist to use the empty
slot at the end of the freelist on 64 bit systems that was not
being used due to sizeof() rounding up the structure size.

This has caused versions of xfs_repair prior to 4.5.0 (which also
has the fix) to report this as a corruption once the filesystem has
been grown. Older kernels can also have problems (seen from a whacky
container/vm management environment) mounting filesystems grown on a
system with a newer kernel than the vm/container it is deployed on.

To avoid this problem, change the initial free list indexes not to
wrap across the end of the AGFL, hence avoiding the initialisation
of agf_fllast to the last index in the AGFL.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>

xfs: disallow rw remount on fs with unknown ro-compat features

BugLink: http://bugs.launchpad.net/bugs/1590455
commit d0a58e833931234c44e515b5b8bede32bd4e6eed upstream.

Today, a kernel which refuses to mount a filesystem read-write
due to unknown ro-compat features can still transition to read-write
via the remount path.  The old kernel is most likely none the wiser,
because it's unaware of the new feature, and isn't using it.  However,
writing to the filesystem may well corrupt metadata related to that
new feature, and moving to a newer kernel which understand the feature
will have problems.

Right now the only ro-compat feature we have is the free inode btree,
which showed up in v3.16.  It would be good to push this back to
all the active stable kernels, I think, so that if anyone is using
newer mkfs (which enables the finobt feature) with older kernel
releases, they'll be protected.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Bill O'Donnell <billodo@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>

gcov: disable tree-loop-im to reduce stack usage

BugLink: http://bugs.launchpad.net/bugs/1590455
commit c87bf431448b404a6ef5fbabd74c0e3e42157a7f upstream.

Enabling CONFIG_GCOV_PROFILE_ALL produces us a lot of warnings like

lib/lz4/lz4hc_compress.c: In function 'lz4_compresshcctx':
lib/lz4/lz4hc_compress.c:514:1: warning: the frame size of 1504 bytes is larger than 1024 bytes [-Wframe-larger-than=]

After some investigation, I found that this behavior started with gcc-4.9,
and opened https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69702.
A suggested workaround for it is to use the -fno-tree-loop-im
flag that turns off one of the optimization stages in gcc, so the
code runs a little slower but does not use excessive amounts
of stack.

We could make this conditional on the gcc version, but I could not
find an easy way to do this in Kbuild and the benefit would be
fairly small, given that most of the gcc version in production are
affected now.

I'm marking this for 'stable' backports because it addresses a bug
with code generation in gcc that exists in all kernel versions
with the affected gcc releases.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Peter Oberparleiter <oberpar@linux.vnet.ibm.com>
Signed-off-by: Michal Marek <mmarek@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>

scripts/package/Makefile: rpmbuild add support of RPMOPTS

BugLink: http://bugs.launchpad.net/bugs/1590455
commit 65a9f31c5042e5bb50d30ed8ae374044be561054 upstream.

After commit 21a59991ce0c ("scripts/package/Makefile: rpmbuild is needed
for rpm targets"), it is no longer possible to specify RPMOPTS.
For example, we can no longer able to control _topdir using the following
make command.
make RPMOPTS="--define '_topdir /home/xyz/workspace/'" binrpm-pkg

Fixes: 21a59991ce0c ("scripts/package/Makefile: rpmbuild is needed for rpm targets")
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Michal Marek <mmarek@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>

dma-debug: avoid spinlock recursion when disabling dma-debug

BugLink: http://bugs.launchpad.net/bugs/1590455
commit 3017cd63f26fc655d56875aaf497153ba60e9edf upstream.

With netconsole (at least) the pr_err("... disablingn") call can
recurse back into the dma-debug code, where it'll try to grab
free_entries_lock again. Avoid the problem by doing the printk after
dropping the lock.

Link: http://lkml.kernel.org/r/1463678421-18683-1-git-send-email-ville.syrjala@linux.intel.com
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>

PM / sleep: Handle failures in device_suspend_late() consistently

BugLink: http://bugs.launchpad.net/bugs/1590455
commit 3a17fb329da68cb00558721aff876a80bba2fdb9 upstream.

Grygorii Strashko reports:

The PM runtime will be left disabled for the device if its
.suspend_late() callback fails and async suspend is not allowed
for this device. In this case device will not be added in
dpm_late_early_list and dpm_resume_early() will ignore this
device, as result PM runtime will be disabled for it forever
(side effect: after 8 subsequent failures for the same device
the PM runtime will be reenabled due to disable_depth overflow).

To fix this problem, add devices to dpm_late_early_list regardless
of whether or not device_suspend_late() returns errors for them.

That will ensure failures in there to be handled consistently for
all devices regardless of their async suspend/resume status.

Reported-by: Grygorii Strashko <grygorii.strashko@ti.com>
Tested-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>

ext4: silence UBSAN in ext4_mb_init()

BugLink: http://bugs.launchpad.net/bugs/1590455
commit 935244cd54b86ca46e69bc6604d2adfb1aec2d42 upstream.

Currently, in ext4_mb_init(), there's a loop like the following:

  do {
    ...
    offset += 1 << (sb->s_blocksize_bits - i);
    i++;
  } while (i <= sb->s_blocksize_bits + 1);

Note that the updated offset is used in the loop's next iteration only.

However, at the last iteration, that is at i == sb->s_blocksize_bits + 1,
the shift count becomes equal to (unsigned)-1 > 31 (c.f. C99 6.5.7(3))
and UBSAN reports

  UBSAN: Undefined behaviour in fs/ext4/mballoc.c:2621:15
  shift exponent 4294967295 is too large for 32-bit type 'int'
  [...]
  Call Trace:
   [<ffffffff818c4d25>] dump_stack+0xbc/0x117
   [<ffffffff818c4c69>] ? _atomic_dec_and_lock+0x169/0x169
   [<ffffffff819411ab>] ubsan_epilogue+0xd/0x4e
   [<ffffffff81941cac>] __ubsan_handle_shift_out_of_bounds+0x1fb/0x254
   [<ffffffff81941ab1>] ? __ubsan_handle_load_invalid_value+0x158/0x158
   [<ffffffff814b6dc1>] ? kmem_cache_alloc+0x101/0x390
   [<ffffffff816fc13b>] ? ext4_mb_init+0x13b/0xfd0
   [<ffffffff814293c7>] ? create_cache+0x57/0x1f0
   [<ffffffff8142948a>] ? create_cache+0x11a/0x1f0
   [<ffffffff821c2168>] ? mutex_lock+0x38/0x60
   [<ffffffff821c23ab>] ? mutex_unlock+0x1b/0x50
   [<ffffffff814c26ab>] ? put_online_mems+0x5b/0xc0
   [<ffffffff81429677>] ? kmem_cache_create+0x117/0x2c0
   [<ffffffff816fcc49>] ext4_mb_init+0xc49/0xfd0
   [...]

Observe that the mentioned shift exponent, 4294967295, equals (unsigned)-1.

Unless compilers start to do some fancy transformations (which at least
GCC 6.0.0 doesn't currently do), the issue is of cosmetic nature only: the
such calculated value of offset is never used again.

Silence UBSAN by introducing another variable, offset_incr, holding the
next increment to apply to offset and adjust that one by right shifting it
by one position per loop iteration.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=114701
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=112161

Signed-off-by: Nicolai Stange <nicstange@gmail.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>

ext4: address UBSAN warning in mb_find_order_for_block()

BugLink: http://bugs.launchpad.net/bugs/1590455
commit b5cb316cdf3a3f5f6125412b0f6065185240cfdc upstream.

Currently, in mb_find_order_for_block(), there's a loop like the following:

  while (order <= e4b->bd_blkbits + 1) {
    ...
    bb += 1 << (e4b->bd_blkbits - order);
  }

Note that the updated bb is used in the loop's next iteration only.

However, at the last iteration, that is at order == e4b->bd_blkbits + 1,
the shift count becomes negative (c.f. C99 6.5.7(3)) and UBSAN reports

  UBSAN: Undefined behaviour in fs/ext4/mballoc.c:1281:11
  shift exponent -1 is negative
  [...]
  Call Trace:
   [<ffffffff818c4d35>] dump_stack+0xbc/0x117
   [<ffffffff818c4c79>] ? _atomic_dec_and_lock+0x169/0x169
   [<ffffffff819411bb>] ubsan_epilogue+0xd/0x4e
   [<ffffffff81941cbc>] __ubsan_handle_shift_out_of_bounds+0x1fb/0x254
   [<ffffffff81941ac1>] ? __ubsan_handle_load_invalid_value+0x158/0x158
   [<ffffffff816e93a0>] ? ext4_mb_generate_from_pa+0x590/0x590
   [<ffffffff816502c8>] ? ext4_read_block_bitmap_nowait+0x598/0xe80
   [<ffffffff816e7b7e>] mb_find_order_for_block+0x1ce/0x240
   [...]

Unless compilers start to do some fancy transformations (which at least
GCC 6.0.0 doesn't currently do), the issue is of cosmetic nature only: the
such calculated value of bb is never used again.

Silence UBSAN by introducing another variable, bb_incr, holding the next
increment to apply to bb and adjust that one by right shifting it by one
position per loop iteration.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=114701
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=112161

Signed-off-by: Nicolai Stange <nicstange@gmail.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>

ext4: fix oops on corrupted filesystem

BugLink: http://bugs.launchpad.net/bugs/1590455
commit 74177f55b70e2f2be770dd28684dd6d17106a4ba upstream.

When filesystem is corrupted in the right way, it can happen
ext4_mark_iloc_dirty() in ext4_orphan_add() returns error and we
subsequently remove inode from the in-memory orphan list. However this
deletion is done with list_del(&EXT4_I(inode)->i_orphan) and thus we
leave i_orphan list_head with a stale content. Later we can look at this
content causing list corruption, oops, or other issues. The reported
trace looked like:

WARNING: CPU: 0 PID: 46 at lib/list_debug.c:53 __list_del_entry+0x6b/0x100()
list_del corruption, 0000000061c1d6e0->next is LIST_POISON1
0000000000100100)
CPU: 0 PID: 46 Comm: ext4.exe Not tainted 4.1.0-rc4+ #250
Stack:
60462947 62219960 602ede24 62219960
602ede24 603ca293 622198f0 602f02eb
62219950 6002c12c 62219900 601b4d6b
Call Trace:
[<6005769c>] ? vprintk_emit+0x2dc/0x5c0
[<602ede24>] ? printk+0x0/0x94
[<600190bc>] show_stack+0xdc/0x1a0
[<602ede24>] ? printk+0x0/0x94
[<602ede24>] ? printk+0x0/0x94
[<602f02eb>] dump_stack+0x2a/0x2c
[<6002c12c>] warn_slowpath_common+0x9c/0xf0
[<601b4d6b>] ? __list_del_entry+0x6b/0x100
[<6002c254>] warn_slowpath_fmt+0x94/0xa0
[<602f4d09>] ? __mutex_lock_slowpath+0x239/0x3a0
[<6002c1c0>] ? warn_slowpath_fmt+0x0/0xa0
[<60023ebf>] ? set_signals+0x3f/0x50
[<600a205a>] ? kmem_cache_free+0x10a/0x180
[<602f4e88>] ? mutex_lock+0x18/0x30
[<601b4d6b>] __list_del_entry+0x6b/0x100
[<601177ec>] ext4_orphan_del+0x22c/0x2f0
[<6012f27c>] ? __ext4_journal_start_sb+0x2c/0xa0
[<6010b973>] ? ext4_truncate+0x383/0x390
[<6010bc8b>] ext4_write_begin+0x30b/0x4b0
[<6001bb50>] ? copy_from_user+0x0/0xb0
[<601aa840>] ? iov_iter_fault_in_readable+0xa0/0xc0
[<60072c4f>] generic_perform_write+0xaf/0x1e0
[<600c4166>] ? file_update_time+0x46/0x110
[<60072f0f>] __generic_file_write_iter+0x18f/0x1b0
[<6010030f>] ext4_file_write_iter+0x15f/0x470
[<60094e10>] ? unlink_file_vma+0x0/0x70
[<6009b180>] ? unlink_anon_vmas+0x0/0x260
[<6008f169>] ? free_pgtables+0xb9/0x100
[<600a6030>] __vfs_write+0xb0/0x130
[<600a61d5>] vfs_write+0xa5/0x170
[<600a63d6>] SyS_write+0x56/0xe0
[<6029fcb0>] ? __libc_waitpid+0x0/0xa0
[<6001b698>] handle_syscall+0x68/0x90
[<6002633d>] userspace+0x4fd/0x600
[<6002274f>] ? save_registers+0x1f/0x40
[<60028bd7>] ? arch_prctl+0x177/0x1b0
[<60017bd5>] fork_handler+0x85/0x90

Fix the problem by using list_del_init() as we always should with
i_orphan list.

Reported-by: Vegard Nossum <vegard.nossum@oracle.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>

ext4: clean up error handling when orphan list is corrupted

BugLink: http://bugs.launchpad.net/bugs/1590455
commit 7827a7f6ebfcb7f388dc47fddd48567a314701ba upstream.

Instead of just printing warning messages, if the orphan list is
corrupted, declare the file system is corrupted. If there are any
reserved inodes in the orphaned inode list, declare the file system
corrupted and stop right away to avoid doing more potential damage to
the file system.

Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>

ext4: fix hang when processing corrupted orphaned inode list

BugLink: http://bugs.launchpad.net/bugs/1590455
commit c9eb13a9105e2e418f72e46a2b6da3f49e696902 upstream.

If the orphaned inode list contains inode #5, ext4_iget() returns a
bad inode (since the bootloader inode should never be referenced
directly).  Because of the bad inode, we end up processing the inode
repeatedly and this hangs the machine.

This can be reproduced via:

   mke2fs -t ext4 /tmp/foo.img 100
   debugfs -w -R "ssv last_orphan 5" /tmp/foo.img
   mount -o loop /tmp/foo.img /mnt

(But don't do this if you are using an unpatched kernel if you care
about the system staying functional.  :-)

This bug was found by the port of American Fuzzy Lop into the kernel
to find file system problems[1].  (Since it *only* happens if inode #5
shows up on the orphan list --- 3, 7, 8, etc. won't do it, it's not
surprising that AFL needed two hours before it found it.)

[1] http://events.linuxfoundation.org/sites/events/files/slides/AFL%20filesystem%20fuzzing%2C%20Vault%202016_0.pdf

Reported by: Vegard Nossum <vegard.nossum@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>

drm/imx: Match imx-ipuv3-crtc components using device node in platform data

BugLink: http://bugs.launchpad.net/bugs/1590455
commit 310944d148e3600dcff8b346bee7fa01d34903b1 upstream.

The component master driver imx-drm-core matches component devices using
their of_node. Since commit 950b410dd1ab ("gpu: ipu-v3: Fix imx-ipuv3-crtc
module autoloading"), the imx-ipuv3-crtc dev->of_node is not set during
probing. Before that, of_node was set and caused an of: modalias to be
used instead of the platform: modalias, which broke module autoloading.

On the other hand, if dev->of_node is not set yet when the imx-ipuv3-crtc
probe function calls component_add, component matching in imx-drm-core
fails. While dev->of_node will be set once the next component tries to
bring up the component master, imx-drm-core component binding will never
succeed if one of the crtc devices is probed last.

Add of_node to the component platform data and match against the
pdata->of_node instead of dev->of_node in imx-drm-core to work around
this problem.

Fixes: 950b410dd1ab ("gpu: ipu-v3: Fix imx-ipuv3-crtc module autoloading")
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
Tested-by: Fabio Estevam <fabio.estevam@nxp.com>
Tested-by: Lothar Waßmann <LW@KARO-electronics.de>
Tested-by: Heiko Schocher <hs@denx.de>
Tested-by: Chris Ruehl <chris.ruehl@gtsys.com.hk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>

drm/i915: Don't leave old junk in ilk active watermarks on readout

BugLink: http://bugs.launchpad.net/bugs/1590455
commit 7045c3689f148a0c95f42bae8ef3eb2829ac7de9 upstream.

When we read out the watermark state from the hardware we're supposed to
transfer that into the active watermarks, but currently we fail to any
part of the active watermarks that isn't explicitly written. Let's clear
it all upfront.

Looks like this has been like this since the beginning, when I added the
readout. No idea why I didn't clear it up.

Cc: Matt Roper <matthew.d.roper@intel.com>
Fixes: 243e6a44b9ca ("drm/i915: Init HSW watermark tracking in intel_modeset_setup_hw_state()")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1463151318-14719-2-git-send-email-ville.syrjala@linux.intel.com
(cherry picked from commit 15606534bf0a65d8a74a90fd57b8712d147dbca6)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>

drm/atomic: Verify connector->funcs != NULL when clearing states

BugLink: http://bugs.launchpad.net/bugs/1590455
Unfortunately since we don't have Dave's connector refcounting patch
here yet, it's very possible that drm_atomic_state_default_clear() could
get called by intel_display_resume() when
intel_dp_mst_destroy_connector() isn't completely finished destroying an
mst connector, but has already finished setting connector->funcs to
NULL. As such, we need to treat the connector like it's already been
destroyed and just skip it, otherwise we'll end up dereferencing a NULL
pointer.

This fix is only required for 4.6 and below. David Airlie's patchseries
for 4.7 to add connector reference counting provides a more proper fix
for this.

Changes since v1:
- Fix leftover whitespace

Upstream fix: 0552f7651bc2 ("drm/i915/mst: use reference counted
connectors. (v3)")
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Lyude <cpaul@redhat.com>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>

drm/fb_helper: Fix references to dev->mode_config.num_connector

BugLink: http://bugs.launchpad.net/bugs/1590455
commit 255f0e7c418ad95a4baeda017ae6182ba9b3c423 upstream.

During boot, MST hotplugs are generally expected (even if no physical
hotplugging occurs) and result in DRM's connector topology changing.
This means that using num_connector from the current mode configuration
can lead to the number of connectors changing under us. This can lead to
some nasty scenarios in fbcon:

- We allocate an array to the size of dev->mode_config.num_connectors.
- MST hotplug occurs, dev->mode_config.num_connectors gets incremented.
- We try to loop through each element in the array using the new value
  of dev->mode_config.num_connectors, and end up going out of bounds
  since dev->mode_config.num_connectors is now larger then the array we
  allocated.

fb_helper->connector_count however, will always remain consistent while
we do a modeset in fb_helper.

Note: This is just polish for 4.7, Dave Airlie's drm_connector
refcounting fixed these bugs for real. But it's good enough duct-tape
for stable kernel backporting, since backporting the refcounting
changes is way too invasive.

Signed-off-by: Lyude <cpaul@redhat.com>
[danvet: Clarify why we need this. Also remove the now unused "dev"
local variable to appease gcc.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1463065021-18280-3-git-send-email-cpaul@redhat.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>

drm/i915/fbdev: Fix num_connector references in intel_fb_initial_config()

BugLink: http://bugs.launchpad.net/bugs/1590455
commit 14a3842a1d5945067d1dd0788f314e14d5b18e5b upstream.

During boot time, MST devices usually send a ton of hotplug events
irregardless of whether or not any physical hotplugs actually occurred.
Hotplugs mean connectors being created/destroyed, and the number of DRM
connectors changing under us. This isn't a problem if we use
fb_helper->connector_count since we only set it once in the code,
however if we use num_connector from struct drm_mode_config we risk it's
value changing under us. On top of that, there's even a chance that
dev->mode_config.num_connector != fb_helper->connector_count. If the
number of connectors happens to increase under us, we'll end up using
the wrong array size for memcpy and start writing beyond the actual
length of the array, occasionally resulting in kernel panics.

Note: This is just polish for 4.7, Dave Airlie's drm_connector
refcounting fixed these bugs for real. But it's good enough duct-tape
for stable kernel backporting, since backporting the refcounting
changes is way too invasive.

Signed-off-by: Lyude <cpaul@redhat.com>
[danvet: Clarify why we need this.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1463065021-18280-2-git-send-email-cpaul@redhat.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>

drm/amdgpu: Fix hdmi deep color support.

BugLink: http://bugs.launchpad.net/bugs/1590455
commit 9d746ab68163d642dae13756b2b3145b2e38cb65 upstream.

When porting the hdmi deep color detection code from
radeon-kms to amdgpu-kms apparently some kind of
copy and paste error happened, attaching an else
branch to the wrong if statement.

The result is that hdmi deep color mode is always
disabled, regardless of gpu and display capabilities and
user wishes, as the code mistakenly thinks that the display
doesn't provide the required max_tmds_clock limit and falls
back to 8 bpc.

This patch fixes deep color support, as tested on a
R9 380 Tonga Pro + suitable display, and should be
backported to all kernels with amdgpu-kms support.

Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>

drm/amdgpu: use drm_mode_vrefresh() rather than mode->vrefresh

BugLink: http://bugs.launchpad.net/bugs/1590455
commit 6b8812eb004ee2b24aac8b1a711a0e8e797df3ce upstream.

This is a port of radeon commit:
3d2d98ee1af0cf6eebfbd6bff4c17d3601ac1284
drm/radeon: use drm_mode_vrefresh() rather than mode->vrefresh
to amdgpu.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>

drm/vmwgfx: Fix order of operation

BugLink: http://bugs.launchpad.net/bugs/1590455
commit 7851496a32319237456919575e5f4ba62f74cc7d upstream.

mode->hdisplay * (var->bits_per_pixel + 7) gets evaluated before
the division, potentially making the pitch larger than it should
be.

Since the original intention is to do a div-round-up, just use
the macro instead.

Signed-off-by: Sinclair Yeh <syeh@vmware.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>

drm/vmwgfx: use vmw_cmd_dx_cid_check for query commands.

BugLink: http://bugs.launchpad.net/bugs/1590455
commit e02e58843153ce80a9fe7588def89b2638d40e64 upstream.

Instead of calling vmw_cmd_ok, call vmw_cmd_dx_cid_check to
validate the context id for query commands.

Signed-off-by: Charmaine Lee <charmainel@vmware.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>

drm/vmwgfx: Enable SVGA_3D_CMD_DX_SET_PREDICATION

BugLink: http://bugs.launchpad.net/bugs/1590455
commit 1883598d4201361a6d2ce785095695f58071ee11 upstream.

Fixes piglit tests nv_conditional_render-* crashes.

Signed-off-by: Charmaine Lee <charmainel@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>

drm/gma500: Fix possible out of bounds read

BugLink: http://bugs.launchpad.net/bugs/1590455
commit 7ccca1d5bf69fdd1d3c5fcf84faf1659a6e0ad11 upstream.

Fix possible out of bounds read, by adding missing comma.
The code may read pass the end of the dsi_errors array
when the most significant bit (bit #31) in the intr_stat register
is set.
This bug has been detected using CppCheck (static analysis tool).

Signed-off-by: Itai Handler <itai_handler@hotmail.com>
Signed-off-by: Patrik Jakobsson <patrik.r.jakobsson@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>

sunrpc: fix stripping of padded MIC tokens

BugLink: http://bugs.launchpad.net/bugs/1590455
commit c0cb8bf3a8e4bd82e640862cdd8891400405cb89 upstream.

The length of the GSS MIC token need not be a multiple of four bytes.
It is then padded by XDR to a multiple of 4 B, but unwrap_integ_data()
would previously only trim mic.len + 4 B. The remaining up to three
bytes would then trigger a check in nfs4svc_decode_compoundargs(),
leading to a "garbage args" error and mount failure:

nfs4svc_decode_compoundargs: compound not properly padded!
nfsd: failed to decode arguments!

This would prevent older clients using the pre-RFC 4121 MIC format
(37-byte MIC including a 9-byte OID) from mounting exports from v3.9+
servers using krb5i.

The trimming was introduced by commit 4c190e2f913f ("sunrpc: trim off
trailing checksum before returning decrypted or integrity authenticated
buffer").

Fixes: 4c190e2f913f "unrpc: trim off trailing checksum..."
Signed-off-by: Tomáš Trnka <ttrnka@mail.muni.cz>
Acked-by: Jeff Layton <jlayton@poochiereds.net>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>

xen: use same main loop for counting and remapping pages

BugLink: http://bugs.launchpad.net/bugs/1590455
commit dd14be92fbf5bc1ef7343f34968440e44e21b46a upstream.

Instead of having two functions for cycling through the E820 map in
order to count to be remapped pages and remap them later, just use one
function with a caller supplied sub-function called for each region to
be processed. This eliminates the possibility of a mismatch between
both loops which showed up in certain configurations.

Suggested-by: Ed Swierk <eswierk@skyportsystems.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>

xen/events: Don't move disabled irqs

BugLink: http://bugs.launchpad.net/bugs/1590455
commit f0f393877c71ad227d36705d61d1e4062bc29cf5 upstream.

Commit ff1e22e7a638 ("xen/events: Mask a moving irq") open-coded
irq_move_irq() but left out checking if the IRQ is disabled. This broke
resuming from suspend since it tries to move a (disabled) irq without
holding the IRQ's desc->lock. Fix it by adding in a check for disabled
IRQs.

The resulting stacktrace was:
kernel BUG at /build/linux-UbQGH5/linux-4.4.0/kernel/irq/migration.c:31!
invalid opcode: 0000 [#1] SMP
Modules linked in: xenfs xen_privcmd ...
CPU: 0 PID: 9 Comm: migration/0 Not tainted 4.4.0-22-generic #39-Ubuntu
Hardware name: Xen HVM domU, BIOS 4.6.1-xs125180 05/04/2016
task: ffff88003d75ee00 ti: ffff88003d7bc000 task.ti: ffff88003d7bc000
RIP: 0010:[<ffffffff810e26e2>]  [<ffffffff810e26e2>] irq_move_masked_irq+0xd2/0xe0
RSP: 0018:ffff88003d7bfc50  EFLAGS: 00010046
RAX: 0000000000000000 RBX: ffff88003d40ba00 RCX: 0000000000000001
RDX: 0000000000000001 RSI: 0000000000000100 RDI: ffff88003d40bad8
RBP: ffff88003d7bfc68 R08: 0000000000000000 R09: ffff88003d000000
R10: 0000000000000000 R11: 000000000000023c R12: ffff88003d40bad0
R13: ffffffff81f3a4a0 R14: 0000000000000010 R15: 00000000ffffffff
FS:  0000000000000000(0000) GS:ffff88003da00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fd4264de624 CR3: 0000000037922000 CR4: 00000000003406f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Stack:
ffff88003d40ba38 0000000000000024 0000000000000000 ffff88003d7bfca0
ffffffff814c8d92 00000010813ef89d 00000000805ea732 0000000000000009
0000000000000024 ffff88003cc39b80 ffff88003d7bfce0 ffffffff814c8f66
Call Trace:
[<ffffffff814c8d92>] eoi_pirq+0xb2/0xf0
[<ffffffff814c8f66>] __startup_pirq+0xe6/0x150
[<ffffffff814ca659>] xen_irq_resume+0x319/0x360
[<ffffffff814c7e75>] xen_suspend+0xb5/0x180
[<ffffffff81120155>] multi_cpu_stop+0xb5/0xe0
[<ffffffff811200a0>] ? cpu_stop_queue_work+0x80/0x80
[<ffffffff811203d0>] cpu_stopper_thread+0xb0/0x140
[<ffffffff810a94e6>] ? finish_task_switch+0x76/0x220
[<ffffffff810ca731>] ? __raw_callee_save___pv_queued_spin_unlock+0x11/0x20
[<ffffffff810a3935>] smpboot_thread_fn+0x105/0x160
[<ffffffff810a3830>] ? sort_range+0x30/0x30
[<ffffffff810a0588>] kthread+0xd8/0xf0
[<ffffffff810a04b0>] ? kthread_create_on_node+0x1e0/0x1e0
[<ffffffff8182568f>] ret_from_fork+0x3f/0x70
[<ffffffff810a04b0>] ? kthread_create_on_node+0x1e0/0x1e0

Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>

powerpc/eeh: Restore initial state in eeh_pe_reset_and_recover()

BugLink: http://bugs.launchpad.net/bugs/1590455
commit 5a0cdbfd17b90a89c64a71d8aec9773ecdb20d0d upstream.

The function eeh_pe_reset_and_recover() is used to recover EEH
error when the passthrou device are transferred to guest and
backwards. The content in the device's config space will be lost
on PE reset issued in the middle of the recovery. The function
saves/restores it before/after the reset. However, config access
to some adapters like Broadcom BCM5719 at this point will causes
fenced PHB. The config space is always blocked and we save 0xFF's
that are restored at late point. The memory BARs are totally
corrupted, causing another EEH error upon access to one of the
memory BARs.

This restores the config space on those adapters like BCM5719
from the content saved to the EEH device when it's populated,
to resolve above issue.

Fixes: 5cfb20b9 ("powerpc/eeh: Emulate EEH recovery for VFIO devices")
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Reviewed-by: Russell Currey <ruscur@russell.cc>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>

powerpc/eeh: Don't report error in eeh_pe_reset_and_recover()

BugLink: http://bugs.launchpad.net/bugs/1590455
commit affeb0f2d3a9af419ad7ef4ac782e1540b2f7b28 upstream.

The function eeh_pe_reset_and_recover() is used to recover EEH
error when the passthrough device are transferred to guest and
backwards, meaning the device's driver is vfio-pci or none.
When the driver is vfio-pci that provides error_detected() error
handler only, the handler simply stops the guest and it's not
expected behaviour. On the other hand, no error handlers will
be called if we don't have a bound driver.

This ignores the error handler in eeh_pe_reset_and_recover()
that reports the error to device driver to avoid the exceptional
behaviour.

Fixes: 5cfb20b9 ("powerpc/eeh: Emulate EEH recovery for VFIO devices")
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Reviewed-by: Russell Currey <ruscur@russell.cc>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>

powerpc/book3s64: Fix branching to OOL handlers in relocatable kernel

BugLink: http://bugs.launchpad.net/bugs/1590455
commit 8ed8ab40047a570fdd8043a40c104a57248dd3fd upstream.

Some of the interrupt vectors on 64-bit POWER server processors are only
32 bytes long (8 instructions), which is not enough for the full
first-level interrupt handler. For these we need to branch to an
out-of-line (OOL) handler. But when we are running a relocatable kernel,
interrupt vectors till __end_interrupts marker are copied down to real
address 0x100. So, branching to labels (ie. OOL handlers) outside this
section must be handled differently (see LOAD_HANDLER()), considering
relocatable kernel, which would need at least 4 instructions.

However, branching from interrupt vector means that we corrupt the
CFAR (come-from address register) on POWER7 and later processors as
mentioned in commit 1707dd16. So, EXCEPTION_PROLOG_0 (6 instructions)
that contains the part up to the point where the CFAR is saved in the
PACA should be part of the short interrupt vectors before we branch out
to OOL handlers.

But as mentioned already, there are interrupt vectors on 64-bit POWER
server processors that are only 32 bytes long (like vectors 0x4f00,
0x4f20, etc.), which cannot accomodate the above two cases at the same
time owing to space constraint. Currently, in these interrupt vectors,
we simply branch out to OOL handlers, without using LOAD_HANDLER(),
which leaves us vulnerable when running a relocatable kernel (eg. kdump
case). While this has been the case for sometime now and kdump is used
widely, we were fortunate not to see any problems so far, for three
reasons:

  1. In almost all cases, production kernel (relocatable) is used for
     kdump as well, which would mean that crashed kernel's OOL handler
     would be at the same place where we end up branching to, from short
     interrupt vector of kdump kernel.
  2. Also, OOL handler was unlikely the reason for crash in almost all
     the kdump scenarios, which meant we had a sane OOL handler from
     crashed kernel that we branched to.
  3. On most 64-bit POWER server processors, page size is large enough
     that marking interrupt vector code as executable (see commit
     429d2e83) leads to marking OOL handler code from crashed kernel,
     that sits right below interrupt vector code from kdump kernel, as
     executable as well.

Let us fix this by moving the __end_interrupts marker down past OOL
handlers to make sure that we also copy OOL handlers to real address
0x100 when running a relocatable kernel.

This fix has been tested successfully in kdump scenario, on an LPAR with
4K page size by using different default/production kernel and kdump
kernel.

Also tested by manually corrupting the OOL handlers in the first kernel
and then kdump'ing, and then causing the OOL handlers to fire - mpe.

Fixes: c1fb6816fb1b ("powerpc: Add relocation on exception vector handlers")
Signed-off-by: Hari Bathini <hbathini@linux.vnet.ibm.com>
Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>

QE-UART: add "fsl,t1040-ucc-uart" to of_device_id

BugLink: http://bugs.launchpad.net/bugs/1590455
commit 11ca2b7ab432eb90906168c327733575e68d388f upstream.

New bindings use "fsl,t1040-ucc-uart" as the compatible for qe-uart.
So add it.

Signed-off-by: Zhao Qiang <qiang.zhao@nxp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>

wait/ptrace: assume __WALL if the child is traced

BugLink: http://bugs.launchpad.net/bugs/1590455
commit bf959931ddb88c4e4366e96dd22e68fa0db9527c upstream.

The following program (simplified version of generated by syzkaller)

#include <pthread.h>
#include <unistd.h>
#include <sys/ptrace.h>
#include <stdio.h>
#include <signal.h>

void *thread_func(void *arg)
{
ptrace(PTRACE_TRACEME, 0,0,0);
return 0;
}

int main(void)
{
pthread_t thread;

if (fork())
return 0;

while (getppid() != 1)
;

pthread_create(&thread, NULL, thread_func, NULL);
pthread_join(thread, NULL);
return 0;
}

creates an unreapable zombie if /sbin/init doesn't use __WALL.

This is not a kernel bug, at least in a sense that everything works as
expected: debugger should reap a traced sub-thread before it can reap the
leader, but without __WALL/__WCLONE do_wait() ignores sub-threads.

Unfortunately, it seems that /sbin/init in most (all?) distributions
doesn't use it and we have to change the kernel to avoid the problem.
Note also that most init's use sys_waitid() which doesn't allow __WALL, so
the necessary user-space fix is not that trivial.

This patch just adds the "ptrace" check into eligible_child().  To some
degree this matches the "tsk->ptrace" in exit_notify(), ->exit_signal is
mostly ignored when the tracee reports to debugger.  Or WSTOPPED, the
tracer doesn't need to set this flag to wait for the stopped tracee.

This obviously means the user-visible change: __WCLONE and __WALL no
longer have any meaning for debugger.  And I can only hope that this won't
break something, but at least strace/gdb won't suffer.

We could make a more conservative change.  Say, we can take __WCLONE into
account, or !thread_group_leader().  But it would be nice to not
complicate these historical/confusing checks.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Jan Kratochvil <jan.kratochvil@redhat.com>
Cc: "Michael Kerrisk (man-pages)" <mtk.manpages@gmail.com>
Cc: Pedro Alves <palves@redhat.com>
Cc: Roland McGrath <roland@hack.frob.com>
Cc: <syzkaller@googlegroups.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>

mm: use phys_addr_t for reserve_bootmem_region() arguments

BugLink: http://bugs.launchpad.net/bugs/1590455
commit 4b50bcc7eda4d3cc9e3f2a0aa60e590fedf728c5 upstream.

Since commit 92923ca3aace ("mm: meminit: only set page reserved in the
memblock region") the reserved bit is set on reserved memblock regions.
However start and end address are passed as unsigned long.  This is only
32bit on i386, so it can end up marking the wrong pages reserved for
ranges at 4GB and above.

This was observed on a 32bit Xen dom0 which was booted with initial
memory set to a value below 4G but allowing to balloon in memory
(dom0_mem=1024M for example).  This would define a reserved bootmem
region for the additional memory (for example on a 8GB system there was
a reverved region covering the 4GB-8GB range).  But since the addresses
were passed on as unsigned long, this was actually marking all pages
from 0 to 4GB as reserved.

Fixes: 92923ca3aacef63 ("mm: meminit: only set page reserved in the memblock region")
Link: http://lkml.kernel.org/r/1463491221-10573-1-git-send-email-stefan.bader@canonical.com
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>