git.proxmox.com Git - mirror_ubuntu-zesty-kernel.git/log

UBUNTU: SAUCE: powerpc/tm: Fix illegal TM state in signal handler

Currently it's possible that on returning from the signal handler
through the restore_tm_sigcontexts() code path (e.g. from a signal
caught due to a `trap` instruction executed in the middle of an HTM
block, or a deliberately constructed sigframe) an illegal TM state
(like TS=10 TM=0, i.e. "T0") is set in SRR1 and when `rfid` sets
implicitly the MSR register from SRR1 register on return to userspace
it causes a TM Bad Thing exception.

That illegal state can be set (a) by a malicious user that disables
the TM bit by tweaking the bits in uc_mcontext before returning from
the signal handler or (b) by a sufficient number of context switches
occurring such that the load_tm counter overflows and TM is disabled
whilst in the signal handler.

This commit fixes the illegal TM state by ensuring that TM bit is
always enabled before we return from restore_tm_sigcontexts(). A small
comment correction is made as well.

Fixes: 5d176f751ee3 ("powerpc: tm: Enable transactional memory (TM) lazily for userspace")
Cc: stable@vger.kernel.org # v4.9+
Signed-off-by: Gustavo Romero <gromero@linux.vnet.ibm.com>
Signed-off-by: Breno Leitao <leitao@debian.org>
Signed-off-by: Cyril Bur <cyrilbur@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
CVE-2017-1000255
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>

UBUNTU: SAUCE: powerpc/64s: Use emergency stack for kernel TM Bad Thing program checks

When using transactional memory (TM), the CPU can be in one of six
states as far as TM is concerned, encoded in the Machine State
Register (MSR). Certain state transitions are illegal and if attempted
trigger a "TM Bad Thing" type program check exception.

If we ever hit one of these exceptions it's treated as a bug, ie. we
oops, and kill the process and/or panic, depending on configuration.

One case where we can trigger a TM Bad Thing, is when returning to
userspace after a system call or interrupt, using RFID. When this
happens the CPU first restores the user register state, in particular
r1 (the stack pointer) and then attempts to update the MSR. However
the MSR update is not allowed and so we take the program check with
the user register state, but the kernel MSR.

This tricks the exception entry code into thinking we have a bad
kernel stack pointer, because the MSR says we're coming from the
kernel, but r1 is pointing to userspace.

To avoid this we instead always switch to the emergency stack if we
take a TM Bad Thing from the kernel. That way none of the user
register values are used, other than for printing in the oops message.

Fixes: 5d176f751ee3 ("powerpc: tm: Enable transactional memory (TM) lazily for userspace")
Cc: stable@vger.kernel.org # v4.9+
Signed-off-by: Cyril Bur <cyrilbur@gmail.com>
[mpe: Rewrite change log & comments, tweak asm slightly]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
CVE-2017-1000255
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>

UBUNTU: Start new release

Ignore: yes
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>

UBUNTU: Ubuntu-4.10.0-36.40

Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>

net: pending_confirm is not used anymore

BugLink: https://bugs.launchpad.net/bugs/1715812
When same struct dst_entry can be used for many different
neighbours we can not use it for pending confirmations.
As last step, we can remove the pending_confirm flag.

Reported-by: YueHaibing <yuehaibing@huawei.com>
Fixes: 5110effee8fd ("net: Do delayed neigh confirmation.")
Fixes: f2bb4bedf35d ("ipv4: Cache output routes in fib_info nexthops.")
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 51ce8bd4d17a761e1a90a34a1b5c9b762cce7553)
Signed-off-by: Daniel Axtens <daniel.axtens@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Colin King <colin.king@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

net: use dst_confirm_neigh for UDP, RAW, ICMP, L2TP

BugLink: https://bugs.launchpad.net/bugs/1715812
When same struct dst_entry can be used for many different
neighbours we can not use it for pending confirmations.

The datagram protocols can use MSG_CONFIRM to confirm the
neighbour. When used with MSG_PROBE we do not reach the
code where neighbour is confirmed, so we have to do the
same slow lookup by using the dst_confirm_neigh() helper.
When MSG_PROBE is not used, ip_append_data/ip6_append_data
will set the skb flag dst_pending_confirm.

Reported-by: YueHaibing <yuehaibing@huawei.com>
Fixes: 5110effee8fd ("net: Do delayed neigh confirmation.")
Fixes: f2bb4bedf35d ("ipv4: Cache output routes in fib_info nexthops.")
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 0dec879f636f11b0ffda1cb5fd96a1754c59ead3)
Signed-off-by: Daniel Axtens <daniel.axtens@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Colin King <colin.king@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

net: add confirm_neigh method to dst_ops

BugLink: https://bugs.launchpad.net/bugs/1715812
Add confirm_neigh method to dst_ops and use it from IPv4 and IPv6
to lookup and confirm the neighbour. Its usage via the new helper
dst_confirm_neigh() should be restricted to MSG_PROBE users for
performance reasons.

For XFRM prefer the last tunnel address, if present. With help
from Steffen Klassert.

Signed-off-by: Julian Anastasov <ja@ssi.bg>
Acked-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 63fca65d08632fbec9d9b655f671cf08aa1aeeb8)
Signed-off-by: Daniel Axtens <daniel.axtens@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Colin King <colin.king@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

tcp: replace dst_confirm with sk_dst_confirm

BugLink: https://bugs.launchpad.net/bugs/1715812
When same struct dst_entry can be used for many different
neighbours we can not use it for pending confirmations.
Use the new sk_dst_confirm() helper to propagate the
indication from received packets to sock_confirm_neigh().

Reported-by: YueHaibing <yuehaibing@huawei.com>
Fixes: 5110effee8fd ("net: Do delayed neigh confirmation.")
Fixes: f2bb4bedf35d ("ipv4: Cache output routes in fib_info nexthops.")
Tested-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit c3a2e8370534f810cac6050169db0ed3e0f94f0b)
Signed-off-by: Daniel Axtens <daniel.axtens@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Colin King <colin.king@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

sctp: add dst_pending_confirm flag

BugLink: https://bugs.launchpad.net/bugs/1715812
Add new transport flag to allow sockets to confirm neighbour.
When same struct dst_entry can be used for many different
neighbours we can not use it for pending confirmations.
The flag is propagated from transport to every packet.
It is reset when cached dst is reset.

Reported-by: YueHaibing <yuehaibing@huawei.com>
Fixes: 5110effee8fd ("net: Do delayed neigh confirmation.")
Fixes: f2bb4bedf35d ("ipv4: Cache output routes in fib_info nexthops.")
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Acked-by: Eric Dumazet <edumazet@google.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit c86a773c78025f5b825bacd7b846f4fa60dc0317)
Signed-off-by: Daniel Axtens <daniel.axtens@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Colin King <colin.king@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

net: add dst_pending_confirm flag to skbuff

BugLink: https://bugs.launchpad.net/bugs/1715812
Add new skbuff flag to allow protocols to confirm neighbour.
When same struct dst_entry can be used for many different
neighbours we can not use it for pending confirmations.

Add sock_confirm_neigh() helper to confirm the neighbour and
use it for IPv4, IPv6 and VRF before dst_neigh_output.

Signed-off-by: Julian Anastasov <ja@ssi.bg>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 4ff0620354f2b39b9fe2a91c22c4de9d1fba0c8e)
Signed-off-by: Daniel Axtens <daniel.axtens@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Colin King <colin.king@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

sock: add sk_dst_pending_confirm flag

BugLink: https://bugs.launchpad.net/bugs/1715812
Add new sock flag to allow sockets to confirm neighbour.
When same struct dst_entry can be used for many different
neighbours we can not use it for pending confirmations.
As not all call paths lock the socket use full word for
the flag.

Add sk_dst_confirm as replacement for dst_confirm when
called for received packets.

Signed-off-by: Julian Anastasov <ja@ssi.bg>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 9b8805a325591cf5b6b9df71200de25a2bd721fd)
Signed-off-by: Daniel Axtens <daniel.axtens@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Colin King <colin.king@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

PCI: Disable VF decoding before pcibios_sriov_disable() updates resources

BugLink: http://bugs.launchpad.net/bugs/1715073
A struct resource represents the address space consumed by a device.  We
should not modify that resource while the device is actively using the
address space.  For VFs, pci_iov_update_resource() enforces this by
printing a warning and doing nothing if the VFE (VF Enable) and MSE (VF
Memory Space Enable) bits are set.

Previously, both sriov_enable() and sriov_disable() called the
pcibios_sriov_disable() arch hook, which may update the struct resource,
while VFE and MSE were enabled.  This effectively dropped the resource
update pcibios_sriov_disable() intended to do.

Disable VF memory decoding before calling pcibios_sriov_disable().

Reported-by: Carol L Soto <clsoto@us.ibm.com>
Tested-by: Carol L Soto <clsoto@us.ibm.com>
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
[bhelgaas: changelog]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: shan.gavin@gmail.com
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
(cherry picked from commit 0fc690a7c3f7053613dcbab6a7613bb6586d8ee2)
Signed-off-by: Joseph Salisbury <joseph.salisbury@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Colin King <colin.king@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

PCI: Lock each enable/disable num_vfs operation in sysfs

BugLink: http://bugs.launchpad.net/bugs/1715073
Enabling/disabling SRIOV via sysfs by echo-ing multiple values
simultaneously:

  # echo 63 > /sys/class/net/ethX/device/sriov_numvfs&
  # echo 63 > /sys/class/net/ethX/device/sriov_numvfs

  # sleep 5

  # echo 0 > /sys/class/net/ethX/device/sriov_numvfs&
  # echo 0 > /sys/class/net/ethX/device/sriov_numvfs

results in the following bug:

  kernel BUG at drivers/pci/iov.c:495!
  invalid opcode: 0000 [#1] SMP
  CPU: 1 PID: 8050 Comm: bash Tainted: G   W   4.9.0-rc7-net-next #2092
  RIP: 0010:[<ffffffff813b1647>]
    [<ffffffff813b1647>] pci_iov_release+0x57/0x60

  Call Trace:
   [<ffffffff81391726>] pci_release_dev+0x26/0x70
   [<ffffffff8155be6e>] device_release+0x3e/0xb0
   [<ffffffff81365ee7>] kobject_cleanup+0x67/0x180
   [<ffffffff81365d9d>] kobject_put+0x2d/0x60
   [<ffffffff8155bc27>] put_device+0x17/0x20
   [<ffffffff8139c08a>] pci_dev_put+0x1a/0x20
   [<ffffffff8139cb6b>] pci_get_dev_by_id+0x5b/0x90
   [<ffffffff8139cca5>] pci_get_subsys+0x35/0x40
   [<ffffffff8139ccc8>] pci_get_device+0x18/0x20
   [<ffffffff8139ccfb>] pci_get_domain_bus_and_slot+0x2b/0x60
   [<ffffffff813b09e7>] pci_iov_remove_virtfn+0x57/0x180
   [<ffffffff813b0b95>] pci_disable_sriov+0x65/0x140
   [<ffffffffa00a1af7>] ixgbe_disable_sriov+0xc7/0x1d0 [ixgbe]
   [<ffffffffa00a1e9d>] ixgbe_pci_sriov_configure+0x3d/0x170 [ixgbe]
   [<ffffffff8139d28c>] sriov_numvfs_store+0xdc/0x130
  ...
  RIP  [<ffffffff813b1647>] pci_iov_release+0x57/0x60

Use the existing mutex lock to protect each enable/disable operation.

Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
CC: Alexander Duyck <alexander.h.duyck@intel.com>
(cherry picked from commit 5b0948dfe138f0837699f46f5877f4f81c252dac)
Signed-off-by: Joseph Salisbury <joseph.salisbury@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Colin King <colin.king@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

usb: quirks: add delay init quirk for Corsair Strafe RGB keyboard

BugLink: https://bugs.launchpad.net/bugs/1678477
Corsair Strafe RGB keyboard has trouble to initialize:

[ 1.679455] usb 3-6: new full-speed USB device number 4 using xhci_hcd
[ 6.871136] usb 3-6: unable to read config index 0 descriptor/all
[ 6.871138] usb 3-6: can't read configurations, error -110
[ 6.991019] usb 3-6: new full-speed USB device number 5 using xhci_hcd
[ 12.246642] usb 3-6: unable to read config index 0 descriptor/all
[ 12.246644] usb 3-6: can't read configurations, error -110
[ 12.366555] usb 3-6: new full-speed USB device number 6 using xhci_hcd
[ 17.622145] usb 3-6: unable to read config index 0 descriptor/all
[ 17.622147] usb 3-6: can't read configurations, error -110
[ 17.742093] usb 3-6: new full-speed USB device number 7 using xhci_hcd
[ 22.997715] usb 3-6: unable to read config index 0 descriptor/all
[ 22.997716] usb 3-6: can't read configurations, error -110

Although it may work after several times unpluging/pluging:

[ 68.195240] usb 3-6: new full-speed USB device number 11 using xhci_hcd
[ 68.337459] usb 3-6: New USB device found, idVendor=1b1c, idProduct=1b20
[ 68.337463] usb 3-6: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[ 68.337466] usb 3-6: Product: Corsair STRAFE RGB Gaming Keyboard
[ 68.337468] usb 3-6: Manufacturer: Corsair
[ 68.337470] usb 3-6: SerialNumber: 0F013021AEB8046755A93ED3F5001941

Tried three quirks: USB_QUIRK_DELAY_INIT, USB_QUIRK_NO_LPM and
USB_QUIRK_DEVICE_QUALIFIER, user confirmed that USB_QUIRK_DELAY_INIT alone
can workaround this issue. Hence add the quirk for Corsair Strafe RGB.

BugLink: https://bugs.launchpad.net/bugs/1678477
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit de3af5bf259d7a0bfaac70441c8568ab5998d80c)
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Colin King <colin.king@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

tcp: initialize rcv_mss to TCP_MIN_MSS instead of 0

CVE-2017-14106

When tcp_disconnect() is called, inet_csk_delack_init() sets
icsk->icsk_ack.rcv_mss to 0.
This could potentially cause tcp_recvmsg() => tcp_cleanup_rbuf() =>
__tcp_select_window() call path to have division by 0 issue.
So this patch initializes rcv_mss to TCP_MIN_MSS instead of 0.

Reported-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Wei Wang <weiwan@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 499350a5a6e7512d9ed369ed63a4244b6536f4f8)
Signed-off-by: Po-Hsu Lin <po-hsu.lin@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Colin King <colin.king@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

CIFS: Fix maximum SMB2 header size

BugLink: http://bugs.launchpad.net/bugs/1713884
Currently the maximum size of SMB2/3 header is set incorrectly which
leads to hanging of directory listing operations on encrypted SMB3
connections. Fix this by setting the maximum size to 170 bytes that
is calculated as RFC1002 length field size (4) + transform header
size (52) + SMB2 header size (64) + create response size (56).

Cc: <stable@vger.kernel.org>
Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com>
Signed-off-by: Steve French <smfrench@gmail.com>
Acked-by: Sachin Prabhu <sprabhu@redhat.com>
(cherry picked from commit 9e37b1784f2be9397a903307574ee565bbadfd75)
Signed-off-by: Joseph Salisbury <joseph.salisbury@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Colin King <colin.king@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

Input: trackpoint - assume 3 buttons when buttons detection fails

BugLink: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1715271
Trackpoint buttons detection fails on ThinkPad 570 and 470 series,
this makes the middle button of the trackpoint to not being recogized.
As I don't believe there is any trackpoint with less than 3 buttons this
patch just assumes three buttons when the extended button information
read fails.

Signed-off-by: Oscar Campos <oscar.campos@member.fsf.org>
Acked-by: Peter Hutterer <peter.hutterer@who-t.net>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
(backported from commit 293b915fd9bebf33cdc906516fb28d54649a25ac)
Signed-off-by: Aaron Ma <aaron.ma@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Acked-by: Hui Wang <hui.wang@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

powerpc: Drop GPL from of_node_to_nid() export to match other arches

BugLink: http://bugs.launchpad.net/bugs/1709179
The generic implementation of of_node_to_nid() is EXPORT_SYMBOL, added
in commit 298535c00a2c ("of, numa: Add NUMA of binding
implementation.").

The powerpc implementation added in commit 953039c8df7b ("[PATCH]
powerpc: Allow devices to register with numa topology") is
EXPORT_SYMBOL_GPL.

This creates an inconsistency for of_node_to_nid() callers across
architectures.

Update the powerpc implementation to be exported consistently with the
generic implementation.

Signed-off-by: Shailendra Singh <shailendras@nvidia.com>
Reviewed-by: Andy Ritger <aritger@nvidia.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
(cherry picked from commit be9ba9ff93cc3e44dc46da9ed25655780069411a)
Signed-off-by: Joseph Salisbury <joseph.salisbury@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Colin King <colin.king@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

Revert "vhost: cache used event for better performance"

BugLink: http://bugs.launchpad.net/bugs/1711251
commit 8d65843c44269c21e95c98090d9bb4848d473853 upstream

This reverts commit 809ecb9bca6a9424ccd392d67e368160f8b76c92.

Signed-off-by: Joseph Salisbury <joseph.salisbury@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Colin King <colin.king@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

clocksource/drivers/arm_arch_timer: Avoid infinite recursion when ftrace is enabled

BugLink: https://bugs.launchpad.net/bugs/1713821
On platforms with an arch timer erratum workaround, it's possible for
arch_timer_reg_read_stable() to recurse into itself when certain
tracing options are enabled, leading to stack overflows and related
problems.

For example, when PREEMPT_TRACER and FUNCTION_GRAPH_TRACER are
selected, it's possible to trigger this with:

$ mount -t debugfs nodev /sys/kernel/debug/
$ echo function_graph > /sys/kernel/debug/tracing/current_tracer

The problem is that in such cases, preempt_disable() instrumentation
attempts to acquire a timestamp via trace_clock(), resulting in a call
back to arch_timer_reg_read_stable(), and hence recursion.

This patch changes arch_timer_reg_read_stable() to use
preempt_{disable,enable}_notrace(), which avoids this.

This problem is similar to the fixed by upstream commit 96b3d28bf4
("sched/clock: Prevent tracing recursion in sched_clock_cpu()").

Fixes: 6acc71ccac71 ("arm64: arch_timer: Allows a CPU-specific erratum to only affect a subset of CPUs")
Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
(cherry picked from commit adb4f11e0a8f4e29900adb2b7af28b6bbd5c1fa4)
Signed-off-by: dann frazier <dann.frazier@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Colin King <colin.king@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

clocksource/drivers/arm_arch_timer: Fix mem frame loop initialization

BugLink: https://bugs.launchpad.net/bugs/1713821
The loop to find the best memory frame in arch_timer_mem_acpi_init()
initializes the loop counter with itself ('i = i'), which is suspicious
in the first place and pointed out by clang. The loop condition is
'i < timer_count' and a prior for loop exits when 'i' reaches
'timer_count', therefore the second loop is never executed.

Initialize the loop counter with 0 to iterate over all timers, which
supposedly was the intention before the typo monster attacked.

Fixes: c2743a36765d3 ("clocksource: arm_arch_timer: add GTDT support for memory-mapped timer")
Signed-off-by: Matthias Kaehlcke <mka@chromium.org>
Reported-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
(cherry picked from commit d197f7988721221fac64f899efd7657c15281810)
Signed-off-by: dann frazier <dann.frazier@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Colin King <colin.king@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

clocksource/drivers/arm_arch_timer: Fix read and iounmap of incorrect variable

BugLink: https://bugs.launchpad.net/bugs/1713821
Fix boot warning 'Trying to vfree() nonexistent vm area'
from arch_timer_mem_of_init().

Refactored code attempts to read and iounmap using address frame
instead of address ioremap(frame->cntbase).

Fixes: c389d701dfb70 ("clocksource: arm_arch_timer: split MMIO timer probing.")
Signed-off-by: Frank Rowand <frank.rowand@sony.com>
Reviewed-by: Fu Wei <fu.wei@linaro.org>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
(cherry picked from commit 3db1200ca21f3c63c9044185dc5762ef996848cb)
Signed-off-by: dann frazier <dann.frazier@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Colin King <colin.king@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

clocksource/arm_arch_timer: Fix arch_timer_mem_find_best_frame()

BugLink: https://bugs.launchpad.net/bugs/1713821
arch_timer_mem_find_best_frame() looks through ARCH_TIMER_MEM_MAX_FRAMES
frames even after finding matches to ensure the best frame is chosen,
which means the variable frame will point to the last valid frame but
not necessarily the best frame.

On Juno, we get the following error as the wrong frame is returned as the
best frame from arch_timer_mem_find_best_frame():

  arch_timer: Unable to map frame @ 0x0000000000000000
  arch_timer: Frame missing phys irq.
  Failed to initialize '/timer@2a810000': -22

Fix the issue by correctly returning the best frame from
arch_timer_mem_find_best_frame().

Fixes: c389d701dfb7 ("clocksource: arm_arch_timer: split MMIO timer probing.")
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lkml.kernel.org/r/1494246747-17267-1-git-send-email-sudeep.holla@arm.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
(cherry picked from commit f63d947c1673930bfc5f2f9bd1073a02c179a890)
Signed-off-by: dann frazier <dann.frazier@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Colin King <colin.king@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

arm64: arch_timer: Enable CNTVCT_EL0 trap if workaround is enabled

BugLink: https://bugs.launchpad.net/bugs/1713821
Userspace being allowed to use read CNTVCT_EL0 anytime (and not
only in the VDSO), we need to enable trapping whenever a cntvct
workaround is enabled on a given CPU.

Acked-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
(cherry picked from commit a86bd139f2ae02f960caeb0c1dd5f871d3e087cd)
Signed-off-by: dann frazier <dann.frazier@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Colin King <colin.king@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

Revert "UBUNTU: SAUCE: arm64: arch_timer: Enable CNTVCT_EL0 trap if workaround is enabled"

BugLink: https://bugs.launchpad.net/bugs/1713821
This reverts commit 83f3444f36eba82cca61715f793d1f6411b06e3b.

Before this patch landed upstream, a fix was added to prevent CNTVCT_EL0
accesses from going untrapped if it was reset the the user access bit set:

https://www.spinics.net/lists/arm-kernel/msg574175.html

Let's inherit this fix by replacing with the upstream version.

Signed-off-by: dann frazier <dann.frazier@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Colin King <colin.king@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

Input: elan_i2c - add ELAN0608 to the ACPI table

BugLink: https://bugs.launchpad.net/bugs/1708852
Similar to commit 722c5ac708b4f ("Input: elan_i2c - add ELAN0605 to the
ACPI table"), ELAN0608 should be handled by elan_i2c.

This touchpad can be found in Lenovo ideapad 320-14IKB.

BugLink: https://bugs.launchpad.net/bugs/1708852
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Cc: stable@vger.kernel.org
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
(cherry picked from commit 1874064eed0502bd9bef7be8023757b0c4f26883)
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Acked-by: Po-Hsu Lin (Sam) <po-hsu.lin@canonical.com>
Acked-by: Colin King <colin.king@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

UBUNTU: Start new release

Ignore: yes
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

UBUNTU: Ubuntu-4.10.0-35.39

Signed-off-by: Juerg Haefliger <juerg.haefliger@canonical.com>

UBUNTU: SAUCE: s390/mm: fix race on mm->context.flush_mm

BugLink: http://bugs.launchpad.net/bugs/1708399
The order in __tlb_flush_mm_lazy is to flush TLB first and then clear
the mm->context.flush_mm bit. This can lead to missed flushes as the
bit can be set anytime, the order needs to be the other way aronud.

But this leads to a different race, __tlb_flush_mm_lazy may be called
on two CPUs concurrently. If mm->context.flush_mm is cleared first then
another CPU can bypass __tlb_flush_mm_lazy although the first CPU has
not done the flush yet. In a virtualized environment the time until the
flush is finally completed can be arbitrarily long.

Add a spinlock to serialize __tlb_flush_mm_lazy and use the function
in finish_arch_post_lock_switch as well.

Cc: <stable@vger.kernel.org>
Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
(cherry-picked from commit 60f07c8ec5fae06c23e9fd7bab67dabce92b3414 linux-next)
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
Signed-off-by: Juerg Haefliger <juerg.haefliger@canonical.com>

UBUNTU: SAUCE: s390/mm: fix local TLB flushing vs. detach of an mm address space

BugLink: http://bugs.launchpad.net/bugs/1708399
The local TLB flushing code keeps an additional mask in the mm.context,
the cpu_attach_mask. At the time a global flush of an address space is
done the cpu_attach_mask is copied to the mm_cpumask in order to avoid
future global flushes in case the mm is used by a single CPU only after
the flush.

Trouble is that the reset of the mm_cpumask is racy against the detach
of an mm address space by switch_mm. The current order is first the
global TLB flush and then the copy of the cpu_attach_mask to the
mm_cpumask. The order needs to be the other way around.

Cc: <stable@vger.kernel.org>
Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
(cherry-picked from commit b3e5dc45fd1ec2aa1de6b80008f9295eb17e0659 linux-next)
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Brad Figg <brad.figg@canonical.com>
Signed-off-by: Juerg Haefliger <juerg.haefliger@canonical.com>

Bluetooth: Properly check L2CAP config option output buffer length

Validate the output buffer length for L2CAP config requests and responses
to avoid overflowing the stack buffer used for building the option blocks.

Cc: stable@vger.kernel.org
Signed-off-by: Ben Seri <ben@armis.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
CVE-2017-1000251

(cherry-picked from commit e860d2c904d1a9f38a24eb44c9f34b8f915a6ea3)
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Juerg Haefliger <juerg.haefliger@canonical.com>

UBUNTU: Start new release

Ignore: yes
Signed-off-by: Juerg Haefliger <juerg.haefliger@canonical.com>

UBUNTU: Ubuntu-4.10.0-34.38

Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>

powerpc/perf: Fix Power9 test_adder fields

BugLink: http://bugs.launchpad.net/bugs/1709964
Commit 8d911904f3ce4 ('powerpc/perf: Add restrictions to PMC5 in power9 DD1')
was added to restrict the use of PMC5 in Power9 DD1. Intention was to disable
the use of PMC5 using raw event code. But instead of updating the
power9_isa207_pmu structure (used on DD1), the commit incorrectly updated the
power9_pmu structure. Fix it.

Fixes: 8d911904f3ce ("powerpc/perf: Add restrictions to PMC5 in power9 DD1")
Reported-by: Shriya <shriyak@linux.vnet.ibm.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Tested-by: Shriya <shriyak@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
(cherry picked from commit 8c218578fcbbbdb10416c8614658bf32e3bf1655)
Signed-off-by: Joseph Salisbury <joseph.salisbury@canonical.com>
Acked-by: Colin King <colin.king@canonical.com>
Acked-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>

UBUNTU: SAUCE: HID: multitouch: Support ALPS PTP stick with pid 0x120A

BugLink: https://bugs.launchpad.net/bugs/1712481

This patch adds ALPS PTP sticks with pid/device id 0x120A to the list of
devices supported by hid-multitouch.

Signed-off-by: Shrirang Bagul <shrirang.bagul@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>

HID: multitouch: Support PTP Stick and Touchpad device

BugLink: https://bugs.launchpad.net/bugs/1712481

Support PTP Stick and Touchpad device. This Touchpad is Precision Touchpad
(PTP), and Stick Pointer data is the same as Mouse; Stick Pointer works as
Mouse.

[jkosina@suse.cz: changelog deuglification]
Signed-off-by: Masaki Ota <masaki.ota@jp.alps.com>
Reviewed-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
(cherry picked from commit 504c932c7c47a6b4c572302a13873f7d83af1ff3)
Signed-off-by: Shrirang Bagul <shrirang.bagul@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>

UBUNTU: SAUCE: igb: add support for using Broadcom 54616 as PHY

BugLink: https://launchpad.net/bugs/1712024
Ported from packages/base/any/kernels/3.18.25/patches/driver-support-intel-igb-bcm54616-phy.patch
in OpenNetworkLinux https://github.com/opencomputeproject/OpenNetworkLinux/

Signed-off-by: Wen-chien Jesse Sung <jesse.sung@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-By: AceLan Kao <acelan.kao@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>

powerpc/mm/radix: Avoid flushing the PWC on every flush_tlb_range

BugLink: http://bugs.launchpad.net/bugs/1709220
We do that because it's used by THP pmd collapsing, so use
instead a dedicated flush function.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
(cherry picked from commit 424de9c6e3f89399fc11afc1f53f89c5329132da linux-next)
Signed-off-by: Joseph Salisbury <joseph.salisbury@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>

powerpc/mm/radix: Improve TLB/PWC flushes

BugLink: http://bugs.launchpad.net/bugs/1709220
At the moment we have to rather sub-optimal flushing behaviours:

- flush_tlb_mm() will flush the PWC which is unnecessary (for example
   when doing a fork)

- A large unmap will call flush_tlb_pwc() multiple times causing us
   to perform that fairly expensive operation repeatedly. This happens
   often in batches of 3 on every new process.

So we change flush_tlb_mm() to only flush the TLB, and we use the
existing "need_flush_all" flag in struct mmu_gather to indicate
that the PWC needs flushing.

Unfortunately, flush_tlb_range() still needs to do a full flush
for now as it's used by the THP collapsing. We will fix that later.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
(backported from commit a46cc7a90fd8d95bfbb2b27080efe872a1a51db4 linux-next)
Signed-off-by: Joseph Salisbury <joseph.salisbury@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>

powerpc/mm/radix: Improve _tlbiel_pid to be usable for PWC flushes

BugLink: http://bugs.launchpad.net/bugs/1709220
The PWC flush only needs a single set call, just like the
full (RIC=2) flush.

This will allow us to get rid of the dedicated _tlbiel_pwc()

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
(cherry picked from commit 5ce5fe14ed0302315061cf97ce67accd1b25b938 linux-next)
Signed-off-by: Joseph Salisbury <joseph.salisbury@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>

powerpc/mm/radix: Optimise tlbiel flush all case

BugLink: http://bugs.launchpad.net/bugs/1709220
_tlbiel_pid() is called with a ric (Radix Invalidation Control) argument of
either RIC_FLUSH_TLB or RIC_FLUSH_ALL.

RIC_FLUSH_ALL says to invalidate the entire TLB and the Page Walk Cache (PWC).

To flush the whole TLB, we have to iterate over each set (congruence class) of
the TLB. Currently we do that and pass RIC_FLUSH_ALL each time. That is not
incorrect but it means we flush the PWC 128 times, when once would suffice.

Fix it by doing the first flush with the ric value we're passed, and then if it
was RIC_FLUSH_ALL, we downgrade it to RIC_FLUSH_TLB, because we know we have
just flushed the PWC and don't need to do it again.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
[mpe: Split out of combined patch, tweak logic, rewrite change log]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
(cherry picked from commit a5998fcb92552a18713b6aa5c146aa400e4d75ee)
Signed-off-by: Joseph Salisbury <joseph.salisbury@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>

usb: xhci: Issue stop EP command only when the EP state is running

BugLink: https://bugs.launchpad.net/bugs/1711098
on AMD platforms with SNPS 3.1 USB controller if stop endpoint command is
issued the controller does not respond, when the EP is not in running
state. HW completes the command execution and reports
"Context State Error" completion code. This is as per the spec. However
HW on receiving the second command additionally marks EP to Flow control
state in HW which is RTL bug. This bug causes the HW not to respond
to any further doorbells that are rung by the driver. This makes the EP
to not functional anymore and causes gross functional failures.

As a workaround, not to hit this problem, it's better to check the EP state
and issue a stop EP command only when the EP is in running state.

As a sidenote, even with this patch there is still a possibility of
triggering the RTL bug if the context state races with the stop endpoint
command as described in xHCI spec 4.6.9

[code simplification and reworded sidenote in commit message -Mathias]
Signed-off-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com>
Signed-off-by: Nehal Shah <Nehal-bakulchandra.Shah@amd.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 28a2369f7d72ece55089f33e7d7b9c1223673cc3)
Signed-off-by: Alberto Milone <alberto.milone@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>

dma-buf: avoid scheduling on fence status query v2

BugLink: http://bugs.launchpad.net/bugs/1711096
When a timeout of zero is specified, the caller is only interested in
the fence status.

In the current implementation, dma_fence_default_wait will always call
schedule_timeout() at least once for an unsignaled fence. This adds a
significant overhead to a fence status query.

Avoid this overhead by returning early if a zero timeout is specified.

v2: move early return after enable_signaling

Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170426144620.3560-1-andresx7@gmail.com
(cherry picked from commit 03c0c5f6641533f5fc14bf4e76d2304197402552)
Signed-off-by: Alberto Milone <alberto.milone@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>

scsi: ipr: do not set DID_PASSTHROUGH on CHECK CONDITION

BugLink: http://bugs.launchpad.net/bugs/1682644
On a dual controller setup with multipath enabled, some MEDIUM ERRORs
caused both paths to be failed, thus I/O got queued/blocked since the
'queue_if_no_path' feature is enabled by default on IPR controllers.

This example disabled 'queue_if_no_path' so the I/O failure is seen at
the sg_dd program.  Notice that after the sg_dd test-case, both paths
are in 'failed' state, and both path/priority groups are in 'enabled'
state (not 'active') -- which would block I/O with 'queue_if_no_path'.

    # sg_dd if=/dev/dm-2 bs=4096 count=1 dio=1 verbose=4 blk_sgio=0
    <...>
    read(unix): count=4096, res=-1
    sg_dd: reading, skip=0 : Input/output error
    <...>

    # dmesg
    [...] sd 2:2:16:0: [sds] FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
    [...] sd 2:2:16:0: [sds] Sense Key : Medium Error [current]
    [...] sd 2:2:16:0: [sds] Add. Sense: Unrecovered read error - recommend rewrite the data
    [...] sd 2:2:16:0: [sds] CDB: Read(10) 28 00 00 00 00 00 00 00 20 00
    [...] blk_update_request: I/O error, dev sds, sector 0
    [...] device-mapper: multipath: Failing path 65:32.
    <...>
    [...] device-mapper: multipath: Failing path 65:224.

    # multipath -l
    1IBM_IPR-0_59C2AE0000001F80 dm-2 IBM     ,IPR-0   59C2AE00
    size=5.2T features='0' hwhandler='1 alua' wp=rw
    |-+- policy='service-time 0' prio=0 status=enabled
    | `- 2:2:16:0 sds  65:32  failed undef running
    `-+- policy='service-time 0' prio=0 status=enabled
      `- 1:2:7:0  sdae 65:224 failed undef running

This is not the desired behavior. The dm-multipath explicitly checks
for the MEDIUM ERROR case (and a few others) so not to fail the path
(e.g., I/O to other sectors could potentially happen without problems).
See dm-mpath.c :: do_end_io_bio() -> noretry_error() !->! fail_path().

The problem trace is:

1) ipr_scsi_done()  // SENSE KEY/CHECK CONDITION detected, go to..
2) ipr_erp_start()  // ipr_is_gscsi() and masked_ioasc OK, go to..
3) ipr_gen_sense()  // masked_ioasc is IPR_IOASC_MED_DO_NOT_REALLOC,
                    // so set DID_PASSTHROUGH.

4) scsi_decide_disposition()  // check for DID_PASSTHROUGH and return
                              // early on, faking a DID_OK.. *instead*
                              // of reaching scsi_check_sense().

                              // Had it reached the latter, that would
                              // set host_byte to DID_MEDIUM_ERROR.

5) scsi_finish_command()
6) scsi_io_completion()
7) __scsi_error_from_host_byte()  // That would be converted to -ENODATA
<...>
8) dm_softirq_done()
9) multipath_end_io()
10) do_end_io()
11) noretry_error()  // And that is checked in dm-mpath :: noretry_error()
                     // which would cause fail_path() not to be called.

With this patch applied, the I/O is failed but the paths are not.  This
multipath device continues accepting more I/O requests without blocking.
(and notice the different host byte/driver byte handling per SCSI layer).

    # dmesg
    [...] sd 2:2:7:0: [sdaf] Done: SUCCESS Result: hostbyte=0x13 driverbyte=DRIVER_OK
    [...] sd 2:2:7:0: [sdaf] CDB: Read(10) 28 00 00 00 00 00 00 00 40 00
    [...] sd 2:2:7:0: [sdaf] Sense Key : Medium Error [current]
    [...] sd 2:2:7:0: [sdaf] Add. Sense: Unrecovered read error - recommend rewrite the data
    [...] blk_update_request: critical medium error, dev sdaf, sector 0
    [...] blk_update_request: critical medium error, dev dm-6, sector 0
    [...] sd 2:2:7:0: [sdaf] Done: SUCCESS Result: hostbyte=0x13 driverbyte=DRIVER_OK
    [...] sd 2:2:7:0: [sdaf] CDB: Read(10) 28 00 00 00 00 00 00 00 10 00
    [...] sd 2:2:7:0: [sdaf] Sense Key : Medium Error [current]
    [...] sd 2:2:7:0: [sdaf] Add. Sense: Unrecovered read error - recommend rewrite the data
    [...] blk_update_request: critical medium error, dev sdaf, sector 0
    [...] blk_update_request: critical medium error, dev dm-6, sector 0
    [...] Buffer I/O error on dev dm-6, logical block 0, async page read

    # multipath -l 1IBM_IPR-0_59C2AE0000001F80
    1IBM_IPR-0_59C2AE0000001F80 dm-6 IBM     ,IPR-0   59C2AE00
    size=5.2T features='1 queue_if_no_path' hwhandler='1 alua' wp=rw
    |-+- policy='service-time 0' prio=0 status=active
    | `- 2:2:7:0  sdaf 65:240 active undef running
    `-+- policy='service-time 0' prio=0 status=enabled
      `- 1:2:7:0  sdh  8:112  active undef running

Signed-off-by: Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com>
Acked-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 785a470496d8e0a32e3d39f376984eb2c98ca5b3)
Signed-off-by: Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Colin King <colin.king@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>

UBUNTU: [Config] CONFIG_MEMORY_HOTPLUG_DEFAULT_ONLINE=n for ppc64el

BugLink: http://bugs.launchpad.net/bugs/1709171
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
(cherry picked from artful commit 8f189e08c9a0719aa1e615fcb5380ecb9bf12e7f)
Signed-off-by: Joseph Salisbury <joseph.salisbury@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Colin King <colin.king@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>

selftests: fix memory-hotplug test

BugLink: http://bugs.launchpad.net/bugs/1710868
In the memory offline test, the $ration was used with RANDOM as the
possibility to get it offlined, correct it to become the portion of
available removable memory blocks.

Also ask the tool to try to offline the next available memory block
if the attempt is unsuccessful. It will only fail if all removable
memory blocks are busy.

A nice example:
$ sudo ./test.sh
Test scope: 10% hotplug memory
online all hot-pluggable memory in offline state:
SKIPPED - no hot-pluggable memory in offline state
offline 10% hot-pluggable memory in online state
trying to offline 3 out of 28 memory block(s):
online->offline memory1
online->offline memory10
./test.sh: line 74: echo: write error: Resource temporarily unavailable
offline_memory_expect_success 10: unexpected fail
online->offline memory100
online->offline memory101
online all hot-pluggable memory in offline state:
offline->online memory1
offline->online memory100
offline->online memory101
skip extra tests: debugfs is not mounted
$ echo $?
0

Signed-off-by: Po-Hsu Lin <po-hsu.lin@canonical.com>
Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
(cherry picked from commit 5ff0c60b0e5e8c2a2139c304d7620b0ea70d721a)
Signed-off-by: Po-Hsu Lin <po-hsu.lin@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>

selftests: add missing test name in memory-hotplug test

BugLink: http://bugs.launchpad.net/bugs/1710868
There is no prompt for testing memory notifier error injection,
added with the same echo format of other tests above.

Signed-off-by: Po-Hsu Lin <po-hsu.lin@canonical.com>
Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
(cherry picked from commit 02d8f075ac44f6f0dc46881965f815576921f2c0)
Signed-off-by: Po-Hsu Lin <po-hsu.lin@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>

selftests: check percentage range for memory-hotplug test

BugLink: http://bugs.launchpad.net/bugs/1710868
Check the precentage range for -r flag in memory-hotplug test.

Signed-off-by: Po-Hsu Lin <po-hsu.lin@canonical.com>
Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
(cherry picked from commit 72441ea5886780d04ac96913e02dba9e84ea80e5)
Signed-off-by: Po-Hsu Lin <po-hsu.lin@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>

selftests: check hot-pluggagble memory for memory-hotplug test

BugLink: http://bugs.launchpad.net/bugs/1710868
Check for hot-pluggable memory availability in prerequisite() of the
memory-hotplug test.

Signed-off-by: Po-Hsu Lin <po-hsu.lin@canonical.com>
Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
(cherry picked from commit a34b28c92ec8c92938de03c18c5fab32efd2e29a)
Signed-off-by: Po-Hsu Lin <po-hsu.lin@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>

selftests: typo correction for memory-hotplug test

BugLink: http://bugs.launchpad.net/bugs/1710868
Typo fixed for hotpluggable_offline_memory() in memory-hotplug test.

Signed-off-by: Po-Hsu Lin <po-hsu.lin@canonical.com>
Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
(cherry picked from commit 593f927851a58fdca559b173c807a0b78c9f6e49)
Signed-off-by: Po-Hsu Lin <po-hsu.lin@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>

KVM: PPC: Book3S HV: Add radix checks in real-mode hypercall handlers

BugLink: http://bugs.launchpad.net/bugs/1686019
POWER9 running a radix guest will take some hypervisor interrupts
without going to real mode (turning off the MMU).  This means that
early hypercall handlers may now be called in virtual mode.  Most of
the handlers work just fine in both modes, but there are some that
can crash the host if called in virtual mode, notably the TCE (IOMMU)
hypercalls H_PUT_TCE, H_STUFF_TCE and H_PUT_TCE_INDIRECT.  These
already have both a real-mode and a virtual-mode version, so we
arrange for the real-mode version to return H_TOO_HARD for radix
guests, which will result in the virtual-mode version being called.

The other hypercall which is sensitive to the MMU mode is H_RANDOM.
It doesn't have a virtual-mode version, so this adds code to enable
it to be called in either mode.

An alternative solution was considered which would refuse to call any
of the early hypercall handlers when doing a virtual-mode exit from a
radix guest.  However, the XICS-on-XIVE code depends on the XICS
hypercalls being handled early even for virtual-mode exits, because
the handlers need to be called before the XIVE vCPU state has been
pulled off the hardware.  Therefore that solution would have become
quite invasive and complicated, and was rejected in favour of the
simpler, though less elegant, solution presented here.

Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Tested-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
(cherry picked from commit acde25726bc6034b628febb8a4c6c0838736ccbf)
Signed-off-by: Jose Ricardo Ziviani <joserz@linux.vnet.ibm.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>

KVM: PPC: Reserve KVM_CAP_SPAPR_TCE_VFIO capability number

BugLink: http://bugs.launchpad.net/bugs/1686019
This adds a capability number for in-kernel support for VFIO on
SPAPR platform.

The capability will tell the user space whether in-kernel handlers of
H_PUT_TCE can handle VFIO-targeted requests or not. If not, the user space
must not attempt allocating a TCE table in the host kernel via
the KVM_CREATE_SPAPR_TCE KVM ioctl because in that case TCE requests
will not be passed to the user space which is desired action in
the situation like that.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
(backported from commit 4898d3f49b5b156c33f0ae0f49ede417ab86195e)
[joserz: small change on file include/uapi/linux/kvm.h due to conflics
with defines included by MIPS and S390 patches.]
Signed-off-by: Jose Ricardo Ziviani <joserz@linux.vnet.ibm.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>

powerpc/mmu: Add real mode support for IOMMU preregistered memory

BugLink: http://bugs.launchpad.net/bugs/1686019
This makes mm_iommu_lookup() able to work in realmode by replacing
list_for_each_entry_rcu() (which can do debug stuff which can fail in
real mode) with list_for_each_entry_lockless().

This adds realmode version of mm_iommu_ua_to_hpa() which adds
explicit vmalloc'd-to-linear address conversion.
Unlike mm_iommu_ua_to_hpa(), mm_iommu_ua_to_hpa_rm() can fail.

This changes mm_iommu_preregistered() to receive @mm as in real mode
@current does not always have a correct pointer.

This adds realmode version of mm_iommu_lookup() which receives @mm
(for the same reason as for mm_iommu_preregistered()) and uses
lockless version of list_for_each_entry_rcu().

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
(cherry picked from commit 6b5c19c55266f6efd10ffac0e9f9f2b7aa420a58)
Signed-off-by: Jose Ricardo Ziviani <joserz@linux.vnet.ibm.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>

powerpc/vfio_spapr_tce: Add reference counting to iommu_table

BugLink: http://bugs.launchpad.net/bugs/1686019
So far iommu_table obejcts were only used in virtual mode and had
a single owner. We are going to change this by implementing in-kernel
acceleration of DMA mapping requests. The proposed acceleration
will handle requests in real mode and KVM will keep references to tables.

This adds a kref to iommu_table and defines new helpers to update it.
This replaces iommu_free_table() with iommu_tce_table_put() and makes
iommu_free_table() static. iommu_tce_table_get() is not used in this patch
but it will be in the following patch.

Since this touches prototypes, this also removes @node_name parameter as
it has never been really useful on powernv and carrying it for
the pseries platform code to iommu_free_table() seems to be quite
useless as well.

This should cause no behavioral change.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
(cherry picked from commit e5afdf9dd515a9446c009f44f99f9bc2f91b89a7)
Signed-off-by: Jose Ricardo Ziviani <joserz@linux.vnet.ibm.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>

powerpc/iommu/vfio_spapr_tce: Cleanup iommu_table disposal

BugLink: http://bugs.launchpad.net/bugs/1686019
At the moment iommu_table can be disposed by either calling
iommu_table_free() directly or it_ops::free(); the only implementation
of free() is in IODA2 - pnv_ioda2_table_free() - and it calls
iommu_table_free() anyway.

As we are going to have reference counting on tables, we need an unified
way of disposing tables.

This moves it_ops::free() call into iommu_free_table() and makes use
of the latter. The free() callback now handles only platform-specific
data.

As from now on the iommu_free_table() calls it_ops->free(), we need
to have it_ops initialized before calling iommu_free_table() so this
moves this initialization in pnv_pci_ioda2_create_table().

This should cause no behavioral change.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
(cherry picked from commit 11edf116e3a6352cfee6b1437d41603c9aff79c9)
Signed-off-by: Jose Ricardo Ziviani <joserz@linux.vnet.ibm.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>

powerpc/powernv/ioda2: Update iommu table base on ownership change

BugLink: http://bugs.launchpad.net/bugs/1686019
On POWERNV platform, in order to do DMA via IOMMU (i.e. 32bit DMA in
our case), a device needs an iommu_table pointer set via
set_iommu_table_base().

The codeflow is:
- pnv_pci_ioda2_setup_dma_pe()
- pnv_pci_ioda2_setup_default_config()
- pnv_ioda_setup_bus_dma() [1]

pnv_pci_ioda2_setup_dma_pe() creates IOMMU groups,
pnv_pci_ioda2_setup_default_config() does default DMA setup,
pnv_ioda_setup_bus_dma() takes a bus PE (on IODA2, all physical function
PEs as bus PEs except NPU), walks through all underlying buses and
devices, adds all devices to an IOMMU group and sets iommu_table.

On IODA2, when VFIO is used, it takes ownership over a PE which means it
removes all tables and creates new ones (with a possibility of sharing
them among PEs). So when the ownership is returned from VFIO to
the kernel, the iommu_table pointer written to a device at [1] is
stale and needs an update.

This adds an "add_to_group" parameter to pnv_ioda_setup_bus_dma()
(in fact re-adds as it used to be there a while ago for different
reasons) to tell the helper if a device needs to be added to
an IOMMU group with an iommu_table update or just the latter.

This calls pnv_ioda_setup_bus_dma(..., false) from
pnv_ioda2_release_ownership() so when the ownership is restored,
32bit DMA can work again for a device. This does the same thing
on obtaining ownership as the iommu_table point is stale at this point
anyway and it is safer to have NULL there.

We did not hit this earlier as all tested devices in recent years were
only using 64bit DMA; the rare exception for this is MPT3 SAS adapter
which uses both 32bit and 64bit DMA access and it has not been tested
with VFIO much.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
(cherry picked from commit db08e1d53034a54fe177ced70476fda73954b9e9)
Signed-off-by: Jose Ricardo Ziviani <joserz@linux.vnet.ibm.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>

powerpc/powernv/iommu: Add real mode version of iommu_table_ops::exchange()

BugLink: http://bugs.launchpad.net/bugs/1686019
In real mode, TCE tables are invalidated using special
cache-inhibited store instructions which are not available in
virtual mode

This defines and implements exchange_rm() callback. This does not
define set_rm/clear_rm/flush_rm callbacks as there is no user for those -
exchange/exchange_rm are only to be used by KVM for VFIO.

The exchange_rm callback is defined for IODA1/IODA2 powernv platforms.

This replaces list_for_each_entry_rcu with its lockless version as
from now on pnv_pci_ioda2_tce_invalidate() can be called in
the real mode too.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
(cherry picked from commit a540aa56ba3d29084f28710c8b93cc9c3c422943)
Signed-off-by: Jose Ricardo Ziviani <joserz@linux.vnet.ibm.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>

KVM: PPC: VFIO: Add in-kernel acceleration for VFIO

BugLink: http://bugs.launchpad.net/bugs/1686019
This allows the host kernel to handle H_PUT_TCE, H_PUT_TCE_INDIRECT
and H_STUFF_TCE requests targeted an IOMMU TCE table used for VFIO
without passing them to user space which saves time on switching
to user space and back.

This adds H_PUT_TCE/H_PUT_TCE_INDIRECT/H_STUFF_TCE handlers to KVM.
KVM tries to handle a TCE request in the real mode, if failed
it passes the request to the virtual mode to complete the operation.
If it a virtual mode handler fails, the request is passed to
the user space; this is not expected to happen though.

To avoid dealing with page use counters (which is tricky in real mode),
this only accelerates SPAPR TCE IOMMU v2 clients which are required
to pre-register the userspace memory. The very first TCE request will
be handled in the VFIO SPAPR TCE driver anyway as the userspace view
of the TCE table (iommu_table::it_userspace) is not allocated till
the very first mapping happens and we cannot call vmalloc in real mode.

If we fail to update a hardware IOMMU table unexpected reason, we just
clear it and move on as there is nothing really we can do about it -
for example, if we hot plug a VFIO device to a guest, existing TCE tables
will be mirrored automatically to the hardware and there is no interface
to report to the guest about possible failures.

This adds new attribute - KVM_DEV_VFIO_GROUP_SET_SPAPR_TCE - to
the VFIO KVM device. It takes a VFIO group fd and SPAPR TCE table fd
and associates a physical IOMMU table with the SPAPR TCE table (which
is a guest view of the hardware IOMMU table). The iommu_table object
is cached and referenced so we do not have to look up for it in real mode.

This does not implement the UNSET counterpart as there is no use for it -
once the acceleration is enabled, the existing userspace won't
disable it unless a VFIO container is destroyed; this adds necessary
cleanup to the KVM_DEV_VFIO_GROUP_DEL handler.

This advertises the new KVM_CAP_SPAPR_TCE_VFIO capability to the user
space.

This adds real mode version of WARN_ON_ONCE() as the generic version
causes problems with rcu_sched. Since we testing what vmalloc_to_phys()
returns in the code, this also adds a check for already existing
vmalloc_to_phys() call in kvmppc_rm_h_put_tce_indirect().

This finally makes use of vfio_external_user_iommu_id() which was
introduced quite some time ago and was considered for removal.

Tests show that this patch increases transmission speed from 220MB/s
to 750..1020MB/s on 10Gb network (Chelsea CXGB3 10Gb ethernet card).

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
(backported from commit 121f80ba68f1a5779a36d7b3247206e60e0a7418)
[joserz: removed WARN_ON(!kref_read(&stit->kref)); to avoid a dependency
to commit 2c935bc, that adds sensible changes to other important kernel
components.]
Signed-off-by: Jose Ricardo Ziviani <joserz@linux.vnet.ibm.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>

KVM: PPC: Use preregistered memory API to access TCE list

BugLink: http://bugs.launchpad.net/bugs/1686019
VFIO on sPAPR already implements guest memory pre-registration
when the entire guest RAM gets pinned. This can be used to translate
the physical address of a guest page containing the TCE list
from H_PUT_TCE_INDIRECT.

This makes use of the pre-registrered memory API to access TCE list
pages in order to avoid unnecessary locking on the KVM memory
reverse map as we know that all of guest memory is pinned and
we have a flat array mapping GPA to HPA which makes it simpler and
quicker to index into that array (even with looking up the
kernel page tables in vmalloc_to_phys) than it is to find the memslot,
lock the rmap entry, look up the user page tables, and unlock the rmap
entry. Note that the rmap pointer is initialized to NULL
where declared (not in this patch).

If a requested chunk of memory has not been preregistered, this will
fall back to non-preregistered case and lock rmap.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
(cherry picked from commit da6f59e19233efdda58f196afbae8e05f6030d7f)
Signed-off-by: Jose Ricardo Ziviani <joserz@linux.vnet.ibm.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>

KVM: PPC: Pass kvm* to kvmppc_find_table()

BugLink: http://bugs.launchpad.net/bugs/1686019
The guest view TCE tables are per KVM anyway (not per VCPU) so pass kvm*
there. This will be used in the following patches where we will be
attaching VFIO containers to LIOBNs via ioctl() to KVM (rather than
to VCPU).

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
(cherry picked from commit 503bfcbe18576a79be0bc5173b23b530845e704a)
Signed-off-by: Jose Ricardo Ziviani <joserz@linux.vnet.ibm.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>

tty: pl011: fix initialization order of QDF2400 E44

The work-around for Qualcomm Technologies QDF2400 Erratum 44 hinges on a
global variable defined in the pl011 driver.  The ACPI SPCR parsing code
determines whether the work-around is needed, and if so, it changes the
console name from "pl011" to "qdf2400_e44".  The expectation is that
the pl011 driver will implement the work-around when it sees the console
name.  The global variable qdf2400_e44_present is set when that happens.

The problem is that work-around needs to be enabled when the pl011
driver probes, not when the console name is queried.  However, sbsa_probe()
is called before pl011_console_match().  The work-around appeared to work
previously because the default console on QDF2400 platforms was always
ttyAMA1.  The first time sbsa_probe() is called (for ttyAMA0),
qdf2400_e44_present is still false.  Then pl011_console_match() is called,
and it sets qdf2400_e44_present to true.  All subsequent calls to
sbsa_probe() enable the work-around.

The solution is to move the global variable into spcr.c and let the
pl011 driver query it during probe time.  This works because all QDF2400
platforms require SPCR, so parse_spcr() will always be called.
pl011_console_match still checks for the "qdf2400_e44" console name,
but it doesn't do anything else special.

BugLink: https://launchpad.net/bugs/1709123
Fixes: 5a0722b898f8 ("tty: pl011: use "qdf2400_e44" as the earlycon name for QDF2400 E44")
Tested-by: Jeffrey Hugo <jhugo@codeaurora.org>
Signed-off-by: Timur Tabi <timur@codeaurora.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(backported from commit 37ef38f3f83891a2f413fb872bae7d0f9bb95b27)
Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>

UBUNTU: SAUCE: aufs: bugfix, for v4.10, copy-up on XFS branch

BugLink: http://bugs.launchpad.net/bugs/1709749
The commit for linux-v4.10
6552321 2016-11-30 xfs: remove i_iolock and use i_rwsem in the
VFS inode instead
broke aufs' copy-up because XFS acquires inode_lock for a normal read().
Additionally XFS has its version and older one doesn't support
clone_file_range().
This commit fixes the error path after trying clone_file_range().

Reported-by: Vasily Tarasov <tarasov@vasily.name>
See-also: http://www.mail-archive.com/aufs-users@lists.sourceforge.net/msg05498.html
Signed-off-by: J. R. Okajima <hooanon05g@gmail.com>
(cherry picked from commit e34c81ff96415c64ca827ec30e7935454d26c1d3
https://github.com/sfjro/aufs4-standalone.git)
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>

UBUNTU: SAUCE: aufs: for v4.5, use vfs_clone_file_range() in copy-up

BugLink: http://bugs.launchpad.net/bugs/1709749
In mainline, ioctl(FICLONE) is introduced by the commit
04b38d6 2015-12-07 vfs: pull btrfs clone API to vfs layer
so are vfs_clone_file_ranage() and f_op->clone_file_ranage().
Compared to copy_file_range(2), cloning doesn't return with the partial
success. Using this method in aufs copy-up, the speed will be improved.

But unfortunately this method is supported by nfs4.2, btrfs and cifs
only (currently). Additionally, linux nfs server 4.2 implementation
simply calls vfs_clone_file_ranage(), which means if the backend fs
doesn't support this operation, it returns EOPNOTSUPP.

So the benefit is rather limited, but it must be a good thing.

Signed-off-by: J. R. Okajima <hooanon05g@gmail.com>
(backported from commit b4d3dcc92a13d53952fe6e9a640201ef87475302
https://github.com/sfjro/aufs4-standalone.git)
[saf: Resolved conflicts based primarily on resolution found in
fd18affa818115edad7e1b7472f26ac4d73e73a1]
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>

ACPI: APD: Fix HID for Hisilicon Hip07/08

BugLink: https://bugs.launchpad.net/bugs/1711182
ACPI HID for Hisilicon Hip07/08 should be HISI02A1/2,
not HISI0A21/2, HISI02A1/2 was tested ok but was modified
by the stupid typo when upstream the patches (by me),
correct them to the right IDs (matching the IDs in
drivers/i2c/busses/i2c-designware-platdrv.c).

Fixes: 6e14cf361a0c (ACPI / APD: Add clock frequency for Hisilicon Hip07/08 I2C controller)
Reported-by: Tao Tian <tiantao6@huawei.com>
Signed-off-by: Hanjun Guo <hanjun.guo@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
(backported from commit f7f3dd5b4cbb138ed4559b0d096bab76a8f476de)
Signed-off-by: dann frazier <dann.frazier@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>

powerpc/perf: Avoid spurious PMU interrupts after idle

BugLink: http://bugs.launchpad.net/bugs/1709352
POWER9 DD2 can see spurious PMU interrupts after state-loss idle in
some conditions.

A solution is to save and reload MMCR0 over state-loss idle.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Acked-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Tested-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
(cherry picked from commit 101dd590a7fa37954540cf3149a1c502c0acc524)
Signed-off-by: Rodrigo R. Galvao <rosattig@linux.vnet.ibm.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>

block: fix bio_will_gap() for first bvec with offset

BugLink: http://bugs.launchpad.net/bugs/1709073
Commit 729204ef49ec("block: relax check on sg gap") allows us to merge
bios, if both are physically contiguous. This change can merge a huge
number of small bios, through mkfs for example, mkfs.ntfs running time
can be decreased to ~1/10.

But if one rq starts with a non-aligned buffer (the 1st bvec's bv_offset
is non-zero) and if we allow the merge, it is quite difficult to respect
sg gap limit, especially the max segment size, or we risk having an
unaligned virtual boundary. This patch tries to avoid the issue by
disallowing a merge, if the req starts with an unaligned buffer.

Also add comments to explain why the merged segment can't end in
unaligned virt boundary.

Fixes: 729204ef49ec ("block: relax check on sg gap")
Tested-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Rewrote parts of the commit message and comments.

Signed-off-by: Jens Axboe <axboe@fb.com>
(cherry picked from commit 5a8d75a1b8c99bdc926ba69b7b7dbe4fae81a5af)
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Acked-by: Marcelo Cerri <marcelo.cerri@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>

brcmfmac: fix possible buffer overflow in brcmf_cfg80211_mgmt_tx()

The lower level nl80211 code in cfg80211 ensures that "len" is between
25 and NL80211_ATTR_FRAME (2304).  We subtract DOT11_MGMT_HDR_LEN (24) from
"len" so thats's max of 2280.  However, the action_frame->data[] buffer is
only BRCMF_FIL_ACTION_FRAME_SIZE (1800) bytes long so this memcpy() can
overflow.

memcpy(action_frame->data, &buf[DOT11_MGMT_HDR_LEN],
       le16_to_cpu(action_frame->len));

Cc: stable@vger.kernel.org # 3.9.x
Fixes: 18e2f61db3b70 ("brcmfmac: P2P action frame tx.")
Reported-by: "freenerguo(郭大兴)" <freenerguo@tencent.com>
Signed-off-by: Arend van Spriel <arend.vanspriel@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
CVE-2017-7541

(cherry-picked from commit 8f44c9a41386729fea410e688959ddaa9d51be7c)
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Acked-by: Colin King <colin.king@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>

UBUNTU: [Packaging] sort ABI files with C.UTF-8 locale

BugLink: https://bugs.launchpad.net/bugs/1712345
Whenever we update the ABI files, the files may be sorted in a different
order, even though their contents are the same. That happens because the
system updating the ABI files may use a different locale than the one
that was used previously.

Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
Acked-by: Marcelo Cerri <marcelo.cerri@canonical.com>
Acked-by: Colin King <colin.king@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>

UBUNTU: [Debian] Don't depend on initramfs-tools

BugLink: http://bugs.launchpad.net/bugs/1700972
Allow images to be created without the need of an initrd and also allow
users to run without an initrd if they want to.

Signed-off-by: Marcelo Henrique Cerri <marcelo.cerri@canonical.com>
Acked-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>

UBUNTU: Start new release

Ignore: yes
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>

UBUNTU: Ubuntu-4.10.0-33.37

Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

udp: consistently apply ufo or fragmentation

When iteratively building a UDP datagram with MSG_MORE and that
datagram exceeds MTU, consistently choose UFO or fragmentation.

Once skb_is_gso, always apply ufo. Conversely, once a datagram is
split across multiple skbs, do not consider ufo.

Sendpage already maintains the first invariant, only add the second.
IPv6 does not have a sendpage implementation to modify.

A gso skb must have a partial checksum, do not follow sk_no_check_tx
in udp_send_skb.

Found by syzkaller.

Fixes: e89e9cf539a2 ("[IPv4/IPv6]: UFO Scatter-gather approach")
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
CVE-2017-1000112

(cherry-picked from commit 85f1bd9a7b5a79d5baa8bf44af19658f7bf77bfa)
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

packet: fix tp_reserve race in packet_set_ring

Updates to tp_reserve can race with reads of the field in
packet_set_ring. Avoid this by holding the socket lock during
updates in setsockopt PACKET_RESERVE.

This bug was discovered by syzkaller.

Fixes: 8913336a7e8d ("packet: add PACKET_RESERVE sockopt")
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
CVE-2017-1000111

(cherry-picked from commit c27927e372f0785f3303e8fad94b85945e2c97b7)
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

Revert "net-packet: fix race in packet_set_ring on PACKET_RESERVE"

This reverts commit e824e06eda7dfac325d937c33cda0d45aaf4509c.

CVE-2017-1000111

Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

Revert "udp: consistently apply ufo or fragmentation"

This reverts commit 5967f2e74e50c277759c900e415a5ef49ac1c635.

CVE-2017-1000112

Signed-off-by: Stefan Bader <stefan.bader@canonical.com>

KVM: arm64: Log an error if trapping a write-to-read-only GICv3 access

BugLink: https://bugs.launchpad.net/bugs/1673564
A write-to-read-only GICv3 access should UNDEF at EL1. But since
we're in complete paranoia-land with broken CPUs, let's assume the
worse and gracefully handle the case.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Christoffer Dall <cdall@linaro.org>
Signed-off-by: Christoffer Dall <cdall@linaro.org>
(backported picked from commit 7b1dba1f7325629427c0e5bdf014159b229d16c8)
[ dannf: Recoded sys_reg_descs[] entries to use pre-SYS_DESC() syntax and
minor context changes. ]
Signed-off-by: dann frazier <dann.frazier@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>

KVM: arm64: Log an error if trapping a read-from-write-only GICv3 access

BugLink: https://bugs.launchpad.net/bugs/1673564
A read-from-write-only GICv3 access should UNDEF at EL1. But since
we're in complete paranoia-land with broken CPUs, let's assume the
worse and gracefully handle the case.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Christoffer Dall <cdall@linaro.org>
Signed-off-by: Christoffer Dall <cdall@linaro.org>
(backported from commit e7f1d1eef482150a64a6e6ad8faf40f8f97eed67)
[ dannf: Dropped hunk in access_pmswinc since WO check wasn't added
until e0443230. Recoded sys_reg_descs[] entries to use pre-SYS_DESC()
syntax. Other context changes. ]
Signed-off-by: dann frazier <dann.frazier@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>

arm64: KVM: Make unexpected reads from WO registers inject an undef

BugLink: https://bugs.launchpad.net/bugs/1673564
Reads from write-only system registers are generally confined to
EL1 and not propagated to EL2 (that's what the architecture
mantates). In order to be sure that we have a sane behaviour
even in the unlikely event that we have a broken system, we still
handle it in KVM.

In that case, let's inject an undef into the guest.

Let's also remove write_to_read_only which isn't used anywhere.

Reviewed-by: Christoffer Dall <cdall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
(cherry picked from commit 7b5b4df1a79954e0b208630fc63b16ec0231a516)
Signed-off-by: dann frazier <dann.frazier@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>

KVM: arm64: vgic-v3: Log which GICv3 system registers are trapped

BugLink: https://bugs.launchpad.net/bugs/1673564
In order to facilitate debug, let's log which class of GICv3 system
registers are trapped.

Tested-by: Alexander Graf <agraf@suse.de>
Acked-by: David Daney <david.daney@cavium.com>
Acked-by: Christoffer Dall <cdall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <cdall@linaro.org>
(cherry picked from commit 2873b5082c5fbe2037f12e8d2abc8ad8f8d82a1b)
Signed-off-by: dann frazier <dann.frazier@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>

KVM: arm64: Enable GICv3 common sysreg trapping via command-line

BugLink: https://bugs.launchpad.net/bugs/1673564
Now that we're able to safely handle common sysreg access, let's
give the user the opportunity to enable it by passing a specific
command-line option (vgic_v3.common_trap).

Tested-by: Alexander Graf <agraf@suse.de>
Acked-by: David Daney <david.daney@cavium.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Acked-by: Christoffer Dall <cdall@linaro.org>
Signed-off-by: Christoffer Dall <cdall@linaro.org>
(cherry picked from commit ff89511ef29b794d6a9c6b62f5ea76fc013cdae7)
Signed-off-by: dann frazier <dann.frazier@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>

KVM: arm64: vgic-v3: Add ICV_PMR_EL1 handler

BugLink: https://bugs.launchpad.net/bugs/1673564
Add a handler for reading/writing the guest's view of the ICC_PMR_EL1
register, which is located in the ICH_VMCR_EL2.VPMR field.

Tested-by: Alexander Graf <agraf@suse.de>
Acked-by: David Daney <david.daney@cavium.com>
Acked-by: Christoffer Dall <cdall@linaro.org>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <cdall@linaro.org>
(backported from commit 6293d6514d6b0945210e15edcecc71d99cfca97c)
[ dannf: Dropped SYS_ prefix on ICC_PMR_EL1 macro, which wasn't
added upstream until 0e9884fe ]
Signed-off-by: dann frazier <dann.frazier@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>

KVM: arm64: vgic-v3: Add ICV_CTLR_EL1 handler

BugLink: https://bugs.launchpad.net/bugs/1673564
Add a handler for reading/writing the guest's view of the ICV_CTLR_EL1
register. only EOIMode and CBPR are of interest here, as all the other
bits directly come from ICH_VTR_EL2 and are Read-Only.

Tested-by: Alexander Graf <agraf@suse.de>
Acked-by: David Daney <david.daney@cavium.com>
Acked-by: Christoffer Dall <cdall@linaro.org>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <cdall@linaro.org>
(backported from commit d840b2d37d3ef2463db38274c6fd72619acbe216)
[ dannf: Dropped SYS_ prefix on ICC_DIR_EL1 macro, which wasn't
added upstream until 0e9884fe ]
Signed-off-by: dann frazier <dann.frazier@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>

KVM: arm64: vgic-v3: Add ICV_RPR_EL1 handler

BugLink: https://bugs.launchpad.net/bugs/1673564
Add a handler for reading the guest's view of the ICV_RPR_EL1
register, returning the highest active priority.

Tested-by: Alexander Graf <agraf@suse.de>
Acked-by: David Daney <david.daney@cavium.com>
Acked-by: Christoffer Dall <cdall@linaro.org>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <cdall@linaro.org>
(backported from commit 43515894c06f856b7743145e002591309f60b247)
[ dannf: Drop the SYS_ prefix and move the ICV_RPR_EL1 macro definition
from asm/sysreg.h to asm/arch_gicv3.h for consistency with code
prior to 0e9884fe ]
Signed-off-by: dann frazier <dann.frazier@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>

KVM: arm64: vgic-v3: Add ICV_DIR_EL1 handler

BugLink: https://bugs.launchpad.net/bugs/1673564
Add a handler for writing the guest's view of the ICC_DIR_EL1
register, performing the deactivation of an interrupt if EOImode
is set ot 1.

Tested-by: Alexander Graf <agraf@suse.de>
Acked-by: David Daney <david.daney@cavium.com>
Reviewed-by: Christoffer Dall <cdall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <cdall@linaro.org>
(backported from commit 40228ba57c8532d4977d684a363915402d85dbf9)
[ dannf: Dropped SYS_ prefix on ICV_DIR_EL1 macro, which wasn't
added upstream until 0e9884fe ]
Signed-off-by: dann frazier <dann.frazier@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>

arm64: Add workaround for Cavium Thunder erratum 30115

BugLink: https://bugs.launchpad.net/bugs/1673564
Some Cavium Thunder CPUs suffer a problem where a KVM guest may
inadvertently cause the host kernel to quit receiving interrupts.

Use the Group-0/1 trapping in order to deal with it.

[maz]: Adapted patch to the Group-0/1 trapping, reworked commit log

Tested-by: Alexander Graf <agraf@suse.de>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: David Daney <david.daney@cavium.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <cdall@linaro.org>
(cherry picked from commit 690a341577f9adf2c275ababe0dcefe91898bbf0)
Signed-off-by: dann frazier <dann.frazier@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>

UBUNTU: [Config] CONFIG_CAVIUM_ERRATUM_30115=y

BugLink: https://bugs.launchpad.net/bugs/1673564
Signed-off-by: dann frazier <dann.frazier@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>

arm64: Add MIDR values for Cavium cn83XX SoCs

BugLink: https://bugs.launchpad.net/bugs/1673564
Tested-by: Alexander Graf <agraf@suse.de>
Acked-by: David Daney <david.daney@cavium.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: David Daney <david.daney@cavium.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <cdall@linaro.org>
(cherry picked from commit e982276d8f5c974b838fb22ba8d592feb039a544)
Signed-off-by: dann frazier <dann.frazier@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>

KVM: arm64: Enable GICv3 Group-0 sysreg trapping via command-line

BugLink: https://bugs.launchpad.net/bugs/1673564
Now that we're able to safely handle Group-0 sysreg access, let's
give the user the opportunity to enable it by passing a specific
command-line option (vgic_v3.group0_trap).

Tested-by: Alexander Graf <agraf@suse.de>
Acked-by: David Daney <david.daney@cavium.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <cdall@linaro.org>
(cherry picked from commit e23f62f76acea232d4def5d4eb21709a2b575f14)
Signed-off-by: dann frazier <dann.frazier@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>

KVM: arm64: vgic-v3: Enable trapping of Group-0 system registers

BugLink: https://bugs.launchpad.net/bugs/1673564
In order to be able to trap Group-0 GICv3 system registers, we need to
set ICH_HCR_EL2.TALL0 begore entering the guest. This is conditionnaly
done after having restored the guest's state, and cleared on exit.

Tested-by: Alexander Graf <agraf@suse.de>
Acked-by: David Daney <david.daney@cavium.com>
Acked-by: Christoffer Dall <cdall@linaro.org>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <cdall@linaro.org>
(backported from commit abf55766f7b062234083ff612446ff8d47e2417e)
[ dannf: context fixes ]
Signed-off-by: dann frazier <dann.frazier@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>

KVM: arm64: vgic-v3: Add misc Group-0 handlers

BugLink: https://bugs.launchpad.net/bugs/1673564
A number of Group-0 registers can be handled by the same accessors
as that of Group-1, so let's add the required system register encodings
and catch them in the dispatching function.

Tested-by: Alexander Graf <agraf@suse.de>
Acked-by: David Daney <david.daney@cavium.com>
Acked-by: Christoffer Dall <cdall@linaro.org>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <cdall@linaro.org>
(backported from commit eab0b2dc4f6f34147e3d10da49ab8032e15dbea0)
[ dannf: Drop the SYS_ prefix and move the system register macro
definitions from asm/sysreg.h to asm/arch_gicv3.h for consistency
with code prior to 0e9884fe ]
Signed-off-by: dann frazier <dann.frazier@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>

KVM: arm64: vgic-v3: Add ICV_IGNREN0_EL1 handler

BugLink: https://bugs.launchpad.net/bugs/1673564
Add a handler for reading/writing the guest's view of the ICC_IGRPEN0_EL1
register, which is located in the ICH_VMCR_EL2.VENG0 field.

Tested-by: Alexander Graf <agraf@suse.de>
Acked-by: David Daney <david.daney@cavium.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Christoffer Dall <cdall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <cdall@linaro.org>
(backported from commit fbc48a0011deb3d51cb657ca9c0f9083f41c0665)
[ dannf: Drop the SYS_ prefix and move the ICV_IGNREN0_EL1 macro definition
from asm/sysreg.h to asm/arch_gicv3.h for consistency with code
prior to 0e9884fe ]
Signed-off-by: dann frazier <dann.frazier@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>

KVM: arm64: vgic-v3: Add ICV_BPR0_EL1 handler

BugLink: https://bugs.launchpad.net/bugs/1673564
Add a handler for reading/writing the guest's view of the ICC_BPR0_EL1
register, which is located in the ICH_VMCR_EL2.BPR0 field.

Tested-by: Alexander Graf <agraf@suse.de>
Acked-by: David Daney <david.daney@cavium.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Christoffer Dall <cdall@linaro.org>
Signed-off-by: Christoffer Dall <cdall@linaro.org>
(backported from commit 423de85a98c2b50715a0784a74f6124fbc0b1548)
[ dannf: Drop the SYS_ prefix and move the ICV_BPR0_EL1 macro definition
from asm/sysreg.h to asm/arch_gicv3.h for consistency with code
prior to 0e9884fe ]
Signed-off-by: dann frazier <dann.frazier@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>

KVM: arm64: Enable GICv3 Group-1 sysreg trapping via command-line

BugLink: https://bugs.launchpad.net/bugs/1673564
Now that we're able to safely handle Group-1 sysreg access, let's
give the user the opportunity to enable it by passing a specific
command-line option (vgic_v3.group1_trap).

Tested-by: Alexander Graf <agraf@suse.de>
Acked-by: David Daney <david.daney@cavium.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Acked-by: Christoffer Dall <cdall@linaro.org>
Signed-off-by: Christoffer Dall <cdall@linaro.org>
(cherry picked from commit 182936eee7e6b1956881b51bc534a541dc71a101)
Signed-off-by: dann frazier <dann.frazier@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>

KVM: arm64: vgic-v3: Enable trapping of Group-1 system registers

BugLink: https://bugs.launchpad.net/bugs/1673564
In order to be able to trap Group-1 GICv3 system registers, we need to
set ICH_HCR_EL2.TALL1 before entering the guest. This is conditionally
done after having restored the guest's state, and cleared on exit.

Tested-by: Alexander Graf <agraf@suse.de>
Acked-by: David Daney <david.daney@cavium.com>
Acked-by: Christoffer Dall <cdall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <cdall@linaro.org>
(backported from commit 9c7bfc288c71068ab323b802dba2eb87fd08b127)
[ dannf: Context adjustments ]
Signed-off-by: dann frazier <dann.frazier@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>

KVM: arm64: vgic-v3: Add ICV_HPPIR1_EL1 handler

BugLink: https://bugs.launchpad.net/bugs/1673564
Add a handler for reading the guest's view of the ICV_HPPIR1_EL1
register. This is a simple parsing of the available LRs, extracting the
highest available interrupt.

Tested-by: Alexander Graf <agraf@suse.de>
Acked-by: David Daney <david.daney@cavium.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Christoffer Dall <cdall@linaro.org>
Signed-off-by: Christoffer Dall <cdall@linaro.org>
(backported from commit 2724c11a1df4b22ee966c04809ea0e808f66b04e)
[ dannf: Drop the SYS_ prefix and move the ICV_HPPIR1_EL1 macro definition
from asm/sysreg.h to asm/arch_gicv3.h for consistency with code
prior to 0e9884fe ]
Signed-off-by: dann frazier <dann.frazier@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>

KVM: arm64: vgic-v3: Add ICV_AP1Rn_EL1 handler

BugLink: https://bugs.launchpad.net/bugs/1673564
Add a handler for reading/writing the guest's view of the ICV_AP1Rn_EL1
registers. We just map them to the corresponding ICH_AP1Rn_EL2 registers.

Tested-by: Alexander Graf <agraf@suse.de>
Acked-by: David Daney <david.daney@cavium.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Christoffer Dall <cdall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <cdall@linaro.org>
(backported from commit f9e7449c780f688bf61a13dfa8c344afeb4ad6e0)
[ dannf: Drop the SYS_ prefix and move the ICV_AP1Rn_EL1 macro definition
from asm/sysreg.h to asm/arch_gicv3.h for consistency with code
prior to 0e9884fe ]
Signed-off-by: dann frazier <dann.frazier@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>

KVM: arm64: vgic-v3: Add ICV_EOIR1_EL1 handler

BugLink: https://bugs.launchpad.net/bugs/1673564
Add a handler for writing the guest's view of the ICC_EOIR1_EL1
register. This involves dropping the priority of the interrupt,
and deactivating it if required (EOImode == 0).

Tested-by: Alexander Graf <agraf@suse.de>
Acked-by: David Daney <david.daney@cavium.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Christoffer Dall <cdall@linaro.org>
Signed-off-by: Christoffer Dall <cdall@linaro.org>
(backported from commit b6f49035b4bf6e2709f2a5fed3107f5438c1fd02)
[ dannf: Dropped SYS_ prefix on ICV_EOIR1_EL1 macro, which wasn't
added upstream until 0e9884fe ]
Signed-off-by: dann frazier <dann.frazier@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>

KVM: arm64: vgic-v3: Add ICV_IAR1_EL1 handler

BugLink: https://bugs.launchpad.net/bugs/1673564
Add a handler for reading the guest's view of the ICC_IAR1_EL1
register. This involves finding the highest priority Group-1
interrupt, checking against both PMR and the active group
priority, activating the interrupt and setting the group
priority as active.

Tested-by: Alexander Graf <agraf@suse.de>
Acked-by: David Daney <david.daney@cavium.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Christoffer Dall <cdall@linaro.org>
Signed-off-by: Christoffer Dall <cdall@linaro.org>
(backported from commit 132a324ab62fe4fb8d6dcc2ab4eddb0e93b69afe)
[ dannf: Dropped SYS_ prefix on ICV_IAR1_EL1 macro, which wasn't
added upstream until 0e9884fe ]
Signed-off-by: dann frazier <dann.frazier@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>