git.proxmox.com Git - mirror_ubuntu-zesty-kernel.git/log

lpfc: Fix external loopback failure.

BugLink: http://bugs.launchpad.net/bugs/1541592
Fix external loopback failure.

Rx sequence reassembly was incorrect.

Signed-off-by: Dick Kennedy <dick.kennedy@avagotech.com>
Signed-off-by: James Smart <james.smart@avagotech.com>
Reviewed-by: Hannes Reinicke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 4360ca9c24388e44cb0e14861a62fff43cf225c0)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

lpfc: Fix mbox reuse in PLOGI completion

BugLink: http://bugs.launchpad.net/bugs/1541592
Fix mbox reuse in PLOGI completion. Moved allocations so that buffer
properly init'd.

Signed-off-by: Dick Kennedy <dick.kennedy@avagotech.com>
Signed-off-by: James Smart <james.smart@avagotech.com>
Reviewed-by: Hannes Reinicke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 01c73bbcd7cc4f31f45a1b0caeacdba46acd9c9c)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

lpfc: Use new FDMI speed definitions for 10G, 25G and 40G FCoE.

BugLink: http://bugs.launchpad.net/bugs/1541592
Use new FDMI speed definitions for 10G, 25G and 40G FCoE.

Signed-off-by: Dick Kennedy <dick.kennedy@avagotech.com>
Signed-off-by: James Smart <james.smart@avagotech.com>
Reviewed-by: Hannes Reinicke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit a085e87c814567c94e5d375e7362f9f25030aac1)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

lpfc: Make write check error processing more resilient

BugLink: http://bugs.launchpad.net/bugs/1541592
Make write check error processing more resilient.

Checks to catch writes that fw reports weren't fully complete yet SCSI
status indicated fine needed correction.

Signed-off-by: Dick Kennedy <dick.kennedy@avagotech.com>
Signed-off-by: James Smart <james.smart@avagotech.com>
Reviewed-by: Hannes Reinicke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 5afab6bbf3f026b7d50451acbfdc12300c5f4353)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

lpfc: Fix RDP ACC being too long.

BugLink: http://bugs.launchpad.net/bugs/1541592
Fix RDP ACC being too long.

Signed-off-by: Dick Kennedy <dick.kennedy@avagotech.com>
Signed-off-by: James Smart <james.smart@avagotech.com>
Reviewed-by: Hannes Reinicke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit eb8d68c9930f7f9c8f3f4a6059b051b32077a735)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

lpfc: Fix RDP Speed reporting.

BugLink: http://bugs.launchpad.net/bugs/1541592
Fix RDP Speed reporting.

Signed-off-by: Dick Kennedy <dick.kennedy@avagotech.com>
Signed-off-by: James Smart <james.smart@avagotech.com>
Reviewed-by: Hannes Reinicke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 81e7517723fc17396ba91f59312b3177266ddbda)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

lpfc: Modularize and cleanup FDMI code in driver

BugLink: http://bugs.launchpad.net/bugs/1541592
Modularize, cleanup, add comments - for FDMI code in driver

Note: I don't like the comments with leading # - but as we have a lot if
present, I'm deferring to handle it in one big fix later.

Signed-off-by: Dick Kennedy <dick.kennedy@avagotech.com>
Signed-off-by: James Smart <james.smart@avagotech.com>
Reviewed-by: Hannes Reinicke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 4258e98ee3862ca7036654b43c839ab7668043e0)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

lpfc: Fix crash in fcp command completion path.

BugLink: http://bugs.launchpad.net/bugs/1541592
Fix crash in fcp command completion path.

Missed null check.

Signed-off-by: Dick Kennedy <dick.kennedy@avagotech.com>
Signed-off-by: James Smart <james.smart@avagotech.com>
Reviewed-by: Hannes Reinicke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit c90261dcd86e4eb5c9c1627fde037e902db8aefa)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

lpfc: Fix driver crash when module parameter lpfc_fcp_io_channel set to 16

BugLink: http://bugs.launchpad.net/bugs/1541592
Fix driver crash when module parameter lpfc_fcp_io_channel set to 16

Signed-off-by: Dick Kennedy <dick.kennedy@avagotech.com>
Signed-off-by: James Smart <james.smart@avagotech.com>
Reviewed-by: Hannes Reinicke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 6690e0d4fc5cccf74534abe0c9f9a69032bc02f0)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

lpfc: Fix RegLogin failed error seen on Lancer FC during port bounce

BugLink: http://bugs.launchpad.net/bugs/1541592
Fix RegLogin failed error seen on Lancer FC during port bounce

Fix the statemachine and ref counting.

Signed-off-by: Dick Kennedy <dick.kennedy@avagotech.com>
Signed-off-by: James Smart <james.smart@avagotech.com>
Reviewed-by: Hannes Reinicke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 4b7789b71c916f79a3366da080101014473234c3)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

lpfc: Fix the FLOGI discovery logic to comply with T11 standards

BugLink: http://bugs.launchpad.net/bugs/1541592
Fix the FLOGI discovery logic to comply with T11 standards

We weren't properly setting fabric parameters, such as R_A_TOV and E_D_TOV,
when we registered the vfi object in default configs and pt2pt configs.
Revise to now pass service params with the values to the firmware and
ensure they are reset on link bounce. Required reworking the call sequence
in the discovery threads.

Signed-off-by: Dick Kennedy <dick.kennedy@avagotech.com>
Signed-off-by: James Smart <james.smart@avagotech.com>
Reviewed-by: Hannes Reinicke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit d6de08cc46269899988b4f40acc7337279693d4b)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

lpfc: Fix FCF Infinite loop in lpfc_sli4_fcf_rr_next_index_get.

BugLink: http://bugs.launchpad.net/bugs/1541592
Fix FCF Infinite loop in lpfc_sli4_fcf_rr_next_index_get.

Signed-off-by: Dick Kennedy <dick.kennedy@avagotech.com>
Signed-off-by: James Smart <james.smart@avagotech.com>
Reviewed-by: Hannes Reinicke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit f5cb5304eb26d307c9b30269fb0e007e0b262b7d)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

UBUNTU: [Config] update spl/zfs version

BugLink: http://bugs.launchpad.net/bugs/1542296
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

UBUNTU: [Config] Fixed Vcs-Git

Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

UBUNTU: [Config] CONFIG_ARM64_VA_BITS=48

On NUMA implementations of Cavium ThunderX, node1 memory addresses start with
bit 40 set to 1, and therefore requires >= 41 bits of VA. A side effect of
this is an increase from 3 to 4 page table levels.

Note: The alternative way to increase VA bits w/o moving to 4 level page
tables would be to adopt a larger page size. That would have a far more
significant impact to memory usage, and is known to have issue with current
Ubuntu userspace:

https://lists.ubuntu.com/archives/ubuntu-devel/2014-December/038572.html
http://bugs.launchpad.net/bugs/1520162

Signed-off-by: dann frazier <dann.frazier@canonical.com>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

cxl: Enable PCI device ID for future IBM CXL adapter

Add support for future IBM Coherent Accelerator (CXL) device
with ID of 0x0601.

Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Reviewed-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
(cherry picked from commit 68adb7bfd66504e97364651fb7dac3f9c8aa8561)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

cxl: use -Werror only with CONFIG_PPC_WERROR

Some developers really like to have -Werror enabled for their code, as
it helps to ensure warning free code. Others don't want -Werror, as it
(for example) can cause problems when newer (or older) compilers have
different sets of warnings, or new warnings can appear just when turning
up the warning level (e.g., make W=1 or W=2). Thus, it seems prudent to
have the use of -Werror be configurable.

It so happens that cxl is only built on PowerPC, and PowerPC already
has a nice set of Kconfig options for this, under CONFIG_PPC_WERROR. So
let's use that, and the world is a happy place again! (Note that
PPC_WERROR defaults to =y, so the common case compile should still be
enforcing -Werror.)

Fixes: d3d73f4b38a8 ("cxl: Compile with -Werror")
Signed-off-by: Brian Norris <computersforpeace@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
(cherry picked from commit 57f7c3932516b9f7908d9b0a24396112d0f4ca55)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

cxl: fix build for GCC 4.6.x

GCC 4.6.3 does not support -Wno-unused-const-variable. Instead, use the
kbuild infrastructure that checks if this options exists.

Fixes: 2cd55c68c0a4 ("cxl: Fix build failure due to -Wunused-variable behaviour change")
Suggested-by: Michal Marek <mmarek@suse.com>
Suggested-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Brian Norris <computersforpeace@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
(cherry picked from commit aa09545589ceeff884421d8eb38d04963190afbe)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

cxlflash: Enable device id for future IBM CXL adapter

This drop enables a future card with a device id of 0x0600 to be
recognized by the cxlflash driver.

As per the design, the Accelerator Function Unit (AFU) for this new IBM
CXL Flash Adapter retains the same host interface as the previous
generation. For the early prototypes of the new card, the driver with
this change behaves exactly as the driver prior to this behaved with the
earlier generation card. Therefore, no card specific programming has
been added. These card specific changes can be staged in later if
needed.

Signed-off-by: Manoj N. Kumar <manoj@linux.vnet.ibm.com>
Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit a2746fb16e41b7c8f02aa4d2605ecce97abbebbd)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

cxlflash: Resolve oops in wait_port_offline

If an async error interrupt is generated, and the error requires the FC
link to be reset, it cannot be performed in the interrupt context. So a
work element is scheduled to complete the link reset in a process
context. If either an EEH event or an escalation occurs in between when
the interrupt is generated and the scheduled work is started, the MMIO
space may no longer be available. This will cause an oops in the worker
thread.

[  606.806583] NIP kthread_data+0x28/0x40
[  606.806633] LR wq_worker_sleeping+0x30/0x100
[  606.806694] Call Trace:
[  606.806721] 0x50 (unreliable)
[  606.806796] wq_worker_sleeping+0x30/0x100
[  606.806884] __schedule+0x69c/0x8a0
[  606.806959] schedule+0x44/0xc0
[  606.807034] do_exit+0x770/0xb90
[  606.807109] die+0x300/0x460
[  606.807185] bad_page_fault+0xd8/0x150
[  606.807259] handle_page_fault+0x2c/0x30
[  606.807338] wait_port_offline.constprop.12+0x60/0x130 [cxlflash]

To prevent the problem space area from being unmapped, when there is
pending work, a mapcount (using the kref mechanism) is held.  The
mapcount is released only when the work is completed.  The last
reference release is tied to the unmapping service.

Signed-off-by: Manoj N. Kumar <manoj@linux.vnet.ibm.com>
Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Reviewed-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit b45cdbaf9f7f0486847c52f60747fb108724652a)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

cxlflash: Fix to resolve cmd leak after host reset

After a few iterations of resetting the card, either during EEH
recovery, or a host_reset the following is seen in the logs. cxlflash
0008:00: cxlflash_queuecommand: could not get a free command

At every reset of the card, the commands that are outstanding are being
leaked. No effort is being made to reap these commands. A few more
resets later, the above error message floods the logs and the card is
rendered totally unusable as no free commands are available.

Iterated through the 'cmd' queue and printed out the 'free' counter and
found that on each reset certain commands were in-use and stayed in-use
through subsequent resets.

To resolve this issue, when the card is reset, reap all the commands
that are active/outstanding.

Signed-off-by: Manoj N. Kumar <manoj@linux.vnet.ibm.com>
Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit ee91e332a6e6e9b939f60f6e1bd72fb2def5290d)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

cxlflash: Removed driver date print

Having a date for the driver requires it to be updated quite
often. Removing the date which is not necessary. Also made
use of the existing symbol to print the driver name.

Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 85599218914dadad3347eaa4337e71f09f39e78f)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

cxl: Fix DSI misses when the context owning task exits

Presently when a user-space process issues CXL_IOCTL_START_WORK ioctl we
store the pid of the current task_struct and use it to get pointer to
the mm_struct of the process, while processing page or segment faults
from the capi card. However this causes issues when the thread that had
originally issued the start-work ioctl exits in which case the stored
pid is no more valid and the cxl driver is unable to handle faults as
the mm_struct corresponding to process is no more accessible.

This patch fixes this issue by using the mm_struct of the next alive
task in the thread group. This is done by iterating over all the tasks
in the thread group starting from thread group leader and calling
get_task_mm on each one of them. When a valid mm_struct is obtained the
pid of the associated task is stored in the context replacing the
exiting one for handling future faults.

The patch introduces a new function named get_mem_context that checks if
the current task pointed to by ctx->pid is dead? If yes it performs the
steps described above. Also a new variable cxl_context.glpid is
introduced which stores the pid of the thread group leader associated
with the context owning task.

Reported-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Reported-by: Frank Haverkamp <HAVERKAM@de.ibm.com>
Suggested-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Vaibhav Jain <vaibhav@linux.vnet.ibm.com>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Reviewed-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Reviewed-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
(cherry picked from commit 7b8ad495d59280b634a7b546f4cdf58cf4d65f61)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

cxlflash: drop unlikely before IS_ERR_OR_NULL

IS_ERR_OR_NULL already contain an unlikely compiler flag. Drop it.

Signed-off-by: Geliang Tang <geliangtang@163.com>
Acked-by: Manoj Kumar <manoj@linux.vnet.ibm.com>
Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 21891a452a42afc2313f1e3a69040e46c1d068c1)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

cxl: use correct operator when writing pcie config space values

When writing a value to config space, cxl_pcie_write_config() calls
cxl_pcie_config_info() to obtain a mask and shift value, shifts the new
value accordingly, then uses the mask to combine the shifted value with the
existing value at the address as part of a read-modify-write pattern.

Currently, we use a logical OR operator rather than a bitwise OR operator,
which means any use of this function results in an incorrect value being
written. Replace the logical OR operator with a bitwise OR operator so the
value is written correctly.

Reported-by: Michael Ellerman <mpe@ellerman.id.au>
Cc: stable@vger.kernel.org
Fixes: 6f7f0b3df6d4 ("cxl: Add AFU virtual PHB and kernel API")
Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
(cherry picked from commit 48f0f6b717e314a30be121b67e1d044f6d311d66)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

cxl: Fix possible idr warning when contexts are released

An idr warning is reported when a context is release after the capi card
is unbound from the cxl driver via sysfs. Below are the steps to
reproduce:

1. Create multiple afu contexts in an user-space application using libcxl.
2. Unbind capi card from cxl using command of form
echo <capi-card-pci-addr> > /sys/bus/pci/drivers/cxl-pci/unbind
3. Exit/kill the application owning afu contexts.

After above steps a warning message is usually seen in the kernel logs
of the form "idr_remove called for id=<context-id> which is not
allocated."

This is caused by the function cxl_release_afu which destroys the
contexts_idr table. So when a context is release no entry for context pe
is found in the contexts_idr table and idr code prints this warning.

This patch fixes this issue by increasing & decreasing the ref-count on
the afu device when a context is initialized or when its freed
respectively. This prevents the afu from being released until all the
afu contexts have been released. The patch introduces two new functions
namely cxl_afu_get/put that manage the ref-count on the afu device.

Also the patch removes code inside cxl_dev_context_init that increases ref
on the afu device as its guaranteed to be alive during this function.

Reported-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Vaibhav Jain <vaibhav@linux.vnet.ibm.com>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
(cherry picked from commit 1b5df59e50874b9034c0fa389cd52b65f1f93292)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

UBUNTU: Start new release

Ignore: yes
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

UBUNTU: Ubuntu-4.4.0-3.17

Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

UBUNTU: rebase to v4.4.1

Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

Drivers: hv: vmbus: Treat Fibre Channel devices as performance critical

For performance critical devices, we distribute the incoming
channel interrupt load across available CPUs in the guest.
Include Fibre channel devices in the set of devices for which
we would distribute the interrupt load.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 879a650a273bc3efb9d472886b8ced12630ea8ed)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

Drivers: hv: utils: fix hvt_op_poll() return value on transport destroy

The return type of hvt_op_poll() is unsigned int and -EBADF is
inappropriate, poll functions return POLL* statuses.

Reported-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 77b744a598d604de49df79cf161bbd1809a6948a)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

Drivers: hv: vmbus: fix the building warning with hyperv-keyboard

With the recent change af3ff643ea91ba64dd8d0b1cbed54d44512f96cd
(Drivers: hv: vmbus: Use uuid_le type consistently), we always get this
warning:

  CC [M]  drivers/input/serio/hyperv-keyboard.o
drivers/input/serio/hyperv-keyboard.c:427:2: warning: missing braces around
initializer [-Wmissing-braces]
  { HV_KBD_GUID, },
  ^
drivers/input/serio/hyperv-keyboard.c:427:2: warning: (near initialization
for .id_table[0].guid.b.) [-Wmissing-braces]

The patch fixes the warning.

Signed-off-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 2048157ad02e65f6327118dd4a7b9c9f1fd12f77)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

drivers/hv: Move struct hv_timer_message_payload into UAPI Hyper-V x86 header

This struct is required for Hyper-V SynIC timers implementation inside KVM
and for upcoming Hyper-V VMBus support by userspace(QEMU). So place it into
Hyper-V UAPI header.

Signed-off-by: Andrey Smetanin <asmetanin@virtuozzo.com>
CC: Gleb Natapov <gleb@kernel.org>
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: "K. Y. Srinivasan" <kys@microsoft.com>
CC: Haiyang Zhang <haiyangz@microsoft.com>
CC: Vitaly Kuznetsov <vkuznets@redhat.com>
CC: Roman Kagan <rkagan@virtuozzo.com>
CC: Denis V. Lunev <den@openvz.org>
CC: qemu-devel@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit c71acc4c74dddebbbbeede69fdd4f0b1a124f9df)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

drivers/hv: Move struct hv_message into UAPI Hyper-V x86 header

This struct is required for Hyper-V SynIC timers implementation inside KVM
and for upcoming Hyper-V VMBus support by userspace(QEMU). So place it into
Hyper-V UAPI header.

Signed-off-by: Andrey Smetanin <asmetanin@virtuozzo.com>
Acked-by: K. Y. Srinivasan <kys@microsoft.com>
Reviewed-by: Roman Kagan <rkagan@virtuozzo.com>
CC: Gleb Natapov <gleb@kernel.org>
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: "K. Y. Srinivasan" <kys@microsoft.com>
CC: Haiyang Zhang <haiyangz@microsoft.com>
CC: Vitaly Kuznetsov <vkuznets@redhat.com>
CC: Roman Kagan <rkagan@virtuozzo.com>
CC: Denis V. Lunev <den@openvz.org>
CC: qemu-devel@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 5b423efe11e822e092e8c911a6bad17eadf718eb)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

drivers/hv: Move HV_SYNIC_STIMER_COUNT into Hyper-V UAPI x86 header

This constant is required for Hyper-V SynIC timers MSR's
support by userspace(QEMU).

Signed-off-by: Andrey Smetanin <asmetanin@virtuozzo.com>
Acked-by: K. Y. Srinivasan <kys@microsoft.com>
Reviewed-by: Roman Kagan <rkagan@virtuozzo.com>
CC: Gleb Natapov <gleb@kernel.org>
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: "K. Y. Srinivasan" <kys@microsoft.com>
CC: Haiyang Zhang <haiyangz@microsoft.com>
CC: Vitaly Kuznetsov <vkuznets@redhat.com>
CC: Roman Kagan <rkagan@virtuozzo.com>
CC: Denis V. Lunev <den@openvz.org>
CC: qemu-devel@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 4f39bcfd1c132522380138a323f9af7902766301)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

drivers/hv: replace enum hv_message_type by u32

enum hv_message_type inside struct hv_message, hv_post_message
is not size portable. Replace enum by u32.

Signed-off-by: Andrey Smetanin <asmetanin@virtuozzo.com>
CC: Gleb Natapov <gleb@kernel.org>
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: "K. Y. Srinivasan" <kys@microsoft.com>
CC: Haiyang Zhang <haiyangz@microsoft.com>
CC: Vitaly Kuznetsov <vkuznets@redhat.com>
CC: Roman Kagan <rkagan@virtuozzo.com>
CC: Denis V. Lunev <den@openvz.org>
CC: qemu-devel@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 7797dcf63f11b6e1d34822daf2317223d0f4ad46)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

Drivers: hv: ring_buffer: eliminate hv_ringbuffer_peek()

Currently, there is only one user for hv_ringbuffer_read()/
hv_ringbuffer_peak() functions and the usage of these functions is:
- insecure as we drop ring_lock between them, someone else (in theory
  only) can acquire it in between;
- non-optimal as we do a number of things (acquire/release the above
  mentioned lock, calculate available space on the ring, ...) twice and
  this path is performance-critical.

Remove hv_ringbuffer_peek() moving the logic from __vmbus_recvpacket() to
hv_ringbuffer_read().

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 940b68e2c3e4ebf032885203c3970e9649f814af)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

Drivers: hv: remove code duplication between vmbus_recvpacket()/vmbus_recvpacket_raw()

vmbus_recvpacket() and vmbus_recvpacket_raw() are almost identical but
there are two discrepancies:
1) vmbus_recvpacket() doesn't propagate errors from hv_ringbuffer_read()
   which looks like it is not desired.
2) There is an error message printed in packetlen > bufferlen case in
   vmbus_recvpacket(). I'm removing it as it is usless for users to see
   such messages and /vmbus_recvpacket_raw() doesn't have it.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 667d374064b0cc48b6122101b287908d1b392bdb)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

Drivers: hv: ring_buffer: remove code duplication from hv_ringbuffer_peek/read()

hv_ringbuffer_peek() does the same as hv_ringbuffer_read() without
advancing the read index. The only functional change this patch brings
is moving hv_need_to_signal_on_read() call under the ring_lock but this
function is just a couple of comparisons.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit b5f53dde8d8e84a6ee200dbd0bd90a400a8fe1a1)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

Drivers: hv: ring_buffer: remove stray smp_read_barrier_depends()

smp_read_barrier_depends() does nothing on almost all arcitectures
including x86 and having it in the beginning of
hv_get_ringbuffer_availbytes() does not provide any guarantees anyway.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 45870a441361d1c05a5f767c4ece2f6e30e0da9c)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

Drivers: hv: ring_buffer.c: fix comment style

Convert 6+-string comments repeating function names to normal kernel-style
comments and fix a couple of other comment style issues. No textual or
functional changes intended.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 822f18d4d3e9d4efb4996bbe562d0f99ab82d7dd)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

Drivers: hv: utils: fix crash when device is removed from host side

The crash is observed when a service is being disabled host side while
userspace daemon is connected to the device:

[   90.244859] general protection fault: 0000 [#1] SMP
...
[   90.800082] Call Trace:
[   90.800082]  [<ffffffff81187008>] __fput+0xc8/0x1f0
[   90.800082]  [<ffffffff8118716e>] ____fput+0xe/0x10
...
[   90.800082]  [<ffffffff81015278>] do_signal+0x28/0x580
[   90.800082]  [<ffffffff81086656>] ? finish_task_switch+0xa6/0x180
[   90.800082]  [<ffffffff81443ebf>] ? __schedule+0x28f/0x870
[   90.800082]  [<ffffffffa01ebbaa>] ? hvt_op_read+0x12a/0x140 [hv_utils]
...

The problem is that hvutil_transport_destroy() which does misc_deregister()
freeing the appropriate device is reachable by two paths: module unload
and from util_remove(). While module unload path is protected by .owner in
struct file_operations util_remove() path is not. Freeing the device while
someone holds an open fd for it is a show stopper.

In general, it is not possible to revoke an fd from all users so the only
way to solve the issue is to defer freeing the hvutil_transport structure.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 9420098adc50a88d4a441e0f92d54bfa7af44448)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

Drivers: hv: utils: introduce HVUTIL_TRANSPORT_DESTROY mode

When Hyper-V host asks us to remove some util driver by closing the
appropriate channel there is no easy way to force the current file
descriptor holder to hang up but we can start to respond -EBADF to all
operations asking it to exit gracefully.

As we're setting hvt->mode from two separate contexts now we need to use
a proper locking.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit a15025660d4703a8b37290a14734cb4a84875770)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

Drivers: hv: utils: rename outmsg_lock

As a preparation to reusing outmsg_lock to protect test-and-set openrations
on 'mode' rename it the more general 'lock'.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit a72f3a4ccff22de879a1f599210ecdd9bd483a43)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

Drivers: hv: utils: fix memory leak on on_msg() failure

inmsg should be freed in case of on_msg() failure to avoid memory leak.
Preserve the error code from on_msg().

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 1f75338b6fece2bbd42ac3623830c65e2df6e031)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

tools: hv: vss: fix the write()'s argument: error -> vss_msg

Fix the write()'s argument in the daemon code.

Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Signed-off-by: Dexuan Cui <decui@microsoft.com>
Cc: stable@vger.kernel.org
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit a689d2510f188e75391dbebacbddfd74d42f2a7e)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

Drivers: hv: utils: Invoke the poll function after handshake

When the handshake with daemon is complete, we should poll the channel since
during the handshake, we will not be processing any messages. This is a
potential bug if the host is waiting for a response from the guest.
I would like to thank Dexuan for pointing this out.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 2d0c3b5ad739697a68dc8a444f5b9f4817cf8f8f)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

Drivers: hv: vmbus: Force all channel messages to be delivered on CPU 0

Force all channel messages to be delivered on CPU0. These messages are not
performance critical and are used during the setup and teardown of the
channel.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit b282e4c06fe89fc750fb26791c0bd7b25315973a)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

drivers/hv: correct tsc page sequence invalid value

Hypervisor Top Level Functional Specification v3/4 says
that TSC page sequence value = -1(0xFFFFFFFF) is used to
indicate that TSC page no longer reliable source of reference
timer. Unfortunately, we found that Windows Hyper-V guest
side implementation uses sequence value = 0 to indicate
that Tsc page no longer valid. This is clearly visible
inside Windows 2012R2 ntoskrnl.exe HvlGetReferenceTime()
function dissassembly:

HvlGetReferenceTime proc near
                 xchg    ax, ax
loc_1401C3132:
                 mov     rax, cs:HvlpReferenceTscPage
                 mov     r9d, [rax]
                 test    r9d, r9d
                 jz      short loc_1401C3176
                 rdtsc
                 mov     rcx, cs:HvlpReferenceTscPage
                 shl     rdx, 20h
                 or      rdx, rax
                 mov     rax, [rcx+8]
                 mov     rcx, cs:HvlpReferenceTscPage
                 mov     r8, [rcx+10h]
                 mul     rdx
                 mov     rax, cs:HvlpReferenceTscPage
                 add     rdx, r8
                 mov     ecx, [rax]
                 cmp     ecx, r9d
                 jnz     short loc_1401C3132
                 jmp     short loc_1401C3184
loc_1401C3176:
                 mov     ecx, 40000020h
                 rdmsr
                 shl     rdx, 20h
                 or      rdx, rax
loc_1401C3184:
                 mov     rax, rdx
                 retn
HvlGetReferenceTime endp

This patch aligns Tsc page invalid sequence value with
Windows Hyper-V guest implementation which is more
compatible with both Hyper-V hypervisor and KVM hypervisor.

Signed-off-by: Andrey Smetanin <asmetanin@virtuozzo.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: "K. Y. Srinivasan" <kys@microsoft.com>
CC: Haiyang Zhang <haiyangz@microsoft.com>
CC: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit c35b82ef0294ae5052120615f5cfcef17c5a6bf7)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

Drivers: hv: vmbus: Fix a Host signaling bug

Currently we have two policies for deciding when to signal the host:
One based on the ring buffer state and the other based on what the
VMBUS client driver wants to do. Consider the case when the client
wants to explicitly control when to signal the host. In this case,
if the client were to defer signaling, we will not be able to signal
the host subsequently when the client does want to signal since the
ring buffer state will prevent the signaling. Implement logic to
have only one signaling policy in force for a given channel.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Tested-by: Haiyang Zhang <haiyangz@microsoft.com>
Cc: <stable@vger.kernel.org> # v4.2+
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 8599846d73997cdbccf63f23394d871cfad1e5e6)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

drivers:hv: Allow for MMIO claims that span ACPI _CRS records

This patch makes 16GB GPUs work in Hyper-V VMs, since, for
compatibility reasons, the Hyper-V BIOS lists MMIO ranges in 2GB
chunks in its root bus's _CRS object.

Signed-off-by: Jake Oshins <jakeo@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 40f26f3168bf7a4da490db308dc0bd9f9923f41f)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

Drivers: hv: vmbus: channge vmbus_connection.channel_lock to mutex

spinlock is unnecessary here.
mutex is enough.

Signed-off-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit d6f591e339d23f434efda11917da511870891472)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

Drivers: hv: vmbus: release relid on error in vmbus_process_offer()

We want to simplify vmbus_onoffer_rescind() by not invoking
hv_process_channel_removal(NULL, ...).

Signed-off-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit f52078cf5711ce47c113a58702b35c8ff5f212f5)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

Drivers: hv: vmbus: fix rescind-offer handling for device without a driver

In the path vmbus_onoffer_rescind() -> vmbus_device_unregister() ->
device_unregister() -> ... -> __device_release_driver(), we can see for a
device without a driver loaded: dev->driver is NULL, so
dev->bus->remove(dev), namely vmbus_remove(), isn't invoked.

As a result, vmbus_remove() -> hv_process_channel_removal() isn't invoked
and some cleanups(like sending a CHANNELMSG_RELID_RELEASED message to the
host) aren't done.

We can demo the issue this way:
1. rmmod hv_utils;
2. disable the Heartbeat Integration Service in Hyper-V Manager and lsvmbus
shows the device disappears.
3. re-enable the Heartbeat in Hyper-V Manager and modprobe hv_utils, but
lsvmbus shows the device can't appear again.
This is because, the host thinks the VM hasn't released the relid, so can't
re-offer the device to the VM.

We can fix the issue by moving hv_process_channel_removal()
from vmbus_close_internal() to vmbus_device_release(), since the latter is
always invoked on device_unregister(), whether or not the dev has a driver
loaded.

Signed-off-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 34c6801e3310ad286c7bb42bc88d42926b8f99bf)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

Drivers: hv: vmbus: do sanity check of channel state in vmbus_close_internal()

This fixes an incorrect assumption of channel state in the function.

Signed-off-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 64b7faf903dae2df94d89edf2c688b16751800e4)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

Drivers: hv: vmbus: serialize process_chn_event() and vmbus_close_internal()

process_chn_event(), running in the tasklet, can race with
vmbus_close_internal() in the case of SMP guest, e.g., when the former is
accessing channel->inbound.ring_buffer, the latter could be freeing the
ring_buffer pages.

To resolve the race, we can serialize them by disabling the tasklet when
the latter is running here.

Signed-off-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 63d55b2aeb5e4faa170316fee73c3c47ea9268c7)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

Drivers: hv: vmbus: Get rid of the unused irq variable

The irq we extract from ACPI is not used - we deliver hypervisor
interrupts on a special vector. Make the necessary adjustments.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit efc267226b827f347a329c395e16c08973b0e3d6)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

Drivers: hv: vmbus: Get rid of the unused macro

The macro VMBUS_DEVICE() is unused; get rid of it.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 90e031fa06ad6b660a8e9bebbb80bd30e555a2a5)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

Drivers: hv: vmbus: Use uuid_le_cmp() for comparing GUIDs

Use uuid_le_cmp() for comparing GUIDs.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 4ae9250893485f380275e7d5cb291df87c4d9710)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

Drivers: hv: vmbus: Use uuid_le type consistently

Consistently use uuid_le type in the Hyper-V driver code.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit af3ff643ea91ba64dd8d0b1cbed54d44512f96cd)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

Drivers: hv: vss: run only on supported host versions

The Backup integration service on WS2012 has appearently trouble to
negotiate with a guest which does not support the provided util version.
Currently the VSS driver supports only version 5/0. A WS2012 offers only
version 1/x and 3/x, and vmbus_prep_negotiate_resp correctly returns an
empty icframe_vercnt/icmsg_vercnt. But the host ignores that and
continues to send ICMSGTYPE_NEGOTIATE messages. The result are weird
errors during boot and general misbehaviour.

Check the Windows version to work around the host bug, skip hv_vss_init
on WS2012 and older.

Signed-off-by: Olaf Hering <olaf@aepfle.de>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit ed9ba608e4851144af8c7061cbb19f751c73e998)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

drivers:hv: Define the channel type for Hyper-V PCI Express pass-through

This defines the channel type for PCI front-ends in Hyper-V VMs.

Signed-off-by: Jake Oshins <jakeo@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 3053c762444a83ec6a8777f9476668b23b8ab180)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

drivers:hv: Export the API to invoke a hypercall on Hyper-V

This patch exposes the function that hv_vmbus.ko uses to make hypercalls. This
is necessary for retargeting an interrupt when it is given a new affinity.

Since we are exporting this API, rename the API as it will be visible outside
the hv.c file.

Signed-off-by: Jake Oshins <jakeo@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit a108393dbf764efb2405f21ca759806c65b8bc16)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

drivers:hv: Export a function that maps Linux CPU num onto Hyper-V proc num

This patch exposes the mapping between Linux CPU number and Hyper-V virtual
processor number. This is necessary because the hypervisor needs to know which
virtual processors to target when making a mapping in the Interrupt Redirection
Table in the I/O MMU.

Signed-off-by: Jake Oshins <jakeo@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 619848bd074343ff2bdeeafca0be39748f6da372)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

drivers/hv: cleanup synic msrs if vmbus connect failed

Before vmbus_connect() synic is setup per vcpu - this means
hypervisor receives writes at synic msr's and probably allocate
hypervisor resources per synic setup.

If vmbus_connect() failed for some reason it's neccessary to cleanup
synic setup by call hv_synic_cleanup() at each vcpu to get a chance
to free allocated resources by hypervisor per synic.

This patch does appropriate cleanup in case of vmbus_connect() failure.

Signed-off-by: Andrey Smetanin <asmetanin@virtuozzo.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com>
CC: "K. Y. Srinivasan" <kys@microsoft.com>
CC: Haiyang Zhang <haiyangz@microsoft.com>
CC: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 17efbee8ba02ef00d3b270998978f8a1a90f1d92)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

Drivers: hv: utils: use memdup_user in hvt_op_write

Use memdup_user to handle OOM.

Fixes: 14b50f80c32d ('Drivers: hv: util: introduce hv_utils_transport abstraction')
Signed-off-by: Olaf Hering <olaf@aepfle.de>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit b00359642c2427da89dc8f77daa2c9e8a84e6d76)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

Drivers: hv: util: catch allocation errors

Catch allocation errors in hvutil_transport_send.

Fixes: 14b50f80c32d ('Drivers: hv: util: introduce hv_utils_transport abstraction')
Signed-off-by: Olaf Hering <olaf@aepfle.de>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit cdc0c0c94e4e6dfa371d497a3130f83349b6ead6)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

tools: hv: remove repeated HV_FCOPY string

HV_FCOPY is already used as identifier in syslog.

Signed-off-by: Olaf Hering <olaf@aepfle.de>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 6dfb867cea9e93ae9220f0b2e702b0440e4c8b4b)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

tools: hv: report ENOSPC errors in hv_fcopy_daemon

Currently some "Unspecified error 0x80004005" is reported on the Windows
side if something fails. Handle the ENOSPC case and return
ERROR_DISK_FULL, which allows at least Copy-VMFile to report a meaning
full error.

Signed-off-by: Olaf Hering <olaf@aepfle.de>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit b4ed5d1682c6613988c2eb1de55df5ac9988afcc)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

Drivers: hv: utils: run polling callback always in interrupt context

All channel interrupts are bound to specific VCPUs in the guest
at the point channel is created. While currently, we invoke the
polling function on the correct CPU (the CPU to which the channel
is bound to) in some cases we may run the polling function in
a non-interrupt context. This potentially can cause an issue as the
polling function can be interrupted by the channel callback function.
Fix the issue by running the polling function on the appropriate CPU
at interrupt level. Additional details of the issue being addressed by
this patch are given below:

Currently hv_fcopy_onchannelcallback is called from interrupts and also
via the ->write function of hv_utils. Since the used global variables to
maintain state are not thread safe the state can get out of sync.
This affects the variable state as well as the channel inbound buffer.

As suggested by KY adjust hv_poll_channel to always run the given
callback on the cpu which the channel is bound to. This avoids the need
for locking because all the util services are single threaded and only
one transaction is active at any given point in time.

Additionally, remove the context variable, they will always be the same as
recv_channel.

Signed-off-by: Olaf Hering <olaf@aepfle.de>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 3cace4a616108539e2730f8dc21a636474395e0f)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

Drivers: hv: util: Increase the timeout for util services

Util services such as KVP and FCOPY need assistance from daemon's running
in user space. Increase the timeout so we don't prematurely terminate
the transaction in the kernel. Host sets up a 60 second timeout for
all util driver transactions. The host will retry the transaction if it
times out. Set the guest timeout at 30 seconds.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit c0b200cfb0403740171c7527b3ac71d03f82947a)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

Drivers: hv: vmbus: fix build warning

We were getting build warning about unused variable "tsc_msr" and
"va_tsc" while building for i386 allmodconfig.

Signed-off-by: Sudip Mukherjee <sudip@vectorindia.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 9220e39b5c900c67ddcb517d52fe52d90fb5e3c8)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

net: i40e: shut up uninitialized variable warnings

intel/i40e/i40e_txrx.c: In function 'i40e_xmit_frame_ring':
intel/i40e/i40e_txrx.c:2367:20: error: 'oiph' may be used uninitialized in this function [-Werror=maybe-uninitialized]
intel/i40e/i40e_txrx.c:2317:16: note: 'oiph' was declared here
intel/i40e/i40e_txrx.c:2367:17: error: 'oudph' may be used uninitialized in this function [-Werror=maybe-uninitialized]
intel/i40e/i40e_txrx.c:2316:17: note: 'oudph' was declared here

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit 79febbc19b81b5242339bffb90e4dbea15015dde)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

i40e: fix build warnings

Fixes following build warnings :

drivers/net/ethernet/intel/i40e/i40e_main.c:7057:13: warning:
'i40e_sync_udp_filters_subtask' defined but not used [-Wunused-function]
drivers/net/ethernet/intel/i40e/i40e_main.c:8524:13: warning:
'i40e_add_vxlan_port' defined but not used [-Wunused-function]
drivers/net/ethernet/intel/i40e/i40e_main.c:8569:13: warning:
'i40e_del_vxlan_port' defined but not used [-Wunused-function]
drivers/net/ethernet/intel/i40e/i40e_main.c:8604:13: warning:
'i40e_add_geneve_port' defined but not used [-Wunused-function]
drivers/net/ethernet/intel/i40e/i40e_main.c:8651:13: warning:
'i40e_del_geneve_port' defined but not used [-Wunused-function]

Fixes: 6a899024058d ("i40e: geneve tunnel offload support")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
(cherry picked from commit 5cae7615b613381a04d3dd06b8237234cc3f7cc9)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

UBUNTU: SAUCE: nvme merge cleanup

BugLink: http://bugs.launchpad.net/bugs/1531539
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

NVMe: Export NVMe attributes to sysfs group

BugLink: http://bugs.launchpad.net/bugs/1531539
Adds all controller information to attribute list exposed to sysfs, and
appends the reset_controller attribute to it. The nvme device is created
with this attribute list, so driver no long manages its attributes.

Reported-by: Sujith Pandel <sujithpshankar@gmail.com>
Cc: Sujith Pandel <sujithpshankar@ gmail.com>
Cc: David Milburn <dmilburn@redhat.com>
Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
(cherry picked from commit 779ff75617099f4defe14e20443b95019a4c5ae8)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

NVMe: Shutdown controller only for power-off

BugLink: http://bugs.launchpad.net/bugs/1531539
We don't need to shutdown a controller for a reset. A controller in a
shutdown state may take longer to become ready than one that was simply
disabled. This patch has the driver shut down a controller only if the
device is about to be powered off or being removed. When taking the
controller down for a reset reason, the controller will be disabled
instead.

Function names have been updated in this patch to reflect their changed
semantics.

Signed-off-by: Keith Busch <keith.busch@intel.com>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
(cherry picked from commit a5cdb68c2c10f0865122656833cd07636a4143ee)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

NVMe: IO queue deletion re-write

BugLink: http://bugs.launchpad.net/bugs/1531539
The nvme driver deletes IO queues asynchronously since this operation
may potentially take an undesirable amount of time with a large number
of queues if done serially.

The driver used to manage coordinating asynchronous deletions. This
patch simplifies that by leveraging the block layer rather than using
kthread workers and chaining more complicated callbacks.

Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
(cherry picked from commit db3cbfff5bcc0b9a82d8c71f00b9d60fad215871)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

NVMe: Remove queue freezing on resets

BugLink: http://bugs.launchpad.net/bugs/1531539
NVMe submits all commands through the block layer now. This means we
can let requests queue at the blk-mq hardware context since there is no
path that bypasses this anymore so we don't need to freeze the queues
anymore. The driver can simply stop the h/w queues from running during
a reset instead.

This also fixes a WARN in percpu_ref_reinit when the queue was unfrozen
with requeued requests.

Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
(cherry picked from commit 25646264e15af96c5c630fc742708b1eb3339222)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

NVMe: Use a retryable error code on reset

BugLink: http://bugs.launchpad.net/bugs/1531539
A negative status has the "do not retry" bit set, which makes it not
retryable. Use a fake status that can potentially be retried on reset.

An aborted command's status is overridden by the timeout handler so
that it won't be retried, which is necessary to keep initialization from
getting into a reset loop.

Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
(cherry picked from commit 1d49c38c4865c596b01b31a52540275c1bb383e7)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

NVMe: Fix admin queue ring wrap

BugLink: http://bugs.launchpad.net/bugs/1531539
The tag set queue depth needs to be one less than the h/w queue depth
so we don't wrap the circular buffer. This conforms to the specification
defined "Full Queue" condition.

Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
(cherry picked from commit e3e9d50cd6ed392bb716e35c134d1e82707c51b4)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

nvme: make SG_IO support optional

BugLink: http://bugs.launchpad.net/bugs/1531539
Translation SCSI commands to NVMe commands is rather pointless in general
as applications must not expext to be able to use SCSI commands on a
generic block device.

Make the huge translation layer optional and hope no one will ever enable
it in the future.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
(back ported from commit 4490733250b8b272a6d3e66352dd7b8025409549)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Conflicts:
drivers/nvme/host/Makefile

nvme: fixes for NVME_IOCTL_IO_CMD on the char device

BugLink: http://bugs.launchpad.net/bugs/1531539
Make sure we synchronize access to the namespaces list and grab a reference
to the namespace before doing I/O. Make sure to reject the ioctl if multiple
namespaces are present as it's entirely unsafe, and warn when using it even
with a single namespace.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Acked-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
(cherry picked from commit bfd8947194b2e2a53db82bbc7eb7c15d028c46db)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

nvme: synchronize access to ctrl->namespaces

BugLink: http://bugs.launchpad.net/bugs/1531539
Currently traversal and modification of ctrl->namespaces happens completely
unsynchronized, which can be fixed by the addition of a simple mutex.

Note: nvme_dev_ioctl will be handled in the next patch.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Acked-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
(cherry picked from commit 69d3b8ac15a5eb938e6a01909f6cc8ae4b5d3a17)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

nvme: Move nvme_freeze/unfreeze_queues to nvme core

BugLink: http://bugs.launchpad.net/bugs/1531539
Nothing pci specific about them and We'll need them exported
in other transports too.

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
(cherry picked from commit 363c9aacb6c59bb63148dd115632880a4aed4d88)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

NVMe: Export namespace attributes to sysfs

BugLink: http://bugs.launchpad.net/bugs/1531539
Exposes the NGUID, EUI-64, and NSID to sysfs entries under the disk's
kobject.

Signed-off-by: Keith Busch <keith.busch@intel.com>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
(cherry picked from commit 2b9b6e86bca7209de02754fc84acf7ab3e78734e)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

NVMe: Add pci error handlers

BugLink: http://bugs.launchpad.net/bugs/1531539
Requests enabling pcie aer support. Shuts down the controller on error
detected with io frozen state prior to requesting slot reset; resumes
controller after reset completes.

Signed-off-by: Keith Busch <keith.busch@intel.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
(cherry picked from commit a0a3408ee614848c27b0d36c2fe490da3b387b8d)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

nvme: merge iod and cmd_info

BugLink: http://bugs.launchpad.net/bugs/1531539
Merge the two per-request structures in the nvme driver into a single
one.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
(cherry picked from commit f4800d6d1548e0d5ab94f2216d41d94282e2588c)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

nvme: meta_sg doesn't have to be an array

BugLink: http://bugs.launchpad.net/bugs/1531539
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
(cherry picked from commit bf68405705bd35c09ec1f7528718dce5af88daff)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

nvme: properly free resources for cancelled command

BugLink: http://bugs.launchpad.net/bugs/1531539
We need to move freeing of resources to the ->complete handler to ensure
they are also freed when we cancel the command.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
(cherry picked from commit eee417b0697827a6e120199b126b447af3c81b47)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

nvme: simplify completion handling

BugLink: http://bugs.launchpad.net/bugs/1531539
Now that all commands are executed as block layer requests we can remove the
internal completion in the NVMe driver. Note that we can simply call
blk_mq_complete_request to abort commands as the block layer will protect
against double copletions internally.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
(cherry picked from commit aae239e1910ebc27ec9f7e8b25904a69626cf28c)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

nvme: special case AEN requests

BugLink: http://bugs.launchpad.net/bugs/1531539
AEN requests are different from other requests in that they don't time out
or can easily be cancelled. Because of that we should not use the blk-mq
infrastructure but just special case them in the completion path.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
(cherry picked from commit adf68f21c15572c68d9fadae618a09cf324b9814)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

nvme: switch abort to blk_execute_rq_nowait

BugLink: http://bugs.launchpad.net/bugs/1531539
And remove the now unused nvme_submit_cmd helper.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
(cherry picked from commit e7a2a87d5938bbebe1637c82fbde94ea6be3ef78)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

nvme: switch delete SQ/CQ to blk_execute_rq_nowait

BugLink: http://bugs.launchpad.net/bugs/1531539
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
(cherry picked from commit d8f32166a9c587e87a3a86f654c73d40b6b5df00)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

nvme: factor out a few helpers from req_completion

BugLink: http://bugs.launchpad.net/bugs/1531539
We'll need them in other places later.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
(cherry picked from commit 7688faa6dd2c99ce5d66571d9ad65535ec39e8cb)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

nvme: fix admin queue depth

BugLink: http://bugs.launchpad.net/bugs/1531539
The number in tag_set->queue depth includes the reserved tags.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
(cherry picked from commit 4680072003df14230e9eeeeefb617401012234a5)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

NVMe: Simplify metadata setup

BugLink: http://bugs.launchpad.net/bugs/1531539
We no longer require the two-pass setup for block integrity.

Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
(cherry picked from commit 4b9d5b151046ff717819864f93cb8e012b347bce)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

NVMe: Remove device management handles on remove

BugLink: http://bugs.launchpad.net/bugs/1531539
We don't want to allow new references to open on a device that is
removed. This ties the lifetime of these handles to the physical device's
presence rather than to the open reference count.

Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
(cherry picked from commit 53029b0441bbd263dbb2ee6429572b1732dad4de)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

NVMe: Use unbounded work queue for all work

BugLink: http://bugs.launchpad.net/bugs/1531539
Removes all usage of the global work queue so work can't be
scheduled on two different work queues, and removes nvme's work queue
singlethreadedness so controllers can be driven in parallel.

Signed-off-by: Keith Busch <keith.busch@intel.com>
[hch: keep the dead controller removal on the system workqueue to avoid
deadlocks]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
(cherry picked from commit 92f7a1624bbc2361b96db81de89aee1baae40da9)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

NVMe: Implement namespace list scanning

BugLink: http://bugs.launchpad.net/bugs/1531539
The NVMe 1.1 specification provides an identify mode to return a
list of active namespaces. This is more efficient to discover which
namespace identifiers are active on a controller, providing potentially
significant improvement in scan time for controllers with sparesly
populated namespaces.

Signed-off-by: Keith Busch <keith.busch@intel.com>
[hch: add quirk for the broken Qemu Identify implementation. To be relaxed
later]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
(cherry picked from commit 540c801c65eb58e05e0ca38b6fd644a83d7e2b33)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>