git.proxmox.com Git - mirror_ubuntu-zesty-kernel.git/log

UBUNTU: [Config] CONFIG_EFI_SECURE_BOOT_SIG_ENFORCE=n

BugLink: http://bugs.launchpad.net/bugs/1566221
Defer enforcing signed module loading, but other secure boot
checks are still valid.

Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

x86: Restrict MSR access when module loading is restricted

BugLink: http://bugs.launchpad.net/bugs/1566221
Writing to MSRs should not be allowed if module loading is restricted,
since it could lead to execution of arbitrary code in kernel mode. Based
on a patch by Kees Cook.

Cc: Kees Cook <keescook@chromium.org>
Signed-off-by: Matthew Garrett <matthew.garrett@nebula.com>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

kexec: Disable at runtime if the kernel enforces module loading restrictions

BugLink: http://bugs.launchpad.net/bugs/1566221
kexec permits the loading and execution of arbitrary code in ring 0, which
is something that module signing enforcement is meant to prevent. It makes
sense to disable kexec in this situation.

Signed-off-by: Matthew Garrett <matthew.garrett@nebula.com>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

acpi: Ignore acpi_rsdp kernel parameter when module loading is restricted

BugLink: http://bugs.launchpad.net/bugs/1566221
This option allows userspace to pass the RSDP address to the kernel, which
makes it possible for a user to circumvent any restrictions imposed on
loading modules. Disable it in that case.

Signed-off-by: Josh Boyer <jwboyer@redhat.com>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

Restrict /dev/mem and /dev/kmem when module loading is restricted

BugLink: http://bugs.launchpad.net/bugs/1566221
Allowing users to write to address space makes it possible for the kernel
to be subverted, avoiding module loading restrictions. Prevent this when
any restrictions have been imposed on loading modules.

Signed-off-by: Matthew Garrett <matthew.garrett@nebula.com>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

asus-wmi: Restrict debugfs interface when module loading is restricted

BugLink: http://bugs.launchpad.net/bugs/1566221
We have no way of validating what all of the Asus WMI methods do on a
given machine, and there's a risk that some will allow hardware state to
be manipulated in such a way that arbitrary code can be executed in the
kernel, circumventing module loading restrictions. Prevent that if any of
these features are enabled.

Signed-off-by: Matthew Garrett <matthew.garrett@nebula.com>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

ACPI: Limit access to custom_method

BugLink: http://bugs.launchpad.net/bugs/1566221
custom_method effectively allows arbitrary access to system memory, making
it possible for an attacker to circumvent restrictions on module loading.
Disable it if any such restrictions have been enabled.

Signed-off-by: Matthew Garrett <matthew.garrett@nebula.com>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

x86: Lock down IO port access when module security is enabled

BugLink: http://bugs.launchpad.net/bugs/1566221
IO port access would permit users to gain access to PCI configuration
registers, which in turn (on a lot of hardware) give access to MMIO register
space. This would potentially permit root to trigger arbitrary DMA, so lock
it down by default.

Signed-off-by: Matthew Garrett <matthew.garrett@nebula.com>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

PCI: Lock down BAR access when module security is enabled

BugLink: http://bugs.launchpad.net/bugs/1566221
Any hardware that can potentially generate DMA has to be locked down from
userspace in order to avoid it being possible for an attacker to modify
kernel code, allowing them to circumvent disabled module loading or module
signing. Default to paranoid - in future we can potentially relax this for
sufficiently IOMMU-isolated devices.

Signed-off-by: Matthew Garrett <matthew.garrett@nebula.com>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

Add secure_modules() call

BugLink: http://bugs.launchpad.net/bugs/1566221
Provide a single call to allow kernel code to determine whether the system
has been configured to either disable module loading entirely or to load
only modules signed with a trusted key.

Bugzilla: N/A
Upstream-status: Fedora mustard. Replaced by securelevels, but that was nak'd

Signed-off-by: Matthew Garrett <matthew.garrett@nebula.com>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

UBUNTU: [Config] do_zfs_powerpc64-smp use default value

Signed-off-by: Andy Whitcroft <apw@canonical.com>

PCI: hv: Add paravirtual PCI front-end for Microsoft Hyper-V VMs

BugLink: http://bugs.launchpad.net/bugs/1565967
Add a new driver which exposes a root PCI bus whenever a PCI Express device
is passed through to a guest VM under Hyper-V. The device can be single-
or multi-function. The interrupts for the devices are managed by an IRQ
domain, implemented within the driver.

[bhelgaas: fold in race condition fix (http://lkml.kernel.org/r/1456340196-13717-1-git-send-email-jakeo@microsoft.com)]
Signed-off-by: Jake Oshins <jakeo@microsoft.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
(cherry picked from commit 4daace0d8ce851f8f8f91563c835e3000c954d5e)

Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

UBUNTU: [Config] CONFIG_PCI_HYPERV=m

BugLink: http://bugs.launchpad.net/bugs/1565967
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

PCI: Look up IRQ domain by fwnode_handle

BugLink: http://bugs.launchpad.net/bugs/1565967
If pci_host_bridge_msi_domain() can't find an IRQ domain through the OF
tree, try to look it up directly through the fwnode_handle.

[bhelgaas: changelog]
Signed-off-by: Jake Oshins <jakeo@microsoft.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
(cherry picked from commit 788858ebc49a07fe5f812778f245a51b0d800d82)

Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

PCI: Add fwnode_handle to x86 pci_sysdata

BugLink: http://bugs.launchpad.net/bugs/1565967
Add an fwnode_handle to the x86 struct pci_sysdata, which will be used to
locate an IRQ domain associated with a root PCI bus.

[bhelgaas: changelog]
Signed-off-by: Jake Oshins <jakeo@microsoft.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
(cherry picked from commit 92016ba5c1d71fbe4e9952df518b5386f2a0556b)

Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

ACPI / processor: Request native thermal interrupt handling via _OSC

BugLink: http://bugs.launchpad.net/bugs/1559923
There are several reports of freeze on enabling HWP (Hardware PStates)
feature on Skylake-based systems by the Intel P-states driver. The root
cause is identified as the HWP interrupts causing BIOS code to freeze.

HWP interrupts use the thermal LVT which can be handled by Linux
natively, but on the affected Skylake-based systems SMM will respond
to it by default.  This is a problem for several reasons:
- On the affected systems the SMM thermal LVT handler is broken (it
   will crash when invoked) and a BIOS update is necessary to fix it.
- With thermal interrupt handled in SMM we lose all of the reporting
   features of the arch/x86/kernel/cpu/mcheck/therm_throt driver.
- Some thermal drivers like x86-package-temp depend on the thermal
   threshold interrupts signaled via the thermal LVT.
- The HWP interrupts are useful for debugging and tuning
   performance (if the kernel can handle them).
The native handling of thermal interrupts needs to be enabled
because of that.

This requires some way to tell SMM that the OS can handle thermal
interrupts.  That can be done by using _OSC/_PDC in processor
scope very early during ACPI initialization.

The meaning of _OSC/_PDC bit 12 in processor scope is whether or
not the OS supports native handling of interrupts for Collaborative
Processor Performance Control (CPPC) notifications.  Since on
HWP-capable systems CPPC is a firmware interface to HWP, setting
this bit effectively tells the firmware that the OS will handle
thermal interrupts natively going forward.

For details on _OSC/_PDC refer to:
http://www.intel.com/content/www/us/en/standards/processor-vendor-specific-acpi-specification.html

To implement the _OSC/_PDC handshake as described, introduce a new
function, acpi_early_processor_osc(), that walks the ACPI
namespace looking for ACPI processor objects and invokes _OSC for
them with bit 12 in the capabilities buffer set and terminates the
namespace walk on the first success.

Also modify intel_thermal_interrupt() to clear HWP status bits in
the HWP_STATUS MSR to acknowledge HWP interrupts (which prevents
them from firing continuously).

Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
[ rjw: Subject & changelog, function rename ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
(cherry picked from commit a21211672c9a1d730a39aa65d4a5b3414700adfb)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

UBUNTU: SAUCE: (noup) Update spl to 0.6.5.6-0ubuntu1, zfs to 0.6.5.6-0ubuntu3

Fixes two issues:
* Add support 32 bit FS_IOC32_{GET|SET}FLAGS compat ioctls
for (powerpc64 big endian mode)
* Fix aarch64 compilation, missing hrtime_t and timestruc_t types

BugLink: http://bugs.launchpad.net/bugs/1564591
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

powerpc: atomic: Implement acquire/release/relaxed variants for cmpxchg

BugLink: http://bugs.launchpad.net/bugs/1556096
Implement cmpxchg{,64}_relaxed and atomic{,64}_cmpxchg_relaxed, based on
which _release variants can be built.

To avoid superfluous barriers in _acquire variants, we implement these
operations with assembly code rather use __atomic_op_acquire() to build
them automatically.

For the same reason, we keep the assembly implementation of fully
ordered cmpxchg operations.

However, we don't do the similar for _release, because that will require
putting barriers in the middle of ll/sc loops, which is probably a bad
idea.

Note cmpxchg{,64}_relaxed and atomic{,64}_cmpxchg_relaxed are not
compiler barriers.

Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
(cherry picked from commit 56c08e6d226c860ad097fa6ba109133228c56722)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

powerpc: atomic: Implement acquire/release/relaxed variants for xchg

BugLink: http://bugs.launchpad.net/bugs/1556096
Implement xchg{,64}_relaxed and atomic{,64}_xchg_relaxed, based on these
_relaxed variants, release/acquire variants and fully ordered versions
can be built.

Note that xchg{,64}_relaxed and atomic_{,64}_xchg_relaxed are not
compiler barriers.

Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
(cherry picked from commit 26760fc19a7e663e4f49d586aca6740fb21d887d)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

powerpc: atomic: Implement atomic{, 64}_*_return_* variants

BugLink: http://bugs.launchpad.net/bugs/1556096
On powerpc, acquire and release semantics can be achieved with
lightweight barriers("lwsync" and "ctrl+isync"), which can be used to
implement __atomic_op_{acquire,release}.

For release semantics, since we only need to ensure all memory accesses
that issue before must take effects before the -store- part of the
atomics, "lwsync" is what we only need. On the platform without
"lwsync", "sync" should be used. Therefore in __atomic_op_release() we
use PPC_RELEASE_BARRIER.

For acquire semantics, "lwsync" is what we only need for the similar
reason. However on the platform without "lwsync", we can use "isync"
rather than "sync" as an acquire barrier. Therefore in
__atomic_op_acquire() we use PPC_ACQUIRE_BARRIER, which is barrier() on
UP, "lwsync" if available and "isync" otherwise.

Implement atomic{,64}_{add,sub,inc,dec}_return_relaxed, and build other
variants with these helpers.

Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
(cherry picked from commit dc53617c4a3f6ca35641dfd4279720365ce9f4da)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

atomics: Allow architectures to define their own __atomic_op_* helpers

BugLink: http://bugs.launchpad.net/bugs/1556096
Some architectures may have their special barriers for acquire, release
and fence semantics, so that general memory barriers(smp_mb__*_atomic())
in the default __atomic_op_*() may be too strong, so allow architectures
to define their own helpers which can overwrite the default helpers.

Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
(cherry picked from commit e1ab7f39d7e0dbfbdefe148be3ae4ee121e47ecc)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

UBUNTU: [Debian] fix linux_tools when cross-compiling

Fixed invocation of dh_shlibdeps when cross-compiling with
do_linux_tools=true.
Without being told where to find the crossdev libs, dh_shlibdeps
will emit these warnings and fail the linux-tools package:

  Debug: binary-acm7xxx
  ...
  dh_shlibdeps -plinux-headers-4.4.0-15-generic
  arm-linux-gnueabihf-objdump: .../asn1_compiler: File format not recognized
  arm-linux-gnueabihf-objdump: .../extract-cert: File format not recognized
  ...

For example:

archtriple=arm-linux-gnueabihf
         flavour=generic
         dpkg-architecture -t $archtriple -c fakeroot \
                debian/rules \
                binary-$flavour binary-perarch \
                AUTOBUILD=true \
                abi_suffix= \
                do_linux_tools=true \
                do_tools=true \
                do_tools_usbip=false \
                do_tools_cpupower=false \
                do_tools_perf=true \
                do_tools_x86=false

Signed-off-by: David Leonard <david.leonard@opengear.com>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

UBUNTU: [Debian] cpupower uses non-standard CROSS

BugLink: http://bugs.launchpad.net/bugs/1564206
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

UBUNTU: SAUCE: block: partition: initialize percpuref before sending out KOBJ_ADD

BugLink: http://bugs.launchpad.net/bugs/1546439
The initialization of partition's percpu_ref should have been done before
sending out KOBJ_ADD, which may cause userspace to read partition table.

This patch should fix this issue.

Reported-by: Naveen Kaje <nkaje@codeaurora.org>
Fixes: 6c71013ecb7e2(block: partition: convert percpu ref)
Signed-off-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

UBUNTU: SAUCE: cgroup mount: ignore nsroot=

BugLink: http://bugs.launchpad.net/bugs/1563921
our mountinfo output now shows 'nsroot='. If userspace like
criu copy/pastes mount options from there into a new mount
command, we should ignore it.

Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
Tested-by: Tycho Andersen <tycho.andersen@canonical.com>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

UBUNTU: [Config] do_zfs_powerpc64-smp = true

cking> so, all the ZFS tests on powerpc64 pass except one test, which is ioctl
FS_IOC_GETFLAGS where the cmd is being barfed up somewhere in the ioctl 32/64
thunking. I've filed a bug upstream. Since that ioctl is not frequently used,
I think we should enable powerpc64 and fix up the ioctl as a SRU later on.

Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

hv_netvsc: Move subchannel waiting to rndis_filter_device_remove()

BugLink: http://bugs.launchpad.net/bugs/1563688
During hot add, vmbus_device_register() is called from vmbus_onoffer(), on
the same workqueue as the subchannel offer message work-queue, so
subchannel offer won't be processed until the vmbus_device_register()/...
/netvsc_probe() is done.
Also, vmbus_device_register() is called with channel_mutex locked, which
prevents subchannel processing too. So the "waiting for sub-channel
processing" will not success in hot add case. But, in usual module loading,
the netvsc_probe() is called from different code path, and doesn't fail.

This patch resolves the deadlock during NIC hot-add, and speeds up NIC
loading time.

Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Reviewed-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit d66ab51442211158b677c2f12310c314d9587f74)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

proc: revert /proc/<pid>/maps [stack:TID] annotation

BugLink: http://bugs.launchpad.net/bugs/1547231
Commit b76437579d13 ("procfs: mark thread stack correctly in
proc/<pid>/maps") added [stack:TID] annotation to /proc/<pid>/maps.

Finding the task of a stack VMA requires walking the entire thread list,
turning this into quadratic behavior: a thousand threads means a
thousand stacks, so the rendering of /proc/<pid>/maps needs to look at a
million combinations.

The cost is not in proportion to the usefulness as described in the
patch.

Drop the [stack:TID] annotation to make /proc/<pid>/maps (and
/proc/<pid>/numa_maps) usable again for higher thread counts.

The [stack] annotation inside /proc/<pid>/task/<tid>/maps is retained, as
identifying the stack VMA there is an O(1) operation.

Siddesh said:
"The end users needed a way to identify thread stacks programmatically and
  there wasn't a way to do that.  I'm afraid I no longer remember (or have
  access to the resources that would aid my memory since I changed
  employers) the details of their requirement.  However, I did do this on my
  own time because I thought it was an interesting project for me and nobody
  really gave any feedback then as to its utility, so as far as I am
  concerned you could roll back the main thread maps information since the
  information is available in the thread-specific files"

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: Siddhesh Poyarekar <siddhesh.poyarekar@gmail.com>
Cc: Shaohua Li <shli@fb.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
(cherry picked from commit 65376df582174ffcec9e6471bf5b0dd79ba05e4a)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

tpm_tis: fix build warning with tpm_tis_resume

BugLink: http://bugs.launchpad.net/bugs/1398274
drivers/char/tpm/tpm_tis.c:838: warning: ‘tpm_tis_resume’ defined but
not used

Reported-by: James Morris <jmorris@namei.org>
Fixes: 00194826e6be ("tpm_tis: Clean up the force=1 module parameter")
Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
cc: stable@vger.kernel.org
(cherry picked from commit 2cb6d6460f1a171c71c134e0efe3a94c2206d080)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

tpm_crb: tpm2_shutdown() must be called before tpm_chip_unregister()

BugLink: http://bugs.launchpad.net/bugs/1398274
Wrong call order.

Reported-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Fixes: 74d6b3ceaa17
Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
cc: stable@vger.kernel.org
(cherry picked from commit 99cda8cb4639de81cde785b5bab9bc52e916e594)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

tpm_crb/tis: fix: use dev_name() for /proc/iomem

BugLink: http://bugs.launchpad.net/bugs/1398274
In all cases use dev_name() for the mapped resources. This is both
for sake of consistency and also with some platforms resource name
given by ACPI object seems to return garbage.

Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Fixes: 1bd047be37d9 ("tpm_crb: Use devm_ioremap_resource")
(cherry picked from commit 30f9c8c9e2ea37473a51354e9e492580a40661ce)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

tpm_eventlog.c: fix binary_bios_measurements

BugLink: http://bugs.launchpad.net/bugs/1398274
The commit 0cc698af36ff ("vTPM: support little endian guests") copied
the event, but without the event data, did an endian conversion on the
size and tried to output the event data from the copied version, which
has only have one byte of the data, resulting in garbage event data.

[jarkko.sakkinen@linux.intel.com: fixed minor coding style issues and
renamed the local variable tempPtr as temp_ptr now that there is an
excuse to do this.]

Signed-off-by: Harald Hoyer <harald@redhat.com>
Fixes: 0cc698af36ff ("vTPM: support little endian guests")
Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
cc: stable@vger.kernel.org
(cherry picked from commit 186d124f07da193a8f47e491af85cb695d415f2f)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

tpm: fix: return rc when devm_add_action() fails

BugLink: http://bugs.launchpad.net/bugs/1398274
Call put_device() and return error code if devm_add_action() fails.

Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Reported-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Fixes: 8e0ee3c9faed ("tpm: fix the cleanup of struct tpm_chip")
(cherry picked from commit 4f3b193dee4423d8c89c9a3e8e05f9197ea459a4)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

tpm: fix: set continueSession attribute for the unseal operation

BugLink: http://bugs.launchpad.net/bugs/1398274
It's better to set the continueSession attribute for the unseal
operation so that the session object is not removed as a side-effect
when the operation is successful. Since a user process created the
session, it should be also decide when the session is destroyed.

Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Fixes: 5beb0c435b ("keys, trusted: seal with a TPM2 authorization policy")
(cherry picked from commit c0b5eed110dcf520aadafefbcc40658cbdd18b95)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

tpm: fix the cleanup of struct tpm_chip

BugLink: http://bugs.launchpad.net/bugs/1398274
If the initialization fails before tpm_chip_register(), put_device()
will be not called, which causes release callback not to be called.
This patch fixes the issue by adding put_device() to devres list of
the parent device.

Fixes: 313d21eeab ("tpm: device class for tpm")
Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
cc: stable@vger.kernel.org
Reviewed-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
(cherry picked from commit 8e0ee3c9faed7ca68807ea45141775856c438ac0)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

tpm: fix the rollback in tpm_chip_register()

BugLink: http://bugs.launchpad.net/bugs/1398274
Fixed the rollback and gave more self-documenting names for the
functions.

Fixes: d972b0523f ("tpm: fix call order in tpm-chip.c")
Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
cc: stable@vger.kernel.org
Reviewed-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
(cherry picked from commit 72c91ce8523ae5828fe5e4417ae0aaab53707a08)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

tpm_crb: Use devm_ioremap_resource

BugLink: http://bugs.launchpad.net/bugs/1398274
To support the force mode in tpm_tis we need to use resource locking
in tpm_crb as well, via devm_ioremap_resource.

The light restructuring better aligns crb and tis and makes it easier
to see the that new changes make sense.

The control area and its associated buffers do not always fall in the
range of the iomem resource given by the ACPI object. This patch fixes
the issue by mapping the buffers if this is the case.

[jarkko.sakkinen@linux.intel.com: squashed update described in the
last paragraph.]

Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Tested-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Acked-by: Peter Huewe <peterhuewe@gmx.de>
Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
(cherry picked from commit 1bd047be37d95bf65a219f4931215f71878ac060)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

tpm_crb: Drop le32_to_cpu(ioread32(..))

BugLink: http://bugs.launchpad.net/bugs/1398274
ioread32 and readl are defined to read from PCI style memory, ie little
endian and return the result in host order. On platforms where a
swap is required ioread32/readl do the swap internally (eg see ppc).

Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Tested-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Acked-by: Peter Huewe <peterhuewe@gmx.de>
(cherry picked from commit 1e3ed59d6200eb31b554dbdcfdde62d1e3d91f0c)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

tpm_tis: Clean up the force=1 module parameter

BugLink: http://bugs.launchpad.net/bugs/1398274
The TPM core has long assumed that every device has a driver attached,
however the force path was attaching the TPM core outside of a driver
context. This isn't generally reliable as the user could detatch the
driver using sysfs or something, but commit b8b2c7d845d5 ("base/platform:
assert that dev_pm_domain callbacks are called unconditionally")
forced the issue by leaving the driver pointer NULL if there is
no probe.

Rework the TPM setup to create a platform device with resources and
then allow the driver core to naturally bind and probe it through the
normal mechanisms. All this structure is needed anyhow to enable TPM
for OF environments.

Finally, since the entire flow is changing convert the init/exit to use
the modern ifdef-less coding style when possible

Reported-by: "Wilck, Martin" <martin.wilck@ts.fujitsu.com>
Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Tested-by: Wilck, Martin <martin.wilck@ts.fujitsu.com>
Tested-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Acked-by: Peter Huewe <peterhuewe@gmx.de>
(cherry picked from commit 00194826e6be333083ba9ddbd6e83fb423206f8a)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

tpm_tis: Use devm_ioremap_resource

BugLink: http://bugs.launchpad.net/bugs/1398274
This does a request_resource under the covers which means tis holds a
lock on the memory range it is using so other drivers cannot grab it.
When doing probing it is important to ensure that other drivers are
not using the same range before tis starts touching it.

To do this flow the actual struct resource from the device right
through to devm_ioremap_resource. This ensures all the proper resource
meta-data is carried down.

Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Tested-by: Wilck, Martin <martin.wilck@ts.fujitsu.com>
Tested-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Acked-by: Peter Huewe <peterhuewe@gmx.de>
(cherry picked from commit 51dd43dff74b0547ad844638f6910ca29c956819)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

tpm_tis: Do not fall back to a hardcoded address for TPM2

BugLink: http://bugs.launchpad.net/bugs/1398274
If the ACPI tables do not declare a memory resource for the TPM2
then do not just fall back to the x86 default base address.

Also be stricter when checking the ancillary TPM2 ACPI data and error
out if any of this data is wrong rather than blindly assuming TPM1.

Fixes: 399235dc6e95 ("tpm, tpm_tis: fix tpm_tis ACPI detection issue with TPM 2.0")
Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Tested-by: Wilck, Martin <martin.wilck@ts.fujitsu.com>
Tested-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Acked-by: Peter Huewe <peterhuewe@gmx.de>
(cherry picked from commit 4d627e672bd0e8af4e734fef93e806499d1e1277)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

tpm_tis: Disable interrupt auto probing on a per-device basis

BugLink: http://bugs.launchpad.net/bugs/1398274
Instead of clearing the global interrupts flag when any device
does not have an interrupt just pass -1 through tpm_info.irq.

The only thing that asks for autoprobing is the force=1 path.

Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Tested-by: Wilck, Martin <martin.wilck@ts.fujitsu.com>
Tested-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Acked-by: Peter Huewe <peterhuewe@gmx.de>
(cherry picked from commit ef7b81dc78642e1a33c890acf3214d1e04c90a8f)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

tpm_crb: Use the common ACPI definition of struct acpi_tpm2

BugLink: http://bugs.launchpad.net/bugs/1398274
include/acpi/actbl2.h is the proper place for these definitions
and the needed TPM2 ones have been there since
commit 413d4a6defe0 ("ACPICA: Update TPM2 ACPI table")

This also drops a couple of le32_to_cpu's for members of this table,
the existing swapping was not done consistently, and the standard
used by other Linux callers of acpi_get_table is unswapped.

Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Tested-by: Wilck, Martin <martin.wilck@ts.fujitsu.com>
Tested-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Acked-by: Peter Huewe <peterhuewe@gmx.de>
(cherry picked from commit 55a889c2cb138f8f10164539c6d290a1cefaa863)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

tpm: fix checks for policy digest existence in tpm2_seal_trusted()

BugLink: http://bugs.launchpad.net/bugs/1398274
In my original patch sealing with policy was done with dynamically
allocated buffer that I changed later into an array so the checks in
tpm2-cmd.c became invalid. This patch fixes the issue.

Fixes: 5beb0c435bdd ("keys, trusted: seal with a TPM2 authorization policy")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Acked-by: Peter Huewe <peterhuewe@gmx.de>
(cherry picked from commit f3c82ade7c59303167d56b0be3e0707751fc45e2)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

tpm: remove unneeded include of actbl2.h

BugLink: http://bugs.launchpad.net/bugs/1398274
tpm_tis.c already gets actbl2.h via linux/acpi.h -> acpi/acpi.h ->
acpi/actbl.h -> acpi/actbl2.h, so the direct include in tpm_tis.c
is not needed.

Signed-off-by: Jerry Snitselaar <jsnitsel@redhat.com>
Acked-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Acked-by: Peter Huewe <peterhuewe@gmx.de>
(cherry picked from commit e5be990c2fc3c2682ab7cfbc4f0e6c8cdad2b40d)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

intel_idle: Support for Intel Xeon Phi Processor x200 Product Family

BugLink: http://bugs.launchpad.net/bugs/1461365
Enables "Intel(R) Xeon Phi(TM) Processor x200 Product Family" support,
formerly code-named KNL. It is based on modified Intel Atom Silvermont
microarchitecture.

Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
[micah.barany@intel.com: adjusted values of residency and latency]
Signed-off-by: Micah Barany <micah.barany@intel.com>
[hubert.chrzaniuk@intel.com: removed deprecated CPUIDLE_FLAG_TIME_VALID flag]
Signed-off-by: Hubert Chrzaniuk <hubert.chrzaniuk@intel.com>
Signed-off-by: Pawel Karczewski <pawel.karczewski@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
(cherry picked from commit 281baf7a702693deaa45c98ef0c5161006b48257)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

UBUNTU: SAUCE: (noup) cxlflash: Move to exponential back-off when cmd_room is not available

BugLink: http://bugs.launchpad.net/bugs/1563485
While profiling the cxlflash_queuecommand() path under a heavy load it
was found that number of retries to find cmd_room was fairly high.

There are two problems with the current back-off:
a) It starts with a udelay of 0
b) It backs-off linearly

Tried several approaches (a higher multiple 10*n, 100*n, as well as n^2,
2^n) and found that the exponential back-off(2^n) approach had the least
overall cost. Cost as being defined as overall time spent waiting.

The fix is to change the linear back-off to an exponential back-off.
This solution also takes care of the problem with the initial
delay (starts with 1 usec).

Signed-off-by: Manoj N. Kumar <manoj@linux.vnet.ibm.com>
Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

cxlflash: Increase cmd_per_lun for better throughput

BugLink: http://bugs.launchpad.net/bugs/1563485
With the current value of cmd_per_lun at 16, the throughput
over a single adapter is limited to around 150kIOPS.

Increase the value of cmd_per_lun to 256 to improve
throughput. With this change a single adapter is able to
attain close to the maximum throughput (380kIOPS).
Also change the number of RRQ entries that can be queued.

Signed-off-by: Manoj N. Kumar <manoj@linux.vnet.ibm.com>
Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Reviewed-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 83430833b4d4a9c9b23964babbeb1f36450f8136)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

cxlflash: Fix to avoid unnecessary scan with internal LUNs

BugLink: http://bugs.launchpad.net/bugs/1563485
When switching to the internal LUN defined on the
IBM CXL flash adapter, there is an unnecessary
scan occurring on the second port. This scan leads
to the following extra lines in the log:

Dec 17 10:09:00 tul83p1 kernel: [ 3708.561134] cxlflash 0008:00:00.0: cxlflash_queuecommand: (scp=c0000000fc1f0f00) 11/1/0/0 cdb=(A0000000-00000000-10000000-00000000)
Dec 17 10:09:00 tul83p1 kernel: [ 3708.561147] process_cmd_err: cmd failed afu_rc=32 scsi_rc=0 fc_rc=0 afu_extra=0xE, scsi_extra=0x0, fc_extra=0x0

By definition, both of the internal LUNs are on the first port/channel.

When the lun_mode is switched to internal LUN the
same value for host->max_channel is retained. This
causes an unnecessary scan over the second port/channel.

This fix alters the host->max_channel to 0 (1 port), if internal
LUNs are configured and switches it back to 1 (2 ports) while
going back to external LUNs.

Signed-off-by: Manoj N. Kumar <manoj@linux.vnet.ibm.com>
Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Reviewed-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 603ecce95f4817074a724a889cd88c3c8210f933)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

UBUNTU: Start new release

Ignore: yes
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

UBUNTU: Ubuntu-4.4.0-17.33

Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

UBUNTU: SAUCE: (noup) ppc64 boot: Wait for boot cpu to show up if nr_cpus limit is about to hit.

BugLink: http://bugs.launchpad.net/bugs/1560552
http://patchwork.ozlabs.org/patch/577193/

The kernel boot parameter 'nr_cpus=' allows one to specify number of
possible cpus in the system. In the normal scenario the first cpu (cpu0)
that shows up is the boot cpu and hence it gets covered under nr_cpus
limit.

But this assumption will be broken in kdump scenario where kdump kenrel
after a crash can boot up on an non-zero boot cpu. The paca structure
allocation depends on value of nr_cpus and is indexed using logical cpu
ids. This definetly will be an issue if boot cpu id > nr_cpus

This patch modifies allocate_pacas() and smp_setup_cpu_maps() to
accommodate boot cpu for the case where boot_cpuid > nr_cpu_ids.

This change would help to reduce the memory reservation requirement for
kdump on ppc64.

Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>

UBUNTU: SAUCE: (noup) powerpc/pci: Assign fixed PHB number based on device-tree properties

BugLink: http://bugs.launchpad.net/bugs/1560514
https://patchwork.ozlabs.org/patch/599720/

The domain/PHB field of PCI addresses has its value obtained from a
global variable, incremented each time a new domain (represented by
struct pci_controller) is added on the system. The domain addition
process happens during boot or due to PCI device hotplug.

As recent kernels are using predictable naming for network interfaces,
the network stack is more tied to PCI naming. This can be a problem in
hotplug scenarios, because PCI addresses will change if devices are
removed and then re-added. This situation seems unusual, but it can
happen if a user wants to replace a NIC without rebooting the machine,
for example.

This patch changes the way PCI domain values are generated: now, we use
device-tree properties to assign fixed PHB numbers to PCI addresses
when available (meaning pSeries and PowerNV cases). We also use a bitmap
to allow dynamic PHB numbering when device-tree properties are not
used. This bitmap keeps track of used PHB numbers and if a PHB is
released (by hotplug operations for example), it allows the reuse of
this PHB number, avoiding PCI address to change in case of device remove
and re-add soon after. No functional changes were introduced.

Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

UBUNTU: [d-i] Add phy drivers for Cavium ThunderX to nic-modules udeb

BugLink: http://bugs.launchpad.net/bugs/1562968
Different implementations of Cavium ThunderX use different phys for
network support.

Signed-off-by: dann frazier <dann.frazier@canonical.com>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

phy: mdio-thunder: Fix some Kconfig typos

BugLink: http://bugs.launchpad.net/bugs/1562968
Drop two extra occurrences of "on" in option title and help text.

Fixes: 379d7ac7ca31 ("phy: mdio-thunder: Add driver for Cavium Thunder SoC MDIO buses.")
Cc: David Daney <david.daney@cavium.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
Acked-by: David Daney <david.daney@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit e2ad1f976b721df383ff12c12a6dcc805cbb80f3)
Signed-off-by: dann frazier <dann.frazier@canonical.com>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

net: cavium: For Kconfig THUNDER_NIC_BGX, select MDIO_THUNDER.

BugLink: http://bugs.launchpad.net/bugs/1562968
Previously we selected MDIO_OCTEON, which after creating the Thunder
specific MDIO bus driver is much less useful.

Signed-off-by: David Daney <david.daney@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 9277a4f875adbeeb6209c0a3e3cf04c752522b2e)
Signed-off-by: dann frazier <dann.frazier@canonical.com>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

phy: mdio-cavium: Add missing MODULE_* annotations.

BugLink: http://bugs.launchpad.net/bugs/1562968
When the code was factored out of mdio-octeon.c, the
MODULE_DESCRIPTION, MODULE_AUTHOR and MODULE_LICENSE annotations were
inadvertently omitted. Restore them so that we don't get kernel taint
warnings upon module loading.

Signed-off-by: David Daney <david.daney@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 7091f01e8cf6989e63c4eacb59b654fcff057901)
Signed-off-by: dann frazier <dann.frazier@canonical.com>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

phy: mdio-thunder: Add driver for Cavium Thunder SoC MDIO buses.

BugLink: http://bugs.launchpad.net/bugs/1562968
The Cavium Thunder SoCs have multiple MIDO buses that are part of a
single PCI device. To model this in the device tree we call the PCI
parent device a "cavium,thunder-8890-mdio-nexus", it has several
children, one for each MDIO bus.

The MDIO bus hardware is identical to that found in the OCTEON SoCs,
so we use that code for things that are not part of the PCI driver
probe/remove

Signed-off-by: David Daney <david.daney@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 379d7ac7ca31722a1fb488ae3e98b274c9db568c)
Signed-off-by: dann frazier <dann.frazier@canonical.com>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

UBUNTU: [Config] CONFIG_MDIO_THUNDER=m

BugLink: http://bugs.launchpad.net/bugs/1562968
Signed-off-by: dann frazier <dann.frazier@canonical.com>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

phy: mdio-octeon: Refactor into two files/modules

BugLink: http://bugs.launchpad.net/bugs/1562968
A follow-on patch uses PCI probing to find the Thunder MDIO hardware.
In preparation for this, split out the common code into a new file
mdio-cavium.c, which will be used by both the existing OCTEON driver,
and the new Thunder PCI based driver.

As part of the refactoring simplify the struct cavium_mdiobus by
removing fields that are only ever used in the probe function and can
just as well be local variables.

Use readq/writeq in preference to readq_relaxed/writeq_relaxed as the
relaxed form was an optimization for an early chip revision, and the
MDIO drivers are not performance bottlenecks that need optimization in
the first place.

Signed-off-by: David Daney <david.daney@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 1eefee901fca0208b8a56f20cdc134e2b8638ae7)
[ dannf: backported to v4.4 ]
Signed-off-by: dann frazier <dann.frazier@canonical.com>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

UBUNTU: [Config] CONFIG_MDIO_CAVIUM=m

BugLink: http://bugs.launchpad.net/bugs/1562968
Signed-off-by: dann frazier <dann.frazier@canonical.com>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

net: thunderx: Cleanup PHY probing code.

BugLink: http://bugs.launchpad.net/bugs/1562968
Remove the call to force the octeon-mdio driver to be loaded. Allow
the standard driver loading mechanisms to load the PHY drivers, and
use -EPROBE_DEFER to cause the BGX driver to be probed only after the
PHY drivers are available.

Reorder the setting of MAC addresses and PHY probing to allow BGX
LMACs with no attached PHY to still be assigned a MAC address.

Signed-off-by: David Daney <david.daney@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 5fc7cf179449502ad4ad67845ded2df94b680de2)
Signed-off-by: dann frazier <dann.frazier@canonical.com>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

mm: exclude ZONE_DEVICE from GFP_ZONE_TABLE

BugLink: http://bugs.launchpad.net/bugs/1563293
ZONE_DEVICE (merged in 4.3) and ZONE_CMA (proposed) are examples of new
mm zones that are bumping up against the current maximum limit of 4
zones, i.e.  2 bits in page->flags for the GFP_ZONE_TABLE.

The GFP_ZONE_TABLE poses an interesting constraint since
include/linux/gfp.h gets included by the 32-bit portion of a 64-bit
build.  We need to be careful to only build the table for zones that
have a corresponding gfp_t flag.  GFP_ZONES_SHIFT is introduced for this
purpose.  This patch does not attempt to solve the problem of adding a
new zone that also has a corresponding GFP_ flag.

Vlastimil points out that ZONE_DEVICE, by depending on x86_64 and
SPARSEMEM_VMEMMAP implies that SECTIONS_WIDTH is zero.  In other words
even though ZONE_DEVICE does not fit in GFP_ZONE_TABLE it is free to
consume another bit in page->flags (expand ZONES_WIDTH) with room to
spare.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=110931
Fixes: 033fbae988fc ("mm: ZONE_DEVICE for "device memory"")
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Reported-by: Mark <markk@clara.co.uk>
Reported-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Rik van Riel <riel@redhat.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Sudip Mukherjee <sudipm.mukherjee@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
(cherry picked from commit b11a7b94100cba5ec926a181894c2897a22651b9)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

Revert "mm: CONFIG_NR_ZONES_EXTENDED"

BugLink: http://bugs.launchpad.net/bugs/1563293
This reverts commit f671c3e6168e8b21aeb32201b3187c5d2b077349.

This patch never made it upstream. Instead, commit
b11a7b94100cba5ec926a181894c2897a22651b9 ('mm: exclude ZONE_DEVICE from GFP_ZONE_TABLE')
replaced it.

Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

printk: set may_schedule for some of console_trylock() callers

BugLink: http://bugs.launchpad.net/bugs/1534216
console_unlock() allows to cond_resched() if its caller has set
`console_may_schedule' to 1, since 8d91f8b15361 ("printk: do
cond_resched() between lines while outputting to consoles").

The rules are:
-- console_lock() always sets `console_may_schedule' to 1
-- console_trylock() always sets `console_may_schedule' to 0

However, console_trylock() callers (among them is printk()) do not
always call printk() from atomic contexts, and some of them can
cond_resched() in console_unlock(), so console_trylock() can set
`console_may_schedule' to 1 for such processes.

For !CONFIG_PREEMPT_COUNT kernels, however, console_trylock() always
sets `console_may_schedule' to 0.

It's possible to drop explicit preempt_disable()/preempt_enable() in
vprintk_emit(), because console_unlock() and console_trylock() are now
smart enough:
a) console_unlock() does not cond_resched() when it's unsafe
(console_trylock() takes care of that)
b) console_unlock() does can_use_console() check.

Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Cc: Jan Kara <jack@suse.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Kyle McMartin <kyle@kernel.org>
Cc: Dave Jones <davej@codemonkey.org.uk>
Cc: Calvin Owens <calvinowens@fb.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
(cherry picked from commit 6b97a20d3a7909daa06625d4440c2c52d7bf08d7)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

drm/atomic-helper: Implement subsystem-level suspend/resume

BugLink: http://bugs.launchpad.net/bugs/1560395
Provide subsystem-level suspend and resume helpers that can be used to
implement suspend/resume on atomic mode-setting enabled drivers.

v2: simplify locking, enhance kerneldoc comments
v3: pass lock acquisition context by parameter, improve kerneldoc
v4: - remove redundant code (already provided by atomic helpers)
      (Maarten Lankhorst)
    - move backoff dance from drm_modeset_lock_all_ctx() into suspend
      helper (Daniel Vetter)
v5: handle potential EDEADLK from drm_atomic_helper_duplicate_state()
    and drm_atomic_helper_disable_all() (Daniel Vetter)

Signed-off-by: Thierry Reding <treding@nvidia.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1449075005-13937-2-git-send-email-thierry.reding@gmail.com
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
(cherry picked from commit 1494276000db789c6d2acd85747be4707051c801)
Signed-off-by: Timo Aaltonen <timo.aaltonen@canonical.com>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

drm/core: Add drm_for_each_encoder_mask, v2.

BugLink: http://bugs.launchpad.net/bugs/1560395
This is similar to the other drm_for_each_*_mask functions.

Changes since v1:
- Use for_each_if

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1452160762-30487-3-git-send-email-maarten.lankhorst@linux.intel.com
(cherry picked from commit ead8b665705a0926442fbd3f4dbccbec36e5b8f4)
Signed-off-by: Timo Aaltonen <timo.aaltonen@canonical.com>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

PM / runtime: Add new helper for conditional usage count incrementation

BugLink: http://bugs.launchpad.net/bugs/1560395
Introduce a new runtime PM function, pm_runtime_get_if_in_use(),
that will increment the device's runtime PM usage counter and
return 1 if its status is RPM_ACTIVE and its usage counter
is greater than 0 at the same time (0 will be returned otherwise).

This is useful for things that should only be done if the device
is active (from the runtime PM perspective) and used by somebody
(as indicated by the usage counter) already and they are not worth
bothering otherwise.

Requested-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
(cherry picked from commit a436b6a19f57656a6557439523923d89eb4a880d)
Signed-off-by: Timo Aaltonen <timo.aaltonen@canonical.com>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

UBUNTU: SAUCE: i915_bpo: Update to drm-intel-next-fixes-2016-03-16

BugLink: http://bugs.launchpad.net/bugs/1560395
Signed-off-by: Timo Aaltonen <timo.aaltonen@canonical.com>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

Drivers: hv: vmbus: Support handling messages on multiple CPUs

BugLink: http://bugs.launchpad.net/bugs/1541585
Starting with Windows 2012 R2, message inteerupts can be delivered
on any VCPU in the guest. Support this functionality.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit d81274aae61c0a045cd0f34191c51fa64ba58bc4)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

Revert "Drivers: hv: vmbus: Support handling messages on multiple CPUs"

BugLink: http://bugs.launchpad.net/bugs/1541585
This reverts commit 6e97c2369f18d020c1384bf0716aec6009e8128a.

Drivers: hv: utils: Remove util transport handler from list if registration fails

BugLink: http://bugs.launchpad.net/bugs/1541585
If util transport fails to initialize for any reason, the list of transport
handlers may become corrupted due to freeing the transport handler without
removing it from the list. Fix this by cleaning it up from the list.

Signed-off-by: Alex Ng <alexng@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit e66853b09017a788dc384dadce9323396dae3293)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

Drivers: hv: util: Pass the channel information during the init call

BugLink: http://bugs.launchpad.net/bugs/1541585
Pass the channel information to the util drivers that need to defer
reading the channel while they are processing a request. This would address
the following issue reported by Vitaly:

Commit 3cace4a61610 ("Drivers: hv: utils: run polling callback always in
interrupt context") removed direct *_transaction.state = HVUTIL_READY
assignments from *_handle_handshake() functions introducing the following
race: if a userspace daemon connects before we get first non-negotiation
request from the server hv_poll_channel() won't set transaction state to
HVUTIL_READY as (!channel) condition will fail, we set it to non-NULL on
the first real request from the server.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Reported-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit b9830d120cbe155863399f25eaef6aa8353e767f)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

Drivers: hv: vmbus: avoid unneeded compiler optimizations in vmbus_wait_for_unload()

BugLink: http://bugs.launchpad.net/bugs/1541585
Message header is modified by the hypervisor and we read it in a loop,
we need to prevent compilers from optimizing accesses. There are no such
optimizations at this moment, this is just a future proof.

Suggested-by: Radim Krcmar <rkrcmar@redhat.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: Radim Kr.má<rkrcmar@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit d452ab7b4c65dfcaee88a0d6866eeeb98a3d1884)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

Drivers: hv: vmbus: remove code duplication in message handling

BugLink: http://bugs.launchpad.net/bugs/1541585
We have 3 functions dealing with messages and they all implement
the same logic to finalize reads, move it to vmbus_signal_eom().

Suggested-by: Radim Krcmar <rkrcmar@redhat.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: Radim Kr.má<rkrcmar@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 0f70b66975ce4331e9002b792d5aa6787a110181)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

Drivers: hv: vmbus: avoid wait_for_completion() on crash

BugLink: http://bugs.launchpad.net/bugs/1541585
wait_for_completion() may sleep, it enables interrupts and this
is something we really want to avoid on crashes because interrupt
handlers can cause other crashes. Switch to the recently introduced
vmbus_wait_for_unload() doing busy wait instead.

Reported-by: Radim Krcmar <rkrcmar@redhat.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: Radim Kr.má<rkrcmar@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 75ff3a8a9168df750b5bd0589e897a6c0517a9f1)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

Drivers: hv: vmbus: don't loose HVMSG_TIMER_EXPIRED messages

BugLink: http://bugs.launchpad.net/bugs/1541585
We must handle HVMSG_TIMER_EXPIRED messages in the interrupt context
and we offload all the rest to vmbus_on_msg_dpc() tasklet. This functions
loops to see if there are new messages pending. In case we'll ever see
HVMSG_TIMER_EXPIRED message there we're going to lose it as we can't
handle it from there. Avoid looping in vmbus_on_msg_dpc(), we're OK
with handling one message per interrupt.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: Radim Kr.má<rkrcmar@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 7be3e169444d2c625f15a0b6639252b98d1f226a)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

drivers/hv: Move VMBus hypercall codes into Hyper-V UAPI header

BugLink: http://bugs.launchpad.net/bugs/1541585
VMBus hypercall codes inside Hyper-V UAPI header will
be used by QEMU to implement VMBus host devices support.

Signed-off-by: Andrey Smetanin <asmetanin@virtuozzo.com>
Acked-by: K. Y. Srinivasan <kys@microsoft.com>
Reviewed-by: Roman Kagan <rkagan@virtuozzo.com>
CC: Gleb Natapov <gleb@kernel.org>
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: Joerg Roedel <joro@8bytes.org>
CC: "K. Y. Srinivasan" <kys@microsoft.com>
CC: Haiyang Zhang <haiyangz@microsoft.com>
CC: Roman Kagan <rkagan@virtuozzo.com>
CC: Denis V. Lunev <den@openvz.org>
CC: qemu-devel@nongnu.org
[Do not rename the constant at the same time as moving it, as that
would cause semantic conflicts with the Hyper-V tree. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(back ported from commit 18f098618aa031f4c8a907c550fcd6785280c977)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Conflicts:
arch/x86/include/uapi/asm/hyperv.h

Drivers: hv: vmbus: Give control over how the ring access is serialized

BugLink: http://bugs.launchpad.net/bugs/1541585
On the channel send side, many of the VMBUS
device drivers explicity serialize access to the
outgoing ring buffer. Give more control to the
VMBUS device drivers in terms how to serialize
accesss to the outgoing ring buffer.
The default behavior will be to aquire the
ring lock to preserve the current behavior.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit fe760e4d64fe5c17c39e86c410d41f6587ee88bc)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

Drivers: hv: vmbus: Eliminate the spin lock on the read path

BugLink: http://bugs.launchpad.net/bugs/1541585
The function hv_ringbuffer_read() is called always on a pre-assigned
CPU. Each chnnel is bound to a specific CPU and this function is
always called on the CPU the channel is bound. There is no need to
acquire the spin lock; get rid of this overhead.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 3eba9a77d5fc2cee486a16fff435686f024f61cf)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

Drivers: hv: vmbus: add an API vmbus_hvsock_device_unregister()

BugLink: http://bugs.launchpad.net/bugs/1541585
The hvsock driver needs this API to release all the resources related
to the channel.

Signed-off-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 85d9aa705184a4504d0330017e3956fcdae8a9d6)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

Drivers: hv: vmbus: add a per-channel rescind callback

BugLink: http://bugs.launchpad.net/bugs/1541585
This will be used by the coming hv_sock driver.

Signed-off-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 499e8401a515d04daa986b995da710d2b9737764)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

Drivers: hv: vmbus: add a hvsock flag in struct hv_driver

BugLink: http://bugs.launchpad.net/bugs/1541585
Only the coming hv_sock driver has a "true" value for this flag.

We treat the hvsock offers/channels as special VMBus devices.
Since the hv_sock driver handles all the hvsock offers/channels, we need to
tweak vmbus_match() for hv_sock driver, so we introduce this flag.

Signed-off-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 8981da320a11217589aa3c50f9e891bcdef07ece)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

Drivers: hv: vmbus: define a new VMBus message type for hvsock

BugLink: http://bugs.launchpad.net/bugs/1541585
A function to send the type of message is also added.

The coming net/hvsock driver will use this function to proactively request
the host to offer a VMBus channel for a new hvsock connection.

Signed-off-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 5c23a1a5c60b0f472cfa61cd7d8279f8aaeb5b64)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

Drivers: hv: vmbus: vmbus_sendpacket_ctl: hvsock: avoid unnecessary signaling

BugLink: http://bugs.launchpad.net/bugs/1541585
When the hvsock channel's outbound ringbuffer is full (i.e.,
hv_ringbuffer_write() returns -EAGAIN), we should avoid the unnecessary
signaling the host.

Signed-off-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 5f363bc38f810d238d1e8b19998625ddec3b8138)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

Drivers: hv: vmbus: define the new offer type for Hyper-V socket (hvsock)

BugLink: http://bugs.launchpad.net/bugs/1541585
A helper function is also added.

Signed-off-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit e8d6ca023efce3bd80050dcd9e708ee3cf8babd4)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

Drivers: hv: vmbus: add a helper function to set a channel's pending send size

BugLink: http://bugs.launchpad.net/bugs/1541585
This will be used by the coming net/hvsock driver.

Signed-off-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 3c75354d043ad546148d6992e40033ecaefc5ea5)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

Drivers: hv: vmbus: don't manipulate with clocksources on crash

BugLink: http://bugs.launchpad.net/bugs/1541585
clocksource_change_rating() involves mutex usage and can't be called
in interrupt context. It also makes sense to avoid doing redundant work
on crash.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 3ccb4fd8f492f99aece21acc1bd6142275f26236)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

Drivers: hv: vmbus: avoid scheduling in interrupt context in vmbus_initiate_unload()

BugLink: http://bugs.launchpad.net/bugs/1541585
We have to call vmbus_initiate_unload() on crash to make kdump work but
the crash can also be happening in interrupt (e.g. Sysrq + c results in
such) where we can't schedule or the following will happen:

[ 314.905786] bad: scheduling from the idle thread!

Just skipping the wait (and even adding some random wait here) won't help:
to make host-side magic working we're supposed to receive CHANNELMSG_UNLOAD
(and actually confirm the fact that we received it) but we can't use
interrupt-base path (vmbus_isr()-> vmbus_on_msg_dpc()). Implement a simple
busy wait ignoring all the other messages and use it if we're in an
interrupt context.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 415719160de3fae3bb9cbc617664649919cd00d0)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

Drivers: hv: vmbus: avoid infinite loop in init_vp_index()

BugLink: http://bugs.launchpad.net/bugs/1541585
When we pick a CPU to use for a new subchannel we try find a non-used one
on the appropriate NUMA node, we keep track of them with the
primary->alloced_cpus_in_node mask. Under normal circumstances we don't run
out of available CPUs but it is possible when we we don't initialize some
cpus in Linux, e.g. when we boot with 'nr_cpus=' limitation.

Avoid the infinite loop in init_vp_index() by checking that we still have
non-used CPUs in the alloced_cpus_in_node mask and resetting it in case
we don't.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 79fd8e706637a5c7c41f9498fe0fbfb437abfdc8)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

Drivers: hv: vmbus: Add vendor and device atttributes

BugLink: http://bugs.launchpad.net/bugs/1541585
Add vendor and device attributes to VMBUS devices. These will be used
by Hyper-V tools as well user-level RDMA libraries that will use the
vendor/device tuple to discover the RDMA device.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 7047f17d70fc0599563d30d0791692cb5fe42ae6)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

Drivers: hv: vmbus: Cleanup vmbus_set_event()

BugLink: http://bugs.launchpad.net/bugs/1541585
Cleanup vmbus_set_event() by inlining the hypercall to post
the event and since the return value of vmbus_set_event() is not checked,
make it void. As part of this cleanup, get rid of the function
hv_signal_event() as it is only callled from vmbus_set_event().

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 1b807e1011af46a595ba46c75ad5e20ad7177af7)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

intel_idle: prevent SKL-H boot failure when C8+C9+C10 enabled

BugLink: http://bugs.launchpad.net/bugs/1559918
Some SKL-H configurations require "intel_idle.max_cstate=7" to boot.
While that is an effective workaround, it disables C10.

This patch detects the problematic configuration,
and disables C8 and C9, keeping C10 enabled.

Note that enabling SGX in BIOS SETUP can also prevent this issue,
if the system BIOS provides that option.

https://bugzilla.kernel.org/show_bug.cgi?id=109081
"Freezes with Intel i7 6700HQ (Skylake), unless intel_idle.max_cstate=7"

Signed-off-by: Len Brown <len.brown@intel.com>
Cc: stable@vger.kernel.org
(cherry picked from commit d70e28f57e14a481977436695b0c9ba165472431)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

net: ixgbe: abort with cls u32 divisor groups greater than 1

BugLink: http://bugs.launchpad.net/bugs/1562326
This patch ensures ixgbe will not try to offload hash tables from the
u32 module. The device class does not currently support this so until
it is enabled just abort on these tables.

Interestingly the more flexible your hardware is the less code you
need to implement to guard against these cases.

Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit db956ae882f4e7aa99c9c242a91ae942d08b6939)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

net: ixgbe: add support for tc_u32 offload

BugLink: http://bugs.launchpad.net/bugs/1562326
This adds initial support for offloading the u32 tc classifier. This
initial implementation only implements a few base matches and actions
to illustrate the use of the infrastructure patches.

However it is an interesting subset because it handles the u32 next
hdr logic to correctly map tcp packets from ip headers using the ihl
and protocol fields. After this is accepted we can extend the match
and action fields easily by updating the model header file.

Also only the drop action is supported initially.

Here is a short test script,

#tc qdisc add dev eth4 ingress
#tc filter add dev eth4 parent ffff: protocol ip \
u32 ht 800: order 1 \
match ip dst 15.0.0.1/32 match ip src 15.0.0.2/32 action drop

<-- hardware has dst/src ip match rule installed -->

#tc filter del dev eth4 parent ffff: prio 49152
#tc filter add dev eth4 parent ffff: protocol ip prio 99 \
handle 1: u32 divisor 1
#tc filter add dev eth4 protocol ip parent ffff: prio 99 \
u32 ht 800: order 1 link 1: \
offset at 0 mask 0f00 shift 6 plus 0 eat match ip protocol 6 ff
#tc filter add dev eth4 parent ffff: protocol ip \
u32 ht 1: order 3 match tcp src 23 ffff action drop

<-- hardware has tcp src port rule installed -->

#tc qdisc del dev eth4 parent ffff:

<-- hardware cleaned up -->

Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit b82b17d929a692df1122fedc0ff4ddcef9cb6ad4)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

net: sched: add cls_u32 offload hooks for netdevs

BugLink: http://bugs.launchpad.net/bugs/1562326
This patch allows netdev drivers to consume cls_u32 offloads via
the ndo_setup_tc ndo op.

This works aligns with how network drivers have been doing qdisc
offloads for mqprio.

Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit a1b7c5fd7fe98f51fbbc393ee1fc4c1cdb2f0119)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

net: rework setup_tc ndo op to consume general tc operand

BugLink: http://bugs.launchpad.net/bugs/1562326
This patch updates setup_tc so we can pass additional parameters into
the ndo op in a generic way. To do this we provide structured union
and type flag.

This lets each classifier and qdisc provide its own set of attributes
without having to add new ndo ops or grow the signature of the
callback.

Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(back ported from commit 16e5cc647173a97e33b3e3ba81f73eb455561794)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Conflicts:
drivers/net/ethernet/mellanox/mlx4/en_netdev.c

net: rework ndo tc op to consume additional qdisc handle parameter

BugLink: http://bugs.launchpad.net/bugs/1562326
The ndo_setup_tc() op was added to support drivers offloading tx
qdiscs however only support for mqprio was ever added. So we
only ever added support for passing the number of traffic classes
to the driver.

This patch generalizes the ndo_setup_tc op so that a handle can
be provided to indicate if the offload is for ingress or egress
or potentially even child qdiscs.

CC: Murali Karicheri <m-karicheri2@ti.com>
CC: Shradha Shah <sshah@solarflare.com>
CC: Or Gerlitz <ogerlitz@mellanox.com>
CC: Ariel Elior <ariel.elior@qlogic.com>
CC: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
CC: Bruce Allan <bruce.w.allan@intel.com>
CC: Jesse Brandeburg <jesse.brandeburg@intel.com>
CC: Don Skidmore <donald.c.skidmore@intel.com>
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(back ported from commit e4c6734eaab90695db0ea8456307790cb0c1ccb5)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Conflicts:
drivers/net/ethernet/mellanox/mlx4/en_netdev.c

sctp: Rename NETIF_F_SCTP_CSUM to NETIF_F_SCTP_CRC

BugLink: http://bugs.launchpad.net/bugs/1562326
The SCTP checksum is really a CRC and is very different from the
standards 1's complement checksum that serves as the checksum
for IP protocols. This offload interface is also very different.
Rename NETIF_F_SCTP_CSUM to NETIF_F_SCTP_CRC to highlight these
differences. The term CSUM should be reserved in the stack to refer
to the standard 1's complement IP checksum.

Signed-off-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 53692b1de419c1b59106909c7f6b4dd3dbc768ac)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

net: tc: helper functions to query action types

BugLink: http://bugs.launchpad.net/bugs/1562326
This is a helper function drivers can use to learn if the
action type is a drop action.

Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit 3b01cf56daf96acf9b155d6201d94bc8b4de218e)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>