]> git.proxmox.com Git - mirror_ubuntu-artful-kernel.git/log
mirror_ubuntu-artful-kernel.git
6 years agoUBUNTU: Ubuntu-4.13.0-14.15 Ubuntu-4.13.0-14.15
Seth Forshee [Tue, 3 Oct 2017 19:52:44 +0000 (14:52 -0500)]
UBUNTU: Ubuntu-4.13.0-14.15

Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
6 years agoarm64: mm: Use READ_ONCE when dereferencing pointer to pte table
Will Deacon [Tue, 3 Oct 2017 19:35:04 +0000 (14:35 -0500)]
arm64: mm: Use READ_ONCE when dereferencing pointer to pte table

On kernels built with support for transparent huge pages, different CPUs
can access the PMD concurrently due to e.g. fast GUP or page_vma_mapped_walk
and they must take care to use READ_ONCE to avoid value tearing or caching
of stale values by the compiler. Unfortunately, these functions call into
our pgtable macros, which don't use READ_ONCE, and compiler caching has
been observed to cause the following crash during ext4 writeback:

PC is at check_pte+0x20/0x170
LR is at page_vma_mapped_walk+0x2e0/0x540
[...]
Process doio (pid: 2463, stack limit = 0xffff00000f2e8000)
Call trace:
[<ffff000008233328>] check_pte+0x20/0x170
[<ffff000008233758>] page_vma_mapped_walk+0x2e0/0x540
[<ffff000008234adc>] page_mkclean_one+0xac/0x278
[<ffff000008234d98>] rmap_walk_file+0xf0/0x238
[<ffff000008236e74>] rmap_walk+0x64/0xa0
[<ffff0000082370c8>] page_mkclean+0x90/0xa8
[<ffff0000081f3c64>] clear_page_dirty_for_io+0x84/0x2a8
[<ffff00000832f984>] mpage_submit_page+0x34/0x98
[<ffff00000832fb4c>] mpage_process_page_bufs+0x164/0x170
[<ffff00000832fc8c>] mpage_prepare_extent_to_map+0x134/0x2b8
[<ffff00000833530c>] ext4_writepages+0x484/0xe30
[<ffff0000081f6ab4>] do_writepages+0x44/0xe8
[<ffff0000081e5bd4>] __filemap_fdatawrite_range+0xbc/0x110
[<ffff0000081e5e68>] file_write_and_wait_range+0x48/0xd8
[<ffff000008324310>] ext4_sync_file+0x80/0x4b8
[<ffff0000082bd434>] vfs_fsync_range+0x64/0xc0
[<ffff0000082332b4>] SyS_msync+0x194/0x1e8

This is because page_vma_mapped_walk loads the PMD twice before calling
pte_offset_map: the first time without READ_ONCE (where it gets all zeroes
due to a concurrent pmdp_invalidate) and the second time with READ_ONCE
(where it sees a valid table pointer due to a concurrent pmd_populate).
However, the compiler inlines everything and caches the first value in
a register, which is subsequently used in pte_offset_phys which returns
a junk pointer that is later dereferenced when attempting to access the
relevant pte.

This patch fixes the issue by using READ_ONCE in pte_offset_phys to ensure
that a stale value is not used. Whilst this is a point fix for a known
failure (and simple to backport), a full fix moving all of our page table
accessors over to {READ,WRITE}_ONCE and consistently using READ_ONCE in
page_vma_mapped_walk is in the works for a future kernel release.

Cc: Jon Masters <jcm@redhat.com>
Cc: Timur Tabi <timur@codeaurora.org>
Cc: <stable@vger.kernel.org>
Fixes: f27176cfc363 ("mm: convert page_mkclean_one() to use page_vma_mapped_walk()")
BugLink: https://launchpad.net/bugs/1721067
Tested-by: Richard Ruigrok <rruigrok@codeaurora.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
(cherry picked from commit f069faba688701c4d56b6c3452a130f97bf02e95
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git)
Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
6 years agoUBUNTU: SAUCE: LSM stacking: check for invalid zero sized writes
Colin Ian King [Tue, 3 Oct 2017 12:12:54 +0000 (13:12 +0100)]
UBUNTU: SAUCE: LSM stacking: check for invalid zero sized writes

BugLink: http://bugs.launchpad.net/bugs/1720779
Writing zero bytes to /proc/$pid/task/$pid/attr/context via
security_setprocattr cause an oops in memcpy_erms. Fix this by
checking for zero size and returning -EINVAL for this invalid
write size.

Detected by running stress-ng --procfs 0

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
6 years agoUBUNTU: d-i: Add bnxt_en to nic-modules.
Vinson Lee [Fri, 29 Sep 2017 22:42:52 +0000 (22:42 +0000)]
UBUNTU: d-i: Add bnxt_en to nic-modules.

BugLink: http://bugs.launchpad.net/bugs/1720466
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
6 years agoUBUNTU: [Packaging] Include arch/arm64/kernel/ftrace-mod.o in headers package
Seth Forshee [Fri, 29 Sep 2017 19:48:52 +0000 (15:48 -0400)]
UBUNTU: [Packaging] Include arch/arm64/kernel/ftrace-mod.o in headers package

dkms builds on arm64 fail because this file is required, so add
it to the arm64 headers package to get them working.

Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
6 years agoUBUNTU: Start new release
Seth Forshee [Tue, 3 Oct 2017 17:38:25 +0000 (12:38 -0500)]
UBUNTU: Start new release

Ignore: yes
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
6 years agoUBUNTU: Ubuntu-4.13.0-13.14 Ubuntu-4.13.0-13.14
Seth Forshee [Thu, 28 Sep 2017 21:38:05 +0000 (17:38 -0400)]
UBUNTU: Ubuntu-4.13.0-13.14

Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
6 years agoUBUNTU: [Config] Run updateconfigs after merging LSM stacking
Seth Forshee [Thu, 28 Sep 2017 20:48:41 +0000 (16:48 -0400)]
UBUNTU: [Config] Run updateconfigs after merging LSM stacking

New options weren't in the correct locations and some removed
options were still in the configs. Run updateconfigs to sort it
all out.

Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
6 years agoUBUNTU: SAUCE: LSM stacking: add configs for LSM stacking
John Johansen [Mon, 25 Sep 2017 16:21:31 +0000 (12:21 -0400)]
UBUNTU: SAUCE: LSM stacking: add configs for LSM stacking

Add the config options for LSM stacking. Default to only apparmor
being in the stack. Users can experiment with stacking by specifying
the
     security=

kernel boot parameter.
  eg.
     security=apparmor,selinux

Signed-off-by: John Johansen <john.johansen@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
6 years agoUBUNTU: SAUCE: LSM stacking: add Kconfig to set default display LSM
John Johansen [Wed, 27 Sep 2017 11:50:19 +0000 (07:50 -0400)]
UBUNTU: SAUCE: LSM stacking: add Kconfig to set default display LSM

Signed-off-by: John Johansen <john.johansen@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
6 years agoUBUNTU: SAUCE: LSM stacking: add /proc/<pid>/attr/display_lsm
John Johansen [Thu, 28 Sep 2017 15:09:43 +0000 (11:09 -0400)]
UBUNTU: SAUCE: LSM stacking: add /proc/<pid>/attr/display_lsm

Add /proc/<pid>/attr/display_lsm so that scripts can easily introspect
the display lsm of a give task.

Signed-off-by: John Johansen <john.johansen@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
6 years agoUBUNTU: SAUCE: LSM stacking: make sure LSM blob align on 64 bit boundaries
John Johansen [Thu, 28 Sep 2017 17:47:13 +0000 (13:47 -0400)]
UBUNTU: SAUCE: LSM stacking: make sure LSM blob align on 64 bit boundaries

The security->task blob reserving the first 12 bytes means that LSM
blobs don't align on 64 byte boundaries. This is not a problem
for x86 but if an LSM stores a long or ptr in its blob, then some
architectures require it be aligned to the arch word size and
will through a fault.

Signed-off-by: John Johansen <john.johansen@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
6 years agoUBUNTU: SAUCE: LSM stacking: provide a way to specify the default display lsm
John Johansen [Wed, 27 Sep 2017 06:05:22 +0000 (02:05 -0400)]
UBUNTU: SAUCE: LSM stacking: provide a way to specify the default display lsm

When the system boots up the desired default display LSM maybe different
than the first LSM initialized. Allow it to be set by specifying an
LSM with
  security.display=apparmor

Signed-off-by: John Johansen <john.johansen@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
6 years agoUBUNTU: SAUCE: LSM stacking: verify display LSM
John Johansen [Wed, 27 Sep 2017 05:28:08 +0000 (01:28 -0400)]
UBUNTU: SAUCE: LSM stacking: verify display LSM

Make sure the display LSM is verified to be a registered LSM, to
avoid breakage when a bad name is passed.

Signed-off-by: John Johansen <john.johansen@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
6 years agoUBUNTU: SAUCE: LSM stacking: keep an index for each registered LSM
John Johansen [Wed, 27 Sep 2017 05:13:17 +0000 (01:13 -0400)]
UBUNTU: SAUCE: LSM stacking: keep an index for each registered LSM

Keep an index of the registered LSMs so that it can be used in table
lookups and ordering comparisons.

pulled from the full LSM stacking patch

Signed-off-by: John Johansen <john.johansen@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
6 years agoUBUNTU: SAUCE: LSM stacking: inherit current display LSM
John Johansen [Wed, 27 Sep 2017 04:45:16 +0000 (00:45 -0400)]
UBUNTU: SAUCE: LSM stacking: inherit current display LSM

If a current display LSM is set it should be inherited. As per 2017
LSS discussion.

Signed-off-by: John Johansen <john.johansen@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
6 years agoUBUNTU: SAUCE: LSM stacking: provide prctl interface for setting context
John Johansen [Tue, 26 Sep 2017 18:58:26 +0000 (14:58 -0400)]
UBUNTU: SAUCE: LSM stacking: provide prctl interface for setting context

Separate out the prctl interface added in the full LSM stacking patches
to allow tasks to set the display "LSM".

Signed-off-by: John Johansen <john.johansen@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
6 years agoUBUNTU: SAUCE: LSM stacking: allow selecting multiple LSMs using kernel boot params
John Johansen [Tue, 26 Sep 2017 15:06:30 +0000 (11:06 -0400)]
UBUNTU: SAUCE: LSM stacking: allow selecting multiple LSMs using kernel boot params

The base stacking code does not provide a way for users to specify the
desired stack using the kernel boot parameters. Enable specifying an
LSM stack on the command line by providing a comma separated list

ie.
  security=apparmor,selinux

Signed-off-by: John Johansen <john.johansen@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
6 years agoUBUNTU: SAUCE: LSM stacking: fixup stacking kconfig
John Johansen [Mon, 25 Sep 2017 16:03:19 +0000 (12:03 -0400)]
UBUNTU: SAUCE: LSM stacking: fixup stacking kconfig

The stack configs in the base stacking patches are confusing and
separate the selinux/smack stacking from the other LSMs with an
"extreme" stacking entry which is extremely confusing.

Switch the "extreme" stacking to a select for mutually exclusive
LSMs, which provides a better explanation of what is happening.

Fixes: 6c5100029055 ("LSM: general but not extreme module stacking")
Signed-off-by: John Johansen <john.johansen@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
6 years agoUBUNTU: SAUCE: LSM stacking: fixup apparmor stacking enablement
John Johansen [Wed, 27 Sep 2017 06:10:17 +0000 (02:10 -0400)]
UBUNTU: SAUCE: LSM stacking: fixup apparmor stacking enablement

AppArmor doesn't need to register twice.

Signed-off-by: John Johansen <john.johansen@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
6 years agoUBUNTU: SAUCE: LSM stacking: add stacking support to apparmor network hooks
John Johansen [Wed, 27 Sep 2017 07:08:55 +0000 (03:08 -0400)]
UBUNTU: SAUCE: LSM stacking: add stacking support to apparmor network hooks

The stacking patches weren't developed against apparmor networking hooks.

Signed-off-by: John Johansen <john.johansen@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
6 years agoUBUNTU: SAUCE: LSM stacking: add support for stacking getpeersec_stream
John Johansen [Tue, 26 Sep 2017 19:12:35 +0000 (15:12 -0400)]
UBUNTU: SAUCE: LSM stacking: add support for stacking getpeersec_stream

getpeersec_stream needs to use the "current" display LSM set by the
prctl.

Split out the getpeersec_stream implementation from the full stacking
patch.

Signed-off-by: John Johansen <john.johansen@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
6 years agoUBUNTU: SAUCE: LSM stacking: fixup: alloc_task_ctx is dead code
John Johansen [Sun, 3 Sep 2017 19:10:20 +0000 (12:10 -0700)]
UBUNTU: SAUCE: LSM stacking: fixup: alloc_task_ctx is dead code

Fixes: 7b27ec622c90 ("LSM: manage credential security blobs)
Signed-off-by: John Johansen <john.johansen@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
6 years agoUBUNTU: SAUCE: LSM stacking: fixup initialize task->security
John Johansen [Tue, 26 Sep 2017 19:03:19 +0000 (15:03 -0400)]
UBUNTU: SAUCE: LSM stacking: fixup initialize task->security

Signed-off-by: John Johansen <john.johansen@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
6 years agoUBUNTU: SAUCE: LSM stacking: fixup procsfs: add smack subdir to attrs
John Johansen [Thu, 31 Aug 2017 23:34:12 +0000 (16:34 -0700)]
UBUNTU: SAUCE: LSM stacking: fixup procsfs: add smack subdir to attrs

The patch that adds the smack subdirs also changes how set_procattr
is handled. It makes the assumption that each LSM will attempt
to handle the context written in turn and return ENOENT, if
the LSM doesn't handle the context and another should try.

This is wrong in two ways, LSMs may return ENOENT as an error for
a context that is valid but is not present in currently loaded
policy, and the code is based on earlier versions where each
LSM is tried in turn.

Under the current patchset there is the concept of the current LSM
which will be used by applications using the old interfaces, to
ensure those interfaces which were not designed for multiple LSM
input/out only ever have to deal with a single LSM.

Fixes: 86400b03f812 ("procfs: add smack subdir to attrs")
Signed-off-by: John Johansen <john.johansen@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
6 years agoUBUNTU: SAUCE: LSM stacking: LSM: Complete task_alloc hook
Casey Schaufler [Thu, 3 Aug 2017 23:51:40 +0000 (16:51 -0700)]
UBUNTU: SAUCE: LSM stacking: LSM: Complete task_alloc hook

The Task alloc hook needs to allocate the data.

Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
6 years agoUBUNTU: SAUCE: LSM stacking: LSM: general but not extreme module stacking
Casey Schaufler [Thu, 27 Jul 2017 23:00:12 +0000 (16:00 -0700)]
UBUNTU: SAUCE: LSM stacking: LSM: general but not extreme module stacking

Leverage the infrastructure management of the credential and
file security blobs to allow stacking of security modules in
all but the most extreme case. Security modules are informed
of the location of their data within the blobs at module
initialization.

Stacking is optional. If stacking is not configured the old
limit of one "major" security module applies. If stacking is
configured any combination that does not include both SELinux
and Smack is allowed.

A subdirectory has been added to /proc/.../attr for each of
SELinux and AppArmor (Smack introduced such a subdirectory earlier)
to disambiguate what data is provided in the proc/.../attr
interfaces. An entry "context" is added to /proc/.../attr and
to each of the subdirectories. The "context" entry provides
process attribute information in the form:

        lsm-name='lsm-data'[,lsm-name='lsm-data']...

Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
6 years agoUBUNTU: SAUCE: LSM stacking: LSM: Infrastructure management of the remaining blobs
Casey Schaufler [Thu, 27 Jul 2017 23:00:12 +0000 (16:00 -0700)]
UBUNTU: SAUCE: LSM stacking: LSM: Infrastructure management of the remaining blobs

Move management of the inode, ipc, key, msg_msg, sock and superblock
security blobs from the security modules to the infrastructure.
Use of the blob pointers is abstracted in the security modules.

Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
6 years agoUBUNTU: SAUCE: LSM stacking: LSM: manage task security blobs
Casey Schaufler [Mon, 10 Apr 2017 22:44:25 +0000 (15:44 -0700)]
UBUNTU: SAUCE: LSM stacking: LSM: manage task security blobs

Move management of task security blobs into the security
infrastructure. Modules are required to identify the space
they require. At this time there are no modules that use
task blobs.

Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
6 years agoUBUNTU: SAUCE: LSM stacking: LSM: Manage file security blobs
Casey Schaufler [Thu, 27 Jul 2017 23:00:12 +0000 (16:00 -0700)]
UBUNTU: SAUCE: LSM stacking: LSM: Manage file security blobs

Move the management of file security blobs from the individual
security modules to the security infrastructure. The security modules
using file blobs have been updated accordingly. Modules are required
to identify the space they need at module initialization. In some
cases a module no longer needs to supply a blob management hook, in
which case the hook has been removed.

Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
6 years agoUBUNTU: SAUCE: LSM stacking: LSM: manage credential security blobs
Casey Schaufler [Tue, 25 Jul 2017 19:19:28 +0000 (12:19 -0700)]
UBUNTU: SAUCE: LSM stacking: LSM: manage credential security blobs

Move the management of credential security blobs from the
individual security modules to the security infrastructure.
The security modules using credential blobs have been updated
accordingly. Modules are required to identify the space they
require at module initialization. In some cases a module no
longer needs to supply blob management hook, in which case
the hook has been removed.

Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
6 years agoUBUNTU: SAUCE: LSM stacking: procfs: add smack subdir to attrs
Casey Schaufler [Tue, 25 Jul 2017 18:06:52 +0000 (11:06 -0700)]
UBUNTU: SAUCE: LSM stacking: procfs: add smack subdir to attrs

Back in 2007 I made what turned out to be a rather serious
mistake in the implementation of the Smack security module.
The SELinux module used an interface in /proc to manipulate
the security context on processes. Rather than use a similar
interface, I used the same interface. The AppArmor team did
likewise. Now /proc/.../attr/current will tell you the
security "context" of the process, but it will be different
depending on the security module you're using.

This patch provides a subdirectory in /proc/.../attr for
Smack. Smack user space can use the "current" file in
this subdirectory and never have to worry about getting
SELinux attributes by mistake. Programs that use the
old interface will continue to work (or fail, as the case
may be) as before.

This patch does not include subdirectories for SELinux
or AppArmor. I do have a patch that provides those, and
will happily make it available should anyone see value
in it.

The original implementation is by Kees Cook.

Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
6 years agoxhci: set missing SuperSpeedPlus Link Protocol bit in roothub descriptor
Mathias Nyman [Mon, 18 Sep 2017 14:39:18 +0000 (17:39 +0300)]
xhci: set missing SuperSpeedPlus Link Protocol bit in roothub descriptor

BugLink: http://bugs.launchpad.net/bugs/1720045
A SuperSpeedPlus roothub needs to have the Link Protocol (LP) bit set in
the bmSublinkSpeedAttr[] entry of a SuperSpeedPlus descriptor.

If the xhci controller has an optional Protocol Speed ID (PSI) table then
that will be used as a base to create the roothub SuperSpeedPlus
descriptor.
The PSI table does not however necessary contain the LP bit so we need
to set it manually.

Check the psi speed and set LP bit if speed is 10Gbps or higher.
We're not setting it for 5 to 10Gbps as USB 3.1 specification always
mention SuperSpeedPlus for 10Gbps or higher, and some SSIC USB 3.0 speeds
can be over 5Gbps, such as SSIC-G3B-L1 at 5830 Mbps

Cc: <stable@vger.kernel.org> # 4.6+
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 7bea22b124d77845c85a62eaa29a85ba6cc2f899 linux-next)
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
6 years agoUBUNTU: [Config] CONFIG_INTEL_RDT=y
Seth Forshee [Thu, 28 Sep 2017 19:11:40 +0000 (15:11 -0400)]
UBUNTU: [Config] CONFIG_INTEL_RDT=y

BugLink: http://bugs.launchpad.net/bugs/1591609
CONFIG_INTEL_RDT_A changed name to CONFIG_INTEL_RDT, update our
configs accordingly.

Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
6 years agox86/intel_rdt: Turn off most RDT features on Skylake
Tony Luck [Thu, 24 Aug 2017 16:26:52 +0000 (09:26 -0700)]
x86/intel_rdt: Turn off most RDT features on Skylake

BugLink: http://bugs.launchpad.net/bugs/1713619
Errata list is included in this document:
https://www.intel.com/content/dam/www/public/us/en/documents/specification-updates/6th-gen-x-series-spec-update.pdf
with more details in:
https://www.intel.com/content/www/us/en/processors/xeon/scalable/xeon-scalable-spec-update.html

But the tl;dr summary (using tags from first of the documents) is:
SKZ4  MBM does not accurately track write bandwidth
SKZ17 CMT counters may not count accurately
SKZ18 CAT may not restrict cacheline allocation under certain conditions
SKZ19 MBM counters may undercount

Disable all these features on Skylake models. Users who understand the
errata may re-enable using boot command line options.

Signed-off-by: Tony Luck <tony.luck@intel.com>
Cc: Fenghua" <fenghua.yu@intel.com>
Cc: Ravi V" <ravi.v.shankar@intel.com>
Cc: "Peter Zijlstra" <peterz@infradead.org>
Cc: "Stephane Eranian" <eranian@google.com>
Cc: "Andi Kleen" <ak@linux.intel.com>
Cc: "David Carrillo-Cisneros" <davidcc@google.com>
Cc: Vikas Shivappa <vikas.shivappa@linux.intel.com>
Link: http://lkml.kernel.org/r/3aea0a3bae219062c812668bd9b7b8f1a25003ba.1503512900.git.tony.luck@intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
(cherry picked from commit d56593eb5eda8f593db92927059697bbf89bc4b3)
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
6 years agox86/intel_rdt: Add command line options for resource director technology
Tony Luck [Thu, 24 Aug 2017 16:26:51 +0000 (09:26 -0700)]
x86/intel_rdt: Add command line options for resource director technology

BugLink: http://bugs.launchpad.net/bugs/1713619
Command line options allow us to ignore features that we don't want.
Also we can re-enable options that have been disabled on a platform
(so long as the underlying h/w actually supports the option).

[ tglx: Marked the option array __initdata and the helper function __init ]

Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Fenghua" <fenghua.yu@intel.com>
Cc: Ravi V" <ravi.v.shankar@intel.com>
Cc: "Peter Zijlstra" <peterz@infradead.org>
Cc: "Stephane Eranian" <eranian@google.com>
Cc: "Andi Kleen" <ak@linux.intel.com>
Cc: "David Carrillo-Cisneros" <davidcc@google.com>
Cc: Vikas Shivappa <vikas.shivappa@linux.intel.com>
Link: http://lkml.kernel.org/r/0c37b0d4dbc30977a3c1cee08b66420f83662694.1503512900.git.tony.luck@intel.com
(cherry picked from commit 1d9807fc64c131a83a96917f2b2da1c9b00cf127)
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
6 years agox86/intel_rdt: Move special case code for Haswell to a quirk function
Tony Luck [Thu, 24 Aug 2017 16:26:50 +0000 (09:26 -0700)]
x86/intel_rdt: Move special case code for Haswell to a quirk function

BugLink: http://bugs.launchpad.net/bugs/1713619
No functional change, but lay the ground work for other per-model
quirks.

Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Fenghua" <fenghua.yu@intel.com>
Cc: Ravi V" <ravi.v.shankar@intel.com>
Cc: "Peter Zijlstra" <peterz@infradead.org>
Cc: "Stephane Eranian" <eranian@google.com>
Cc: "Andi Kleen" <ak@linux.intel.com>
Cc: "David Carrillo-Cisneros" <davidcc@google.com>
Cc: Vikas Shivappa <vikas.shivappa@linux.intel.com>
Link: http://lkml.kernel.org/r/f195a83751b5f8b1d8a78bd3c1914300c8fa3142.1503512900.git.tony.luck@intel.com
(cherry picked from commit 0576113a387e0c8a5d9e24b4cd62605d1c9c0db8)
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
6 years agox86/intel_rdt: Remove redundant ternary operator on return
Colin Ian King [Tue, 8 Aug 2017 09:28:59 +0000 (10:28 +0100)]
x86/intel_rdt: Remove redundant ternary operator on return

BugLink: http://bugs.launchpad.net/bugs/1591609
The use of the ternary operator is redundant as ret can never be
non-zero at that point. Instead, just return nbytes.

Detected by CoverityScan, CID#1452658 ("Logically dead code")

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Vikas Shivappa <vikas.shivappa@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: kernel-janitors@vger.kernel.org
Link: http://lkml.kernel.org/r/20170808092859.13021-1-colin.king@canonical.com
(cherry picked from commit 5707b46a4206be2068444eb6b514a1ee070651c8)
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
6 years agox86/intel_rdt/cqm: Improve limbo list processing
Vikas Shivappa [Wed, 16 Aug 2017 01:00:43 +0000 (18:00 -0700)]
x86/intel_rdt/cqm: Improve limbo list processing

BugLink: http://bugs.launchpad.net/bugs/1591609
During a mkdir, the entire limbo list is synchronously checked on each
package for free RMIDs by sending IPIs. With a large number of RMIDs (SKL
has 192) this creates a intolerable amount of work in IPIs.

Replace the IPI based checking of the limbo list with asynchronous worker
threads on each package which periodically scan the limbo list and move the
RMIDs that have:

llc_occupancy < threshold_occupancy

on all packages to the free list.

mkdir now returns -ENOSPC if the free list and the limbo list ere empty or
returns -EBUSY if there are RMIDs on the limbo list and the free list is
empty.

Getting rid of the IPIs also simplifies the data structures and the
serialization required for handling the lists.

[ tglx: Rewrote changelog ... ]

Signed-off-by: Vikas Shivappa <vikas.shivappa@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: ravi.v.shankar@intel.com
Cc: tony.luck@intel.com
Cc: fenghua.yu@intel.com
Cc: peterz@infradead.org
Cc: eranian@google.com
Cc: vikas.shivappa@intel.com
Cc: ak@linux.intel.com
Cc: davidcc@google.com
Link: http://lkml.kernel.org/r/1502845243-20454-3-git-send-email-vikas.shivappa@linux.intel.com
(cherry picked from commit 24247aeeabe99eab13b798ccccc2dec066dd6f07)
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
6 years agox86/intel_rdt/mbm: Fix MBM overflow handler during CPU hotplug
Vikas Shivappa [Wed, 16 Aug 2017 01:00:42 +0000 (18:00 -0700)]
x86/intel_rdt/mbm: Fix MBM overflow handler during CPU hotplug

BugLink: http://bugs.launchpad.net/bugs/1591609
When a CPU is dying, the overflow worker is canceled and rescheduled on a
different CPU in the same domain. But if the timer is already about to
expire this essentially doubles the interval which might result in a non
detected overflow.

Cancel the overflow worker and reschedule it immediately on a different CPU
in same domain. The work could be flushed as well, but that would
reschedule it on the same CPU.

[ tglx: Rewrote changelog once again ]

Reported-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Vikas Shivappa <vikas.shivappa@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: ravi.v.shankar@intel.com
Cc: tony.luck@intel.com
Cc: fenghua.yu@intel.com
Cc: peterz@infradead.org
Cc: eranian@google.com
Cc: vikas.shivappa@intel.com
Cc: ak@linux.intel.com
Cc: davidcc@google.com
Link: http://lkml.kernel.org/r/1502845243-20454-2-git-send-email-vikas.shivappa@linux.intel.com
(cherry picked from commit bbc4615e0b7df5e21d0991adb4b2798508354924)
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
6 years agox86/intel_rdt: Modify the intel_pqr_state for better performance
Vikas Shivappa [Wed, 9 Aug 2017 18:46:34 +0000 (11:46 -0700)]
x86/intel_rdt: Modify the intel_pqr_state for better performance

BugLink: http://bugs.launchpad.net/bugs/1591609
Currently we have pqr_state and rdt_default_state which store the cached
CLOSID/RMIDs and the user configured cpu default values respectively. We
touch both of these during context switch. Put all of them in one
structure so that we can spare a cache line.

Reported-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Vikas Shivappa <vikas.shivappa@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: ravi.v.shankar@intel.com
Cc: tony.luck@intel.com
Cc: fenghua.yu@intel.com
Cc: peterz@infradead.org
Cc: eranian@google.com
Cc: sai.praneeth.prakhya@intel.com
Cc: ak@linux.intel.com
Cc: davidcc@google.com
Link: http://lkml.kernel.org/r/1502304395-7166-3-git-send-email-vikas.shivappa@linux.intel.com
(cherry picked from commit a9110b552d44fedbd1221eb0e5bde81da32d9350)
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
6 years agox86/intel_rdt/cqm: Clear the default RMID during hotcpu
Vikas Shivappa [Wed, 9 Aug 2017 18:46:33 +0000 (11:46 -0700)]
x86/intel_rdt/cqm: Clear the default RMID during hotcpu

BugLink: http://bugs.launchpad.net/bugs/1591609
The user configured per cpu default RMID is not cleared during cpu
hotplug. This may lead to incorrect RMID values after a cpu goes offline
and again comes back online. Clear the per cpu default RMID during cpu
offline and online handling.

Reported-by: Prakyha Sai Praneeth <sai.praneeth.prakhya@intel.com>
Signed-off-by: Vikas Shivappa <vikas.shivappa@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: ravi.v.shankar@intel.com
Cc: tony.luck@intel.com
Cc: fenghua.yu@intel.com
Cc: peterz@infradead.org
Cc: eranian@google.com
Cc: ak@linux.intel.com
Cc: davidcc@google.com
Link: http://lkml.kernel.org/r/1502304395-7166-2-git-send-email-vikas.shivappa@linux.intel.com
(cherry picked from commit eda61c265f3656be8345fdf0334b3a77829437fc)
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
6 years agox86/intel_rdt: Show bitmask of shareable resource with other executing units
Fenghua Yu [Tue, 25 Jul 2017 22:39:04 +0000 (15:39 -0700)]
x86/intel_rdt: Show bitmask of shareable resource with other executing units

BugLink: http://bugs.launchpad.net/bugs/1591609
CPUID.(EAX=0x10, ECX=res#):EBX[31:0] reports a bit mask for a resource.
Each set bit within the length of the CBM indicates the corresponding
unit of the resource allocation may be used by other entities in the
platform (e.g. an integrated graphics engine or hardware units outside
the processor core and have direct access to the resource). Each
cleared bit within the length of the CBM indicates the corresponding
allocation unit can be configured to implement a priority-based
allocation scheme without interference with other hardware agents in
the system. Bits outside the length of the CBM are reserved.

More details on the bit mask are described in x86 Software Developer's
Manual.

The bitmask is shown in "info" directory for each resource. It's
up to user to decide how to use the bitmask within a CBM in a partition
to share or isolate a resource with other executing units.

Suggested-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: ravi.v.shankar@intel.com
Cc: peterz@infradead.org
Cc: eranian@google.com
Cc: ak@linux.intel.com
Cc: davidcc@google.com
Cc: vikas.shivappa@linux.intel.com
Link: http://lkml.kernel.org/r/20170725223904.12996-1-tony.luck@intel.com
(cherry picked from commit 0dd2d7494cd818d06a2ae1cd840cd62124a2d25e)
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
6 years agox86/intel_rdt/mbm: Handle counter overflow
Vikas Shivappa [Tue, 25 Jul 2017 21:14:47 +0000 (14:14 -0700)]
x86/intel_rdt/mbm: Handle counter overflow

BugLink: http://bugs.launchpad.net/bugs/1591609
Set up a delayed work queue for each domain that will read all
the MBM counters of active RMIDs once per second to make sure
that they don't wrap around between reads from users.

[Tony: Added the initializations for the work structure and completed
the patch]

Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Vikas Shivappa <vikas.shivappa@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: ravi.v.shankar@intel.com
Cc: fenghua.yu@intel.com
Cc: peterz@infradead.org
Cc: eranian@google.com
Cc: vikas.shivappa@intel.com
Cc: ak@linux.intel.com
Cc: davidcc@google.com
Cc: reinette.chatre@intel.com
Link: http://lkml.kernel.org/r/1501017287-28083-29-git-send-email-vikas.shivappa@linux.intel.com
(cherry picked from commit e33026831bdb5f051499fec6a606f79fe1f94cc8)
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
6 years agox86/intel_rdt/mbm: Add mbm counter initialization
Vikas Shivappa [Tue, 25 Jul 2017 21:14:46 +0000 (14:14 -0700)]
x86/intel_rdt/mbm: Add mbm counter initialization

BugLink: http://bugs.launchpad.net/bugs/1591609
MBM counters are monotonically increasing counts representing the total
memory bytes at a particular time. In order to calculate total_bytes for
an rdtgroup, we store the value of the counter when we create an
rdtgroup or when a new domain comes online.

When the total_bytes(all memory controller bytes) or local_bytes(local
memory controller bytes) file in "mon_data" is read it shows the
total bytes for that rdtgroup since its creation. User can snapshot this
at different time intervals to obtain bytes/second.

Signed-off-by: Vikas Shivappa <vikas.shivappa@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: ravi.v.shankar@intel.com
Cc: tony.luck@intel.com
Cc: fenghua.yu@intel.com
Cc: peterz@infradead.org
Cc: eranian@google.com
Cc: vikas.shivappa@intel.com
Cc: ak@linux.intel.com
Cc: davidcc@google.com
Cc: reinette.chatre@intel.com
Link: http://lkml.kernel.org/r/1501017287-28083-28-git-send-email-vikas.shivappa@linux.intel.com
(cherry picked from commit a4de1dfdd726537e2a78b55759fc646d9b0a0be8)
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
6 years agox86/intel_rdt/mbm: Basic counting of MBM events (total and local)
Tony Luck [Tue, 25 Jul 2017 21:14:45 +0000 (14:14 -0700)]
x86/intel_rdt/mbm: Basic counting of MBM events (total and local)

BugLink: http://bugs.launchpad.net/bugs/1591609
Check CPUID bits for whether each of the MBM events is supported.
Allocate space for each RMID for each counter in each domain to save
previous MSR counter value and running total of data.
Create files in each of the monitor directories.

Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Vikas Shivappa <vikas.shivappa@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: ravi.v.shankar@intel.com
Cc: fenghua.yu@intel.com
Cc: peterz@infradead.org
Cc: eranian@google.com
Cc: vikas.shivappa@intel.com
Cc: ak@linux.intel.com
Cc: davidcc@google.com
Cc: reinette.chatre@intel.com
Link: http://lkml.kernel.org/r/1501017287-28083-27-git-send-email-vikas.shivappa@linux.intel.com
(cherry picked from commit 9f52425ba303d91c8370719e91d7e578bfdf309f)
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
6 years agox86/intel_rdt/cqm: Add CPU hotplug support
Vikas Shivappa [Tue, 25 Jul 2017 21:14:44 +0000 (14:14 -0700)]
x86/intel_rdt/cqm: Add CPU hotplug support

BugLink: http://bugs.launchpad.net/bugs/1591609
Resource groups have a per domain directory under "mon_data". Add or
remove these directories as and when domains come online and go offline.
Also update the per cpu rmids and cache upon onlining and offlining
cpus.

Signed-off-by: Vikas Shivappa <vikas.shivappa@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: ravi.v.shankar@intel.com
Cc: tony.luck@intel.com
Cc: fenghua.yu@intel.com
Cc: peterz@infradead.org
Cc: eranian@google.com
Cc: vikas.shivappa@intel.com
Cc: ak@linux.intel.com
Cc: davidcc@google.com
Cc: reinette.chatre@intel.com
Link: http://lkml.kernel.org/r/1501017287-28083-26-git-send-email-vikas.shivappa@linux.intel.com
(cherry picked from commit 895c663ecef16c8138e20a7d5c052e0fcc400241)
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
6 years agox86/intel_rdt/cqm: Add sched_in support
Vikas Shivappa [Tue, 25 Jul 2017 21:14:43 +0000 (14:14 -0700)]
x86/intel_rdt/cqm: Add sched_in support

BugLink: http://bugs.launchpad.net/bugs/1591609
OS associates an RMID/CLOSid to a task by writing the per CPU
IA32_PQR_ASSOC MSR when a task is scheduled in.

The sched_in code will stay as no-op unless we are running on Intel SKU
which supports either resource control or monitoring and we also enable
them by mounting the resctrl fs.  The per cpu CLOSid/RMID values are
cached and the write is performed only when a task with a different
CLOSid/RMID is scheduled in.

Signed-off-by: Vikas Shivappa <vikas.shivappa@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: ravi.v.shankar@intel.com
Cc: tony.luck@intel.com
Cc: fenghua.yu@intel.com
Cc: peterz@infradead.org
Cc: eranian@google.com
Cc: vikas.shivappa@intel.com
Cc: ak@linux.intel.com
Cc: davidcc@google.com
Cc: reinette.chatre@intel.com
Link: http://lkml.kernel.org/r/1501017287-28083-25-git-send-email-vikas.shivappa@linux.intel.com
(cherry picked from commit 748b6b881ccdda8f0663c68605f431279e06f49a)
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
6 years agox86/intel_rdt: Introduce rdt_enable_key for scheduling
Vikas Shivappa [Tue, 25 Jul 2017 21:14:42 +0000 (14:14 -0700)]
x86/intel_rdt: Introduce rdt_enable_key for scheduling

BugLink: http://bugs.launchpad.net/bugs/1591609
Introduce the usage of rdt_enable_key in sched_in code as a preparation
to add RDT monitoring support for sched_in.

Signed-off-by: Vikas Shivappa <vikas.shivappa@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: ravi.v.shankar@intel.com
Cc: tony.luck@intel.com
Cc: fenghua.yu@intel.com
Cc: peterz@infradead.org
Cc: eranian@google.com
Cc: vikas.shivappa@intel.com
Cc: ak@linux.intel.com
Cc: davidcc@google.com
Cc: reinette.chatre@intel.com
Link: http://lkml.kernel.org/r/1501017287-28083-24-git-send-email-vikas.shivappa@linux.intel.com
(cherry picked from commit 4be6c078428b08d1a948cc09faca8f1326231866)
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
6 years agox86/intel_rdt/cqm: Add mount,umount support
Vikas Shivappa [Tue, 25 Jul 2017 21:14:41 +0000 (14:14 -0700)]
x86/intel_rdt/cqm: Add mount,umount support

BugLink: http://bugs.launchpad.net/bugs/1591609
Add monitoring support during mount and unmount. Since root directory is
a "ctrl_mon" directory which can control and monitor resources create
the "mon_groups" directory which can hold monitor groups and a
"mon_data" directory which would hold all monitoring data like the rest
of resource groups.

The mount succeeds if either of monitoring or control/allocation is
enabled. If only monitoring is enabled user can still create monitor
groups under the "/sys/fs/resctrl/mon_groups/" and any mkdir under root
would fail. If only control/allocation is enabled all of the monitoring
related directories/files would not exist and resctrl would work in
legacy mode.

Signed-off-by: Vikas Shivappa <vikas.shivappa@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: ravi.v.shankar@intel.com
Cc: tony.luck@intel.com
Cc: fenghua.yu@intel.com
Cc: peterz@infradead.org
Cc: eranian@google.com
Cc: vikas.shivappa@intel.com
Cc: ak@linux.intel.com
Cc: davidcc@google.com
Cc: reinette.chatre@intel.com
Link: http://lkml.kernel.org/r/1501017287-28083-23-git-send-email-vikas.shivappa@linux.intel.com
(cherry picked from commit 4af4a88e0c9246990f95c88eeba781265f27c58e)
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
6 years agox86/intel_rdt/cqm: Add rmdir support
Vikas Shivappa [Tue, 25 Jul 2017 21:14:40 +0000 (14:14 -0700)]
x86/intel_rdt/cqm: Add rmdir support

BugLink: http://bugs.launchpad.net/bugs/1591609
Resource groups (ctrl_mon and monitor groups) are represented by
directories in resctrl fs. Add support to remove the directories.

When a ctrl_mon directory is removed all the cpus and tasks are assigned
back to the root rdtgroup. When a monitor group is removed the cpus and
tasks are returned to the parent control group.

Signed-off-by: Vikas Shivappa <vikas.shivappa@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: ravi.v.shankar@intel.com
Cc: tony.luck@intel.com
Cc: fenghua.yu@intel.com
Cc: peterz@infradead.org
Cc: eranian@google.com
Cc: vikas.shivappa@intel.com
Cc: ak@linux.intel.com
Cc: davidcc@google.com
Cc: reinette.chatre@intel.com
Link: http://lkml.kernel.org/r/1501017287-28083-22-git-send-email-vikas.shivappa@linux.intel.com
(cherry picked from commit f3cbeacaa06e2b8c2f3ce8531e9aa3fe1f2762cd)
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
6 years agox86/intel_rdt: Separate the ctrl bits from rmdir
Vikas Shivappa [Tue, 25 Jul 2017 21:14:39 +0000 (14:14 -0700)]
x86/intel_rdt: Separate the ctrl bits from rmdir

BugLink: http://bugs.launchpad.net/bugs/1591609
Re-factor the code to separate the ctrl group removal from the rmdir to
prepare to add RDT monitoring group removal.

Signed-off-by: Vikas Shivappa <vikas.shivappa@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: ravi.v.shankar@intel.com
Cc: tony.luck@intel.com
Cc: fenghua.yu@intel.com
Cc: peterz@infradead.org
Cc: eranian@google.com
Cc: vikas.shivappa@intel.com
Cc: ak@linux.intel.com
Cc: davidcc@google.com
Cc: reinette.chatre@intel.com
Link: http://lkml.kernel.org/r/1501017287-28083-21-git-send-email-vikas.shivappa@linux.intel.com
(cherry picked from commit f9049547f7e72377049d717354b2f56f36a5854a)
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
6 years agox86/intel_rdt/cqm: Add mon_data
Vikas Shivappa [Tue, 25 Jul 2017 21:14:38 +0000 (14:14 -0700)]
x86/intel_rdt/cqm: Add mon_data

BugLink: http://bugs.launchpad.net/bugs/1591609
Add a mon_data directory for the root rdtgroup and all other rdtgroups.
The directory holds all of the monitored data for all domains and events
of all resources being monitored.

The mon_data itself has a list of directories in the format
mon_<domain_name>_<domain_id>. Each of these subdirectories contain one
file per event in the mode "0444". Reading the file displays a snapshot
of the monitored data for the event the file represents.

For ex, on a 2 socket Broadwell with llc_occupancy being
monitored the mon_data contents look as below:

$ ls /sys/fs/resctrl/p1/mon_data/
mon_L3_00
mon_L3_01

Each domain directory has one file per event:
$ ls /sys/fs/resctrl/p1/mon_data/mon_L3_00/
llc_occupancy

To read current llc_occupancy of ctrl_mon group p1
$ cat /sys/fs/resctrl/p1/mon_data/mon_L3_00/llc_occupancy
33789096

[This patch idea is based on Tony's sample patches to organise data in a
per domain directory and have one file per event (and use the fp->priv to
store mon data bits)]

Signed-off-by: Vikas Shivappa <vikas.shivappa@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: ravi.v.shankar@intel.com
Cc: tony.luck@intel.com
Cc: fenghua.yu@intel.com
Cc: peterz@infradead.org
Cc: eranian@google.com
Cc: vikas.shivappa@intel.com
Cc: ak@linux.intel.com
Cc: davidcc@google.com
Cc: reinette.chatre@intel.com
Link: http://lkml.kernel.org/r/1501017287-28083-20-git-send-email-vikas.shivappa@linux.intel.com
(cherry picked from commit d89b7379015fc561060a4094676d143e6ed264e7)
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
6 years agox86/intel_rdt: Prepare for RDT monitor data support
Vikas Shivappa [Tue, 25 Jul 2017 21:14:37 +0000 (14:14 -0700)]
x86/intel_rdt: Prepare for RDT monitor data support

BugLink: http://bugs.launchpad.net/bugs/1591609
Rename the intel_rdt_schemata file to intel_rdt_ctrlmondata as we now
want to add support for RDT monitoring data for the events that are
supported in later patches.

Signed-off-by: Vikas Shivappa <vikas.shivappa@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: ravi.v.shankar@intel.com
Cc: tony.luck@intel.com
Cc: fenghua.yu@intel.com
Cc: peterz@infradead.org
Cc: eranian@google.com
Cc: vikas.shivappa@intel.com
Cc: ak@linux.intel.com
Cc: davidcc@google.com
Cc: reinette.chatre@intel.com
Link: http://lkml.kernel.org/r/1501017287-28083-19-git-send-email-vikas.shivappa@linux.intel.com
(cherry picked from commit 90c403e83101c87ee9e6df8c8d30ea8628ff8bfc)
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
6 years agox86/intel_rdt/cqm: Add cpus file support
Vikas Shivappa [Tue, 25 Jul 2017 21:14:36 +0000 (14:14 -0700)]
x86/intel_rdt/cqm: Add cpus file support

BugLink: http://bugs.launchpad.net/bugs/1591609
The cpus file is extended to support resource monitoring. This is used
to over-ride the RMID of the default group when running on specific
CPUs. It works similar to the resource control. The "cpus" and
"cpus_list" file is present in default group, ctrl_mon groups and
monitor groups.

Each "cpus" file or cpu_list file reads a cpumask or list showing which
CPUs belong to the resource group. By default all online cpus belong to
the default root group. A CPU can be present in one "ctrl_mon" and one
"monitor" group simultaneously. They can be added to a resource group by
writing the CPU to the file. When a CPU is added to a ctrl_mon group it
is automatically removed from the previous ctrl_mon group. A CPU can be
added to a monitor group only if it is present in the parent ctrl_mon
group and when a CPU is added to a monitor group, it is automatically
removed from the previous monitor group. When CPUs go offline, they are
automatically removed from the ctrl_mon and monitor groups.

Signed-off-by: Vikas Shivappa <vikas.shivappa@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: ravi.v.shankar@intel.com
Cc: tony.luck@intel.com
Cc: fenghua.yu@intel.com
Cc: peterz@infradead.org
Cc: eranian@google.com
Cc: vikas.shivappa@intel.com
Cc: ak@linux.intel.com
Cc: davidcc@google.com
Cc: reinette.chatre@intel.com
Link: http://lkml.kernel.org/r/1501017287-28083-18-git-send-email-vikas.shivappa@linux.intel.com
(cherry picked from commit a9fcf8627dc01049c390023bbb0323db3c785b91)
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
6 years agox86/intel_rdt: Prepare to add RDT monitor cpus file support
Vikas Shivappa [Tue, 25 Jul 2017 21:14:35 +0000 (14:14 -0700)]
x86/intel_rdt: Prepare to add RDT monitor cpus file support

BugLink: http://bugs.launchpad.net/bugs/1591609
Separate the ctrl cpus file handling from the generic cpus file handling
and convert the per cpu closid from u32 to a struct which will be used
later to add rmid to the same struct. Also cleanup some name space.

Signed-off-by: Vikas Shivappa <vikas.shivappa@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: ravi.v.shankar@intel.com
Cc: tony.luck@intel.com
Cc: fenghua.yu@intel.com
Cc: peterz@infradead.org
Cc: eranian@google.com
Cc: vikas.shivappa@intel.com
Cc: ak@linux.intel.com
Cc: davidcc@google.com
Cc: reinette.chatre@intel.com
Link: http://lkml.kernel.org/r/1501017287-28083-17-git-send-email-vikas.shivappa@linux.intel.com
(cherry picked from commit b09d981b3f346690dafa3e4ebedfcf3e44b68e83)
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
6 years agox86/intel_rdt/cqm: Add tasks file support
Vikas Shivappa [Tue, 25 Jul 2017 21:14:34 +0000 (14:14 -0700)]
x86/intel_rdt/cqm: Add tasks file support

BugLink: http://bugs.launchpad.net/bugs/1591609
The root directory, ctrl_mon and monitor groups are populated
with a read/write file named "tasks". When read, it shows all the task
IDs assigned to the resource group.

Tasks can be added to groups by writing the PID to the file. A task can
be present in one "ctrl_mon" group "and" one "monitor" group. IOW a
PID_x can be seen in a ctrl_mon group and a monitor group at the same
time. When a task is added to a ctrl_mon group, it is automatically
removed from the previous ctrl_mon group where it belonged. Similarly if
a task is moved to a monitor group it is removed from the previous
monitor group . Also since the monitor groups can only have subset of
tasks of parent ctrl_mon group, a task can be moved to a monitor group
only if its already present in the parent ctrl_mon group.

Task membership is indicated by a new field in the task_struct "u32
rmid" which holds the RMID for the task. RMID=0 is reserved for the
default root group where the tasks belong to at mount.

[tony: zero the rmid if rdtgroup was deleted when task was being moved]

Signed-off-by: Tony Luck <tony.luck@linux.intel.com>
Signed-off-by: Vikas Shivappa <vikas.shivappa@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: ravi.v.shankar@intel.com
Cc: tony.luck@intel.com
Cc: fenghua.yu@intel.com
Cc: peterz@infradead.org
Cc: eranian@google.com
Cc: vikas.shivappa@intel.com
Cc: ak@linux.intel.com
Cc: davidcc@google.com
Cc: reinette.chatre@intel.com
Link: http://lkml.kernel.org/r/1501017287-28083-16-git-send-email-vikas.shivappa@linux.intel.com
(cherry picked from commit d6aaba615a482ce7d3ec218cf7b8d02d0d5753b8)
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
6 years agox86/intel_rdt: Change closid type from int to u32
Vikas Shivappa [Tue, 25 Jul 2017 21:14:33 +0000 (14:14 -0700)]
x86/intel_rdt: Change closid type from int to u32

BugLink: http://bugs.launchpad.net/bugs/1591609
OS associates a CLOSid(Class of service id) to a task by writing the
high 32 bits of per CPU IA32_PQR_ASSOC MSR when a task is scheduled in.
CPUID.(EAX=10H, ECX=1):EDX[15:0] enumerates the max CLOSID supported and
it is zero indexed. Hence change the type to u32 from int.

Signed-off-by: Vikas Shivappa <vikas.shivappa@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: ravi.v.shankar@intel.com
Cc: tony.luck@intel.com
Cc: fenghua.yu@intel.com
Cc: peterz@infradead.org
Cc: eranian@google.com
Cc: vikas.shivappa@intel.com
Cc: ak@linux.intel.com
Cc: davidcc@google.com
Cc: reinette.chatre@intel.com
Link: http://lkml.kernel.org/r/1501017287-28083-15-git-send-email-vikas.shivappa@linux.intel.com
(cherry picked from commit 0734ded1abee9439b0c5d7b62af1ead78aab895b)
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
6 years agox86/intel_rdt/cqm: Add mkdir support for RDT monitoring
Vikas Shivappa [Tue, 25 Jul 2017 21:14:32 +0000 (14:14 -0700)]
x86/intel_rdt/cqm: Add mkdir support for RDT monitoring

BugLink: http://bugs.launchpad.net/bugs/1591609
Resource control groups can be created using mkdir in resctrl
fs(rdtgroup). In order to extend the resctrl interface to support
monitoring the control groups, extend the current mkdir to support
resource monitoring also.

This allows the rdtgroup created under the root directory to be able to
both control and monitor resources (ctrl_mon group). The ctrl_mon groups
are associated with one CLOSID like the legacy rdtgroups and one
RMID(Resource monitoring ID) as well. Hardware uses RMID to track the
resource usage. Once either of the CLOSID or RMID are exhausted, the
mkdir fails with -ENOSPC. If there are RMIDs in limbo list but not free
an -EBUSY is returned. User can also monitor a subset of the ctrl_mon
rdtgroup's tasks/cpus using the monitor groups. The monitor groups are
created using mkdir under the "mon_groups" directory in every ctrl_mon
group.

[Merged Tony's code: Removed a lot of common mkdir code, a fix to handling
of the list of the child rdtgroups and some cleanups in list
traversal. Also the changes to have similar alloc and free for CLOS/RMID
and return -EBUSY when RMIDs are in limbo and not free]

Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Vikas Shivappa <vikas.shivappa@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: ravi.v.shankar@intel.com
Cc: fenghua.yu@intel.com
Cc: peterz@infradead.org
Cc: eranian@google.com
Cc: vikas.shivappa@intel.com
Cc: ak@linux.intel.com
Cc: davidcc@google.com
Cc: reinette.chatre@intel.com
Link: http://lkml.kernel.org/r/1501017287-28083-14-git-send-email-vikas.shivappa@linux.intel.com
(cherry picked from commit c7d9aac6131148abe29ed1dc6bd73ad1213d1f56)
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
6 years agox86/intel_rdt: Prepare for RDT monitoring mkdir support
Vikas Shivappa [Tue, 25 Jul 2017 21:14:31 +0000 (14:14 -0700)]
x86/intel_rdt: Prepare for RDT monitoring mkdir support

BugLink: http://bugs.launchpad.net/bugs/1591609
Separate the ctrl mkdir code from the rest in order to prepare for
adding support for RDT monitoring mkdir support as well.

Signed-off-by: Vikas Shivappa <vikas.shivappa@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: ravi.v.shankar@intel.com
Cc: tony.luck@intel.com
Cc: fenghua.yu@intel.com
Cc: peterz@infradead.org
Cc: eranian@google.com
Cc: vikas.shivappa@intel.com
Cc: ak@linux.intel.com
Cc: davidcc@google.com
Cc: reinette.chatre@intel.com
Link: http://lkml.kernel.org/r/1501017287-28083-13-git-send-email-vikas.shivappa@linux.intel.com
(cherry picked from commit 65b4f403057e32da73c36e33d403890173c4c324)
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
6 years agox86/intel_rdt/cqm: Add info files for RDT monitoring
Vikas Shivappa [Tue, 25 Jul 2017 21:14:30 +0000 (14:14 -0700)]
x86/intel_rdt/cqm: Add info files for RDT monitoring

BugLink: http://bugs.launchpad.net/bugs/1591609
Add info directory files specific to RDT monitoring.

 num_rmids:
    The number of RMIDs which are valid for the resource.

 mon_features:
    Lists the monitoring events if monitoring is enabled for the
    resource.

 max_threshold_occupancy:
    This is specific to llc_occupancy monitoring and is used to
    determine if an RMID can be reused. Provides an upper bound on the
    threshold and is shown to the user in bytes though the internal
    value will be rounded to the scaling factor supported by the h/w.

Signed-off-by: Vikas Shivappa <vikas.shivappa@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: ravi.v.shankar@intel.com
Cc: tony.luck@intel.com
Cc: fenghua.yu@intel.com
Cc: peterz@infradead.org
Cc: eranian@google.com
Cc: vikas.shivappa@intel.com
Cc: ak@linux.intel.com
Cc: davidcc@google.com
Cc: reinette.chatre@intel.com
Link: http://lkml.kernel.org/r/1501017287-28083-12-git-send-email-vikas.shivappa@linux.intel.com
(cherry picked from commit d4ab33201029913b594ae785a9665f45040396ab)
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
6 years agox86/intel_rdt: Simplify info and base file lists
Tony luck [Tue, 25 Jul 2017 21:14:29 +0000 (14:14 -0700)]
x86/intel_rdt: Simplify info and base file lists

BugLink: http://bugs.launchpad.net/bugs/1591609
The info directory files and base files need to be different for each
resource like cache and Memory bandwidth. With in each resource, the
files would be further different for monitoring and ctrl. This leads to
a lot of different static array declarations given that we are adding
resctrl monitoring.

Simplify this to one common list of files and then declare a set of
flags to choose the files based on the resource, whether it is info or
base and if it is control type file. This is as a preparation to include
monitoring based info and base files.

No functional change.

[Vikas: Extended the flags to have few bits per category like resource,
info/base etc]

Signed-off-by: Tony luck <tony.luck@intel.com>
Signed-off-by: Vikas Shivappa <vikas.shivappa@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: ravi.v.shankar@intel.com
Cc: fenghua.yu@intel.com
Cc: peterz@infradead.org
Cc: eranian@google.com
Cc: vikas.shivappa@intel.com
Cc: ak@linux.intel.com
Cc: davidcc@google.com
Cc: reinette.chatre@intel.com
Link: http://lkml.kernel.org/r/1501017287-28083-11-git-send-email-vikas.shivappa@linux.intel.com
(backported from commit 5dc1d5c6bac2cfe3420cf353dfb0ef2e543f7c10)
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
 Conflicts:
arch/x86/kernel/cpu/intel_rdt_rdtgroup.c

6 years agox86/intel_rdt/cqm: Add RMID (Resource monitoring ID) management
Vikas Shivappa [Tue, 25 Jul 2017 21:14:28 +0000 (14:14 -0700)]
x86/intel_rdt/cqm: Add RMID (Resource monitoring ID) management

BugLink: http://bugs.launchpad.net/bugs/1591609
Hardware uses RMID(Resource monitoring ID) to keep track of each of the
RDT events associated with tasks. The number of RMIDs is dependent on
the SKU and is enumerated via CPUID. We add support to manage the RMIDs
which include managing the RMID allocation and reading LLC occupancy
for an RMID.

RMID allocation is managed by keeping a free list which is initialized
to all available RMIDs except for RMID 0 which is always reserved for
root group. RMIDs goto a limbo list once they are
freed since the RMIDs are still tagged to cache lines of the tasks which
were using them - thereby still having some occupancy. They continue to
be in limbo list until the occupancy < threshold_occupancy. The
threshold_occupancy is a user configurable value.
OS uses IA32_QM_CTR MSR to read the occupancy associated with an RMID
after programming the IA32_EVENTSEL MSR with the RMID.

[Tony: Improved limbo search]

Signed-off-by: Vikas Shivappa <vikas.shivappa@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: ravi.v.shankar@intel.com
Cc: tony.luck@intel.com
Cc: fenghua.yu@intel.com
Cc: peterz@infradead.org
Cc: eranian@google.com
Cc: vikas.shivappa@intel.com
Cc: ak@linux.intel.com
Cc: davidcc@google.com
Cc: reinette.chatre@intel.com
Link: http://lkml.kernel.org/r/1501017287-28083-10-git-send-email-vikas.shivappa@linux.intel.com
(cherry picked from commit edf6fa1c4a951b3a03e94b63e6483c5d9da3ab11)
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
6 years agox86/intel_rdt/cqm: Add RDT monitoring initialization
Vikas Shivappa [Tue, 25 Jul 2017 21:14:27 +0000 (14:14 -0700)]
x86/intel_rdt/cqm: Add RDT monitoring initialization

BugLink: http://bugs.launchpad.net/bugs/1591609
Add common data structures for RDT resource monitoring and perform RDT
monitoring related data structure initializations which include setting
up the RMID(Resource monitoring ID) lists and event list which the
resource supports.

[ tony: some cleanup to make adding MBM easier later, remove "cqm" from
   some names, make some data structure local to intel_rdt_monitor.c
   static. Add copyright header]

[ tglx: Made it readable ]

Signed-off-by: Vikas Shivappa <vikas.shivappa@linux.intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: ravi.v.shankar@intel.com
Cc: fenghua.yu@intel.com
Cc: peterz@infradead.org
Cc: eranian@google.com
Cc: vikas.shivappa@intel.com
Cc: ak@linux.intel.com
Cc: davidcc@google.com
Cc: reinette.chatre@intel.com
Link: http://lkml.kernel.org/r/1501017287-28083-9-git-send-email-vikas.shivappa@linux.intel.com
(cherry picked from commit 6a445edce657810992594c7b9e679219aaf78ad9)
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
6 years agox86/intel_rdt: Make rdt_resources_all more readable
Vikas Shivappa [Tue, 25 Jul 2017 21:14:26 +0000 (14:14 -0700)]
x86/intel_rdt: Make rdt_resources_all more readable

BugLink: http://bugs.launchpad.net/bugs/1591609
Change the format of the global rdt_resources_all. This holds all the
RDT resource structure initialization values. Make this more readable by
using the format:

rdt_resources_all[] = {
[<resource_index>] =
            {...
    }
...
}

Signed-off-by: Vikas Shivappa <vikas.shivappa@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: ravi.v.shankar@intel.com
Cc: tony.luck@intel.com
Cc: fenghua.yu@intel.com
Cc: peterz@infradead.org
Cc: eranian@google.com
Cc: vikas.shivappa@intel.com
Cc: ak@linux.intel.com
Cc: davidcc@google.com
Cc: reinette.chatre@intel.com
Link: http://lkml.kernel.org/r/1501017287-28083-8-git-send-email-vikas.shivappa@linux.intel.com
(cherry picked from commit dd131853f3fbc1c3aa051c34a2967c2f76309024)
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
6 years agox86/intel_rdt: Cleanup namespace to support RDT monitoring
Vikas Shivappa [Tue, 25 Jul 2017 21:14:25 +0000 (14:14 -0700)]
x86/intel_rdt: Cleanup namespace to support RDT monitoring

BugLink: http://bugs.launchpad.net/bugs/1591609
Few of the data-structures have generic names although they are RDT
allocation specific. Rename them to be allocation specific to
accommodate RDT monitoring. E.g. s/enabled/alloc_enabled/

No functional change.

Signed-off-by: Vikas Shivappa <vikas.shivappa@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: ravi.v.shankar@intel.com
Cc: tony.luck@intel.com
Cc: fenghua.yu@intel.com
Cc: peterz@infradead.org
Cc: eranian@google.com
Cc: vikas.shivappa@intel.com
Cc: ak@linux.intel.com
Cc: davidcc@google.com
Cc: reinette.chatre@intel.com
Link: http://lkml.kernel.org/r/1501017287-28083-7-git-send-email-vikas.shivappa@linux.intel.com
(backported from commit 1b5c0b7583173b787b5c93ff89838a950d0e23ff)
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
 Conflicts:
arch/x86/kernel/cpu/intel_rdt.c

6 years agox86/intel_rdt: Mark rdt_root and closid_alloc as static
Reinette Chatre [Tue, 25 Jul 2017 21:14:24 +0000 (14:14 -0700)]
x86/intel_rdt: Mark rdt_root and closid_alloc as static

BugLink: http://bugs.launchpad.net/bugs/1591609
Sparse reports that both of these can be static.

Make it so.

Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Vikas Shivappa <vikas.shivappa@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: ravi.v.shankar@intel.com
Cc: tony.luck@intel.com
Cc: fenghua.yu@intel.com
Cc: peterz@infradead.org
Cc: eranian@google.com
Cc: vikas.shivappa@intel.com
Cc: ak@linux.intel.com
Cc: davidcc@google.com
Link: http://lkml.kernel.org/r/1501017287-28083-6-git-send-email-vikas.shivappa@linux.intel.com
(cherry picked from commit cb2200e967c65519ca6c5426644a49dca65f6294)
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
6 years agox86/intel_rdt: Change file names to accommodate RDT monitor code
Vikas Shivappa [Tue, 25 Jul 2017 21:14:23 +0000 (14:14 -0700)]
x86/intel_rdt: Change file names to accommodate RDT monitor code

BugLink: http://bugs.launchpad.net/bugs/1591609
Because the "perf cqm" and resctrl code were separately added and
indivdually configurable, there seem to be separate context switch code
and also things on global .h which are not really needed.

Move only the scheduling specific code and definitions to
<asm/intel_rdt_sched.h> and the put all the other declarations to a
local intel_rdt.h.

h/t to Reinette Chatre for pointing out that we should separate the
public interfaces used by other parts of the kernel from private
objects shared between the various files comprising RDT.

No functional change.

Signed-off-by: Vikas Shivappa <vikas.shivappa@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: ravi.v.shankar@intel.com
Cc: tony.luck@intel.com
Cc: fenghua.yu@intel.com
Cc: peterz@infradead.org
Cc: eranian@google.com
Cc: vikas.shivappa@intel.com
Cc: ak@linux.intel.com
Cc: davidcc@google.com
Cc: reinette.chatre@intel.com
Link: http://lkml.kernel.org/r/1501017287-28083-5-git-send-email-vikas.shivappa@linux.intel.com
(cherry picked from commit 0583020456cea9fcf43b84bb13a41eab059ae0a8)
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
6 years agox86/intel_rdt: Introduce a common compile option for RDT
Vikas Shivappa [Tue, 25 Jul 2017 21:14:22 +0000 (14:14 -0700)]
x86/intel_rdt: Introduce a common compile option for RDT

BugLink: http://bugs.launchpad.net/bugs/1591609
We currently have a CONFIG_RDT_A which is for RDT(Resource directory
technology) allocation based resctrl filesystem interface. As a
preparation to add support for RDT monitoring as well into the same
resctrl filesystem, change the config option to be CONFIG_RDT which
would include both RDT allocation and monitoring code.

No functional change.

Signed-off-by: Vikas Shivappa <vikas.shivappa@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: ravi.v.shankar@intel.com
Cc: tony.luck@intel.com
Cc: fenghua.yu@intel.com
Cc: peterz@infradead.org
Cc: eranian@google.com
Cc: vikas.shivappa@intel.com
Cc: ak@linux.intel.com
Cc: davidcc@google.com
Cc: reinette.chatre@intel.com
Link: http://lkml.kernel.org/r/1501017287-28083-4-git-send-email-vikas.shivappa@linux.intel.com
(cherry picked from commit f01d7d51f577b5dc0fa5919ab8a9228e2bf49f3e)
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
6 years agox86/intel_rdt/cqm: Documentation for resctrl based RDT Monitoring
Vikas Shivappa [Tue, 25 Jul 2017 21:14:21 +0000 (14:14 -0700)]
x86/intel_rdt/cqm: Documentation for resctrl based RDT Monitoring

BugLink: http://bugs.launchpad.net/bugs/1591609
Add a description of resctrl based RDT(resource director technology)
monitoring extension and its usage.

[Tony: Added descriptions for how monitoring and allocation are measured
and some cleanups]

Signed-off-by: Vikas Shivappa <vikas.shivappa@linux.intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: ravi.v.shankar@intel.com
Cc: fenghua.yu@intel.com
Cc: peterz@infradead.org
Cc: eranian@google.com
Cc: vikas.shivappa@intel.com
Cc: ak@linux.intel.com
Cc: davidcc@google.com
Cc: reinette.chatre@intel.com
Link: http://lkml.kernel.org/r/1501017287-28083-3-git-send-email-vikas.shivappa@linux.intel.com
(cherry picked from commit 1640ae9471ae41eb18d2b214f1f40af3c4ed3828)
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
6 years agox86/perf/cqm: Wipe out perf based cqm
Vikas Shivappa [Tue, 25 Jul 2017 21:14:20 +0000 (14:14 -0700)]
x86/perf/cqm: Wipe out perf based cqm

BugLink: http://bugs.launchpad.net/bugs/1591609
'perf cqm' never worked due to the incompatibility between perf
infrastructure and cqm hardware support.  The hardware uses RMIDs to
track the llc occupancy of tasks and these RMIDs are per package. This
makes monitoring a hierarchy like cgroup along with monitoring of tasks
separately difficult and several patches sent to lkml to fix them were
NACKed. Further more, the following issues in the current perf cqm make
it almost unusable:

    1. No support to monitor the same group of tasks for which we do
    allocation using resctrl.

    2. It gives random and inaccurate data (mostly 0s) once we run out
    of RMIDs due to issues in Recycling.

    3. Recycling results in inaccuracy of data because we cannot
    guarantee that the RMID was stolen from a task when it was not
    pulling data into cache or even when it pulled the least data. Also
    for monitoring llc_occupancy, if we stop using an RMID_x and then
    start using an RMID_y after we reclaim an RMID from an other event,
    we miss accounting all the occupancy that was tagged to RMID_x at a
    later perf_count.

    2. Recycling code makes the monitoring code complex including
    scheduling because the event can lose RMID any time. Since MBM
    counters count bandwidth for a period of time by taking snap shot of
    total bytes at two different times, recycling complicates the way we
    count MBM in a hierarchy. Also we need a spin lock while we do the
    processing to account for MBM counter overflow. We also currently
    use a spin lock in scheduling to prevent the RMID from being taken
    away.

    4. Lack of support when we run different kind of event like task,
    system-wide and cgroup events together. Data mostly prints 0s. This
    is also because we can have only one RMID tied to a cpu as defined
    by the cqm hardware but a perf can at the same time tie multiple
    events during one sched_in.

    5. No support of monitoring a group of tasks. There is partial support
    for cgroup but it does not work once there is a hierarchy of cgroups
    or if we want to monitor a task in a cgroup and the cgroup itself.

    6. No support for monitoring tasks for the lifetime without perf
    overhead.

    7. It reported the aggregate cache occupancy or memory bandwidth over
    all sockets. But most cloud and VMM based use cases want to know the
    individual per-socket usage.

Signed-off-by: Vikas Shivappa <vikas.shivappa@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: ravi.v.shankar@intel.com
Cc: tony.luck@intel.com
Cc: fenghua.yu@intel.com
Cc: peterz@infradead.org
Cc: eranian@google.com
Cc: vikas.shivappa@intel.com
Cc: ak@linux.intel.com
Cc: davidcc@google.com
Cc: reinette.chatre@intel.com
Link: http://lkml.kernel.org/r/1501017287-28083-2-git-send-email-vikas.shivappa@linux.intel.com
(cherry picked from commit c39a0e2c8850f08249383f2425dbd8dbe4baad69)
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
6 years agoKVM: VMX: Do not BUG() on out-of-bounds guest IRQ
Jan H. Schönherr [Thu, 7 Sep 2017 18:02:30 +0000 (19:02 +0100)]
KVM: VMX: Do not BUG() on out-of-bounds guest IRQ

The value of the guest_irq argument to vmx_update_pi_irte() is
ultimately coming from a KVM_IRQFD API call. Do not BUG() in
vmx_update_pi_irte() if the value is out-of bounds. (Especially,
since KVM as a whole seems to hang after that.)

Instead, print a message only once if we find that we don't have a
route for a certain IRQ (which can be out-of-bounds or within the
array).

This fixes CVE-2017-1000252.

Fixes: efc644048ecde54 ("KVM: x86: Update IRTE for posted-interrupts")
Signed-off-by: Jan H. Schönherr <jschoenh@amazon.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 3a8b0677fc6180a467e26cc32ce6b0c09a32f9bb)
CVE-2017-1000252
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
6 years agocrypto/nx: Add P9 NX support for 842 compression engine
Haren Myneni [Thu, 28 Sep 2017 10:58:00 +0000 (07:58 -0300)]
crypto/nx: Add P9 NX support for 842 compression engine

BugLink: http://bugs.launchpad.net/bugs/1718292
This patch adds P9 NX support for 842 compression engine. Virtual
Accelerator Switchboard (VAS) is used to access 842 engine on P9.

For each NX engine per chip, setup receive window using
vas_rx_win_open() which configures RxFIFo with FIFO address, lpid,
pid and tid values. This unique (lpid, pid, tid) combination will
be used to identify the target engine.

For crypto open request, open send window on the NX engine for
the corresponding chip / cpu where the open request is executed.
This send window will be closed upon crypto close request.

NX provides high and normal priority FIFOs. For compression /
decompression requests, we use only hight priority FIFOs in kernel.

Each NX request will be communicated to VAS using copy/paste
instructions with vas_copy_crb() / vas_paste_crb() functions.

Signed-off-by: Haren Myneni <haren@us.ibm.com>
Reviewed-by: Ram Pai <linuxram@us.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
(cherry picked from commit b0d6c9bab5e41d07f2bece1ef8c81cd2175b5f88)
Signed-off-by: Gustavo Walbon <gwalbon@linux.vnet.ibm.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
6 years agocrypto/nx: Add P9 NX specific error codes for 842 engine
Haren Myneni [Thu, 28 Sep 2017 10:57:59 +0000 (07:57 -0300)]
crypto/nx: Add P9 NX specific error codes for 842 engine

BugLink: http://bugs.launchpad.net/bugs/1718292
This patch adds changes for checking P9 specific 842 engine
error codes. These errros are reported in coprocessor status
block (CSB) for failures.

Signed-off-by: Haren Myneni <haren@us.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
(cherry picked from commit 146e9f1b65478643f2729a97ccb8be60bb4492e5)
Signed-off-by: Gustavo Walbon <gwalbon@linux.vnet.ibm.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
6 years agocrypto/nx: Use kzalloc for workmem allocation
Haren Myneni [Thu, 28 Sep 2017 10:57:58 +0000 (07:57 -0300)]
crypto/nx: Use kzalloc for workmem allocation

BugLink: http://bugs.launchpad.net/bugs/1718292
Send window is opened / closed for each crypto session.
So initializes txwin in workmem.

Signed-off-by: Haren Myneni <haren@us.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
(cherry picked from commit f05368336b3ae399f66cf511c52c6d69c7bc6b39)
Signed-off-by: Gustavo Walbon <gwalbon@linux.vnet.ibm.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
6 years agocrypto/nx: Add nx842_add_coprocs_list function
Haren Myneni [Thu, 28 Sep 2017 10:57:57 +0000 (07:57 -0300)]
crypto/nx: Add nx842_add_coprocs_list function

BugLink: http://bugs.launchpad.net/bugs/1718292
Updating coprocessor list is moved to nx842_add_coprocs_list().
This function will be used for both icswx and VAS functions.

Signed-off-by: Haren Myneni <haren@us.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
(cherry picked from commit cd38a8a8a2ab95e43a031410db6a32f2d84e3fc0)
Signed-off-by: Gustavo Walbon <gwalbon@linux.vnet.ibm.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
6 years agocrypto/nx: Create nx842_delete_coprocs function
Haren Myneni [Thu, 28 Sep 2017 10:57:56 +0000 (07:57 -0300)]
crypto/nx: Create nx842_delete_coprocs function

BugLink: http://bugs.launchpad.net/bugs/1718292
Move deleting coprocessors info upon exit or failure to
nx842_delete_coprocs().

Signed-off-by: Haren Myneni <haren@us.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
(cherry picked from commit 1ee51b28ee6ad7919da4fbe9672263dd274dcbfe)
Signed-off-by: Gustavo Walbon <gwalbon@linux.vnet.ibm.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
6 years agocrypto/nx: Create nx842_configure_crb function
Haren Myneni [Thu, 28 Sep 2017 10:57:55 +0000 (07:57 -0300)]
crypto/nx: Create nx842_configure_crb function

BugLink: http://bugs.launchpad.net/bugs/1718292
Configure CRB is moved to nx842_configure_crb() so that it can
be used for icswx and VAS exec functions. VAS function will be
added later with P9 support.

Signed-off-by: Haren Myneni <haren@us.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
(cherry picked from commit 56c10d5ea68b9a1425df0dedbe5d2710cc7d8bfa)
Signed-off-by: Gustavo Walbon <gwalbon@linux.vnet.ibm.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
6 years agocrypto/nx: Rename nx842_powernv_function as icswx function
Haren Myneni [Thu, 28 Sep 2017 10:57:54 +0000 (07:57 -0300)]
crypto/nx: Rename nx842_powernv_function as icswx function

BugLink: http://bugs.launchpad.net/bugs/1718292
Rename nx842_powernv_function to nx842_powernv_exec.
nx842_powernv_exec points to nx842_exec_icswx and
will be point to VAS exec function which will be added later
for P9 NX support.

Signed-off-by: Haren Myneni <haren@us.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
(cherry picked from commit c97f8169fb227cae5adeac56cafa980f25978031)
Signed-off-by: Gustavo Walbon <gwalbon@linux.vnet.ibm.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
6 years agoUBUNTU: [Config] CONFIG_PPC_VAS=y
Seth Forshee [Thu, 28 Sep 2017 19:09:17 +0000 (15:09 -0400)]
UBUNTU: [Config] CONFIG_PPC_VAS=y

BugLink: http://bugs.launchpad.net/bugs/1718293
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
6 years agopowerpc/powernv/vas: Define copy/paste interfaces
Sukadev Bhattiprolu [Wed, 27 Sep 2017 23:24:21 +0000 (20:24 -0300)]
powerpc/powernv/vas: Define copy/paste interfaces

BugLink: http://bugs.launchpad.net/bugs/1718293
Define interfaces (wrappers) to the 'copy' and 'paste'
instructions (which are new in PowerISA 3.0). These are intended to be
used to by NX driver(s) to submit Coprocessor Request Blocks (CRBs) to
the NX hardware engines.

Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
(cherry picked from commit 2392c8c8c0450293625dbef19ff5e206fb7b6749)
Signed-off-by: Gustavo Walbon <gwalbon@linux.vnet.ibm.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
6 years agopowerpc/powernv/vas: Define vas_tx_win_open()
Sukadev Bhattiprolu [Wed, 27 Sep 2017 23:24:20 +0000 (20:24 -0300)]
powerpc/powernv/vas: Define vas_tx_win_open()

BugLink: http://bugs.launchpad.net/bugs/1718293
Define an interface to open a VAS send window. This interface is
intended to be used the Nest Accelerator (NX) driver(s) to open
a send window and use it to submit compression/encryption requests
to a VAS receive window.

The receive window, identified by the [vasid, cop] parameters, must
already be open in VAS (i.e connected to an NX engine).

Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
(cherry picked from commit 5239af679a07427647b009ebb9c70b1a03ebca9b)
Signed-off-by: Gustavo Walbon <gwalbon@linux.vnet.ibm.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
6 years agopowerpc/powernv/vas: Define vas_win_close() interface
Sukadev Bhattiprolu [Wed, 27 Sep 2017 23:24:19 +0000 (20:24 -0300)]
powerpc/powernv/vas: Define vas_win_close() interface

BugLink: http://bugs.launchpad.net/bugs/1718293
Define the vas_win_close() interface which should be used to close a
send or receive windows.

While the hardware configurations required to open send and receive
windows differ, the configuration to close a window is the same for
both. So we use a single interface to close the window.

Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
(cherry picked from commit 98271d4198699947d66d6f8a02c09bd27cb90022)
Signed-off-by: Gustavo Walbon <gwalbon@linux.vnet.ibm.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
6 years agopowerpc/powernv/vas: Define vas_rx_win_open() interface
Sukadev Bhattiprolu [Wed, 27 Sep 2017 23:24:18 +0000 (20:24 -0300)]
powerpc/powernv/vas: Define vas_rx_win_open() interface

BugLink: http://bugs.launchpad.net/bugs/1718293
Define the vas_rx_win_open() interface. This interface is intended to
be used by the Nest Accelerator (NX) driver(s) to setup receive
windows for one or more NX engines (which implement compression &
encryption algorithms in the hardware).

Follow-on patches will provide an interface to close the window and to
open a send window that kernel subsystems can use to access the NX
engines.

The interface to open a receive window is expected to be invoked for
each instance of VAS in the system.

Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
(cherry picked from commit 62c4eda4fabe89709ec43dcf1efe9fbea007a734)
Signed-off-by: Gustavo Walbon <gwalbon@linux.vnet.ibm.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
6 years agopowerpc/powernv/vas: Define helpers to alloc/free windows
Sukadev Bhattiprolu [Wed, 27 Sep 2017 23:24:17 +0000 (20:24 -0300)]
powerpc/powernv/vas: Define helpers to alloc/free windows

BugLink: http://bugs.launchpad.net/bugs/1718293
Define helpers to allocate/free VAS window objects. These will be used
in follow-on patches when opening/closing windows.

Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
(cherry picked from commit bbfe59f8a7057f80f67a74e77fb4e941240e90b9)
Signed-off-by: Gustavo Walbon <gwalbon@linux.vnet.ibm.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
6 years agopowerpc/powernv/vas: Define helpers to init window context
Sukadev Bhattiprolu [Wed, 27 Sep 2017 23:24:16 +0000 (20:24 -0300)]
powerpc/powernv/vas: Define helpers to init window context

BugLink: http://bugs.launchpad.net/bugs/1718293
Define helpers to initialize window context registers of the VAS
hardware. These will be used in follow-on patches when opening/closing
VAS windows.

Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
(cherry picked from commit b25b33ac18b35775949ab227bb3075bb6cb11bc3)
Signed-off-by: Gustavo Walbon <gwalbon@linux.vnet.ibm.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
6 years agopowerpc/powernv/vas: Define helpers to access MMIO regions
Sukadev Bhattiprolu [Wed, 27 Sep 2017 23:24:15 +0000 (20:24 -0300)]
powerpc/powernv/vas: Define helpers to access MMIO regions

BugLink: http://bugs.launchpad.net/bugs/1718293
Define some helper functions to access the MMIO regions. We use these
in follow-on patches to read/write VAS hardware registers. They are
also used to later issue 'paste' instructions to submit requests to
the NX hardware engines.

Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
(cherry picked from commit 180fe15a8299c14f77347c5835c98c2446226ee6)
Signed-off-by: Gustavo Walbon <gwalbon@linux.vnet.ibm.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
6 years agopowerpc/powernv/vas: Define vas_init() and vas_exit()
Sukadev Bhattiprolu [Wed, 27 Sep 2017 23:24:14 +0000 (20:24 -0300)]
powerpc/powernv/vas: Define vas_init() and vas_exit()

BugLink: http://bugs.launchpad.net/bugs/1718293
Implement vas_init() and vas_exit() functions for a new VAS module.
This VAS module is essentially a library for other device drivers
and kernel users of the NX coprocessors like NX-842 and NX-GZIP.
In the future this will be extended to add support for user space
to access the NX coprocessors.

VAS is currently only supported with 64K page size.

Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
(backported from commit 4dea2d1a927c61114a168d4509b56329ea6effb7)
Signed-off-by: Gustavo Walbon <gwalbon@linux.vnet.ibm.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
6 years agopowerpc/powernv: Move GET_FIELD/SET_FIELD to vas.h
Sukadev Bhattiprolu [Wed, 27 Sep 2017 23:24:13 +0000 (20:24 -0300)]
powerpc/powernv: Move GET_FIELD/SET_FIELD to vas.h

BugLink: http://bugs.launchpad.net/bugs/1718293
Move the GET_FIELD and SET_FIELD macros to vas.h as VAS and other
users of VAS, including NX-842 can use those macros.

There is a lot of related code between the VAS/NX kernel drivers
and skiboot. For consistency, switch the order of parameters in
SET_FIELD to match the order in skiboot.

Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Reviewed-by: Dan Streetman <ddstreet@ieee.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
(cherry picked from commit b6622a339e8670a1025d4dd84be473c76dabed33)
Signed-off-by: Gustavo Walbon <gwalbon@linux.vnet.ibm.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
6 years agopowerpc/powernv/vas: Define macros, register fields and structures
Sukadev Bhattiprolu [Wed, 27 Sep 2017 23:24:12 +0000 (20:24 -0300)]
powerpc/powernv/vas: Define macros, register fields and structures

BugLink: http://bugs.launchpad.net/bugs/1718293
Define macros for the VAS hardware registers and bit-fields as well
as couple of data structures needed by the VAS driver.

Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
[mpe: Fixup include guard to use _ASM_POWERPC_VAS_H]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
(cherry picked from commit 967689141eb37c4365eac0fac82d857773098475)
Signed-off-by: Gustavo Walbon <gwalbon@linux.vnet.ibm.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
6 years agopowerpc/powernv: Enable PCI peer-to-peer
Frederic Barrat [Wed, 27 Sep 2017 23:24:11 +0000 (20:24 -0300)]
powerpc/powernv: Enable PCI peer-to-peer

BugLink: http://bugs.launchpad.net/bugs/1718293
P9 has support for PCI peer-to-peer, enabling a device to write in the
MMIO space of another device directly, without interrupting the CPU.

This patch adds support for it on powernv, by adding a new API to be
called by drivers. The pnv_pci_set_p2p(...) call configures an
'initiator', i.e the device which will issue the MMIO operation, and a
'target', i.e. the device on the receiving side.

P9 really only supports MMIO stores for the time being but that's
expected to change in the future, so the API allows to define both
load and store operations.

  /* PCI p2p descriptor */
  #define OPAL_PCI_P2P_ENABLE           0x1
  #define OPAL_PCI_P2P_LOAD             0x2
  #define OPAL_PCI_P2P_STORE            0x4

  int pnv_pci_set_p2p(struct pci_dev *initiator, struct pci_dev *target,
                      u64 desc)

It uses a new OPAL call, as the configuration magic is done on the
PHBs by skiboot.

Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Reviewed-by: Russell Currey <ruscur@russell.cc>
[mpe: Drop unrelated OPAL calls, s/uint64_t/u64/, minor formatting]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
(backported from commit 2552910084a5e12e280caf082ab01468e187a064)
Signed-off-by: Gustavo Walbon <gwalbon@linux.vnet.ibm.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
6 years agopowerpc/powernv: Add support to set power-shifting-ratio
Shilpasri G Bhat [Wed, 27 Sep 2017 23:24:10 +0000 (20:24 -0300)]
powerpc/powernv: Add support to set power-shifting-ratio

BugLink: http://bugs.launchpad.net/bugs/1718293
This patch adds support to set power-shifting-ratio which hints the
firmware how to distribute/throttle power between different entities
in a system (e.g CPU v/s GPU). This ratio is used by OCC for power
capping algorithm.

Signed-off-by: Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
(cherry picked from commit 8e84b2d1f0f6a00b6476790f7bce6dcbffe91980)
Signed-off-by: Gustavo Walbon <gwalbon@linux.vnet.ibm.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
6 years agopowerpc/powernv: Add support for powercap framework
Shilpasri G Bhat [Wed, 27 Sep 2017 23:24:09 +0000 (20:24 -0300)]
powerpc/powernv: Add support for powercap framework

BugLink: http://bugs.launchpad.net/bugs/1718293
Adds a generic powercap framework to change the system powercap
inband through OPAL-OCC command/response interface.

Signed-off-by: Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
(backported from commit cb8b340de21e1c57e1c6d4f26ccc4af46a3ed559)
Signed-off-by: Gustavo Walbon <gwalbon@linux.vnet.ibm.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
6 years agopowerpc/perf: Add nest IMC PMU support
Anju T Sudhakar [Wed, 27 Sep 2017 23:24:08 +0000 (20:24 -0300)]
powerpc/perf: Add nest IMC PMU support

BugLink: http://bugs.launchpad.net/bugs/1718293
Add support to register Nest In-Memory Collection PMU counters.
Patch adds a new device file called "imc-pmu.c" under powerpc/perf
folder to contain all the device PMU functions.

Device tree parser code added to parse the PMU events information
and create sysfs event attributes for the PMU.

Cpumask attribute added along with Cpu hotplug online/offline functions
specific for nest PMU. A new state "CPUHP_AP_PERF_POWERPC_NEST_IMC_ONLINE"
added for the cpu hotplug callbacks. Error handle path frees the memory
and unregisters the CPU hotplug callbacks.

Signed-off-by: Anju T Sudhakar <anju@linux.vnet.ibm.com>
Signed-off-by: Hemant Kumar <hemant@linux.vnet.ibm.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
(cherry picked from commit 885dcd709ba9120b9935415b8b0f9d1b94e5826b)
Signed-off-by: Gustavo Walbon <gwalbon@linux.vnet.ibm.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
6 years agopowerpc/powernv: Detect and create IMC device
Madhavan Srinivasan [Wed, 27 Sep 2017 23:24:07 +0000 (20:24 -0300)]
powerpc/powernv: Detect and create IMC device

BugLink: http://bugs.launchpad.net/bugs/1718293
Code to create platform device for the In-Memory Collection (IMC)
counters. Platform devices are created based on the IMC compatibility.
New header file created to contain the data structures and macros
needed for In-Memory Collection (IMC) counter pmu devices.

The device tree for IMC counters starts at the node "imc-counters".
This node contains all the IMC PMU nodes and event nodes for these IMC
PMUs. Device probe() parses the device to locate three possible IMC
device types (Nest/Core/Thread). Function then branch to parse each
unit nodes to populate vital information such as device memory sizes,
event nodes information, base address for reserve memory access (if
any) and so on. Simple bare-minimum shutdown function added which only
"stops" the engines.

Signed-off-by: Anju T Sudhakar <anju@linux.vnet.ibm.com>
Signed-off-by: Hemant Kumar <hemant@linux.vnet.ibm.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
[mpe: Fix build with CONFIG_PERF_EVENTS=n]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
(cherry picked from commit 8f95faaac56c18b32d0e23ace55417a440abdb7e)
Signed-off-by: Gustavo Walbon <gwalbon@linux.vnet.ibm.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
6 years agopowerpc/powernv: Add IMC OPAL APIs
Madhavan Srinivasan [Wed, 27 Sep 2017 23:24:06 +0000 (20:24 -0300)]
powerpc/powernv: Add IMC OPAL APIs

BugLink: http://bugs.launchpad.net/bugs/1718293
In-Memory Collection (IMC) counters are performance monitoring
infrastructure. These counters need special sequence of SCOMs to
init/start/stop which is handled by OPAL. And OPAL provides three APIs
to init and control these IMC engines.

OPAL API documentation:
  https://github.com/open-power/skiboot/blob/master/doc/opal-api/opal-imc-counters.rst

Patch updates the kernel side powernv platform code to support the new
OPAL APIs

Signed-off-by: Hemant Kumar <hemant@linux.vnet.ibm.com>
Signed-off-by: Anju T Sudhakar <anju@linux.vnet.ibm.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
(cherry picked from commit 28a5db0061014c8afbbb98560cf420c29bc4d8e1)
Signed-off-by: Gustavo Walbon <gwalbon@linux.vnet.ibm.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
6 years agoUBUNTU: d/s/m/insert-ubuntu-changes: use full version numbers
Marcelo Henrique Cerri [Wed, 27 Sep 2017 20:23:49 +0000 (17:23 -0300)]
UBUNTU: d/s/m/insert-ubuntu-changes: use full version numbers

Ignore: yes

Make insert-ubuntu-changes to consider full version numbers when looping
through debian.master/changelog entries and comparing the version number
of each entry with the arguments passed to the script to decide which
entries should be included in the output changelog file.

Previously, only the last number in the version was used in this
comparison. For example, when comparing 4.4.0-50.51 and 4.4.0-83.84 only
the numbers 51 and 84 were actually used in the comparison. That however
might not work properly when the major version is bumped.

For instance, using "end" as 4.4.0-50.51 and "start" as 4.4.0-83.84 used
to work fine because 84 is greater than 51. However when using "end"
as 4.11.0-10.11 and "start" as 4.13.0-2.3, no entry was being selected
since 3 is not greater than 11.

Signed-off-by: Marcelo Henrique Cerri <marcelo.cerri@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
6 years agoLinux 4.13.4
Greg Kroah-Hartman [Wed, 27 Sep 2017 12:43:35 +0000 (14:43 +0200)]
Linux 4.13.4

BugLink: http://bugs.launchpad.net/bugs/1720154
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
6 years agoiwlwifi: add workaround to disable wide channels in 5GHz
Luca Coelho [Tue, 15 Aug 2017 17:48:41 +0000 (20:48 +0300)]
iwlwifi: add workaround to disable wide channels in 5GHz

BugLink: http://bugs.launchpad.net/bugs/1720154
commit 01a9c948a09348950515bf2abb6113ed83e696d8 upstream.

The OTP in some SKUs have erroneously allowed 40MHz and 80MHz channels
in the 5.2GHz band.  The firmware has been modified to not allow this
in those SKUs, so the driver needs to do the same otherwise the
firmware will assert when we try to use it.

Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
6 years agosched/cpuset/pm: Fix cpuset vs. suspend-resume bugs
Peter Zijlstra [Thu, 7 Sep 2017 09:13:38 +0000 (11:13 +0200)]
sched/cpuset/pm: Fix cpuset vs. suspend-resume bugs

BugLink: http://bugs.launchpad.net/bugs/1720154
commit 50e76632339d4655859523a39249dd95ee5e93e7 upstream.

Cpusets vs. suspend-resume is _completely_ broken. And it got noticed
because it now resulted in non-cpuset usage breaking too.

On suspend cpuset_cpu_inactive() doesn't call into
cpuset_update_active_cpus() because it doesn't want to move tasks about,
there is no need, all tasks are frozen and won't run again until after
we've resumed everything.

But this means that when we finally do call into
cpuset_update_active_cpus() after resuming the last frozen cpu in
cpuset_cpu_active(), the top_cpuset will not have any difference with
the cpu_active_mask and this it will not in fact do _anything_.

So the cpuset configuration will not be restored. This was largely
hidden because we would unconditionally create identity domains and
mobile users would not in fact use cpusets much. And servers what do use
cpusets tend to not suspend-resume much.

An addition problem is that we'd not in fact wait for the cpuset work to
finish before resuming the tasks, allowing spurious migrations outside
of the specified domains.

Fix the rebuild by introducing cpuset_force_rebuild() and fix the
ordering with cpuset_wait_for_hotplug().

Reported-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
Cc: Tejun Heo <tj@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: deb7aa308ea2 ("cpuset: reorganize CPU / memory hotplug handling")
Link: http://lkml.kernel.org/r/20170907091338.orwxrqkbfkki3c24@hirez.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>