Eric Auger [Thu, 19 Jan 2017 20:57:48 +0000 (20:57 +0000)]
iommu: Add a new type field in iommu_resv_region
We introduce a new field to differentiate the reserved region
types and specialize the apply_resv_region implementation.
Legacy direct mapped regions have IOMMU_RESV_DIRECT type.
We introduce 2 new reserved memory types:
- IOMMU_RESV_MSI will characterize MSI regions that are mapped
- IOMMU_RESV_RESERVED characterize regions that cannot by mapped.
Signed-off-by: Eric Auger <eric.auger@redhat.com> Tested-by: Tomasz Nowicki <tomasz.nowicki@caviumnetworks.com> Tested-by: Bharat Bhushan <bharat.bhushan@nxp.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
(cherry picked from commit d30ddcaa7b028049cdfee3a40248002d07b2bbf3) Signed-off-by: dann frazier <dann.frazier@canonical.com> Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Eric Auger [Thu, 19 Jan 2017 20:57:47 +0000 (20:57 +0000)]
iommu: Rename iommu_dm_regions into iommu_resv_regions
We want to extend the callbacks used for dm regions and
use them for reserved regions. Reserved regions can be
- directly mapped regions
- regions that cannot be iommu mapped (PCI host bridge windows, ...)
- MSI regions (because they belong to another address space or because
they are not translated by the IOMMU and need special handling)
So let's rename the struct and also the callbacks.
Signed-off-by: Eric Auger <eric.auger@redhat.com> Acked-by: Robin Murphy <robin.murphy@arm.com> Reviewed-by: Tomasz Nowicki <tomasz.nowicki@caviumnetworks.com> Tested-by: Tomasz Nowicki <tomasz.nowicki@caviumnetworks.com> Tested-by: Bharat Bhushan <bharat.bhushan@nxp.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
(cherry picked from commit e5b5234a36ca283158721d3d2e0cddfa324abdf9) Signed-off-by: dann frazier <dann.frazier@canonical.com> Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Robin Murphy [Thu, 19 Jan 2017 20:57:46 +0000 (20:57 +0000)]
iommu/dma: Allow MSI-only cookies
IOMMU domain users such as VFIO face a similar problem to DMA API ops
with regard to mapping MSI messages in systems where the MSI write is
subject to IOMMU translation. With the relevant infrastructure now in
place for managed DMA domains, it's actually really simple for other
users to piggyback off that and reap the benefits without giving up
their own IOVA management, and without having to reinvent their own
wheel in the MSI layer.
Allow such users to opt into automatic MSI remapping by dedicating a
region of their IOVA space to a managed cookie, and extend the mapping
routine to implement a trivial linear allocator in such cases, to avoid
the needless overhead of a full-blown IOVA domain.
Signed-off-by: Robin Murphy <robin.murphy@arm.com> Reviewed-by: Tomasz Nowicki <tomasz.nowicki@caviumnetworks.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Tested-by: Eric Auger <eric.auger@redhat.com> Tested-by: Tomasz Nowicki <tomasz.nowicki@caviumnetworks.com> Tested-by: Bharat Bhushan <bharat.bhushan@nxp.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
(cherry picked from commit fdbe574eb69312a7fbe09674d69c01b80e4ed9dc) Signed-off-by: dann frazier <dann.frazier@canonical.com> Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Documentation:powerpc: Add device-tree bindings for power-mgt
BugLink: http://bugs.launchpad.net/bugs/1666197
Document the device-tree bindings defining the the properties under
the @power-mgt node in the device tree that describe the idle states
for Linux running on baremetal POWER servers.
These bindings are documented separately instead of using the the
common idle state bindings since the idle-states on POWER servers
are exposed as property arrays where as the common idle state bindings
expect idle-states to be described as nodes.
Acked-by: Rob Herring <robh@kernel.org> Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
(cherry picked from linux-next commit b48ff52043f489d594b989b318c120ca340a2e41) Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
powernv: Pass PSSCR value and mask to power9_idle_stop
BugLink: http://bugs.launchpad.net/bugs/1666197
The power9_idle_stop method currently takes only the requested stop
level as a parameter and picks up the rest of the PSSCR bits from a
hand-coded macro. This is not a very flexible design, especially when
the firmware has the capability to communicate the psscr value and the
mask associated with a particular stop state via device tree.
This patch modifies the power9_idle_stop API to take as parameters the
PSSCR value and the PSSCR mask corresponding to the stop state that
needs to be set. These PSSCR value and mask are respectively obtained
by parsing the "ibm,cpu-idle-state-psscr" and
"ibm,cpu-idle-state-psscr-mask" fields from the device tree.
In addition to this, the patch adds support for handling stop states
for which ESL and EC bits in the PSSCR are zero. As per the
architecture, a wakeup from these stop states resumes execution from
the subsequent instruction as opposed to waking up at the System
Vector.
The older firmware sets only the Requested Level (RL) field in the
psscr and psscr-mask exposed in the device tree. For older firmware
where psscr-mask=0xf, this patch will set the default sane values that
the set for for remaining PSSCR fields (i.e PSLL, MTL, ESL, EC, and
TR). For the new firmware, the patch will validate that the invariants
required by the ISA for the psscr values are maintained by the
firmware.
This skiboot patch that exports fully populated PSSCR values and the
mask for all the stop states can be found here:
https://lists.ozlabs.org/pipermail/skiboot/2016-September/004869.html
[Optimize the number of instructions before entering STOP with
ESL=EC=0, validate the PSSCR values provided by the firimware
maintains the invariants required as per the ISA suggested by Balbir
Singh]
Acked-by: Balbir Singh <bsingharora@gmail.com> Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
(cherry picked from linux-next commit 09206b600c76f20984e80d99f3b5343c79332a97) Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
cpuidle:powernv: Add helper function to populate powernv idle states.
BugLink: http://bugs.launchpad.net/bugs/1666197
In the current code for powernv_add_idle_states, there is a lot of code
duplication while initializing an idle state in powernv_states table.
Add an inline helper function to populate the powernv_states[] table
for a given idle state. Invoke this for populating the "Nap",
"Fastsleep" and the stop states in powernv_add_idle_states.
Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> Acked-by: Balbir Singh <bsingharora@gmail.com> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
(cherry picked from linux-next commit 9e9fc6f00a54f7064dc681ac187be6498d566a4f) Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
powernv:stop: Rename pnv_arch300_idle_init to pnv_power9_idle_init
BugLink: http://bugs.launchpad.net/bugs/1666197
Balbir pointed out that the name of the function pnv_arch300_idle_init
was inconsistent with the names of the variables and functions
pertaining to POWER9 features in book3s_idle.S.
This patch renames pnv_arch300_idle_init to pnv_power9_idle_init.
This patch does not change any behaviour.
Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
(cherry picked from linux-next commit dd34c74c97b6c3ed1ac7caec0b46267142659aff) Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
BugLink: http://bugs.launchpad.net/bugs/1666197
Currently all the low-power idle states are expected to wake up
at reset vector 0x100. Which is why the macro IDLE_STATE_ENTER_SEQ
that puts the CPU to an idle state and never returns.
On ISA v3.0, when the ESL and EC bits in the PSSCR are zero, the CPU
is expected to wake up at the next instruction of the idle
instruction.
This patch adds a new macro named IDLE_STATE_ENTER_SEQ_NORET for the
no-return variant and reuses the name IDLE_STATE_ENTER_SEQ
for a variant that allows resuming operation at the instruction next
to the idle-instruction.
Acked-by: Balbir Singh <bsingharora@gmail.com> Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
(cherry picked from linux-next commit 823b7bd5156a93872d9561b3f033dfe5cb80204e) Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Ding Tianhong [Mon, 6 Feb 2017 16:47:42 +0000 (16:47 +0000)]
clocksource/drivers/arm_arch_timer: Work around Hisilicon erratum 161010101
Erratum Hisilicon-161010101 says that the ARM generic timer counter "has
the potential to contain an erroneous value when the timer value
changes". Accesses to TVAL (both read and write) are also affected due
to the implicit counter read. Accesses to CVAL are not affected.
The workaround is to reread the system count registers until the value
of the second read is larger than the first one by less than 32, the
system counter can be guaranteed not to return wrong value twice by
back-to-back read and the error value is always larger than the correct
one by 32. Writes to TVAL are replaced with an equivalent write to CVAL.
Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>
[Mark: split patch, fix Kconfig, reword commit message] Signed-off-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
(cherry picked from commit bb42ca47401010fc02901b5e8f79e40a26f208cb) Signed-off-by: dann frazier <dann.frazier@canonical.com> Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Currently we have code inline in the arch timer probe path to cater for
Freescale erratum A-008585, complete with ifdeffery. This is a little
ugly, and will get worse as we try to add more errata handling.
This patch refactors the handling of Freescale erratum A-008585. Now the
erratum is described in a generic arch_timer_erratum_workaround
structure, and the probe path can iterate over these to detect errata
and enable workarounds.
This will simplify the addition and maintenance of code handling
Hisilicon erratum 161010101.
Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>
[Mark: split patch, correct Kconfig, reword commit message] Signed-off-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
(cherry picked from commit 16d10ef29f25aba923779234bb93a451b14d20e6) Signed-off-by: dann frazier <dann.frazier@canonical.com> Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Having a command line option to flip the errata handling for a
particular erratum is a little bit unusual, and it's vastly superior to
pass this in the DT. By common consensus, it's best to kill off the
command line parameter.
Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>
[Mark: split patch, reword commit message] Signed-off-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
(cherry picked from commit 5444ea6a7f46276876e94ecf8d44615af1ef22f7) Signed-off-by: dann frazier <dann.frazier@canonical.com> Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Ding Tianhong [Mon, 6 Feb 2017 16:47:39 +0000 (16:47 +0000)]
clocksource/drivers/arm_arch_timer: Add dt binding for hisilicon-161010101 erratum
This erratum describes a bug in logic outside the core, so MIDR can't be
used to identify its presence, and reading an SoC-specific revision
register from common arch timer code would be awkward. So, describe it
in the device tree.
Signed-off-by: Ding Tianhong <dingtianhong@huawei.com> Acked-by: Rob Herring <robh@kernel.org> Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
(cherry picked from commit 729e55225b1f6225ee7a2a358d5141a3264627c4) Signed-off-by: dann frazier <dann.frazier@canonical.com> Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Driver for interrupt combiners in the Top-level Control and Status
Registers (TCSR) hardware block in Qualcomm Technologies chips.
An interrupt combiner in this block combines a set of interrupts by
OR'ing the individual interrupt signals into a summary interrupt
signal routed to a parent interrupt controller, and provides read-
only, 32-bit registers to query the status of individual interrupts.
The status bit for IRQ n is bit (n % 32) within register (n / 32)
of the given combiner. Thus, each combiner can be described as a set
of register offsets and the number of IRQs managed.
Signed-off-by: Agustin Vega-Frias <agustinv@codeaurora.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
(cherry picked from commit f20cc9b00c7b71f9b5e970b6bd4ac93b0d9cfd5b) Signed-off-by: dann frazier <dann.frazier@canonical.com> Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
ACPI: Add support for ResourceSource/IRQ domain mapping
ACPI extended IRQ resources may contain a ResourceSource to specify
an alternate interrupt controller. Introduce acpi_irq_get and use it
to implement ResourceSource/IRQ domain mapping.
The new API is similar to of_irq_get and allows re-initialization
of a platform resource from the ACPI extended IRQ resource, and
provides proper behavior for probe deferral when the domain is not
yet present when called.
Acked-by: Rafael J. Wysocki <rafael@kernel.org> Acked-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Reviewed-by: Hanjun Guo <hanjun.guo@linaro.org> Tested-by: Hanjun Guo <hanjun.guo@linaro.org> Signed-off-by: Agustin Vega-Frias <agustinv@codeaurora.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
(cherry picked from commit d44fa3d46079dc095c1346fa6e5bc96dca1ead41) Signed-off-by: dann frazier <dann.frazier@canonical.com> Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
ACPI: Generic GSI: Do not attempt to map non-GSI IRQs during bus scan
ACPI extended IRQ resources may contain a Resource Source field to specify
an alternate interrupt controller, attempting to map them as GSIs is
incorrect, so just disable the platform resource.
Since this field is currently ignored, we make this change conditional
on CONFIG_ACPI_GENERIC_GSI to keep the current behavior on x86 platforms,
in case some existing ACPI tables are using this incorrectly.
Acked-by: Rafael J. Wysocki <rafael@kernel.org> Acked-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Reviewed-by: Hanjun Guo <hanjun.guo@linaro.org> Tested-by: Hanjun Guo <hanjun.guo@linaro.org> Signed-off-by: Agustin Vega-Frias <agustinv@codeaurora.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
(cherry picked from commit fa20b176f609c813d2c677f54c814cbb7ea5f1d1) Signed-off-by: dann frazier <dann.frazier@canonical.com> Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Anju T [Wed, 8 Feb 2017 09:50:52 +0000 (15:20 +0530)]
powerpc/kprobes: Optimize kprobe in kretprobe_trampoline()
BugLink: http://bugs.launchpad.net/bugs/1585741
Kprobe placed on the kretprobe_trampoline() during boot time can be
optimized, since the instruction at probe point is a 'nop'.
Signed-off-by: Anju T Sudhakar <anju@linux.vnet.ibm.com> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
(cherry picked from linux-next commit 762df10bad6954b353ee649c387a8ffacf6dc347) Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Anju T [Wed, 8 Feb 2017 09:50:51 +0000 (15:20 +0530)]
powerpc/kprobes: Implement Optprobes
BugLink: http://bugs.launchpad.net/bugs/1585741
Current infrastructure of kprobe uses the unconditional trap instruction
to probe a running kernel. Optprobe allows kprobe to replace the trap
with a branch instruction to a detour buffer. Detour buffer contains
instructions to create an in memory pt_regs. Detour buffer also has a
call to optimized_callback() which in turn call the pre_handler(). After
the execution of the pre-handler, a call is made for instruction
emulation. The NIP is determined in advanced through dummy instruction
emulation and a branch instruction is created to the NIP at the end of
the trampoline.
To address the limitation of branch instruction in POWER architecture,
detour buffer slot is allocated from a reserved area. For the time
being, 64KB is reserved in memory for this purpose.
Instructions which can be emulated using analyse_instr() are the
candidates for optimization. Before optimization ensure that the address
range between the detour buffer allocated and the instruction being
probed is within +/- 32MB.
Signed-off-by: Anju T Sudhakar <anju@linux.vnet.ibm.com> Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
(cherry picked from linux-next commit 51c9c0843993528bffc920c54c2121d9e6f8b090) Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Naveen N. Rao [Wed, 8 Feb 2017 08:57:31 +0000 (14:27 +0530)]
powerpc/kprobes: Fixes for kprobe_lookup_name() on BE
BugLink: http://bugs.launchpad.net/bugs/1585741
Fix two issues with kprobes.h on BE which were exposed with the
optprobes work:
- one, having to do with a missing include for linux/module.h for
MODULE_NAME_LEN -- this didn't show up previously since the only
users of kprobe_lookup_name were in kprobes.c, which included
linux/module.h through other headers, and
- two, with a missing const qualifier for a local variable which ends
up referring a string literal. Again, this is unique to how
kprobe_lookup_name is being invoked in optprobes.c
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
(cherry picked from linux-next commit 30176466e36aadba01e1a630cf42397a3438efa4) Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Anju T [Wed, 8 Feb 2017 08:57:30 +0000 (14:27 +0530)]
powerpc: Add helper to check if offset is within relative branch range
BugLink: http://bugs.launchpad.net/bugs/1585741
To permit the use of relative branch instruction in powerpc, the target
address has to be relatively nearby, since the address is specified in an
immediate field (24 bit filed) in the instruction opcode itself. Here
nearby refers to 32MB on either side of the current instruction.
This patch verifies whether the target address is within +/- 32MB
range or not.
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: Anju T Sudhakar <anju@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
(cherry picked from linux-next commit ebfa50df435eed18e1389a43e0596246228e7298) Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Naveen N. Rao [Wed, 8 Feb 2017 08:57:29 +0000 (14:27 +0530)]
powerpc/bpf: Introduce __PPC_SH64()
BugLink: http://bugs.launchpad.net/bugs/1585741
Introduce __PPC_SH64() as a 64-bit variant to encode shift field in some
of the shift and rotate instructions operating on double-words. Convert
some of the BPF instruction macros to use the same.
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
(cherry picked from linux-next commit c233f5979b3dbb39a5b2473b5fcaf58baec8f1bd) Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
John Johansen [Wed, 15 Feb 2017 23:13:50 +0000 (15:13 -0800)]
UBUNTU: SAUCE: apparmor: fix link auditing failure due to, uninitialized var
The lperms struct is uninitialized for use with auditing if there is
an early failure due to a path name error. This can result in incorrect
logging or in the extreme case apparmor killing the task with a signal
which results in the failure in the referenced bug.
BugLink: http://bugs.launchpad.net/bugs/1664912 Signed-off-by: John Johansen <john.johansen@canonical.com> Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Hari Bathini [Wed, 15 Feb 2017 18:16:11 +0000 (11:16 -0700)]
UBUNTU: SAUCE: powerpc/fadump: set an upper limit for boot memory size
BugLink: http://bugs.launchpad.net/bugs/1655241
By default, 5% of system RAM is reserved for preserving boot memory.
Alternatively, a user can specify the amount of memory to reserve.
See Documentation/powerpc/firmware-assisted-dump.txt for details. In
addition to the memory reserved for preserving boot memory, some more
memory is reserved, to save HPTE region, CPU state data and ELF core
headers.
Memory Reservation during first kernel looks like below:
Low memory Top of memory
0 boot memory size |
| | |<--Reserved dump area -->|
V V | Permanent Reservation V
+-----------+----------/ /----------+---+----+-----------+----+
| | |CPU|HPTE| DUMP |ELF |
+-----------+----------/ /----------+---+----+-----------+----+
| ^
| |
\ /
-------------------------------------------
Boot memory content gets transferred to
reserved area by firmware at the time of
crash
The implicit rule here is that the sum of the sizes of boot memory,
CPU state data, HPTE region and ELF core headers can't be greater than
the total memory size. But currently, a user is allowed to specify any
value as boot memory size. So, the above rule is violated when a boot
memory size closer to 50% of the total available memory is specified.
As the kernel is not handling this currently, it may lead to undefined
behavior. Fix it by setting an upper limit for boot memory size to 25%
of the total available memory.
Signed-off-by: Hari Bathini <hbathini@linux.vnet.ibm.com>
BugLink: http://bugs.launchpad.net/bugs/1649292
To support unprivileged users mounting filesystems two permission
checks have to be performed: a test to see if the user allowed to
create a mount in the mount namespace, and a test to see if
the user is allowed to access the specified filesystem.
The automount case is special in that mounting the original filesystem
grants permission to mount the sub-filesystems, to any user who
happens to stumble across the their mountpoint and satisfies the
ordinary filesystem permission checks.
Attempting to handle the automount case by using override_creds
almost works. It preserves the idea that permission to mount
the original filesystem is permission to mount the sub-filesystem.
Unfortunately using override_creds messes up the filesystems
ordinary permission checks.
Solve this by being explicit that a mount is a submount by introducing
vfs_submount, and using it where appropriate.
vfs_submount uses a new mount internal mount flags MS_SUBMOUNT, to let
sget and friends know that a mount is a submount so they can take appropriate
action.
sget and sget_userns are modified to not perform any permission checks
on submounts.
follow_automount is modified to stop using override_creds as that
has proven problemantic.
do_mount is modified to always remove the new MS_SUBMOUNT flag so
that we know userspace will never by able to specify it.
autofs4 is modified to stop using current_real_cred that was put in
there to handle the previous version of submount permission checking.
cifs is modified to pass the mountpoint all of the way down to vfs_submount.
debugfs is modified to pass the mountpoint all of the way down to
trace_automount by adding a new parameter. To make this change easier
a new typedef debugfs_automount_t is introduced to capture the type of
the debugfs automount function.
Cc: stable@vger.kernel.org Fixes: 069d5ac9ae0d ("autofs: Fix automounts by using current_real_cred()->uid") Fixes: aeaa4a79ff6a ("fs: Call d_automount with the filesystems creds") Reviewed-by: Trond Myklebust <trond.myklebust@primarydata.com> Reviewed-by: Seth Forshee <seth.forshee@canonical.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
(cherry picked from commit 93faccbbfa958a9668d3ab4e30f38dd205cee8d8 linux-next) Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Long Li [Thu, 15 Dec 2016 02:46:03 +0000 (18:46 -0800)]
scsi: storvsc: properly set residual data length on errors
BugLink: http://bugs.launchpad.net/bugs/1663687
On I/O errors, the Windows driver doesn't set data_transfer_length
on error conditions other than SRB_STATUS_DATA_OVERRUN.
In these cases we need to set data_transfer_length to 0,
indicating there is no data transferred. On SRB_STATUS_DATA_OVERRUN,
data_transfer_length is set by the Windows driver to the actual data transferred.
Reported-by: Shiva Krishna <Shiva.Krishna@nimblestorage.com> Signed-off-by: Long Li <longli@microsoft.com> Reviewed-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Cc: <stable@vger.kernel.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from linux-next commit 40630f462824ee24bc00d692865c86c3828094e0) Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Long Li [Thu, 15 Dec 2016 02:46:02 +0000 (18:46 -0800)]
scsi: storvsc: properly handle SRB_ERROR when sense message is present
BugLink: http://bugs.launchpad.net/bugs/1663687
When sense message is present on error, we should pass along to the upper
layer to decide how to deal with the error.
This patch fixes connectivity issues with Fiber Channel devices.
Signed-off-by: Long Li <longli@microsoft.com> Reviewed-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Cc: <stable@vger.kernel.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from linux-next commit bba5dc332ec2d3a685cb4dae668c793f6a3713a3) Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Long Li [Thu, 15 Dec 2016 02:46:01 +0000 (18:46 -0800)]
scsi: storvsc: use tagged SRB requests if supported by the device
BugLink: http://bugs.launchpad.net/bugs/1663687
Properly set SRB flags when hosting device supports tagged queuing.
This patch improves the performance on Fiber Channel disks.
Signed-off-by: Long Li <longli@microsoft.com> Reviewed-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Cc: <stable@vger.kernel.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from linux-next commit 3cd6d3d9b1abab8dcdf0800224ce26daac24eea2) Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
1. We will make every effort to pick a channel that is in the
same NUMA node that is initiating the I/O
2. The mapping between the guest CPU and the outgoing channel
is persistent.
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Reviewed-by: Long Li <longli@microsoft.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(back ported from linux-next commit d86adf482b843b3a58a9ec3b7c1ccdbf7c705db1) Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Conflicts:
drivers/scsi/storvsc_drv.c
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Reviewed-by: Long Li <longli@microsoft.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from linux-next commit 977965283526dd2e887331365da19b05c909a966) Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Reviewed-by: Long Li <longli@microsoft.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from linux-next commit f64dad2628bdf62eac7ac145a6e31430376b65e4) Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Li Zhong [Fri, 11 Nov 2016 04:57:36 +0000 (12:57 +0800)]
KVM: PPC: Book 3S: XICS: Don't lock twice when checking for resend
BugLink: http://bugs.launchpad.net/bugs/1651248
This patch improves the code that takes lock twice to check the resend flag
and do the actual resending, by checking the resend flag locklessly, and
add a boolean parameter check_resend to icp_[rm_]deliver_irq(), so the
resend flag can be checked in the lock when doing the delivery.
We need make sure when we clear the ics's bit in the icp's resend_map, we
don't miss the resend flag of the irqs that set the bit. It could be
ordered through the barrier in test_and_clear_bit(), and a newly added
wmb between setting irq's resend flag, and icp's resend_map.
Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
(cherry picked from linux-next commit 21acd0e4df04f02176e773468658c3cebff096bb) Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
When the interrupt is presented, set P. Present if P was not set.
If P is already set, don't present again, set Q.
When the interrupt is EOI'ed, move Q into P (and clear Q). If it is
set, re-present.
The asserted flag used by LSI is also incorporated into the P bit.
When the irq state is saved, P/Q bits are also saved, they need some
qemu modifications to be recognized and passed around to be restored.
KVM_XICS_PENDING bit set and saved should also indicate
KVM_XICS_PRESENTED bit set and saved. But it is possible some old
code doesn't have/recognize the P bit, so when we restore, we set P
for PENDING bit, too.
The idea and much of the code come from Ben.
Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
(cherry picked from linux-next commit 17d48610ae0fa218aa386b16a538c792991a3652) Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
ics_check_resend()
lock ics_lock
see resend set
unlock ics_lock
/* change affinity of the irq */
kvmppc_xics_set_xive()
write_xive()
lock ics_lock
see resend set
unlock ics_lock
icp_deliver_irq() /* resend */
icp_deliver_irq() /* resend again */
It doesn't have any user-visible effect at present, but needs to be avoided
when the following patch implementing the P/Q stuff is applied.
This patch clears the resend flag before releasing the ics lock, when we
know we will do a re-delivery after checking the flag, or setting the flag.
Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
(cherry picked from linux-next commit bf5a71d53835110d46d33eb5335713ffdbff9ab6) Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Li Zhong [Fri, 11 Nov 2016 04:57:33 +0000 (12:57 +0800)]
KVM: PPC: Book 3S: XICS: correct the real mode ICP rejecting counter
BugLink: http://bugs.launchpad.net/bugs/1651248
Some counters are added in Commit 6e0365b78273 ("KVM: PPC: Book3S HV:
Add ICP real mode counters"), to provide some performance statistics to
determine whether further optimizing is needed for real mode functions.
The n_reject counter counts how many times ICP rejects an irq because of
priority in real mode. The redelivery of an lsi that is still asserted
after eoi doesn't fall into this category, so the increasement there is
removed.
Also, it needs to be increased in icp_rm_deliver_irq() if it rejects
another one.
Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
(cherry picked from linux-next commit 37451bc95dee0e666927d6ffdda302dbbaaae6fa) Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Li Zhong [Fri, 11 Nov 2016 04:57:32 +0000 (12:57 +0800)]
KVM: PPC: Book 3S: XICS cleanup: remove XICS_RM_REJECT
BugLink: http://bugs.launchpad.net/bugs/1651248
Commit b0221556dbd3 ("KVM: PPC: Book3S HV: Move virtual mode ICP functions
to real-mode") removed the setting of the XICS_RM_REJECT flag. And
since that commit, nothing else sets the flag any more, so we can remove
the flag and the remaining code that handles it, including the counter
that counts how many times it get set.
Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
(cherry picked from linux-next commit 5efa6605151b84029edeb2e07f2d2d74b52c106f) Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Seth Forshee [Tue, 19 Jan 2016 19:12:02 +0000 (13:12 -0600)]
UBUNTU: SAUCE: overlayfs: Skip permission checking for trusted.overlayfs.* xattrs
The original mounter had CAP_SYS_ADMIN in the user namespace
where the mount happened, and the vfs has validated that the user
has permission to do the requested operation. This is sufficient
for allowing the kernel to write these specific xattrs, so we can
bypass the permission checks for these xattrs.
To support this, export __vfs_setxattr_noperm and add an similar
__vfs_removexattr_noperm which is also exported. Use these when
setting or removing trusted.overlayfs.* xattrs.
Colin Ian King [Mon, 6 Feb 2017 15:21:31 +0000 (15:21 +0000)]
UBUNTU: SAUCE: md/raid6 algorithms: scale test duration for speedier boots
The original code runs for a set run time based on 2^RAID6_TIME_JIFFIES_LG2.
The default kernel value for RAID6_TIME_JIFFIES_LG2 is 4, however, emperical
testing shows that a value of 3.5 is the sweet spot for getting consistent
benchmarking results and speeding up the run time of the benchmarking.
To achieve 2^3.5 we use the following:
2^3.5 = 2^4 / 2^0.5
= 2^4 / sqrt(2)
= 2^4 * 0.707106781
Too keep this as integer math that is as accurate as required and avoiding
overflow, this becomes:
= 2^4 * 181 / 256
= (2^4 * 181) >> 8
We also need to scale down perf by the same factor, however, to
get a good approximate integer result without an overflow we scale
by 2^4.0 * sqrt(2) =
= 2 ^ 4 * 1.41421356237
= 2 ^ 4 * 1448 / 1024
= (2 ^ 4 * 1448) >> 10
This has been tested on 2 AWS instances, a small t2 and a medium m3
with 30 boot tests each and compared to the same instances booted 30
times on an umodified kernel. In all results, we get the same
algorithms being selected and a 100% consistent result over the 30
boots, showing that this optimised jiffy timing scaling does not break
the original functionality.
On the t2.small we see a saving of ~0.126 seconds and t3.medium a saving of
~0.177 seconds.
Tested on a 4 CPU VM on an 8 thread Xeon server; seeing a saving of ~0.33
seconds (average over 10 boots).
Tested on a 8 thread Xeon server, seeing a saving of ~1.24 seconds (average
of 10 boots).
The testing included double checking the algorithm chosen by the optimized
selection and seeing the same as pre-optimised version.
Signed-off-by: Colin Ian King <colin.king@canonical.com> Acked-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com> Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
John Johansen [Thu, 2 Feb 2017 09:09:02 +0000 (01:09 -0800)]
fix regression with domain change in complain mode
The patch
Fix no_new_privs blocking change_onexec when using stacked namespaces
changed when the no_new_privs checks is processed so the test could
be correctly applied in a stacked profile situation.
However it changed the behavior of the error returned in complain mode,
which will have both @error and @new set.
Fix this by introducing a new var to indicate the no_new_privs condition
instead of relying on error. While doing this allow the new label under
no new privs to be audited, by having its reference put in the error path,
instead of in the no_new_privs condition check.
John Johansen [Mon, 30 Jan 2017 10:38:14 +0000 (02:38 -0800)]
UBUNTU: SAUCE: apparmor: flock mediation is not being enforced on cache check
When an open file with cached permissions is checked for the flock
permission. The cache check fails and falls through to no error instead
of auditing, and returning an error.
For the fall through to do a permission check, so it will audit the
failed flock permission check.
BugLink: http://bugs.launchpad.net/bugs/1658219 Signed-off-by: John Johansen <john.johansen@canonical.com> Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
John Johansen [Thu, 12 Jan 2017 23:12:25 +0000 (15:12 -0800)]
UBUNTU: SAUCE: apparmor: null profiles should inherit parent control flags
null profiles that don't have the same control flags as the parent
behave in unexpected ways and can cause failures.
BugLink: http://bugs.launchpad.net/bugs/1656121 Signed-off-by: John Johansen <john.johansen@canonical.com> Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
John Johansen [Wed, 18 Jan 2017 09:23:11 +0000 (01:23 -0800)]
UBUNTU: SAUCE: apparmor: fix ns ref count link when removing profiles from policy
BugLink: http://bugs.launchpad.net/bugs/1660849 Signed-off-by: John Johansen <john.johansen@canonical.com> Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
John Johansen [Sat, 31 Dec 2016 11:55:30 +0000 (03:55 -0800)]
UBUNTU: SAUCE: apparmor: Fix no_new_privs blocking change_onexec when using stacked namespaces
Push the no_new_privs logic into the per profile transition fns, so
that the no_new_privs check can be done at the ns level instead of the
aggregate stack level.
BugLink: http://bugs.launchpad.net/bugs/1648143 Signed-off-by: John Johansen <john.johansen@canonical.com> Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
John Johansen [Thu, 8 Dec 2016 02:59:07 +0000 (18:59 -0800)]
UBUNTU: SAUCE: apparmor: fix lock ordering for mkdir
There is a lock inversion that can result in a dead lock when profile
replacements are racing with dir creation for a namespace in apparmorfs.
BugLink: http://bugs.launchpad.net/bugs/1645037 Signed-off-by: John Johansen <john.johansen@canonical.com> Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
John Johansen [Thu, 8 Dec 2016 02:56:31 +0000 (18:56 -0800)]
UBUNTU: SAUCE: apparmor: fix leak on securityfs pin count
apparmor is leaking pinfs refcoutn when inode setup fails.
BugLink: http://bugs.launchpad.net/bugs/1660846 Signed-off-by: John Johansen <john.johansen@canonical.com> Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
John Johansen [Thu, 8 Dec 2016 02:52:14 +0000 (18:52 -0800)]
UBUNTU: SAUCE: apparmor: fix reference count leak when securityfs_setup_d_inode() fails
BugLink: http://bugs.launchpad.net/bugs/1660845 Signed-off-by: John Johansen <john.johansen@canonical.com> Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
John Johansen [Thu, 8 Dec 2016 02:50:14 +0000 (18:50 -0800)]
UBUNTU: SAUCE: apparmor: fix not handling error case when securityfs_pin_fs() fails
BugLink: http://bugs.launchpad.net/bugs/1660842 Signed-off-by: John Johansen <john.johansen@canonical.com> Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
John Johansen [Sat, 3 Dec 2016 10:36:39 +0000 (02:36 -0800)]
UBUNTU: SAUCE: apparmor: fix oops in bind_mnt when dev_path lookup fails
Bind mounts can oops when devname lookup fails because the devname is
uninitialized and used in auditing the denial.
BugLink: http://bugs.launchpad.net/bugs/1660840 Signed-off-by: John Johansen <john.johansen@canonical.com> Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
John Johansen [Sat, 12 Nov 2016 00:06:25 +0000 (16:06 -0800)]
UBUNTU: SAUCE: apparmor: Don't audit denied access of special apparmor .null file
When an fd is disallowed from being inherited during exec, instead of
closed it is duped to a special apparmor/.null file. This prevents the
fd from being reused by another file in case the application expects
the original file on a give fd (eg stdin/stdout etc). This results in
a denial message like
[32375.561535] audit: type=1400 audit(1478825963.441:358): apparmor="DENIED" operation="file_inherit" namespace="root//lxd-t_<var-lib-lxd>" profile="/sbin/dhclient" name="/dev/pts/1" pid=16795 comm="dhclient" requested_mask="wr" denied_mask="wr" fsuid=165536 ouid=165536
Further access to the fd is resultin in the rather useless denial message
of
[32375.566820] audit: type=1400 audit(1478825963.445:359): apparmor="DENIED" operation="file_perm" namespace="root//lxd-t_<var-lib-lxd>" profile="/sbin/dhclient" name="/apparmor/.null" pid=16795 comm="dhclient" requested_mask="w" denied_mask="w" fsuid=165536 ouid=0
since we have the original denial, the noisy and useless .null based
denials can be skipped.
BugLink: http://bugs.launchpad.net/bugs/1660836 Signed-off-by: John Johansen <john.johansen@canonical.com> Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
John Johansen [Sat, 12 Nov 2016 19:33:54 +0000 (11:33 -0800)]
UBUNTU: SAUCE: apparmor: fix label leak when new label is unused
When a new label is created, it is created with a proxy in a circular
ref count that is broken by replacement. However if the label is not
used it will never be replaced and the circular ref count will never
be broken resulting in a leak.
BugLink: http://bugs.launchpad.net/bugs/1660834 Signed-off-by: John Johansen <john.johansen@canonical.com> Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
John Johansen [Sat, 12 Nov 2016 09:39:51 +0000 (01:39 -0800)]
UBUNTU: SAUCE: apparmor: fix reference count bug in label_merge_insert()
@new does not have a reference taken locally and should not have its
reference put locally either.
BugLink: http://bugs.launchpad.net/bugs/1660833 Signed-off-by: John Johansen <john.johansen@canonical.com> Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
John Johansen [Sat, 12 Nov 2016 05:44:20 +0000 (21:44 -0800)]
UBUNTU: SAUCE: apparmor: fix replacement race in reading rawdata
The reading of rawdata is subject to a replacement race when the
rawdata is read in chunks smaller than the data size.
For each read the profile proxy is rechecked for the newest profile;
Which means if a profile is replaced between reads later chunks will
contain data from the new version of the profile while the earlier
reads will contain data from the previous version. This can result in
data that is inconsistent and corrupt.
Instead of rechecking for the current profile at each read. Get the
current profile at the time of the open and use the rawdata of the
profile for the lifetime that the file handle is open.
BugLink: http://bugs.launchpad.net/bugs/1638996 Signed-off-by: John Johansen <john.johansen@canonical.com> Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
When using nested namespaces policy within the nested namespace is trying
to cross validate with policy outside of the namespace that is not
visible to it. This results the access being denied and with no way to
add a rule to policy that would allow it.
The check should only be done again policy that is visible.
BugLink: http://bugs.launchpad.net/bugs/1660832 Signed-off-by: John Johansen <john.johansen@canonical.com> Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
John Johansen [Tue, 4 Oct 2016 00:27:09 +0000 (17:27 -0700)]
UBUNTU: SAUCE: apparmor: add flag to detect semantic change, to binfmt_elf mmap
commit 9f834ec18defc369d73ccf9e87a2790bfa05bf46 changed when the creds
are installed by the binfmt_elf handler. This affects which creds
are used to mmap the executable into the address space. Which can have
an affect on apparmor policy.
Add a flag to apparmor at
/sys/kernel/security/apparmor/features/domain/fix_binfmt_elf_mmap
to make it possible to detect this semantic change so that the userspace
tools and the regression test suite can correctly deal with the change.
Note: since 9f834ec1 is a potential information leak fix for prof
events and tracing, it is expected that it could be picked up by
kernels earlier kernels than 4.8 so that detecting the kernel version
is not sufficient.
BugLink: http://bugs.launchpad.net/bugs/1630069 Signed-off-by: John Johansen <john.johansen@canonical.com> Acked-by: Brad Figg <brad.figg@canonical.com> Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
John Johansen [Wed, 28 Sep 2016 03:11:29 +0000 (20:11 -0700)]
apparmor: bump domain stacking version to 1.2
BugLink: http://bugs.launchpad.net/bugs/1611078 Signed-off-by: John Johansen <john.johansen@canonical.com> Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
John Johansen [Tue, 27 Sep 2016 00:05:45 +0000 (17:05 -0700)]
apparmor: add per ns policy management interface
BugLink: http://bugs.launchpad.net/bugs/1611078 Signed-off-by: John Johansen <john.johansen@canonical.com> Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
John Johansen [Tue, 27 Sep 2016 02:06:51 +0000 (19:06 -0700)]
apparmor: update policy permissions to consider ns being viewed/managed
BugLink: http://bugs.launchpad.net/bugs/1611078 Signed-off-by: John Johansen <john.johansen@canonical.com> Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
John Johansen [Tue, 27 Sep 2016 22:14:48 +0000 (15:14 -0700)]
apparmor: add interface to advertise status of current task stacking
BugLink: http://bugs.launchpad.net/bugs/1611078 Signed-off-by: John Johansen <john.johansen@canonical.com> Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
John Johansen [Wed, 28 Sep 2016 09:23:56 +0000 (02:23 -0700)]
apparmor: fix warning that fn build_pivotroot discards const
fix mount.c warnings:
warning: passing argument 2 of ‘build_pivotroot’ discards ‘const’ qualifier fro\
m pointer target type [-Wdiscarded-qualifiers]
warning: passing argument 4 of ‘build_pivotroot’ discards ‘const’ qualifier from pointer target type [-Wdiscarded-qualifiers]
BugLink: http://bugs.launchpad.net/bugs/1611078 Signed-off-by: John Johansen <john.johansen@canonical.com> Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
John Johansen [Wed, 28 Sep 2016 05:14:12 +0000 (22:14 -0700)]
apparmor: fix oops in pivot_root mediation
BugLink: http://bugs.launchpad.net/bugs/1611078 Signed-off-by: John Johansen <john.johansen@canonical.com> Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
John Johansen [Thu, 22 Sep 2016 17:50:42 +0000 (10:50 -0700)]
apparmor: add mkdir/rmdir interface to manage policy namespaces
BugLink: http://bugs.launchpad.net/bugs/1611078 Signed-off-by: John Johansen <john.johansen@canonical.com> Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
John Johansen [Thu, 22 Sep 2016 21:53:40 +0000 (14:53 -0700)]
apparmor: add __aa_find_ns fn
BugLink: http://bugs.launchpad.net/bugs/1611078 Signed-off-by: John Johansen <john.johansen@canonical.com> Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
John Johansen [Thu, 22 Sep 2016 19:51:11 +0000 (12:51 -0700)]
apparmor: refactor aa_prepare_ns into prepare_ns and create_ns routines
BugLink: http://bugs.launchpad.net/bugs/1611078 Signed-off-by: John Johansen <john.johansen@canonical.com> Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
John Johansen [Wed, 14 Sep 2016 22:23:55 +0000 (15:23 -0700)]
apparmor: add interface to be able to grab loaded policy
Check point/restore needs to be able to grab policy currently loaded
into the kernel.
BugLink: http://bugs.launchpad.net/bugs/1611078 Signed-off-by: John Johansen <john.johansen@canonical.com> Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
John Johansen [Sun, 14 Aug 2016 22:01:12 +0000 (15:01 -0700)]
apparmor: fix: permissions test to view and manage policy
Drop may_open_profiles and unify with policy_view_capable()
Adjust policy_view_capable() so that it is slightly less restricted.
user_namespaces can now manage policy iff
- the task has cap_mac_admin in the namespace
- the user_namespace->level == apparmor policy_namespace->level.
This ensures a usernamespace can never be used to manage the
system namespace, and can only be used to manage the namespace at its
view level.
If for some reason a user namespace is setup without an apparmor
policy namespace it will not be able to manage or view policy.
However this also means an extra level of apparmor policy namespaces
can not be setup and used with user namespaces at this time.
ie. this blocks user confinement stacking, and user defined policy
use cases from being used with user namespaces atm.
Add the ability to output a debug message in relation to
capable(cap_mac_admin) &&
policy_locked
as it is possible for these to cause failures that are not audited and
thus hard to trace down.
Signed-off-by: John Johansen <john.johansen@canonical.com> Signed-off-by: Leann Ogasawara <leann.ogasawara@canonical.com> Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
John Johansen [Tue, 2 Aug 2016 10:49:35 +0000 (03:49 -0700)]
apparmor: convert delegating deleted files to mediate deleted files
This is a semantic change that may need to be reverted but we can not
properly do delegation atm and doing blind delegation is a security
hole.
Files that have the necessary labeling can still be delegated however
mediation will be required for deleted files that need to be revalidated.
Note: we code is setup to specify DELEGATE_DELETED but aliases it on
the backend to MEDIATE_DELETED. This will have to be partially reverted/
changed for profile replacement causing a revalidation.
Signed-off-by: John Johansen <john.johansen@canonical.com> Signed-off-by: Leann Ogasawara <leann.ogasawara@canonical.com> Signed-off-by: Tim Gardner <tim.gardner@canonical.com>