UBUNTU: SAUCE: (no-up) net: Zeroing the structure ethtool_wolinfo in ethtool_get_wol()
CVE-2014-9900
memset() the structure ethtool_wolinfo that has padded bytes
but the padded bytes have not been zeroed out.
Change-Id: If3fd2d872a1b1ab9521d937b86a29fc468a8bbfe Signed-off-by: Avijit Kanti Das <avijitnsec@codeaurora.org> Signed-off-by: Brad Figg <brad.figg@canonical.com> Acked-by: Seth Forshee <seth.forshee@canonical.com> Acked-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Colin Ian King [Wed, 7 Jun 2017 12:28:24 +0000 (13:28 +0100)]
UBUNTU: SAUCE: (noup) Update spl to 0.6.5.9-1ubuntu2, zfs to 0.6.5.9-5ubuntu7
Sync with upstream 4.12 compat fixes to build with 4.12. Tested against
upstream 4.12-rc4 and ubuntu Artful 4.11 kernels.
SPL:
* Add 4.12 compat patch from upstream to build with 4.12 kernel:
- 8f87971e1fd11e Linux 4.12 compat: PF_FSTRANS was removed
ZFS:
* Add 4.12 compat patches from upstream to build with 4.12 kernel:
- 608d6942b70436 Linux 4.12 compat: super_setup_bdi_name()
- e624cd19599047 Linux 4.12 compat: PF_FSTRANS was removed
- 2946a1a15aab87 Linux 4.12 compat: CURRENT_TIME removed
- 3e6c9433474f0b Linux 4.12 compat: fix super_setup_bdi_name() call
Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
dann frazier [Tue, 28 Mar 2017 20:42:36 +0000 (14:42 -0600)]
UBUNTU: SAUCE: PCI: Restore codepath for !CONFIG_LIBIO
BugLink: http://bugs.launchpad.net/bugs/1677319
While LIBIO is needed on arm64, it is still new infrastructure that hasn't
had a lot of testing on other architectures. This specifically impacts
Ubuntu architecures that define PCI_IOBASE: armhf, arm64 and s390x.
Restore the pre-LIBIO infrastructure when CONFIG_LIBIO=n, which we'll use
for those builds.
= Verification of correctness =
The files referred to in this test are:
- pci.c: drivers/pci/pci.c with this series applied
- pci.c.lpc: Same as pci.c, but without this patch.
- pci.c.orig: Ubuntu's pci.c, prior to this patchset.
Test #1: Architectures that will use LIBIO and define PCI_IOBASE
(i.e. arm64) use the new LIBIO code:
$ unifdef -DCONFIG_LIBIO -DPCI_IOBASE pci.c > a
$ unifdef -DPCI_IOBASE pci.c.lpc > b
$ diff -u a b
--- a 2017-03-29 14:36:07.444552427 -0600
+++ b 2017-03-29 14:36:16.652547367 -0600
@@ -3241 +3240,0 @@
-
(i.e., whitespace only)
Test #2: Architectures that will *not* use LIBIO and define PCI_IOBASE
(i.e. armhf & s390x) should use pre-LIBIO code.
$ unifdef -UCONFIG_LIBIO -DPCI_IOBASE pci.c > a
$ unifdef -DPCI_IOBASE pci.c.orig > b
$ diff -U0 a b
--- a 2017-03-29 14:42:20.640348198 -0600
+++ b 2017-03-29 14:43:02.204325557 -0600
@@ -3254,2 +3254 @@
-int pci_register_io_range(struct fwnode_handle *fwnode, phys_addr_t addr,
- resource_size_t size)
+int __weak pci_register_io_range(phys_addr_t addr, resource_size_t size)
(i.e., just the expected changes in prototype - new *fwnode param and
removal of unnecessary "__weak" annotation).
Test #3: Architectures that will neither use LIBIO nor define PCI_IOBASE
(i.e. ppc64el & x86) should use pre-LIBIO code.
$ unifdef -UPCI_IOBASE -UCONFIG_LIBIO pci.c > a
$ unifdef -UPCI_IOBASE pci.c.orig > b
$ diff -U0 a b
--- a 2017-03-29 14:45:58.064229981 -0600
+++ b 2017-03-29 14:46:11.392222753 -0600
@@ -3246,2 +3246 @@
-int pci_register_io_range(struct fwnode_handle *fwnode, phys_addr_t addr,
- resource_size_t size)
+int __weak pci_register_io_range(phys_addr_t addr, resource_size_t size)
@@ -3266,0 +3266 @@
+
(Again, just the expected changes in prototype - new *fwnode param and
removal of unnecessary "__weak" annotation).
Signed-off-by: dann frazier <dann.frazier@canonical.com> Acked-by: Seth Forshee <seth.forshee@canonical.com> Acked-by: Kamal Mostafa <kamal@canonical.com> Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
zhichang.yuan [Sat, 11 Mar 2017 13:36:08 +0000 (21:36 +0800)]
UBUNTU: SAUCE: PCI: Apply the new generic I/O management on PCI IO hosts
BugLink: http://bugs.launchpad.net/bugs/1677319
After introducing the new generic I/O space management(LIBIO), the original PCI
MMIO relevant helpers need to be updated based on the new interfaces defined in
LIBIO.
This patch adapts the corresponding code to match the changes introduced by
LIBIO.
[Note that the removal of __weak on pci_register_io_range is intentional, as
there are no other users. See: https://lkml.org/lkml/2017/1/30/848 -dannf]
Signed-off-by: zhichang.yuan <yuanzhichang@hisilicon.com> Signed-off-by: Gabriele Paoloni <gabriele.paoloni@huawei.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> #earlier draft Acked-by: Bjorn Helgaas <bhelgaas@google.com> #drivers/pci parts
(v7 submission)
Reference: http://www.spinics.net/lists/linux-pci/msg59176.html
[dannf: included a few changes from zhichang based on list feedback:
tighter arch-restriction, build fix for non-LIBIO builds & a return code
optimization] Signed-off-by: dann frazier <dann.frazier@canonical.com> Acked-by: Seth Forshee <seth.forshee@canonical.com> Acked-by: Kamal Mostafa <kamal@canonical.com> Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
zhichang.yuan [Mon, 13 Mar 2017 02:42:43 +0000 (10:42 +0800)]
UBUNTU: SAUCE: LPC: Add the ACPI LPC support
BugLink: http://bugs.launchpad.net/bugs/1677319
The patch update the _CRS of LPC children based on the relevant LIBIO
interfaces. Then the ACPI platform device enumeration for LPC can apply the
right I/O resource to request the system I/O space from ioport_resource and
ensure the LPC peripherals work well.
Signed-off-by: zhichang.yuan <yuanzhichang@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com>
(v7 submission)
Reference: http://www.spinics.net/lists/linux-pci/msg59174.html
[dannf: Include fix from zhichang to support early LPC bus probing] Signed-off-by: dann frazier <dann.frazier@canonical.com> Acked-by: Seth Forshee <seth.forshee@canonical.com> Acked-by: Kamal Mostafa <kamal@canonical.com> Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
zhichang.yuan [Mon, 13 Mar 2017 02:42:42 +0000 (10:42 +0800)]
UBUNTU: SAUCE: LIBIO: Support the dynamically logical PIO registration of ACPI host I/O
BugLink: http://bugs.launchpad.net/bugs/1677319
For those hosts which access I/O based on the host/bus local I/O addresses,
their I/O range must be registered and translated as unique logical PIO before
the ACPI enumeration on the devices under the hosts. Otherwise, there is no
available I/O resources allocated for those devices.
This patch implements the interfaces in LIBIO to perform the host local I/O
translation and set the logical IO mapped as ACPI I/O resources.
Signed-off-by: zhichang.yuan <yuanzhichang@hisilicon.com> Signed-off-by: Gabriele Paoloni <gabriele.paoloni@huawei.com>
(v7 submission)
Reference: https://www.spinics.net/lists/arm-kernel/msg568096.html
[dannf: Include fix from zhichang to support early LPC bus probing] Signed-off-by: dann frazier <dann.frazier@canonical.com> Acked-by: Seth Forshee <seth.forshee@canonical.com> Acked-by: Kamal Mostafa <kamal@canonical.com> Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
zhichang.yuan [Mon, 13 Mar 2017 02:42:40 +0000 (10:42 +0800)]
UBUNTU: SAUCE: LPC: Support the device-tree LPC host on Hip06/Hip07
BugLink: http://bugs.launchpad.net/bugs/1677319
The low-pin-count(LPC) interface of Hip06/Hip07 accesses the peripherals in
I/O port addresses. This patch implements the LPC host controller driver which
perform the I/O operations on the underlying hardware.
We don't want to touch those existing peripherals' driver, such as ipmi-bt. So
this driver applies the indirect-IO introduced in the previous patch after
registering an indirect-IO node to the indirect-IO devices list which will be
searched in the I/O accessors.
As the I/O translations for LPC children depend on the host I/O registration,
we should ensure the host I/O registration is finished before all the LPC
children scanning. That is why an arch_init() hook was added in this patch.
Signed-off-by: zhichang.yuan <yuanzhichang@hisilicon.com> Signed-off-by: Gabriele Paoloni <gabriele.paoloni@huawei.com> Acked-by: Rob Herring <robh@kernel.org> #dts part
(v7 submission)
Reference: https://www.spinics.net/lists/arm-kernel/msg568096.html
[dannf: Applied from zhichang to fix probing issue w/o relying on ACPI _DEP] Signed-off-by: dann frazier <dann.frazier@canonical.com> Acked-by: Seth Forshee <seth.forshee@canonical.com> Acked-by: Kamal Mostafa <kamal@canonical.com> Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
zhichang.yuan [Mon, 13 Mar 2017 02:42:39 +0000 (10:42 +0800)]
UBUNTU: SAUCE: OF: Add missing I/O range exception for indirect-IO devices
BugLink: http://bugs.launchpad.net/bugs/1677319
There are some special ISA/LPC devices that work on a specific I/O range where
it is not correct to specify a 'ranges' property in DTS parent node as cpu
addresses translated from DTS node are only for memory space on some
architectures, such as Arm64. Without the parent 'ranges' property, current
of_translate_address() return an error.
Here we add special handlings for this case.
During the OF address translation, some checkings will be perfromed to
identify whether the device node is registered as indirect-IO. If yes, the I/O
translation will be done in a different way from that one of PCI MMIO.
In this way, the I/O 'reg' property of the special ISA/LPC devices will be
parsed correctly.
zhichang.yuan [Mon, 13 Mar 2017 02:42:37 +0000 (10:42 +0800)]
UBUNTU: SAUCE: LIBIO: Introduce a generic PIO mapping method
BugLink: http://bugs.launchpad.net/bugs/1677319
In commit 41f8bba7f55(of/pci: Add pci_register_io_range() and
pci_pio_to_address()), a new I/O space management was supported. With that
driver, the I/O ranges configured for PCI/PCIE hosts on some architectures can
be mapped to logical PIO, converted easily between CPU address and the
corresponding logicial PIO. Based on this, PCI I/O devices can be accessed in a
memory read/write way through the unified in/out accessors.
But on some archs/platforms, there are bus hosts which access I/O peripherals
with host-local I/O port addresses rather than memory addresses after
memory-mapped.
To support those devices, a more generic I/O mapping method is introduced here.
Through this patch, both the CPU addresses and the host-local port can be
mapped into logical PIO, then all the I/O accesses to either PCI MMIO devices or
host-local I/O peripherals can be unified into the existing I/O accessors
defined asm-generic/io.h and be redirected to the right device-specific hooks
based on the input logical PIO.
Peter Hurley [Thu, 16 Mar 2017 03:08:26 +0000 (14:08 +1100)]
UBUNTU: SAUCE: tty: Fix ldisc crash on reopened tty
BugLink: http://bugs.launchpad.net/bugs/1674325
If the tty has been hungup, the ldisc instance may have been destroyed.
Continued input to the tty will be ignored as long as the ldisc instance
is not visible to the flush_to_ldisc kworker. However, when the tty
is reopened and a new ldisc instance is created, the flush_to_ldisc
kworker can obtain an ldisc reference before the new ldisc is
completely initialized. This will likely crash:
Colin Ian King [Fri, 12 May 2017 14:49:53 +0000 (15:49 +0100)]
UBUNTU: SAUCE: exec: ensure file system accounting in check_unsafe_exec is correct
BugLink: http://bugs.launchpad.net/bugs/1672819
There are two very race windows that need to be taken into consideration
when check_unsafe_exec performs the file system accounting comparing the
number of fs->users against the number threads that share the same fs.
The first race can occur when a pthread creates a new pthread and the
the fs->users count is incremented before the new pthread is associated with
the pthread performing the exec. When this occurs, the pthread performing
the exec has flags with bit PF_FORKNOEXEC set.
The second race can occur when a pthread is terminating and the fs->users
count has been decremented by the pthread is still associated with the
pthread that is performing the exec. When this occurs, the pthread
peforming the exec has flags with bit PF_EXITING set.
This fix keeps track of any pthreads that may be in the race window
(when PF_FORKNOEXEC or PF_EXITING) are set and if the fs count does
not match the expected count we retry the count as we may have hit
this small race windows. Tests on an 8 thread server with the
reproducer (see below) show that this retry occurs rarely, so the
overhead of the retry is very small.
Below is a reproducer of the race condition.
The bug manifests itself because the current check_unsafe_exec
hits this race and indicates it is not a safe exec, and the
exec'd suid program fails to setuid.
$ cat Makefile
ALL=a b
all: $(ALL)
a: LDFLAGS=-pthread
b: b.c
$(CC) b.c -o b
sudo chown root:root b
sudo chmod u+s b
test:
for I in $$(seq 1000); do echo $I; ./a ; done
Seth Forshee [Fri, 12 May 2017 20:29:18 +0000 (15:29 -0500)]
UBUNTU: SAUCE: Fix module signing exclusion in package builds
BugLink: http://bugs.launchpad.net/bugs/1690908
The current module signing exclusion implementation suffers from
two problems. First, it looks for the signed-inclusion file
relative to the path where make is executed and thus doesn't work
if the source and build directories are different. Second, the
signed-inclusion file lists only the module name, but the strings
searched for in the file include the path (and the path to the
module install location at that).
Fix these problems by updating scripts/Makefile.modinst to look
for signed-inclusion relative to the path of the source tree and
to use only the module name when matching against the contents of
that file.
Jay Vosburgh [Wed, 11 Nov 2015 13:04:50 +0000 (13:04 +0000)]
UBUNTU: SAUCE: fan: add VXLAN implementation
Generify the fan mapping support and utilise that to implement fan
mappings over vxlan transport.
Expose the existance of this functionality (when the module is loaded)
via an additional sysctl marker.
Signed-off-by: Jay Vosburgh <jay.vosburgh@canonical.com>
[apw@canonical.com: added feature marker for fan over vxlan.] Signed-off-by: Andy Whitcroft <apw@canonical.com>
Conflicts:
drivers/net/vxlan.c
include/uapi/linux/if_link.h
net/ipv4/ipip.c
Signed-off-by: Andy Whitcroft <apw@canonical.com> Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Switch to a single tunnel for all mappings, this removes the limitations
on how many mappings each tunnel can handle, and therefore how many Fan
slices each local address may hold.
NOTE: This introduces a new kernel netlink interface which needs updated
iproute2 support.
BugLink: http://bugs.launchpad.net/bugs/1470091 Signed-off-by: Jay Vosburgh <jay.vosburgh@canonical.com> Signed-off-by: Andy Whitcroft <apw@canonical.com> Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
[saf: Fix conflicts during rebase to 4.12] Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Conflicts:
include/uapi/linux/if_tunnel.h
net/ipv4/ipip.c
Colin Ian King [Tue, 2 May 2017 14:32:47 +0000 (15:32 +0100)]
UBUNTU: SAUCE: (noup) Update spl to 0.6.5.9-1ubuntu1, zfs to 0.6.5.9-5ubuntu5
Add upstream SPL compat patches from upstream to build with 4.11 kernel:
- 8d5feecacfdcca Linux 4.11 compat: set_task_state() removed
- 94b1ab2ae01e9e Linux 4.11 compat: vfs_getattr() takes 4 args
- 9a054d54fb6772 Linux 4.11 compat: add linux/sched/signal.h
- bf8abea4dade11 Linux 4.11 compat: remove stub for __put_task_struct
Add upstream ZFS compat patches from upstream to build with 4.11 kernel:
- a3478c07475261 Linux 4.11 compat: iops.getattr and friends
- 4859fe796c5b03 Linux 4.11 compat: avoid refcount_t name conflict
Tested and verified against the Ubuntu ZFS autotest regression tests
Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
UBUNTU: SAUCE: (namespace) block_dev: Forbid unprivileged mounting when device is opened for writing
For unprivileged mounts to be safe the user must not be able to
make changes to the backing store while it is mounted. This patch
takes a step towards preventing this by refusing to mount in a
user namepspace if the block device is open for writing and
refusing attempts to open the block device for writing by non-
root while it is mounted in a user namespace.
To prevent this from happening we use i_writecount in the inodes
of the bdev filesystem similarly to how it is used for regular
files. Whenever the device is opened for writing i_writecount
is checked; if it is negative the open returns -EBUSY, otherwise
i_writecount is incremented. On mount, a positive i_writecount
results in mount_bdev returning -EBUSY, otherwise i_writecount
is decremented. Opens by root and mounts from init_user_ns do not
check nor modify i_writecount.
Seth Forshee [Tue, 9 Feb 2016 19:26:34 +0000 (13:26 -0600)]
UBUNTU: SAUCE: (namespace) ext4: Add module parameter to enable user namespace mounts
This is still an experimental feature, so disable it by default
and allow it only when the system administrator supplies the
userns_mounts=true module parameter.
Seth Forshee [Thu, 15 Dec 2016 17:03:08 +0000 (11:03 -0600)]
UBUNTU: SAUCE: (namespace) evm: Don't update hmacs in user ns mounts
The kernel should not calculate new hmacs for mounts done by
non-root users. Update evm_calc_hmac_or_hash() to refuse to
calculate new hmacs for mounts for non-init user namespaces.
Seth Forshee [Sat, 18 Oct 2014 11:02:09 +0000 (13:02 +0200)]
UBUNTU: SAUCE: (namespace) ext4: Add support for unprivileged mounts from user namespaces
Support unprivileged mounting of ext4 volumes from user
namespaces. This requires the following changes:
- Perform all uid, gid, and projid conversions to/from disk
relative to s_user_ns. In many cases this will already be
handled by the vfs helper functions. This also requires
updates to handle cases where ids may not map into s_user_ns.
A new helper, projid_valid_eq(), is added to help with this.
- Update most capability checks to check for capabilities in
s_user_ns rather than init_user_ns. These mostly reflect
changes to the filesystem that a user in s_user_ns could
already make externally by virtue of having write access to
the backing device.
- Restrict unsafe options in either the mount options or the
ext4 superblock. Currently the only concerning option is
errors=panic, and this is made to require CAP_SYS_ADMIN in
init_user_ns.
- Verify that unprivileged users have the required access to the
journal device at the path passed via the journal_path mount
option.
Note that for the journal_path and the journal_dev mount
options, and for external journal devices specified in the
ext4 superblock, devcgroup restrictions will be enforced by
__blkdev_get(), (via blkdev_get_by_dev()), ensuring that the
user has been granted appropriate access to the block device.
- Set the FS_USERNS_MOUNT flag on the filesystem types supported
by ext4.
sysfs attributes for ext4 mounts remain writable only by real
root.
Seth Forshee [Thu, 2 Oct 2014 20:34:45 +0000 (15:34 -0500)]
UBUNTU: SAUCE: (namespace) fuse: Restrict allow_other to the superblock's namespace or a descendant
Unprivileged users are normally restricted from mounting with the
allow_other option by system policy, but this could be bypassed
for a mount done with user namespace root permissions. In such
cases allow_other should not allow users outside the userns
to access the mount as doing so would give the unprivileged user
the ability to manipulate processes it would otherwise be unable
to manipulate. Restrict allow_other to apply to users in the same
userns used at mount or a descendant of that namespace. Also
export current_in_userns() for use by fuse when built as a
module.
Seth Forshee [Thu, 26 Jun 2014 16:58:11 +0000 (11:58 -0500)]
UBUNTU: SAUCE: (namespace) fuse: Support fuse filesystems outside of init_user_ns
In order to support mounts from namespaces other than
init_user_ns, fuse must translate uids and gids to/from the
userns of the process servicing requests on /dev/fuse. This
patch does that, with a couple of restrictions on the namespace:
- The userns for the fuse connection is fixed to the namespace
from which /dev/fuse is opened.
- The namespace must be the same as s_user_ns.
These restrictions simplify the implementation by avoiding the
need to pass around userns references and by allowing fuse to
rely on the checks in inode_change_ok for ownership changes.
Either restriction could be relaxed in the future if needed.
For cuse the namespace used for the connection is also simply
current_user_ns() at the time /dev/cuse is opened.
Seth Forshee [Sun, 15 Feb 2015 20:35:35 +0000 (14:35 -0600)]
UBUNTU: SAUCE: (namespace) fs: Allow CAP_SYS_ADMIN in s_user_ns to freeze and thaw filesystems
The user in control of a super block should be allowed to freeze
and thaw it. Relax the restrictions on the FIFREEZE and FITHAW
ioctls to require CAP_SYS_ADMIN in s_user_ns.
UBUNTU: SAUCE: (namespace) capabilities: Allow privileged user in s_user_ns to set security.* xattrs
A privileged user in s_user_ns will generally have the ability to
manipulate the backing store and insert security.* xattrs into
the filesystem directly. Therefore the kernel must be prepared to
handle these xattrs from unprivileged mounts, and it makes little
sense for commoncap to prevent writing these xattrs to the
filesystem. The capability and LSM code have already been updated
to appropriately handle xattrs from unprivileged mounts, so it
is safe to loosen this restriction on setting xattrs.
The exception to this logic is that writing xattrs to a mounted
filesystem may also cause the LSM inode_post_setxattr or
inode_setsecurity callbacks to be invoked. SELinux will deny the
xattr update by virtue of applying mountpoint labeling to
unprivileged userns mounts, and Smack will deny the writes for
any user without global CAP_MAC_ADMIN, so loosening the
capability check in commoncap is safe in this respect as well.
UBUNTU: SAUCE: (namespace) fs: Allow superblock owner to access do_remount_sb()
Superblock level remounts are currently restricted to global
CAP_SYS_ADMIN, as is the path for changing the root mount to
read only on umount. Loosen both of these permission checks to
also allow CAP_SYS_ADMIN in any namespace which is privileged
towards the userns which originally mounted the filesystem.
UBUNTU: SAUCE: (namespace) fs: Allow superblock owner to change ownership of inodes
Allow users with CAP_SYS_CHOWN over the superblock of a filesystem to
chown files. Ordinarily the capable_wrt_inode_uidgid check is
sufficient to allow access to files but when the underlying filesystem
has uids or gids that don't map to the current user namespace it is
not enough, so the chown permission checks need to be extended to
allow this case.
Calling chown on filesystem nodes whose uid or gid don't map is
necessary if those nodes are going to be modified as writing back
inodes which contain uids or gids that don't map is likely to cause
filesystem corruption of the uid or gid fields.
Once chown has been called the existing capable_wrt_inode_uidgid
checks are sufficient, to allow the owner of a superblock to do anything
the global root user can do with an appropriate set of capabilities.
For the proc filesystem this relaxation of permissions is not safe, as
some files are owned by users (particularly GLOBAL_ROOT_UID) outside
of the control of the mounter of the proc and that would be unsafe to
grant chown access to. So update setattr on proc to disallow changing
files whose uids or gids are outside of proc's s_user_ns.
The original version of this patch was written by: Seth Forshee. I
have rewritten and rethought this patch enough so it's really not the
same thing (certainly it needs a different description), but he
deserves credit for getting out there and getting the conversation
started, and finding the potential gotcha's and putting up with my
semi-paranoid feedback.
Inspired-by: Seth Forshee <seth.forshee@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
[saf: Resolve conflicts caused by s/inode_change_ok/setattr_prepare/]
Seth Forshee [Wed, 7 Oct 2015 19:53:33 +0000 (14:53 -0500)]
UBUNTU: SAUCE: (namespace) mtd: Check permissions towards mtd block device inode when mounting
Unprivileged users should not be able to mount mtd block devices
when they lack sufficient privileges towards the block device
inode. Update mount_mtd() to validate that the user has the
required access to the inode at the specified path. The check
will be skipped for CAP_SYS_ADMIN, so privileged mounts will
continue working as before.
Seth Forshee [Wed, 7 Oct 2015 19:49:47 +0000 (14:49 -0500)]
UBUNTU: SAUCE: (namespace) block_dev: Check permissions towards block device inode when mounting
Unprivileged users should not be able to mount block devices when
they lack sufficient privileges towards the block device inode.
Update blkdev_get_by_path() to validate that the user has the
required access to the inode at the specified path. The check
will be skipped for CAP_SYS_ADMIN, so privileged mounts will
continue working as before.
UBUNTU: SAUCE: (namespace) block_dev: Support checking inode permissions in lookup_bdev()
When looking up a block device by path no permission check is
done to verify that the user has access to the block device inode
at the specified path. In some cases it may be necessary to
check permissions towards the inode, such as allowing
unprivileged users to mount block devices in user namespaces.
Add an argument to lookup_bdev() to optionally perform this
permission check. A value of 0 skips the permission check and
behaves the same as before. A non-zero value specifies the mask
of access rights required towards the inode at the specified
path. The check is always skipped if the user has CAP_SYS_ADMIN.
All callers of lookup_bdev() currently pass a mask of 0, so this
patch results in no functional change. Subsequent patches will
add permission checks where appropriate.
Stefan Bader [Wed, 1 Mar 2017 10:28:07 +0000 (11:28 +0100)]
UBUNTU: SAUCE: bcache: Fix bcache device names
When adding partition support to bcache, the name assignment was not
updated, resulting in numbers jumping (bcache0, bcache16, bcache32...).
Fix this by taking BCACHE_MINORS into account when assigning the disk
name.
BugLink: https://bugs.launchpad.net/bugs/1667078 Fixes: b8c0d91 (bcache: partition support: add 16 minors per bcacheN device) Cc: <stable@vger.kernel.org> # v4.10 Signed-off-by: Stefan Bader <stefan.bader@canonical.com> Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Seth Forshee [Tue, 19 Jan 2016 19:12:02 +0000 (13:12 -0600)]
UBUNTU: SAUCE: overlayfs: Skip permission checking for trusted.overlayfs.* xattrs
The original mounter had CAP_SYS_ADMIN in the user namespace
where the mount happened, and the vfs has validated that the user
has permission to do the requested operation. This is sufficient
for allowing the kernel to write these specific xattrs, so we can
bypass the permission checks for these xattrs.
To support this, export __vfs_setxattr_noperm and add an similar
__vfs_removexattr_noperm which is also exported. Use these when
setting or removing trusted.overlayfs.* xattrs.
Colin Ian King [Mon, 6 Feb 2017 15:21:31 +0000 (15:21 +0000)]
UBUNTU: SAUCE: md/raid6 algorithms: scale test duration for speedier boots
The original code runs for a set run time based on 2^RAID6_TIME_JIFFIES_LG2.
The default kernel value for RAID6_TIME_JIFFIES_LG2 is 4, however, emperical
testing shows that a value of 3.5 is the sweet spot for getting consistent
benchmarking results and speeding up the run time of the benchmarking.
To achieve 2^3.5 we use the following:
2^3.5 = 2^4 / 2^0.5
= 2^4 / sqrt(2)
= 2^4 * 0.707106781
Too keep this as integer math that is as accurate as required and avoiding
overflow, this becomes:
= 2^4 * 181 / 256
= (2^4 * 181) >> 8
We also need to scale down perf by the same factor, however, to
get a good approximate integer result without an overflow we scale
by 2^4.0 * sqrt(2) =
= 2 ^ 4 * 1.41421356237
= 2 ^ 4 * 1448 / 1024
= (2 ^ 4 * 1448) >> 10
This has been tested on 2 AWS instances, a small t2 and a medium m3
with 30 boot tests each and compared to the same instances booted 30
times on an umodified kernel. In all results, we get the same
algorithms being selected and a 100% consistent result over the 30
boots, showing that this optimised jiffy timing scaling does not break
the original functionality.
On the t2.small we see a saving of ~0.126 seconds and t3.medium a saving of
~0.177 seconds.
Tested on a 4 CPU VM on an 8 thread Xeon server; seeing a saving of ~0.33
seconds (average over 10 boots).
Tested on a 8 thread Xeon server, seeing a saving of ~1.24 seconds (average
of 10 boots).
The testing included double checking the algorithm chosen by the optimized
selection and seeing the same as pre-optimised version.
Signed-off-by: Colin Ian King <colin.king@canonical.com> Acked-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com> Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
BugLink: http://bugs.launchpad.net/bugs/1628889
Add support for automatic message tags to the printk macro
families dev_xyz and pr_xyz. The message tag consists of a
component name and a 24 bit hash of the message text. For
each message that is documented in the included kernel message
catalog a man page can be created with a script (which is
included in the patch). The generated man pages contain
explanatory text that is intended to help understand the
messages.
Note that only s390 specific messages are prepared
appropriately and included in the generated message catalog.
This patch is optional as it is very unlikely to be accepted
in upstream kernel, but is recommended for all distributions
which are built based on the 'Development stream'
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
[saf: Adjust context for v4.13-rc1] Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Ming Lei [Thu, 3 Nov 2016 01:20:01 +0000 (09:20 +0800)]
UBUNTU: SAUCE: hio: splitting bio in the entry of .make_request_fn
BugLink: http://bugs.launchpad.net/bugs/1638700
From v4.3, the incoming bio can be very big[1], and it is
required to split it first in .make_request_fn(), so
we need to do that for hio.c too.
Signed-off-by: Ming Lei <ming.lei@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com> Acked-by: Tim Gardner <tim.gardner@canonical.com> Acked-by: Seth Forshee <seth.forshee@canonical.com> Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
This program is free software; you can redistribute it and/or modify it
under the terms and conditions of the GNU General Public License,
version 2, as published by the Free Software Foundation.
This program is distributed in the hope it will be useful, but WITHOUT
ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for
more details.
Signed-off-by: Kamal Mostafa <kamal@canonical.com> Acked-by: Brad Figg <brad.figg@canonical.com> Acked-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com> BugLink: http://bugs.launchpad.net/bugs/1635594 Signed-off-by: Andy Whitcroft <apw@canonical.com> Acked-by: Leann Ogasawara <leann.ogasawara@canonical.com> Acked-by: Stefan Bader <stefan.bader@canonical.com> Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Upstream commit 96368701e1c89057bbf39222e965161c68a85b4b changed the
auditing behavior of seccomp so that actions are only logged when the
audit subsystem is enabled. A default install of Ubuntu does not include
the audit userspace and simply enabling the audit subsystem, without
filtering some audit events, would result in more audit records hitting
the system log than usual.
This patch undoes the functional change in upstream commit 96368701e1c89057bbf39222e965161c68a85b4b and goes back to the old
behavior of logging seccomp actions even when audit is not enabled.
Signed-off-by: Tyler Hicks <tyhicks@canonical.com> Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Seth Forshee [Thu, 21 Jan 2016 21:37:53 +0000 (15:37 -0600)]
UBUNTU: SAUCE: overlayfs: Propogate nosuid from lower and upper mounts
An overlayfs mount using an upper or lower directory from a
nosuid filesystem bypasses this restriction. Change this so
that if any lower or upper directory is nosuid at mount time the
overlayfs superblock is marked nosuid. This requires some
additions at the vfs level since nosuid currently only applies to
mounts, so a SB_I_NOSUID flag is added along with a helper
function to check a path for nosuid in both the mount and the
superblock.
Seth Forshee [Thu, 21 Jan 2016 17:52:04 +0000 (11:52 -0600)]
UBUNTU: SAUCE: overlayfs: Be more careful about copying up sxid files
When an overlayfs filesystem's lowerdir is on a nosuid filesystem
but the upperdir is not, it's possible to copy up an sxid file or
stick directory into upperdir without changing the mode by
opening the file rw in the overlayfs mount without writing to it.
This makes it possible to bypass the nosuid restriction on the
lowerdir mount.
It's a bad idea in general to let the mounter copy up a sxid file
if the mounter wouldn't have had permission to create the sxid
file in the first place. Therefore change ovl_set_xattr to
exclude these bits when initially setting the mode, then set the
full mode after setting the user for the inode. This allows copy
up for non-sxid files to work as before but causes copy up to
fail for the cases where the user could not have created the sxid
inode in upperdir.
In the mark_source_chains function (net/ipv4/netfilter/ip_tables.c) it
is possible for a user-supplied ipt_entry structure to have a large
next_offset field. This field is not bounds checked prior to writing a
counter value at the supplied offset.
Problem is that xt_entry_foreach() macro stops iterating once e->next_offset
is out of bounds, assuming this is the last entry.
With malformed data thats not necessarily the case so we can
write outside of allocated area later as we might not have walked the
entire blob.
Fix this by simplifying mark_source_chains -- it already has to check
if nextoff is in range to catch invalid jumps, so just do the check
when we move to a next entry as well.
Also, check that the offset meets the xtables_entry alignment.
Reported-by: Ben Hawkes <hawkes@google.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Chris J. Arges <chris.j.arges@canonical.com> Acked-by: Brad Figg <brad.figg@canonical.com> Signed-off-by: Brad Figg <brad.figg@canonical.com> Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Ben Hutchings [Tue, 16 Aug 2016 16:27:00 +0000 (10:27 -0600)]
UBUNTU: SAUCE: security,perf: Allow further restriction of perf_event_open
https://lkml.org/lkml/2016/1/11/587
The GRKERNSEC_PERF_HARDEN feature extracted from grsecurity. Adds the
option to disable perf_event_open() entirely for unprivileged users.
This standalone version doesn't include making the variable read-only
(or renaming it).
When kernel.perf_event_open is set to 3 (or greater), disallow all
access to performance events by users without CAP_SYS_ADMIN.
Add a Kconfig symbol CONFIG_SECURITY_PERF_EVENTS_RESTRICT that
makes this value the default.
This is based on a similar feature in grsecurity
(CONFIG_GRKERNSEC_PERF_HARDEN). This version doesn't include making
the variable read-only. It also allows enabling further restriction
at run-time regardless of whether the default is changed.
Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
UBUNTU: SAUCE: Clear Linux: reduce e1000e boot time by tightening sleep ranges
The e1000e driver is a great user of the usleep_range() API,
and has any nice ranges that in principle help power management.
However the ranges that are used only during system startup are
very long (and can add easily 100 msec to the boot time) while
the power savings of such long ranges is irrelevant due to the
one-off, boot only, nature of these functions.
This patch shrinks some of the longest ranges to be shorter
(while still using a power friendly 1 msec range); this saves
100msec+ of boot time on my BDW NUCs
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Arjan van de Ven [Tue, 23 Jun 2015 06:26:52 +0000 (01:26 -0500)]
UBUNTU: SAUCE: Clear Linux: i8042: decrease debug message level to info
Author: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Miguel Bernal Marin <miguel.bernal.marin@linux.intel.com> Signed-off-by: Jose Carlos Venegas Munoz <jos.c.venegas.munoz@intel.com> Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Seth Forshee [Tue, 19 Jan 2016 16:20:43 +0000 (10:20 -0600)]
UBUNTU: SAUCE: cred: Add clone_cred() interface
This interface returns a new set of credentials which is an exact
copy of another set. Also update prepare_kernel_cred() to use
this function instead of duplicating code.
Tetsuo Handa [Sat, 29 Mar 2014 06:39:24 +0000 (15:39 +0900)]
UBUNTU: SAUCE: kthread: Do not leave kthread_create() immediately upon SIGKILL.
Commit 786235ee "kthread: make kthread_create() killable" changed to
leave kthread_create() as soon as receiving SIGKILL. But this change
caused boot failures if systemd-udevd worker process received SIGKILL
due to systemd's hardcoded 30 seconds timeout while loading fusion
driver using finit_module() [1].
Linux kernel people think that the systemd's hardcoded timeout is a
systemd bug. But systemd people think that loading of kernel module
needs more than 30 seconds is a kernel module's bug.
Although Linux kernel people are expecting fusion driver module not
to take more than 30 seconds, it will definitely not in time for
trusty kernel. Also, nobody can prove that fusion driver module is
the only case which is affected by commit 786235ee.
Therefore, this patch changes kthread_create() to wait for up to 10
seconds after receiving SIGKILL, unless chosen by the OOM killer,
in order to give the kthreadd a chance to complete the request.
The side effect of this patch is that current thread's response to
SIGKILL is delayed for a bit (likely less than a second, unlikely
10 seconds).
Andy Whitcroft [Wed, 16 Apr 2014 18:40:57 +0000 (19:40 +0100)]
UBUNTU: SAUCE: vt -- maintain bootloader screen mode and content until vt switch
Introduce a new VT mode KD_TRANSPARENT which endevours to leave the current
content of the framebuffer untouched. This allows the bootloader to insert
a graphical splash and have the kernel maintain it until the OS splash
can take over. When we finally switch away (either through programs like
plymouth or manually) the content is lost and the VT reverts to text mode.
Seth Forshee [Tue, 17 Jan 2017 21:19:39 +0000 (15:19 -0600)]
UBUNTU: SAUCE: (no-up) i915: Remove MODULE_FIRMWARE statements for unreleased firmware
BugLink: http://bugs.launchpad.net/bugs/1626740
Intel has added MODULE_FIRMWARE statements to i915 which refer to
firmware files that they have not yet pushed out to upstream
linux-firmware. This causes the following warnings when
generating the initrd:
W: Possible missing firmware /lib/firmware/i915/kbl_guc_ver9_14.bin for module i915
W: Possible missing firmware /lib/firmware/i915/bxt_guc_ver8_7.bin for module i915
This firmware is clearly optional, and the warnings have been
generating a lot of confusion for users. Remove the offending
MODULE_FIRMWARE statements until Intel makes these files
available.