UBUNTU: SAUCE: (namespace) fs: Allow superblock owner to change ownership of inodes
Allow users with CAP_SYS_CHOWN over the superblock of a filesystem to
chown files. Ordinarily the capable_wrt_inode_uidgid check is
sufficient to allow access to files but when the underlying filesystem
has uids or gids that don't map to the current user namespace it is
not enough, so the chown permission checks need to be extended to
allow this case.
Calling chown on filesystem nodes whose uid or gid don't map is
necessary if those nodes are going to be modified as writing back
inodes which contain uids or gids that don't map is likely to cause
filesystem corruption of the uid or gid fields.
Once chown has been called the existing capable_wrt_inode_uidgid
checks are sufficient, to allow the owner of a superblock to do anything
the global root user can do with an appropriate set of capabilities.
For the proc filesystem this relaxation of permissions is not safe, as
some files are owned by users (particularly GLOBAL_ROOT_UID) outside
of the control of the mounter of the proc and that would be unsafe to
grant chown access to. So update setattr on proc to disallow changing
files whose uids or gids are outside of proc's s_user_ns.
The original version of this patch was written by: Seth Forshee. I
have rewritten and rethought this patch enough so it's really not the
same thing (certainly it needs a different description), but he
deserves credit for getting out there and getting the conversation
started, and finding the potential gotcha's and putting up with my
semi-paranoid feedback.
Inspired-by: Seth Forshee <seth.forshee@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
[saf: Resolve conflicts caused by s/inode_change_ok/setattr_prepare/]
Seth Forshee [Wed, 7 Oct 2015 19:53:33 +0000 (14:53 -0500)]
UBUNTU: SAUCE: (namespace) mtd: Check permissions towards mtd block device inode when mounting
Unprivileged users should not be able to mount mtd block devices
when they lack sufficient privileges towards the block device
inode. Update mount_mtd() to validate that the user has the
required access to the inode at the specified path. The check
will be skipped for CAP_SYS_ADMIN, so privileged mounts will
continue working as before.
Seth Forshee [Wed, 7 Oct 2015 19:49:47 +0000 (14:49 -0500)]
UBUNTU: SAUCE: (namespace) block_dev: Check permissions towards block device inode when mounting
Unprivileged users should not be able to mount block devices when
they lack sufficient privileges towards the block device inode.
Update blkdev_get_by_path() to validate that the user has the
required access to the inode at the specified path. The check
will be skipped for CAP_SYS_ADMIN, so privileged mounts will
continue working as before.
UBUNTU: SAUCE: (namespace) block_dev: Support checking inode permissions in lookup_bdev()
When looking up a block device by path no permission check is
done to verify that the user has access to the block device inode
at the specified path. In some cases it may be necessary to
check permissions towards the inode, such as allowing
unprivileged users to mount block devices in user namespaces.
Add an argument to lookup_bdev() to optionally perform this
permission check. A value of 0 skips the permission check and
behaves the same as before. A non-zero value specifies the mask
of access rights required towards the inode at the specified
path. The check is always skipped if the user has CAP_SYS_ADMIN.
All callers of lookup_bdev() currently pass a mask of 0, so this
patch results in no functional change. Subsequent patches will
add permission checks where appropriate.
Seth Forshee [Tue, 19 Jan 2016 19:12:02 +0000 (13:12 -0600)]
UBUNTU: SAUCE: overlayfs: Skip permission checking for trusted.overlayfs.* xattrs
The original mounter had CAP_SYS_ADMIN in the user namespace
where the mount happened, and the vfs has validated that the user
has permission to do the requested operation. This is sufficient
for allowing the kernel to write these specific xattrs, so we can
bypass the permission checks for these xattrs.
To support this, export __vfs_setxattr_noperm and add an similar
__vfs_removexattr_noperm which is also exported. Use these when
setting or removing trusted.overlayfs.* xattrs.
Colin Ian King [Mon, 6 Feb 2017 15:21:31 +0000 (15:21 +0000)]
UBUNTU: SAUCE: md/raid6 algorithms: scale test duration for speedier boots
The original code runs for a set run time based on 2^RAID6_TIME_JIFFIES_LG2.
The default kernel value for RAID6_TIME_JIFFIES_LG2 is 4, however, emperical
testing shows that a value of 3.5 is the sweet spot for getting consistent
benchmarking results and speeding up the run time of the benchmarking.
To achieve 2^3.5 we use the following:
2^3.5 = 2^4 / 2^0.5
= 2^4 / sqrt(2)
= 2^4 * 0.707106781
Too keep this as integer math that is as accurate as required and avoiding
overflow, this becomes:
= 2^4 * 181 / 256
= (2^4 * 181) >> 8
We also need to scale down perf by the same factor, however, to
get a good approximate integer result without an overflow we scale
by 2^4.0 * sqrt(2) =
= 2 ^ 4 * 1.41421356237
= 2 ^ 4 * 1448 / 1024
= (2 ^ 4 * 1448) >> 10
This has been tested on 2 AWS instances, a small t2 and a medium m3
with 30 boot tests each and compared to the same instances booted 30
times on an umodified kernel. In all results, we get the same
algorithms being selected and a 100% consistent result over the 30
boots, showing that this optimised jiffy timing scaling does not break
the original functionality.
On the t2.small we see a saving of ~0.126 seconds and t3.medium a saving of
~0.177 seconds.
Tested on a 4 CPU VM on an 8 thread Xeon server; seeing a saving of ~0.33
seconds (average over 10 boots).
Tested on a 8 thread Xeon server, seeing a saving of ~1.24 seconds (average
of 10 boots).
The testing included double checking the algorithm chosen by the optimized
selection and seeing the same as pre-optimised version.
Signed-off-by: Colin Ian King <colin.king@canonical.com> Acked-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com> Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
BugLink: http://bugs.launchpad.net/bugs/1628889
Add support for automatic message tags to the printk macro
families dev_xyz and pr_xyz. The message tag consists of a
component name and a 24 bit hash of the message text. For
each message that is documented in the included kernel message
catalog a man page can be created with a script (which is
included in the patch). The generated man pages contain
explanatory text that is intended to help understand the
messages.
Note that only s390 specific messages are prepared
appropriately and included in the generated message catalog.
This patch is optional as it is very unlikely to be accepted
in upstream kernel, but is recommended for all distributions
which are built based on the 'Development stream'
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
[saf: Adjust context for v4.15-rc2] Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Ming Lei [Thu, 3 Nov 2016 01:20:01 +0000 (09:20 +0800)]
UBUNTU: SAUCE: hio: splitting bio in the entry of .make_request_fn
BugLink: http://bugs.launchpad.net/bugs/1638700
From v4.3, the incoming bio can be very big[1], and it is
required to split it first in .make_request_fn(), so
we need to do that for hio.c too.
Signed-off-by: Ming Lei <ming.lei@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com> Acked-by: Tim Gardner <tim.gardner@canonical.com> Acked-by: Seth Forshee <seth.forshee@canonical.com> Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
This program is free software; you can redistribute it and/or modify it
under the terms and conditions of the GNU General Public License,
version 2, as published by the Free Software Foundation.
This program is distributed in the hope it will be useful, but WITHOUT
ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for
more details.
Signed-off-by: Kamal Mostafa <kamal@canonical.com> Acked-by: Brad Figg <brad.figg@canonical.com> Acked-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com> BugLink: http://bugs.launchpad.net/bugs/1635594 Signed-off-by: Andy Whitcroft <apw@canonical.com> Acked-by: Leann Ogasawara <leann.ogasawara@canonical.com> Acked-by: Stefan Bader <stefan.bader@canonical.com> Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Seth Forshee [Thu, 21 Jan 2016 21:37:53 +0000 (15:37 -0600)]
UBUNTU: SAUCE: overlayfs: Propogate nosuid from lower and upper mounts
An overlayfs mount using an upper or lower directory from a
nosuid filesystem bypasses this restriction. Change this so
that if any lower or upper directory is nosuid at mount time the
overlayfs superblock is marked nosuid. This requires some
additions at the vfs level since nosuid currently only applies to
mounts, so a SB_I_NOSUID flag is added along with a helper
function to check a path for nosuid in both the mount and the
superblock.
Seth Forshee [Thu, 21 Jan 2016 17:52:04 +0000 (11:52 -0600)]
UBUNTU: SAUCE: overlayfs: Be more careful about copying up sxid files
When an overlayfs filesystem's lowerdir is on a nosuid filesystem
but the upperdir is not, it's possible to copy up an sxid file or
stick directory into upperdir without changing the mode by
opening the file rw in the overlayfs mount without writing to it.
This makes it possible to bypass the nosuid restriction on the
lowerdir mount.
It's a bad idea in general to let the mounter copy up a sxid file
if the mounter wouldn't have had permission to create the sxid
file in the first place. Therefore change ovl_set_xattr to
exclude these bits when initially setting the mode, then set the
full mode after setting the user for the inode. This allows copy
up for non-sxid files to work as before but causes copy up to
fail for the cases where the user could not have created the sxid
inode in upperdir.
In the mark_source_chains function (net/ipv4/netfilter/ip_tables.c) it
is possible for a user-supplied ipt_entry structure to have a large
next_offset field. This field is not bounds checked prior to writing a
counter value at the supplied offset.
Problem is that xt_entry_foreach() macro stops iterating once e->next_offset
is out of bounds, assuming this is the last entry.
With malformed data thats not necessarily the case so we can
write outside of allocated area later as we might not have walked the
entire blob.
Fix this by simplifying mark_source_chains -- it already has to check
if nextoff is in range to catch invalid jumps, so just do the check
when we move to a next entry as well.
Also, check that the offset meets the xtables_entry alignment.
Reported-by: Ben Hawkes <hawkes@google.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Chris J. Arges <chris.j.arges@canonical.com> Acked-by: Brad Figg <brad.figg@canonical.com> Signed-off-by: Brad Figg <brad.figg@canonical.com> Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Ben Hutchings [Tue, 16 Aug 2016 16:27:00 +0000 (10:27 -0600)]
UBUNTU: SAUCE: security,perf: Allow further restriction of perf_event_open
https://lkml.org/lkml/2016/1/11/587
The GRKERNSEC_PERF_HARDEN feature extracted from grsecurity. Adds the
option to disable perf_event_open() entirely for unprivileged users.
This standalone version doesn't include making the variable read-only
(or renaming it).
When kernel.perf_event_open is set to 3 (or greater), disallow all
access to performance events by users without CAP_SYS_ADMIN.
Add a Kconfig symbol CONFIG_SECURITY_PERF_EVENTS_RESTRICT that
makes this value the default.
This is based on a similar feature in grsecurity
(CONFIG_GRKERNSEC_PERF_HARDEN). This version doesn't include making
the variable read-only. It also allows enabling further restriction
at run-time regardless of whether the default is changed.
Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
UBUNTU: SAUCE: Clear Linux: reduce e1000e boot time by tightening sleep ranges
The e1000e driver is a great user of the usleep_range() API,
and has any nice ranges that in principle help power management.
However the ranges that are used only during system startup are
very long (and can add easily 100 msec to the boot time) while
the power savings of such long ranges is irrelevant due to the
one-off, boot only, nature of these functions.
This patch shrinks some of the longest ranges to be shorter
(while still using a power friendly 1 msec range); this saves
100msec+ of boot time on my BDW NUCs
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Arjan van de Ven [Tue, 23 Jun 2015 06:26:52 +0000 (01:26 -0500)]
UBUNTU: SAUCE: Clear Linux: i8042: decrease debug message level to info
Author: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Miguel Bernal Marin <miguel.bernal.marin@linux.intel.com> Signed-off-by: Jose Carlos Venegas Munoz <jos.c.venegas.munoz@intel.com> Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Seth Forshee [Tue, 19 Jan 2016 16:20:43 +0000 (10:20 -0600)]
UBUNTU: SAUCE: cred: Add clone_cred() interface
This interface returns a new set of credentials which is an exact
copy of another set. Also update prepare_kernel_cred() to use
this function instead of duplicating code.
Tetsuo Handa [Sat, 29 Mar 2014 06:39:24 +0000 (15:39 +0900)]
UBUNTU: SAUCE: kthread: Do not leave kthread_create() immediately upon SIGKILL.
Commit 786235ee "kthread: make kthread_create() killable" changed to
leave kthread_create() as soon as receiving SIGKILL. But this change
caused boot failures if systemd-udevd worker process received SIGKILL
due to systemd's hardcoded 30 seconds timeout while loading fusion
driver using finit_module() [1].
Linux kernel people think that the systemd's hardcoded timeout is a
systemd bug. But systemd people think that loading of kernel module
needs more than 30 seconds is a kernel module's bug.
Although Linux kernel people are expecting fusion driver module not
to take more than 30 seconds, it will definitely not in time for
trusty kernel. Also, nobody can prove that fusion driver module is
the only case which is affected by commit 786235ee.
Therefore, this patch changes kthread_create() to wait for up to 10
seconds after receiving SIGKILL, unless chosen by the OOM killer,
in order to give the kthreadd a chance to complete the request.
The side effect of this patch is that current thread's response to
SIGKILL is delayed for a bit (likely less than a second, unlikely
10 seconds).
Andy Whitcroft [Wed, 16 Apr 2014 18:40:57 +0000 (19:40 +0100)]
UBUNTU: SAUCE: vt -- maintain bootloader screen mode and content until vt switch
Introduce a new VT mode KD_TRANSPARENT which endevours to leave the current
content of the framebuffer untouched. This allows the bootloader to insert
a graphical splash and have the kernel maintain it until the OS splash
can take over. When we finally switch away (either through programs like
plymouth or manually) the content is lost and the VT reverts to text mode.
Seth Forshee [Tue, 17 Jan 2017 21:19:39 +0000 (15:19 -0600)]
UBUNTU: SAUCE: (no-up) i915: Remove MODULE_FIRMWARE statements for unreleased firmware
BugLink: http://bugs.launchpad.net/bugs/1626740
Intel has added MODULE_FIRMWARE statements to i915 which refer to
firmware files that they have not yet pushed out to upstream
linux-firmware. This causes the following warnings when
generating the initrd:
W: Possible missing firmware /lib/firmware/i915/kbl_guc_ver9_14.bin for module i915
W: Possible missing firmware /lib/firmware/i915/bxt_guc_ver8_7.bin for module i915
This firmware is clearly optional, and the warnings have been
generating a lot of confusion for users. Remove the offending
MODULE_FIRMWARE statements until Intel makes these files
available.
UBUNTU: SAUCE: (no-up) ACPI: Disable Windows 8 compatibility for some Lenovo ThinkPads
The AML implementation for brightness control on several ThinkPads
contains a workaround to meet a Windows 8 requirement of 101 brightness
levels [1]. The implementation is flawed, as only 16 of the brighness
values reported by _BCL affect a change in brightness. _BCM silently
discards the rest of the values. Disabling Windows 8 compatibility on
these machines reverts them to the old behavior, making _BCL only report
the 16 brightness levels which actually work. Add a quirk to do this
along with a dmi callback to disable Win8 compatibility.
The kernel boot parameter 'nr_cpus=' allows one to specify number of
possible cpus in the system. In the normal scenario the first cpu (cpu0)
that shows up is the boot cpu and hence it gets covered under nr_cpus
limit.
But this assumption will be broken in kdump scenario where kdump kenrel
after a crash can boot up on an non-zero boot cpu. The paca structure
allocation depends on value of nr_cpus and is indexed using logical cpu
ids. This definetly will be an issue if boot cpu id > nr_cpus
This patch modifies allocate_pacas() and smp_setup_cpu_maps() to
accommodate boot cpu for the case where boot_cpuid > nr_cpu_ids.
This change would help to reduce the memory reservation requirement for
kdump on ppc64.
BugLink: http://bugs.launchpad.net/bugs/1558828
In case of ARCH_THUNDER, there is a need to allocate the GICv3 ITS table
which is bigger than the allowed max order. So we are forcing it only in
case of 4KB page size.
Signed-off-by: Radha Mohan Chintakuntla <rchintakuntla@cavium.com> Signed-off-by: Robert Richter <rrichter@cavium.com>
[ dannf: Depend on ARM64_4K_PAGES instead of !ARM64_64K_PAGES now that
16K pages are available ] Signed-off-by: dann frazier <dann.frazier@canonical.com> Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Mehmet Kayaalp [Thu, 10 Mar 2016 21:22:13 +0000 (16:22 -0500)]
UBUNTU: SAUCE: (noup) KEYS: Support for inserting a certificate into x86 bzImage
BugLink: http://bugs.launchpad.net/bugs/1558553
The config option SYSTEM_EXTRA_CERTIFICATE reserves space in vmlinux file,
which is compressed to create the self-extracting bzImage. This patch adds the
capability of extracting the vmlinux, inserting the certificate, and
repackaging the result into a bzImage.
It only works if the resulting compressed vmlinux is smaller than the original.
Otherwise re-linking would be required. To make the reserved space allocate
actual space in bzImage, a null key is inserted into vmlinux before creating
the bzImage:
make vmlinux
scripts/insert-sys-cert -b vmlinux -c /dev/null
make bzImage
After null key insertion, the script populates the rest of the reserved space
with random bytes, which have poor compression. After receiving a bzImage that
is created this way, actual certificate can be inserted into the bzImage:
Andy Whitcroft [Fri, 27 Nov 2015 17:38:30 +0000 (17:38 +0000)]
UBUNTU: SAUCE: (no-up) add compat_uts_machine= kernel command line override
We wish to use the arm64 buildds to build armhf binaries in 32bit chroots.
To make this work we need uname to return armv7l machine type. To achieve
this add a kernel command line override for the 32bit machine type.
Add compat_uts_machine=<type> to allow the LINUX32 personality to return
that type for uname.
BugLink: http://bugs.launchpad.net/bugs/1210848
On an ASUSTek G60JX laptop, the intel_ips driver spams the log with a warning message: "ME failed to update for more than 1s, likely hung". This ME doesn't support the feature, so requesting it be blacklisted for now.
Signed-off-by: Joseph Salisbury <joseph.salisbury@canonical.com> Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org> Tested-by: Nick Jenkins <tech.crew.jenkins@gmail.com> Acked-by: Seth Forshee <seth.forshee@canonical.com> Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
UBUNTU: SAUCE: (no-up) trace: add trace events for open(), exec() and uselib() (for v3.7+)
BugLink: http://bugs.launchpad.net/bugs/462111
This patch uses TRACE_EVENT to add tracepoints for the open(),
exec() and uselib() syscalls so that ureadahead can cheaply trace
the boot sequence to determine what to read to speed up the next.
It's not upstream because it will need to be rebased onto the syscall
trace events whenever that gets merged, and is a stop-gap.
[apw@canonical.com: updated for v3.7 and later.]
[apw@canonical.com: updated for v3.19 and later.] BugLink: http://bugs.launchpad.net/bugs/1085766 Signed-off-by: Scott James Remnant <scott@ubuntu.com> Acked-by: Stefan Bader <stefan.bader@canonical.com> Acked-by: Andy Whitcroft <andy.whitcroft@canonical.com> Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Conflicts:
fs/open.c
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Xiangliang Yu [Thu, 7 Mar 2013 14:29:16 +0000 (14:29 +0000)]
UBUNTU: SAUCE: (no-up) PCI: fix system hang issue of Marvell SATA host controller
BugLink: http://bugs.launchpad.net/bugs/1159863
Hassle someone if this patch hasn't been removed by 13.10.
See https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1159863/comments/2
Fix system hang issue: if first accessed resource file of BAR0 ~
BAR4, system will hang after executing lspci command
Signed-off-by: Kamal Mostafa <kamal@canonical.com> Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Instead of SEMI_MT, present a full mt interface with simulated contact
positions for >=3 fingers. Enables e.g. multi-finger tap and drag for
old userspace applications which only count the contact positions.
Signed-off-by: Kamal Mostafa <kamal@canonical.com> Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
BugLink: http://bugs.launchpad.net/bugs/1084192
Reverting this in the kernel as opposed to adding a sysctl
to the procps package guarentees that this regression will be
propagated to the Raring LTS kernel.
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Andy Whitcroft [Tue, 3 Apr 2012 10:42:41 +0000 (11:42 +0100)]
UBUNTU: SAUCE: (no-up) elide some ioctl warnings which are known benign
BugLink: http://bugs.launchpad.net/bugs/972355
We have been seeing increasing reports of scarey ioctl messages in
dmesg, such as the below often in bulk:
mdadm: sending ioctl 1261 to a partition!
mdadm: sending ioctl 800c0910 to a partition!
Looking at the upstream discussions these are all benign and can be safely
suppressed. This patch is based on some discussions at the link below,
on some work SUSE did in this area. This is not suitable for upstreaming
as we need some refactoring to fix the 32bit compat ioctl mess.
Andy Whitcroft [Fri, 3 Dec 2010 09:51:33 +0000 (09:51 +0000)]
UBUNTU: SAUCE: (no-up) add support for installed header files to ubuntu directory
BugLink: http://bugs.launchpad.net/bugs/684666
We need the aufs headers in the linux-libc-headers, add support for
including files from the ubuntu include directory.
Andy Whitcroft [Thu, 11 Mar 2010 10:33:36 +0000 (10:33 +0000)]
UBUNTU: SAUCE: (no-up) cdrom -- default to not locking the tray when in use
BugLink: http://bugs.launchpad.net/bugs/397734
It seems that users are have a high expectation that the eject button
on their CDROM drive will eject the disk regardless of whether it is in
use or not. To this end we are now changing the default LOCK mode for
mounted CDROMS to 0 to allow ejects. This however does not handle the
direct open cases like music and video players. From the launchpad bug
commentary:
So, according to the upstream discussion David Zeuthen recommended
to just not lock CD-ROM trays by default. Kernel/userspace already
handles prematurely removed USB storage devices reasonably, and with
read-only devices like CD-ROMs it is even less of an issue. So we
should just set /proc/sys/dev/cdrom/lock to 0 by default.
Note that we still will have the drive mounted after the eject. There is a
media change uevent generated and this will be used to trigger the unmount
of the drive in udisks. The burner software will also have to be looked
at to ensure they are explicitly locking the drive closed during the burn.
This will all be handled under the bug above.
Signed-off-by: Andy Whitcroft <apw@canonical.com> Acked-by: Tim Gardner <tim.gardner@canonical.com> Acked-by: Colin King <colin.king@canonical.com>
Linus Torvalds [Sun, 28 Jan 2018 20:19:23 +0000 (12:19 -0800)]
Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Thomas Gleixner:
"A set of small fixes for 4.15:
- Fix vmapped stack synchronization on systems with 4-level paging
and a large amount of memory caused by a missing 5-level folding
which made the pgd synchronization logic to fail and causing double
faults.
- Add a missing sanity check in the vmalloc_fault() logic on 5-level
paging systems.
- Bring back protection against accessing a freed initrd in the
microcode loader which was lost by a wrong merge conflict
resolution.
- Extend the Broadwell micro code loading sanity check.
- Add a missing ENDPROC annotation in ftrace assembly code which
makes ORC unhappy.
- Prevent loading the AMD power module on !AMD platforms. The load
itself is uncritical, but an unload attempt results in a kernel
crash.
- Update Peter Anvins role in the MAINTAINERS file"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/ftrace: Add one more ENDPROC annotation
x86: Mark hpa as a "Designated Reviewer" for the time being
x86/mm/64: Tighten up vmalloc_fault() sanity checks on 5-level kernels
x86/mm/64: Fix vmapped stack syncing on very-large-memory 4-level systems
x86/microcode: Fix again accessing initrd after having been freed
x86/microcode/intel: Extend BDW late-loading further with LLC size check
perf/x86/amd/power: Do not load AMD power module on !AMD platforms
Linus Torvalds [Sun, 28 Jan 2018 20:17:35 +0000 (12:17 -0800)]
Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer fix from Thomas Gleixner:
"A single fix for a ~10 years old problem which causes high resolution
timers to stop after a CPU unplug/plug cycle due to a stale flag in
the per CPU hrtimer base struct.
Paul McKenney was hunting this for about a year, but the heisenbug
nature made it resistant against debug attempts for quite some time"
* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
hrtimer: Reset hrtimer cpu base proper on CPU hotplug
Thomas Gleixner [Fri, 26 Jan 2018 13:54:32 +0000 (14:54 +0100)]
hrtimer: Reset hrtimer cpu base proper on CPU hotplug
The hrtimer interrupt code contains a hang detection and mitigation
mechanism, which prevents that a long delayed hrtimer interrupt causes a
continous retriggering of interrupts which prevent the system from making
progress. If a hang is detected then the timer hardware is programmed with
a certain delay into the future and a flag is set in the hrtimer cpu base
which prevents newly enqueued timers from reprogramming the timer hardware
prior to the chosen delay. The subsequent hrtimer interrupt after the delay
clears the flag and resumes normal operation.
If such a hang happens in the last hrtimer interrupt before a CPU is
unplugged then the hang_detected flag is set and stays that way when the
CPU is plugged in again. At that point the timer hardware is not armed and
it cannot be armed because the hang_detected flag is still active, so
nothing clears that flag. As a consequence the CPU does not receive hrtimer
interrupts and no timers expire on that CPU which results in RCU stalls and
other malfunctions.
Clear the flag along with some other less critical members of the hrtimer
cpu base to ensure starting from a clean state when a CPU is plugged in.
Thanks to Paul, Sebastian and Anna-Maria for their help to get down to the
root cause of that hard to reproduce heisenbug. Once understood it's
trivial and certainly justifies a brown paperbag.
Fixes: 41d2e4949377 ("hrtimer: Tune hrtimer_interrupt hang logic") Reported-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sebastian Sewior <bigeasy@linutronix.de> Cc: Anna-Maria Gleixner <anna-maria@linutronix.de> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1801261447590.2067@nanos
H. Peter Anvin [Thu, 25 Jan 2018 19:59:34 +0000 (11:59 -0800)]
x86: Mark hpa as a "Designated Reviewer" for the time being
Due to some unfortunate events, I have not been directly involved in
the x86 kernel patch flow for a while now. I have also not been able
to ramp back up by now like I had hoped to, and after reviewing what I
will need to work on both internally at Intel and elsewhere in the near
term, it is clear that I am not going to be able to ramp back up until
late 2018 at the very earliest.
It is not acceptable to not recognize that this load is currently
taken by Ingo and Thomas without my direct participation, so I mark
myself as R: (designated reviewer) rather than M: (maintainer) until
further notice. This is in fact recognizing the de facto situation
for the past few years.
I have obviously no intention of going away, and I will do everything
within my power to improve Linux on x86 and x86 for Linux. This,
however, puts credit where it is due and reflects a change of focus.
This patch also removes stale entries for portions of the x86
architecture which have not been maintained separately from arch/x86
for a long time. If there is a reason to re-introduce them then that
can happen later.
Signed-off-by: H. Peter Anvin <h.peter.anvin@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Bruce Schlobohm <bruce.schlobohm@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180125195934.5253-1-hpa@zytor.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
Linus Torvalds [Fri, 26 Jan 2018 23:10:50 +0000 (15:10 -0800)]
Merge tag 'riscv-for-linus-4.15-maintainers' of git://git.kernel.org/pub/scm/linux/kernel/git/palmer/riscv-linux
Pull RISC-V update from Palmer Dabbelt:
"RISC-V: We have a new mailing list and git repo!
Sorry to send something essentially as late as possible (Friday after
an rc9), but we managed to get a mailing list for the RISC-V Linux
port. We've been using patches@groups.riscv.org for a while, but that
list has some problems (it's Google Groups and it's shared over all
RISC-V software projects). The new infaread.org list is much better.
We just got it on Wednesday but I used it a bit on Thursday to shake
out all the configuration problems and it appears to be in working
order.
When I updated the mailing list I noticed that the MAINTAINERS file
was pointing to our github repo, but now that we have a kernel.org
repo I'd like to point to that instead so I changed that as well.
We'll be centralizing all RISC-V Linux related development here as
that seems to be the saner way to go about it.
I can understand if it's too late to get this into 4.15, but given
that it's not a code change I was hoping it'd still be OK. It would be
nice to have the new mailing list and git repo in the release tarballs
so when people start to find bugs they'll get to the right place"
* tag 'riscv-for-linus-4.15-maintainers' of git://git.kernel.org/pub/scm/linux/kernel/git/palmer/riscv-linux:
Update the RISC-V MAINTAINERS file