]> git.proxmox.com Git - mirror_ubuntu-artful-kernel.git/log
mirror_ubuntu-artful-kernel.git
8 years agoUBUNTU: [Debian] Improve tools version message
Andy Whitcroft [Thu, 5 Dec 2013 18:14:04 +0000 (18:14 +0000)]
UBUNTU: [Debian] Improve tools version message

BugLink: http://bugs.launchpad.net/bugs/1257715
Signed-off-by: Andy Whitcroft <apw@canonical.com>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
8 years agoUBUNTU: [Debian] Re-sign modules after debug objcopy
Tim Gardner [Tue, 26 Nov 2013 17:35:47 +0000 (10:35 -0700)]
UBUNTU: [Debian] Re-sign modules after debug objcopy

BugLink: http://bugs.launchpad.net/bugs/1253155
Adding a GNU debug link to a module ELF destroys the
module signature, so re-sign the module file after the objcopy.

objcopy --add-gnu-debuglink=$(dbgpkgdir)/usr/lib/debug/$$module $(pkgdir)/$$module;
scripts/sign-file $(CONFIG_MODULE_SIG_HASH) $(MODSECKEY) $(MODPUBKEY) $(pkgdir)/$$module;

Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
8 years agoUBUNTU: [Debian] sort out linux-tools naming
Andy Whitcroft [Fri, 26 Jul 2013 10:48:03 +0000 (11:48 +0100)]
UBUNTU: [Debian] sort out linux-tools naming

BugLink: http://bugs.launchpad.net/bugs/1205284
Signed-off-by: Andy Whitcroft <apw@canonical.com>
8 years agoUBUNTU: [Debian] linux-tools: switch to common generic version helper
Andy Whitcroft [Tue, 13 Aug 2013 13:19:05 +0000 (14:19 +0100)]
UBUNTU: [Debian] linux-tools: switch to common generic version helper

Signed-off-by: Andy Whitcroft <apw@canonical.com>
8 years agoUBUNTU: [Debian] postinst -- fix unchanged link detection
Andy Whitcroft [Tue, 5 Nov 2013 12:21:19 +0000 (12:21 +0000)]
UBUNTU: [Debian] postinst -- fix unchanged link detection

http://bugs.launchpad.net/bugs/1248053
Signed-off-by: Andy Whitcroft <apw@canonical.com>
8 years agoUBUNTU: [Debian] postinst -- improve relative symlink detection with missing files
Andy Whitcroft [Tue, 5 Nov 2013 11:12:07 +0000 (11:12 +0000)]
UBUNTU: [Debian] postinst -- improve relative symlink detection with missing files

When the symlinks are made we attempt to use relative links if that
would work.  However this relies on the file we are making the link to to
actually exist.  When it does not we fall back to absolute.  This impacts
the initrd links which are made before we make the initrd itself.

When the caller has asked us to use a specific handle file and that file
does not yet exist, see if there are any other files we can use in that
directory.  In the common case this will be a version specific file and
highly unique.

BugLink: http://bugs.launchpad.net/bugs/1248053
Signed-off-by: Andy Whitcroft <apw@canonical.com>
8 years agoUBUNTU: [Debian] getabis: Commit new ABI directory, remove the old
Tim Gardner [Tue, 10 Sep 2013 14:30:04 +0000 (08:30 -0600)]
UBUNTU: [Debian] getabis: Commit new ABI directory, remove the old

Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
8 years agoUBUNTU: [Debian] Add hv_vss_daemon to tools package
Tim Gardner [Mon, 19 Aug 2013 18:29:47 +0000 (12:29 -0600)]
UBUNTU: [Debian] Add hv_vss_daemon to tools package

BugLink: http://bugs.launchpad.net/bugs/1213282
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
8 years agoUBUNTU: [debian] tools: ship 'cpupower' in linux-tools
Kamal Mostafa [Fri, 9 Aug 2013 23:03:46 +0000 (16:03 -0700)]
UBUNTU: [debian] tools: ship 'cpupower' in linux-tools

BugLink: http://bugs.launchpad.net/bugs/1158668
New Build-depends: libpci-dev

Adds to binary package "linux-tools-$(abi_version)"
  /usr/bin/cpupower_$(abi_version)
  /usr/lib/libcpupower.so.$(abi_version)

Adds to binary package "linux-tools-common"
  /usr/bin/cpupower
  /usr/share/man/man1/cpupower-set.1.gz
  /usr/share/man/man1/cpupower-frequency-set.1.gz
  /usr/share/man/man1/cpupower-frequency-info.1.gz
  /usr/share/man/man1/cpupower-monitor.1.gz
  /usr/share/man/man1/cpupower-info.1.gz
  /usr/share/man/man1/cpupower-idle-info.1.gz
  /usr/share/man/man1/cpupower.1.gz

Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
8 years agoUBUNTU: [Debian] tools: enable x86 and hyper-v
Andy Whitcroft [Tue, 13 Aug 2013 12:18:47 +0000 (13:18 +0100)]
UBUNTU: [Debian] tools: enable x86 and hyper-v

Signed-off-by: Andy Whitcroft <apw@canonical.com>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
8 years agoUBUNTU: [Debian] autopkgtest: switch Depends: to build-essential
Andy Whitcroft [Sat, 22 Jun 2013 11:52:29 +0000 (12:52 +0100)]
UBUNTU: [Debian] autopkgtest: switch Depends: to build-essential

autopkgtest control Depends: as empty now seems to be an error, we want
to say 'install nothing' so switch it to depend on packages which by
definition are always installed.

Signed-off-by: Andy Whitcroft <apw@canonical.com>
8 years agoUBUNTU: [Debian] supply perf with appropriate prefix to ensure use of local config
Andy Whitcroft [Wed, 31 Jul 2013 12:41:32 +0000 (13:41 +0100)]
UBUNTU: [Debian] supply perf with appropriate prefix to ensure use of local config

If we do not supply an installation prefix when we are building perf
it will assume it is designed to run relative to the builders HOME.
This means that as built on a buildd we will check for the system
configuration relative to the buildd users home rather than in /etc.
This implies a local user could use this to compromise other users _if_
there is a buildd user installed on the system and they have access to it.

CVE-2013-1060
BugLink: http://bugs.launchpad.net/bugs/1206200
Signed-off-by: Andy Whitcroft <apw@canonical.com>
8 years agoUBUNTU: [Debian] Explicitly reference gawk in build rules awk in all and arch
Andy Whitcroft [Fri, 26 Jul 2013 16:55:44 +0000 (17:55 +0100)]
UBUNTU: [Debian] Explicitly reference gawk in build rules awk in all and arch

Explicitly reference gawk in all rules files. Fixes FTBS on the buildds.

Signed-off-by: Andy Whitcroft <apw@canonical.com>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
8 years agoUBUNTU: [Debian] fix SRCPKGNAME-udebs-FLAVOUR handling for complex flavours
Andy Whitcroft [Thu, 25 Jul 2013 14:24:34 +0000 (15:24 +0100)]
UBUNTU: [Debian] fix SRCPKGNAME-udebs-FLAVOUR handling for complex flavours

Signed-off-by: Andy Whitcroft <apw@canonical.com>
8 years agoUBUNTU: [Debian] Supply PKG_ABI in kmake
Tim Gardner [Wed, 24 Jul 2013 18:46:43 +0000 (12:46 -0600)]
UBUNTU: [Debian] Supply PKG_ABI in kmake

BugLink: http://bugs.launchpad.net/bugs/1193172
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
8 years agoUBUNTU: [Debian] reduce udeb rules spew
Tim Gardner [Wed, 24 Jul 2013 15:50:34 +0000 (09:50 -0600)]
UBUNTU: [Debian] reduce udeb rules spew

Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
8 years agoUBUNTU: [Debian] Prepare to build using arch specific compiler
Tim Gardner [Wed, 10 Jul 2013 19:51:09 +0000 (13:51 -0600)]
UBUNTU: [Debian] Prepare to build using arch specific compiler

Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
8 years agoUBUNTU: [Debian] do_tools=false when cross compiling
Tim Gardner [Thu, 20 Jun 2013 16:13:57 +0000 (16:13 +0000)]
UBUNTU: [Debian] do_tools=false when cross compiling

Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
8 years agoUBUNTU: [Debian] generate a SRCPKGNAME-udebs-FLAVOUR-di depending on all built udebs
Andy Whitcroft [Thu, 18 Jul 2013 13:11:26 +0000 (14:11 +0100)]
UBUNTU: [Debian] generate a SRCPKGNAME-udebs-FLAVOUR-di depending on all built udebs

Build a nice little meta package for use in the seeds.

Signed-off-by: Andy Whitcroft <apw@canonical.com>
8 years agoUBUNTU: [debian] Use dh_strip
Tim Gardner [Thu, 20 Jun 2013 14:06:04 +0000 (08:06 -0600)]
UBUNTU: [debian] Use dh_strip

BugLink: http://bugs.launchpad.net/bugs/1192759
Rely on dh_strip to strip any binaries for the
host arch instead of using install -s

Signed-off-by: Steve Langasek <steve.langasek@canonical.com>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
8 years agoUBUNTU: (debian) get-firmware: Be more selective about copies
Tim Gardner [Wed, 10 Apr 2013 19:19:02 +0000 (13:19 -0600)]
UBUNTU: (debian) get-firmware: Be more selective about copies

Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
8 years agoUBUNTU: (debian) fix internal linkage for separated header packages
Andy Whitcroft [Wed, 10 Apr 2013 11:53:26 +0000 (12:53 +0100)]
UBUNTU: (debian) fix internal linkage for separated header packages

BugLink: http://bugs.launchpad.net/bugs/1165259
Signed-off-by: Andy Whitcroft <apw@canonical.com>
8 years agoUBUNTU: (debian) Abort build on unresolved symbols
Stefan Bader [Tue, 9 Apr 2013 17:18:46 +0000 (19:18 +0200)]
UBUNTU: (debian) Abort build on unresolved symbols

When splitting the flavours of a module into the extras and base
package, we already run depmod. Unfortunately this only produces
warnings when modules in the base package have unresolved depen-
dencies.
This change will abort the build in that case, so we can fix things.

BugLink: http://bugs.launchpad.net/bugs/1166197
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
8 years agoUBUNTU: [debian] Specify python2.7 for perf tools build
Tim Gardner [Fri, 5 Apr 2013 15:06:41 +0000 (09:06 -0600)]
UBUNTU: [debian] Specify python2.7 for perf tools build

Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
8 years agoUBUNTU: [debian] do not use ../.$(series)-env file
Kamal Mostafa [Wed, 13 Mar 2013 22:20:04 +0000 (15:20 -0700)]
UBUNTU: [debian] do not use ../.$(series)-env file

Trying to use a file from ../ outside the tree seems like a bad idea, and
the series="oneiric" value here is stale by three releases now.  Kill this
apparently unused "feature".

Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
8 years agoUBUNTU: [debian] perf: NO_LIBPERL=1
Tim Gardner [Mon, 11 Feb 2013 17:54:27 +0000 (10:54 -0700)]
UBUNTU: [debian] perf: NO_LIBPERL=1

Disable building perl libraries until such time as they are
actually packaged.

Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
8 years agoUBUNTU: [debian] perf: NO_LIBPYTHON=1
Tim Gardner [Mon, 11 Feb 2013 15:07:31 +0000 (08:07 -0700)]
UBUNTU: [debian] perf: NO_LIBPYTHON=1

Avoid a build dependency on python. Don't build python
support libraries since they aren't actually packaged.

Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
8 years agoUBUNTU: [debian] Build extras package only for specific arches
Tim Gardner [Fri, 22 Mar 2013 12:53:48 +0000 (06:53 -0600)]
UBUNTU: [debian] Build extras package only for specific arches

An unwanted side effect of renaming arm omap4 to generic is that
the default rule is to create an extras package for flavours named 'generic'.
Furthermore, We stupidly tied the extras package logic to the flavour name 'generic'.

Defeat this side effect by specifying which architectures get an extras package
split, e.g., x86_64 and i386 in the arch specific make file.

Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
8 years agoUBUNTU: [debian] Remove dangling symlink from headers package
Tim Gardner [Tue, 5 Feb 2013 15:42:23 +0000 (08:42 -0700)]
UBUNTU: [debian] Remove dangling symlink from headers package

BugLink: http://bugs.launchpad.net/bugs/1112442
Signed-off-by: Herton Krzesinski <herton.krzesinski@canonical.com>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
8 years agoUBUNTU: [debian] DTB: add support for multiple DTBs
Paolo Pisati [Wed, 9 Jan 2013 10:27:47 +0000 (10:27 +0000)]
UBUNTU: [debian] DTB: add support for multiple DTBs

Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com>
Acked-by: Andy Whitcroft <andy.whitcroft@canonical.com>
Signed-off-by: Leann Ogasawara <leann.ogasawara@canonical.com>
8 years agoUBUNTU: [debian] Add macro to selectively disable building perf
Tim Gardner [Thu, 10 Jan 2013 19:41:32 +0000 (12:41 -0700)]
UBUNTU: [debian] Add macro to selectively disable building perf

Fixes FTBS until libaudit-dev is promoted to main.

Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
8 years agoUBUNTU: [debian] insertchanges -- fix to work across major version changes
Andy Whitcroft [Thu, 3 Jan 2013 12:14:33 +0000 (12:14 +0000)]
UBUNTU: [debian] insertchanges -- fix to work across major version changes

Signed-off-by: Andy Whitcroft <apw@canonical.com>
8 years agoUBUNTU: [debian] Moved scripts/fw-to-ihex.sh to debian/scripts/misc
Tim Gardner [Wed, 28 Nov 2012 15:17:41 +0000 (08:17 -0700)]
UBUNTU: [debian] Moved scripts/fw-to-ihex.sh to debian/scripts/misc

Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
8 years agoUBUNTU: [debian] find-obsolete-firmware: Use correct path
Tim Gardner [Mon, 3 Dec 2012 16:36:33 +0000 (09:36 -0700)]
UBUNTU: [debian] find-obsolete-firmware: Use correct path

Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
8 years agoUBUNTU: [debian] get-firmware: Filter new files through fwinfo
Tim Gardner [Wed, 28 Nov 2012 16:41:56 +0000 (09:41 -0700)]
UBUNTU: [debian] get-firmware: Filter new files through fwinfo

Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
8 years agoUBUNTU: [debian] hmake -j1
Tim Gardner [Tue, 27 Nov 2012 19:56:28 +0000 (12:56 -0700)]
UBUNTU: [debian] hmake -j1

The kernel makefile appears to have parallel dependency
problems for the install_headers target.

Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
8 years agoUBUNTU: [debian] add an autopkgtest rebuild test
Andy Whitcroft [Wed, 21 Nov 2012 11:13:50 +0000 (11:13 +0000)]
UBUNTU: [debian] add an autopkgtest rebuild test

The plan here is for linux, gcc, binutils, and eglibc to all depends on
each other and to all have a rebuild test.  That way the entire set is
rebuild tested for any one in the set being uploaded.

BugLink: http://bugs.launchpad.net/bugs/1081500
Signed-off-by: Andy Whitcroft <apw@canonical.com>
8 years agoUBUNTU: [debian] move build tests out of the way
Andy Whitcroft [Wed, 21 Nov 2012 11:01:09 +0000 (11:01 +0000)]
UBUNTU: [debian] move build tests out of the way

The new Debian autopkgtest system takes ownership of the debian/tests
directory, in such a way that is incompatible with our usage.  Move our
tests to debian/tests-build as they are build tests.

BugLink: http://bugs.launchpad.net/bugs/1081500
Signed-off-by: Andy Whitcroft <apw@canonical.com>
8 years agoUBUNTU: [debian] add rebuild-test support for autopkgtest
Andy Whitcroft [Wed, 21 Nov 2012 10:00:25 +0000 (10:00 +0000)]
UBUNTU: [debian] add rebuild-test support for autopkgtest

Add support for the DEB_BUILD_OPTIONS rebuild-test which indicates this is
not a full build but a quick smoke test.  For us short circuit the build
and only make the first flavour on the assumption it is representative
of the others.

BugLink: http://bugs.launchpad.net/bugs/1081500
Signed-off-by: Andy Whitcroft <apw@canonical.com>
8 years agoUBUNTU: [debian] script to detect obsolete firmware
Tim Gardner [Fri, 16 Nov 2012 16:24:10 +0000 (09:24 -0700)]
UBUNTU: [debian] script to detect obsolete firmware

Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
8 years agoUBUNTU: [debian] Use SRCPKGNAME as prefix for indep linux headers package
Ben Collins [Fri, 9 Nov 2012 19:17:12 +0000 (14:17 -0500)]
UBUNTU: [debian] Use SRCPKGNAME as prefix for indep linux headers package

[apw@canonical.com: forward ported to new cleaned up indep stack.]
Signed-off-by: Ben Collins <ben.c@servergy.com>
Signed-off-by: Andy Whitcroft <apw@canonical.com>
8 years agoUBUNTU: [debian] Document binary-indep dependency chain
Tim Gardner [Fri, 9 Nov 2012 15:41:35 +0000 (08:41 -0700)]
UBUNTU: [debian] Document binary-indep dependency chain

Move some code around to directly reflect the dependency chain ordering.

Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
8 years agoUBUNTU: [debian] Use dh_prep instead of 'dh_clean -k'
Tim Gardner [Fri, 9 Nov 2012 15:09:39 +0000 (08:09 -0700)]
UBUNTU: [debian] Use dh_prep instead of 'dh_clean -k'

dh_prep needs to be run only once at the root of the
binary-indep dependency chain, i.e., install-headers.

Similarly, dh_testdir and dh_testroot only need to be run once at
the root of the binary-indep dependency chain.

Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
8 years agoUBUNTU: [debian] enforcer -- fix debugging output
Andy Whitcroft [Fri, 9 Nov 2012 15:55:10 +0000 (15:55 +0000)]
UBUNTU: [debian] enforcer -- fix debugging output

Fix up some confusingly wrong debugging output in the config checker.

Signed-off-by: Andy Whitcroft <apw@canonical.com>
8 years agoUBUNTU: [debian] Add custom_override rule to allow for alternate kernel file/install
Ben Collins [Thu, 8 Nov 2012 20:01:31 +0000 (15:01 -0500)]
UBUNTU: [debian] Add custom_override rule to allow for alternate kernel file/install

On PowerPC, the flavours have different make targets and installable
images. For example, e500/e500mc use a target of uImage, since that is
the native format for U-Boot systems.

Signed-off-by: Ben Collins <ben.c@servergy.com>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
8 years agoUBUNTU: [debian] Update armhf comments.
Tim Gardner [Fri, 9 Nov 2012 00:14:08 +0000 (19:14 -0500)]
UBUNTU: [debian] Update armhf comments.

https://lists.ubuntu.com/archives/ubuntu-devel/2012-November/036106.html

Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
8 years agoUBUNTU: [debian] drop manual headers copy
Andy Whitcroft [Thu, 1 Nov 2012 14:00:39 +0000 (14:00 +0000)]
UBUNTU: [debian] drop manual headers copy

Signed-off-by: Andy Whitcroft <apw@canonical.com>
8 years agoUBUNTU: [debian] bootstrap: switch to the new DEB_BUILD_PROFILE
Andy Whitcroft [Thu, 1 Nov 2012 13:42:28 +0000 (13:42 +0000)]
UBUNTU: [debian] bootstrap: switch to the new DEB_BUILD_PROFILE

Signed-off-by: Andy Whitcroft <apw@canonical.com>
8 years agoUBUNTU: [debian] do not fail secure copy on older kernels
Andy Whitcroft [Wed, 10 Oct 2012 16:58:24 +0000 (17:58 +0100)]
UBUNTU: [debian] do not fail secure copy on older kernels

Signed-off-by: Andy Whitcroft <apw@canonical.com>
8 years agoUBUNTU: [debian] allow us to select which builds have uefi signed versions
Andy Whitcroft [Fri, 5 Oct 2012 08:43:00 +0000 (09:43 +0100)]
UBUNTU: [debian] allow us to select which builds have uefi signed versions

Signed-off-by: Andy Whitcroft <apw@canonical.com>
8 years agoUBUNTU: [debian] we already have a valid src_pkg_name
Andy Whitcroft [Fri, 5 Oct 2012 08:51:51 +0000 (09:51 +0100)]
UBUNTU: [debian] we already have a valid src_pkg_name

Signed-off-by: Andy Whitcroft <apw@canonical.com>
8 years agoUBUNTU: [debian] add custom upload for the kernel binary package
Andy Whitcroft [Thu, 20 Sep 2012 18:49:03 +0000 (19:49 +0100)]
UBUNTU: [debian] add custom upload for the kernel binary package

Pick out the kernel binaries and add them to a custom upload.  This upload
will trigger signing of the contained files which will later be pulled
into linux-*-signed packages.

Only include amd64 kernels as we only support EFI signed packages there.
Also ensure the kernel has a high enough interface version >= 0x020b
otherwise we may end up with an unsafe kernel loaded.

Signed-off-by: Andy Whitcroft <apw@canonical.com>
8 years agoUBUNTU: [debian] note directory name changes
Tim Gardner [Thu, 18 Oct 2012 19:52:27 +0000 (20:52 +0100)]
UBUNTU: [debian] note directory name changes

Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
8 years agoUBUNTU: [debian] Initial debian and ubuntu directories
Leann Ogasawara [Sat, 13 Mar 2010 01:13:25 +0000 (17:13 -0800)]
UBUNTU: [debian] Initial debian and ubuntu directories

Ignore: yes
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
8 years agoLinux 4.4.3
Greg Kroah-Hartman [Thu, 25 Feb 2016 20:01:36 +0000 (12:01 -0800)]
Linux 4.4.3

8 years agomodules: fix modparam async_probe request
Luis R. Rodriguez [Wed, 3 Feb 2016 06:25:26 +0000 (16:55 +1030)]
modules: fix modparam async_probe request

commit 4355efbd80482a961cae849281a8ef866e53d55c upstream.

Commit f2411da746985 ("driver-core: add driver module
asynchronous probe support") added async probe support,
in two forms:

  * in-kernel driver specification annotation
  * generic async_probe module parameter (modprobe foo async_probe)

To support the generic kernel parameter parse_args() was
extended via commit ecc8617053e0 ("module: add extra
argument for parse_params() callback") however commit
failed to f2411da746985 failed to add the required argument.

This causes a crash then whenever async_probe generic
module parameter is used. This was overlooked when the
form in which in-kernel async probe support was reworked
a bit... Fix this as originally intended.

Cc: Hannes Reinecke <hare@suse.de>
Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Luis R. Rodriguez <mcgrof@suse.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> [minimized]
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agomodule: wrapper for symbol name.
Rusty Russell [Wed, 3 Feb 2016 06:25:26 +0000 (16:55 +1030)]
module: wrapper for symbol name.

commit 2e7bac536106236104e9e339531ff0fcdb7b8147 upstream.

This trivial wrapper adds clarity and makes the following patch
smaller.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoitimers: Handle relative timers with CONFIG_TIME_LOW_RES proper
Thomas Gleixner [Thu, 14 Jan 2016 16:54:48 +0000 (16:54 +0000)]
itimers: Handle relative timers with CONFIG_TIME_LOW_RES proper

commit 51cbb5242a41700a3f250ecfb48dcfb7e4375ea4 upstream.

As Helge reported for timerfd we have the same issue in itimers. We return
remaining time larger than the programmed relative time to user space in case
of CONFIG_TIME_LOW_RES=y. Use the proper function to adjust the extra time
added in hrtimer_start_range_ns().

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Helge Deller <deller@gmx.de>
Cc: John Stultz <john.stultz@linaro.org>
Cc: linux-m68k@lists.linux-m68k.org
Cc: dhowells@redhat.com
Link: http://lkml.kernel.org/r/20160114164159.528222587@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoposix-timers: Handle relative timers with CONFIG_TIME_LOW_RES proper
Thomas Gleixner [Thu, 14 Jan 2016 16:54:47 +0000 (16:54 +0000)]
posix-timers: Handle relative timers with CONFIG_TIME_LOW_RES proper

commit 572c39172684c3711e4a03c9a7380067e2b0661c upstream.

As Helge reported for timerfd we have the same issue in posix timers. We
return remaining time larger than the programmed relative time to user space
in case of CONFIG_TIME_LOW_RES=y. Use the proper function to adjust the extra
time added in hrtimer_start_range_ns().

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Helge Deller <deller@gmx.de>
Cc: John Stultz <john.stultz@linaro.org>
Cc: linux-m68k@lists.linux-m68k.org
Cc: dhowells@redhat.com
Link: http://lkml.kernel.org/r/20160114164159.450510905@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agotimerfd: Handle relative timers with CONFIG_TIME_LOW_RES proper
Thomas Gleixner [Thu, 14 Jan 2016 16:54:46 +0000 (16:54 +0000)]
timerfd: Handle relative timers with CONFIG_TIME_LOW_RES proper

commit b62526ed11a1fe3861ab98d40b7fdab8981d788a upstream.

Helge reported that a relative timer can return a remaining time larger than
the programmed relative time on parisc and other architectures which have
CONFIG_TIME_LOW_RES set. This happens because we add a jiffie to the resulting
expiry time to prevent short timeouts.

Use the new function hrtimer_expires_remaining_adjusted() to calculate the
remaining time. It takes that extra added time into account for relative
timers.

Reported-and-tested-by: Helge Deller <deller@gmx.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: John Stultz <john.stultz@linaro.org>
Cc: linux-m68k@lists.linux-m68k.org
Cc: dhowells@redhat.com
Link: http://lkml.kernel.org/r/20160114164159.354500742@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoprctl: take mmap sem for writing to protect against others
Mateusz Guzik [Wed, 20 Jan 2016 23:01:02 +0000 (15:01 -0800)]
prctl: take mmap sem for writing to protect against others

commit ddf1d398e517e660207e2c807f76a90df543a217 upstream.

An unprivileged user can trigger an oops on a kernel with
CONFIG_CHECKPOINT_RESTORE.

proc_pid_cmdline_read takes mmap_sem for reading and obtains args + env
start/end values. These get sanity checked as follows:
        BUG_ON(arg_start > arg_end);
        BUG_ON(env_start > env_end);

These can be changed by prctl_set_mm. Turns out also takes the semaphore for
reading, effectively rendering it useless. This results in:

  kernel BUG at fs/proc/base.c:240!
  invalid opcode: 0000 [#1] SMP
  Modules linked in: virtio_net
  CPU: 0 PID: 925 Comm: a.out Not tainted 4.4.0-rc8-next-20160105dupa+ #71
  Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
  task: ffff880077a68000 ti: ffff8800784d0000 task.ti: ffff8800784d0000
  RIP: proc_pid_cmdline_read+0x520/0x530
  RSP: 0018:ffff8800784d3db8  EFLAGS: 00010206
  RAX: ffff880077c5b6b0 RBX: ffff8800784d3f18 RCX: 0000000000000000
  RDX: 0000000000000002 RSI: 00007f78e8857000 RDI: 0000000000000246
  RBP: ffff8800784d3e40 R08: 0000000000000008 R09: 0000000000000001
  R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000050
  R13: 00007f78e8857800 R14: ffff88006fcef000 R15: ffff880077c5b600
  FS:  00007f78e884a740(0000) GS:ffff88007b200000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
  CR2: 00007f78e8361770 CR3: 00000000790a5000 CR4: 00000000000006f0
  Call Trace:
    __vfs_read+0x37/0x100
    vfs_read+0x82/0x130
    SyS_read+0x58/0xd0
    entry_SYSCALL_64_fastpath+0x12/0x76
  Code: 4c 8b 7d a8 eb e9 48 8b 9d 78 ff ff ff 4c 8b 7d 90 48 8b 03 48 39 45 a8 0f 87 f0 fe ff ff e9 d1 fe ff ff 4c 8b 7d 90 eb c6 0f 0b <0f> 0b 0f 0b 66 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00
  RIP   proc_pid_cmdline_read+0x520/0x530
  ---[ end trace 97882617ae9c6818 ]---

Turns out there are instances where the code just reads aformentioned
values without locking whatsoever - namely environ_read and get_cmdline.

Interestingly these functions look quite resilient against bogus values,
but I don't believe this should be relied upon.

The first patch gets rid of the oops bug by grabbing mmap_sem for
writing.

The second patch is optional and puts locking around aformentioned
consumers for safety.  Consumers of other fields don't seem to benefit
from similar treatment and are left untouched.

This patch (of 2):

The code was taking the semaphore for reading, which does not protect
against readers nor concurrent modifications.

The problem could cause a sanity checks to fail in procfs's cmdline
reader, resulting in an OOPS.

Note that some functions perform an unlocked read of various mm fields,
but they seem to be fine despite possible modificaton.

Signed-off-by: Mateusz Guzik <mguzik@redhat.com>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Jarod Wilson <jarod@redhat.com>
Cc: Jan Stancek <jstancek@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Anshuman Khandual <anshuman.linux@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoxfs: log mount failures don't wait for buffers to be released
Dave Chinner [Mon, 18 Jan 2016 21:28:10 +0000 (08:28 +1100)]
xfs: log mount failures don't wait for buffers to be released

commit 85bec5460ad8e05e0a8d70fb0f6750eb719ad092 upstream.

Recently I've been seeing xfs/051 fail on 1k block size filesystems.
Trying to trace the events during the test lead to the problem going
away, indicating that it was a race condition that lead to this
ASSERT failure:

XFS: Assertion failed: atomic_read(&pag->pag_ref) == 0, file: fs/xfs/xfs_mount.c, line: 156
.....
[<ffffffff814e1257>] xfs_free_perag+0x87/0xb0
[<ffffffff814e21b9>] xfs_mountfs+0x4d9/0x900
[<ffffffff814e5dff>] xfs_fs_fill_super+0x3bf/0x4d0
[<ffffffff811d8800>] mount_bdev+0x180/0x1b0
[<ffffffff814e3ff5>] xfs_fs_mount+0x15/0x20
[<ffffffff811d90a8>] mount_fs+0x38/0x170
[<ffffffff811f4347>] vfs_kern_mount+0x67/0x120
[<ffffffff811f7018>] do_mount+0x218/0xd60
[<ffffffff811f7e5b>] SyS_mount+0x8b/0xd0

When I finally caught it with tracing enabled, I saw that AG 2 had
an elevated reference count and a buffer was responsible for it. I
tracked down the specific buffer, and found that it was missing the
final reference count release that would put it back on the LRU and
hence be found by xfs_wait_buftarg() calls in the log mount failure
handling.

The last four traces for the buffer before the assert were (trimmed
for relevance)

kworker/0:1-5259   xfs_buf_iodone:        hold 2  lock 0 flags ASYNC
kworker/0:1-5259   xfs_buf_ioerror:       hold 2  lock 0 error -5
mount-7163    xfs_buf_lock_done:     hold 2  lock 0 flags ASYNC
mount-7163    xfs_buf_unlock:        hold 2  lock 1 flags ASYNC

This is an async write that is completing, so there's nobody waiting
for it directly.  Hence we call xfs_buf_relse() once all the
processing is complete. That does:

static inline void xfs_buf_relse(xfs_buf_t *bp)
{
xfs_buf_unlock(bp);
xfs_buf_rele(bp);
}

Now, it's clear that mount is waiting on the buffer lock, and that
it has been released by xfs_buf_relse() and gained by mount. This is
expected, because at this point the mount process is in
xfs_buf_delwri_submit() waiting for all the IO it submitted to
complete.

The mount process, however, is waiting on the lock for the buffer
because it is in xfs_buf_delwri_submit(). This waits for IO
completion, but it doesn't wait for the buffer reference owned by
the IO to go away. The mount process collects all the completions,
fails the log recovery, and the higher level code then calls
xfs_wait_buftarg() to free all the remaining buffers in the
filesystem.

The issue is that on unlocking the buffer, the scheduler has decided
that the mount process has higher priority than the the kworker
thread that is running the IO completion, and so immediately
switched contexts to the mount process from the semaphore unlock
code, hence preventing the kworker thread from finishing the IO
completion and releasing the IO reference to the buffer.

Hence by the time that xfs_wait_buftarg() is run, the buffer still
has an active reference and so isn't on the LRU list that the
function walks to free the remaining buffers. Hence we miss that
buffer and continue onwards to tear down the mount structures,
at which time we get find a stray reference count on the perag
structure. On a non-debug kernel, this will be ignored and the
structure torn down and freed. Hence when the kworker thread is then
rescheduled and the buffer released and freed, it will access a
freed perag structure.

The problem here is that when the log mount fails, we still need to
quiesce the log to ensure that the IO workqueues have returned to
idle before we run xfs_wait_buftarg(). By synchronising the
workqueues, we ensure that all IO completions are fully processed,
not just to the point where buffers have been unlocked. This ensures
we don't end up in the situation above.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoRevert "xfs: clear PF_NOFREEZE for xfsaild kthread"
Dave Chinner [Mon, 18 Jan 2016 21:21:46 +0000 (08:21 +1100)]
Revert "xfs: clear PF_NOFREEZE for xfsaild kthread"

commit 3e85286e75224fa3f08bdad20e78c8327742634e upstream.

This reverts commit 24ba16bb3d499c49974669cd8429c3e4138ab102 as it
prevents machines from suspending. This regression occurs when the
xfsaild is idle on entry to suspend, and so there s no activity to
wake it from it's idle sleep and hence see that it is supposed to
freeze. Hence the freezer times out waiting for it and suspend is
cancelled.

There is no obvious fix for this short of freezing the filesystem
properly, so revert this change for now.

Signed-off-by: Dave Chinner <david@fromorbit.com>
Acked-by: Jiri Kosina <jkosina@suse.cz>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoxfs: inode recovery readahead can race with inode buffer creation
Dave Chinner [Mon, 11 Jan 2016 20:03:44 +0000 (07:03 +1100)]
xfs: inode recovery readahead can race with inode buffer creation

commit b79f4a1c68bb99152d0785ee4ea3ab4396cdacc6 upstream.

When we do inode readahead in log recovery, we do can do the
readahead before we've replayed the icreate transaction that stamps
the buffer with inode cores. The inode readahead verifier catches
this and marks the buffer as !done to indicate that it doesn't yet
contain valid inodes.

In adding buffer error notification  (i.e. setting b_error = -EIO at
the same time as as we clear the done flag) to such a readahead
verifier failure, we can then get subsequent inode recovery failing
with this error:

XFS (dm-0): metadata I/O error: block 0xa00060 ("xlog_recover_do..(read#2)") error 5 numblks 32

This occurs when readahead completion races with icreate item replay
such as:

inode readahead
find buffer
lock buffer
submit RA io
....
icreate recovery
    xfs_trans_get_buffer
find buffer
lock buffer
<blocks on RA completion>
.....
<ra completion>
fails verifier
clear XBF_DONE
set bp->b_error = -EIO
release and unlock buffer
<icreate gains lock>
icreate initialises buffer
marks buffer as done
adds buffer to delayed write queue
releases buffer

At this point, we have an initialised inode buffer that is up to
date but has an -EIO state registered against it. When we finally
get to recovering an inode in that buffer:

inode item recovery
    xfs_trans_read_buffer
find buffer
lock buffer
sees XBF_DONE is set, returns buffer
    sees bp->b_error is set
fail log recovery!

Essentially, we need xfs_trans_get_buf_map() to clear the error status of
the buffer when doing a lookup. This function returns uninitialised
buffers, so the buffer returned can not be in an error state and
none of the code that uses this function expects b_error to be set
on return. Indeed, there is an ASSERT(!bp->b_error); in the
transaction case in xfs_trans_get_buf_map() that would have caught
this if log recovery used transactions....

This patch firstly changes the inode readahead failure to set -EIO
on the buffer, and secondly changes xfs_buf_get_map() to never
return a buffer with an error state set so this first change doesn't
cause unexpected log recovery failures.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agolibxfs: pack the agfl header structure so XFS_AGFL_SIZE is correct
Darrick J. Wong [Mon, 4 Jan 2016 05:13:21 +0000 (16:13 +1100)]
libxfs: pack the agfl header structure so XFS_AGFL_SIZE is correct

commit 96f859d52bcb1c6ea6f3388d39862bf7143e2f30 upstream.

Because struct xfs_agfl is 36 bytes long and has a 64-bit integer
inside it, gcc will quietly round the structure size up to the nearest
64 bits -- in this case, 40 bytes.  This results in the XFS_AGFL_SIZE
macro returning incorrect results for v5 filesystems on 64-bit
machines (118 items instead of 119).  As a result, a 32-bit xfs_repair
will see garbage in AGFL item 119 and complain.

Therefore, tell gcc not to pad the structure so that the AGFL size
calculation is correct.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoovl: setattr: check permissions before copy-up
Miklos Szeredi [Fri, 11 Dec 2015 15:30:49 +0000 (16:30 +0100)]
ovl: setattr: check permissions before copy-up

commit cf9a6784f7c1b5ee2b9159a1246e327c331c5697 upstream.

Without this copy-up of a file can be forced, even without actually being
allowed to do anything on the file.

[Arnd Bergmann] include <linux/pagemap.h> for PAGE_CACHE_SIZE (used by
MAX_LFS_FILESIZE definition).

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoovl: root: copy attr
Miklos Szeredi [Wed, 9 Dec 2015 15:11:59 +0000 (16:11 +0100)]
ovl: root: copy attr

commit ed06e069775ad9236087594a1c1667367e983fb5 upstream.

We copy i_uid and i_gid of underlying inode into overlayfs inode.  Except
for the root inode.

Fix this omission.

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoovl: check dentry positiveness in ovl_cleanup_whiteouts()
Konstantin Khlebnikov [Mon, 16 Nov 2015 15:44:11 +0000 (18:44 +0300)]
ovl: check dentry positiveness in ovl_cleanup_whiteouts()

commit 84889d49335627bc770b32787c1ef9ebad1da232 upstream.

This patch fixes kernel crash at removing directory which contains
whiteouts from lower layers.

Cache of directory content passed as "list" contains entries from all
layers, including whiteouts from lower layers. So, lookup in upper dir
(moved into work at this stage) will return negative entry. Plus this
cache is filled long before and we can race with external removal.

Example:
 mkdir -p lower0/dir lower1/dir upper work overlay
 touch lower0/dir/a lower0/dir/b
 mknod lower1/dir/a c 0 0
 mount -t overlay none overlay -o lowerdir=lower1:lower0,upperdir=upper,workdir=work
 rm -fr overlay/dir

Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoovl: use a minimal buffer in ovl_copy_xattr
Vito Caputo [Sat, 24 Oct 2015 12:19:46 +0000 (07:19 -0500)]
ovl: use a minimal buffer in ovl_copy_xattr

commit e4ad29fa0d224d05e08b2858e65f112fd8edd4fe upstream.

Rather than always allocating the high-order XATTR_SIZE_MAX buffer
which is costly and prone to failure, only allocate what is needed and
realloc if necessary.

Fixes https://github.com/coreos/bugs/issues/489

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoovl: allow zero size xattr
Miklos Szeredi [Tue, 10 Nov 2015 16:08:41 +0000 (17:08 +0100)]
ovl: allow zero size xattr

commit 97daf8b97ad6f913a34c82515be64dc9ac08d63e upstream.

When ovl_copy_xattr() encountered a zero size xattr no more xattrs were
copied and the function returned success.  This is clearly not the desired
behavior.

Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agofutex: Drop refcount if requeue_pi() acquired the rtmutex
Thomas Gleixner [Sat, 19 Dec 2015 20:07:38 +0000 (20:07 +0000)]
futex: Drop refcount if requeue_pi() acquired the rtmutex

commit fb75a4282d0d9a3c7c44d940582c2d226cf3acfb upstream.

If the proxy lock in the requeue loop acquires the rtmutex for a
waiter then it acquired also refcount on the pi_state related to the
futex, but the waiter side does not drop the reference count.

Add the missing free_pi_state() call.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Darren Hart <darren@dvhart.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Bhuvanesh_Surachari@mentor.com
Cc: Andy Lowe <Andy_Lowe@mentor.com>
Link: http://lkml.kernel.org/r/20151219200607.178132067@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agodevm_memremap_release(): fix memremap'd addr handling
Toshi Kani [Wed, 17 Feb 2016 21:11:29 +0000 (13:11 -0800)]
devm_memremap_release(): fix memremap'd addr handling

commit 9273a8bbf58a15051e53a777389a502420ddc60e upstream.

The pmem driver calls devm_memremap() to map a persistent memory range.
When the pmem driver is unloaded, this memremap'd range is not released
so the kernel will leak a vma.

Fix devm_memremap_release() to handle a given memremap'd address
properly.

Signed-off-by: Toshi Kani <toshi.kani@hpe.com>
Acked-by: Dan Williams <dan.j.williams@intel.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoipc/shm: handle removed segments gracefully in shm_mmap()
Kirill A. Shutemov [Wed, 17 Feb 2016 21:11:35 +0000 (13:11 -0800)]
ipc/shm: handle removed segments gracefully in shm_mmap()

commit 1ac0b6dec656f3f78d1c3dd216fad84cb4d0a01e upstream.

remap_file_pages(2) emulation can reach file which represents removed
IPC ID as long as a memory segment is mapped.  It breaks expectations of
IPC subsystem.

Test case (rewritten to be more human readable, originally autogenerated
by syzkaller[1]):

#define _GNU_SOURCE
#include <stdlib.h>
#include <sys/ipc.h>
#include <sys/mman.h>
#include <sys/shm.h>

#define PAGE_SIZE 4096

int main()
{
int id;
void *p;

id = shmget(IPC_PRIVATE, 3 * PAGE_SIZE, 0);
p = shmat(id, NULL, 0);
shmctl(id, IPC_RMID, NULL);
remap_file_pages(p, 3 * PAGE_SIZE, 0, 7, 0);

        return 0;
}

The patch changes shm_mmap() and code around shm_lock() to propagate
locking error back to caller of shm_mmap().

[1] http://github.com/google/syzkaller

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Manfred Spraul <manfred@colorfullife.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agointel_scu_ipcutil: underflow in scu_reg_access()
Dan Carpenter [Tue, 26 Jan 2016 09:24:25 +0000 (12:24 +0300)]
intel_scu_ipcutil: underflow in scu_reg_access()

commit b1d353ad3d5835b16724653b33c05124e1b5acf1 upstream.

"count" is controlled by the user and it can be negative.  Let's prevent
that by making it unsigned.  You have to have CAP_SYS_RAWIO to call this
function so the bug is not as serious as it could be.

Fixes: 5369c02d951a ('intel_scu_ipc: Utility driver for intel scu ipc')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Darren Hart <dvhart@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agomm,thp: khugepaged: call pte flush at the time of collapse
Vineet Gupta [Fri, 12 Feb 2016 00:13:09 +0000 (16:13 -0800)]
mm,thp: khugepaged: call pte flush at the time of collapse

commit 6a6ac72fd6ea32594b316513e1826c3f6db4cc93 upstream.

This showed up on ARC when running LMBench bw_mem tests as Overlapping
TLB Machine Check Exception triggered due to STLB entry (2M pages)
overlapping some NTLB entry (regular 8K page).

bw_mem 2m touches a large chunk of vaddr creating NTLB entries.  In the
interim khugepaged kicks in, collapsing the contiguous ptes into a
single pmd.  pmdp_collapse_flush()->flush_pmd_tlb_range() is called to
flush out NTLB entries for the ptes.  This for ARC (by design) can only
shootdown STLB entries (for pmd).  The stray NTLB entries cause the
overlap with the subsequent STLB entry for collapsed page.  So make
pmdp_collapse_flush() call pte flush interface not pmd flush.

Note that originally all thp flush call sites in generic code called
flush_tlb_range() leaving it to architecture to implement the flush for
pte and/or pmd.  Commit 12ebc1581ad11454 changed this by calling a new
opt-in API flush_pmd_tlb_range() which made the semantics more explicit
but failed to distinguish the pte vs pmd flush in generic code, which is
what this patch fixes.

Note that ARC can fixed w/o touching the generic pmdp_collapse_flush()
by defining a ARC version, but that defeats the purpose of generic
version, plus sementically this is the right thing to do.

Fixes STAR 9000961194: LMBench on AXS103 triggering duplicate TLB
exceptions with super pages

Fixes: 12ebc1581ad11454 ("mm,thp: introduce flush_pmd_tlb_range")
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agodump_stack: avoid potential deadlocks
Eric Dumazet [Fri, 5 Feb 2016 23:36:16 +0000 (15:36 -0800)]
dump_stack: avoid potential deadlocks

commit d7ce36924344ace0dbdc855b1206cacc46b36d45 upstream.

Some servers experienced fatal deadlocks because of a combination of
bugs, leading to multiple cpus calling dump_stack().

The checksumming bug was fixed in commit 34ae6a1aa054 ("ipv6: update
skb->csum when CE mark is propagated").

The second problem is a faulty locking in dump_stack()

CPU1 runs in process context and calls dump_stack(), grabs dump_lock.

   CPU2 receives a TCP packet under softirq, grabs socket spinlock, and
   call dump_stack() from netdev_rx_csum_fault().

   dump_stack() spins on atomic_cmpxchg(&dump_lock, -1, 2), since
   dump_lock is owned by CPU1

While dumping its stack, CPU1 is interrupted by a softirq, and happens
to process a packet for the TCP socket locked by CPU2.

CPU1 spins forever in spin_lock() : deadlock

Stack trace on CPU1 looked like :

    NMI backtrace for cpu 1
    RIP: _raw_spin_lock+0x25/0x30
    ...
    Call Trace:
      <IRQ>
      tcp_v6_rcv+0x243/0x620
      ip6_input_finish+0x11f/0x330
      ip6_input+0x38/0x40
      ip6_rcv_finish+0x3c/0x90
      ipv6_rcv+0x2a9/0x500
      process_backlog+0x461/0xaa0
      net_rx_action+0x147/0x430
      __do_softirq+0x167/0x2d0
      call_softirq+0x1c/0x30
      do_softirq+0x3f/0x80
      irq_exit+0x6e/0xc0
      smp_call_function_single_interrupt+0x35/0x40
      call_function_single_interrupt+0x6a/0x70
      <EOI>
      printk+0x4d/0x4f
      printk_address+0x31/0x33
      print_trace_address+0x33/0x3c
      print_context_stack+0x7f/0x119
      dump_trace+0x26b/0x28e
      show_trace_log_lvl+0x4f/0x5c
      show_stack_log_lvl+0x104/0x113
      show_stack+0x42/0x44
      dump_stack+0x46/0x58
      netdev_rx_csum_fault+0x38/0x3c
      __skb_checksum_complete_head+0x6e/0x80
      __skb_checksum_complete+0x11/0x20
      tcp_rcv_established+0x2bd5/0x2fd0
      tcp_v6_do_rcv+0x13c/0x620
      sk_backlog_rcv+0x15/0x30
      release_sock+0xd2/0x150
      tcp_recvmsg+0x1c1/0xfc0
      inet_recvmsg+0x7d/0x90
      sock_recvmsg+0xaf/0xe0
      ___sys_recvmsg+0x111/0x3b0
      SyS_recvmsg+0x5c/0xb0
      system_call_fastpath+0x16/0x1b

Fixes: b58d977432c8 ("dump_stack: serialize the output from dump_stack()")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Alex Thorlton <athorlton@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoradix-tree: fix oops after radix_tree_iter_retry
Konstantin Khlebnikov [Fri, 5 Feb 2016 23:37:01 +0000 (15:37 -0800)]
radix-tree: fix oops after radix_tree_iter_retry

commit 732042821cfa106b3c20b9780e4c60fee9d68900 upstream.

Helper radix_tree_iter_retry() resets next_index to the current index.
In following radix_tree_next_slot current chunk size becomes zero.  This
isn't checked and it tries to dereference null pointer in slot.

Tagged iterator is fine because retry happens only at slot 0 where tag
bitmask in iter->tags is filled with single bit.

Fixes: 46437f9a554f ("radix-tree: fix race in gang lookup")
Signed-off-by: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Matthew Wilcox <willy@linux.intel.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Ohad Ben-Cohen <ohad@wizery.com>
Cc: Jeremiah Mahler <jmmahler@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agodrivers/hwspinlock: fix race between radix tree insertion and lookup
Matthew Wilcox [Wed, 3 Feb 2016 00:57:55 +0000 (16:57 -0800)]
drivers/hwspinlock: fix race between radix tree insertion and lookup

commit c6400ba7e13a41539342f1b6e1f9e78419cb0148 upstream.

of_hwspin_lock_get_id() is protected by the RCU lock, which means that
insertions can occur simultaneously with the lookup.  If the radix tree
transitions from a height of 0, we can see a slot with the indirect_ptr
bit set, which will cause us to at least read random memory, and could
cause other havoc.

Fix this by using the newly introduced radix_tree_iter_retry().

Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Ohad Ben-Cohen <ohad@wizery.com>
Cc: Konstantin Khlebnikov <khlebnikov@openvz.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoradix-tree: fix race in gang lookup
Matthew Wilcox [Wed, 3 Feb 2016 00:57:52 +0000 (16:57 -0800)]
radix-tree: fix race in gang lookup

commit 46437f9a554fbe3e110580ca08ab703b59f2f95a upstream.

If the indirect_ptr bit is set on a slot, that indicates we need to redo
the lookup.  Introduce a new function radix_tree_iter_retry() which
forces the loop to retry the lookup by setting 'slot' to NULL and
turning the iterator back to point at the problematic entry.

This is a pretty rare problem to hit at the moment; the lookup has to
race with a grow of the radix tree from a height of 0.  The consequences
of hitting this race are that gang lookup could return a pointer to a
radix_tree_node instead of a pointer to whatever the user had inserted
in the tree.

Fixes: cebbd29e1c2f ("radix-tree: rewrite gang lookup using iterator")
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Ohad Ben-Cohen <ohad@wizery.com>
Cc: Konstantin Khlebnikov <khlebnikov@openvz.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoMAINTAINERS: return arch/sh to maintained state, with new maintainers
Rich Felker [Fri, 22 Jan 2016 23:11:05 +0000 (15:11 -0800)]
MAINTAINERS: return arch/sh to maintained state, with new maintainers

commit 114bf37e04d839b555b3dc460b5e6ce156f49cf0 upstream.

Add Yoshinori Sato and Rich Felker as maintainers for arch/sh
(SUPERH).

Signed-off-by: Rich Felker <dalias@libc.org>
Signed-off-by: Yoshinori Sato <ysato@users.sourceforge.jp>
Acked-by: D. Jeff Dionne <jeff@uClinux.org>
Acked-by: Rob Landley <rob@landley.net>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Simon Horman <horms+renesas@verge.net.au>
Acked-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agomemcg: only free spare array when readers are done
Martijn Coenen [Sat, 16 Jan 2016 00:57:49 +0000 (16:57 -0800)]
memcg: only free spare array when readers are done

commit 6611d8d76132f86faa501de9451a89bf23fb2371 upstream.

A spare array holding mem cgroup threshold events is kept around to make
sure we can always safely deregister an event and have an array to store
the new set of events in.

In the scenario where we're going from 1 to 0 registered events, the
pointer to the primary array containing 1 event is copied to the spare
slot, and then the spare slot is freed because no events are left.
However, it is freed before calling synchronize_rcu(), which means
readers may still be accessing threshold->primary after it is freed.

Fixed by only freeing after synchronize_rcu().

Signed-off-by: Martijn Coenen <maco@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Vladimir Davydov <vdavydov@virtuozzo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agonuma: fix /proc/<pid>/numa_maps for hugetlbfs on s390
Michael Holzheu [Wed, 3 Feb 2016 00:57:26 +0000 (16:57 -0800)]
numa: fix /proc/<pid>/numa_maps for hugetlbfs on s390

commit 5c2ff95e41c9290d16556cd02e35b25d81be8fe0 upstream.

When working with hugetlbfs ptes (which are actually pmds) is not valid to
directly use pte functions like pte_present() because the hardware bit
layout of pmds and ptes can be different.  This is the case on s390.
Therefore we have to convert the hugetlbfs ptes first into a valid pte
encoding with huge_ptep_get().

Currently the /proc/<pid>/numa_maps code uses hugetlbfs ptes without
huge_ptep_get().  On s390 this leads to the following two problems:

1) The pte_present() function returns false (instead of true) for
   PROT_NONE hugetlb ptes. Therefore PROT_NONE vmas are missing
   completely in the "numa_maps" output.

2) The pte_dirty() function always returns false for all hugetlb ptes.
   Therefore these pages are reported as "mapped=xxx" instead of
   "dirty=xxx".

Therefore use huge_ptep_get() to correctly convert the hugetlb ptes.

Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agofs/hugetlbfs/inode.c: fix bugs in hugetlb_vmtruncate_list()
Mike Kravetz [Sat, 16 Jan 2016 00:57:37 +0000 (16:57 -0800)]
fs/hugetlbfs/inode.c: fix bugs in hugetlb_vmtruncate_list()

commit 9aacdd354d197ad64685941b36d28ea20ab88757 upstream.

Hillf Danton noticed bugs in the hugetlb_vmtruncate_list routine.  The
argument end is of type pgoff_t.  It was being converted to a vaddr
offset and passed to unmap_hugepage_range.  However, end was also being
used as an argument to the vma_interval_tree_foreach controlling loop.
In addition, the conversion of end to vaddr offset was incorrect.

hugetlb_vmtruncate_list is called as part of a file truncate or
fallocate hole punch operation.

When truncating a hugetlbfs file, this bug could prevent some pages from
being unmapped.  This is possible if there are multiple vmas mapping the
file, and there is a sufficiently sized hole between the mappings.  The
size of the hole between two vmas (A,B) must be such that the starting
virtual address of B is greater than (ending virtual address of A <<
PAGE_SHIFT).  In this case, the pages in B would not be unmapped.  If
pages are not properly unmapped during truncate, the following BUG is
hit:

kernel BUG at fs/hugetlbfs/inode.c:428!

In the fallocate hole punch case, this bug could prevent pages from
being unmapped as in the truncate case.  However, for hole punch the
result is that unmapped pages will not be removed during the operation.
For hole punch, it is also possible that more pages than desired will be
unmapped.  This unnecessary unmapping will cause page faults to
reestablish the mappings on subsequent page access.

Fixes: 1bfad99ab (" hugetlbfs: hugetlb_vmtruncate_list() needs to take a range")Reported-by: Hillf Danton <hillf.zj@alibaba-inc.com>
Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoscripts/bloat-o-meter: fix python3 syntax error
Sergey Senozhatsky [Thu, 14 Jan 2016 23:16:53 +0000 (15:16 -0800)]
scripts/bloat-o-meter: fix python3 syntax error

commit 72214a24a7677d4c7501eecc9517ed681b5f2db2 upstream.

In Python3+ print is a function so the old syntax is not correct
anymore:

  $ ./scripts/bloat-o-meter vmlinux.o vmlinux.o.old
    File "./scripts/bloat-o-meter", line 61
      print "add/remove: %s/%s grow/shrink: %s/%s up/down: %s/%s (%s)" % \
                                                                     ^
  SyntaxError: invalid syntax

Fix by calling print as a function.

Tested on python 2.7.11, 3.5.1

Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agodma-debug: switch check from _text to _stext
Laura Abbott [Thu, 14 Jan 2016 23:16:50 +0000 (15:16 -0800)]
dma-debug: switch check from _text to _stext

commit ea535e418c01837d07b6c94e817540f50bfdadb0 upstream.

In include/asm-generic/sections.h:

  /*
   * Usage guidelines:
   * _text, _data: architecture specific, don't use them in
   * arch-independent code
   * [_stext, _etext]: contains .text.* sections, may also contain
   * .rodata.*
   *                   and/or .init.* sections

_text is not guaranteed across architectures.  Architectures such as ARM
may reuse parts which are not actually text and erroneously trigger a bug.
Switch to using _stext which is guaranteed to contain text sections.

Came out of https://lkml.kernel.org/g/<567B1176.4000106@redhat.com>

Signed-off-by: Laura Abbott <labbott@fedoraproject.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agom32r: fix m32104ut_defconfig build fail
Sudip Mukherjee [Thu, 14 Jan 2016 23:16:47 +0000 (15:16 -0800)]
m32r: fix m32104ut_defconfig build fail

commit 601f1db653217f205ffa5fb33514b4e1711e56d1 upstream.

The build of m32104ut_defconfig for m32r arch was failing for long long
time with the error:

  ERROR: "memory_start" [fs/udf/udf.ko] undefined!
  ERROR: "memory_end" [fs/udf/udf.ko] undefined!
  ERROR: "memory_end" [drivers/scsi/sg.ko] undefined!
  ERROR: "memory_start" [drivers/scsi/sg.ko] undefined!
  ERROR: "memory_end" [drivers/i2c/i2c-dev.ko] undefined!
  ERROR: "memory_start" [drivers/i2c/i2c-dev.ko] undefined!

As done in other architectures export the symbols to fix the error.

Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Sudip Mukherjee <sudip@vectorindia.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoxhci: Fix list corruption in urb dequeue at host removal
Mathias Nyman [Tue, 26 Jan 2016 15:50:12 +0000 (17:50 +0200)]
xhci: Fix list corruption in urb dequeue at host removal

commit 5c82171167adb8e4ac77b91a42cd49fb211a81a0 upstream.

xhci driver frees data for all devices, both usb2 and and usb3 the
first time usb_remove_hcd() is called, including td_list and and xhci_ring
structures.

When usb_remove_hcd() is called a second time for the second xhci bus it
will try to dequeue all pending urbs, and touches td_list which is already
freed for that endpoint.

Reported-by: Joe Lawrence <joe.lawrence@stratus.com>
Tested-by: Joe Lawrence <joe.lawrence@stratus.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoRevert "xhci: don't finish a TD if we get a short-transfer event mid TD"
Mathias Nyman [Tue, 26 Jan 2016 15:50:04 +0000 (17:50 +0200)]
Revert "xhci: don't finish a TD if we get a short-transfer event mid TD"

commit a6835090716a85f2297668ba593bd00e1051e662 upstream.

This reverts commit e210c422b6fd ("xhci: don't finish a TD if we get a
short transfer event mid TD")

Turns out that most host controllers do not follow the xHCI specs and never
send the second event for the last TRB in the TD if there was a short event
mid-TD.

Returning the URB directly after the first short-transfer event is far
better than never returning the URB. (class drivers usually timeout
after 30sec). For the hosts that do send the second event we will go
back to treating it as misplaced event and print an error message for it.

The origial patch was sent to stable kernels and needs to be reverted from
there as well

Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoiommu/vt-d: Clear PPR bit to ensure we get more page request interrupts
David Woodhouse [Mon, 15 Feb 2016 12:42:38 +0000 (12:42 +0000)]
iommu/vt-d: Clear PPR bit to ensure we get more page request interrupts

commit 46924008273ed03bd11dbb32136e3da4cfe056e1 upstream.

According to the VT-d specification we need to clear the PPR bit in
the Page Request Status register when handling page requests, or the
hardware won't generate any more interrupts.

This wasn't actually necessary on SKL/KBL (which may well be the
subject of a hardware erratum, although it's harmless enough). But
other implementations do appear to get it right, and we only ever get
one interrupt unless we clear the PPR bit.

Reported-by: CQ Tang <cq.tang@intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoiommu/vt-d: Fix 64-bit accesses to 32-bit DMAR_GSTS_REG
CQ Tang [Wed, 13 Jan 2016 21:15:03 +0000 (21:15 +0000)]
iommu/vt-d: Fix 64-bit accesses to 32-bit DMAR_GSTS_REG

commit fda3bec12d0979aae3f02ee645913d66fbc8a26e upstream.

This is a 32-bit register. Apparently harmless on real hardware, but
causing justified warnings in simulation.

Signed-off-by: CQ Tang <cq.tang@intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoiommu/vt-d: Fix mm refcounting to hold mm_count not mm_users
David Woodhouse [Tue, 12 Jan 2016 19:18:06 +0000 (19:18 +0000)]
iommu/vt-d: Fix mm refcounting to hold mm_count not mm_users

commit e57e58bd390a6843db58560bf7b8341665d2e058 upstream.

Holding mm_users works OK for graphics, which was the first user of SVM
with VT-d. However, it works less well for other devices, where we actually
do a mmap() from the file descriptor to which the SVM PASID state is tied.

In this case on process exit we end up with a recursive reference count:
 - The MM remains alive until the file is closed and the driver's release()
   call ends up unbinding the PASID.
 - The VMA corresponding to the mmap() remains intact until the MM is
   destroyed.
 - Thus the file isn't closed, even when exit_files() runs, because the
   VMA is still holding a reference to it. And the MM remains alive…

To address this issue, we *stop* holding mm_users while the PASID is bound.
We already hold mm_count by virtue of the MMU notifier, and that can be
made to be sufficient.

It means that for a period during process exit, the fun part of mmput()
has happened and exit_mmap() has been called so the MM is basically
defunct. But the PGD still exists and the PASID is still bound to it.

During this period, we have to be very careful — exit_mmap() doesn't use
mm->mmap_sem because it doesn't expect anyone else to be touching the MM
(quite reasonably, since mm_users is zero). So we also need to fix the
fault handler to just report failure if mm_users is already zero, and to
temporarily bump mm_users while handling any faults.

Additionally, exit_mmap() calls mmu_notifier_release() *before* it tears
down the page tables, which is too early for us to flush the IOTLB for
this PASID. And __mmu_notifier_release() removes every notifier from the
list, so when exit_mmap() finally *does* tear down the mappings and
clear the page tables, we don't get notified. So we work around this by
clearing the PASID table entry in our MMU notifier release() callback.
That way, the hardware *can't* get any pages back from the page tables
before they get cleared.

Hardware designers have confirmed that the resulting 'PASID not present'
faults should be handled just as gracefully as 'page not present' faults,
the important criterion being that they don't perturb the operation for
any *other* PASID in the system.

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoiommu/amd: Correct the wrong setting of alias DTE in do_attach
Baoquan He [Wed, 20 Jan 2016 14:01:19 +0000 (22:01 +0800)]
iommu/amd: Correct the wrong setting of alias DTE in do_attach

commit 9b1a12d29109234d2b9718d04d4d404b7da4e794 upstream.

In below commit alias DTE is set when its peripheral is
setting DTE. However there's a code bug here to wrongly
set the alias DTE, correct it in this patch.

commit e25bfb56ea7f046b71414e02f80f620deb5c6362
Author: Joerg Roedel <jroedel@suse.de>
Date:   Tue Oct 20 17:33:38 2015 +0200

    iommu/amd: Set alias DTE in do_attach/do_detach

Signed-off-by: Baoquan He <bhe@redhat.com>
Tested-by: Mark Hounschell <markh@compro.net>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoiommu/vt-d: Don't skip PCI devices when disabling IOTLB
Jeremy McNicoll [Fri, 15 Jan 2016 05:33:06 +0000 (21:33 -0800)]
iommu/vt-d: Don't skip PCI devices when disabling IOTLB

commit da972fb13bc5a1baad450c11f9182e4cd0a091f6 upstream.

Fix a simple typo when disabling IOTLB on PCI(e) devices.

Fixes: b16d0cb9e2fc ("iommu/vt-d: Always enable PASID/PRI PCI capabilities before ATS")
Signed-off-by: Jeremy McNicoll <jmcnicol@redhat.com>
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoInput: vmmouse - fix absolute device registration
Dmitry Torokhov [Sat, 16 Jan 2016 18:04:49 +0000 (10:04 -0800)]
Input: vmmouse - fix absolute device registration

commit d4f1b06d685d11ebdaccf11c0db1cb3c78736862 upstream.

We should set device's capabilities first, and then register it,
otherwise various handlers already present in the kernel will not be
able to connect to the device.

Reported-by: Lauri Kasanen <cand@gmx.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agostring_helpers: fix precision loss for some inputs
James Bottomley [Wed, 20 Jan 2016 22:58:29 +0000 (14:58 -0800)]
string_helpers: fix precision loss for some inputs

commit 564b026fbd0d28e9f70fb3831293d2922bb7855b upstream.

It was noticed that we lose precision in the final calculation for some
inputs.  The most egregious example is size=3000 blk_size=1900 in units
of 10 should yield 5.70 MB but in fact yields 3.00 MB (oops).

This is because the current algorithm doesn't correctly account for
all the remainders in the logarithms.  Fix this by doing a correct
calculation in the remainders based on napier's algorithm.

Additionally, now we have the correct result, we have to account for
arithmetic rounding because we're printing 3 digits of precision.  This
means that if the fourth digit is five or greater, we have to round up,
so add a section to ensure correct rounding.  Finally account for all
possible inputs correctly, including zero for block size.

Fixes: b9f28d863594c429e1df35a0474d2663ca28b307
Signed-off-by: James Bottomley <JBottomley@Odin.com>
Reported-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoInput: i8042 - add Fujitsu Lifebook U745 to the nomux list
Aurélien Francillon [Sun, 3 Jan 2016 04:39:54 +0000 (20:39 -0800)]
Input: i8042 - add Fujitsu Lifebook U745 to the nomux list

commit dd0d0d4de582a6a61c032332c91f4f4cb2bab569 upstream.

Without i8042.nomux=1 the Elantech touch pad is not working at all on
a Fujitsu Lifebook U745. This patch does not seem necessary for all
U745 (maybe because of different BIOS versions?). However, it was
verified that the patch does not break those (see opensuse bug 883192:
https://bugzilla.opensuse.org/show_bug.cgi?id=883192).

Signed-off-by: Aurélien Francillon <aurelien@francillon.net>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agoInput: elantech - mark protocols v2 and v3 as semi-mt
Benjamin Tissoires [Tue, 12 Jan 2016 01:35:38 +0000 (17:35 -0800)]
Input: elantech - mark protocols v2 and v3 as semi-mt

commit 6544a1df11c48c8413071aac3316792e4678fbfb upstream.

When using a protocol v2 or v3 hardware, elantech uses the function
elantech_report_semi_mt_data() to report data. This devices are rather
creepy because if num_finger is 3, (x2,y2) is (0,0). Yes, only one valid
touch is reported.

Anyway, userspace (libinput) is now confused by these (0,0) touches,
and detect them as palm, and rejects them.

Commit 3c0213d17a09 ("Input: elantech - fix semi-mt protocol for v3 HW")
was sufficient enough for xf86-input-synaptics and libinput before it has
palm rejection. Now we need to actually tell libinput that this device is
a semi-mt one and it should not rely on the actual values of the 2 touches.

Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agomm: fix regression in remap_file_pages() emulation
Kirill A. Shutemov [Wed, 17 Feb 2016 21:11:15 +0000 (13:11 -0800)]
mm: fix regression in remap_file_pages() emulation

commit 48f7df329474b49d83d0dffec1b6186647f11976 upstream.

Grazvydas Ignotas has reported a regression in remap_file_pages()
emulation.

Testcase:
#define _GNU_SOURCE
#include <assert.h>
#include <stdlib.h>
#include <stdio.h>
#include <sys/mman.h>

#define SIZE    (4096 * 3)

int main(int argc, char **argv)
{
unsigned long *p;
long i;

p = mmap(NULL, SIZE, PROT_READ | PROT_WRITE,
MAP_SHARED | MAP_ANONYMOUS, -1, 0);
if (p == MAP_FAILED) {
perror("mmap");
return -1;
}

for (i = 0; i < SIZE / 4096; i++)
p[i * 4096 / sizeof(*p)] = i;

if (remap_file_pages(p, 4096, 0, 1, 0)) {
perror("remap_file_pages");
return -1;
}

if (remap_file_pages(p, 4096 * 2, 0, 1, 0)) {
perror("remap_file_pages");
return -1;
}

assert(p[0] == 1);

munmap(p, SIZE);

return 0;
}

The second remap_file_pages() fails with -EINVAL.

The reason is that remap_file_pages() emulation assumes that the target
vma covers whole area we want to over map.  That assumption is broken by
first remap_file_pages() call: it split the area into two vma.

The solution is to check next adjacent vmas, if they map the same file
with the same flags.

Fixes: c8d78c1823f4 ("mm: replace remap_file_pages() syscall with emulation")
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reported-by: Grazvydas Ignotas <notasas@gmail.com>
Tested-by: Grazvydas Ignotas <notasas@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agomm: replace vma_lock_anon_vma with anon_vma_lock_read/write
Konstantin Khlebnikov [Fri, 5 Feb 2016 23:36:50 +0000 (15:36 -0800)]
mm: replace vma_lock_anon_vma with anon_vma_lock_read/write

commit 12352d3cae2cebe18805a91fab34b534d7444231 upstream.

Sequence vma_lock_anon_vma() - vma_unlock_anon_vma() isn't safe if
anon_vma appeared between lock and unlock.  We have to check anon_vma
first or call anon_vma_prepare() to be sure that it's here.  There are
only few users of these legacy helpers.  Let's get rid of them.

This patch fixes anon_vma lock imbalance in validate_mm().  Write lock
isn't required here, read lock is enough.

And reorders expand_downwards/expand_upwards: security_mmap_addr() and
wrapping-around check don't have to be under anon vma lock.

Link: https://lkml.kernel.org/r/CACT4Y+Y908EjM2z=706dv4rV6dWtxTLK9nFg9_7DhRMLppBo2g@mail.gmail.com
Signed-off-by: Konstantin Khlebnikov <koct9i@gmail.com>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8 years agomm: fix mlock accouting
Kirill A. Shutemov [Fri, 22 Jan 2016 00:40:27 +0000 (16:40 -0800)]
mm: fix mlock accouting

commit 7162a1e87b3e380133dadc7909081bb70d0a7041 upstream.

Tetsuo Handa reported underflow of NR_MLOCK on munlock.

Testcase:

    #include <stdio.h>
    #include <stdlib.h>
    #include <sys/mman.h>

    #define BASE ((void *)0x400000000000)
    #define SIZE (1UL << 21)

    int main(int argc, char *argv[])
    {
        void *addr;

        system("grep Mlocked /proc/meminfo");
        addr = mmap(BASE, SIZE, PROT_READ | PROT_WRITE,
                MAP_ANONYMOUS | MAP_PRIVATE | MAP_LOCKED | MAP_FIXED,
                -1, 0);
        if (addr == MAP_FAILED)
            printf("mmap() failed\n"), exit(1);
        munmap(addr, SIZE);
        system("grep Mlocked /proc/meminfo");
        return 0;
    }

It happens on munlock_vma_page() due to unfortunate choice of nr_pages
data type:

    __mod_zone_page_state(zone, NR_MLOCK, -nr_pages);

For unsigned int nr_pages, implicitly casted to long in
__mod_zone_page_state(), it becomes something around UINT_MAX.

munlock_vma_page() usually called for THP as small pages go though
pagevec.

Let's make nr_pages signed int.

Similar fixes in 6cdb18ad98a4 ("mm/vmstat: fix overflow in
mod_zone_page_state()") used `long' type, but `int' here is OK for a
count of the number of sub-pages in a huge page.

Fixes: ff6a6da60b89 ("mm: accelerate munlock() treatment of THP pages")
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Tested-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: Michel Lespinasse <walken@google.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>