git.proxmox.com Git - mirror_ubuntu-disco-kernel.git/log

coredump: move dump_write() and dump_seek() into a header file

My next patch will replace ELF_CORE_EXTRA_* macros by functions, putting
them into other newly created *.c files. Then, each files will contain
dump_write(), where each pair of binfmt_*.c and elfcore.c should be the
same. So, this patch moves them into a header file with dump_seek().
Also, the patch deletes confusing DUMP_WRITE macros in each files.

Signed-off-by: Daisuke HATAYAMA <d.hatayama@jp.fujitsu.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Greg Ungerer <gerg@snapgear.com>
Cc: Roland McGrath <roland@redhat.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

coredump: unify dump_seek() implementations for each binfmt_*.c

The current ELF dumper can produce broken corefiles if program headers
exceed 65535.  In particular, the program in 64-bit environment often
demands more than 65535 mmaps.  If you google max_map_count, then you can
find many users facing this problem.

Solaris has already dealt with this issue, and other OSes have also
adopted the same method as in Solaris.  Currently, Sun's document and AMD
64 ABI include the description for the extension, where they call the
extension Extended Numbering.  See Reference for further information.

I believe that linux kernel should adopt the same way as they did, so I've
written this patch.

I am also preparing for patches of GDB and binutils.

How to fix
==========

In new dumping process, there are two cases according to weather or
not the number of program headers is equal to or more than 65535.

- if less than 65535, the produced corefile format is exactly the same
   as the ordinary one.

- if equal to or more than 65535, then e_phnum field is set to newly
   introduced constant PN_XNUM(0xffff) and the actual number of program
   headers is set to sh_info field of the section header at index 0.

Compatibility Concern
=====================

* As already mentioned in Summary, Sun and AMD64 has already adopted
   this.  See Reference.

* There are four combinations according to whether kernel and userland
   tools are respectively modified or not.  The next table summarizes
   shortly for each combination.

                  ---------------------------------------------
                     Original Kernel    |   Modified Kernel
                  ---------------------------------------------
                 < 65535  | >= 65535 | < 65535  | >= 65535
  -------------------------------------------------------------
   Original Tools |    OK    |  broken  |   OK     | broken (#)
  -------------------------------------------------------------
   Modified Tools |    OK    |  broken  |   OK     |    OK
  -------------------------------------------------------------

  Note that there is no case that `OK' changes to `broken'.

  (#) Although this case remains broken, O-M behaves better than
  O-O. That is, while in O-O case e_phnum field would be extremely
  small due to integer overflow, in O-M case it is guaranteed to be at
  least 65535 by being set to PN_XNUM(0xFFFF), much closer to the
  actual correct value than the O-O case.

Test Program
============

Here is a test program mkmmaps.c that is useful to produce the
corefile with many mmaps. To use this, please take the following
steps:

$ ulimit -c unlimited
$ sysctl vm.max_map_count=70000 # default 65530 is too small
$ sysctl fs.file-max=70000
$ mkmmaps 65535

Then, the program will abort and a corefile will be generated.

If failed, there are two cases according to the error message
displayed.

* ``out of memory'' means vm.max_map_count is still smaller

* ``too many open files'' means fs.file-max is still smaller

So, please change it to a larger value, and then retry it.

mkmmaps.c
==
#include <stdio.h>
#include <stdlib.h>
#include <sys/mman.h>
#include <fcntl.h>
#include <unistd.h>
int main(int argc, char **argv)
{
int maps_num;
if (argc < 2) {
fprintf(stderr, "mkmmaps [number of maps to be created]\n");
exit(1);
}
if (sscanf(argv[1], "%d", &maps_num) == EOF) {
perror("sscanf");
exit(2);
}
if (maps_num < 0) {
fprintf(stderr, "%d is invalid\n", maps_num);
exit(3);
}
for (; maps_num > 0; --maps_num) {
if (MAP_FAILED == mmap((void *)NULL, (size_t) 1, PROT_READ,
MAP_SHARED | MAP_ANONYMOUS, (int) -1,
(off_t) NULL)) {
perror("mmap");
exit(4);
}
}
abort();
{
char buffer[128];
sprintf(buffer, "wc -l /proc/%u/maps", getpid());
system(buffer);
}
return 0;
}

Tested on i386, ia64 and um/sys-i386.
Built on sh4 (which covers fs/binfmt_elf_fdpic.c)

References
==========

- Sun microsystems: Linker and Libraries.
   Part No: 817-1984-17, September 2008.
   URL: http://docs.sun.com/app/docs/doc/817-1984

- System V ABI AMD64 Architecture Processor Supplement
   Draft Version 0.99., May 11, 2009.
   URL: http://www.x86-64.org/

This patch:

There are three different definitions for dump_seek() functions in
binfmt_aout.c, binfmt_elf.c and binfmt_elf_fdpic.c, respectively.  The
only for binfmt_elf.c.

My next patch will move dump_seek() into a header file in order to share
the same implementations for dump_write() and dump_seek().  As the first
step, this patch unify these three definitions for dump_seek() by applying
the past commits that have been applied only for binfmt_elf.c.

Specifically, the modification made here is part of the following commits:

  * d025c9db7f31fc0554ce7fb2dfc78d35a77f3487
  * 7f14daa19ea36b200d237ad3ac5826ae25360461

This patch does not change a shape of corefiles.

Signed-off-by: Daisuke HATAYAMA <d.hatayama@jp.fujitsu.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Greg Ungerer <gerg@snapgear.com>
Cc: Roland McGrath <roland@redhat.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

proc: warn on non-existing proc entries

* warn if creation goes on to non-existent directory
* warn if removal goes on from non-existing directory
* warn if non-existing proc entry is removed

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

proc: do translation + unlink atomically at remove_proc_entry()

remove_proc_entry() does

lock
lookup parent
unlock
lock
unlink proc entry from lists
unlock

which can be made bit more correct by doing parent translation + unlink
without dropping lock.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

drivers/hwmon/adcxx.c: fix for single-channel ADCs

While testing an ADC121S021 in an embedded board with a S3C2142 SoC (ARM
core), I have found that the 'adcxx' driver does not handle correctly
single channel ADCs from this chip family. For single channel chips you
must only issue one read transfer for correct measurement.

Signed-off-by: Jose Miguel Goncalves <jose.goncalves@inov.pt>
Cc: Marc Pignat <marc.pignat@hevs.ch>
Cc: Anton Vorontsov <avorontsov@ru.mvista.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

drivers/hwmon/vt8231.c: fix continuation line formats

String constants that are continued on subsequent lines with \ will cause
spurious whitespace in the resulting output.

Signed-off-by: Joe Perches <joe@perches.com>
Cc: Roger Lucas <vt8231@hiddenengine.co.uk>
Cc: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

crc32: some minor cleanups

Signed-off-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

checkpatch: warn on unnecessary spaces before quoted newlines

Signed-off-by: Joe Perches <joe@perches.com>
Cc: Andy Whitcroft <apw@shadowen.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

checkpatch.pl: warn if an adding line introduce spaces before tabs.

Signed-off-by: Alberto Panizzo <maramaopercheseimorto@gmail.com>
Cc: Andy Whitcroft <apw@shadowen.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

checkpatch.pl: extend list of expected-to-be-const structures

Based on Arjan's suggestion, extend the list of ops structures that should
be const.

Signed-off-by: Emese Revfy <re.emese@gmail.com>
Cc: Andy Whitcroft <apw@shadowen.org>
Cc: Arjan van de Ven <arjan@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

checkpatch.pl: add union and struct to the exceptions list

Here is a small code snippet, which will be complained about by
checkpatch.pl:

#define __STRUCT_KFIFO_COMMON(recsize, ptrtype) \
union { \
struct { \
unsigned int in; \
unsigned int out; \
}; \
char rectype[recsize]; \
ptrtype *ptr; \
const ptrtype *ptr_const; \
};

This construct is legal and safe, so checkpatch.pl should accept this. It
should be also true for struct defined in a macro.

Add the `struct' and `union' keywords to the exceptions list of the
checkpatch.pl script, to prevent error message "Macros with multiple
statements should be enclosed in a do - while loop". Otherwise it is not
possible to build a struct or union with a macro.

Signed-off-by: Stefani Seibold <stefani@seibold.net>
Cc: Andy Whitcroft <apw@shadowen.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

checkpatch: fix false positive on __initconst

checkpatch falsely complained about '__initconst' because it thought the
'const' needed a space before. Fix this by changing the list of
attributes:

- add '__initconst'
- force plain 'init' to contain a word-boundary at the end

Signed-off-by: Wolfram Sang <w.sang@pengutronix.de>
Cc: Andy Whitcroft <apw@shadowen.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

checkpatch.pl: allow > 80 char lines for logging functions not just printk

Signed-off-by: Joe Perches <joe@perches.com>
Cc: Andy Whitcroft <apw@canonical.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

checkpatch: trivial fix for trailing statements check

In case if the statement and the conditional are in one line, the line
appears in the report doubly.

And items of this check have no blank line before the next item.

This patch fixes these trivial problems, to improve readability of the
report.

[sample.c]
  > if (cond1
  >        && cond2
  >        && cond3) func_foo();
  >
  > if (cond4) func_bar();

Before:
  > ERROR: trailing statements should be on next line
  > #1: FILE: sample.c:1:
  > +if (cond1
  > [...]
  > +       && cond3) func_foo();
  > ERROR: trailing statements should be on next line
  > #5: FILE: sample.c:5:
  > +if (cond4) func_bar();
  > +if (cond4) func_bar();
  > total: 2 errors, 0 warnings, 5 lines checked

After:
  > ERROR: trailing statements should be on next line
  > #1: FILE: sample.c:1:
  > +if (cond1
  > [...]
  > +       && cond3) func_foo();
  >
  > ERROR: trailing statements should be on next line
  > #5: FILE: sample.c:5:
  > +if (cond4) func_bar();
  >
  > total: 2 errors, 0 warnings, 5 lines checked

Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Cc: Andy Whitcroft <apw@canonical.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

scripts/checkpatch.pl: add WARN on sizeof(&foo)

sizeof(&foo) is frequently an error. Warn on its use.

Signed-off-by: Joe Perches <joe@perches.com>
Cc: Andy Whitcroft <apw@shadowen.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mmc: enable DMA on Ricoh sdhci reader by default

This card reader doesn't advertise, however DMA works well. Probably
windows SDHCI driver assumes that all readers support DMA and thus we see
that bug.

Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
Tested-by: Maxim Levitsky <maximlevitsky@gmail.com>
Signed-off-by: Maxim Levitsky <maximlevitsky@gmail.com>
Cc: Harald Welte <HaraldWelte@viatech.com>
Cc: Norbert Preining <preining@logic.at>
Cc: <linux-mmc@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mmc: at91_mci: correct kunmap_atomic()

kunmap_atomic() accepts a pointer to any location in the page so we do not
need the subtraction and cast.

Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Cc: Wolfgang Muees <wolfgang.mues@auerswald.de>
Cc: Andrew Victor <avictor.za@gmail.com>
Cc: <linux-mmc@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mmc: at91_mci: introduce per-mci-revision conditional code

We used to manage features and differences on a per-cpu basis.  As several
cpus share the same mci revision, this patch aggregates cpus that have the
same IP revision in one defined constant.  We use the
at91mci_is_mci1rev2xx() funtion name not to mess with newer Atmel sd/mmc
IP called "MCI2".  _rev2 naming could have been confusing...

Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Cc: Wolfgang Muees <wolfgang.mues@auerswald.de>
Cc: Andrew Victor <avictor.za@gmail.com>
Cc: <linux-mmc@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mmc: at91_mci: Enable MMC_CAP_SDIO_IRQ only when it actually works.

According to the datasheets AT91SAM9261 does not support SDIO interrupts,
and AT91SAM9260/9263 have an erratum requiring 4bit mode while using slot
B for the interrupt to work.

Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Cc: Wolfgang Muees <wolfgang.mues@auerswald.de>
Cc: Andrew Victor <avictor.za@gmail.com>
Cc: <linux-mmc@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mmc: at91_mci: enable large data blocks

This patch is setting some max_ variables for the IO elevator, so the
elevator will put requests for large data blocks to the driver.  This is
critical for

a) speed

and

b) wear leveling of the flash chip controller: Otherwise the controller
   will treat the SD card badly with millions of single 4 KByte write
   commands.  This will lead to a shorter life time for the SD cards.

Signed-off-by: Wolfgang Muees <wolfgang.mues@auerswald.de>
Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Cc: Andrew Victor <avictor.za@gmail.com>
Cc: <linux-mmc@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mmc: at91_mci: use DMA buffer for read

Convert the read to use the DMA buffer as well.  The old code was doing
double-buffering DMA with the PDC; no way to make it work.  Replace it
with a single-PDC approach.  It also simplify things removing the need for
a pre_dma_read() function.

[nicolas.ferre@atmel.com coding style modifications]
Signed-off-by: Wolfgang Muees <wolfgang.mues@auerswald.de>
Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Cc: Andrew Victor <avictor.za@gmail.com>
Cc: <linux-mmc@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mmc: at91_mci: use one coherent DMA buffer

The TX DMA buffer is allocated only once, because the
allocation/deallocation of the buffer for EACH chunk of data is
time-consuming and prone to memory fragmentation.

Using a coherent DMA buffer avoids extra data cache calls.

[nicolas.ferre@atmel.com: coding style modifications]
Signed-off-by: Wolfgang Muees <wolfgang.mues@auerswald.de>
Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Cc: Andrew Victor <avictor.za@gmail.com>
Cc: <linux-mmc@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mmc: at91_mci: fix timeout errors

Fix two timeout errors, one for slow SDHC cards and one for slow users
while inserting SD cards.

Signed-off-by: Wolfgang Muees <wolfgang.mues@auerswald.de>
Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Cc: Andrew Victor <avictor.za@gmail.com>
Cc: <linux-mmc@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mmc: at91_mci: fix pointer errors

Fixes two pointer errors, one which leads to memory overwrites if used
with large chunks of data.

Signed-off-by: Wolfgang Muees <wolfgang.mues@auerswald.de>
Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Cc: Andrew Victor <avictor.za@gmail.com>
Cc: <linux-mmc@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

s3cmci: s3cmci_card_present: Use no_detect to decide whether there is a card detect pin

Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Cc: Ben Dooks <ben-linux@fluff.org>
Cc: <linux-mmc@vger.kernel.org>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

s3cmci: initialize default platform data no_wprotect and no_detect with 1

If no platform_data was givin to the device it's going to use it's default
platform data struct which has all fields initialized to zero.  As a
result the driver is going to try to request gpio0 both as write protect
and card detect pin.  Which of course will fail and makes the driver
unusable

Previously to the introduction of no_wprotect and no_detect the behavior
was to assume that if no platform data was given there is no write protect
or card detect pin.  This patch restores that behavior.

Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Cc: Ben Dooks <ben-linux@fluff.org>
Cc: <linux-mmc@vger.kernel.org>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

sdio: put active devices into 1-bit mode during suspend

And bring them back to 4-bit mode during resume.

Signed-off-by: Daniel Drake <dsd@laptop.org>
Signed-off-by: Nicolas Pitre <nico@marvell.com>
Cc: <linux-mmc@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

sdio: kick the interrupt thread upon a resume

Some SDIO cards may suspend while keeping function interrupts active
especially in the powered suspend case. Upon resume we need to kick the
SDIO interrupt thread to check for pending interrupts and to restart card
IRQ detection at the host controller level.

Signed-off-by: Nicolas Pitre <nico@marvell.com>
Cc: <linux-mmc@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

sdio: don't use CMD[357] as part of a powered SDIO resume

Seen on a Marvell 8686 SDIO card and Via VX855 controller: we must avoid
sending CMD3/5/7 on a resume where power has been maintained, because the
8686 will refuse to respond to them and the MMC stack will give up on the
card.

Signed-off-by: Chris Ball <cjb@laptop.org>
Signed-off-by: Nicolas Pitre <nico@marvell.com>
Cc: <linux-mmc@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

sdio: sdhci support for suspend mode PM features

Tested with an XO v1.5 from OLPC.

Signed-off-by: Nicolas Pitre <nico@marvell.com>
Cc: <linux-mmc@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

sdio: introduce API for special power management features

This patch series provides the core changes needed to allow SDIO cards to
remain powered and active while the host system is suspended, and let them
wake up the host system when needed.  This is used to implement
wake-on-lan with SDIO wireless cards at the moment.  Patches to add that
support to the libertas driver will be posted separately.

This patch:

Some SDIO cards have the ability to keep on running autonomously when the
host system is suspended, and wake it up when needed.  This however
requires that the host controller preserve power to the card, and
configure itself appropriately for wake-up.

There is however 4 layers of abstractions involved: the host controller
driver, the MMC core code, the SDIO card management code, and the actual
SDIO function driver.  To make things simple and manageable, host drivers
must advertise their PM capabilities with a feature bitmask, then function
drivers can query and set those features from their suspend method.  Then
each layer in the suspend call chain is expected to act upon those bits
accordingly.

[akpm@linux-foundation.org: fix typo in comment]
Signed-off-by: Nicolas Pitre <nico@marvell.com>
Cc: <linux-mmc@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

sdhci: improve sdhci sdhci_set_adma_desc() code

sdhci_set_adma_desc() is using byte-writes to write data in a specified
order into memory.  Change to using __le16 for the two byte and __le32 for
the four byte cases and use the cpu_to_{le16,le32} to do the conversion
before writing.

This will reduce the size of the code and the number of writes as we no
longer need to chop the data up before writing.

As an example on ARM S3C64XX SoC, in little-endian configuration:

000000d4 <sdhci_set_adma_desc>:
-      d8: e1a0c423 lsr ip, r3, #8
-      dc: e1a0ec21 lsr lr, r1, #24
-      e0: e1a04821 lsr r4, r1, #16
-      e4: e1a05421 lsr r5, r1, #8
-      e8: e1a06442 asr r6, r2, #8
-      ec: e5c0c001 strb ip, [r0, #1]
-      f0: e5c0e007 strb lr, [r0, #7]
-      f4: e5c04006 strb r4, [r0, #6]
-      f8: e5c05005 strb r5, [r0, #5]
-      fc: e5c01004 strb r1, [r0, #4]
-     100: e5c06003 strb r6, [r0, #3]
-     104: e5c02002 strb r2, [r0, #2]
-     108: e5c03000 strb r3, [r0]
+      d4: e5801004 str r1, [r0, #4]
+      d8: e1c030b0 strh r3, [r0]
+      dc: e1c020b2 strh r2, [r0, #2]

Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Cc: Pierre Ossman <pierre@ossman.eu>
Cc: <linux-mmc@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

sdhci: add adma descriptor set call

The code to write the ADMA descriptor into memory is repeated several
times throughout sdhci_adma_table_pre, and thus should be moved into a
common function. This will also be useful if the patch to make the write
more efficient is accepted.

Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Cc: Pierre Ossman <pierre@ossman.eu>
Cc: <linux-mmc@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

sdio: add quirk to clamp byte mode transfer

Some SDIO cards expect byte transfers not to exceed the configured block
transfer size. Add a quirk to that effect.

Patches to make use of this quirk will be sent separately.

Signed-off-by: Bing Zhao <bzhao@marvell.com>
Signed-off-by: Nicolas Pitre <nico@marvell.com>
Cc: <linux-mmc@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mmc: bfin_sdh: set timeout based on actual card data

The hardcoded value doesn't really work for all cards.

Signed-off-by: Cliff Cai <cliff.cai@analog.com>
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Cc: <linux-mmc@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mmc: bfin_sdh: drop redundant MMC depend string

The host/Kconfig file is only included when MMC is selected.

Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Cc: <linux-mmc@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mmc: bfin_sdh: fix unused sg warning on BF51x/BF52x systems

The local sg variable is only used with BF54x code.

Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Cc: <linux-mmc@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mmc: Atmel host kconfig cleanup for everyone else

This prevents those without an Atmel chip having a line in kernel
configuration which says "Atmel SD/MMC Driver" without any option.

Signed-off-by: Jonathan Cameron <jic23@cam.ac.uk>
Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Cc: <linux-mmc@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

davinci: MMC: add support for 8bit MMC cards

Add support for 8bit MMC cards. The controller data width is configurable
depending on the wires setting in the platform data structure.

MMC 8bit is tested on OMAPL137 and MMC 4bit is tested on OMAPL138 EVM.

Signed-off-by: Vipin Bhandari <vipin.bhandari@ti.com>
Cc: David Brownell <david-b@pacbell.net>
Cc: Chaithrika U S <chaithrika@ti.com>
Cc: Sudhakar Rajashekhara <sudhakar.raj@ti.com>
Cc: <linux-mmc@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

ricoh_mmc: port from driver to pci quirk

This patch solves nasty problem original driver has.

Original goal of the ricoh_mmc was to disable this device because then,
mmc cards can be read using standard SDHCI controller, thus avoiding
writing of yet another driver.

However, the act of disablement, makes other pci functions that belong to
this controller (xD and memstick) shift up one level, thus pci core has
now wrong idea about these devices.

To fix this issue, this patch moves the driver into the pci quirk section,
thus it is executes before the pci is enumerated, and therefore solving
that issue, also same sequence of commands is performed on resume for same
reasons.

Also regardless of the above, this way is cleaner. You still need to set
CONFIG_MMC_RICOH_MMC to enable this quirk

Signed-off-by: Maxim Levitsky <maximlevitsky@gmail.com>
Acked-by: Philip Langdale <philipl@overt.org>
Acked-by: Wolfram Sang <w.sang@pengutronix.de>
Cc: <linux-mmc@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

fs/compat_ioctl.c: suppress two warnings

fs/compat_ioctl.c: In function 'do_ioctl_trans':
fs/compat_ioctl.c:534: warning: 'karg' may be used uninitialized in this function
fs/compat_ioctl.c:533: warning: 'kcmd' may be used uninitialized in this function
fs/compat_ioctl.c:656: warning: 'ret' may be used uninitialized in this function

Reduces text size by 44 bytes.

If someone calls one of these functions with an unexpected argument, the
code's buggy as-is.

Amerigo Wang <amwang@redhat.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

bitmap: use for_each_set_bit()

Replace open-coded loop with for_each_set_bit().

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

lib: fix first line of kernel-doc for a few functions

The function name must be followed by a space, hypen, space, and a short
description.

Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

lib: build list_sort() only if needed

Build list_sort() only for configs that need it -- those that don't save
~581 bytes (i386).

Signed-off-by: Don Mullis <don.mullis@gmail.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Artem Bityutskiy <dedekind@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

lib: revise list_sort() header comment

Clarify and correct header comment of list_sort().

Signed-off-by: Don Mullis <don.mullis@gmail.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Artem Bityutskiy <dedekind@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

lib: more scalable list_sort()

XFS and UBIFS can pass long lists to list_sort(); this alternative
implementation scales better, reaching ~3x performance gain when list
length exceeds the L2 cache size.

Stand-alone program timings were run on a Core 2 duo L1=32KB L2=4MB,
gcc-4.4, with flags extracted from an Ubuntu kernel build.  Object size is
581 bytes compared to 455 for Mark J.  Roberts' code.

Worst case for either implementation is a list length just over a power of
two, and to roughly the same degree, so here are timing results for a
range of 2^N+1 lengths.  List elements were 16 bytes each including malloc
overhead; initial order was random.

                      time (msec)
                      Tatham-Roberts
                      |       generic-Mullis-v2
loop_count  length    |       |    ratio
4000000       2     206     294    1.427
2000000       3     176     227    1.289
1000000       5     199     172    0.864
500000       9     235     178    0.757
250000      17     243     182    0.748
125000      33     261     196    0.750
  62500      65     277     209    0.754
  31250     129     292     219    0.75
  15625     257     317     235    0.741
   7812     513     340     252    0.741
   3906    1025     362     267    0.737
   1953    2049     388     283    0.729  ~ L1 size
    976    4097     556     323    0.580
    488    8193     678     361    0.532
    244   16385     773     395    0.510
    122   32769     844     418    0.495
     61   65537     917     454    0.495
     30  131073    1128     543    0.481
     15  262145    2355     869    0.369  ~ L2 size
      7  524289    5597    1714    0.306
      3 1048577    6218    2022    0.325

Mark's code does not actually implement the usual or generic mergesort,
but rather a variant from Simon Tatham described here:

    http://www.chiark.greenend.org.uk/~sgtatham/algorithms/listsort.html

Simon's algorithm performs O(log N) passes over the entire input list,
doing merges of sublists that double in size on each pass.  The generic
algorithm instead merges pairs of equal length lists as early as possible,
in recursive order.  For either algorithm, the elements that extend the
list beyond power-of-two length are a special case, handled as nearly as
possible as a "rounding-up" to a full POT.

Some intuition for the locality of reference implications of merge order
may be gotten by watching this animation:

    http://www.sorting-algorithms.com/merge-sort

Simon's algorithm requires only O(1) extra space rather than the generic
algorithm's O(log N), but in my non-recursive implementation the actual
O(log N) data is merely a vector of ~20 pointers, which I've put on the
stack.

Long-running list_sort() calls: If the list passed in may be long, or the
client's cmp() callback function is slow, the client's cmp() may
periodically invoke cond_resched() to voluntarily yield the CPU.  All
inner loops of list_sort() call back to cmp().

Stability of the sort: distinct elements that compare equal emerge from
the sort in the same order as with Mark's code, for simple test cases.  A
boot-time test is provided to verify this and other correctness
requirements.

A kernel that uses drm.ko appears to run normally with this change; I have
no suitable hardware to similarly test the use by UBIFS.

[akpm@linux-foundation.org: style tweaks, fix comment, make list_sort_test __init]
Signed-off-by: Don Mullis <don.mullis@gmail.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Artem Bityutskiy <dedekind@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

lib/string.c: simplify strnstr()

Signed-off-by: André Goddard Rosa <andre.goddard@gmail.com>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Joe Perches <joe@perches.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

lib/string.c: simplify stricmp()

Removes 32 bytes on core2 with gcc 4.4.1:
   text    data     bss     dec     hex filename
   3196       0       0    3196     c7c lib/string-BEFORE.o
   3164       0       0    3164     c5c lib/string-AFTER.o

Signed-off-by: André Goddard Rosa <andre.goddard@gmail.com>
Cc: Joe Perches <joe@perches.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

MAINTAINERS: document and add "Q" patchwork queue entries

Patchwork queues show the acceptance/rejection state of submitted patches
for various MAINTAINER trees. Document their existence.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

MAINTAINERS: WAVELAN moved to staging

by commit 0234f84ebb00d36c48062befa5436eef36b71ccd
Update patterns

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

MAINTAINERS: STARMODE RADIO IP (STRIP) moved to staging

by commit 955015bb0b42167d14f776ff5947ae2463a974dc

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

MAINTAINERS: update PERFORMANCE EVENTS F: patterns

To match arch/*/kernel perf_event location changes

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

MAINTAINERS: remove HAYES ESP SERIAL DRIVER

Commit f53a2ade0bb9f2a81f473e6469155172a96b7c38 ("tty: esp: remove
broken driver") removed it

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

MAINTAINERS: remove AMD GEODE F: arch/x86/kernel/geode_32.c

Commit c95d1e53ed89b75a4d7b68d1cbae4607b1479243 ("cs5535: drop the
Geode-specific MFGPT/GPIO code") removed it.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

scripts/get_maintainer.pl: fix possible infinite loop

If MAINTAINERS section entries are misformatted, it was possible to have
an infinite loop.

Correct the defect by always moving the index to the end of section + 1

Also, exit check for exclude as soon as possible.

Signed-off-by: Joe Perches <joe@perches.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

get_maintainer: quote email address with period

Picky mail systems won't accept email addresses where recipient has period
in name; ie. David S. Miller <davemloft.net> will not work.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Acked-by: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

get_maintainer: fix perlcritic warnings

perlcritic is a standard checker for Perl Best Practices.  This patch
fixes most of the warnings in the get_maintainer script.  If kernel
programmers are going to have checkpatch they should write clean scripts
as well...

Bareword file handle opened at line 176, column 1.  See pages 202,204 of PBP.  (Severity: 5)
Two-argument "open" used at line 176, column 1.  See page 207 of PBP.  (Severity: 5)
Bareword file handle opened at line 207, column 5.  See pages 202,204 of PBP.  (Severity: 5)
Two-argument "open" used at line 207, column 5.  See page 207 of PBP.  (Severity: 5)
Bareword file handle opened at line 246, column 6.  See pages 202,204 of PBP.  (Severity: 5)
Two-argument "open" used at line 246, column 6.  See page 207 of PBP.  (Severity: 5)
Bareword file handle opened at line 258, column 2.  See pages 202,204 of PBP.  (Severity: 5)
Two-argument "open" used at line 258, column 2.  See page 207 of PBP.  (Severity: 5)
Expression form of "eval" at line 983, column 17.  See page 161 of PBP.  (Severity: 5)
Expression form of "eval" at line 985, column 17.  See page 161 of PBP.  (Severity: 5)
Subroutine prototypes used at line 1186, column 1.  See page 194 of PBP.  (Severity: 5)
Subroutine prototypes used at line 1206, column 1.  See page 194 of PBP.  (Severity: 5)

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Acked-by: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

scripts/get_maintainer.pl: add ability to read from STDIN

Doesn't need or accept '-' as a trailing option to read stdin. Doesn't
print usage() after bad options. Adds --usage as command line equivalent
of --help

Suggested-by: Borislav Petkov <petkovbb@googlemail.com>
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

scripts/get_maintainer.pl: change --sections to print in the same style as MAINTAINERS

Signed-off-by: Joe Perches <joe@perches.com>
Cc: Stefan Richter <stefanr@s5r6.in-berlin.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

scripts/get_maintainer.pl: add --sections, print entire matched subsystem

Print the complete contents of the matched subsystems
in pattern match depth order.

Sample output:

$ ./scripts/get_maintainer.pl --sections -f drivers/net/usb/smsc95xx.c
USB SMSC95XX ETHERNET DRIVER
M:Steve Glendinning <steve.glendinning@smsc.com>
L:netdev@vger.kernel.org
S:Supported
F:drivers/net/usb/smsc95xx.*
USB SUBSYSTEM
M:Greg Kroah-Hartman <gregkh@suse.de>
L:linux-usb@vger.kernel.org
W:http://www.linux-usb.org
T:quilt kernel.org/pub/linux/kernel/people/gregkh/gregkh-2.6/
S:Supported
F:Documentation/usb/
F:drivers/net/usb/
F:drivers/usb/
F:include/linux/usb.h
F:include/linux/usb/
NETWORKING DRIVERS
L:netdev@vger.kernel.org
W:http://www.linuxfoundation.org/en/Net
T:git git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6.git
S:Odd Fixes
F:drivers/net/
F:include/linux/if_*
F:include/linux/*device.h
THE REST
M:Linus Torvalds <torvalds@linux-foundation.org>
L:linux-kernel@vger.kernel.org
Q:http://patchwork.kernel.org/project/LKML/list/
T:git git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
S:Buried alive in reporters
F:*
F:*/

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

scripts/get_maintainer.pl: add --file-emails, find embedded email addresses

Add an imperfect option to search a source file for email addresses.

New option:  --file-emails or --fe

email addresses in files are freeform text and are nearly impossible to
parse.  Still, might as well try to do a somewhat acceptable job of
finding them.  This code should find all addresses that are in the form
addr@domain.tld

The code assumes that up to 3 alphabetic words along with dashes, commas,
and periods that preceed the email address are a name.

If 3 words are found for the name, and one of the first two words are a
single letter and period, or just a single letter then the 3 words are use
as name otherwise the last 2 words are used.

Some variants that are shown correctly:
    John Smith <jksmith@domain.org>
    Random J. Developer <rjd@tld.com>
    Random J. Developer (rjd@tld.com)
    J. Random Developer rjd@tld.com

Variants that are shown nominally correctly:
    Written by First Last (funny-addr@somecompany.com)
is shown as:
    First Last <funny-addr@somecompany.com>

Variants that are shown incorrectly:
    Some Really Long Name <srln@foo.bar>
    MontaVista Software, Inc. <source@mvista.com>
are returned as:
    Long Name <srln@foo.bar>
    "Software, Inc" <source@mvista.com>

--roles and --rolestats show "(in file)" for matches.

For instance:

Without -file-emails:

$ ./scripts/get_maintainer.pl -f -nogit -roles net/core/netpoll.c
David S. Miller <davem@davemloft.net> (maintainer:NETWORKING [GENERAL])
linux-kernel@vger.kernel.org (open list)

With -fe:

$ ./scripts/get_maintainer.pl -f -fe -nogit -roles net/core/netpoll.c
David S. Miller <davem@davemloft.net> (maintainer:NETWORKING [GENERAL])
Matt Mackall <mpm@selenic.com> (in file)
Ingo Molnar <mingo@redhat.com> (in file)
linux-kernel@vger.kernel.org (open list)
netdev@vger.kernel.org (open list:NETWORKING [GENERAL])

The number of email addresses in the file in not limited.  Neither is the
number of returned email addresses.

Signed-off-by: Joe Perches <joe@perches.com>
Cc: Matt Mackall <mpm@selenic.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

printk: avoid warning when CONFIG_PRINTK is disabled

kernel/printk.c:72: warning: `saved_console_loglevel' defined but not used

Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

exec: create initial stack independent of PAGE_SIZE

Currently we create the initial stack based on the PAGE_SIZE. This is
unnecessary.

This creates this initial stack independent of the PAGE_SIZE.

It also bumps up the number of 4k pages allocated from 20 to 32, to
align with 64K page systems.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Cc: Helge Deller <deller@gmx.de>
Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Americo Wang <xiyou.wangcong@gmail.com>
Cc: Anton Blanchard <anton@samba.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

kernel/pid.c: update comment on find_task_by_pid_ns

tasklist_lock does protect the task and its pid, it can't go away. The
problem is that find_pid_ns() itself is unsafe without rcu lock, it can
race with copy_process()->free_pid(any_pid).

Protecting copy_process()->free_pid(any_pid) with tasklist_lock would make
it possible to call find_task_by_pid_ns() under tasklist safely, but we
don't do so because we are trying to get rid of the read_lock sites of
tasklist_lock.

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: "Paul E. McKenney" <paulmck@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

panic: fix panic_timeout accuracy when running on a hypervisor

I've had some complaints about panic_timeout being wildly innacurate on
shared processor PowerPC partitions (a 3 minute panic_timeout taking 30
minutes).

The problem is we loop on mdelay(1) and with a 1ms in 10ms hypervisor
timeslice each of these will take 10ms (ie 10x) longer. I expect other
platforms with shared processor hypervisors will see the same issue.

This patch keeps the old behaviour if we have a panic_blink (only keyboard
LEDs right now) and does 1 second mdelays if we don't.

Signed-off-by: Anton Blanchard <anton@samba.org>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

kernel core: use helpers for rlimits

Make sure compiler won't do weird things with limits. E.g. fetching them
twice may return 2 different values after writable limits are implemented.

I.e. either use rlimit helpers added in commit 3e10e716abf3 ("resource:
add helpers for fetching rlimits") or ACCESS_ONCE if not applicable.

Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: john stultz <johnstul@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

posix-cpu-timers: cleanup rlimits usage

Fetch rlimit (both hard and soft) values only once and work on them. It
removes many accesses through sig structure and makes the code cleaner.

Mostly a preparation for writable resource limits support.

Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: john stultz <johnstul@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

kernel/exit.c: fix shadows sparse warning

kernel/exit.c:1183:26: warning: symbol 'status' shadows an earlier one
kernel/exit.c:1173:21: originally declared here

Signed-off-by: Thiago Farina <tfransosi@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

includecheck fix for kernel/params.c

Fix the following 'make includecheck' warning:
kernel/params.c: linux/string.h is included more than once.

Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
Cc: André Goddard Rosa <andre.goddard@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

splice: comparing unsigned int < 0

"ret" needs to be signed or the error handling for splice_to_pipe() won't
work correctly.

Signed-off-by: Dan Carpenter <error27@gmail.com>
Cc: Tom Zanussi <zanussi@comcast.net>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

lkdtm: add debugfs access and loosen KPROBE ties

Add adds a debugfs interface and additional failure modes to LKDTM to
provide similar functionality to the provoke-crash driver submitted here:

http://lwn.net/Articles/371208/

Crashes can now be induced either through module parameters (as before)
or through the debugfs interface as in provoke-crash.

The patch also provides a new "direct" interface, where KPROBES are not
used, i.e., the crash is invoked directly upon write to the debugfs
file. When built without KPROBES configured, only this mode is available.

Signed-off-by: Simon Kagstrom <simon.kagstrom@netinsight.net>
Cc: M. Mohan Kumar <mohan@in.ibm.com>
Cc: Americo Wang <xiyou.wangcong@gmail.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>,
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

eisa: fix coding style for eisa bus code

Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@holoscopio.com>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Cc: Greg Kroah-Hartman <gregkh@suse.de>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

drivers/misc/iwmc3200top/main.c: eliminate useless code

The variable priv is initialized twice to the same (side effect-free)
expression. Drop one initialization.

A simplified version of the semantic match that finds this problem is:
(http://coccinelle.lip6.fr/)

// <smpl>
@forall@
idexpression *x;
identifier f!=ERR_PTR;
@@

x = f(...)
... when != x
(
x = f(...,<+...x...+>,...)
|
* x = f(...)
)
// </smpl>

Signed-off-by: Julia Lawall <julia@diku.dk>
Cc: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

init/main.c: make setup_max_cpus static for !SMP

The only in tree external users of the symbol setup_max_cpus are in
arch/x86/. The files ./kernel/alternative.c, ./kernel/visws_quirks.c, and
./mm/kmemcheck/kmemcheck.c are all guarded by CONFIG_SMP being defined.
For this case the symbol is an unsigned int and declared as an extern in
include/linux/smp.h.

When CONFIG_SMP is not defined the symbol setup_max_cpus is
a constant value that is only used in init/main.c. Make the symbol
static for this case.

Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

smp: fix documentation in include/linux/smp.h

smp: Fix documentation.

Fix documentation in include/linux/smp.h: smp_processor_id()

Signed-off-by: Rakib Mullick <rakib.mullick@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

nodemask.h: remove macro any_online_node

The macro any_online_node() is prone to producing sparse warnings due to
the local symbol 'node'. Since all the in-tree users are really
requesting the first online node (the mask argument is either
NODE_MASK_ALL or node_online_map) just use the first_online_node macro and
remove the any_online_node macro since there are no users.

Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com>
Acked-by: David Rientjes <rientjes@google.com>
Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Lee Schermerhorn <lee.schermerhorn@hp.com>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Dave Hansen <dave@linux.vnet.ibm.com>
Cc: Milton Miller <miltonm@bga.com>
Cc: Nathan Fontenot <nfont@austin.ibm.com>
Cc: Geoff Levand <geoffrey.levand@am.sony.com>
Cc: Grant Likely <grant.likely@secretlab.ca>
Cc: J. Bruce Fields <bfields@fieldses.org>
Cc: Neil Brown <neilb@suse.de>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Benny Halevy <bhalevy@panasas.com>
Cc: Chuck Lever <chuck.lever@oracle.com>
Cc: Ricardo Labiaga <Ricardo.Labiaga@netapp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

fs: use rlimit helpers

Make sure compiler won't do weird things with limits. E.g. fetching them
twice may return 2 different values after writable limits are implemented.

I.e. either use rlimit helpers added in commit 3e10e716abf3 ("resource:
add helpers for fetching rlimits") or ACCESS_ONCE if not applicable.

Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

cpumask: let num_*_cpus() function always return unsigned values

Dependent on CONFIG_SMP the num_*_cpus() functions return unsigned or
signed values. Let them always return unsigned values to avoid strange
casts.

Fixes at least one warning:

kernel/kprobes.c: In function 'register_kretprobe':
kernel/kprobes.c:1038: warning: comparison of distinct pointer types lacks a cast

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Masami Hiramatsu <mhiramat@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

init/initramfs.c: fix "symbol shadows an earlier one" noise

The symbol 'count' is a local global variable in this file. The function
clean_rootfs() should use a different symbol name to prevent "symbol
shadows an earlier one" noise.

Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

init/main.c: improve usability in case of init binary failure

- new Documentation/init.txt file describing various forms of failure
trying to load the init binary after kernel bootup

- extend the init/main.c init failure message to direct to
Documentation/init.txt

Signed-off-by: Andreas Mohr <andi@lisas.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

kernel/cpu.c: delete deprecated definition in cpu_up()

Additional_cpus is only supported for IA64 now. X86_64 should not be
included.

Signed-off-by: Chen Gong <gong.chen@linux.intel.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

MFGPT: move clocksource menu

Move the CS5535 MFGPT hrtimer kconfig option to be with the other MFGPT
options. This makes it easier to find and also removes it from the main
"Device Drivers" menu, where it should not have been.

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Acked-by: Andres Salomon <dilinger@collabora.co.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

um: tell git to ignore generated files

Tell git to ignore the generated files under um, except:

include/shared/kern_constants.h
include/shared/user_constants.h

which will be moved to include/generated.

Signed-off-by: WANG Cong <xiyou.wangcong@gmail.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

uml: line.c: avoid NULL pointer dereference

Assign tty only if line is not NULL.

[akpm@linux-foundation.org: simplification]
Signed-off-by: Alexander Beregalov <a.beregalov@gmail.com>
Cc: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

cris v32: typo in crisv32_arbiter_unwatch()?

With id 1 the wrong bp was unwatched.

Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Cc: Mikael Starvik <starvik@axis.com>
Cc: Jesper Nilsson <jesper.nilsson@axis.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

cryptocop: fix assertion in create_output_descriptors()

size_t desc_len cannot be less than 0, test before the subtraction.

Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Cc: Mikael Starvik <starvik@axis.com>
Cc: Jesper Nilsson <jesper.nilsson@axis.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

cris: convert to use arch_gettimeoffset()

Convert cris to use GENERIC_TIME via the arch_getoffset() infrastructure,
reducing the amount of arch specific code we need to maintain.

Signed-off-by: John Stultz <johnstul@us.ibm.com>
Cc: Mikael Starvik <starvik@axis.com>
Cc: Jesper Nilsson <jesper.nilsson@axis.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

cpuidle menu: remove 8 bytes of padding on 64 bit builds

Reorder struct menu_device to remove 8 bytes of padding on 64 bit builds.
Size drops from 136 to 128 bytes, so possibly needing one fewer cache
lines.

Signed-off-by: Richard Kennedy <richard@rsk.demon.co.uk>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

alpha: PTR_ERR overwrites -EINVAL in syscall osf_mount

The initial -EINVAL value is overwritten by `retval = PTR_ERR(name)'. If
this isn't an error pointer and typenr is not 1, 6 or 9, then this retval,
a pointer cast to a long, is returned.

Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Acked-by: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Matt Turner <mattst88@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

frv: remove pci_dma_sync_single() and pci_dma_sync_sg()

No architecture except for frv has pci_dma_sync_single() and
pci_dma_sync_sg(). The APIs are deprecated.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm: add comment on swap_duplicate's error code

swap_duplicate()'s loop appears to miss out on returning the error code
from __swap_duplicate(), except when that's -ENOMEM. In fact this is
intentional: prior to -ENOMEM for swap_count_continuation,
swap_duplicate() was void (and the case only occurs when copy_one_pte()
hits a corrupt pte). But that's surprising behaviour, which certainly
deserves a comment.

Signed-off-by: Hugh Dickins <hughd@google.com>
Reported-by: Huang Shijie <shijie8@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

nommu: get_user_pages(): pin last page on non-page-aligned start

The noMMU version of get_user_pages() fails to pin the last page when the
start address isn't page-aligned. The patch fixes this in a way that
makes find_extend_vma() congruent to its MMU cousin.

Signed-off-by: Steven J. Magnani <steve@digidescorp.com>
Acked-by: Paul Mundt <lethal@linux-sh.org>
Cc: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm: use the same log level for show_mem()

Use the same log level for printk's in show_mem(), so that those messages
can be shown completely when using log level 6.

Signed-off-by: WANG Cong <amwang@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm: add comment about deprecation of __GFP_NOFAIL

__GFP_NOFAIL was deprecated in dab48dab, so add a comment that no new
users should be added.

Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

vmscan: detect mapped file pages used only once

The VM currently assumes that an inactive, mapped and referenced file page
is in use and promotes it to the active list.

However, every mapped file page starts out like this and thus a problem
arises when workloads create a stream of such pages that are used only for
a short time.  By flooding the active list with those pages, the VM
quickly gets into trouble finding eligible reclaim canditates.  The result
is long allocation latencies and eviction of the wrong pages.

This patch reuses the PG_referenced page flag (used for unmapped file
pages) to implement a usage detection that scales with the speed of LRU
list cycling (i.e.  memory pressure).

If the scanner encounters those pages, the flag is set and the page cycled
again on the inactive list.  Only if it returns with another page table
reference it is activated.  Otherwise it is reclaimed as 'not recently
used cache'.

This effectively changes the minimum lifetime of a used-once mapped file
page from a full memory cycle to an inactive list cycle, which allows it
to occur in linear streams without affecting the stable working set of the
system.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: Minchan Kim <minchan.kim@gmail.com>
Cc: OSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Lee Schermerhorn <lee.schermerhorn@hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

vmscan: drop page_mapping_inuse()

page_mapping_inuse() is a historic predicate function for pages that are
about to be reclaimed or deactivated.

According to it, a page is in use when it is mapped into page tables OR
part of swap cache OR backing an mmapped file.

This function is used in combination with page_referenced(), which checks
for young bits in ptes and the page descriptor itself for the
PG_referenced bit.  Thus, checking for unmapped swap cache pages is
meaningless as PG_referenced is not set for anonymous pages and unmapped
pages do not have young ptes.  The test makes no difference.

Protecting file pages that are not by themselves mapped but are part of a
mapped file is also a historic leftover for short-lived things like the
exec() code in libc.  However, the VM now does reference accounting and
activation of pages at unmap time and thus the special treatment on
reclaim is obsolete.

This patch drops page_mapping_inuse() and switches the two callsites to
use page_mapped() directly.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: Minchan Kim <minchan.kim@gmail.com>
Cc: OSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Lee Schermerhorn <lee.schermerhorn@hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

vmscan: factor out page reference checks

The used-once mapped file page detection patchset.

It is meant to help workloads with large amounts of shortly used file
mappings, like rtorrent hashing a file or git when dealing with loose
objects (git gc on a bigger site?).

Right now, the VM activates referenced mapped file pages on first
encounter on the inactive list and it takes a full memory cycle to
reclaim them again.  When those pages dominate memory, the system
no longer has a meaningful notion of 'working set' and is required
to give up the active list to make reclaim progress.  Obviously,
this results in rather bad scanning latencies and the wrong pages
being reclaimed.

This patch makes the VM be more careful about activating mapped file
pages in the first place.  The minimum granted lifetime without
another memory access becomes an inactive list cycle instead of the
full memory cycle, which is more natural given the mentioned loads.

This test resembles a hashing rtorrent process.  Sequentially, 32MB
chunks of a file are mapped into memory, hashed (sha1) and unmapped
again.  While this happens, every 5 seconds a process is launched and
its execution time taken:

python2.4 -c 'import pydoc'
old: max=2.31s mean=1.26s (0.34)
new: max=1.25s mean=0.32s (0.32)

find /etc -type f
old: max=2.52s mean=1.44s (0.43)
new: max=1.92s mean=0.12s (0.17)

vim -c ':quit'
old: max=6.14s mean=4.03s (0.49)
new: max=3.48s mean=2.41s (0.25)

mplayer --help
old: max=8.08s mean=5.74s (1.02)
new: max=3.79s mean=1.32s (0.81)

overall hash time (stdev):
old: time=1192.30 (12.85) thruput=25.78mb/s (0.27)
new: time=1060.27 (32.58) thruput=29.02mb/s (0.88) (-11%)

I also tested kernbench with regular IO streaming in the background to
see whether the delayed activation of frequently used mapped file
pages had a negative impact on performance in the presence of pressure
on the inactive list.  The patch made no significant difference in
timing, neither for kernbench nor for the streaming IO throughput.

The first patch submission raised concerns about the cost of the extra
faults for actually activated pages on machines that have no hardware
support for young page table entries.

I created an artificial worst case scenario on an ARM machine with
around 300MHz and 64MB of memory to figure out the dimensions
involved.  The test would mmap a file of 20MB, then

  1. touch all its pages to fault them in
  2. force one full scan cycle on the inactive file LRU
  -- old: mapping pages activated
  -- new: mapping pages inactive
  3. touch the mapping pages again
  -- old and new: fault exceptions to set the young bits
  4. force another full scan cycle on the inactive file LRU
  5. touch the mapping pages one last time
  -- new: fault exceptions to set the young bits

The test showed an overall increase of 6% in time over 100 iterations
of the above (old: ~212sec, new: ~225sec).  13 secs total overhead /
(100 * 5k pages), ignoring the execution time of the test itself,
makes for about 25us overhead for every page that gets actually
activated.  Note:

  1. File mapping the size of one third of main memory, _completely_
  in active use across memory pressure - i.e., most pages referenced
  within one LRU cycle.  This should be rare to non-existant,
  especially on such embedded setups.

  2. Many huge activation batches.  Those batches only occur when the
  working set fluctuates.  If it changes completely between every full
  LRU cycle, you have problematic reclaim overhead anyway.

  3. Access of activated pages at maximum speed: sequential loads from
  every single page without doing anything in between.  In reality,
  the extra faults will get distributed between actual operations on
  the data.

So even if a workload manages to get the VM into the situation of
activating a third of memory in one go on such a setup, it will take
2.2 seconds instead 2.1 without the patch.

Comparing the numbers (and my user-experience over several months),
I think this change is an overall improvement to the VM.

Patch 1 is only refactoring to break up that ugly compound conditional
in shrink_page_list() and make it easy to document and add new checks
in a readable fashion.

Patch 2 gets rid of the obsolete page_mapping_inuse().  It's not
strictly related to #3, but it was in the original submission and is a
net simplification, so I kept it.

Patch 3 implements used-once detection of mapped file pages.

This patch:

Moving the big conditional into its own predicate function makes the code
a bit easier to read and allows for better commenting on the checks
one-by-one.

This is just cleaning up, no semantics should have been changed.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: Minchan Kim <minchan.kim@gmail.com>
Cc: OSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Lee Schermerhorn <lee.schermerhorn@hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm: document /sys/devices/system/node/nodeX

Add a bare description of what /sys/devices/system/node/nodeX is. Others
will follow in time but right now, none of that tree is documented. The
existence of this file might at least encourage people to document new
entries.

Signed-off-by: Mel Gorman <mel@csn.ul.ie>
Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm: document /proc/pagetypeinfo

Add documentation for /proc/pagetypeinfo.

Signed-off-by: Mel Gorman <mel@csn.ul.ie>
Reviewed-by: Christoph Lameter <cl@linux-foundation.org>
Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm: suppress pfn range output for zones without pages

free_area_init_nodes() emits pfn ranges for all zones on the system.
There may be no pages on a higher zone, however, due to memory limitations
or the use of the mem= kernel parameter.  For example:

Zone PFN ranges:
  DMA      0x00000001 -> 0x00001000
  DMA32    0x00001000 -> 0x00100000
  Normal   0x00100000 -> 0x00100000

The implementation copies the previous zone's highest pfn, if any, as the
next zone's lowest pfn.  If its highest pfn is then greater than the
amount of addressable memory, the upper memory limit is used instead.
Thus, both the lowest and highest possible pfn for higher zones without
memory may be the same.

The pfn range for zones without memory is now shown as "empty" instead.

Signed-off-by: David Rientjes <rientjes@google.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Reviewed-by: Christoph Lameter <cl@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>