Ben Hutchings [Tue, 15 Dec 2009 02:01:29 +0000 (18:01 -0800)]
mmc: add module parameter to set whether cards are assumed removable
Some people run general-purpose distribution kernels on netbooks with
a card that is physically non-removable or logically non-removable
(e.g. used for /home) and cannot be cleanly unmounted during suspend.
Add a module parameter to set whether cards are assumed removable or
non-removable, with the default set by CONFIG_MMC_UNSAFE_RESUME.
In general, it is not possible to tell whether a card present in an MMC
slot after resume is the same that was there before suspend. So there are
two possible behaviours, each of which will cause data loss in some cases:
CONFIG_MMC_UNSAFE_RESUME=n (default): Cards are assumed to be removed
during suspend. Any filesystem on them must be unmounted before suspend;
otherwise, buffered writes will be lost.
CONFIG_MMC_UNSAFE_RESUME=y: Cards are assumed to remain present during
suspend. They must not be swapped during suspend; otherwise, buffered
writes will be flushed to the wrong card.
Currently the choice is made at compile time and this allows that to be
overridden at module load time.
Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Cc: Wouter van Heyst <larstiq@larstiq.dyndns.org> Cc: <linux-mmc@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ben Dooks [Tue, 15 Dec 2009 02:01:27 +0000 (18:01 -0800)]
s3cmci: convert missed s3c2410_gpio calls to gpiolib calls
Convert two missed s3c2410 specific gpio calls to gpiolib calls.
Signed-off-by: Ben Dooks <ben@simtec.co.uk> Signed-off-by: Simtec Linux Team <linux@simtec.co.uk> Cc: <linux-mmc@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Nicolas Pitre [Tue, 15 Dec 2009 02:01:26 +0000 (18:01 -0800)]
sdhci: add support for the SysKonnect CardBus2SDIO adapter
This is still in use especially to develop SDIO device drivers on laptop
machines which are lacking SDIO slots. This adapter supports SDIO cards
only due to lack of 136-bit response capability.
Signed-off-by: Nicolas Pitre <nico@marvell.com> Cc: <linux-mmc@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
mmc: davinci: modify data types of EDMA related variables
Currently DaVinci EDMA driver supports multiple EDMA channel controller
instances. edma_alloc_channel() api returns a 32 bit value which has the
channel controller number in MSB and the EDMA channel number in LSB. The
variables which store the value returned by edma_alloc_channel() have to
be 32 bit wide now.
Albert Herranz [Tue, 15 Dec 2009 02:01:19 +0000 (18:01 -0800)]
sdio: rework cis tuple parsing
Rework the current CIS tuple parsing code, reusing the existing
infrastructure and providing an easy way to add new CISTPL_FUNCE parsers
by TPLFE_TYPE.
Valid known CIS tuples are now silently queued for the SDIO function
driver when not parsed/processed (-EILSEQ) by the SDIO core. Unknown CIS
tuples (-ENOENT) are queued too for the SDIO function driver without
aborting the initialization, but emit a warning in the kernel log.
CISTPL_FUNCE tuples can be "whitelisted" now by adding a matching entry to
the cis_tpl_funce_list table.
Signed-off-by: Albert Herranz <albert_herranz@yahoo.es> Acked-by: Pierre Ossman <pierre@ossman.eu> Cc: <linux-mmc@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mike Frysinger [Tue, 15 Dec 2009 02:01:16 +0000 (18:01 -0800)]
firmware: only allow EDD on x86
Rather than have the EDD depend on !ia64 (and assuming that only ia64,
x86, x86_64 will be including this Kconfig), have EDD depend on the only
arches which can support this code. This should allow all other arches to
cleanly include the firmware Kconfig.
Also simplify the x86 string used by FIRMWARE_MEMMAP to match EDD.
Signed-off-by: Mike Frysinger <vapier@gentoo.org> Acked-by: Matt Domsch <Matt_Domsch@dell.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Recently, We marked strstrip() as must_check. because it was frequently
misused and it should be checked. However, we found one exception.
scsi/ipr.c intentionally ignore return value of strstrip. Because it
wishes to keep the whitespace at the beginning.
Thus we need to keep with and without checked whitespace trim function.
This patch adds a new strim() and changes ipr.c to use it.
[akpm@linux-foundation.org: coding-style fixes] Suggested-by: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Joe Perches [Tue, 15 Dec 2009 02:01:12 +0000 (18:01 -0800)]
drivers/md/md.c: use %pU to print UUIDs
Signed-off-by: Joe Perches <joe@perches.com> Cc: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Joe Perches [Tue, 15 Dec 2009 02:01:10 +0000 (18:01 -0800)]
fs/xfs/xfs_log_recover.c: use %pU to print UUIDs
Signed-off-by: Joe Perches <joe@perches.com> Acked-by: Alex Elder <aelder@sgi.com> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
No functional change. Cache strlen() result to avoid recalculating it up
to 3 times on the worst case.
Reduces code size a little by 32 bytes:
text data bss dec hex filename
1385 0 0 1385 569 lib/parser.o-BEFORE
1353 0 0 1353 549 lib/parser.o-AFTER
Signed-off-by: André Goddard Rosa <andre.goddard@gmail.com> Cc: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
tree-wide: convert open calls to remove spaces to skip_spaces() lib function
Makes use of skip_spaces() defined in lib/string.c for removing leading
spaces from strings all over the tree.
It decreases lib.a code size by 47 bytes and reuses the function tree-wide:
text data bss dec hex filename
64688 584 592 65864 10148 (TOTALS-BEFORE)
64641 584 592 65817 10119 (TOTALS-AFTER)
Also, while at it, if we see (*str && isspace(*str)), we can be sure to
remove the first condition (*str) as the second one (isspace(*str)) also
evaluates to 0 whenever *str == 0, making it redundant. In other words,
"a char equals zero is never a space".
Julia Lawall tried the semantic patch (http://coccinelle.lip6.fr) below,
and found occurrences of this pattern on 3 more files:
drivers/leds/led-class.c
drivers/leds/ledtrig-timer.c
drivers/video/output.c
Signed-off-by: André Goddard Rosa <andre.goddard@gmail.com> Cc: Julia Lawall <julia@diku.dk> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Jeff Dike <jdike@addtoit.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Richard Purdie <rpurdie@rpsys.net> Cc: Neil Brown <neilb@suse.de> Cc: Kyle McMartin <kyle@mcmartin.ca> Cc: Henrique de Moraes Holschuh <hmh@hmh.eng.br> Cc: David Howells <dhowells@redhat.com> Cc: <linux-ext4@vger.kernel.org> Cc: Samuel Ortiz <samuel@sortiz.org> Cc: Patrick McHardy <kaber@trash.net> Cc: Takashi Iwai <tiwai@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
string: on strstrip(), first remove leading spaces before running over str
... so that strlen() iterates over a smaller string comprising of the
remaining characters only.
Signed-off-by: André Goddard Rosa <andre.goddard@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
string: factorize skip_spaces and export it to be generally available
On the following sentence:
while (*s && isspace(*s))
s++;
If *s == 0, isspace() evaluates to ((_ctype[*s] & 0x20) != 0), which
evaluates to ((0x08 & 0x20) != 0) which equals to 0 as well.
If *s == 1, we depend on isspace() result anyway. In other words,
"a char equals zero is never a space", so remove this check.
Also, *s != 0 is most common case (non-null string).
Fixed const return as noticed by Jan Engelhardt and James Bottomley.
Fixed unnecessary extra cast on strstrip() as noticed by Jan Engelhardt.
Signed-off-by: André Goddard Rosa <andre.goddard@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Andrew Morton [Tue, 15 Dec 2009 02:01:02 +0000 (18:01 -0800)]
drivers/scsi/sym53c8xx_2/sym_glue.c: rename skip_spaces() to sym_skip_spaces()
To avoid a collision with the newly-added kernel-wide skip_spaces().
Signed-off-by: André Goddard Rosa <andre.goddard@gmail.com> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: André Goddard Rosa <andre.goddard@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
vsprintf: reuse almost identical simple_strtoulX() functions
The difference between simple_strtoul() and simple_strtoull() is just
the size of the variable used to keep track of the sum of characters
converted to numbers:
unsigned long simple_strtoul() {...}
unsigned long long simple_strtoull(){...}
Both are same size on my Core 2/gcc 4.4.1.
Overflow condition is not checked on both functions, so an extremely large
string can break these functions so that they don't even notice it.
As we do not care for overflowing on these functions, always keep the sum
using the larger variable around (unsigned long long) on simple_strtoull()
and cast it to (unsigned long) on simple_strtoul(), which then becomes
just a wrapper around simple_strtoull().
Code size decreases by 304 bytes:
text data bss dec hex filename
15534 0 8 15542 3cb6 vsprintf.o (ex lib/lib.a-BEFORE)
15230 0 8 15238 3b86 vsprintf.o (ex lib/lib.a-AFTER)
Signed-off-by: André Goddard Rosa <andre.goddard@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
vsprintf: factor out skip_space code in a separate function
When converting more caller sites, the inline decision will be left up to gcc.
It decreases code size:
text data bss dec hex filename
15710 0 8 15718 3d66 vsprintf.o (ex lib/lib.a-BEFORE)
15534 0 8 15542 3cb6 vsprintf.o (ex lib/lib.a-AFTER)
Signed-off-by: André Goddard Rosa <andre.goddard@gmail.com> Acked-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
vsprintf: reduce code size by avoiding extra check
No functional change, just refactor the code so that it avoid checking
"if (hi)" two times in a sequence, taking advantage of previous check made.
It also reduces code size:
text data bss dec hex filename
15726 0 8 15734 3d76 vsprintf.o (ex lib/lib.a-BEFORE)
15710 0 8 15718 3d66 vsprintf.o (ex lib/lib.a-AFTER)
Signed-off-by: André Goddard Rosa <andre.goddard@gmail.com> Acked-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
It decreases code size as well:
text data bss dec hex filename
15758 0 8 15766 3d96 vsprintf.o (ex lib/lib.a-BEFORE)
15726 0 8 15734 3d76 vsprintf.o (ex lib/lib.a-TOLOWER)
Signed-off-by: André Goddard Rosa <andre.goddard@gmail.com> Acked-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patchset reduces lib/lib.a code size by 482 bytes on my Core 2 with
gcc 4.4.1 even considering that it exports a newly defined function
skip_spaces() to drivers:
text data bss dec hex filename
64867 840 592 66299 102fb (TOTALS-lib.a-BEFORE)
64641 584 592 65817 10119 (TOTALS-lib.a-AFTER)
and implements some code tidy up.
Besides reducing lib.a size, it converts many in-tree drivers to use the
newly defined function, which makes another small reduction on kernel size
overall when those drivers are used.
This patch:
Change "<NULL>" to "(null)", unifying 3 equal strings.
glibc also uses "(null)" for the same purpose.
It decreases code size by 7 bytes:
text data bss dec hex filename
15765 0 8 15773 3d9d vsprintf.o (ex lib/lib.a-BEFORE)
15758 0 8 15766 3d96 vsprintf.o (ex lib/lib.a-AFTER)
Signed-off-by: André Goddard Rosa <andre.goddard@gmail.com> Acked-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Joe Perches [Tue, 15 Dec 2009 02:00:54 +0000 (18:00 -0800)]
MAINTAINERS: Update file patterns for WOLFSON MICROELECTRONICS PMIC DRIVERS
One of the includes pointed to a non-existent directory
Add Documentation/hwmon/wm83??
Add sound/soc/codecs/wm(8350|8400).h files
Signed-off-by: Joe Perches <joe@perches.com> Acked-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Joe Perches [Tue, 15 Dec 2009 02:00:52 +0000 (18:00 -0800)]
MAINTAINERS: rename PALM TREO section and file patterns
Signed-off-by: Joe Perches <joe@perches.com> Cc: Tomas Cech <sleep_walker@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
KOSAKI Motohiro [Tue, 15 Dec 2009 02:00:52 +0000 (18:00 -0800)]
MAINTAINERS: mark cifs mailing list as "moderated for non-subscribers"
If non-subscribers post bug report to CIFS mailing list, they will get
following messages.
Your mail to 'linux-cifs-client' with the subject
[PATCH x/x] cifs: xxxxxxxxxxxxx
Is being held until the list moderator can review it for approval.
The reason it is being held:
Post by non-member to a members-only list
Either the message will get posted to the list, or you will receive
notification of the moderator's decision. If you would like to cancel
this posting, please visit the following URL:
members-only list should be written as so in MAINTAINERS file.
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Steve French <sfrench@samba.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Joe Perches [Tue, 15 Dec 2009 02:00:49 +0000 (18:00 -0800)]
scripts/get_maintainer.pl: fix --non with --git-blame and cleanups
Fix email matching without name --n and --git-blame
Using --non and --git-blame caused maintainer signature
matching to fail. Fixed that by adding 3rd argument to
sub format_email to control show/hide name portion of address
Slurp -f file instead of reading line-by-line for K: pattern matching.
Suggested by Wolfram Sang as more efficient
Refactor git command execution
Break into 2 functions, execute/analyze
Share code between --git and --git-blame
Don't warn multiple times when git isn't installed
Improve stats reporting
--git-min-percent and -- rolestats now count the total number of commits
for either the period of --git-since or if using --git-blame the commits
used by the current file and calculate commit % as
# of commits signed / total commits * 100
Code style cleaning
Use consistent sub foo { my (args...) = @_;
Signed-off-by: Joe Perches <joe@perches.com> Cc: Ben Hutchings <ben@decadent.org.uk> Cc: Greg KH <greg@kroah.com> Cc: Pavel Machek <pavel@ucw.cz> Cc: Wolfram Sang <w.sang@pengutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Joe Perches [Tue, 15 Dec 2009 02:00:46 +0000 (18:00 -0800)]
scripts/get_maintainer.pl: add --roles and --rolestats
--roles shows the role of each email address, i.e. why it was selected.
--rolestats selects --roles and adds git log/blame signers #'s and %
Multiple roles are possible (supporter, maintainer, git-signer...)
--roles or --rolestats is meant to help identify appropriate maintainers
to notify and should not be used with "git send-email --cc-cmd"
Example output:
Existing:
$ ./scripts/get_maintainer.pl -f arch/x86/kernel/acpi/boot.c
Corentin Chary <corentincj@iksaif.net>
Karol Kozimor <sziwan@users.sourceforge.net>
Len Brown <len.brown@intel.com>
Pavel Machek <pavel@ucw.cz>
Rafael J. Wysocki <rjw@sisk.pl>
Thomas Gleixner <tglx@linutronix.de>
Ingo Molnar <mingo@redhat.com>
H. Peter Anvin <hpa@zytor.com>
x86@kernel.org
Yinghai Lu <yhlu.kernel@gmail.com>
Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
acpi4asus-user@lists.sourceforge.net
linux-pm@lists.linux-foundation.org
linux-kernel@vger.kernel.org
With --roles
$ ./scripts/get_maintainer.pl --roles -f arch/x86/kernel/acpi/boot.c
Corentin Chary <corentincj@iksaif.net> (maintainer:ASUS ACPI EXTRAS...)
Karol Kozimor <sziwan@users.sourceforge.net> (maintainer:ASUS ACPI EXTRAS...)
Len Brown <len.brown@intel.com> (supporter:SUSPEND TO RAM,git-signer)
Pavel Machek <pavel@ucw.cz> (supporter:SUSPEND TO RAM)
Rafael J. Wysocki <rjw@sisk.pl> (supporter:SUSPEND TO RAM)
Thomas Gleixner <tglx@linutronix.de> (maintainer:X86 ARCHITECTURE...)
Ingo Molnar <mingo@redhat.com> (maintainer:X86 ARCHITECTURE...,git-signer)
H. Peter Anvin <hpa@zytor.com> (maintainer:X86 ARCHITECTURE...)
x86@kernel.org (maintainer:X86 ARCHITECTURE...)
Yinghai Lu <yhlu.kernel@gmail.com> (git-signer)
Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> (git-signer)
acpi4asus-user@lists.sourceforge.net (open list:ASUS ACPI EXTRAS...)
linux-pm@lists.linux-foundation.org (open list:SUSPEND TO RAM)
linux-kernel@vger.kernel.org (open list)
With --rolestats
$ ./scripts/get_maintainer.pl --rolestats -f arch/x86/kernel/acpi/boot.c
Corentin Chary <corentincj@iksaif.net> (maintainer:ASUS ACPI EXTRAS...)
Karol Kozimor <sziwan@users.sourceforge.net> (maintainer:ASUS ACPI EXTRAS...)
Len Brown <len.brown@intel.com> (supporter:SUSPEND TO RAM,git-signer:16/79=20%)
Pavel Machek <pavel@ucw.cz> (supporter:SUSPEND TO RAM)
Rafael J. Wysocki <rjw@sisk.pl> (supporter:SUSPEND TO RAM)
Thomas Gleixner <tglx@linutronix.de> (maintainer:X86 ARCHITECTURE...)
Ingo Molnar <mingo@redhat.com> (maintainer:X86 ARCHITECTURE...,git-signer:29/79=37%)
H. Peter Anvin <hpa@zytor.com> (maintainer:X86 ARCHITECTURE...)
x86@kernel.org (maintainer:X86 ARCHITECTURE...)
Yinghai Lu <yhlu.kernel@gmail.com> (git-signer:12/79=15%)
Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> (git-signer:6/79=8%)
acpi4asus-user@lists.sourceforge.net (open list:ASUS ACPI EXTRAS...)
linux-pm@lists.linux-foundation.org (open list:SUSPEND TO RAM)
linux-kernel@vger.kernel.org (open list)
With --rolestats and --git-blame
$ ./scripts/get_maintainer.pl --rolestats --git-blame -f arch/x86/kernel/acpi/boot.c
Corentin Chary <corentincj@iksaif.net> (maintainer:ASUS ACPI EXTRAS...)
Karol Kozimor <sziwan@users.sourceforge.net> (maintainer:ASUS ACPI EXTRAS...)
Len Brown <len.brown@intel.com> (supporter:SUSPEND TO RAM,git-signer:16/79=20%,commits:22/154=14%)
Pavel Machek <pavel@ucw.cz> (supporter:SUSPEND TO RAM)
Rafael J. Wysocki <rjw@sisk.pl> (supporter:SUSPEND TO RAM)
Thomas Gleixner <tglx@linutronix.de> (maintainer:X86 ARCHITECTURE...)
Ingo Molnar <mingo@redhat.com> (maintainer:X86 ARCHITECTURE...,git-signer:29/79=37%,commits:36/154=23%)
H. Peter Anvin <hpa@zytor.com> (maintainer:X86 ARCHITECTURE...)
x86@kernel.org (maintainer:X86 ARCHITECTURE...)
Yinghai Lu <yhlu.kernel@gmail.com> (git-signer:12/79=15%,commits:9/154=6%)
Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> (git-signer:6/79=8%)
Andi Kleen <ak@suse.de> (commits:11/154=7%)
Andrew Morton <akpm@osdl.org> (commits:10/154=6%)
acpi4asus-user@lists.sourceforge.net (open list:ASUS ACPI EXTRAS...)
linux-pm@lists.linux-foundation.org (open list:SUSPEND TO RAM)
linux-kernel@vger.kernel.org (open list)
Other changes:
Format git-signers email addresses a bit to reduce bad signatures
Command line bad arguments emitted a verbose usage(), just show --help
Version number bumped to .22
Ben Hutchings had the idea and created a good deal of this implementation.
Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Cc: Greg KH <greg@kroah.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Bernhard Walle [Tue, 15 Dec 2009 02:00:43 +0000 (18:00 -0800)]
vt: introduce and use vt_kmsg_redirect() function
The kernel offers with TIOCL_GETKMSGREDIRECT ioctl() the possibility to
redirect the kernel messages to a specific console.
However, since it's not possible to switch to the kernel message console
after a panic(), it would be nice if the kernel would print the panic
message on the current console.
This patch series adds a new interface to access the global kmsg_redirect
variable by a function to be able to use it in code where
CONFIG_VT_CONSOLE is not set (kernel/panic.c).
This patch:
Instead of using and exporting a global value kmsg_redirect, introduce a
function vt_kmsg_redirect() that both can set and return the console where
messages are printed.
Change all users of kmsg_redirect (the VT code itself and kernel/power.c)
to the new interface.
The main advantage is that vt_kmsg_redirect() can also be used when
CONFIG_VT_CONSOLE is not set.
Signed-off-by: Bernhard Walle <bernhard@bwalle.de> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Andres Salomon [Tue, 15 Dec 2009 02:00:41 +0000 (18:00 -0800)]
cs5535: drop the Geode-specific MFGPT/GPIO code
With generic modular drivers handling all of this stuff, the
geode-specific code can go away. The cs5535-gpio, cs5535-mfgpt, and
cs5535-clockevt drivers now handle this.
Signed-off-by: Andres Salomon <dilinger@collabora.co.uk> Cc: Jordan Crouse <jordan@cosmicpenguin.net> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: john stultz <johnstul@us.ibm.com> Cc: Chris Ball <cjb@laptop.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Andres Salomon [Tue, 15 Dec 2009 02:00:38 +0000 (18:00 -0800)]
cs5535: add a generic clock event MFGPT driver
This is based on the old code in arch/x86/kernel/mfgpt_32.c, but is
modular and not Geode-specific. There's no reason why the clock event
device needs to be registered so early at boot; the clockevent code is
perfectly capable of dynamic switching.
[akpm@linux-foundation.org: add linux/irq.h include] Signed-off-by: Andres Salomon <dilinger@collabora.co.uk> Cc: Jordan Crouse <jordan@cosmicpenguin.net> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: john stultz <johnstul@us.ibm.com> Cc: Chris Ball <cjb@laptop.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Andres Salomon [Tue, 15 Dec 2009 02:00:37 +0000 (18:00 -0800)]
cs5535: add a generic MFGPT driver
This is based on the old code on arch/x86/kernel/mfgpt_32.c, except it's
not x86 specific, it's modular, and it makes use of a PCI BAR rather than
a random MSR. Currently module unloading is not supported; it's uncertain
whether or not it can be made work with the hardware.
[akpm@linux-foundation.org: add X86 dependency] Signed-off-by: Andres Salomon <dilinger@collabora.co.uk> Cc: Jordan Crouse <jordan@cosmicpenguin.net> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: john stultz <johnstul@us.ibm.com> Cc: Chris Ball <cjb@laptop.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Andres Salomon [Tue, 15 Dec 2009 02:00:36 +0000 (18:00 -0800)]
ALSA: cs5535audio: free OLPC quirks from reliance on MGEODE_LX cpu optimization
Previously, OLPC support for the mic extensions was only enabled in the
ALSA driver if CONFIG_OLPC and CONFIG_MGEODE_LX were both set. This was
because the old geode GPIO code was written in a manner that assumed
CONFIG_MGEODE_LX. With the new cs553x-gpio driver, this is no longer the
case; as such, we can drop the requirement on CONFIG_MGEODE_LX and instead
include a requirement on GPIOLIB.
We use the generic GPIO API rather than the cs553x-specific API.
Signed-off-by: Andres Salomon <dilinger@collabora.co.uk> Cc: Takashi Iwai <tiwai@suse.de> Cc: Jordan Crouse <jordan@cosmicpenguin.net> Cc: David Brownell <david-b@pacbell.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Andres Salomon [Tue, 15 Dec 2009 02:00:32 +0000 (18:00 -0800)]
cs5535-gpio: add AMD CS5535/CS5536 GPIO driver support
This creates a CS5535/CS5536 GPIO driver which uses a gpio_chip backend
(allowing GPIO users to use the generic GPIO API if desired) while also
allowing architecture-specific users directly (via the cs5535_gpio_*
functions).
Tested on an OLPC machine. Some Leemotes also use CS5536 (with a mips
cpu), which is why this is in drivers/gpio rather than arch/x86.
Currently, it conflicts with older geode GPIO support; once MFGPT support
is reworked to also be more generic, the older geode code will be removed.
Signed-off-by: Andres Salomon <dilinger@collabora.co.uk> Cc: Takashi Iwai <tiwai@suse.de> Cc: Jordan Crouse <jordan@cosmicpenguin.net> Cc: David Brownell <david-b@pacbell.net> Reviewed-by: Alessandro Zummo <a.zummo@towertech.it> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
drivers/char/misc.c: use bitmap/bitops functions for dynamic minor number allocation
Use DECLARE_BITMAP(), find_first_zero_bit(), set_bit() and clear_bit()
instead of rewriting code to do it with the minor number dynamic
allocation bitmap.
We need to invert the bit position to keep the code behaviour of using the
last minor numbers first, since we don't have a find_last_zero_bit.
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@holoscopio.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
drivers/char/misc.c: clear allocation bit in minor bitmap when device register fails
If there's a failure creating the device (because there's already one with
the same name, for example), the current implementation does not clear the
bit for the allocated minor and that number is lost for future
allocations.
Second, the test currently in misc_deregister is broken, since it does not
test for the 0 minor.
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@holoscopio.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Phil Carmody [Tue, 15 Dec 2009 02:00:29 +0000 (18:00 -0800)]
err.h: add helper function to simplify pointer error checking
There are quite a few instances in the kernel of checks of pointers both
against NULL and against the errno range, handling both cases identically.
This additional helper function would simplify such code.
[akpm@linux-foundation.org: build fix] Signed-off-by: Phil Carmody <ext-phil.2.carmody@nokia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jean Delvare [Tue, 15 Dec 2009 02:00:27 +0000 (18:00 -0800)]
ioc3/ioc4: various section fixes
Several IOC3 and IOC4 drivers misuse the __devinit and __devexit section
markers. Use __init and __exit instead as appropriate, then add __devinit
and __devexit where they really belong for PCI drivers.
Also make ioc4_serial_init static.
Signed-off-by: Jean Delvare <khali@linux-fr.org> Cc: Pat Gefre <pfg@sgi.com> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Joe Perches [Tue, 15 Dec 2009 02:00:25 +0000 (18:00 -0800)]
kernel.h: add printk_ratelimited and pr_<level>_rl
Add a printk_ratelimited statement expression macro that uses a per-call
ratelimit_state so that multiple subsystems output messages are not
suppressed by a global __ratelimit state.
misc: remove MAC pmu function declaration from misc device class
Commit 8c8709334cec803368a432a33e0f2e116d48fe07 has removed the
pmu_device_init call from misc_init, but unlike other similar commits,
has not removed its declaration.
Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Amerigo Wang [Tue, 15 Dec 2009 02:00:21 +0000 (18:00 -0800)]
rwsem: fix rwsem_is_locked() bugs
rwsem_is_locked() tests ->activity without locks, so we should always keep
->activity consistent. However, the code in __rwsem_do_wake() breaks this
rule, it updates ->activity after _all_ readers waken up, this may give
some reader a wrong ->activity value, thus cause rwsem_is_locked() behaves
wrong.
Quote from Andrew:
"
- we have one or more processes sleeping in down_read(), waiting for access.
- we wake one or more processes up without altering ->activity
- they start to run and they do rwsem_is_locked(). This incorrectly
returns "false", because the waker process is still crunching away in
__rwsem_do_wake().
- the waker now alters ->activity, but it was too late.
"
So we need get a spinlock to protect this. And rwsem_is_locked() should
not block, thus we use spin_trylock_irqsave().
[akpm@linux-foundation.org: simplify code] Reported-by: Brian Behlendorf <behlendorf1@llnl.gov> Cc: Ben Woodard <bwoodard@llnl.gov> Cc: David Howells <dhowells@redhat.com> Signed-off-by: WANG Cong <amwang@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The symbol 'call' is a static symbol used for initcall_debug. This same
symbol name is used locally by a couple functions and produces the
following sparse warnings:
warning: symbol 'call' shadows an earlier one
Fix this noise by renaming the local symbols.
Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Xiao Guangrong [Tue, 15 Dec 2009 02:00:16 +0000 (18:00 -0800)]
generic-ipi: cleanup for generic_smp_call_function_interrupt()
Use smp_processor_id() instead of get_cpu() and put_cpu() in
generic_smp_call_function_interrupt(), It's no need to disable preempt,
because we must call generic_smp_call_function_interrupt() with interrupts
disabled.
Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Acked-by: Ingo Molnar <mingo@elte.hu> Cc: Jens Axboe <jens.axboe@oracle.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
ad525x_dpot: new driver for AD525x digital potentiometers
This driver supports the non-volatile digital potentiometers via I2C:
AD5258, AD5259, AD5251, AD5252, AD5253, AD5254, and AD5255
It provides a sysfs interface to each device for reading/writing which
is documented in Documentation/misc-devices/ad525x_dpot.txt.
Signed-off-by: Michael Hennerich <michael.hennerich@analog.com> Signed-off-by: Chris Verges <chrisv@cyberswitching.com> Signed-off-by: Mike Frysinger <vapier@gentoo.org> Cc: Jean Delvare <khali@linux-fr.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Commit 70867453092297be9afb2249e712a1f960ec0a09 ("printk_once(): use bool
for boolean flag") changed printk_once() to use bool instead of int for
its guard variable. Do the same change to WARN_ONCE() and WARN_ON_ONCE(),
for the same reasons.
This resulted in a reduction of 1462 bytes on a x86-64 defconfig:
Alexey Dobriyan [Tue, 15 Dec 2009 02:00:11 +0000 (18:00 -0800)]
uml: convert to seq_file/proc_fops
Convert code away from ->read_proc/->write_proc interfaces. Switch to
proc_create()/proc_create_data() which make addition of proc entries
reliable wrt NULL ->proc_fops, NULL ->data and so on.
Arjan van de Ven [Tue, 15 Dec 2009 02:00:11 +0000 (18:00 -0800)]
floppy: Add an extra bound check on ioctl arguments
gcc is not convinced that the floppy.c ioctl has sufficient bound checks:
In function `copy_from_user',
inlined from `fd_copyin' at drivers/block/floppy.c:3080,
inlined from `fd_ioctl' at drivers/block/floppy.c:3503:
arch/x86/include/asm/uaccess_32.h:211:
warning: call to `copy_from_user_overflow' declared with attribute
warning: copy_from_user buffer size is not provably correct
And frankly, as a human I have a hard time proving the same more or less
(the size comes from the ioctl argument. humpf. maybe. the code isn't
very nice)
This patch adds an explicit check to make 100% sure it's safe, better than
finding out later that there indeed was a gap.
[akpm@linux-foundation.org: add WARN_ON()] Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Julia Lawall [Tue, 15 Dec 2009 02:00:09 +0000 (18:00 -0800)]
drivers/cpuidle: Move dereference after NULL test
It does not seem possible that ldev can be NULL, so drop the unnecessary
test. If ldev can somehow be NULL, then the initialization of last_idx
should be moved below the test.
A simplified version of the semantic match that detects this problem is as
follows (http://coccinelle.lip6.fr/):
Alexey Dobriyan [Tue, 15 Dec 2009 02:00:06 +0000 (18:00 -0800)]
alpha: convert srm code to seq_file
Convert code away from ->read_proc/->write_proc interfaces. Switch to
proc_create()/proc_create_data() which make addition of proc entries
reliable wrt NULL ->proc_fops, NULL ->data and so on.
john stultz [Tue, 15 Dec 2009 02:00:05 +0000 (18:00 -0800)]
procfs: allow threads to rename siblings via /proc/pid/tasks/tid/comm
Setting a thread's comm to be something unique is a very useful ability
and is helpful for debugging complicated threaded applications. However
currently the only way to set a thread name is for the thread to name
itself via the PR_SET_NAME prctl.
However, there may be situations where it would be advantageous for a
thread dispatcher to be naming the threads its managing, rather then
having the threads self-describe themselves. This sort of behavior is
available on other systems via the pthread_setname_np() interface.
This patch exports a task's comm via proc/pid/comm and
proc/pid/task/tid/comm interfaces, and allows thread siblings to write to
these values.
[akpm@linux-foundation.org: cleanups] Signed-off-by: John Stultz <johnstul@us.ibm.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Arjan van de Ven <arjan@infradead.org> Cc: Mike Fulton <fultonm@ca.ibm.com> Cc: Sean Foley <Sean_Foley@ca.ibm.com> Cc: Darren Hart <dvhltc@us.ibm.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jie Zhang [Tue, 15 Dec 2009 02:00:02 +0000 (18:00 -0800)]
nommu: fix malloc performance by adding uninitialized flag
The NOMMU code currently clears all anonymous mmapped memory. While this
is what we want in the default case, all memory allocation from userspace
under NOMMU has to go through this interface, including malloc() which is
allowed to return uninitialized memory. This can easily be a significant
performance penalty. So for constrained embedded systems were security is
irrelevant, allow people to avoid clearing memory unnecessarily.
This also alters the ELF-FDPIC binfmt such that it obtains uninitialised
memory for the brk and stack region.
Signed-off-by: Jie Zhang <jie.zhang@analog.com> Signed-off-by: Robin Getz <rgetz@blackfin.uclinux.org> Signed-off-by: Mike Frysinger <vapier@gentoo.org> Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Paul Mundt <lethal@linux-sh.org> Acked-by: Greg Ungerer <gerg@snapgear.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Naoya Horiguchi [Tue, 15 Dec 2009 02:00:01 +0000 (18:00 -0800)]
mm hugetlb: add hugepage support to pagemap
This patch enables extraction of the pfn of a hugepage from
/proc/pid/pagemap in an architecture independent manner.
Details
-------
My test program (leak_pagemap) works as follows:
- creat() and mmap() a file on hugetlbfs (file size is 200MB == 100 hugepages,)
- read()/write() something on it,
- call page-types with option -p,
- munmap() and unlink() the file on hugetlbfs
Naoya Horiguchi [Tue, 15 Dec 2009 01:59:59 +0000 (17:59 -0800)]
mm: hugetlb: fix hugepage memory leak in walk_page_range()
Most callers of pmd_none_or_clear_bad() check whether the target page is
in a hugepage or not, but walk_page_range() do not check it. So if we
read /proc/pid/pagemap for the hugepage on x86 machine, the hugepage
memory is leaked as shown below. This patch fixes it.
Details
=======
My test program (leak_pagemap) works as follows:
- creat() and mmap() a file on hugetlbfs (file size is 200MB == 100 hugepages,)
- read()/write() something on it,
- call page-types with option -p (walk around the page tables),
- munmap() and unlink() the file on hugetlbfs
Naoya Horiguchi [Tue, 15 Dec 2009 01:59:58 +0000 (17:59 -0800)]
mm: hugetlb: fix hugepage memory leak in mincore()
Most callers of pmd_none_or_clear_bad() check whether the target page is
in a hugepage or not, but mincore() and walk_page_range() do not check it.
So if we use mincore() on a hugepage on x86 machine, the hugepage memory
is leaked as shown below. This patch fixes it by extending mincore()
system call to support hugepages.
Details
=======
My test program (leak_mincore) works as follows:
- creat() and mmap() a file on hugetlbfs (file size is 200MB == 100 hugepages,)
- read()/write() something on it,
- call mincore() for first ten pages and printf() the values of *vec
- munmap() and unlink() the file on hugetlbfs
Return values in *vec from mincore() are set to 0, while the hugepage
should be in memory, and 1 hugepage is still accounted as used while
there is no file on hugetlbfs.
Mel Gorman [Tue, 15 Dec 2009 01:59:56 +0000 (17:59 -0800)]
hugetlb: abort a hugepage pool resize if a signal is pending
If a user asks for a hugepage pool resize but specified a large number,
the machine can begin trashing. In response, they might hit ctrl-c but
signals are ignored and the pool resize continues until it fails an
allocation. This can take a considerable amount of time so this patch
aborts a pool resize if a signal is pending.
Suggested by Dave Hansen.
Signed-off-by: Mel Gorman <mel@csn.ul.ie> Cc: Dave Hansen <dave@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Lee Schermerhorn [Tue, 15 Dec 2009 01:59:54 +0000 (17:59 -0800)]
mm: remove unevictable_migrate_page function
unevictable_migrate_page() in mm/internal.h is a relic of the since
removed UNEVICTABLE_LRU Kconfig option. This patch removes the function
and open codes the test in migrate_page_copy().
Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com> Reviewed-by: Christoph Lameter <cl@linux-foundation.org> Acked-by: Hugh Dickins <hugh.dickins@tiscali.co.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mel Gorman [Tue, 15 Dec 2009 01:59:53 +0000 (17:59 -0800)]
hugetlb: acquire the i_mmap_lock before walking the prio_tree to unmap a page
When the owner of a mapping fails COW because a child process is holding a
reference, the children VMAs are walked and the page is unmapped. The
i_mmap_lock is taken for the unmapping of the page but not the walking of
the prio_tree. In theory, that tree could be changing if the lock is not
held. This patch takes the i_mmap_lock properly for the duration of the
prio_tree walk.
[hugh.dickins@tiscali.co.uk: Spotted the problem in the first place] Signed-off-by: Mel Gorman <mel@csn.ul.ie> Acked-by: Hugh Dickins <hugh.dickins@tiscali.co.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Magnus Damm [Tue, 15 Dec 2009 01:59:49 +0000 (17:59 -0800)]
mm: uncached vma support with writenotify
Modify the generic mmap() code to keep the cache attribute in
vma->vm_page_prot regardless if writenotify is enabled or not. Without
this patch the cache configuration selected by f_op->mmap() is overwritten
if writenotify is enabled, making it impossible to keep the vma uncached.
Needed by drivers such as drivers/video/sh_mobile_lcdcfb.c which uses
deferred io together with uncached memory.
Signed-off-by: Magnus Damm <damm@opensource.se> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Hugh Dickins <hugh.dickins@tiscali.co.uk> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Jaya Kumar <jayakumar.lkml@gmail.com> Cc: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Rik van Riel [Tue, 15 Dec 2009 01:59:48 +0000 (17:59 -0800)]
vmscan: do not evict inactive pages when skipping an active list scan
In AIM7 runs, recent kernels start swapping out anonymous pages well
before they should. This is due to shrink_list falling through to
shrink_inactive_list if !inactive_anon_is_low(zone, sc), when all we
really wanted to do is pre-age some anonymous pages to give them extra
time to be referenced while on the inactive list.
The obvious fix is to make sure that shrink_list does not fall through to
scanning/reclaiming inactive pages when we called it to scan one of the
active lists.
This change should be safe because the loop in shrink_zone ensures that we
will still shrink the anon and file inactive lists whenever we should.
[kosaki.motohiro@jp.fujitsu.com: inactive_file_is_low() should be inactive_anon_is_low()] Reported-by: Larry Woodman <lwoodman@redhat.com> Signed-off-by: Rik van Riel <riel@redhat.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Tomasz Chmielewski <mangoo@wpkg.org> Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
David Rientjes [Tue, 15 Dec 2009 01:59:46 +0000 (17:59 -0800)]
mm: slab-allocate memory section nodemask for large systems
Nodemasks should not be allocated on the stack for large systems (when it
is larger than 256 bytes) since there is a threat of overflow.
This patch causes the unregister_mem_sect_under_nodes() nodemask to be
allocated on the stack for smaller systems and be allocated by slab for
larger systems.
GFP_KERNEL is used since remove_memory_block() can block.
Cc: Gary Hade <garyhade@us.ibm.com> Cc: Badari Pulavarty <pbadari@us.ibm.com> Cc: Alex Chiang <achiang@hp.com> Signed-off-by: David Rientjes <rientjes@google.com> Cc: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Rakib Mullick [Tue, 15 Dec 2009 01:59:44 +0000 (17:59 -0800)]
mm: fix section mismatch in memory_hotplug.c
__free_pages_bootmem() is a __meminit function - which has been called
from put_pages_bootmem thus causes a section mismatch warning.
We were warned by the following warning:
LD mm/built-in.o
WARNING: mm/built-in.o(.text+0x26b22): Section mismatch in reference
from the function put_page_bootmem() to the function
.meminit.text:__free_pages_bootmem()
The function put_page_bootmem() references
the function __meminit __free_pages_bootmem().
This is often because put_page_bootmem lacks a __meminit
annotation or the annotation of __free_pages_bootmem is wrong.
Larry Woodman [Tue, 15 Dec 2009 01:59:37 +0000 (17:59 -0800)]
hugetlb: prevent deadlock in __unmap_hugepage_range() when alloc_huge_page() fails
hugetlb_fault() takes the mm->page_table_lock spinlock then calls
hugetlb_cow(). If the alloc_huge_page() in hugetlb_cow() fails due to an
insufficient huge page pool it calls unmap_ref_private() with the
mm->page_table_lock held. unmap_ref_private() then calls
unmap_hugepage_range() which tries to acquire the mm->page_table_lock.
This can be fixed by dropping the mm->page_table_lock around the call to
unmap_ref_private() if alloc_huge_page() fails, its dropped right below in
the normal path anyway. However, earlier in the that function, it's also
possible to call into the page allocator with the same spinlock held.
What this patch does is drop the spinlock before the page allocator is
potentially entered. The check for page allocation failure can be made
without the page_table_lock as well as the copy of the huge page. Even if
the PTE changed while the spinlock was held, the consequence is that a
huge page is copied unnecessarily. This resolves both the double taking
of the lock and sleeping with the spinlock held.
[mel@csn.ul.ie: Cover also the case where process can sleep with spinlock] Signed-off-by: Larry Woodman <lwooman@redhat.com> Signed-off-by: Mel Gorman <mel@csn.ul.ie> Acked-by: Adam Litke <agl@us.ibm.com> Cc: Andy Whitcroft <apw@shadowen.org> Cc: Lee Schermerhorn <lee.schermerhorn@hp.com> Cc: Hugh Dickins <hugh.dickins@tiscali.co.uk> Cc: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Hugh Dickins [Tue, 15 Dec 2009 01:59:34 +0000 (17:59 -0800)]
ksm: remove unswappable max_kernel_pages
Now that ksm pages are swappable, and the known holes plugged, remove
mention of unswappable kernel pages from KSM documentation and comments.
Remove the totalram_pages/4 initialization of max_kernel_pages. In fact,
remove max_kernel_pages altogether - we can reinstate it if removal turns
out to break someone's script; but if we later want to limit KSM's memory
usage, limiting the stable nodes would not be an effective approach.
Signed-off-by: Hugh Dickins <hugh.dickins@tiscali.co.uk> Cc: Izik Eidus <ieidus@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Chris Wright <chrisw@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Hugh Dickins [Tue, 15 Dec 2009 01:59:33 +0000 (17:59 -0800)]
ksm: memory hotremove migration only
The previous patch enables page migration of ksm pages, but that soon gets
into trouble: not surprising, since we're using the ksm page lock to lock
operations on its stable_node, but page migration switches the page whose
lock is to be used for that. Another layer of locking would fix it, but
do we need that yet?
Do we actually need page migration of ksm pages? Yes, memory hotremove
needs to offline sections of memory: and since we stopped allocating ksm
pages with GFP_HIGHUSER, they will tend to be GFP_HIGHUSER_MOVABLE
candidates for migration.
But KSM is currently unconscious of NUMA issues, happily merging pages
from different NUMA nodes: at present the rule must be, not to use
MADV_MERGEABLE where you care about NUMA. So no, NUMA page migration of
ksm pages does not make sense yet.
So, to complete support for ksm swapping we need to make hotremove safe.
ksm_memory_callback() take ksm_thread_mutex when MEM_GOING_OFFLINE and
release it when MEM_OFFLINE or MEM_CANCEL_OFFLINE. But if mapped pages
are freed before migration reaches them, stable_nodes may be left still
pointing to struct pages which have been removed from the system: the
stable_node needs to identify a page by pfn rather than page pointer, then
it can safely prune them when MEM_OFFLINE.
And make NUMA migration skip PageKsm pages where it skips PageReserved.
But it's only when we reach unmap_and_move() that the page lock is taken
and we can be sure that raised pagecount has prevented a PageAnon from
being upgraded: so add offlining arg to migrate_pages(), to migrate ksm
page when offlining (has sufficient locking) but reject it otherwise.
Signed-off-by: Hugh Dickins <hugh.dickins@tiscali.co.uk> Cc: Izik Eidus <ieidus@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Chris Wright <chrisw@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Hugh Dickins [Tue, 15 Dec 2009 01:59:31 +0000 (17:59 -0800)]
ksm: rmap_walk to remove_migation_ptes
A side-effect of making ksm pages swappable is that they have to be placed
on the LRUs: which then exposes them to isolate_lru_page() and hence to
page migration.
Add rmap_walk() for remove_migration_ptes() to use: rmap_walk_anon() and
rmap_walk_file() in rmap.c, but rmap_walk_ksm() in ksm.c. Perhaps some
consolidation with existing code is possible, but don't attempt that yet
(try_to_unmap needs to handle nonlinears, but migration pte removal does
not).
rmap_walk() is sadly less general than it appears: rmap_walk_anon(), like
remove_anon_migration_ptes() which it replaces, avoids calling
page_lock_anon_vma(), because that includes a page_mapped() test which
fails when all migration ptes are in place. That was valid when NUMA page
migration was introduced (holding mmap_sem provided the missing guarantee
that anon_vma's slab had not already been destroyed), but I believe not
valid in the memory hotremove case added since.
For now do the same as before, and consider the best way to fix that
unlikely race later on. When fixed, we can probably use rmap_walk() on
hwpoisoned ksm pages too: for now, they remain among hwpoison's various
exceptions (its PageKsm test comes before the page is locked, but its
page_lock_anon_vma fails safely if an anon gets upgraded).
Signed-off-by: Hugh Dickins <hugh.dickins@tiscali.co.uk> Cc: Izik Eidus <ieidus@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Chris Wright <chrisw@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>