* git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable:
Btrfs: check total number of devices when removing missing
Btrfs: check return value of open_bdev_exclusive properly
Btrfs: do not mark the chunk as readonly if in degraded mode
Btrfs: run orphan cleanup on default fs root
Btrfs: fix a memory leak in btrfs_init_acl
Btrfs: Use correct values when updating inode i_size on fallocate
Btrfs: remove tree_search() in extent_map.c
Btrfs: Add mount -o compress-force
H. Peter Anvin [Fri, 29 Jan 2010 06:14:43 +0000 (22:14 -0800)]
x86: get rid of the insane TIF_ABI_PENDING bit
Now that the previous commit made it possible to do the personality
setting at the point of no return, we do just that for ELF binaries.
And suddenly all the reasons for that insane TIF_ABI_PENDING bit go
away, and we can just make SET_PERSONALITY() just do the obvious thing
for a 32-bit compat process.
Everything becomes much more straightforward this way.
Signed-off-by: H. Peter Anvin <hpa@zytor.com> Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Fri, 29 Jan 2010 06:14:42 +0000 (22:14 -0800)]
Split 'flush_old_exec' into two functions
'flush_old_exec()' is the point of no return when doing an execve(), and
it is pretty badly misnamed. It doesn't just flush the old executable
environment, it also starts up the new one.
Which is very inconvenient for things like setting up the new
personality, because we want the new personality to affect the starting
of the new environment, but at the same time we do _not_ want the new
personality to take effect if flushing the old one fails.
As a result, the x86-64 '32-bit' personality is actually done using this
insane "I'm going to change the ABI, but I haven't done it yet" bit
(TIF_ABI_PENDING), with SET_PERSONALITY() not actually setting the
personality, but just the "pending" bit, so that "flush_thread()" can do
the actual personality magic.
This patch in no way changes any of that insanity, but it does split the
'flush_old_exec()' function up into a preparatory part that can fail
(still called flush_old_exec()), and a new part that will actually set
up the new exec environment (setup_new_exec()). All callers are changed
to trivially comply with the new world order.
Signed-off-by: H. Peter Anvin <hpa@zytor.com> Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Fri, 29 Jan 2010 02:48:53 +0000 (18:48 -0800)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6:
Fix failure exit in ipathfs
fix oops in fs/9p late mount failure
fix leak in romfs_fill_super()
get rid of pointless checks after simple_pin_fs()
Fix failure exits in bfs_fill_super()
fix affs parse_options()
Fix remount races with symlink handling in affs
Fix a leak in affs_fill_super()
Linus Torvalds [Fri, 29 Jan 2010 00:33:12 +0000 (16:33 -0800)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6:
x86/PCI: remove IOH range fetching
PCI: fix nested spinlock hang in aer_inject
Linus Torvalds [Thu, 28 Jan 2010 22:34:11 +0000 (14:34 -0800)]
Merge master.kernel.org:/home/rmk/linux-2.6-arm
* master.kernel.org:/home/rmk/linux-2.6-arm:
[ARM] Update mach-types
[ARM] orion5x: D-link DNS-323 rev. B1 power-off
[ARM] Orion5x: add GPIO LED and buttons for wrt350n v2
[ARM] pxa: fix irq suspend/resume for pxa25x
[ARM] pxa: fix the incorrect naming of AC97 reset pin config for pxa26x
[ARM] pxa/corgi: fix incorrect default GPIO for UDC Vbus
[ARM] Kirkwood: drive USB VBUS pin on rd88f6192-nas high on boot
[ARM] Orion: fix PCIe inbound window programming when RAM size is not a power of two
Josef Bacik [Wed, 27 Jan 2010 02:09:38 +0000 (02:09 +0000)]
Btrfs: check total number of devices when removing missing
If you have a disk failure in RAID1 and then add a new disk to the
array, and then try to remove the missing volume, it will fail. The
reason is the sanity check only looks at the total number of rw devices,
which is just 2 because we have 2 good disks and 1 bad one. Instead
check the total number of devices in the array to make sure we can
actually remove the device. Tested this with a failed disk setup and
with this test we can now run
btrfs-vol -r missing /mount/point
and it works fine.
Signed-off-by: Josef Bacik <josef@redhat.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
Josef Bacik [Wed, 27 Jan 2010 02:09:00 +0000 (02:09 +0000)]
Btrfs: check return value of open_bdev_exclusive properly
Hit this problem while testing RAID1 failure stuff. open_bdev_exclusive
returns ERR_PTR(), not NULL. So change the return value properly. This
is important if you accidently specify a device that doesn't exist when
trying to add a new device to an array, you will panic the box
dereferencing bdev.
Signed-off-by: Josef Bacik <josef@redhat.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
Josef Bacik [Wed, 27 Jan 2010 02:07:59 +0000 (02:07 +0000)]
Btrfs: do not mark the chunk as readonly if in degraded mode
If a RAID setup has chunks that span multiple disks, and one of those
disks has failed, btrfs_chunk_readonly will return 1 since one of the
disks in that chunk's stripes is dead and therefore not writeable. So
instead if we are in degraded mode, return 0 so we can go ahead and
allocate stuff. Without this patch all of the block groups in a RAID1
setup will end up read-only, which will mean we can't add new disks to
the array since we won't be able to make allocations.
Signed-off-by: Josef Bacik <josef@redhat.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
Since it introduces this problem where we can run orphan cleanup on a
volume that can have orphan entries re-added. Instead of my original
fix, Yan Zheng pointed out that we can just revert my original fix and
then run the orphan cleanup in open_ctree after we look up the fs_root.
I have tested this with all the tests that gave me problems and this
patch fixes both problems. Thanks,
Signed-off-by: Josef Bacik <josef@redhat.com> Signed-off-by: Chris Mason <chris.mason@oracle.com>
Chris Mason [Thu, 28 Jan 2010 21:18:15 +0000 (16:18 -0500)]
Btrfs: Add mount -o compress-force
The default btrfs mount -o compress mode will quickly back off
compressing a file if it notices that compression does not reduce the
size of the data being written. This can save considerable CPU because
all future writes to the file go through uncompressed.
But some files are both very large and have mixed data stored in
them. In that case, we want to add the ability to always try
compressing data before writing it.
This commit adds mount -o compress-force. A later commit will add
a new inode flag that does the same thing.
Signed-off-by: Chris Mason <chris.mason@oracle.com>
Linus Torvalds [Thu, 28 Jan 2010 20:59:43 +0000 (12:59 -0800)]
Merge branch 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus
* 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus:
MIPS: PowerTV: Fix support for timer interrupts with > 64 external IRQs
MIPS: PowerTV: Streamline access to platform device registers
MIPS: Fix vmlinuz build for 32bit-only math shells
MIPS: Add support of LZO-compressed kernels
David VomLehn [Tue, 22 Dec 2009 01:49:22 +0000 (17:49 -0800)]
MIPS: PowerTV: Fix support for timer interrupts with > 64 external IRQs
The MIPS processor is limited to 64 external interrupt sources. Using a
greater number without IRQ sharing requires reading platform-specific
registers. On such platforms, reading the IntCtl register to determine
which interrupt corresponds to a timer interrupt will not work.
On MIPSR2 systems there is a solution - the TI bit in the Cause register,
specifically indicates that a timer interrupt has occured. This patch uses
that bit to detect interrupts for MIPSR2 processors, which may be expected
to work regardless of how the timer interrupt may be routed in the hardware.
Signed-off-by: David VomLehn (dvomlehn@cisco.com)
To: linux-mips@linux-mips.org
Patchwork: http://patchwork.linux-mips.org/patch/804/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
David VomLehn [Thu, 24 Dec 2009 01:34:46 +0000 (17:34 -0800)]
MIPS: PowerTV: Streamline access to platform device registers
Pre-compute addresses for the basic ASIC registers. This speeds up access
and allows memory for unused configurations to be freed. In addition,
uninitialized register addresses will be returned as NULL to catch bad
usage quickly.
Signed-off-by: David VomLehn <dvomlehn@cisco.com>
To: linux-mips@linux-mips.org
Patchwork: http://patchwork.linux-mips.org/patch/806/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
MIPS: Fix vmlinuz build for 32bit-only math shells
POSIX requires $((<expression>)) arithmetic in sh only to have long
arithmetic so on 32-bit sh binaries might do only 32-bit arithmetic but
the arithmetic done in arch/mips/boot/compressed/Makefile needs 64-bit.
I play with the AR7 platform, so VMLINUX_LOAD_ADDRESS is
0xffffffff94100000, and for an example 4MiB kernel
VMLINUZ_LOAD_ADDRESS is made out to be:
----
alex@berk:~$ bash -c 'printf "%x\n" $((0xffffffff94100000 + 0x400000))' ffffffff94500000
alex@berk:~$ dash -c 'printf "%x\n" $((0xffffffff94100000 + 0x400000))' 80000000003fffff
----
The former is obviously correct whilst the later breaks things royally.
Fortunately working with only the lower 32bit's works for both bash and
dash:
----
$ bash -c 'printf "%x\n" $((0x94100000 + 0x400000))' 94500000
$ dash -c 'printf "%x\n" $((0x94100000 + 0x400000))' 94500000
----
So, we can split the original 64bit string to two parts, and only
calculate the low 32bit part, which is big enough (1GiB kernel sizes
anyone?) for a normal Linux kernel image file, now, we calculate the
VMLINUZ_LOAD_ADDRESS like this:
1. if present, append top 32bit of VMLINUX_LOAD_ADDRESS" as a prefix
2. get the sum of the low 32bit of VMLINUX_LOAD_ADDRESS + VMLINUX_SIZE
This patch fixes vmlinuz kernel builds on systems where only a
32bit-only math shell is available.
Patch Changelog:
Version 2
- simplified method by using 'expr' for 'substr' and making it work
with dash once again
Version 1
- Revert the removals of '-n "$(VMLINUX_SIZE)"' to avoid the error
of "make clean"
- Consider more cases of the VMLINUX_LOAD_ADDRESS
Version 0
- initial release
* git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6:
[SCSI] aic79xx: check for non-NULL scb in ahd_handle_nonpkt_busfree
[SCSI] zfcp: Set hardware timeout as requested by BSG request.
[SCSI] zfcp: Introduce bsg_timeout callback.
[SCSI] scsi_transport_fc: Allow LLD to reset FC BSG timeout
[SCSI] zfcp: add missing compat ptr conversion
[SCSI] zfcp: Fix linebreak in hba trace
[SCSI] zfcp: Issue zfcp_fc_wka_port_put after FC CT BSG request
[SCSI] qla2xxx: Update version number to 8.03.01-k10.
[SCSI] fc-transport: Use packed modifier for fc_bsg_request structure.
[SCSI] qla2xxx: Perform fast mailbox read of flash regardless of size nor address alignment.
[SCSI] qla2xxx: Correct FCP2 recovery handling.
[SCSI] scsi_lib: Fix bug in completion of bidi commands
[SCSI] mptsas: Fix issue with chain pools allocation on katmai
[SCSI] aacraid: fix File System going into read-only mode
[SCSI] lpfc: fix file permissions
Linus Torvalds [Wed, 27 Jan 2010 17:27:44 +0000 (09:27 -0800)]
Merge branch 'for-linus' of git://git390.marist.edu/pub/scm/linux-2.6
* 'for-linus' of git://git390.marist.edu/pub/scm/linux-2.6:
[S390] fix single stepped svcs with TRACE_IRQFLAGS=y
[S390] zcrypt: Do not remove coprocessor for error 8/72
[S390] sclp_vt220: set initial terminal window size
[S390] use set_current_state in sigsuspend
[S390] irqflags: add missing types.h include
[S390] dasd: fix possible NULL pointer errors
Chris Wilson [Wed, 27 Jan 2010 13:36:32 +0000 (13:36 +0000)]
drm/i915: Selectively enable self-reclaim
Having missed the ENOMEM return via i915_gem_fault(), there are probably
other paths that I also missed. By not enabling NORETRY by default these
paths can run the shrinker and take memory from the system (but not from
our own inactive lists because our shrinker can not run whilst we hold
the struct mutex) and this may allow the system to survive a little longer
whilst our drivers consume all available memory.
References:
OOM killer unexpectedly called with kernel 2.6.32
http://bugzilla.kernel.org/show_bug.cgi?id=14933
v2: Pass gfp into page mapping.
v3: Use new read_cache_page_gfp() instead of open-coding.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Hugh Dickins <hugh.dickins@tiscali.co.uk> Cc: Jesse Barnes <jbarnes@virtuousgeek.org> Cc: Eric Anholt <eric@anholt.net> Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Stefan Richter [Tue, 26 Jan 2010 20:39:07 +0000 (21:39 +0100)]
firewire: ohci: fix crashes with TSB43AB23 on 64bit systems
Unsurprisingly, Texas Instruments TSB43AB23 exhibits the same behaviour
as TSB43AB22/A in dual buffer IR DMA mode: If descriptors are located
at physical addresses above the 31 bit address range (2 GB), the
controller will overwrite random memory. With luck, this merely
prevents video reception. With only a little less luck, the machine
crashes.
We use the same workaround here as with TSB43AB22/A: Switch off the
dual buffer capability flag and use packet-per-buffer IR DMA instead.
Another possible workaround would be to limit the coherent DMA mask to
31 bits.
In Linux 2.6.33, this change serves effectively only as documentation
since dual buffer mode is not used for any controller anymore. But
somebody might want to re-enable it in the future to make use of
features of dual buffer DMA that are not available in packet-per-buffer
mode.
In Linux 2.6.32 and older, this update is vital for anyone with this
controller, more than 2 GB RAM, a 64 bit kernel, and FireWire video or
audio applications.
We have at least four reports:
http://bugzilla.kernel.org/show_bug.cgi?id=13808
http://marc.info/?l=linux1394-user&m=126154279004083
https://bugzilla.redhat.com/show_bug.cgi?id=552142
http://marc.info/?l=linux1394-user&m=126432246128386
Reported-by: Paul Johnson Reported-by: Ronneil Camara Reported-by: G Zornetzer Reported-by: Mark Thompson Cc: stable@kernel.org Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Linus Torvalds [Wed, 27 Jan 2010 17:20:03 +0000 (09:20 -0800)]
mm: add new 'read_cache_page_gfp()' helper function
It's a simplified 'read_cache_page()' which takes a page allocation
flag, so that different paths can control how aggressive the memory
allocations are that populate a address space.
In particular, the intel GPU object mapping code wants to be able to do
a certain amount of own internal memory management by automatically
shrinking the address space when memory starts getting tight. This
allows it to dynamically use different memory allocation policies on a
per-allocation basis, rather than depend on the (static) address space
gfp policy.
The actual new function is a one-liner, but re-organizing the helper
functions to the point where you can do this with a single line of code
is what most of the patch is all about.
Tested-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Wed, 27 Jan 2010 10:49:10 +0000 (02:49 -0800)]
Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86, msr/cpuid: Pass the number of minors when unregistering MSR and CPUID drivers.
x86: Remove "x86 CPU features in debugfs" (CONFIG_X86_CPU_DEBUG)
Revert "x86: ucode-amd: Load ucode-patches once ..."
x86: Disable HPET MSI on ATI SB700/SB800
x86: Set hotpluggable nodes in nodes_possible_map
[S390] fix single stepped svcs with TRACE_IRQFLAGS=y
If irq flags tracing is enabled the TRACE_IRQS_ON macros expands to
a function call which clobbers registers %r0-%r5. The macro is used
in the code path for single stepped system calls. The argument
registers %r2-%r6 need to be restored from the stack before the system
call function is called.
Cc: stable@kernel.org Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Felix Beck [Wed, 27 Jan 2010 09:12:39 +0000 (10:12 +0100)]
[S390] zcrypt: Do not remove coprocessor for error 8/72
In a case where the number of the input data is bigger than the
modulus of the key, the coprocessor adapters will report an 8/72
error. This case is not caught yet, thus the adapter will be taken
offline. To prevent this, we return an -EINVAL instead.
Signed-off-by: Felix Beck <felix.beck@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
[S390] sclp_vt220: set initial terminal window size
When opening a SCLP VT220 terminal, the terminal window size is not
initialized (defaults to zero).
Since the SCLP VT220 terminal supports only 80x24, explicitly set
the window size to prevent (n)curses applications from guessing
the default setting.
Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Heiko Carstens [Wed, 27 Jan 2010 09:12:36 +0000 (10:12 +0100)]
[S390] irqflags: add missing types.h include
Add missing types.h include. Otherwise would cause build breakages on
hw breakpoint support, because of undefined BITS_PER_LONG.
Also fix up the copyright line and remove the superfluous __KERNEL__
ifdef.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Russ Anderson [Wed, 27 Jan 2010 02:37:22 +0000 (20:37 -0600)]
x86, msr/cpuid: Pass the number of minors when unregistering MSR and CPUID drivers.
Pass the number of minors when unregistering MSR and CPUID drivers.
Reported-by: Dean Nelson <dnelson@redhat.com> Signed-off-by: Dean Nelson <dnelson@redhat.com>
LKML-Reference: <20100127023722.GA22305@sgi.com> Signed-off-by: Russ Anderson <rja@sgi.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Al Viro [Sun, 24 Jan 2010 05:06:22 +0000 (00:06 -0500)]
fix affs parse_options()
Error handling in that sucker got broken back in 2003. If function
returns 0 on failure, it's not nice to add return -EINVAL into it.
Adding return 1 on other failure exits is also not a good thing (and
yes, original success exits with 1 and some of failure exits with 0
are still there; so's the original logics in callers).
Al Viro [Sun, 24 Jan 2010 05:04:07 +0000 (00:04 -0500)]
Fix remount races with symlink handling in affs
A couple of fields in affs_sb_info is used in follow_link() and
symlink() for handling AFFS "absolute" symlinks. Need locking
against affs_remount() updates.
fnctl: f_modown should call write_lock_irqsave/restore
Commit 703625118069f9f8960d356676662d3db5a9d116 exposed that f_modown()
should call write_lock_irqsave instead of just write_lock_irq so that
because a caller could have a spinlock held and it would not be good to
renable interrupts.
Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Al Viro <viro@ZenIV.linux.org.uk> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: Tavis Ormandy <taviso@google.com> Cc: stable <stable@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Stefan Richter [Sun, 24 Jan 2010 15:45:03 +0000 (16:45 +0100)]
firewire: core: fix use-after-free regression in FCP handler
Commit db5d247a "firewire: fix use of multiple AV/C devices, allow
multiple FCP listeners" introduced a regression into 2.6.33-rc3:
The core freed payloads of incoming requests to FCP_Request or
FCP_Response before a userspace driver accessed them.
We need to copy such payloads for each registered userspace client
and free the copies according to the lifetime rules of non-FCP client
request resources.
(This could possibly be optimized by reference counts instead of
copies.)
The presently only kernelspace driver which listens for FCP requests,
firedtv, was not affected because it already copies FCP frames into an
own buffer before returning to firewire-core's FCP handler dispatcher.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Stefan Richter [Sun, 24 Jan 2010 13:47:02 +0000 (14:47 +0100)]
firewire: core: add_descriptor size check
Presently, firewire-core only checks whether descriptors that are to be
added by userspace drivers to the local node's config ROM do not exceed
a size of 256 quadlets. However, the sum of the bare minimum ROM plus
all descriptors (from firewire-core, from firewire-net, from userspace)
must not exceed 256 quadlets.
Otherwise, the bounds of a statically allocated buffer will be
overwritten. If the kernel survives that, firewire-core will
subsequently be unable to parse the local node's config ROM.
(Note, userspace drivers can add descriptors only through device files
of local nodes. These are usually only accessible by root, unlike
device files of remote nodes which may be accessible to lesser
privileged users.)
Therefore add a test which takes the actual present and required ROM
size into account for all descriptors of kernelspace and userspace
drivers.
Cc: stable@kernel.org Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Linus Torvalds [Tue, 26 Jan 2010 03:05:06 +0000 (19:05 -0800)]
Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
ext4: Drop EXT4_GET_BLOCKS_UPDATE_RESERVE_SPACE flag
ext4: Fix quota accounting error with fallocate
ext4: Handle -EDQUOT error on write
Linus Torvalds [Tue, 26 Jan 2010 03:03:45 +0000 (19:03 -0800)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6:
ASoC: fix a memory-leak in wm8903
ALSA: hda - add possibility to choose speakers configuration for 4930g
ALSA: hda - Fix HP T5735 automute
ALSA: hda - Turn on EAPD only if available for Realtek codecs
ALSA: hda - Fix parsing pin node 0x21 on ALC259
Linus Torvalds [Tue, 26 Jan 2010 03:02:31 +0000 (19:02 -0800)]
Merge branch 'kvm-updates/2.6.33' of git://git.kernel.org/pub/scm/virt/kvm/kvm
* 'kvm-updates/2.6.33' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: x86: Fix leak of free lapic date in kvm_arch_vcpu_init()
KVM: x86: Fix probable memory leak of vcpu->arch.mce_banks
KVM: S390: fix potential array overrun in intercept handling
KVM: fix spurious interrupt with irqfd
eventfd - allow atomic read and waitqueue remove
KVM: MMU: bail out pagewalk on kvm_read_guest error
KVM: properly check max PIC pin in irq route setup
KVM: only allow one gsi per fd
KVM: x86: Fix host_mapping_level()
KVM: powerpc: Show timing option only on embedded
KVM: Fix race between APIC TMR and IRR
Linus Torvalds [Tue, 26 Jan 2010 03:02:06 +0000 (19:02 -0800)]
Merge branch 'linux-next' of git://git.infradead.org/ubi-2.6
* 'linux-next' of git://git.infradead.org/ubi-2.6:
UBI: fix memory leak in update path
UBI: add more checks to chdev open
UBI: initialise update marker
Linus Torvalds [Tue, 26 Jan 2010 03:00:56 +0000 (19:00 -0800)]
Merge branch 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging
* 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging:
hwmon: (fschmd) Fix a memleak on multiple opens of /dev/watchdog
hwmon: (asus_atk0110) Do not fail if MBIF is missing
hwmon: (amc6821) Double unlock bug
hwmon: (smsc47m1) Fix section mismatch
Linus Torvalds [Tue, 26 Jan 2010 02:59:47 +0000 (18:59 -0800)]
Merge branch 'drm-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6
* 'drm-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6: (95 commits)
drm/radeon/kms: preface warning printk with driver name
drm/radeon/kms: drop unnecessary printks.
drm: fix regression in fb blank handling
drm/radeon/kms: make hibernate work on IGPs
drm/vmwgfx: Optimize memory footprint for DMA buffers.
drm/ttm: Allow system memory as a busy placement.
drm/ttm: Fix race condition in ttm_bo_delayed_delete (v3, final)
drm/nv50: prevent switching off SOR when in use for DVI-over-DP
drm/nv50: fail auxch transaction if reply count not what we expect
drm/nouveau: fix failure path if userspace specifies no valid memtypes
drm/nouveau: report LVDS as disconnected if lid closed
drm/radeon/kms: fix legacy get_engine/memory clock
drm/radeon/kms/atom: atom parser fixes
drm/radeon/kms: clean up atombios pll code
drm/radeon/kms: clean up pll struct
drm/radeon/kms/atom: fix crtc lock ordering
drm/radeon: r6xx/r7xx possible security issue, system ram access
drm/radeon/kms: r600/r700 don't test ib if ib initialization fails
drm/radeon/kms: Forbid creation of framebuffer with no valid GEM object
drm/radeon/kms: r600 handle irq vector ring overflow
...
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (42 commits)
virtio_net: Make delayed refill more reliable
sfc: Use fixed-size buffers for MCDI NVRAM requests
sfc: Add workspace for GMAC bug workaround to MCDI MAC_STATS buffer
tcp_probe: avoid modulus operation and wrap fix
qlge: Only free resources if they were allocated
netns xfrm: deal with dst entries in netns
sky2: revert config space change
vlan: fix vlan_skb_recv()
netns xfrm: fix "ip xfrm state|policy count" misreport
sky2: Enable/disable WOL per hardware device
net: Fix IPv6 GSO type checks in Intel ethernet drivers
igb/igbvf: cleanup exception handling in tx_map_adv
MAINTAINERS: Add Intel igbvf maintainer
e1000/e1000e: don't use small hardware rx buffers
fmvj18x_cs: add new id (Panasonic lan & modem card)
be2net: swap only first 2 fields of mcc_wrb
Please add support for Microsoft MN-120 PCMCIA network card
be2net: fix bug in rx page posting
wimax/i2400m: Add support for more i6x50 SKUs
e1000e: enhance frame fragment detection
...
Linus Torvalds [Tue, 26 Jan 2010 02:56:12 +0000 (18:56 -0800)]
Merge branch 'omap-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap-2.6
* 'omap-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap-2.6: (25 commits)
OMAP2/3: DMTIMER: Clear pending interrupts when stopping a timer
PM debug: Fix warning when no CONFIG_DEBUG_FS
OMAP3: PM: DSS PM_WKEN to refill DMA
OMAP: timekeeping: time should not stop during suspend
OMAP3: PM: Force write last pad config register into save area
OMAP: omap3_pm_get_suspend_state() error ignored in pwrdm_suspend_get()
OMAP3: PM: Enable wake-up from McBSP2, 3 and 4 modules
OMAP3: PM debug: fix build error when !CONFIG_DEBUG_FS
OMAP3: PM: Removing redundant and potentially dangerous PRCM configration
OMAP3: Fixed ARM aux ctrl register save/restore
OMAP3: CPUidle: Fixed timer resolution
OMAP3: PM: Remove duplicate code blocks
OMAP3: PM: Disable interrupt controller AUTOIDLE before WFI
OMAP3: PM: Enable system control module autoidle
OMAP3: PM: Ack pending interrupts before entering suspend
omap: Enable GPMC clock in gpmc_init
OMAP1 clock: fix for "BUG: spinlock lockup on CPU#0"
OMAP4: clocks: Fix the clksel_rate struct DPLL divs
OMAP4: PRCM: Fix the base address for CHIRONSS reg defines
OMAP: dma_chan[lch_head].flag & OMAP_DMA_ACTIVE tested twice in omap_dma_unlink_lch()
...
Herbert Xu [Mon, 25 Jan 2010 23:51:01 +0000 (15:51 -0800)]
virtio_net: Make delayed refill more reliable
I have seen RX stalls on a machine that experienced a suspected
OOM. After the stall, the RX buffer is empty on the guest side
and there are exactly 16 entries available on the host side. As
the number of entries is less than that required by a maximal
skb, the host cannot proceed.
The guest did not have a refill job scheduled.
My diagnosis is that an OOM had occured, with the delayed refill
job scheduled. The job was able to allocate at least one skb, but
not enough to overcome the minimum required by the host to proceed.
As the refill job would only reschedule itself if it failed completely
to allocate any skbs, this would lead to an RX stall.
The following patch removes this stall possibility by always
rescheduling the refill job until the ring is totally refilled.
Testing has shown that the RX stall no longer occurs whereas
previously it would occur within a day.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Acked-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>
Ben Hutchings [Mon, 25 Jan 2010 23:49:59 +0000 (15:49 -0800)]
sfc: Use fixed-size buffers for MCDI NVRAM requests
The low-level MCDI code always uses 32-bit MMIO operations, and
callers must pad input and output buffers to multiples of 4 bytes.
The MCDI NVRAM functions are not doing this. Also, their buffers are
declared as variable-length arrays with no explicit maximum length.
Switch to a fixed buffer size based on the chunk size used by the
MTD driver (which is a multiple of 4).
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Guido Barzini [Mon, 25 Jan 2010 23:49:19 +0000 (15:49 -0800)]
sfc: Add workspace for GMAC bug workaround to MCDI MAC_STATS buffer
Due to a hardware bug in the SFC9000 family, the firmware must
transfer raw GMAC statistics to host memory before aggregating them
into the cooked (speed-independent) MAC statistics. Extend the stats
buffer to support this.
The length of the buffer is explicit in the MAC_STATS command, so this
change is backward-compatible on both sides.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
By rounding up the buffer size to power of 2, several expensive
modulus operations can be avoided. This patch also solves a bug where
the gap need when ring gets full was not being accounted for.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Andrew Patterson [Fri, 22 Jan 2010 21:06:53 +0000 (14:06 -0700)]
PCI: fix nested spinlock hang in aer_inject
The aer_inject module hangs in aer_inject() when checking the device's
error masks. The hang is due to a recursive use of the aer_inject lock.
The aer_inject() routine grabs the lock while processing the error and then
calls pci_read_config_dword to read the masks. The pci_read_config_dword
routine is earlier overridden by pci_read_aer, which among other things,
grabs the aer_inject lock.
Fixed by moving the pci_read_config_dword calls to read the masks to before
the lock is taken.
Wei Yongjun [Fri, 22 Jan 2010 06:21:29 +0000 (14:21 +0800)]
KVM: x86: Fix leak of free lapic date in kvm_arch_vcpu_init()
In function kvm_arch_vcpu_init(), if the memory malloc for
vcpu->arch.mce_banks is fail, it does not free the memory
of lapic date. This patch fixed it.
Wei Yongjun [Fri, 22 Jan 2010 06:18:47 +0000 (14:18 +0800)]
KVM: x86: Fix probable memory leak of vcpu->arch.mce_banks
vcpu->arch.mce_banks is malloc in kvm_arch_vcpu_init(), but
never free in any place, this may cause memory leak. So this
patch fixed to free it in kvm_arch_vcpu_uninit().
KVM: S390: fix potential array overrun in intercept handling
kvm_handle_sie_intercept uses a jump table to get the intercept handler
for a SIE intercept. Static code analysis revealed a potential problem:
the intercept_funcs jump table was defined to contain (0x48 >> 2) entries,
but we only checked for code > 0x48 which would cause an off-by-one
array overflow if code == 0x48.
Use the compiler and ARRAY_SIZE to automatically set the limits.
Cc: stable@kernel.org Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
kvm didn't clear irqfd counter on deassign, as a result we could get a
spurious interrupt when irqfd is assigned back. this leads to poor
performance and, in theory, guest crash.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
Davide Libenzi [Wed, 13 Jan 2010 17:34:36 +0000 (09:34 -0800)]
eventfd - allow atomic read and waitqueue remove
KVM needs a wait to atomically remove themselves from the eventfd ->poll()
wait queue head, in order to handle correctly their IRQfd deassign
operation.
This patch introduces such API, plus a way to read an eventfd from its
context.
Signed-off-by: Davide Libenzi <davidel@xmailserver.org> Signed-off-by: Avi Kivity <avi@redhat.com>
Looks like repeatedly binding same fd to multiple gsi's with irqfd can
use up a ton of kernel memory for irqfd structures.
A simple fix is to allow each fd to only trigger one gsi: triggering a
storm of interrupts in guest is likely useless anyway, and we can do it
by binding a single gsi to many interrupts if we really want to.
Cc: stable@kernel.org Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Acked-by: Gregory Haskins <ghaskins@novell.com> Signed-off-by: Avi Kivity <avi@redhat.com>
Avi Kivity [Tue, 29 Dec 2009 10:42:16 +0000 (12:42 +0200)]
KVM: Fix race between APIC TMR and IRR
When we queue an interrupt to the local apic, we set the IRR before the TMR.
The vcpu can pick up the IRR and inject the interrupt before setting the TMR,
and perhaps even EOI it, causing incorrect behaviour.
The race is really insignificant since it can only occur on the first
interrupt (usually following interrupts will not change TMR), but it's better
closed than open.
Fixed by reordering setting the TMR vs IRR.
Cc: stable@kernel.org Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Luca Tettamanti [Mon, 25 Jan 2010 14:00:49 +0000 (15:00 +0100)]
hwmon: (asus_atk0110) Do not fail if MBIF is missing
MBIF (motherboard identification) is only used to print the name of
the board, it's not essential for the driver; do not fail if it's
missing. Based on Juan's patch.
Signed-off-by: Luca Tettamanti <kronos.it@gmail.com> Acked-by: Juan RP <xtraeme@gmail.com> Signed-off-by: Jean Delvare <khali@linux-fr.org>
Alexey Dobriyan [Mon, 25 Jan 2010 06:47:53 +0000 (22:47 -0800)]
netns xfrm: deal with dst entries in netns
GC is non-existent in netns, so after you hit GC threshold, no new
dst entries will be created until someone triggers cleanup in init_net.
Make xfrm4_dst_ops and xfrm6_dst_ops per-netns.
This is not done in a generic way, because it woule waste
(AF_MAX - 2) * sizeof(struct dst_ops) bytes per-netns.
Reorder GC threshold initialization so it'd be done before registering
XFRM policies.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Obviously, this register had some other impact that is causing
the regression. Either it is masking some other access or needs
to be reset in some path.
Either, way it is best to just revert the change for 2.6.33
uses DRM_MODE_DPMS_ON for FB_BLANK_NORMAL, but DRM_MODE_DPMS_ON
is actually for turning output on instead of blank.
This makes fb blank broken on my T61, it put LVDS on but leave
pipe disabled which made screen totally white or caused some
'burning' effect.
[airlied: James objects to this but at this point in 2.6.33,
I can't see a patch that will fix this properly like he wants coming
in time and otherwise this is a regression - proper fix for 2.6.34
hopefully.]
Cc: James Simmons <jsimmons@infradead.org> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com> Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Mon, 25 Jan 2010 03:08:08 +0000 (13:08 +1000)]
drm/radeon/kms: make hibernate work on IGPs
This is the least invasive fix without migrating the radeon driver
to pm_ops from what I can see. We just always migrate VRAM objects
on IGPs for now and we can fix it up later to migrate depending
on STR vs STD.
Dave Airlie [Mon, 25 Jan 2010 06:04:11 +0000 (16:04 +1000)]
Merge remote branch 'nouveau/for-airlied' of ../drm-nouveau-next into drm-linus
* 'nouveau/for-airlied' of ../drm-nouveau-next:
drm/nv50: prevent switching off SOR when in use for DVI-over-DP
drm/nv50: fail auxch transaction if reply count not what we expect
drm/nouveau: fix failure path if userspace specifies no valid memtypes
drm/nouveau: report LVDS as disconnected if lid closed
drm/nv50: prevent accidently turning off encoders we're actually using
drm/nv50: fix alignment of per-channel fifo cache
drm/nouveau: Evict buffers in VRAM before freeing sgdma
drm/nouveau: Acknowledge DMA_VTX_PROTECTION PGRAPH interrupts
drm/nouveau: fix thinko in nv04_instmem.c
drm/nouveau: fix a race condition in nouveau_dma_wait()
Eric Dumazet [Mon, 25 Jan 2010 03:52:24 +0000 (19:52 -0800)]
vlan: fix vlan_skb_recv()
Bruno Prémont found commit 9793241fe92f7d930
(vlan: Precise RX stats accounting) added a regression for non
hw accelerated vlans.
[ 26.390576] BUG: unable to handle kernel NULL pointer dereference at (null)
[ 26.396369] IP: [<df856b89>] vlan_skb_recv+0x89/0x280 [8021q]
vlan_dev_info() was used with original device, instead of
skb->dev. Also spotted by Américo Wang.
Reported-By: Bruno Prémont <bonbons@linux-vserver.org> Tested-By: Bruno Prémont <bonbons@linux-vserver.org> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>