Linus Torvalds [Fri, 18 Mar 2011 17:51:11 +0000 (10:51 -0700)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6:
fs: call security_d_instantiate in d_obtain_alias V2
lose 'mounting_here' argument in ->d_manage()
don't pass 'mounting_here' flag to follow_down()
change the locking order for namespace_sem
fix deadlock in pivot_root()
vfs: split off vfsmount-related parts of vfs_kern_mount()
Some fixes for pstore
kill simple_set_mnt()
Linus Torvalds [Fri, 18 Mar 2011 17:50:52 +0000 (10:50 -0700)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/bcopeland/omfs
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/bcopeland/omfs:
omfs: make readdir stop when filldir says so
omfs: merge unlink() and rmdir(), close leak in rename()
omfs: stop playing silly buggers with omfs_unlink() in ->rename()
omfs: rename() needs to mark old_inode dirty after ctime update
Linus Torvalds [Fri, 18 Mar 2011 17:50:27 +0000 (10:50 -0700)]
Merge branch 'linux-next' of git://git.infradead.org/ubifs-2.6
* 'linux-next' of git://git.infradead.org/ubifs-2.6: (25 commits)
UBIFS: clean-up commentaries
UBIFS: save 128KiB or more RAM
UBIFS: allocate orphans scan buffer on demand
UBIFS: allocate lpt dump buffer on demand
UBIFS: allocate ltab checking buffer on demand
UBIFS: allocate scanning buffer on demand
UBIFS: allocate dump buffer on demand
UBIFS: do not check data crc by default
UBIFS: simplify UBIFS Kconfig menu
UBIFS: print max. index node size
UBIFS: handle allocation failures in UBIFS write path
UBIFS: use max_write_size during recovery
UBIFS: use max_write_size for write-buffers
UBIFS: introduce write-buffer size field
UBI: incorporate LEB offset information
UBIFS: incorporate maximum write size
UBI: provide LEB offset information
UBI: incorporate maximum write size
UBIFS: fix LEB number in printk
UBIFS: restrict world-writable debugfs files
...
Linus Torvalds [Fri, 18 Mar 2011 17:50:02 +0000 (10:50 -0700)]
Merge branch 'linux-next' of git://git.infradead.org/ubi-2.6
* 'linux-next' of git://git.infradead.org/ubi-2.6:
UBI: make tests modes dynamic
UBI: make self-checks dynamic
UBI: make debugging messages dynamic
UBI: remove UBI_IO_DEBUG macro
UBI: kill debugging buffer
UBI: allocate erase checking buffer on demand
UBI: allocate write checking buffer on demand
UBI: always re-read in case of read failures
UBI: cleanup comments about corrupted PEBs
UBI: add slab cache for ubi_scan_leb objects
UBI: use raw mtd read function in debugging code
UBI: try to reveal buggy MTD drivers
UBI: add a commentary about allocating VID header buffer on stack
UBI: cleanup LEB start calculations
UBI: fix NOR erase preparation quirk
Linus Torvalds [Fri, 18 Mar 2011 17:45:21 +0000 (10:45 -0700)]
Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86: Flush TLB if PGD entry is changed in i386 PAE mode
x86, dumpstack: Correct stack dump info when frame pointer is available
x86: Clean up csum-copy_64.S a bit
x86: Fix common misspellings
x86: Fix misspelling and align params
x86: Use PentiumPro-optimized partial_csum() on VIA C7
Linus Torvalds [Fri, 18 Mar 2011 17:44:05 +0000 (10:44 -0700)]
Merge branches 'irq-fixes-for-linus' and 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'irq-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
genirq: Fix incorrect unlock in __setup_irq()
cris: Use generic show_interrupts()
genirq: show_interrupts: Check desc->name before printing it blindly
cris: Use accessor functions to set IRQ_PER_CPU flag
cris: Fix irq conversion fallout
* 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
sched, kernel-doc: Fix runqueue_is_locked() description
Linus Torvalds [Fri, 18 Mar 2011 17:38:34 +0000 (10:38 -0700)]
Merge branch 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (30 commits)
trace, filters: Initialize the match variable in process_ops() properly
trace, documentation: Fix branch profiling location in debugfs
oprofile, s390: Cleanups
oprofile, s390: Remove hwsampler_files.c and merge it into init.c
perf: Fix tear-down of inherited group events
perf: Reorder & optimize perf_event_context to remove alignment padding on 64 bit builds
perf: Handle stopped state with tracepoints
perf: Fix the software events state check
perf, powerpc: Handle events that raise an exception without overflowing
perf, x86: Use INTEL_*_CONSTRAINT() for all PEBS event constraints
perf, x86: Clean up SandyBridge PEBS events
perf lock: Fix sorting by wait_min
perf tools: Version incorrect with some versions of grep
perf evlist: New command to list the names of events present in a perf.data file
perf script: Add support for H/W and S/W events
perf script: Add support for dumping symbols
perf script: Support custom field selection for output
perf script: Move printing of 'common' data from print_event and rename
perf tracing: Remove print_graph_cpu and print_graph_proc from trace-event-parse
perf script: Change process_event prototype
...
Linus Torvalds [Fri, 18 Mar 2011 17:37:40 +0000 (10:37 -0700)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (47 commits)
doc: CONFIG_UNEVICTABLE_LRU doesn't exist anymore
Update cpuset info & webiste for cgroups
dcdbas: force SMI to happen when expected
arch/arm/Kconfig: remove one to many l's in the word.
asm-generic/user.h: Fix spelling in comment
drm: fix printk typo 'sracth'
Remove one to many n's in a word
Documentation/filesystems/romfs.txt: fixing link to genromfs
drivers:scsi Change printk typo initate -> initiate
serial, pch uart: Remove duplicate inclusion of linux/pci.h header
fs/eventpoll.c: fix spelling
mm: Fix out-of-date comments which refers non-existent functions
drm: Fix printk typo 'failled'
coh901318.c: Change initate to initiate.
mbox-db5500.c Change initate to initiate.
edac: correct i82975x error-info reported
edac: correct i82975x mci initialisation
edac: correct commented info
fs: update comments to point correct document
target: remove duplicate include of target/target_core_device.h from drivers/target/target_core_hba.c
...
Trivial conflict in fs/eventpoll.c (spelling vs addition)
Linus Torvalds [Fri, 18 Mar 2011 17:35:30 +0000 (10:35 -0700)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid: (48 commits)
HID: add support for Logitech Driving Force Pro wheel
HID: hid-ortek: remove spurious reference
HID: add support for Ortek PKB-1700
HID: roccat-koneplus: vorrect mode of sysfs attr 'sensor'
HID: hid-ntrig: init settle and mode check
HID: merge hid-egalax into hid-multitouch
HID: hid-multitouch: Send events per slot if CONTACTCOUNT is missing
HID: ntrig remove if and drop an indent
HID: ACRUX - activate the device immediately after binding
HID: ntrig: apply NO_INIT_REPORTS quirk
HID: hid-magicmouse: Correct touch orientation direction
HID: ntrig don't dereference unclaimed hidinput
HID: Do not create input devices for feature reports
HID: bt hidp: send Output reports using SET_REPORT on the Control channel
HID: hid-sony.c: Fix sending Output reports to the Sixaxis
HID: add support for Keytouch IEC 60945
HID: Add HID Report Descriptor to sysfs
HID: add IRTOUCH infrared USB to hid_have_special_driver
HID: kernel oops in out_cleanup in function hidinput_connect
HID: Add teletext/color keys - gyration remote - EU version (GYAR3101CKDE)
...
Josef Bacik [Fri, 19 Nov 2010 01:52:55 +0000 (20:52 -0500)]
fs: call security_d_instantiate in d_obtain_alias V2
While trying to track down some NFS problems with BTRFS, I kept noticing I was
getting -EACCESS for no apparent reason. Eric Paris and printk() helped me
figure out that it was SELinux that was giving me grief, with the following
denial
Turns out this is because in d_obtain_alias if we can't find an alias we create
one and do all the normal instantiation stuff, but we don't do the
security_d_instantiate.
Usually we are protected from getting a hashed dentry that hasn't yet run
security_d_instantiate() by the parent's i_mutex, but obviously this isn't an
option there, so in order to deal with the case that a second thread comes in
and finds our new dentry before we get to run security_d_instantiate(), we go
ahead and call it if we find a dentry already. Eric assures me that this is ok
as the code checks to see if the dentry has been initialized already so calling
security_d_instantiate() against the same dentry multiple times is ok. With
this patch I'm no longer getting errant -EACCESS values.
Signed-off-by: Josef Bacik <josef@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Fri, 18 Mar 2011 12:55:38 +0000 (08:55 -0400)]
change the locking order for namespace_sem
Have it nested inside ->i_mutex. Instead of using follow_down()
under namespace_sem, followed by grabbing i_mutex and checking that
mountpoint to be is not dead, do the following:
grab i_mutex
check that it's not dead
grab namespace_sem
see if anything is mounted there
if not, we've won
otherwise
drop locks
put_path on what we had
replace with what's mounted
retry everything with new mountpoint to be
New helper (lock_mount()) does that. do_add_mount(), do_move_mount(),
do_loopback() and pivot_root() switched to it; in case of the last
two that eliminates a race we used to have - original code didn't
do follow_down().
Al Viro [Fri, 18 Mar 2011 12:29:36 +0000 (08:29 -0400)]
fix deadlock in pivot_root()
Don't hold vfsmount_lock over the loop traversing ->mnt_parent;
do check_mnt(new.mnt) under namespace_sem instead; combined with
namespace_sem held over all that code it'll guarantee the stability
of ->mnt_parent chain all the way to the root.
Doing check_mnt() outside of namespace_sem in case of pivot_root()
is wrong anyway.
Shaohua Li [Wed, 16 Mar 2011 03:37:29 +0000 (11:37 +0800)]
x86: Flush TLB if PGD entry is changed in i386 PAE mode
According to intel CPU manual, every time PGD entry is changed in i386 PAE
mode, we need do a full TLB flush. Current code follows this and there is
comment for this too in the code.
But current code misses the multi-threaded case. A changed page table
might be used by several CPUs, every such CPU should flush TLB. Usually
this isn't a problem, because we prepopulate all PGD entries at process
fork. But when the process does munmap and follows new mmap, this issue
will be triggered.
When it happens, some CPUs keep doing page faults:
Namhyung Kim [Fri, 18 Mar 2011 02:40:06 +0000 (11:40 +0900)]
x86, dumpstack: Correct stack dump info when frame pointer is available
Current stack dump code scans entire stack and check each entry
contains a pointer to kernel code. If CONFIG_FRAME_POINTER=y it
could mark whether the pointer is valid or not based on value of
the frame pointer. Invalid entries could be preceded by '?' sign.
However this was not going to happen because scan start point
was always higher than the frame pointer so that they could not
meet.
Commit 9c0729dc8062 ("x86: Eliminate bp argument from the stack
tracing routines") delayed bp acquisition point, so the bp was
read in lower frame, thus all of the entries were marked
invalid.
This patch fixes this by reverting above commit while retaining
stack_frame() helper as suggested by Frederic Weisbecker.
Dan Rosenberg [Thu, 17 Mar 2011 22:32:24 +0000 (18:32 -0400)]
ALSA: sound/pci/asihpi: check adapter index in hpi_ioctl
The user-supplied index into the adapters array needs to be checked, or
an out-of-bounds kernel pointer could be accessed and used, leading to
potentially exploitable memory corruption.
Signed-off-by: Dan Rosenberg <drosenberg@vsecurity.com> Cc: <stable@kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de>
Takashi Iwai [Fri, 18 Mar 2011 06:31:53 +0000 (07:31 +0100)]
ALSA: aloop - Fix possible IRQ lock inversion
loopback_pos_update() can be called in the timer callback, thus the lock
held should be irq-safe. Otherwise you'll get AB/BA deadlock together
with substream->self_group.lock.
* git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile: (27 commits)
arch/tile: support newer binutils assembler shift semantics
arch/tile: fix deadlock bugs in rwlock implementation
drivers/edac: provide support for tile architecture
tile on-chip network driver: sync up with latest fixes
arch/tile: support 4KB page size as well as 64KB
arch/tile: add some more VMSPLIT options and use consistent naming
arch/tile: fix some comments and whitespace
arch/tile: export some additional module symbols
arch/tile: enhance existing finv_buffer_remote() routine
arch/tile: fix two bugs in the backtracer code
arch/tile: use extended assembly to inline __mb_incoherent()
arch/tile: use a cleaner technique to enable interrupt for cpu_idle()
arch/tile: sync up with <arch/sim.h> and <arch/sim_def.h> changes
arch/tile: fix reversed test of strict_strtol() return value
arch/tile: avoid a simulator warning during bootup
arch/tile: export <asm/hardwall.h> to userspace
arch/tile: warn and retry if an IPI is not accepted by the target cpu
arch/tile: stop disabling INTCTRL_1 interrupts during hypervisor downcalls
arch/tile: fix __ndelay etc to work better
arch/tile: bug fix: exec'ed task thought it was still single-stepping
...
Fix up trivial conflict in arch/tile/kernel/vmlinux.lds.S (percpu
alignment vs section naming convention fix)
Linus Torvalds [Fri, 18 Mar 2011 02:28:15 +0000 (19:28 -0700)]
Merge branch 'omap-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap-2.6
* 'omap-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap-2.6: (258 commits)
omap: zoom: host should not pull up wl1271's irq line
arm: plat-omap: iommu: fix request_mem_region() error path
OMAP2+: Common CPU DIE ID reading code reads wrong registers for OMAP4430
omap4: mux: Remove duplicate mux modes
omap: iovmm: don't check 'da' to set IOVMF_DA_FIXED flag
omap: iovmm: disallow mapping NULL address when IOVMF_DA_ANON is set
omap2+: mux: Fix compile when CONFIG_OMAP_MUX is not selected
omap4: board-omap4panda: Initialise the serial pads
omap3: board-3430sdp: Initialise the serial pads
omap4: board-4430sdp: Initialise the serial pads
omap2+: mux: Add macro for configuring static with omap_hwmod_mux_init
omap2+: mux: Remove the use of IDLE flag
omap2+: Add separate list for dynamic pads to mux
perf: add OMAP support for the new power events
OMAP4: Add IVA OPP enteries.
OMAP4: Update Voltage Rail Values for MPU, IVA and CORE
OMAP4: Enable 800 MHz and 1 GHz MPU-OPP
OMAP3+: OPP: Replace voltage values with Macros
OMAP3: wdtimer: Fix CORE idle transition
Watchdog: omap_wdt: add fine grain runtime-pm
...
Fix up various conflicts in
- arch/arm/mach-omap2/board-omap3evm.c
- arch/arm/mach-omap2/clock3xxx_data.c
- arch/arm/mach-omap2/usb-musb.c
- arch/arm/plat-omap/include/plat/usb.h
- drivers/usb/musb/musb_core.h
Linus Torvalds [Fri, 18 Mar 2011 02:13:18 +0000 (19:13 -0700)]
Merge branch 'for-linus' of git://codeaurora.org/quic/kernel/davidb/linux-msm
* 'for-linus' of git://codeaurora.org/quic/kernel/davidb/linux-msm: (46 commits)
msm: scm: Check for interruption immediately
msm: scm: Fix improper register assignment
msm: scm: Mark inline asm as volatile
msm: iommu: Enable HTW L2 redirection on MSM8960
msm: iommu: Don't read from write-only registers
msm: iommu: Remove dependency on IDR
msm: iommu: Use ASID tagging instead of VMID tagging
msm: iommu: Rework clock logic and add IOMMU bus clock control
msm: iommu: Clock control for the IOMMU driver
msm: mdp: Set the correct pack pattern for XRGB/ARGB
msm_fb: Fix framebuffer console
msm: mdp: Add support for RGBX 8888 image format.
video: msmfb: Put the partial update magic value into the fix_screen struct.
msm: clock: Migrate to clkdev
msm: clock: Remove references to clk_ops_pcom
msm: headsmp.S: Fix section mismatch
msm: Use explicit GPLv2 licenses
msm: iommu: Enable IOMMU support for MSM8960
msm: iommu: Generalize platform data for multiple targets
msm: iommu: Create a Kconfig item for the IOMMU driver
...
Tony Luck [Thu, 17 Mar 2011 23:29:15 +0000 (16:29 -0700)]
Some fixes for pstore
1) Change from ->get_sb() to ->mount()
2) Use mount_single() instead of mount_nodev()
3) Pulled in ramfs_get_inode() & trimmed to what I need for pstore
4) Drop the ugly pstore_writefile() Just save data using kmalloc() and
provide a pstore_file_read() that uses simple_read_from_buffer().
Linus Torvalds [Fri, 18 Mar 2011 02:08:06 +0000 (19:08 -0700)]
Merge branch 'devel-stable' of master.kernel.org:/home/rmk/linux-2.6-arm
* 'devel-stable' of master.kernel.org:/home/rmk/linux-2.6-arm: (289 commits)
davinci: DM644x EVM: register MUSB device earlier
davinci: add spi devices on tnetv107x evm
davinci: add ssp config for tnetv107x evm board
davinci: add tnetv107x ssp platform device
spi: add ti-ssp spi master driver
mfd: add driver for sequencer serial port
ARM: EXYNOS4: Implement Clock gating for System MMU
ARM: EXYNOS4: Enhancement of System MMU driver
ARM: EXYNOS4: Add support for gpio interrupts
ARM: S5P: Add function to register gpio interrupt bank data
ARM: S5P: Cleanup S5P gpio interrupt code
ARM: EXYNOS4: Add missing GPYx banks
ARM: S3C64XX: Fix section mismatch from cpufreq init
ARM: EXYNOS4: Add keypad device to the SMDKV310
ARM: EXYNOS4: Update clocks for keypad
ARM: EXYNOS4: Update keypad base address
ARM: EXYNOS4: Add keypad device helpers
ARM: EXYNOS4: Add support for SATA on ARMLEX4210
plat-nomadik: make GPIO interrupts work with cpuidle ApSleep
mach-u300: define a dummy filter function for coh901318
...
Fix up various conflicts in
- arch/arm/mach-exynos4/cpufreq.c
- arch/arm/mach-mxs/gpio.c
- drivers/net/Kconfig
- drivers/tty/serial/Kconfig
- drivers/tty/serial/Makefile
- drivers/usb/gadget/fsl_mxc_udc.c
- drivers/video/Kconfig
Linus Torvalds [Fri, 18 Mar 2011 01:48:35 +0000 (18:48 -0700)]
Merge branches 'defcfg', 'drivers' and 'cyberpro-next' of master.kernel.org:/home/rmk/linux-2.6-arm
* 'defcfg' of master.kernel.org:/home/rmk/linux-2.6-arm:
ARM: 6647/1: add Versatile Express defconfig
ARM: 6644/1: mach-ux500: update the U8500 defconfig
* 'drivers' of master.kernel.org:/home/rmk/linux-2.6-arm:
ARM: 6764/1: pl011: factor out FIFO to TTY code
ARM: 6763/1: pl011: add optional RX DMA to PL011 v2
ARM: 6758/1: amba: support pm ops
ARM: amba: make amba_driver id_table const
ARM: amba: make internal ID table handling const
ARM: amba: make probe() functions take const id tables
ARM: 6662/1: amba: make amba_bustype non-static
ARM: mmci: add dmaengine-based DMA support
ARM: mmci: no need for separate host->data_xfered
ARM: mmci: avoid unnecessary switch to data available PIO interrupts
ARM: mmci: no need to call flush_dcache_page() with sg_miter API
ARM: mmci: avoid reporting too many completed bytes on fifo overrun
ALSA: AACI: make fifo variables more explanitory
ALSA: AACI: no need to call snd_pcm_period_elapsed() for each period
ALSA: AACI: use snd_pcm_lib_period_bytes()
ALSA: AACI: clean up AACI announcement printk
ALSA: AACI: fix channel mask selection
ALSA: AACI: fix number of channels for record
ALSA: AACI: fix multiple IRQ claiming
* 'cyberpro-next' of master.kernel.org:/home/rmk/linux-2.6-arm:
VIDEO: cyberpro: remove unused cyber2000fb_get_fb_var()
VIDEO: cyberpro: remove useless function extreg pointers
VIDEO: cyberpro: update handling of device structures
VIDEO: cyberpro: add support for video capture I2C
VIDEO: cyberpro: make 'reg_b0_lock' always present
VIDEO: cyberpro: add I2C support
VIDEO: cyberpro: select lowest multipler/divisor for PLL
Linus Torvalds [Fri, 18 Mar 2011 01:40:35 +0000 (18:40 -0700)]
Merge branch 'kvm-updates/2.6.39' of git://git.kernel.org/pub/scm/virt/kvm/kvm
* 'kvm-updates/2.6.39' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (55 commits)
KVM: unbreak userspace that does not sets tss address
KVM: MMU: cleanup pte write path
KVM: MMU: introduce a common function to get no-dirty-logged slot
KVM: fix rcu usage in init_rmode_* functions
KVM: fix kvmclock regression due to missing clock update
KVM: emulator: Fix permission checking in io permission bitmap
KVM: emulator: Fix io permission checking for 64bit guest
KVM: SVM: Load %gs earlier if CONFIG_X86_32_LAZY_GS=n
KVM: x86: Remove useless regs_page pointer from kvm_lapic
KVM: improve comment on rcu use in irqfd_deassign
KVM: MMU: remove unused macros
KVM: MMU: cleanup page alloc and free
KVM: MMU: do not record gfn in kvm_mmu_pte_write
KVM: MMU: move mmu pages calculated out of mmu lock
KVM: MMU: set spte accessed bit properly
KVM: MMU: fix kvm_mmu_slot_remove_write_access dropping intermediate W bits
KVM: Start lock documentation
KVM: better readability of efer_reserved_bits
KVM: Clear async page fault hash after switching to real mode
KVM: VMX: Initialize vm86 TSS only once.
...
Linus Torvalds [Fri, 18 Mar 2011 01:37:42 +0000 (18:37 -0700)]
Merge branch 'stable/xen.pm.bug-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen
* 'stable/xen.pm.bug-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
xen: use freeze/restore/thaw PM events for suspend/resume/chkpt
xen: xenbus PM events support
Linus Torvalds [Fri, 18 Mar 2011 01:27:49 +0000 (18:27 -0700)]
Merge branches 'stable/irq.fairness' and 'stable/irq.ween_of_nr_irqs' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen
* 'stable/irq.fairness' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
xen: events: Remove redundant clear of l2i at end of round-robin loop
xen: events: Make round-robin scan fairer by snapshotting each l2 word once only
xen: events: Clean up round-robin evtchn scan.
xen: events: Make last processed event channel a per-cpu variable.
xen: events: Process event channels notifications in round-robin order.
* 'stable/irq.ween_of_nr_irqs' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
xen: events: Fix compile error if CONFIG_SMP is not defined.
xen: events: correct locking in xen_irq_from_pirq
xen: events: propagate irq allocation failure instead of panicking
xen: events: do not workaround too-small nr_irqs
xen: events: remove use of nr_irqs as upper bound on number of pirqs
xen: events: dynamically allocate irq info structures
xen: events: maintain a list of Xen interrupts
xen: events: push setup of irq<->{evtchn,ipi,virq,pirq} maps into irq_info init functions
xen: events: turn irq_info constructors into initialiser functions
xen: events: use per-cpu variable for cpu_evtchn_mask
xen: events: refactor GSI pirq bindings functions
xen: events: rename restore_cpu_pirqs -> restore_pirqs
xen: events: remove unused public functions
xen: events: fix xen_map_pirq_gsi error return
xen: events: simplify comment
xen: events: separate two unrelated halves of if condition
Linus Torvalds [Fri, 18 Mar 2011 01:16:36 +0000 (18:16 -0700)]
Merge branches 'stable/hvc-console', 'stable/gntalloc.v6' and 'stable/balloon' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen
* 'stable/hvc-console' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
xen/hvc: Disable probe_irq_on/off from poking the hvc-console IRQ line.
* 'stable/gntalloc.v6' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
xen: gntdev: fix build warning
xen/p2m/m2p/gnttab: do not add failed grant maps to m2p override
xen-gntdev: Add cast to pointer
xen-gntdev: Fix incorrect use of zero handle
xen: change xen/[gntdev/gntalloc] to default m
xen-gntdev: prevent using UNMAP_NOTIFY_CLEAR_BYTE on read-only mappings
xen-gntdev: Avoid double-mapping memory
xen-gntdev: Avoid unmapping ranges twice
xen-gntdev: Use map->vma for checking map validity
xen-gntdev: Fix unmap notify on PV domains
xen-gntdev: Fix memory leak when mmap fails
xen/gntalloc,gntdev: Add unmap notify ioctl
xen-gntalloc: Userspace grant allocation driver
xen-gntdev: Support mapping in HVM domains
xen-gntdev: Add reference counting to maps
xen-gntdev: Use find_vma rather than iterating our vma list manually
xen-gntdev: Change page limit to be global instead of per-open
* 'stable/balloon' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen: (24 commits)
xen-gntdev: Use ballooned pages for grant mappings
xen-balloon: Add interface to retrieve ballooned pages
xen-balloon: Move core balloon functionality out of module
xen/balloon: Remove pr_info's and don't alter retry_count
xen/balloon: Protect against CPU exhaust by event/x process
xen/balloon: Migration from mod_timer() to schedule_delayed_work()
xen/balloon: Removal of driver_pages
Linus Torvalds [Fri, 18 Mar 2011 00:41:19 +0000 (17:41 -0700)]
Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs-2.6
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs-2.6:
ext3: Always set dx_node's fake_dirent explicitly.
ext3: Fix an overflow in ext3_trim_fs.
jbd: Remove one to many n's in a word.
ext3: skip orphan cleanup on rocompat fs
ext2: Fix link count corruption under heavy link+rename load
ext3: speed up group trim with the right free block count.
ext3: Adjust trim start with first_data_block.
quota: return -ENOMEM when memory allocation fails
Linus Torvalds [Fri, 18 Mar 2011 00:40:00 +0000 (17:40 -0700)]
Merge branch 'nfs-for-2.6.39' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6
* 'nfs-for-2.6.39' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6: (54 commits)
RPC: killing RPC tasks races fixed
xprt: remove redundant check
SUNRPC: Convert struct rpc_xprt to use atomic_t counters
SUNRPC: Ensure we always run the tk_callback before tk_action
sunrpc: fix printk format warning
xprt: remove redundant null check
nfs: BKL is no longer needed, so remove the include
NFS: Fix a warning in fs/nfs/idmap.c
Cleanup: Factor out some cut-and-paste code.
cleanup: save 60 lines/100 bytes by combining two mostly duplicate functions.
NFS: account direct-io into task io accounting
gss:krb5 only include enctype numbers in gm_upcall_enctypes
RPCRDMA: Fix FRMR registration/invalidate handling.
RPCRDMA: Fix to XDR page base interpretation in marshalling logic.
NFSv4: Send unmapped uid/gids to the server when using auth_sys
NFSv4: Propagate the error NFS4ERR_BADOWNER to nfs4_do_setattr
NFSv4: cleanup idmapper functions to take an nfs_server argument
NFSv4: Send unmapped uid/gids to the server if the idmapper fails
NFSv4: If the server sends us a numeric uid/gid then accept it
NFSv4.1: reject zero layout with zeroed stripe unit
...
Linus Torvalds [Fri, 18 Mar 2011 00:10:19 +0000 (17:10 -0700)]
Merge branch 'for-linus/2639/i2c-1' of git://git.fluff.org/bjdooks/linux
* 'for-linus/2639/i2c-1' of git://git.fluff.org/bjdooks/linux:
i2c-mpc: Add support for 64bit system
i2c: add driver for Freescale i.MX28
i2c: tegra: Add i2c support
* git://git.kernel.org/pub/scm/linux/kernel/git/wim/linux-2.6-watchdog:
watchdog: booke_wdt: clean up status messages
watchdog: cleanup spaces before tabs
watchdog: convert to DEFINE_PCI_DEVICE_TABLE
watchdog: Xen watchdog driver
watchdog: Intel SCU Watchdog Timer Driver for Moorestown and Medfield platforms.
watchdog: jz4740_wdt - fix magic character checking
watchdog: add JZ4740 watchdog driver
watchdog: it87_wdt: Add support for IT8721F watchdog
watchdog: hpwdt: build hpwdt as module by default with NMI_DECODING enabled
watchdog: hpwdt: Fix a couple of typos
Linus Torvalds [Thu, 17 Mar 2011 23:59:38 +0000 (16:59 -0700)]
Merge branch 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/staging
* 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/staging: (44 commits)
hwmon: (lineage-pem): Fix in1 voltage alarm sysfs attributes
hwmon/f71882fg: Add support for f71808e
hwmon/f71882fg: Add support for f71869f and f71869e
hwmon/f71882fg: Add support for f71889ed
hwmon/f71882fg: Break out test for auto pwm's controlled by digital readings
hwmon/f71882fg: Separate temp beep sysfs attr from the other temp sysfs attr
hwmon/f71882fg: Remove bogus temp2_type for certain models
hwmon/f71882fg: Make number of temps configurable
hwmon/f71882fg: Make creation of in sysfs attributes more generic
hwmon/f71882fg: Only allow negative auto point temps if fan_neg_temp is enabled
hwmon/f71882fg: Fix temp1 sensor type reporting
hwmon: (w83627ehf) Display correct temperature sensor labels for systems with NCT6775F
hwmon: (w83627ehf) Add fan debounce support for NCT6775F and NCT6776F
hwmon: (w83627ehf) Update Kconfig for W83677HG-B, NCT6775F and NCT6776F
hwmon: (w83627ehf) Store rpm instead of raw fan speed data
hwmon: (w83627ehf) Use 16 bit fan count registers if supported
hwmon: (w83627ehf) Add support for Nuvoton NCT6775F and NCT6776F
hwmon: (w83627ehf) Permit enabling SmartFan IV mode if configured at startup
hwmon: (w83627ehf) Convert register arrays to 16 bit, and convert access to pointers
hwmon: (w83627ehf) Remove references to datasheets which no longer exist
...
Milton Miller [Tue, 15 Mar 2011 19:27:17 +0000 (13:27 -0600)]
smp_call_function_interrupt: use typedef and %pf
Use the newly added smp_call_func_t in smp_call_function_interrupt for
the func variable, and make the comment above the WARN more assertive
and explicit. Also, func is a function pointer and does not need an
offset, so use %pf not %pS.
Signed-off-by: Milton Miller <miltonm@bga.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Milton Miller [Tue, 15 Mar 2011 19:27:17 +0000 (13:27 -0600)]
smp_call_function_many: handle concurrent clearing of mask
Mike Galbraith reported finding a lockup ("perma-spin bug") where the
cpumask passed to smp_call_function_many was cleared by other cpu(s)
while a cpu was preparing its call_data block, resulting in no cpu to
clear the last ref and unlock the block.
Having cpus clear their bit asynchronously could be useful on a mask of
cpus that might have a translation context, or cpus that need a push to
complete an rcu window.
Instead of adding a BUG_ON and requiring yet another cpumask copy, just
detect the race and handle it.
Note: arch_send_call_function_ipi_mask must still handle an empty
cpumask because the data block is globally visible before the that arch
callback is made. And (obviously) there are no guarantees to which cpus
are notified if the mask is changed during the call; only cpus that were
online and had their mask bit set during the whole call are guaranteed
to be called.
Reported-by: Mike Galbraith <efault@gmx.de> Reported-by: Jan Beulich <JBeulich@novell.com> Acked-by: Jan Beulich <jbeulich@novell.com> Cc: stable@kernel.org Signed-off-by: Milton Miller <miltonm@bga.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Milton Miller [Tue, 15 Mar 2011 19:27:16 +0000 (13:27 -0600)]
call_function_many: add missing ordering
Paul McKenney's review pointed out two problems with the barriers in the
2.6.38 update to the smp call function many code.
First, a barrier that would force the func and info members of data to
be visible before their consumption in the interrupt handler was
missing. This can be solved by adding a smp_wmb between setting the
func and info members and setting setting the cpumask; this will pair
with the existing and required smp_rmb ordering the cpumask read before
the read of refs. This placement avoids the need a second smp_rmb in
the interrupt handler which would be executed on each of the N cpus
executing the call request. (I was thinking this barrier was present
but was not).
Second, the previous write to refs (establishing the zero that we the
interrupt handler was testing from all cpus) was performed by a third
party cpu. This would invoke transitivity which, as a recient or
concurrent addition to memory-barriers.txt now explicitly states, would
require a full smp_mb().
However, we know the cpumask will only be set by one cpu (the data
owner) and any preivous iteration of the mask would have cleared by the
reading cpu. By redundantly writing refs to 0 on the owning cpu before
the smp_wmb, the write to refs will follow the same path as the writes
that set the cpumask, which in turn allows us to keep the barrier in the
interrupt handler a smp_rmb instead of promoting it to a smp_mb (which
will be be executed by N cpus for each of the possible M elements on the
list).
I moved and expanded the comment about our (ab)use of the rcu list
primitives for the concurrent walk earlier into this function. I
considered moving the first two paragraphs to the queue list head and
lock, but felt it would have been too disconected from the code.
Cc: Paul McKinney <paulmck@linux.vnet.ibm.com> Cc: stable@kernel.org (2.6.32 and later) Signed-off-by: Milton Miller <miltonm@bga.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Milton Miller [Tue, 15 Mar 2011 19:27:16 +0000 (13:27 -0600)]
call_function_many: fix list delete vs add race
Peter pointed out there was nothing preventing the list_del_rcu in
smp_call_function_interrupt from running before the list_add_rcu in
smp_call_function_many.
Fix this by not setting refs until we have gotten the lock for the list.
Take advantage of the wmb in list_add_rcu to save an explicit additional
one.
I tried to force this race with a udelay before the lock & list_add and
by mixing all 64 online cpus with just 3 random cpus in the mask, but
was unsuccessful. Still, inspection shows a valid race, and the fix is
a extension of the existing protection window in the current code.
Cc: stable@kernel.org (v2.6.32 and later) Reported-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Milton Miller <miltonm@bga.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Chris Metcalf [Thu, 17 Mar 2011 18:32:06 +0000 (14:32 -0400)]
arch/tile: support newer binutils assembler shift semantics
This change supports building the kernel with newer binutils where
a shift of greater than the word size is no longer interpreted
silently as modulo the word size, but instead generates a warning.
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
Linus Torvalds [Thu, 17 Mar 2011 17:11:25 +0000 (10:11 -0700)]
Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/epip/linux-2.6-unicore32
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/epip/linux-2.6-unicore32: (40 commits)
unicore32: rewrite arch-specific tlb.h to use asm-generic version
unicore32: modify io_p2v and io_v2p macros, and adjust PKUNITY_mmio_BASEs
unicore32: replace unicore32-specific iomap functions with generic lib implementation
unicore32 machine related: add frame buffer driver for pkunity-v3 soc
unicore32 machine related files: add i2c bus drivers for pkunity-v3 soc
unicore32 io: redefine __REG(x) and re-use readl/writel funcs
unicore32 i8042 upgrade and bugfix: adjust resource request region type
unicore32 upgrade to v2.6.38-rc5: add one more paramter for pte_alloc_map call
unicore32 i8042: adjust io funcs of i8042-unicore32io.h
unicore32: rename PKUNITY_IOSPACE_BASE to PKUNITY_MMIO_BASE
unicore32: modify function names and parameters for irq_chips
unicore32: remove unused lines in arch/unicore32/include/asm/irq.h
unicore32 time.c: change calculate method for clock_event_device
unicore32: ADD MAINTAINER for unicore32 architecture
unicore32 machine related files: ps2 driver
unicore32 machine related files: pci bus handling
unicore32 machine related files: hardware registers
unicore32 machine related files: core files
unicore32 additional architecture files: boot process
unicore32 additional architecture files: low-level lib: misc
...
Linus Torvalds [Thu, 17 Mar 2011 16:57:10 +0000 (09:57 -0700)]
Merge branch 'sh-latest' of git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6
* 'sh-latest' of git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6: (34 commits)
sh: Convert to generic show_interrupts.
sh: Wire up new fhandle and clock_adjtime syscalls.
sh: modify platform_device for sh_eth driver
sh: add GETHER's platform_device in board-sh7757lcr
sh: update sh7757lcr_defconfig
sh: add platform_device of tmio_mmc and sh_mmcif to sh7757lcr
sh: dmaengine support for SH7757
sh: add mmc clock in clock-sh7757
sh: add spi_board_info in sh7757lcr
sh: add platform_device for SPI
sh: add USB_ARCH_HAS_EHCI and OHCI for SH7757
sh: Rename cpuidle states to fit general conventions
serial: sh-sci: fix deadlock when resuming from S3 sleep
sh: Enable CONFIG_GCOV_PROFILE_ALL for sh
sh: Fix up async PCIe probing on SMP.
serial: sh-sci: Kill off the special earlyprintk device.
serial: sh-sci: Use dev_name() for region reservations.
serial: sh-sci: Fix up earlyprintk port mapping.
serial: sh-sci: Limit early console to one device.
serial: sh-sci: Fix up break timer scheduling race.
...
* git://git.kernel.org/pub/scm/linux/kernel/git/lethal/fbdev-2.6:
fbdev: sh_mobile_lcdc: Add YUV framebuffer support
viafb: split pll configs up
viafb: remove duplicated clock storage
viafb: always return the best possible clock
viafb: remove duplicated clock information
fbdev: sh_mobile_lcdcfb: add backlight support
viafb: factor lcd scaling parameters out
viafb: strip some structures
viafb: remove unused data_mode and device_type
viafb: kill lcd_panel_id
video via: make local variables static
video via: fix iomem access
video/via: drop deprecated (and unused) i2c_adapter.id
RPC task RPC_TASK_QUEUED bit is set must be checked before trying to wake up
task rpc_killall_tasks() because task->tk_waitqueue can not be set (equal to
NULL).
Also, as Trond Myklebust mentioned, such approach (instead of checking
tk_waitqueue to NULL) allows us to "optimise away the call to
rpc_wake_up_queued_task() altogether for those
tasks that aren't queued".
Here is an example of dereferencing of tk_waitqueue equal to NULL:
Trond Myklebust [Tue, 15 Mar 2011 23:56:30 +0000 (19:56 -0400)]
SUNRPC: Ensure we always run the tk_callback before tk_action
This fixes a race in which the task->tk_callback() puts the rpc_task
to sleep, setting a new callback. Under certain circumstances, the current
code may end up executing the task->tk_action before it gets round to the
callback.
Linus Torvalds [Thu, 17 Mar 2011 16:11:39 +0000 (09:11 -0700)]
Merge branch 'drm-core-next' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6
* 'drm-core-next' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6: (177 commits)
drm/radeon: fixup refcounts in radeon dumb create ioctl.
drm: radeon: *_cs_packet_parse_vline() cleanup
radeon: merge list_del()/list_add_tail() to list_move_tail()
drm: Retry i2c transfer of EDID block after failure
drm/radeon/kms: fix typo in atom overscan setup
drm: Hold the mode mutex whilst probing for sysfs status
drm/nouveau: fix __nouveau_fence_wait performance
drm/nv40: attempt to reserve just enough vram for all 32 channels
drm/nv50: check for vm traps on every gr irq
drm/nv50: decode vm faults some more
drm/nouveau: add nouveau_enum_find() util function
drm/nouveau: properly handle pushbuffer check failures
drm/nvc0: remove vm hack forcing large/small pages to not share a PDE
drm/i915: disable opregion lid detection for now.
drm/i915: Only wait on a pending flip if we intend to write to the buffer
drm/i915/dp: Sanity check eDP existence
drm: add cap bit to denote if dumb ioctl is available or not.
drm/core: add ioctl to query device/driver capabilities
drm/radeon/kms: allow max clock of 340 Mhz on hdmi 1.3+
drm/radeon/kms: add cayman pci ids
...
Gleb Natapov [Sun, 13 Mar 2011 10:34:27 +0000 (12:34 +0200)]
KVM: unbreak userspace that does not sets tss address
Commit 6440e5967bc broke old userspaces that do not set tss address
before entering vcpu. Unbreak it by setting tss address to a safe
value on the first vcpu entry. New userspaces should set tss address,
so print warning in case it doesn't.
Nikola Ciprich [Wed, 9 Mar 2011 22:36:51 +0000 (23:36 +0100)]
KVM: fix kvmclock regression due to missing clock update
commit 387b9f97750444728962b236987fbe8ee8cc4f8c moved kvm_request_guest_time_update(vcpu),
breaking 32bit SMP guests using kvm-clock. Fix this by moving (new) clock update function
to proper place.
Signed-off-by: Nikola Ciprich <nikola.ciprich@linuxbox.cz> Acked-by: Zachary Amsden <zamsden@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
Gleb Natapov [Mon, 7 Mar 2011 12:55:07 +0000 (14:55 +0200)]
KVM: emulator: Fix permission checking in io permission bitmap
Currently if io port + len crosses 8bit boundary in io permission bitmap the
check may allow IO that otherwise should not be allowed. The patch fixes that.
Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
KVM: x86: Remove useless regs_page pointer from kvm_lapic
Access to this page is mostly done through the regs member which holds
the address to this page. The exceptions are in vmx_vcpu_reset() and
kvm_free_lapic() and these both can easily be converted to using regs.
Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp> Signed-off-by: Avi Kivity <avi@redhat.com>
The RCU use in kvm_irqfd_deassign is tricky: we have rcu_assign_pointer
but no synchronize_rcu: synchronize_rcu is done by kvm_irq_routing_update
which we share a spinlock with.
Fix up a comment in an attempt to make this clearer.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
Xiao Guangrong [Fri, 4 Mar 2011 11:00:00 +0000 (19:00 +0800)]
KVM: MMU: do not record gfn in kvm_mmu_pte_write
No need to record the gfn to verifier the pte has the same mode as
current vcpu, it's because we only speculatively update the pte only
if the pte and vcpu have the same mode
Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>
Jan Kiszka [Wed, 9 Feb 2011 14:11:28 +0000 (15:11 +0100)]
KVM: Start lock documentation
The goal of this document shall be
- overview of all locks used in KVM core
- provide details on the scope of each lock
- explain the lock type, specifically of a raw spin locks
- provide a lock ordering guide
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Avi Kivity <avi@redhat.com>
Gleb Natapov [Mon, 21 Feb 2011 10:07:59 +0000 (12:07 +0200)]
KVM: VMX: Initialize vm86 TSS only once.
Currently vm86 task is initialized on each real mode entry and vcpu
reset. Initialization is done by zeroing TSS and updating relevant
fields. But since all vcpus are using the same TSS there is a race where
one vcpu may use TSS while other vcpu is initializing it, so the vcpu
that uses TSS will see wrong TSS content and will behave incorrectly.
Fix that by initializing TSS only once.
Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
Gleb Natapov [Mon, 21 Feb 2011 10:07:58 +0000 (12:07 +0200)]
KVM: VMX: update live TR selector if it changes in real mode
When rmode.vm86 is active TR descriptor is updated with vm86 task values,
but selector is left intact. vmx_set_segment() makes sure that if TR
register is written into while vm86 is active the new values are saved
for use after vm86 is deactivated, but since selector is not updated on
vm86 activation/deactivation new value is lost. Fix this by writing new
selector into vmcs immediately.
Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
Gleb Natapov [Wed, 9 Feb 2011 10:09:46 +0000 (12:09 +0200)]
KVM: remove isr_ack logic from PIC
isr_ack logic was added by e48258009d to avoid unnecessary IPIs. Back
then it made sense, but now the code checks that vcpu is ready to accept
interrupt before sending IPI, so this logic is no longer needed. The
patch removes it.
Fixes a regression with Debian/Hurd.
Signed-off-by: Gleb Natapov <gleb@redhat.com> Reported-and-tested-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Avi Kivity <avi@redhat.com>
Avi Kivity [Thu, 3 Feb 2011 13:29:52 +0000 (15:29 +0200)]
KVM: SVM: check for progress after IRET interception
When we enable an NMI window, we ask for an IRET intercept, since
the IRET re-enables NMIs. However, the IRET intercept happens before
the instruction executes, while the NMI window architecturally opens
afterwards.
To compensate for this mismatch, we only open the NMI window in the
following exit, assuming that the IRET has by then executed; however,
this assumption is not always correct; we may exit due to a host interrupt
or page fault, without having executed the instruction.
Fix by checking for forward progress by recording and comparing the IRET's
rip. This is somewhat of a hack, since an unchaging rip does not mean that
no forward progress has been made, but is the simplest fix for now.
Avi Kivity [Thu, 3 Feb 2011 13:07:07 +0000 (15:07 +0200)]
KVM: Fix race between nmi injection and enabling nmi window
The interrupt injection logic looks something like
if an nmi is pending, and nmi injection allowed
inject nmi
if an nmi is pending
request exit on nmi window
the problem is that "nmi is pending" can be set asynchronously by
the PIT; if it happens to fire between the two if statements, we
will request an nmi window even though nmi injection is allowed. On
SVM, this has disasterous results, since it causes eflags.TF to be
set in random guest code.
The fix is simple; make nmi_pending synchronous using the standard
vcpu->requests mechanism; this ensures the code above is completely
synchronous wrt nmi_pending.
Rik van Riel [Tue, 1 Feb 2011 14:53:28 +0000 (09:53 -0500)]
KVM: use yield_to instead of sleep in kvm_vcpu_on_spin
Instead of sleeping in kvm_vcpu_on_spin, which can cause gigantic
slowdowns of certain workloads, we instead use yield_to to get
another VCPU in the same KVM guest to run sooner.
This seems to give a 10-15% speedup in certain workloads.
Signed-off-by: Rik van Riel <riel@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
Avi Kivity [Tue, 1 Feb 2011 14:32:03 +0000 (16:32 +0200)]
KVM: x86 emulator: vendor specific instructions
Mark some instructions as vendor specific, and allow the caller to request
emulation only of vendor specific instructions. This is useful in some
circumstances (responding to a #UD fault).
Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Avi Kivity [Tue, 1 Feb 2011 14:32:02 +0000 (16:32 +0200)]
KVM: Drop bogus x86_decode_insn() error check
x86_decode_insn() doesn't return X86EMUL_* values, so the check
for X86EMUL_PROPOGATE_FAULT will always fail. There is a proper
check later on, so there is no need for a replacement for this
code.
Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Jan Kiszka [Tue, 1 Feb 2011 12:27:15 +0000 (13:27 +0100)]
KVM: x86: Drop obsolete warning about INIT on runnable VCPU
This warning was once used for debugging QEMU user space. Though
uncommon, it is actually possible to send an INIT request to a running
VCPU. So better drop this warning before someone misuses it to flood
kernel logs this way.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>