David Dillow [Sun, 24 Oct 2010 10:20:20 +0000 (10:20 +0000)]
typhoon: wait for RX mode commands to finish
When adding VLAN devices, we can get several calls to
typhoon_set_rx_mode() in quick succession. Because we didn't wait for
the commands to complete, we could run out of command descriptors and
fail to set the RX mode.
Signed-off-by: David Dillow <dave@thedillows.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Sun, 24 Oct 2010 04:27:10 +0000 (04:27 +0000)]
netlink: fix netlink_change_ngroups()
commit 6c04bb18ddd633 (netlink: use call_rcu for netlink_change_ngroups)
used a somewhat convoluted and racy way to perform call_rcu().
The old block of memory is freed after a grace period, but the rcu_head
used to track it is located in new block.
This can clash if we call two times or more netlink_change_ngroups(),
and a block is freed before another. call_rcu() called on different cpus
makes no guarantee in order of callbacks.
Fix this using a more standard way of handling this : Each block of
memory contains its own rcu_head, so that no 'use after free' can
happens.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> CC: Johannes Berg <johannes@sipsolutions.net> CC: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Guo-Fu Tseng [Sun, 24 Oct 2010 23:18:25 +0000 (16:18 -0700)]
jme: Support WoL after shutdown
Adding shutdown function that setup wake if wol is enabled.
Add jme_phy_on in jme_set_100m_half in case it is shutting down
or suspending when interface is down(phy_off by default).
v2 updates:
Removed CONFIG_PM ifdef for jme_set_100m_half and jme_wait_link.
It would be nice if it can be applied to net-2.6 along with other patches
sent few days ago. If it is not appropriate, please ignore the net-2.6
request, and apply it to net-next-2.6 as previous patch. :)
Reported-and-helped-by: Сtac <Taoga@yandex.ru> Signed-off-by: Guo-Fu Tseng <cooldavid@cooldavid.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Harvey Harrison [Thu, 21 Oct 2010 18:05:34 +0000 (18:05 +0000)]
vmxnet3: fix typo setting confPA
It's a le64, not a le32, typo in one place only.
Noticed by sparse:
drivers/net/vmxnet3/vmxnet3_drv.c:2668:52: warning: incorrect type in assignment (different base types)
drivers/net/vmxnet3/vmxnet3_drv.c:2668:52: expected restricted __le64 [usertype] confPA
drivers/net/vmxnet3/vmxnet3_drv.c:2668:52: got restricted __le32 [usertype] <noident>
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Shreyas Bhatewara <sbhatewara@vmware.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Harvey Harrison [Thu, 21 Oct 2010 18:05:33 +0000 (18:05 +0000)]
vmxnet3: annotate hwaddr members as __iomem pointers
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Shreyas Bhatewara <sbhatewara@vmware.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Harvey Harrison [Thu, 21 Oct 2010 18:05:32 +0000 (18:05 +0000)]
vmxnet3: remove set_flag_le{16,64} helpers
It's easier to just annotate the constants as little endian types and set/clear
the flags directly.
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Shreyas Bhatewara <sbhatewara@vmware.com> Signed-off-by: David S. Miller <davem@davemloft.net>
cxgb4: fix crash due to manipulating queues before registration
Before commit "net: allocate tx queues in register_netdevice"
netif_tx_stop_all_queues and related functions could be used between
device allocation and registration but now only after registration.
cxgb4 has such a call before registration and crashes now. Move it
after register_netdev.
Signed-off-by: Dimitris Michailidis <dm@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Add driver for Technologic Systems TS-CAN1 PC104 peripheral boards.
Signed-off-by: Andre B. Oliveira <anbadeol@gmail.com> Acked-by: Wolfgang Grandegger <wg@grandegger.com> Reviewed-by: Wolfram Sang <w.sang@pengutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
Anders Franzen [Tue, 19 Oct 2010 03:50:47 +0000 (03:50 +0000)]
ip6_tunnel dont update the mtu on the route.
The ip6_tunnel device did not unset the flag,
IFF_XMIT_DST_RELEASE. This will make the dev layer
to release the dst before calling the tunnel.
The tunnel will not update any mtu/pmtu info, since
it does not have a dst on the skb. Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Paul Gortmaker [Mon, 18 Oct 2010 12:14:44 +0000 (12:14 +0000)]
pktgen: clean up handling of local/transient counter vars
The temporary variable "i" is needlessly initialized to zero
in two distinct cases in this file:
1) where it is set to zero and then used as an argument in an addition
before being assigned a non-zero value.
2) where it is only used in a standard/typical loop counter
For (1), simply delete assignment to zero and usages while still
zero; for (2) simply make the loop start at zero as per standard
practice as seen everywhere else in the same file.
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
The following functions are not used directly by any drivers:
phy_attach_direct
phy_device_create
phy_prepare_link
genphy_config_advert
genphy_setup_forced
phy_config_interrupt
phy_clear_interrypt
phy_sanitize_settings
phy_enable_interrupts
phy_disable_interrupts
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Also moved the refcound inlines from l2tp_core.h to l2tp_core.c
since only used in that one file.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Acked-by: James Chapman <jchapman@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
...by adding the correct annotation "__devinit" to the driver's probe
function.
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Acked-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
can: at91_can: fix compiler warning in at91_irq_err_state
This patch fixes the following compiler warning:
drivers/net/can/at91_can.c: In function 'at91_irq_err_state':
drivers/net/can/at91_can.c:779: warning: 'reg_ier' may be used uninitialized in this function
drivers/net/can/at91_can.c:779: warning: 'reg_idr' may be used uninitialized in this function
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
The priv is part of the memory allocated by alloc_candev().
This patch moved the free it after last usage of priv.
While there convert all free_netdev() to free_candev() (which is currently
just a wrapper around free_netdev()) to be symetrically with the allocation
via alloc_candev().
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
The AT91_MID_MIDE bit must be set in order to receive extended frames.
The reception of an extended frame sets this bit, while reception of
standard frames resets it. This results in some lost extended frames in
an extended ID only environment. But leads to unpredictable lost
extended ID frames in a mixed environment.
The problem is fixed by setting the AT91_MID_MIDE after reception of a
CAN frame.
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
Masayuki Ohtake [Fri, 15 Oct 2010 03:00:28 +0000 (03:00 +0000)]
can: Topcliff: Add PCH_CAN driver.
CAN driver of Topcliff PCH
Topcliff PCH is the platform controller hub that is going to be used in
Intel's upcoming general embedded platform. All IO peripherals in
Topcliff PCH are actually devices sitting on AMBA bus.
Topcliff PCH has CAN I/F. This driver enables CAN function.
Signed-off-by: Masayuki Ohtake <masa-korg@dsn.okisemi.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Tejun Heo [Thu, 14 Oct 2010 23:55:22 +0000 (23:55 +0000)]
connector: remove lazy workqueue creation
Commit 1a5645bc (connector: create connector workqueue only while
needed once) implements lazy workqueue creation for connector
workqueue. With cmwq now in place, lazy workqueue creation doesn't
make much sense while adding a lot of complexity. Remove it and
allocate an ordered workqueue during initialization.
This also removes a call to flush_scheduled_work() which is deprecated
and scheduled to be removed.
Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Sun, 24 Oct 2010 20:06:57 +0000 (13:06 -0700)]
Merge branch 'devel' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/edac
* 'devel' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/edac: (25 commits)
i7300_edac: Properly initialize per-csrow memory size
V4L/DVB: i7300_edac: better initialize page counts
MAINTAINERS: Add maintainer for i7300-edac driver
i7300-edac: CodingStyle cleanup
i7300_edac: Improve comments
i7300_edac: Cleanup: reorganize the file contents
i7300_edac: Properly detect channel on CE errors
i7300_edac: enrich FBD error info for corrected errors
i7300_edac: enrich FBD error info for fatal errors
i7300_edac: pre-allocate a buffer used to prepare err messages
i7300_edac: Fix MTR x4/x8 detection logic
i7300_edac: Make the debug messages coherent with the others
i7300_edac: Cleanup: remove get_error_info logic
i7300_edac: Add a code to cleanup error registers
i7300_edac: Add support for reporting FBD errors
i7300_edac: Properly detect the type of error correction
i7300_edac: Detect if the device is on single mode
i7300_edac: Adds detection for enhanced scrub mode on x8
i7300_edac: Clear the error bit after reading
i7300_edac: Add error detection code for global errors
...
Linus Torvalds [Sun, 24 Oct 2010 19:47:55 +0000 (12:47 -0700)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/slab-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/slab-2.6: (27 commits)
SLUB: Fix memory hotplug with !NUMA
slub: Move functions to reduce #ifdefs
slub: Enable sysfs support for !CONFIG_SLUB_DEBUG
SLUB: Optimize slab_free() debug check
slub: Move NUMA-related functions under CONFIG_NUMA
slub: Add lock release annotation
slub: Fix signedness warnings
slub: extract common code to remove objects from partial list without locking
SLUB: Pass active and inactive redzone flags instead of boolean to debug functions
slub: reduce differences between SMP and NUMA
Revert "Slub: UP bandaid"
percpu: clear memory allocated with the km allocator
percpu: use percpu allocator on UP too
percpu: reduce PCPU_MIN_UNIT_SIZE to 32k
vmalloc: pcpu_get/free_vm_areas() aren't needed on UP
SLUB: Fix merged slab cache names
Slub: UP bandaid
slub: fix SLUB_RESILIENCY_TEST for dynamic kmalloc caches
slub: Fix up missing kmalloc_cache -> kmem_cache_node case for memoryhotplug
slub: Add dummy functions for the !SLUB_DEBUG case
...
Linus Torvalds [Sun, 24 Oct 2010 19:47:25 +0000 (12:47 -0700)]
Merge branch 'kvm-updates/2.6.37' of git://git.kernel.org/pub/scm/virt/kvm/kvm
* 'kvm-updates/2.6.37' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (321 commits)
KVM: Drop CONFIG_DMAR dependency around kvm_iommu_map_pages
KVM: Fix signature of kvm_iommu_map_pages stub
KVM: MCE: Send SRAR SIGBUS directly
KVM: MCE: Add MCG_SER_P into KVM_MCE_CAP_SUPPORTED
KVM: fix typo in copyright notice
KVM: Disable interrupts around get_kernel_ns()
KVM: MMU: Avoid sign extension in mmu_alloc_direct_roots() pae root address
KVM: MMU: move access code parsing to FNAME(walk_addr) function
KVM: MMU: audit: check whether have unsync sps after root sync
KVM: MMU: audit: introduce audit_printk to cleanup audit code
KVM: MMU: audit: unregister audit tracepoints before module unloaded
KVM: MMU: audit: fix vcpu's spte walking
KVM: MMU: set access bit for direct mapping
KVM: MMU: cleanup for error mask set while walk guest page table
KVM: MMU: update 'root_hpa' out of loop in PAE shadow path
KVM: x86 emulator: Eliminate compilation warning in x86_decode_insn()
KVM: x86: Fix constant type in kvm_get_time_scale
KVM: VMX: Add AX to list of registers clobbered by guest switch
KVM guest: Move a printk that's using the clock before it's ready
KVM: x86: TSC catchup mode
...
Linus Torvalds [Sun, 24 Oct 2010 19:46:24 +0000 (12:46 -0700)]
Merge branch 'i2c-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging
* 'i2c-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging:
i2c-viapro: Don't log nacks
i2c/pca954x: Remove __devinit and __devexit from probe and remove functions
MAINTAINERS: Add maintainer for PCA9541 I2C bus master selector driver
i2c/mux: Driver for PCA9541 I2C Master Selector
i2c: Optimize function i2c_detect()
i2c: Discard warning message on device instantiation from user-space
i2c-amd8111: Add proper error handling
i2c: Change to new flag variable
i2c: Remove unneeded inclusions of <linux/i2c-id.h>
i2c: Let i2c_parent_is_i2c_adapter return the parent adapter
i2c: Simplify i2c_parent_is_i2c_adapter
i2c-pca-platform: Change device name of request_irq
i2c: Fix Kconfig dependencies
Linus Torvalds [Sun, 24 Oct 2010 19:44:59 +0000 (12:44 -0700)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid: (47 commits)
HID: fix mismerge in hid-lg
HID: hidraw: fix window in hidraw_release
HID: hid-sony: override usbhid_output_raw_report for Sixaxis
HID: add absolute axis resolution calculation
HID: force feedback support for Logitech RumblePad gamepad
HID: support STmicroelectronics and Sitronix with hid-stantuml driver
HID: magicmouse: Adjust major / minor axes to scale
HID: Fix for problems with eGalax/DWAV multi-touch-screen
HID: waltop: add support for Waltop Slim Tablet 12.1 inch
HID: add NOGET quirk for AXIS 295 Video Surveillance Joystick
HID: usbhid: remove unused hiddev_driver
HID: magicmouse: Use hid-input parsing rather than bypassing it
HID: trivial formatting fix
HID: Add support for Logitech Speed Force Wireless gaming wheel
HID: don't Send Feature Reports on Interrupt Endpoint
HID: 3m: Adjust major / minor axes to scale
HID: 3m: Correct touchscreen emulation
HID: 3m: Convert to MT slots
HID: 3m: Output proper orientation range
HID: 3m: Adjust to sequential MT HID protocol
...
* git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
crypto: Makefile - replace the use of <module>-objs with <module>-y
crypto: hifn_795x - use cancel_delayed_work_sync()
crypto: talitos - sparse check endian fixes
crypto: talitos - fix checkpatch warning
crypto: talitos - fix warning: 'alg' may be used uninitialized in this function
crypto: cryptd - Adding the AEAD interface type support to cryptd
crypto: n2_crypto - Niagara2 driver needs to depend upon CRYPTO_DES
crypto: Kconfig - update broken web addresses
crypto: omap-sham - Adjust DMA parameters
crypto: fips - FIPS requires algorithm self-tests
crypto: omap-aes - OMAP2/3 AES hw accelerator driver
crypto: updates to enable omap aes
padata: add missing __percpu markup in include/linux/padata.h
MAINTAINERS: Add maintainer entries for padata/pcrypt
Jean Delvare [Sun, 24 Oct 2010 16:16:59 +0000 (18:16 +0200)]
i2c-viapro: Don't log nacks
Transactions not acked can happen every now and then, in particular
during device detection, and various transaction types can be used for
this purpose. So stop logging this event, except when debugging is
enabled. This is what other similar drivers (e.g. i2c-i801 or
i2c-piix4) do.
Guenter Roeck [Sun, 24 Oct 2010 16:16:59 +0000 (18:16 +0200)]
i2c/pca954x: Remove __devinit and __devexit from probe and remove functions
The underlying I2C adapter may or may not be present when this driver
gets initialized, and may disappear later, so there is no safe time at
which the probe and remove functions can be discarded.
Signed-off-by: Guenter Roeck <guenter.roeck@ericsson.com> Signed-off-by: Jean Delvare <khali@linux-fr.org>
Guenter Roeck [Sun, 24 Oct 2010 16:16:58 +0000 (18:16 +0200)]
i2c/mux: Driver for PCA9541 I2C Master Selector
This patch adds support for PCA9541, an I2C Bus Master Selector.
The driver is modeled as single channel I2C Multiplexer to be able to utilize
the I2C multiplexer framework.
Signed-off-by: Guenter Roeck <guenter.roeck@ericsson.com> Reviewed-by: Tom Grennan <tom.grennan@ericsson.com> Acked-by: Jean Delvare <khali@linux-fr.org>
Jean Delvare [Sun, 24 Oct 2010 16:16:58 +0000 (18:16 +0200)]
i2c: Discard warning message on device instantiation from user-space
The "new_device" sysfs interface has been there for quite some time
now, nobody complained about it so it must be good enough. Time to
remove the warning and call it stable.
Signed-off-by: Jean Delvare <khali@linux-fr.org> Acked-by: Michael Lawnick <ml.lawnick@gmx.de>
Julia Lawall [Sun, 24 Oct 2010 16:16:58 +0000 (18:16 +0200)]
i2c-amd8111: Add proper error handling
The functions the functions amd_ec_wait_write and amd_ec_wait_read have an
unsigned return type, but return a negative constant to indicate an error
condition.
A sematic match that finds this problem is as follows:
(http://coccinelle.lip6.fr/)
Fixing amd_ec_wait_write and amd_ec_wait_read leads to the need to adjust
the return type of the functions amd_ec_write and amd_ec_read, which are
the only functions that call amd_ec_wait_write and amd_ec_wait_read.
amd_ec_write and amd_ec_read, in turn, are only called from within the
function amd8111_access, which already returns a signed typed value. Each
of the calls to amd_ec_write and amd_ec_read are updated using the
following semantic patch:
// <smpl>
@@
@@
+ status = amd_ec_write
- amd_ec_write
(...);
+ if (status) return status;
@@
@@
+ status = amd_ec_read
- amd_ec_read
(...);
+ if (status) return status;
// </smpl>
The patch also adds the declaration of the status variable.
Signed-off-by: Julia Lawall <julia@diku.dk> Signed-off-by: Jean Delvare <khali@linux-fr.org>
Jean Delvare [Sun, 24 Oct 2010 16:16:58 +0000 (18:16 +0200)]
i2c: Remove unneeded inclusions of <linux/i2c-id.h>
These drivers don't use anything which is defined in <linux/i2c-id.h>.
This header file was never meant to be included directly anyway, and
will be deleted soon.
Signed-off-by: Jean Delvare <khali@linux-fr.org> Acked-by: Ben Dooks <ben-linux@fluff.org> Acked-by: Dave Airlie <airlied@linux.ie> Cc: Hans Verkuil <hverkuil@xs4all.nl>
Jean Delvare [Sun, 24 Oct 2010 16:16:57 +0000 (18:16 +0200)]
i2c: Fix Kconfig dependencies
drivers/i2c/algos/Kconfig makes all the algorithms dependent on
!I2C_HELPER_AUTO, which triggers a Kconfig warning about broken
dependencies when some driver selects one of the algorithms. Ideally
we would make only the prompts dependent on !I2C_HELPER_AUTO, however
Kconfig doesn't currently support that. So we have to redefine the
symbols separately for the I2C_HELPER_AUTO=y case.
Signed-off-by: Jean Delvare <khali@linux-fr.org> Acked-by: Michal Marek <mmarek@suse.cz>
Jan Kiszka [Mon, 18 Oct 2010 13:38:40 +0000 (15:38 +0200)]
KVM: Drop CONFIG_DMAR dependency around kvm_iommu_map_pages
We also have to call kvm_iommu_map_pages for CONFIG_AMD_IOMMU. So drop
the dependency on Intel IOMMU, kvm_iommu_map_pages will be a nop anyway
if CONFIG_IOMMU_API is not defined.
KVM-Stable-Tag. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Huang Ying [Fri, 8 Oct 2010 08:24:15 +0000 (16:24 +0800)]
KVM: MCE: Send SRAR SIGBUS directly
Originally, SRAR SIGBUS is sent to QEMU-KVM via touching the poisoned
page. But commit 96054569190bdec375fe824e48ca1f4e3b53dd36 prevents the
signal from being sent. So now the signal is sent via
force_sig_info_fault directly.
[marcelo: use send_sig_info instead]
Reported-by: Dean Nelson <dnelson@redhat.com> Signed-off-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Huang Ying [Fri, 8 Oct 2010 08:24:14 +0000 (16:24 +0800)]
KVM: MCE: Add MCG_SER_P into KVM_MCE_CAP_SUPPORTED
Now we have MCG_SER_P (and corresponding SRAO/SRAR MCE) support in
kernel and QEMU-KVM, the MCG_SER_P should be added into
KVM_MCE_CAP_SUPPORTED to make all these code really works.
Reported-by: Dean Nelson <dnelson@redhat.com> Signed-off-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Avi Kivity [Mon, 4 Oct 2010 10:55:49 +0000 (12:55 +0200)]
KVM: Disable interrupts around get_kernel_ns()
get_kernel_ns() wants preemption disabled. It doesn't make a lot of sense
during the get/set ioctls (no way to make them non-racy) but the callee wants
it.
Jan Kiszka [Sun, 26 Sep 2010 11:00:53 +0000 (13:00 +0200)]
KVM: x86: Fix constant type in kvm_get_time_scale
Older gcc versions complain about the improper type (for x86-32), 4.5
seems to fix this silently. However, we should better use the right type
initially.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Avi Kivity <avi@redhat.com>
Negate the effects of AN TYM spell while kvm thread is preempted by tracking
conversion factor to the highest TSC rate and catching the TSC up when it has
fallen behind the kernel view of time. Note that once triggered, we don't
turn off catchup mode.
A slightly more clever version of this is possible, which only does catchup
when TSC rate drops, and which specifically targets only CPUs with broken
TSC, but since these all are considered unstable_tsc(), this patch covers
all necessary cases.
The math in kvm_get_time_scale relies on the fact that
NSEC_PER_SEC < 2^32. To use the same function to compute
arbitrary time scales, we must extend the first reduction
step to shrink the base rate to a 32-bit value, and
possibly reduce the scaled rate into a 32-bit as well.
Note we must take care to avoid an arithmetic overflow
when scaling up the tps32 value (this could not happen
with the fixed scaled value of NSEC_PER_SEC, but can
happen with scaled rates above 2^31.
Avi Kivity [Tue, 21 Sep 2010 17:59:44 +0000 (19:59 +0200)]
KVM: cpu_relax() during spin waiting for reboot
It doesn't really matter, but if we spin, we should spin in a more relaxed
manner. This way, if something goes wrong at least it won't contribute to
global warming.
Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Mohammed Gamal [Sun, 19 Sep 2010 12:34:06 +0000 (14:34 +0200)]
KVM: Add kvm_inject_realmode_interrupt() wrapper
This adds a wrapper function kvm_inject_realmode_interrupt() around the
emulator function emulate_int_real() to allow real mode interrupt injection.
[avi: initialize operand and address sizes before emulating interrupts]
[avi: initialize rip for real mode interrupt injection]
[avi: clear interrupt pending flag after emulating interrupt injection]
Signed-off-by: Mohammed Gamal <m.gamal005@gmail.com> Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
KVM: SVM: do not generate "external interrupt exit" if other exit is pending
Nested SVM checks for external interrupt after injecting nested exception.
In case there is external interrupt pending the code generates "external
interrupt exit" and overwrites previous exit info. If previously injected
exception already generated exit it will be lost.
Avi Kivity [Sun, 19 Sep 2010 16:44:07 +0000 (18:44 +0200)]
KVM: Convert PIC lock from raw spinlock to ordinary spinlock
The PIC code used to be called from preempt_disable() context, which
wasn't very good for PREEMPT_RT. That is no longer the case, so move
back from raw_spinlock_t to spinlock_t.
Signed-off-by: Avi Kivity <avi@redhat.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
If preempted after kvmclock values are updated, but before hardware
virtualization is entered, the last tsc time as read by the guest is
never set. It underflows the next time kvmclock is updated if there
has not yet been a successful entry / exit into hardware virt.
Fix this by simply setting last_tsc to the newly read tsc value so
that any computed nsec advance of kvmclock is nulled.
Avi Kivity [Tue, 20 Jul 2010 12:06:17 +0000 (15:06 +0300)]
KVM: Non-atomic interrupt injection
Change the interrupt injection code to work from preemptible, interrupts
enabled context. This works by adding a ->cancel_injection() operation
that undoes an injection in case we were not able to actually enter the guest
(this condition could never happen with atomic injection).
Avi Kivity [Tue, 20 Jul 2010 11:43:23 +0000 (14:43 +0300)]
KVM: VMX: Parameterize vmx_complete_interrupts() for both exit and entry
Currently vmx_complete_interrupts() can decode event information from vmx
exit fields into the generic kvm event queues. Make it able to decode
the information from the entry fields as well by parametrizing it.
Avi Kivity [Tue, 20 Jul 2010 11:31:20 +0000 (14:31 +0300)]
KVM: VMX: Split up vmx_complete_interrupts()
vmx_complete_interrupts() does too much, split it up:
- vmx_vcpu_run() gets the "cache important vmcs fields" part
- a new vmx_complete_atomic_exit() gets the parts that must be done atomically
- a new vmx_recover_nmi_blocking() does what its name says
- vmx_complete_interrupts() retains the event injection recovery code
This helps in reducing the work done in atomic context.
Avi Kivity [Tue, 27 Jul 2010 09:30:24 +0000 (12:30 +0300)]
KVM: Check for pending events before attempting injection
Instead of blindly attempting to inject an event before each guest entry,
check for a possible event first in vcpu->requests. Sites that can trigger
event injection are modified to set KVM_REQ_EVENT:
- interrupt, nmi window opening
- ppr updates
- i8259 output changes
- local apic irr changes
- rflags updates
- gif flag set
- event set on exit
This improves non-injecting entry performance, and sets the stage for
non-atomic injection.
Avi Kivity [Mon, 13 Sep 2010 14:45:28 +0000 (16:45 +0200)]
KVM: MMU: Fix regression with ept memory types merged into non-ept page tables
Commit "KVM: MMU: Make tdp_enabled a mmu-context parameter" made real-mode
set ->direct_map, and changed the code that merges in the memory type depend
on direct_map instead of tdp_enabled. However, in this case what really
matters is tdp, not direct_map, since tdp changes the pte format regardless
of whether the mapping is direct or not.
As a result, real-mode shadow mappings got corrupted with ept memory types.
The result was a huge slowdown, likely due to the cache being disabled.
Change it back as the simplest fix for the regression (real fix is to move
all that to vmx code, and not use tdp_enabled as a synonym for ept).