replace bdrv_{get, put}_buffer with bdrv_{load, save}_vmstate
The VM state offset is a concept internal to the image format. Replace
the old bdrv_{get,put}_buffer method that require an index into the
image file that is constructed from the VM state offset and an offset
into the vmstate with the bdrv_{load,save}_vmstate that just take an
offset into the VM state.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Anthony Liguori [Fri, 10 Jul 2009 19:52:56 +0000 (14:52 -0500)]
bios: Fix multiple calls into smbios_load_ex
We're marking the used entry bitmap in smbios_load_external() for each
type we check, regardless of whether we loaded anything. This makes
subsequent calls behave as if we've already loaded the tables from qemu
and can result in missing tables (ex. multiple type4 entries on an SMP
guest). Only mark the bitmap if we actually load something.
Signed-off-by: Alex Williamson <alex.williamson@hp.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Jan Kiszka [Sat, 27 Jun 2009 07:53:51 +0000 (09:53 +0200)]
gdbstub: x86: Support for setting segment registers
This allows to set segment registers via gdb also in system emulation
mode. Basic sanity checks are applied and nothing is changed if they
fail. But screwing up the target via this interface will never be
complicated, so I avoided being too paranoid here.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Jan Kiszka [Sat, 27 Jun 2009 07:53:51 +0000 (09:53 +0200)]
gdbstub: Add vCont support
This patch adds support for the vCont remote gdb command. It is used by
gdb 6.8 or better to switch the debugging focus for single-stepping
multi-threaded targets, ie. multi-threaded application in user mode
emulation or VCPUs in system emulation.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Jan Kiszka [Wed, 1 Jul 2009 22:19:02 +0000 (00:19 +0200)]
Add boot-once support
This allows to specify an exceptional boot order only for the first
startup of the guest. After reboot, qemu will switch back to the default
order (or what was specified via 'order='). Makes installing from CD
images and then booting the freshly set up harddisk more handy.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Jan Kiszka [Wed, 1 Jul 2009 22:19:02 +0000 (00:19 +0200)]
Rework -boot option
This patch changes the boot command line option to the canonical format
-boot [order=drives][,...]
where 'drives' is using the same format as the old -boot. The format
switch allows to add the 'menu' and 'once' options in later patches. The
old format is still understood and will be processed at least for a
transition time.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Mark McLoughlin [Tue, 7 Jul 2009 11:09:58 +0000 (12:09 +0100)]
Change default PCI class of virtio-console to PCI_CLASS_SERIAL_OTHER
We're using PCI_CLASS_DISPLAY_OTHER now, but qemu-kvm.git is using
PCI_CLASS_OTHERS because:
"As a PCI_CLASS_DISPLAY_OTHER, it reduces primary display somehow on
Windows XP (possibly Windows disables acceleration since it fails
to find a driver)."
While this is valid, many versions of X will get confused by it.
Class major number of 0 gets treated as a possibly prehistoric VGA
device, and then the autoconfig logic gets confused trying to figure
out whether the virtio console or the pv vga device are the real VGA.
We should really set a proper class ID. 0x0780 (serial / other) seems
most appropriate. This shouldn't require any kernel changes, the
modalias for virtio looks like:
alias: pci:v00001AF4d*sv*sd*bc*sc*i*
so won't care what the base class or subclass are.
It shows up in the guest as:
00:05.0 Communication controller: Qumranet, Inc. Virtio console
A new qdev type is introduced to allow devices using the old class
to be created for compatibility with qemu-0.10.x.
Reported-by: Adam Jackson <ajax@redhat.com> Signed-off-by: Mark McLoughlin <markmc@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Nathan Froyd [Fri, 5 Jun 2009 02:02:28 +0000 (19:02 -0700)]
gdb-xml: fix hacks in powerpc register numbering
The powerpc xml files contained a hack--an empty, non-existent
register--for getting the register numbers to line up for
newer (XML-aware) and older (non-XML-aware) GDB. While this hack worked
in some cases, it didn't work in all cases, notably when the user used
`finish' or `continue': GDB would attempt to read the non-existent
register and QEMU would complain.
This patch fixes things up properly. Instead of inserting a fake
register, we explicitly declare the floating-point and SPE registers to
start at 71. This action accomplishes the same thing as the nasty hack,
except that now GDB never tries to fetch the non-existant register 70.
Use parameter 'next' to fix the hdecr case.
Also pass 'next' by value instead of pointer (more easy to read and no
performance issue for an always_inline function).
Igor Kovalenko [Sun, 12 Jul 2009 08:35:31 +0000 (12:35 +0400)]
sparc64: trap handling corrections
On Sun, Jul 12, 2009 at 12:09 PM, Blue Swirl<blauwirbel@gmail.com> wrote:
> On 7/12/09, Igor Kovalenko <igor.v.kovalenko@gmail.com> wrote:
>> Good trap handling is required to process interrupts.
>> This patch fixes the following:
>>
>> - sparc64 has no wim register
>> - sparc64 has no psret register, use IE bit of pstate
>> extract IE checking code to cpu_interrupts_enabled
>> - alternate globals are not available if cpu has GL feature
>> in this case bit AG of pstate is constant zero
>> - write to pstate must actually write pstate
>> even if cpu has GL feature
>>
>> Also timer interrupt is handled using do_interrupt.
>
> A bit too much for one patch. Please also remove the code instead of
> commenting out.
I now excluded timer interrupt related part.
To my mind other changes are essentially tied together.
> PUT_PSR for Sparc64 needs CC_OP = CC_OP_FLAGS; like Sparc32.
Igor Kovalenko [Sat, 11 Jul 2009 20:57:03 +0000 (00:57 +0400)]
sparc64: fix helper_st_asi little endian case typo
On Sun, Jul 12, 2009 at 12:43 AM, Stuart Brady<sdbrady@ntlworld.com> wrote:
> On Sat, Jul 11, 2009 at 10:22:18PM +0400, Igor Kovalenko wrote:
>> It is clear that intention is to byte-swap value to be written, not
>> the target address.
>
> @@ -1949,13 +1949,13 @@ void helper_st_asi(target_ulong addr, ta
> case 0x89: // Secondary LE
> switch(size) {
> case 2:
> - addr = bswap16(addr);
> + addr = bswap16(val);
> ^^^^
> Shouldn't that be 'val = bswap16(val)' (and likewise for the 32-bit and
> 64-bit cases)? Also needs a 'signed-off-by:'...
>
> Cheers,
> --
> Stuart Brady
>
Thanks, that part I did not runtime-tested.
Not sure if those asi stores are of any use for user-mode emulator.
Please find attached the corrected version.
Signed-off-by: igor.v.kovalenko@gmail.com
--
Kind regards,
Igor V. Kovalenko
Avi Kivity [Tue, 23 Jun 2009 13:20:36 +0000 (16:20 +0300)]
block: Clean up after deleting BHs
Commit 6a7ad299 ("Call qemu_bh_delete at bdrv_aio_bh_cb") deletes emulated
aio bottom halves to prevent endless accumulation. However, it leaves a
stale ->bh pointer, which is then waited on when the aio is reused.
Zeroing the pointer fixes the issue, allowing vmdk format images to be used.
Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Glauber Costa [Mon, 6 Jul 2009 13:32:09 +0000 (09:32 -0400)]
flush pending aio requests
When we finish migration, there may be pending async io requests
in flight. If we don't flush it before stage3 starting, it might be
the case that the guest loses it.
Signed-off-by: Glauber Costa <glommer@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Clean up msix vector usage state on load. Since guest might have control
over it through the device, the device will have to load this state from
file.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Kevin Wolf [Tue, 7 Jul 2009 16:09:42 +0000 (18:09 +0200)]
qcow2: Fix L1 table memory allocation
Contrary to what one could expect, the size of L1 tables is not cluster
aligned. So as we're writing whole sectors now instead of single entries,
we need to ensure that the L1 table in memory is large enough; otherwise
write would access memory after the end of the L1 table.
Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Mark McLoughlin [Fri, 3 Jul 2009 08:28:02 +0000 (09:28 +0100)]
Prefer sysfs for USB host devices
Scanning for devices via /sys/bus/usb/devices/ and using them via the
/dev/bus/usb/<bus>/<device> character devices is the prefered method
on modern kernels, so try that first.
When using SELinux and libvirt, qemu will have access to /sys/bus/usb
but not /proc/bus/usb, so although the current code will work just
fine, it will generate SELinux AVC warnings.
See also:
https://bugzilla.redhat.com/508326
Reported-by: Daniel Berrange <berrange@redhat.com> Signed-off-by: Mark McLoughlin <markmc@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Add a -g flag to the open command and the main qemu-io command line to
allow opening a file growable. This is only allowed for protocols,
mirroring the limitation exposed through bdrv_file_open.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Kevin Wolf <kwolf@redhat.com>
qemu-io: better input validation for vector-based commands
Fix up a couple of issues with validating the input of the various
length arguments for the vectored I/O commands:
- do the alignment check on each length instead the always 0 count argument
- use a long long varibale for the cvtnum return value so that we can check
wether it wasn't a number
- check for a too large argument instead of truncating it
Also refactor it into a common helper for all four calers and avoid parsing
the numbers twice.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Mark McLoughlin [Wed, 1 Jul 2009 15:46:38 +0000 (16:46 +0100)]
Don't leak VLANClientState on PCI hot remove
destroy_nic() requires that NICInfo::private by a PCIDevice pointer,
but then goes on to require that the same pointer matches
VLANClientState::opaque.
That is no longer the case for virtio-net since qdev and wasn't
previously the case for rtl8139, ne2k_pci or eepro100.
Make the situation a lot more clear by maintaining a VLANClientState
pointer in NICInfo.
Signed-off-by: Mark McLoughlin <markmc@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Alexander Graf [Wed, 1 Jul 2009 20:08:21 +0000 (22:08 +0200)]
Replace signrom with shell script v3
In order to not execute code we just compiled, let's replace signrom
with a shell script that does the same thing while staying compatible
to pretty much every system available.
This should make cross-compilation for windows easier.
aliguori: fix build when objdir != srcdir
Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Kevin Wolf [Tue, 30 Jun 2009 11:06:04 +0000 (13:06 +0200)]
qcow2: Make cache=writethrough default
The performance of qcow2 has improved meanwhile, so we don't need to
special-case it any more. Switch the default to write-through caching
like all other block drivers.
Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Mark McLoughlin [Tue, 30 Jun 2009 09:02:57 +0000 (10:02 +0100)]
net: set a default value for sndbuf=
On reflection, perhaps it does make sense to set a default value for
the sndbuf= tap parameter.
For best effect, sndbuf= should be set to just below the capacity of
the physical NIC.
Setting it higher will cause packets to be dropped before the limit
is hit. Setting it much lower will not cause any problems unless
you set it low enough such that the guest cannot queue up new packets
before the NIC has emptied its queue.
In Linux, txqueuelen=1000 by default for ethernet NICs. Given a 1500
byte MTU, 1Mb is a good choice for sndbuf.
If it turns out that txqueuelen is actually much lower than this, then
sndbuf is essentially disabled. In the event that txqueuelen is much
higher, it's unlikely that the NIC will be able to empty a 1Mb queue.
Thanks to Herbert Xu for this logic.
Signed-off-by: Mark McLoughlin <markmc@redhat.com> Cc: Herbert Xu <herbert.xu@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Huang Ying [Tue, 23 Jun 2009 02:05:14 +0000 (10:05 +0800)]
QEMU: MCE: Add MCE simulation to qemu/tcg
- MCE features are initialized when VCPU is intialized according to CPUID.
- A monitor command "mce" is added to inject a MCE.
- A new interrupt mask: CPU_INTERRUPT_MCE is added to inject the MCE.
aliguori: fix build for linux-user
Signed-off-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Gerd Hoffmann [Tue, 30 Jun 2009 12:12:11 +0000 (14:12 +0200)]
qdev/pci: misc fixes.
* fix secondary bus setup.
* use base->name instead of "FIXME" for device name.
Yes, the device name is redundant. Only for drivers converted
to qdev already though. Once all drivers are converted we can
and should kill it.
Gerd Hoffmann [Tue, 30 Jun 2009 12:12:09 +0000 (14:12 +0200)]
qdev: remove DeviceType
The only purpose DeviceType serves is creating a linked list of
DeviceInfo structs. This removes DeviceType and add a next field to
DeviceInfo instead, so the DeviceInfo structs can be changed that way.
Elimitates a pointless extra level of indirection.
Gerd Hoffmann [Tue, 30 Jun 2009 12:12:08 +0000 (14:12 +0200)]
qdev: replace bus_type enum with bus_info struct.
BusInfo is filled with name and size (pretty much like I did for
DeviceInfo as well). There is also a function pointer to print
bus-specific device information to the monitor. sysbus is hooked
up there, I've also added a print function for PCI.
Device creation is slightly modified as well: The device type search
loop now also checks the bus type while scanning the list instead of
complaining thereafter in case of a mismatch. This effectively gives
each bus a private namespace for device names.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> Signed-off-by: Paul Brook <paul@codesourcery.com>