Benoît Canet [Wed, 2 Oct 2013 12:33:48 +0000 (14:33 +0200)]
block: Add BlockDriver.bdrv_check_ext_snapshot.
This field is used by blkverify to disable external snapshots creation.
It will also be used by block filters like quorum to disable external
snapshot creation.
Signed-off-by: Benoit Canet <benoit@irqsave.net> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Max Reitz [Wed, 9 Oct 2013 08:51:06 +0000 (10:51 +0200)]
qcow2: Assert against snapshot name/ID overflow
qcow2_write_snapshots relies on the length of every snapshot ID and name
fitting into an unsigned 16 bit integer. This is currently ensured by
QEMU through generally only allowing 128 byte IDs and 256 byte names.
However, if this should change in the future, the length written to the
image file should not be silently truncated (though the name itself
would be written completely).
Since this is currently not an issue but might require attention due to
internal QEMU changes in the future, an assert ensuring sanity is enough
for now.
Signed-off-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Max Reitz [Wed, 9 Oct 2013 08:42:56 +0000 (10:42 +0200)]
qcow2: Use pread for inactive L1 in overlap check
Currently, qcow2_check_metadata_overlap uses bdrv_read to read inactive
L1 tables from disk. The number of sectors to read is calculated through
a truncating integer division, therefore, if the L1 table size is not a
multiple of the sector size, the final entries will not be read and
their entries in memory remain undefined (from the g_malloc).
Using bdrv_pread fixes this.
Signed-off-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Max Reitz [Wed, 9 Oct 2013 08:34:10 +0000 (10:34 +0200)]
qcow2: Alignment of snapshot table entries
The qcow2 specification does not explicitly state so far that every
snapshot table entry is aligned to 8 bytes. QEMU, in contrast, does this
alignment, thus it should be properly documented (which this patch
does).
Signed-off-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Max Reitz [Wed, 9 Oct 2013 08:46:20 +0000 (10:46 +0200)]
qemu-iotests: Additional info from qemu-img info
Add a test for the additional information now provided by qemu-img info
when used on qcow2 images. It also tests the qemu QMP output from the
query-block command when running qemu with different runtime options
than specified in the image (ImageInfoSpecific should always refer to
the image).
Signed-off-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Max Reitz [Wed, 9 Oct 2013 08:46:19 +0000 (10:46 +0200)]
qemu-iotests: Discard specific info in _img_info
In _img_info, filter out additional information specific to the image
format provided by qemu-img info, since tests designed for multiple
image formats would produce different outputs for every image format
otherwise.
In a human-readable dump, that new information will always be last for
each "image information block" (multiple blocks are emitted when
inspecting the backing file chain). Every block is separated by an empty
line. Therefore, in this case, everything starting with the line "Format
specific information:" up to that empty line (or EOF, if it is the last
block) has to be stripped.
The JSON dump will always emit pretty JSON data. Therefore, the opening
and closing braces of every object will be on lines which are indented
by exactly the same amount, and all lines in between will have more
indentation. Thus, in this case, everything starting with a line
matching the regular expression /^ *"format-specific": {/ until /^ *},?/
has to be stripped, where the number of spaces at the beginning of the
respective lines is equal.
Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Max Reitz [Wed, 9 Oct 2013 08:46:18 +0000 (10:46 +0200)]
qcow2: Add support for ImageInfoSpecific
Add a new ImageInfoSpecificQCow2 type as a subtype of ImageInfoSpecific.
This contains the compatibility level as a string and an optional
lazy_refcounts boolean (optional means mandatory for compat >= 1.1 and
not available for compat == 0.10).
Also, add qcow2_get_specific_info, which returns this information.
Signed-off-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Max Reitz [Wed, 9 Oct 2013 08:46:17 +0000 (10:46 +0200)]
block/qapi: Human-readable ImageInfoSpecific dump
Add a function for generically dumping the ImageInfoSpecific information
in a human-readable format to block/qapi.c.
Use this function in bdrv_image_info_dump and qemu-io-cmds.c:info_f to
allow qemu-img info resp. qemu-io -c info to print that format specific
information.
Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Fam Zheng [Tue, 8 Oct 2013 09:29:38 +0000 (17:29 +0800)]
blockjob: rename BlockJobType to BlockJobDriver
We will use BlockJobType as the enum type name of block jobs in QAPI,
rename current BlockJobType to BlockJobDriver, which will eventually
become a set of operations, similar to block drivers.
Signed-off-by: Fam Zheng <famz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
exec: Add both big- and little-endian memory helpers
Step three in the transition: helpers not tied to the target
"default" endianness. To be used when the guest uses a memory
operation with non-default endianness.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Anthony Liguori [Thu, 10 Oct 2013 20:16:25 +0000 (13:16 -0700)]
Merge remote-tracking branch 'afaerber/tags/qom-cpu-for-anthony' into staging
QOM CPUState refactorings / X86CPU
* Fix for X86CPU model field of qemu32/qemu64 CPU models
* Bug fix for longjmp on FreeBSD
* Removal of unused function
* Confinement of clone syscall infrastructure to linux-user
# gpg: Signature made Wed 09 Oct 2013 03:40:51 AM PDT using RSA key ID 3E7E013F
# gpg: Can't check signature: public key not found
# By Andreas Färber (2) and others
# Via Andreas Färber
* afaerber/tags/qom-cpu-for-anthony:
cpu: Drop cpu_model_str from CPU_COMMON
cpu: Move cpu_copy() into linux-user
cputlb: Remove dead function tlb_update_dirty()
cpu-exec: Also reload CPUClass *cc after longjmp return in cpu_exec()
target-i386: Set model=6 on qemu64 & qemu32 CPU models
Anthony Liguori [Thu, 10 Oct 2013 20:16:02 +0000 (13:16 -0700)]
Merge remote-tracking branch 'amit/char-remove-watch-on-unplug' into staging
# By Amit Shah
# Via Amit Shah
* amit/char-remove-watch-on-unplug:
char: remove watch callback on chardev detach from frontend
char: use common function to disable callbacks on chardev close
char: move backends' io watch tag to CharDriverState
Message-id: 20131004154802.GA25646@grmbl.mre Signed-off-by: Anthony Liguori <aliguori@amazon.com>
Mark Wu [Wed, 9 Oct 2013 02:37:26 +0000 (10:37 +0800)]
qemu-ga: Extend 'guest-info' command to expose flag 'success-response'
Now we have several qemu-ga commands not returning response on success.
It has been documented in qga/qapi-schema.json already. This patch exposes
the 'success-response' flag by extending 'guest-info' command. With this
change, the clients can handle the command response more flexibly.
Signed-off-by: Mark Wu <wudxw@linux.vnet.ibm.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
*fixed up commit subject Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Mark Wu [Wed, 9 Oct 2013 03:25:07 +0000 (11:25 +0800)]
qemu-ga: Add interface to traverse the qmp command list by QmpCommand
In the original code, qmp_get_command_list is used to construct
a list of all commands' name. To get the information of all qga
commands, it traverses the name list and search the command info
with its name. So it can cause O(n^2) in the number of commands.
This patch adds an interface to traverse the qmp command list by
QmpCommand to replace qmp_get_command_list. It can decrease the
complexity from O(n^2) to O(n).
Signed-off-by: Mark Wu <wudxw@linux.vnet.ibm.com> Reviewed-by: Eric Blake <eblake@redhat.com>
*fix up commit subject Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
One call inside of a loop to tcg_register_helper instead of hundreds
of sequential calls.
Presumably more icache and branch prediction friendly; resulting binary
size mostly unchanged on x86_64, as we're trading 32-bit rip-relative
references in .text for full 64-bit pointers in .rodata.
Signed-off-by: Richard Henderson <rth@twiddle.net>
tcg: Remove stray semi-colons from target-*/helper.h
During GEN_HELPER=1, these are actually stray top-level semi-colons
which are technically invalid ISO C, but GCC accepts as an extension.
If we added enough __extension__ markers that we could dare use
-Wpedantic, we'd see
warning: ISO C does not allow extra ‘;’ outside of a function
This will become a hard error in the next patch, wherein those ; will
appear in the middle of a data structure.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>
Anthony Liguori [Thu, 10 Oct 2013 17:03:38 +0000 (10:03 -0700)]
Merge remote-tracking branch 'sstabellini/xen-2013-10-10' into staging
# By Matthew Daley (1) and Roger Pau Monné (1)
# Via Stefano Stabellini
* sstabellini/xen-2013-10-10:
qemu/xen: make use of xenstore relative paths
xen_disk: mark ioreq as mapped before unmapping in error case
Anthony Liguori [Thu, 10 Oct 2013 17:03:00 +0000 (10:03 -0700)]
Merge remote-tracking branch 'bonzini/scsi-next' into staging
# By Asias He (1) and Peter Lieven (1)
# Via Paolo Bonzini
* bonzini/scsi-next:
scsi: Allocate SCSITargetReq r->buf dynamically [CVE-2013-4344]
block/iscsi: reenable iscsi_co_get_block_status
Message-id: 1381332391-8781-1-git-send-email-pbonzini@redhat.com Signed-off-by: Anthony Liguori <aliguori@amazon.com>
Roger Pau Monné [Thu, 10 Oct 2013 14:25:52 +0000 (14:25 +0000)]
qemu/xen: make use of xenstore relative paths
Qemu has several hardcoded xenstore paths that are only valid on Dom0.
Attempts to launch a Qemu instance (to act as a userspace backend for
PV disks) will fail because Qemu is not able to access those paths
when running on a domain different than Dom0.
Instead make the xenstore paths relative to the domain where Qemu is
actually running.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Reviewed-by: Anthony PERARD <anthony.perard@citrix.com> Cc: xen-devel@lists.xenproject.org Cc: Anthony PERARD <anthony.perard@citrix.com>
Matthew Daley [Thu, 10 Oct 2013 14:10:48 +0000 (14:10 +0000)]
xen_disk: mark ioreq as mapped before unmapping in error case
Commit 4472beae modified the semantics of ioreq_{un,}map so that they are
idempotent if called when they're not needed (ie., twice in a row). However,
it neglected to handle the case where batch mapping is not being used (the
default), and one of the grants fails to map. In this case, ioreq_unmap will
be called to unwind and unmap any mappings already performed, but ioreq_unmap
simply returns due to the aforementioned change (the ioreq has not already
been marked as mapped).
The frontend user can therefore force xen_disk to leak grant mappings, a
per-domain limited resource.
Fix by marking the ioreq as mapped before calling ioreq_unmap in this
situation.
Signed-off-by: Matthew Daley <mattjd@gmail.com> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
r->buf is hardcoded to 2056 which is (256 + 1) * 8, allowing 256 luns at
most. If more than 256 luns are specified by user, we have buffer
overflow in scsi_target_emulate_report_luns.
To fix, we allocate the buffer dynamically.
Signed-off-by: Asias He <asias@redhat.com> Tested-by: Michael Roth <mdroth@linux.vnet.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Anthony Liguori [Wed, 9 Oct 2013 14:54:42 +0000 (07:54 -0700)]
Merge remote-tracking branch 'stefanha/block' into staging
# By Max Reitz (5) and others
# Via Stefan Hajnoczi
* stefanha/block:
block: use correct filename
qemu-iotests: Correct 026 output
qcow2: Free allocated L2 cluster on error
qcow2: Switch L1 table in a single sequence
block: vhdx - add migration blocker
block: use correct filename for error report
qcow2: CHECK_OFLAG_COPIED is obsolete
qcow2: Correct endianness in overlap check
Message-id: 1381145289-6591-1-git-send-email-stefanha@redhat.com Signed-off-by: Anthony Liguori <anthony@codemonkey.ws>
Anthony Liguori [Wed, 9 Oct 2013 14:54:20 +0000 (07:54 -0700)]
Merge remote-tracking branch 'mjt/trivial-patches' into staging
# By Stefan Weil (5) and others
# Via Michael Tokarev
* mjt/trivial-patches:
migration: Fix compiler warning ('caps' may be used uninitialized)
util/path: Fix type which is longer than 8 bit for MinGW
hw/9pfs: Fix errno value for xattr functions
vl: Clean up unnecessary boot_order complications
qemu-char: Fix potential out of bounds access to local arrays
pci-ohci: Add missing 'break' in ohci_service_td
sh4: Fix serial line access for Linux kernels later than 3.2
hw/alpha: Fix compiler warning (integer constant is too large)
target-i386: Fix compiler warning (integer constant is too large)
block: Remove unused assignment (fixes warning from clang)
exec: cleanup DEBUG_SUBPAGE
tests: Fix schema parser test for in-tree build
tests: Update .gitignore for test-int128 and test-bitops
.gitignore: ignore tests/qemu-iotests/socket_scm_helper
Message-id: 1381051979-25742-1-git-send-email-mjt@msgid.tls.msk.ru Signed-off-by: Anthony Liguori <anthony@codemonkey.ws>
Anthony Liguori [Wed, 9 Oct 2013 14:52:57 +0000 (07:52 -0700)]
Merge remote-tracking branch 'rth/tcg-arm-pull' into staging
# By Richard Henderson
# Via Richard Henderson
* rth/tcg-arm-pull:
tcg-arm: Move the tlb addend load earlier
tcg-arm: Remove restriction on qemu_ld output register
tcg-arm: Return register containing tlb addend
tcg-arm: Move load of tlb addend into tcg_out_tlb_read
tcg-arm: Use QEMU_BUILD_BUG_ON to verify constraints on tlb
tcg-arm: Use strd for tcg_out_arg_reg64
tcg-arm: Rearrange slow-path qemu_ld/st
tcg-arm: Use ldrd/strd for appropriate qemu_ld/st64
Message-id: 1380663109-14434-1-git-send-email-rth@twiddle.net Signed-off-by: Anthony Liguori <anthony@codemonkey.ws>
Anthony Liguori [Wed, 9 Oct 2013 14:52:21 +0000 (07:52 -0700)]
Merge remote-tracking branch 'sweil/mingw' into staging
# By Sebastian Ottlik
# Via Stefan Weil
* sweil/mingw:
util: call socket_set_fast_reuse instead of setting SO_REUSEADDR
slirp: call socket_set_fast_reuse instead of setting SO_REUSEADDR
net: call socket_set_fast_reuse instead of setting SO_REUSEADDR
gdbstub: call socket_set_fast_reuse instead of setting SO_REUSEADDR
util: add socket_set_fast_reuse function which will replace setting SO_REUSEADDR
Message-id: 1380735690-24009-1-git-send-email-sw@weilnetz.de Signed-off-by: Anthony Liguori <anthony@codemonkey.ws>
Anthony Liguori [Wed, 9 Oct 2013 14:50:37 +0000 (07:50 -0700)]
Merge remote-tracking branch 'sweil/tci' into staging
# By Stefan Weil
# Via Stefan Weil
* sweil/tci:
misc: Use new rotate functions
bitops: Add rotate functions (rol8, ror8, ...)
tci: Add implementation of rotl_i64, rotr_i64
Message-id: 1380137693-3729-1-git-send-email-sw@weilnetz.de Signed-off-by: Anthony Liguori <anthony@codemonkey.ws>
Peter Lieven [Wed, 2 Oct 2013 11:52:08 +0000 (13:52 +0200)]
block/iscsi: reenable iscsi_co_get_block_status
Commit f35c934a accidently disabled iscsi_co_get_block_status for all
libiscsi versions. Its not possible to check for enumeration constants
in the C preprocessor. This patch changes the check to the preprocessor
constant LIBISCSI_FEATURE_IOVECTOR which was introduced shortly after
get_lba_status support was added to libiscsi.
Signed-off-by: Peter Lieven <pl@kamp.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Andreas Färber [Tue, 2 Jul 2013 15:43:21 +0000 (17:43 +0200)]
cpu: Move cpu_copy() into linux-user
It is only used there and is deemed very fragile if not incorrect in its
current memcpy() form. Moving it into linux-user will allow to move
parts into target_cpu.h headers and only copy what the ABI mandates.
Juergen Lock [Thu, 3 Oct 2013 14:09:37 +0000 (16:09 +0200)]
cpu-exec: Also reload CPUClass *cc after longjmp return in cpu_exec()
Local variable CPUClass *cc needs to be reloaded after return from longjmp,
too. (This fixes a mips-softmmu crash observed on FreeBSD when QEMU is
built with clang.)
Stefan Weil [Wed, 2 Oct 2013 20:40:29 +0000 (22:40 +0200)]
util/path: Fix type which is longer than 8 bit for MinGW
While dirent->d_type is 8 bit for most systems, it is 32 bit for MinGW.
Reducing it to 8 bit results in a compiler warning because the macro
is_dir_maybe compares that 8 bit value with 32 bit constants.
Using 'unsigned' instead of 'unsigned char' matches the declaration for
MinGW and does not harm the other systems.
MinGW-w64 is not affected: it does not declare d_type.
Signed-off-by: Stefan Weil <sw@weilnetz.de> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
If there is no operation driver for the xattr type the
functions return '-1' and set errno to '-EOPNOTSUPP'.
When the calling code sets 'ret = -errno' this turns
into a large positive number.
In Linux 3.11, the kernel has switched to using 9p
version 9p2000.L, instead of 9p2000.u, which enables
support for xattr operations. This on its own is harmless,
but for another change which makes it request the xattr
with a name 'security.capability'.
The result is that the guest sees a succesful return
of 95 bytes of data, instead of a failure with errno
set to 95. Since the kernel expects a maximum of 20
bytes for an xattr return this gets translated to the
unexpected errno ERANGE.
This all means that when running a binary off a 9p fs
in 3.11 kernels you get a fun result of:
# ./date
sh: ./date: Numerical result out of range
The only workaround is to pass 'version=9p2000.u' when
mounting the 9p fs in the guest, to disable all use of
xattrs.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Stefan Weil [Mon, 30 Sep 2013 21:04:49 +0000 (23:04 +0200)]
qemu-char: Fix potential out of bounds access to local arrays
Latest gcc-4.8 supports a new option -fsanitize=address which activates
an AddressSanitizer. This AddressSanitizer stops the QEMU system emulation
very early because two character arrays of size 8 are potentially written
with 9 bytes.
There is no obvious reason why width or height could need 8 characters,
so reduce it to 7 characters which together with the terminating '\0'
fit into the arrays.
Cc: qemu-stable <qemu-stable@nongnu.org> Signed-off-by: Stefan Weil <sw@weilnetz.de> Reviewed-by: Alex Bennée <alex@bennee.com> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Sebastian Macke [Thu, 3 Oct 2013 08:16:14 +0000 (16:16 +0800)]
target-openrisc: Removes a non-conforming behavior for the first page of the memory
Where *software* leaves 0x0000 - 0x2000 unmapped, the hardware should
still allow for this area to be mapped.
Signed-off-by: Sebastian Macke <sebastian@macke.de> Signed-off-by: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi> Reviewed-by: Jia Liu <proljc@gmail.com>
Sebastian Macke [Thu, 3 Oct 2013 08:04:46 +0000 (16:04 +0800)]
target-openrisc: Correct handling of page faults.
The result of (rw & 0) is always zero and therefore a logic false.
The whole comparison will therefore never be executed, it is a obvious bug,
we should use !(rw & 1) here.
Signed-off-by: Sebastian Macke <sebastian@macke.de> Reviewed-by: Jia Liu <proljc@gmail.com>
Alex Williamson [Wed, 2 Oct 2013 19:51:00 +0000 (13:51 -0600)]
vfio-pci: Implement PCI hot reset
Now that VFIO has a PCI hot reset interface, take advantage of it.
There are two modes that we need to consider. The first is when only
one device within the set of devices affected is actually assigned to
the guest. In this case the other devices are are just held by VFIO
for isolation and we can pretend they're not there, doing an entire
bus reset whenever the device reset callback is triggered. Supporting
this case separately allows us to do the best reset we can do of the
device even if the device is hotplugged.
The second mode is when multiple affected devices are all exposed to
the guest. In this case we can only do a hot reset when the entire
system is being reset. However, this also allows us to track which
individual devices are affected by a reset and only do them once.
We split our reset function into pre- and post-reset helper functions
prioritize the types of device resets available to us, and create
separate _one vs _multi reset interfaces to handle the distinct cases
above.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Stefan Weil [Sun, 29 Sep 2013 15:51:20 +0000 (17:51 +0200)]
hw/alpha: Fix compiler warning (integer constant is too large)
From buildbot default_i386_rhel61:
CC alpha-softmmu/hw/alpha/typhoon.o
hw/alpha/typhoon.c: In function 'typhoon_translate_iommu':
hw/alpha/typhoon.c:703: warning: integer constant is too large for 'long' type
hw/alpha/typhoon.c:703: warning: integer constant is too large for 'long' type
Signed-off-by: Stefan Weil <sw@weilnetz.de> Acked-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Stefan Weil [Sun, 29 Sep 2013 15:55:56 +0000 (17:55 +0200)]
target-i386: Fix compiler warning (integer constant is too large)
From buildbot default_i386_rhel61:
CC i386-softmmu/target-i386/arch_memory_mapping.o
target-i386/arch_memory_mapping.c: In function 'walk_pde':
target-i386/arch_memory_mapping.c:110: warning:
integer constant is too large for 'long' type
Signed-off-by: Stefan Weil <sw@weilnetz.de> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Commit 4f193e3 added the test, but screwed up in-tree builds
(SRCDIR=.): the tests's output overwrites the expected output, and is
thus compared to itself.
Cc: qemu-stable@nongnu.org Reported-by: Laszlo Ersek <lersek@redhat.com> Reviewed-by: Andreas Färber <afaerber@suse.de> Reviewed-by: Laszlo Ersek <lersek@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Alex Williamson [Wed, 2 Oct 2013 18:52:38 +0000 (12:52 -0600)]
vfio-pci: Lazy PCI option ROM loading
During vfio-pci initfn, the device is not always in a state where the
option ROM can be read. In the case of graphics cards, there's often
no per function reset, which means we have host driver state affecting
whether the option ROM is usable. Ideally we want to move reading the
option ROM past any co-assigned device resets to the point where the
guest first tries to read the ROM itself.
To accomplish this, we switch the memory region for the option rom to
an I/O region rather than a memory mapped region. This has the side
benefit that we don't waste KVM memory slots for a BAR where we don't
care about performance. This also allows us to delay loading the ROM
from the device until the first read by the guest. We then use the
PCI config space size of the ROM BAR when setting up the BAR through
QEMU PCI.
Another benefit of this approach is that previously when a user set
the ROM to a file using the romfile= option, we still probed VFIO for
the parameters of the ROM, which can result in dmesg errors about an
invalid ROM. We now only probe VFIO to get the ROM contents if the
guest actually tries to read the ROM.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Alex Williamson [Wed, 2 Oct 2013 18:52:38 +0000 (12:52 -0600)]
vfio-pci: Test device reset capabilities
Not all resets are created equal. PM reset is not very reliable,
especially for GPUs, so we might want to opt for a bus reset if a
standard reset will only do a D3hot->D0 transition. We can also
use this to tell if the standard reset will do a bus reset (if
neither has_pm_reset or has_flr is probed, but the device still
supports reset).
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Alex Williamson [Wed, 2 Oct 2013 18:52:38 +0000 (12:52 -0600)]
vfio-pci: Add support for MSI affinity
When MSI is accelerated through KVM the vectors are only programmed
when the guest first enables MSI support. Subsequent writes to the
vector address or data fields are ignored. Unfortunately that means
we're ignore updates done to adjust SMP affinity of the vectors.
MSI SMP affinity already works in non-KVM mode because the address
and data fields are read from their backing store on each interrupt.
This patch stores the MSIMessage programmed into KVM so that we can
determine when changes are made and update the routes.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
util: call socket_set_fast_reuse instead of setting SO_REUSEADDR
SO_REUSEADDR should be avoided on Windows but is desired on other operating
systems. So instead of setting it we call socket_set_fast_reuse that will result
in the appropriate behaviour on all operating systems.
Signed-off-by: Sebastian Ottlik <ottlik@fzi.de> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Stefan Weil <sw@weilnetz.de>
slirp: call socket_set_fast_reuse instead of setting SO_REUSEADDR
SO_REUSEADDR should be avoided on Windows but is desired on other operating
systems. So instead of setting it we call socket_set_fast_reuse that will result
in the appropriate behaviour on all operating systems.
Signed-off-by: Sebastian Ottlik <ottlik@fzi.de> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Stefan Weil <sw@weilnetz.de>
net: call socket_set_fast_reuse instead of setting SO_REUSEADDR
SO_REUSEADDR should be avoided on Windows but is desired on other operating
systems. So instead of setting it we call socket_set_fast_reuse that will result
in the appropriate behaviour on all operating systems.
An exception to this rule are multicast sockets where it is sensible to have
multiple sockets listen on the same ip and port and we should set SO_REUSEADDR
on windows.
Signed-off-by: Sebastian Ottlik <ottlik@fzi.de> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Stefan Weil <sw@weilnetz.de>
gdbstub: call socket_set_fast_reuse instead of setting SO_REUSEADDR
SO_REUSEADDR should be avoided on Windows but is desired on other operating
systems. So instead of setting it we call socket_set_fast_reuse that will result
in the appropriate behaviour on all operating systems.
Signed-off-by: Sebastian Ottlik <ottlik@fzi.de> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Stefan Weil <sw@weilnetz.de>
util: add socket_set_fast_reuse function which will replace setting SO_REUSEADDR
If a socket is closed it remains in TIME_WAIT state for some time. On operating
systems using BSD sockets the endpoint of the socket may not be reused while in
this state unless SO_REUSEADDR was set on the socket. On windows on the other
hand the default behaviour is to allow reuse (i.e. identical to SO_REUSEADDR on
other operating systems) and setting SO_REUSEADDR on a socket allows it to be
bound to a endpoint even if the endpoint is already used by another socket
independently of the other sockets state. This can even result in undefined
behaviour.
Many sockets used by QEMU should not block the use of their endpoint after being
closed while they are still in TIME_WAIT state. Currently QEMU sets SO_REUSEADDR
for such sockets, which can lead to problems on Windows. This patch introduces
the function socket_set_fast_reuse that should be used instead of setting
SO_REUSEADDR when fast socket reuse is desired and behaves correctly on all
operating systems.
As a failure of this function can only be caused by bad QEMU internal errors, an
assertion handles these situations. The return value is still passed on, to
minimize changes in client code and prevent unused variable warnings if NDEBUG
is defined.
Signed-off-by: Sebastian Ottlik <ottlik@fzi.de> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Stefan Weil <sw@weilnetz.de>
target-i386: Set model=6 on qemu64 & qemu32 CPU models
There's no Intel CPU with family=6,model=2, and Linux and Windows guests
disable SEP when seeing that combination due to Pentium Pro erratum #82.
In addition to just having SEP ignored by guests, Skype (and maybe other
applications) runs sysenter directly without passing through ntdll on
Windows, and crashes because Windows ignored the SEP CPUID bit.
So, having model > 2 is a better default on qemu64 and qemu32 for two
reasons: making SEP really available for guests, and avoiding crashing
applications that work on bare metal.
model=3 would fix the problem, but it causes CPU enumeration problems
for Windows guests[1]. So let's set model=6, that matches "Athlon
(PM core)" on AMD and "P2 with on-die L2 cache" on Intel and it allows
Windows to use all CPUs as well as fixing sysenter.
Max Reitz [Mon, 30 Sep 2013 15:57:21 +0000 (17:57 +0200)]
qcow2: Switch L1 table in a single sequence
Switching the L1 table in memory should be an atomic operation, as far
as possible. Calling qcow2_free_clusters on the old L1 table on disk is
not a good idea when the old L1 table is no longer valid and the address
to the new one hasn't yet been written into the corresponding
BDRVQcowState field. To be more specific, this can lead to segfaults due
to qcow2_check_metadata_overlap trying to access the L1 table during the
free operation.
Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Max Reitz [Mon, 30 Sep 2013 07:21:07 +0000 (09:21 +0200)]
qcow2: CHECK_OFLAG_COPIED is obsolete
CHECK_OFLAG_COPIED as a parameter to check_refcounts_l1 and
check_refcounts_l2 is obselete now, since the OFLAG_COPIED consistency
check is actually no longer performed by these functions (but by
check_oflag_copied).
Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
There are free scheduling slots between the sequence of
comparison instructions. This requires changing the
register in use to avoid conflict with those compares.
Signed-off-by: Richard Henderson <rth@twiddle.net>
tcg-arm: Remove restriction on qemu_ld output register
The main intent of the patch is to allow the tlb addend register
to be changed, without tying that change to the constraint. But
the most common side-effect seems to be to enable usage of ldrd
with the r0,r1 pair.
Signed-off-by: Richard Henderson <rth@twiddle.net>