git.proxmox.com Git - mirror

blockjob: Allow nested pause

This patch changes block_job_pause to increase the pause counter and
block_job_resume to decrease it.

The counter will allow calling block_job_pause/block_job_resume
unconditionally on a job when we need to suspend the IO temporarily.

From now on, each block_job_resume must be paired with a block_job_pause
to keep the counter balanced.

The user pause from QMP or HMP will only trigger block_job_pause once
until it's resumed, this is achieved by adding a user_paused flag in
BlockJob.

One occurrence of block_job_resume in mirror_complete is replaced with
block_job_enter which does what is necessary.

In block_job_cancel, the cancel flag is good enough to instruct
coroutines to quit loop, so use block_job_enter to replace the unpaired
block_job_resume.

Upon block job IO error, user is notified about the entering to the
pause state, so this pause belongs to user pause, set the flag
accordingly and expect a matching QMP resume.

[Extended doc comments as suggested by Paolo Bonzini
<pbonzini@redhat.com>.
--Stefan]

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Message-id: 1428069921-2957-2-git-send-email-famz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

MAINTAINERS: Add Fam Zheng as Null block driver maintainer

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1427852740-24315-4-git-send-email-famz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

block/null: Support reopen

Reopen is used in block-commit. With this always-succeed operation, it
is now possible to test committing to a null drive, by specifying
"null-aio://" or "null-co://" as the backing image when creating the
qcow2 image.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1427852740-24315-3-git-send-email-famz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

block/null: Latency simulation by adding new option "latency-ns"

Aio context switch should just work because the requests will be
drained, so the scheduled timer(s) on the old context will be freed.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1427852740-24315-2-git-send-email-famz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

scripts: add 'qemu coroutine' command to qemu-gdb.py

The 'qemu coroutine <coroutine-address>' GDB command prints the
backtrace for a CoroutineUContext.  This is useful for peeking inside
yielded coroutines that are waiting for file descriptor events, timers,
etc.

For example:

  $ gdb tests/test-coroutine
  (gdb) b test_yield
  (gdb) r
  (gdb) b qemu_coroutine_enter
  (gdb) c
  (gdb) c
  Continuing.

  Breakpoint 2, qemu_coroutine_enter (co=0x555555c66520, opaque=0x0) at qemu-coroutine.c:103
  103 {
  (gdb) source scripts/qemu-gdb.py
  (gdb) qemu coroutine 0x555555c66520
  #0  0x000055555557a740 in qemu_coroutine_switch (from_=<optimized out>, to_=0x7ffff7f90a70, action=COROUTINE_YIELD) at coroutine-ucontext.c:177
  #1  0x0000555555566af9 in yield_5_times (opaque=0x7fffffffdbb7) at tests/test-coroutine.c:107
  #2  0x000055555557a7aa in coroutine_trampoline (i0=<optimized out>, i1=<optimized out>) at coroutine-ucontext.c:80
  #3  0x00007ffff08de000 in __start_context () at /lib64/libc.so.6

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1427409754-8556-1-git-send-email-stefanha@redhat.com
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

thread-pool: clean up thread_pool_completion_bh()

This patch simplifies thread_pool_completion_bh().

The function first checks elem->state:

  if (elem->state != THREAD_DONE) {
      continue;
  }

It then goes on to check elem->state == THREAD_DONE although we already
know this must be the case.

The QLIST_REMOVE() is duplicated down both branches of an if-else
statement so that can be lifted out as well.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1427992762-10126-1-git-send-email-stefanha@redhat.com
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

vhdx: Fix zero-fill iov length

Fix the length of the zero-fill for the back, which was accidentally
using the same value as for the front. This is caught by qemu-iotests
033.

For consistency, change the code for the front as well to use the length
stored in the iov (it is the same value, copied four lines above).

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Acked-by: Jeff Cody <jcody@redhat.com>

blkdebug: Add bdrv_truncate()

This is, amongst others, required for qemu-iotests 033 to run as
intended on VHDX, which uses explicit bdrv_truncate() calls to bs->file
when allocating new blocks.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>

qemu-iotests: Some qemu-img convert tests

This adds a regression test for some problems that the qemu-img convert
rewrite just fixed.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>

qemu-img convert: Rewrite copying logic

The implementation of qemu-img convert is (a) messy, (b) buggy, and
(c) less efficient than possible. The changes required to beat some
sense into it are massive enough that incremental changes would only
make my and the reviewers' life harder. So throw it away and reimplement
it from scratch.

Let me give some examples what I mean by messy, buggy and inefficient:

(a) The copying logic of qemu-img convert has two separate branches for
    compressed and normal target images, which roughly do the same -
    except for a little code that handles actual differences between
    compressed and uncompressed images, and much more code that
    implements just a different set of optimisations and bugs. This is
    unnecessary code duplication, and makes the code for compressed
    output (unsurprisingly) suffer from bitrot.

    The code for uncompressed ouput is run twice to count the the total
    length for the progress bar. In the first run it just takes a
    shortcut and runs only half the loop, and when it's done, it toggles
    a boolean, jumps out of the loop with a backwards goto and starts
    over. Works, but pretty is something different.

(b) Converting while keeping a backing file (-B option) is broken in
    several ways. This includes not writing to the image file if the
    input has zero clusters or data filled with zeros (ignoring that the
    backing file will be visible instead).

    It also doesn't correctly limit every iteration of the copy loop to
    sectors of the same status so that too many sectors may be copied to
    in the target image. For -B this gives an unexpected result, for
    other images it just does more work than necessary.

    Conversion with a compressed target completely ignores any target
    backing file.

(c) qemu-img convert skips reading and writing an area if it knows from
    metadata that copying isn't needed (except for the bug mentioned
    above that ignores a status change in some cases). It does, however,
    read from the source even if it knows that it will read zeros, and
    then search for non-zero bytes in the read buffer, if it's possible
    that a write might be needed.

This reimplementation of the copying core reorganises the code to remove
the duplication and have a much more obvious code flow, by essentially
splitting the copy iteration loop into three parts:

1. Find the number of contiguous sectors of the same status at the
   current offset (This can also be called in a separate loop before the
   copying loop in order to determine the total sectors for the progress
   bar.)

2. Read sectors. If the status implies that there is no data there to
   read (zero or unallocated cluster), don't do anything.

3. Write sectors depending on the status. If it's data, write it. If
   we want the backing file to be visible (with -B), don't write it. If
   it's zeroed, skip it if you can, otherwise use bdrv_write_zeroes() to
   optimise the write at least where possible.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>

block-backend: Expose bdrv_write_zeroes()

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>

iothread: release iothread around aio_poll

This is the first step towards having fine-grained critical sections in
dataplane threads, which resolves lock ordering problems between
address_space_* functions (which need the BQL when doing MMIO, even
after we complete RCU-based dispatch) and the AioContext.

Because AioContext does not use contention callbacks anymore, the
unit test has to be changed.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1424449612-18215-4-git-send-email-pbonzini@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

AioContext: acquire/release AioContext during aio_poll

This is the first step in pushing down acquire/release, and will let
rfifolock drop the contention callback feature.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1424449612-18215-3-git-send-email-pbonzini@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

aio-posix: move pollfds to thread-local storage

By using thread-local storage, aio_poll can stop using global data during
g_poll_ns. This will make it possible to drop callbacks from rfifolock.

[Moved npfd = 0 assignment to end of walking_handlers region as
suggested by Paolo. This resolves the assert(npfd == 0) assertion
failure in pollfds_cleanup().
--Stefan]

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1424449612-18215-2-git-send-email-pbonzini@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

block: Switch to host monotonic clock for IO throttling

Currently, throttle timers won't make any progress when VCPU is not
running, which would stall the request queue in utils, qtest, vm
suspending, and live migration, without special handling.

Block jobs are confusingly inconsistent between with and without
throttling: if user sets a bps limit, stops the vm, then start a block
job, the block job will not make any progress; in contrary, if user
unsets the bps limit, or if it's not set, the block job will run
normally.

After this patch, with the host clock, even if the VCPUs are stopped,
the throttle queues will be processed.

This patch also enables potential to add throttle to bdrv_drain_all.
Currently all requests are drained immediately. In other words whenever
it is called, IO throttling goes ineffective (examples: system reset,
migration and many block job operations.). This is a loophole that guest
could exploit. If we use the host clock, we can later just trust the
nested poll. This could be done on top.

Note that for qemu-iotests case 093, which uses qtest, we still keep vm
clock so the script can control the clock stepping in order to be
deterministic.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-id: 1427268446-6426-1-git-send-email-famz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

checkpatch: complain about ffs(3) calls

The ffs(3) family of functions is not portable. MinGW doesn't always
provide the function.

Use ctz32() or ctz64() instead.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1427124571-28598-10-git-send-email-stefanha@redhat.com
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

os-win32: drop ffs(3) prototype

The lack of ffs(3) in the MinGW headers is a hint that we shouldn't rely
on it. MinGW 4.9.2 does not make it available for linking when QEMU's
./configure --enable-debug is used (release builds are fine though).

Now that all QEMU code has been switched to ctz32() there is no need for
ffs(3).

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1427124571-28598-9-git-send-email-stefanha@redhat.com
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

omap_intc: convert ffs(3) to ctz32() in omap_inth_sir_update()

Rewrite the loop using level &= level - 1 to clear the least significant
bit after each iteration. This simplifies the loop and makes it easy to
replace ffs(3) with ctz32().

Cc: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1427124571-28598-8-git-send-email-stefanha@redhat.com
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

sd: convert sd_normal_command() ffs(3) call to ctz32()

ffs() cannot be replaced with ctz32() when the argument might be zero,
because ffs(0) returns 0 while ctz32(0) returns 32.

The ffs(3) call in sd_normal_command() is a special case though.  It can
be converted to ctz32() + 1 because the argument is never zero:

  if (!(req.arg >> 8) || (req.arg >> (ctz32(req.arg & ~0xff) + 1))) {
      ~~~~~~~~~~~~~~~
            ^--------------- req.arg cannot be zero

Cc: Markus Armbruster <armbru@redhat.com>
Cc: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1427124571-28598-7-git-send-email-stefanha@redhat.com
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

Convert ffs() != 0 callers to ctz32()

There are a number of ffs(3) callers that do roughly:

  bit = ffs(val);
  if (bit) {
      do_something(bit - 1);
  }

This pattern can be converted to ctz32() like this:

  zeroes = ctz32(val);
  if (zeroes != 32) {
      do_something(zeroes);
  }

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1427124571-28598-6-git-send-email-stefanha@redhat.com
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

Convert (ffs(val) - 1) to ctz32(val)

This commit was generated mechanically by coccinelle from the following
semantic patch:

@@
expression val;
@@
- (ffs(val) - 1)
+ ctz32(val)

The call sites have been audited to ensure the ffs(0) - 1 == -1 case
never occurs (due to input validation, asserts, etc). Therefore we
don't need to worry about the fact that ctz32(0) == 32.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1427124571-28598-5-git-send-email-stefanha@redhat.com
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

uninorth: convert ffs(3) to ctz32()

It is not clear from the code how a 0 parameter should be handled by the
hardware. Keep the same behavior as ffs(0) - 1 == -1.

Cc: Alexander Graf <agraf@suse.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1427124571-28598-4-git-send-email-stefanha@redhat.com
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

hw/arm/nseries: convert ffs(3) to ctz32()

It is not clear from the code how a 0 parameter should be handled by the
hardware. Keep the same behavior as ffs(0) - 1 == -1.

Cc: Andrzej Zaborowski <balrog@zabor.org>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1427124571-28598-3-git-send-email-stefanha@redhat.com
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

bt-sdp: fix broken uuids power-of-2 calculation

The binary search in sdp_uuid_match() only works when the number of
elements to search is a power of two.

  lo = record->uuid;
  hi = record->uuids;
  while (hi >>= 1)
      if (lo[hi] <= val)
          lo += hi;

  return *lo == val;

I noticed that the record->uuids calculation in
sdp_service_record_build() was suspect:

  record->uuids = 1 << ffs(record->uuids - 1);

Unlike most ffs(val) - 1 users, the expression is ffs(val - 1)!

Actually ffs() is the wrong function to use for power-of-2.  Use
pow2ceil() to achieve the correct effect.  Now the record->uuid[] array
is sized correctly and the binary search in sdp_uuid_match() should
work.

I'm not sure how to run/test this code.

Cc: Andrzej Zaborowski <balrog@zabor.org>
Cc: qemu-stable@nongnu.org
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1427124571-28598-2-git-send-email-stefanha@redhat.com
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

MAINTAINERS: Add myself as the maintainer of the Quorum driver

Signed-off-by: Alberto Garcia <berto@igalia.com>
Message-id: 1426522925-14444-1-git-send-email-berto@igalia.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

savevm: create snapshot failed when id_str already exists

The command "virsh create" will fail in such condition: vm has two
disks: vda and vdb. vda has snapshot s1 with id "1", vdb doesn't have
s1 but has snapshot s2 with id "1". When we want to run command "virsh
create s1", del_existing_snapshots() only deletes s1 in vda, and
bdrv_snapshot_create() tries to create vdb's snapshot s1 with id "1",
but id "1" alreay exists in vdb with name "s2"!

The simplest way is call find_new_snapshot_id() unconditionally.

Signed-off-by: Yi Wang <up2wing@gmail.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

Merge remote-tracking branch 'remotes/ehabkost/tags/x86-pull-request' into staging

X86 queue, 2015-04-27 (v2)

# gpg: Signature made Mon Apr 27 19:42:39 2015 BST using RSA key ID 984DC5A6
# gpg: Good signature from "Eduardo Habkost <ehabkost@redhat.com>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg:          It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 5A32 2FD5 ABC4 D3DB ACCF  D1AA 2807 936F 984D C5A6

* remotes/ehabkost/tags/x86-pull-request:
  target-i386: Remove AMD feature flag aliases from CPU model table
  target-i386: X86CPU::xlevel2 QOM property
  target-i386: Make "level" and "xlevel" properties static
  qemu-config: Accept empty option values
  MAINTAINERS: Change status of X86 to Maintained
  MAINTAINERS: Add myself to X86

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

Merge remote-tracking branch 'remotes/ehabkost/tags/numa-pull-request' into staging

NUMA queue, 2015-04-27

# gpg: Signature made Mon Apr 27 19:02:19 2015 BST using RSA key ID 984DC5A6
# gpg: Good signature from "Eduardo Habkost <ehabkost@redhat.com>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg:          It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 5A32 2FD5 ABC4 D3DB ACCF  D1AA 2807 936F 984D C5A6

* remotes/ehabkost/tags/numa-pull-request:
  MAINTAINERS: Add myself as NUMA code maintainer

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

Merge remote-tracking branch 'remotes/pmaydell/tags/pull-target-arm-20150427' into staging

target-arm queue:
* memory system updates to support transaction attributes
* set user-mode and secure attributes for accesses made by ARM CPUs
* rename c1_coproc to cpacr_el1
* adjust id_aa64pfr0 when has_el3 CPU property disabled
* allow ARMv8 SCR.SMD updates

# gpg: Signature made Mon Apr 27 16:14:30 2015 BST using RSA key ID 14360CDE
# gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>"

* remotes/pmaydell/tags/pull-target-arm-20150427:
  Allow ARMv8 SCR.SMD updates
  target-arm: Adjust id_aa64pfr0 when has_el3 CPU property disabled
  target-arm: rename c1_coproc to cpacr_el1
  target-arm: Check watchpoints against CPU security state
  target-arm: Use attribute info to handle user-only watchpoints
  target-arm: Add user-mode transaction attribute
  target-arm: Use correct memory attributes for page table walks
  target-arm: Honour NS bits in page tables
  Switch non-CPU callers from ld/st*_phys to address_space_ld/st*
  exec.c: Capture the memory attributes for a watchpoint hit
  exec.c: Add new address_space_ld*/st* functions
  exec.c: Make address_space_rw take transaction attributes
  exec.c: Convert subpage memory ops to _with_attrs
  Add MemTxAttrs to the IOTLB
  Make CPU iotlb a structure rather than a plain hwaddr
  memory: Replace io_mem_read/write with memory_region_dispatch_read/write
  memory: Define API for MemoryRegionOps to take attrs and return status

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

Merge remote-tracking branch 'remotes/spice/tags/pull-spice-20150427-1' into staging

spice: misc fixes.

# gpg: Signature made Mon Apr 27 12:03:16 2015 BST using RSA key ID D3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>"

* remotes/spice/tags/pull-spice-20150427-1:
  spice: learn to hide cursor
  spice: set pointer position on hotspot
  spice: fix mouse cursor position
  spice: fix simple display on bigendian hosts
  monitor: Make client_migrate_info synchronous

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

target-i386: Remove AMD feature flag aliases from CPU model table

When CPU vendor is AMD, the AMD feature alias bits on
CPUID[0x80000001].EDX are already automatically copied from CPUID[1].EDX
on x86_cpu_realizefn(). When CPU vendor is Intel, those bits are
reserved and should be zero. On either case, those bits shouldn't be set
in the CPU model table.

Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>

target-i386: X86CPU::xlevel2 QOM property

We already have "level" and "xlevel", only "xlevel2" is missing.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>

target-i386: Make "level" and "xlevel" properties static

Static properties require only 1 line of code, much simpler than the
existing code that requires writing new getters/setters.

As a nice side-effect, this fixes an existing bug where the setters were
incorrectly allowing the properties to be changed after the CPU was
already realized.

Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>

qemu-config: Accept empty option values

Currently it is impossible to set an option in a config file to an empty
string, because the parser matches only lines containing non-empty
strings between double-quotes.

As sscanf() "[" conversion specifier only matches non-empty strings, add
a special case for empty strings.

Reviewed-by: Eric Blake <eblake@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>

MAINTAINERS: Change status of X86 to Maintained

"Odd Fixes" doesn't reflect the current status of target-i386. We have
people looking after it, now.

Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>

MAINTAINERS: Add myself to X86

Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>

Merge remote-tracking branch 'remotes/kraxel/tags/pull-gtk-20150427-1' into staging

gtk: support text consoles without vte, bugfixes.

# gpg: Signature made Mon Apr 27 14:34:15 2015 BST using RSA key ID D3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>"

* remotes/kraxel/tags/pull-gtk-20150427-1:
  gtk: Avoid accel key leakage into guest on console switch
  gtk: Fix VTE focus grabbing
  console/gtk: add qemu_console_get_label
  gtk: bind to text terminal consoles too
  gtk: handle switch_surface(NULL) properly

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

MAINTAINERS: Add myself as NUMA code maintainer

The "srat" and "numa" keywords will help get_maintainer.pl catch
NUMA-related code in other files too.

Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>

Merge remote-tracking branch 'remotes/qmp-unstable/tags/for-upstream' into staging

Four little fixes

# gpg: Signature made Fri Apr 24 19:56:51 2015 BST using RSA key ID E24ED5A7
# gpg: Good signature from "Luiz Capitulino <lcapitulino@gmail.com>"

* remotes/qmp-unstable/tags/for-upstream:
  qmp: Give saner messages related to qmp_capabilities misuse
  qmp-commands: fix incorrect uses of ":O" specifier
  qapi: Drop dead genlist parameter
  balloon: improve error msg when adding second device

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

spice: learn to hide cursor

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>

spice: set pointer position on hotspot

The Spice protocol uses cursor position on hotspot: the client is
applying hotspot offset when drawing the cursor.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>

spice: fix mouse cursor position

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>

spice: fix simple display on bigendian hosts

Denis Kirjanov is busy getting spice run on ppc64 and trapped into this
one. Spice wire format is little endian, so we have to explicitly say
we want little endian when letting pixman convert the data for us.

Reported-by: Denis Kirjanov <kirjanov@gmail.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>

monitor: Make client_migrate_info synchronous

Live migration with spice works like this today:

  (1) client_migrate_info monitor cmd
  (2) spice server notifies client, client connects to target host.
  (3) qemu waits until spice client connect is finished.
  (4) send over vmstate (i.e. main part of live migration).
  (5) spice handover to target host.

(3) is implemented by making client_migrate_info a async monitor
command.  This is the only async monitor command we have.

The original reason to implement this dance was that qemu did not accept
new tcp connections while the incoming migration was running, so (2) and
(4) could not be done in parallel.  That issue was fixed long ago though.
Qemu version 1.3.0 (released Dec 2012) and newer happily accept tcp
connects while the incoming migration runs.

Time to drop step (3).  This patch does exactly that, by making the
monitor command synchronous and removing the code needed to handle the
async monitor command in ui/spice-core.c

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>

gtk: Avoid accel key leakage into guest on console switch

GTK2 sends the accel key to the guest when switching to the graphic
console via that shortcut. Resolve this by ignoring any keys until the
next key-release event. However, do not ignore keys when switching via
the menu or when on GTK3.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>

gtk: Fix VTE focus grabbing

At least on GTK2, the VTE terminal has to be specified as target of
gtk_widget_grab_focus. Otherwise, switching from one VTE terminal to
another causes the focus to get lost.

CC: John Snow <jsnow@redhat.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
[ kraxel: fixed build with CONFIG_VTE=n ]

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>

Allow ARMv8 SCR.SMD updates

Updated scr_write to always allow updates to the SCR.SMD bit on ARMv8
regardless of whether virtualization (EL2) is enabled or not.

Signed-off-by: Greg Bellows <greg.bellows@linaro.org>
Message-id: 1429888797-4378-1-git-send-email-greg.bellows@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

target-arm: Adjust id_aa64pfr0 when has_el3 CPU property disabled

Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com>
Message-id: 1429669112-29835-1-git-send-email-serge.fdrv@gmail.com
Reviewed-by: Greg Bellows <greg.bellows@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

target-arm: rename c1_coproc to cpacr_el1

Rename the field holding CPACR_EL1 system register state in AArch64
naming style.

Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com>
[PMM: also fixed a couple of missed occurrences in cpu.c]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

target-arm: Check watchpoints against CPU security state

Fix a TODO in bp_wp_matches() now that we have a function for
testing whether the CPU is currently in Secure mode or not.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>

target-arm: Use attribute info to handle user-only watchpoints

Now that we have memory access attribute information in the watchpoint
checking code, we can correctly implement handling of watchpoints
which should match only on userspace accesses, where LDRT/STRT/LDT/STT
from EL1 are treated as userspace accesses.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>

target-arm: Add user-mode transaction attribute

Add a transaction attribute indicating that a memory access is being
done from user-mode (unprivileged). This corresponds to an equivalent
signal in ARM AMBA buses.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>

target-arm: Use correct memory attributes for page table walks

Factor out the page table walk memory accesses into their own function,
so that we can specify the correct S/NS memory attributes for them.
This will also provide a place to use the correct endianness and
handle the need for a stage-2 translation when virtualization is
supported.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>

target-arm: Honour NS bits in page tables

Honour the NS bit in ARM page tables:
* when adding entries to the TLB, include the Secure/NonSecure
transaction attribute
* set the NS bit in the PAR when doing ATS operations

Note that we don't yet correctly use the NSTable bit to
cause the page table walk itself to use the right attributes.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>

Switch non-CPU callers from ld/st*_phys to address_space_ld/st*

Switch all the uses of ld/st*_phys to address_space_ld/st*,
except for those cases where the address space is the CPU's
(ie cs->as). This was done with the following script which
generates a Coccinelle patch.

A few over-80-columns lines in the result were rewrapped by
hand where Coccinelle failed to do the wrapping automatically,
as well as one location where it didn't put a line-continuation
'\' when wrapping lines on a change made to a match inside
a macro definition.

===begin===
#!/bin/sh -e
# Usage:
# ./ldst-phys.spatch.sh > ldst-phys.spatch
# spatch -sp_file ldst-phys.spatch -dir . | sed -e '/^+/s/\t/ /g' > out.patch
# patch -p1 < out.patch

for FN in ub uw_le uw_be l_le l_be q_le q_be uw l q; do
cat <<EOF
@ cpu_matches_ld_${FN} @
expression E1,E2;
identifier as;
@@

ld${FN}_phys(E1->as,E2)

@ other_matches_ld_${FN} depends on !cpu_matches_ld_${FN} @
expression E1,E2;
@@

-ld${FN}_phys(E1,E2)
+address_space_ld${FN}(E1,E2, MEMTXATTRS_UNSPECIFIED, NULL)

EOF

done

for FN in b w_le w_be l_le l_be q_le q_be w l q; do
cat <<EOF
@ cpu_matches_st_${FN} @
expression E1,E2,E3;
identifier as;
@@

st${FN}_phys(E1->as,E2,E3)

@ other_matches_st_${FN} depends on !cpu_matches_st_${FN} @
expression E1,E2,E3;
@@

-st${FN}_phys(E1,E2,E3)
+address_space_st${FN}(E1,E2,E3, MEMTXATTRS_UNSPECIFIED, NULL)

EOF

done
===endit===

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>

exec.c: Capture the memory attributes for a watchpoint hit

Capture the memory attributes for the transaction which triggered
a watchpoint; this allows CPU specific code to implement features
like ARM's "user-mode only WPs also hit for LDRT/STRT accesses
made from privileged code". This change also correctly passes
through the memory attributes to the underlying device when
a watchpoint access doesn't hit.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>

exec.c: Add new address_space_ld*/st* functions

Add new address_space_ld*/st* functions which allow transaction
attributes and error reporting for basic load and stores. These
are named to be in line with the address_space_read/write/rw
buffer operations.

The existing ld/st*_phys functions are now wrappers around
the new functions.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>

exec.c: Make address_space_rw take transaction attributes

Make address_space_rw take transaction attributes, rather
than always using the 'unspecified' attributes.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>

exec.c: Convert subpage memory ops to _with_attrs

Convert the subpage memory ops to _with_attrs; this will allow
us to pass the attributes through to the underlying access
functions. (Nothing uses the attributes yet.)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>

Add MemTxAttrs to the IOTLB

Add a MemTxAttrs field to the IOTLB, and allow target-specific
code to set it via a new tlb_set_page_with_attrs() function;
pass the attributes through to the device when making IO accesses.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>

Make CPU iotlb a structure rather than a plain hwaddr

Make the CPU iotlb a structure rather than a plain hwaddr;
this will allow us to add transaction attributes to it.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>

memory: Replace io_mem_read/write with memory_region_dispatch_read/write

Rather than retaining io_mem_read/write as simple wrappers around
the memory_region_dispatch_read/write functions, make the latter
public and change all the callers to use them, since we need to
touch all the callsites anyway to add MemTxAttrs and MemTxResult
support. Delete io_mem_read and io_mem_write entirely.

(All the callers currently pass MEMTXATTRS_UNSPECIFIED
and convert the return value back to bool or ignore it.)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>

memory: Define API for MemoryRegionOps to take attrs and return status

Define an API so that devices can register MemoryRegionOps whose read
and write callback functions are passed an arbitrary pointer to some
transaction attributes and can return a success-or-failure status code.
This will allow us to model devices which:
* behave differently for ARM Secure/NonSecure memory accesses
* behave differently for privileged/unprivileged accesses
* may return a transaction failure (causing a guest exception)
for erroneous accesses

This patch defines the new API and plumbs the attributes parameter through
to the memory.c public level functions io_mem_read() and io_mem_write(),
where it is currently dummied out.

The success/failure response indication is also propagated out to
io_mem_read() and io_mem_write(), which retain the old-style
boolean true-for-error return.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>

Open 2.4 development tree

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

qmp: Give saner messages related to qmp_capabilities misuse

Pretending that QMP doesn't understand a command merely because
we are not in the right mode doesn't help first-time users figure
out what to do to correct things. Although the documentation for
QMP calls out capabilities negotiation, we should also make it
clear in our error messages what we were expecting. With this
patch, I now get the following transcript:

$ ./x86_64-softmmu/qemu-system-x86_64 -qmp stdio -nodefaults
{"QMP": {"version": {"qemu": {"micro": 93, "minor": 2, "major": 2}, "package": ""}, "capabilities": []}}
{"execute":"huh"}
{"error": {"class": "CommandNotFound", "desc": "The command huh has not been found"}}
{"execute":"quit"}
{"error": {"class": "CommandNotFound", "desc": "Expecting capabilities negotiation with 'qmp_capabilities' before command 'quit'"}}
{"execute":"qmp_capabilities"}
{"return": {}}
{"execute":"qmp_capabilities"}
{"error": {"class": "CommandNotFound", "desc": "Capabilities negotiation is already complete, command 'qmp_capabilities' ignored"}}
{"execute":"quit"}
{"return": {}}
{"timestamp": {"seconds": 1429110729, "microseconds": 181935}, "event": "SHUTDOWN"}

Signed-off-by: Eric Blake <eblake@redhat.com>
Tested-By: Kashyap Chamarthy <kchamart@redhat.com>
Reviewed-by: Paulo Vital <paulo.vital@profitbricks.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>

qmp-commands: fix incorrect uses of ":O" specifier

As far as the QMP parser is concerned, neither the 'O' nor the 'q' format specifiers
put any constraint on the command.  However, there are two differences:

1) from a documentation point of view 'O' says that this command takes
a dictionary.  The dictionary will be converted to QemuOpts in the
handler to match the corresponding HMP command.

2) 'O' sets QMP_ACCEPT_UNKNOWNS, resulting in the command accepting invalid
extra arguments.  For example the following is accepted:

   { "execute": "send-key",
        "arguments": { "keys": [ { "type": "qcode", "data": "ctrl" },
                                 { "type": "qcode", "data": "alt" },
                                 { "type": "qcode", "data": "delete" } ], "foo": "bar" } }

Neither send-key nor migrate-set-capabilities take a QemuOpts-like
dictionary; they take an array of dictionaries.  And neither command
really wants to have extra unknown arguments.  Thus, the right
specifier to use in this case is 'q'; with this patch the above
command fails with

   {"error": {"class": "GenericError", "desc": "Invalid parameter 'foo'"}}

as intended.

Reported-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>

qapi: Drop dead genlist parameter

Defaulting a parameter to True, then having all callers omit or
pass an explicit True for that parameter, is pointless. Looks
like it has been dead since introduction in commit 06d64c6, more
than 4 years ago.

Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>

balloon: improve error msg when adding second device

A VM supports only one balloon device, but due to several changes
in infrastructure the error message got messed up when trying
to add a second device. Fix it.

Before this fix

Command-line:

qemu-qmp: -device virtio-balloon-pci,id=balloon0: Another balloon device already registered
qemu-qmp: -device virtio-balloon-pci,id=balloon0: Adding balloon handler failed
qemu-qmp: -device virtio-balloon-pci,id=balloon0: Device 'virtio-balloon-pci' could not be initialized

HMP:

Another balloon device already registered
Adding balloon handler failed
Device 'virtio-balloon-pci' could not be initialized

QMP:

{ "execute": "device_add", "arguments": { "driver": "virtio-balloon-pci", "id": "balloon0" } }
{
"error": {
"class": "GenericError",
"desc": "Adding balloon handler failed"
}
}

After this fix

Command-line:

qemu-qmp: -device virtio-balloon-pci,id=balloon0: Only one balloon device is supported
qemu-qmp: -device virtio-balloon-pci,id=balloon0: Device 'virtio-balloon-pci' could not be initialized

HMP:

(qemu) device_add virtio-balloon-pci,id=balloon0
Only one balloon device is supported
Device 'virtio-balloon-pci' could not be initialized
(qemu)

QMP:

{ "execute": "device_add",
          "arguments": { "driver": "virtio-balloon-pci", "id": "balloon0" } }
{
    "error": {
        "class": "GenericError",
        "desc": "Only one balloon device is supported"
    }
}

Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>

Update version for v2.3.0 release

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

console/gtk: add qemu_console_get_label

Add a new function to get a nice label for a given QemuConsole.
Drop the labeling code in gtk.c and use the new function instead.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>

gtk: bind to text terminal consoles too

This way gtk has text terminal consoles even when building without vte.
Most notably you'll get a monitor tab on windows now.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>

gtk: handle switch_surface(NULL) properly

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>

Update version for v2.3.0-rc4 release

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

vhost: fix log base address

VHOST_SET_LOG_BASE got an incorrect address, causing
migration errors and potentially even memory corruption.

Reported-by: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Amos Kong <akong@redhat.com>
Message-id: 1429283565-32265-1-git-send-email-mst@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

hmp: fix crash in 'info block -n -v'

The image field in BlockDeviceInfo should never be null, however
bdrv_block_device_info() is not filling it in.

This makes the 'info block -n -v' command crash QEMU.

The proper solution is probably to move the relevant code from
bdrv_query_info() to bdrv_block_device_info(), but since we're too
close to the release for that this simpler workaround solves the
crash.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Message-id: 1429274688-8115-1-git-send-email-berto@igalia.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

Merge remote-tracking branch 'remotes/lalrae/tags/mips-20150417-2' into staging

MIPS patches 2015-04-17

Changes:
* fix broken fulong2e

# gpg: Signature made Fri Apr 17 12:14:37 2015 BST using RSA key ID 0B29DA6B
# gpg: Can't check signature: public key not found

* remotes/lalrae/tags/mips-20150417-2:
mips: fix broken fulong2e machine

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

Merge remote-tracking branch 'remotes/kraxel/tags/pull-fwcfg-20150414-1' into staging

fw_cfg: add documentation file (docs/specs/fw_cfg.txt)

# gpg: Signature made Tue Apr 14 12:22:20 2015 BST using RSA key ID D3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>"

* remotes/kraxel/tags/pull-fwcfg-20150414-1:
  fw_cfg: add documentation file (docs/specs/fw_cfg.txt)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

mips: fix broken fulong2e machine

After commit 5312bd8 the bonito_readl() and bonito_writel() have been
accessing incorrect addresses. Consequently QEMU is crashing when trying
to boot Linux kernel on fulong2e machine.

Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>

target-ppc: don't invalidate msr MSR_HVB bit in cpu_post_load

The invalidation code introduced in commit 2360b works by inverting most bits
of env->msr to ensure that hreg_store_msr() will forcibly update the CPU env
state to reflect the new msr value post-migration. Unfortunately
hreg_store_msr() is called with alter_hv set to 0 which preserves the MSR_HVB
state from the CPU env which is now the opposite value to what it should be.

Ensure that we don't invalidate the msr MSR_HVB bit during cpu_post_load so
that the correct value is restored. This fixes suspend/resume for PPC64.

Reported-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Alexander Graf <agraf@suse.de>
Message-id: 1429255009-12751-1-git-send-email-mark.cave-ayland@ilande.co.uk
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

fw_cfg: add documentation file (docs/specs/fw_cfg.txt)

This document covers the guest-side hardware interface, as
well as the host-side programming API of QEMU's firmware
configuration (fw_cfg) device.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Signed-off-by: Gabriel Somlo <somlo@cmu.edu>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>

Update version for v2.3.0-rc3 release

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

Revert seccomp tests that allow it to be used on non-x86 architectures

Unfortunately it turns out that libseccomp 2.2 still does not work
correctly on non-x86 architectures; return to the previous configure
setup of insisting on libseccomp 2.1 or better and i386/x86_64 and
disabling seccomp support in all other situations.

This reverts the two commits:
* "seccomp: libseccomp version varying according to arch"
(commit 896848f0d3e2393905845ef2b244bb2601f9df0c)
* "seccomp: update libseccomp version and remove arch restriction"
(commit 8e27fc200457e3f2473d0069263774d4ba17bd85)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1428670681-23032-1-git-send-email-peter.maydell@linaro.org

pci: Fix crash with illegal "-net nic, model=xxx" option

Current QEMU crashes when specifying an illegal model with the
"-net nic,model=xxx" option, e.g.:

$ qemu-system-x86_64 -net nic,model=n/a
qemu-system-x86_64: Unsupported NIC model: n/a

Program received signal SIGSEGV, Segmentation fault.

The gdb backtrace looks like this:

0x0000555555965fe0 in error_get_pretty (err=0x0) at util/error.c:152
152     return err->msg;
(gdb) bt
0  0x0000555555965fe0 in error_get_pretty (err=0x0) at util/error.c:152
1  0x0000555555965ffd in error_report_err (err=0x0) at util/error.c:157
2  0x0000555555809c90 in pci_nic_init_nofail (nd=0x555555e49860 <nd_table>, rootbus=0x5555564409b0,
    default_model=0x55555598c37b "e1000", default_devaddr=0x0) at hw/pci/pci.c:1663
3  0x0000555555691e42 in pc_nic_init (isa_bus=0x555556f71900, pci_bus=0x5555564409b0)
    at hw/i386/pc.c:1506
4  0x000055555569396b in pc_init1 (machine=0x5555562abbf0, pci_enabled=1, kvmclock_enabled=1)
    at hw/i386/pc_piix.c:248
5  0x0000555555693d27 in pc_init_pci (machine=0x5555562abbf0) at hw/i386/pc_piix.c:310
6  0x000055555572ddf5 in main (argc=3, argv=0x7fffffffe018, envp=0x7fffffffe038) at vl.c:4226

The problem is that pci_nic_init_nofail() does not check whether the err
parameter from pci_nic_init has been set up and thus passes a NULL pointer
to error_report_err(). Fix it by correctly checking the err parameter.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

stm32f205: Fix SoC type name

The type name for the SoC device, unlike those of its sub-devices,
did not follow the QOM naming conventions. While the usage is internal
only, this is exposed through QMP and HMP, so fix it before release.

Cc: Alistair Francis <alistair.francis@xilinx.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
Reviewed-by: Alistair Francis <alistair@alistair23.me>
Message-id: 1428676676-23056-1-git-send-email-afaerber@suse.de
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

cris: memory: Replace memory_region_init_ram with memory_region_allocate_system_memory

Commit 0b183fc871:"memory: move mem_path handling to
memory_region_allocate_system_memory" split memory_region_init_ram and
memory_region_init_ram_from_file. Also it moved mem-path handling a step
up from memory_region_init_ram to memory_region_allocate_system_memory.

Therefore for any board that uses memory_region_init_ram directly,
-mem-path is not supported.

Fix this by replacing memory_region_init_ram with
memory_region_allocate_system_memory.

Tested-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Cc: Edgar E. Iglesias <edgar.iglesias@gmail.com>
Signed-off-by: Dirk Mueller <dmueller@suse.com>
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>

alpha: memory: Replace memory_region_init_ram with memory_region_allocate_system_memory

Commit 0b183fc871:"memory: move mem_path handling to
memory_region_allocate_system_memory" split memory_region_init_ram and
memory_region_init_ram_from_file. Also it moved mem-path handling a step
up from memory_region_init_ram to memory_region_allocate_system_memory.

Therefore for any board that uses memory_region_init_ram directly,
-mem-path is not supported.

Fix this by replacing memory_region_init_ram with
memory_region_allocate_system_memory.

Cc: Richard Henderson <rth@twiddle.net>
Signed-off-by: Dirk Mueller <dmueller@suse.com>
Acked-by: Richard Henderson <rth@twiddle.net>
Message-id: CAL5wTH64_ykF17cw2T1Axq8P3vCWm=6WbUJ3qJrLF-u+-MmzUw@mail.gmail.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

lm32: memory: Replace memory_region_init_ram with memory_region_allocate_system_memory

Commit 0b183fc871:"memory: move mem_path handling to
memory_region_allocate_system_memory" split memory_region_init_ram and
memory_region_init_ram_from_file. Also it moved mem-path handling a step
up from memory_region_init_ram to memory_region_allocate_system_memory.

Therefore for any board that uses memory_region_init_ram directly,
-mem-path is not supported.

Fix this by replacing memory_region_init_ram with
memory_region_allocate_system_memory.

Cc: Michael Walle <michael@walle.cc>
Signed-off-by: Dirk Mueller <dmueller@suse.com>
Acked-by: Michael Walle <michael@walle.cc>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

xen: limit guest control of PCI command register

Otherwise the guest can abuse that control to cause e.g. PCIe
Unsupported Request responses (by disabling memory and/or I/O decoding
and subsequently causing [CPU side] accesses to the respective address
ranges), which (depending on system configuration) may be fatal to the
host.

This is CVE-2015-2756 / XSA-126.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Message-id: alpine.DEB.2.02.1503311510300.7690@kaball.uk.xensource.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

configure: disable Archipelago by default and warn about libxseg GPLv3 license

libxseg has changed license to GPLv3.  QEMU includes GPL "v2 only" code
which is not compatible with GPLv3.  This means the resulting binaries
may not be redistributable!

Disable Archipelago (libxseg) by default to prevent accidental license
violations.  Also warn if linking against libxseg is enabled to remind
the user.

Note that this commit does not constitute any advice about software
licensing.  If you have doubts you should consult a lawyer.

Cc: Chrysostomos Nanakos <cnanakos@grnet.gr>
Suggested-by: Kevin Wolf <kwolf@redhat.com>
Reported-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Andreas Färber <afaerber@suse.de>
Message-id: 1428587538-8765-1-git-send-email-stefanha@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

Merge remote-tracking branch 'remotes/stefanha/tags/block-pull-request' into staging

# gpg: Signature made Thu Apr  9 10:55:11 2015 BST using RSA key ID 81AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>"
# gpg:                 aka "Stefan Hajnoczi <stefanha@gmail.com>"

* remotes/stefanha/tags/block-pull-request:
  block/iscsi: handle zero events from iscsi_which_events
  aio: strengthen memory barriers for bottom half scheduling
  virtio-blk: correctly dirty guest memory
  qcow2: Fix header update with overridden backing file

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

tcg/tcg-op.c: Fix ld/st of 64 bit values on 32-bit bigendian hosts

Commit 951c6300f7 out-of-lined the 32-bit-host versions of
tcg_gen_{ld,st}_i64, but in the process it inadvertently changed
an #ifdef HOST_WORDS_BIGENDIAN to #ifdef TCG_TARGET_WORDS_BIGENDIAN.
Since the latter doesn't get defined anywhere this meant we always
took the "LE host" codepath, and stored the two halves of the value
in the wrong order on BE hosts. This typically breaks any 64-bit
guest on a 32-bit BE host completely, and will have possibly more
subtle effects even for 32-bit guests.

Switch the ifdef back to HOST_WORDS_BIGENDIAN.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Tested-by: Andreas Färber <afaerber@suse.de>
Message-id: 1428523029-13620-1-git-send-email-peter.maydell@linaro.org

block/iscsi: handle zero events from iscsi_which_events

newer libiscsi versions may return zero events from iscsi_which_events.

In this case iscsi_service will return immediately without any progress.
To avoid busy waiting for iscsi_which_events to change we deregister all
read and write handlers in this case and schedule a timer to periodically
check iscsi_which_events for changed events.

Next libiscsi version will introduce async reconnects and zero events
are returned while libiscsi is waiting for a reconnect retry.

Signed-off-by: Peter Lieven <pl@kamp.de>
Message-id: 1428437295-29577-1-git-send-email-pl@kamp.de
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>

aio: strengthen memory barriers for bottom half scheduling

There are two problems with memory barriers in async.c.  The fix is
to use atomic_xchg in order to achieve sequential consistency between
the scheduling of a bottom half and the corresponding execution.

First, if bh->scheduled is already 1 in qemu_bh_schedule, QEMU does
not execute a memory barrier to order any writes needed by the callback
before the read of bh->scheduled.  If the other side sees req->state as
THREAD_ACTIVE, the callback is not invoked and you get deadlock.

Second, the memory barrier in aio_bh_poll is too weak.  Without this
patch, it is possible that bh->scheduled = 0 is not "published" until
after the callback has returned.  Another thread wants to schedule the
bottom half, but it sees bh->scheduled = 1 and does nothing.  This causes
a lost wakeup.  The memory barrier should have been changed to smp_mb()
in commit 924fe12 (aio: fix qemu_bh_schedule() bh->ctx race condition,
2014-06-03) together with qemu_bh_schedule()'s.  Guess who reviewed
that patch?

Both of these involve a store and a load, so they are reproducible on
x86_64 as well.  It is however much easier on aarch64, where the
libguestfs test suite triggers the bug fairly easily.  Even there the
failure can go away or appear depending on compiler optimization level,
tracing options, or even kernel debugging options.

Paul Leveille however reported how to trigger the problem within 15
minutes on x86_64 as well.  His (untested) recipe, reproduced here
for reference, is the following:

   1) Qcow2 (or 3) is critical – raw files alone seem to avoid the problem.

   2) Use “cache=directsync” rather than the default of
   “cache=none” to make it happen easier.

   3) Use a server with a write-back RAID controller to allow for rapid
   IO rates.

   4) Run a random-access load that (mostly) writes chunks to various
   files on the virtual block device.

      a. I use ‘diskload.exe c:25’, a Microsoft HCT load
         generator, on Windows VMs.

      b. Iometer can probably be configured to generate a similar load.

   5) Run multiple VMs in parallel, against the same storage device,
   to shake the failure out sooner.

   6) IvyBridge and Haswell processors for certain; not sure about others.

A similar patch survived over 12 hours of testing, where an unpatched
QEMU would fail within 15 minutes.

This bug is, most likely, also the cause of failures in the libguestfs
testsuite on AArch64.

Thanks to Laszlo Ersek for initially reporting this bug, to Stefan
Hajnoczi for suggesting closer examination of qemu_bh_schedule, and to
Paul for providing test input and a prototype patch.

Reported-by: Laszlo Ersek <lersek@redhat.com>
Reported-by: Paul Leveille <Paul.Leveille@stratus.com>
Reported-by: John Snow <jsnow@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1428419779-26062-1-git-send-email-pbonzini@redhat.com
Suggested-by: Paul Leveille <Paul.Leveille@stratus.com>
Suggested-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>

arm: memory: Replace memory_region_init_ram with memory_region_allocate_system_memory

Commit 0b183fc871:"memory: move mem_path handling to
memory_region_allocate_system_memory" split memory_region_init_ram and
memory_region_init_ram_from_file. Also it moved mem-path handling a step
up from memory_region_init_ram to memory_region_allocate_system_memory.

Therefore for any board that uses memory_region_init_ram directly,
-mem-path is not supported.

Fix this by replacing memory_region_init_ram with
memory_region_allocate_system_memory.

Signed-off-by: Dirk Mueller <dmueller@suse.com>
Message-id: CAL5wTH4UHYKpJF=dLJfFzxpufjY189chnCow47-ySuLf8GLbug@mail.gmail.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

virtio-blk: correctly dirty guest memory

After qemu_iovec_destroy, the QEMUIOVector's size is zeroed and
the zero size ultimately is used to compute virtqueue_push's len
argument. Therefore, reads from virtio-blk devices did not
migrate their results correctly. (Writes were okay).

Save the size in virtio_blk_handle_request, and use it when the request
is completed.

Based on a patch by Wen Congyang.

Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Tested-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Message-id: 1427997044-392-1-git-send-email-pbonzini@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>

qcow2: Fix header update with overridden backing file

In recent qemu versions, it is possible to override the backing file
name and format that is stored in the image file with values given at
runtime. In such cases, the temporary override could end up in the
image header if the qcow2 header was updated, while obviously correct
behaviour would be to leave the on-disk backing file path/format
unchanged.

Fix this and add a test case for it.

Reported-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 1428411796-2852-1-git-send-email-kwolf@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>

Merge remote-tracking branch 'remotes/mjt/tags/pull-trivial-patches-2015-04-04' into staging

trivial patches for 2015-04-04

# gpg: Signature made Sat Apr  4 08:07:49 2015 BST using RSA key ID A4C3D7DB
# gpg: Good signature from "Michael Tokarev <mjt@tls.msk.ru>"
# gpg:                 aka "Michael Tokarev <mjt@corpit.ru>"
# gpg:                 aka "Michael Tokarev <mjt@debian.org>"

* remotes/mjt/tags/pull-trivial-patches-2015-04-04:
  vhost: fix typo in vq_index description
  gitignore: Ignore more .pod files.
  target-tricore: Fix check which was always false
  target-i386: remove superfluous TARGET_HAS_SMC macro
  pcspk: Fix I/O port name

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

vhost: fix typo in vq_index description

Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>

gitignore: Ignore more .pod files.

kvm_stat.{1,pod} started showing up as untracked files in my
directory, and I nearly accidentally merged them into a commit
with my usual habit of 'git add .'. Rather than spelling out
each such file, just ignore the entire pattern.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>

target-tricore: Fix check which was always false

With a mask value of 0x00400000, the result will never be 1.
This fixes a Coverity warning.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>