]> git.proxmox.com Git - mirror_qemu.git/log
mirror_qemu.git
8 years agoblock: reopen: Extract QemuOpts for generic block layer options
Kevin Wolf [Fri, 8 May 2015 15:24:56 +0000 (17:24 +0200)]
block: reopen: Extract QemuOpts for generic block layer options

This patch adds a QemuOpts for generic block layer options to
bdrv_reopen_prepare(). The only two options that currently exist
(node-name and driver) cannot be changed, so the only thing we do is
putting them right back into the QDict so that we check at the end that
they are indeed unchanged.

We will add new options soon that can actually be changed.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
8 years agoqemu-iotests: Remove cache mode test without medium
Kevin Wolf [Thu, 7 May 2015 12:41:30 +0000 (14:41 +0200)]
qemu-iotests: Remove cache mode test without medium

Specifying the cache mode for a driver without a medium is not a useful
thing to do: As long as there is no medium, the cache mode doesn't make
a difference, and once the 'change' command is used to insert a medium,
it ignores the old cache mode and makes the new medium use
cache=writethrough.

Later patches will make it an error to specify the cache mode for an
empty drive. Remove the corresponding test case.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
8 years agoblockdev: Set 'format' indicates non-empty drive
Kevin Wolf [Fri, 13 Nov 2015 13:45:42 +0000 (14:45 +0100)]
blockdev: Set 'format' indicates non-empty drive

Creating an empty drive while specifying 'format' doesn't make sense.
The specified format driver would simply be ignored.

Make a set 'format' option an indication that a non-empty drive should
be created. This makes 'format' consistent with 'driver' and allows
using it with a block driver that doesn't need any other options (like
null-co/null-aio).

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
8 years agoblock: Introduce bs->explicit_options
Kevin Wolf [Fri, 8 May 2015 14:15:03 +0000 (16:15 +0200)]
block: Introduce bs->explicit_options

bs->options doesn't only contain options that the user explicitly
requested, but also option that were derived from flags, the filename or
inherited from the parent node.

For reopen, it is important to know the difference because reopening the
parent can change inherited values in child nodes, but it shouldn't
change any options that were explicitly specified for the child.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
8 years agoblock: Split out parse_json_protocol()
Kevin Wolf [Thu, 29 Oct 2015 14:24:41 +0000 (15:24 +0100)]
block: Split out parse_json_protocol()

The next patch distinguishes options that were explicitly set and
options that were derived. bdrv_fill_option() added options of both
types: Options given by json: syntax should be counted as explicit, but
the rest is derived.

In preparation for the distinction, move json: parse to a separate
function.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
8 years agoblock: Add infrastructure for option inheritance
Kevin Wolf [Wed, 29 Apr 2015 15:29:39 +0000 (17:29 +0200)]
block: Add infrastructure for option inheritance

Options are not actually inherited from the parent node yet, but this
commit lays the grounds for doing so.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
8 years agoblock: reopen: Document option precedence and refactor accordingly
Kevin Wolf [Fri, 8 May 2015 15:07:31 +0000 (17:07 +0200)]
block: reopen: Document option precedence and refactor accordingly

The interesting part of reopening an image is from which sources the
effective options should be taken, i.e. which options take precedence
over which other options. This patch documents the precedence that will
be implemented in the following patches.

It also refactors bdrv_reopen_queue(), so that the top-level reopened
node is handled the same way as children are. Option/flag inheritance
from the parent becomes just one item in the list and is done at the
beginning of the function, similar to how the other items are/will be
handled.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
8 years agoblock: Allow specifying child options in reopen
Kevin Wolf [Fri, 8 May 2015 13:14:15 +0000 (15:14 +0200)]
block: Allow specifying child options in reopen

If the child was defined in the same context (-drive argument or
blockdev-add QMP command) as its parent, a reopen of the parent should
work the same and allow changing options of the child.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
8 years agoblock: Keep "driver" in bs->options
Kevin Wolf [Fri, 24 Apr 2015 14:38:02 +0000 (16:38 +0200)]
block: Keep "driver" in bs->options

Instead of passing a separate drv argument to bdrv_open_common(), just
make sure that a "driver" option is set in the QDict. This also means
that a "driver" entry is consistently present in bs->options now.

This is another step towards keeping all options in the QDict (which is
the represenation of the blockdev-add QMP command).

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
8 years agoblock: Pass driver-specific options to .bdrv_refresh_filename()
Kevin Wolf [Mon, 27 Apr 2015 11:50:54 +0000 (13:50 +0200)]
block: Pass driver-specific options to .bdrv_refresh_filename()

In order to decide whether a blkdebug: filename can be produced or a
json: one is necessary, blkdebug checked whether bs->options had more
options than just "config", "x-image" or "image" (the latter including
nested options). That doesn't work well when generic block layer options
are present.

This patch passes an option QDict to the driver that contains only
driver-specific options, i.e. the options for the general block layer as
well as child nodes are already filtered out. Works much better this
way.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
8 years agoblock: Exclude nested options only for children in append_open_options()
Kevin Wolf [Mon, 27 Apr 2015 11:46:22 +0000 (13:46 +0200)]
block: Exclude nested options only for children in append_open_options()

Some drivers have nested options (e.g. blkdebug rule arrays), which
don't belong to a child node and shouldn't be removed. Don't remove all
options with "." in their name, but check for the complete prefixes of
actually existing child nodes.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
8 years agoblock: Consider all block layer options in append_open_options
Kevin Wolf [Fri, 24 Apr 2015 13:20:28 +0000 (15:20 +0200)]
block: Consider all block layer options in append_open_options

The code already special-cased "node-name", which is currently the only
option passed in the QDict that isn't driver-specific. Generalise the
code to take all general block layer options into consideration.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
8 years agoblock: Allow references for backing files
Kevin Wolf [Fri, 16 Jan 2015 17:23:41 +0000 (18:23 +0100)]
block: Allow references for backing files

For bs->file, using references to existing BDSes has been possible for a
while already. This patch enables the same for bs->backing.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
8 years agomirror: Error out when a BDS would get two BBs
Kevin Wolf [Wed, 28 Oct 2015 12:24:26 +0000 (13:24 +0100)]
mirror: Error out when a BDS would get two BBs

bdrv_replace_in_backing_chain() asserts that not both old and new
BlockDdriverState have a BlockBackend attached to them because both
would have to end up pointing to the new BDS and we don't support more
than one BB per BDS yet.

Before we can safely allow references to existing nodes as backing
files, we need to make sure that even if a backing file has a BB on it,
this doesn't crash qemu.

There are probably also some cases with the 'replaces' option set where
drive-mirror could fail this assertion today. They are fixed with this
error check as well.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
8 years agoblock: Fix reopen with semantically overlapping options
Kevin Wolf [Mon, 16 Nov 2015 15:43:27 +0000 (16:43 +0100)]
block: Fix reopen with semantically overlapping options

This fixes bdrv_reopen() calls like the following one:

    qemu-io -c 'open -o overlap-check.template=all /tmp/test.qcow2' \
    -c 'reopen -o overlap-check=none'

The approach taken so far would result in an options QDict that has both
"overlap-check.template=all" and "overlap-check=none", which obviously
conflicts. In this case, the old option should be overridden by the
newly specified option.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
8 years agoqcow2: Add .bdrv_join_options callback
Kevin Wolf [Mon, 16 Nov 2015 14:34:59 +0000 (15:34 +0100)]
qcow2: Add .bdrv_join_options callback

qcow2 accepts a few driver-specific options that overlap semantically
(e.g. "overlap-check" is an alias of "overlap-check.template", and any
missing cache size option is derived from the given ones).

When bdrv_reopen() merges the set of updated options with left out
options that should be kept at their old value, we need to consider this
and filter out any duplicates (which would generally cause errors
because new and old value would contradict each other).

This patch adds a .bdrv_join_options callback to BlockDriver and
implements it for qcow2.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
8 years agoiotests: 124: don't reopen qcow2
John Snow [Tue, 1 Dec 2015 23:16:39 +0000 (18:16 -0500)]
iotests: 124: don't reopen qcow2

Don't create two interfaces to the same drive in the recently moved
failure test.

Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
8 years agoiotests: 124: move incremental failure test
John Snow [Tue, 1 Dec 2015 23:16:38 +0000 (18:16 -0500)]
iotests: 124: move incremental failure test

Code motion only, in preparation for adjusting
the setUp procedure for this test.

Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
8 years agoiotests: 124: Split into two test classes
John Snow [Tue, 1 Dec 2015 23:16:37 +0000 (18:16 -0500)]
iotests: 124: Split into two test classes

Split it into an abstract test class and an implementation class.

The split is primarily to facilitate more flexible setUp variations
for other kinds of tests without having to rewrite or shuffle around
all of these helpers.

See the following two patches for more of the "why."

Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
8 years agoMerge remote-tracking branch 'remotes/berrange/tags/pull-io-channel-base-2015-12...
Peter Maydell [Fri, 18 Dec 2015 12:42:10 +0000 (12:42 +0000)]
Merge remote-tracking branch 'remotes/berrange/tags/pull-io-channel-base-2015-12-18-1' into staging

Merge I/O channels base classes

# gpg: Signature made Fri 18 Dec 2015 12:18:38 GMT using RSA key ID 15104FDF
# gpg: Good signature from "Daniel P. Berrange <dan@berrange.com>"
# gpg:                 aka "Daniel P. Berrange <berrange@redhat.com>"

* remotes/berrange/tags/pull-io-channel-base-2015-12-18-1:
  io: add QIOChannelBuffer class
  io: add QIOChannelCommand class
  io: add QIOChannelWebsock class
  io: add QIOChannelTLS class
  io: add QIOChannelFile class
  io: add QIOChannelSocket class
  io: add QIOTask class for async operations
  io: add helper module for creating watches on FDs
  io: add abstract QIOChannel classes

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8 years agoio: add QIOChannelBuffer class
Daniel P. Berrange [Tue, 15 Sep 2015 16:27:33 +0000 (17:27 +0100)]
io: add QIOChannelBuffer class

Add a QIOChannel subclass that is capable of performing I/O
to/from a memory buffer. This implementation does not attempt
to support concurrent readers & writers. It is designed for
serialized access where by a single thread at a time may write
data, seek and then read data back out.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
8 years agoio: add QIOChannelCommand class
Daniel P. Berrange [Thu, 27 Aug 2015 15:25:30 +0000 (16:25 +0100)]
io: add QIOChannelCommand class

Add a QIOChannel subclass that is capable of performing I/O
to/from a separate process, via a pair of pipes. The command
can be used for unidirectional or bi-directional I/O.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
8 years agoio: add QIOChannelWebsock class
Daniel P. Berrange [Wed, 4 Mar 2015 15:57:41 +0000 (15:57 +0000)]
io: add QIOChannelWebsock class

Add a QIOChannel subclass that can run the websocket protocol over
the top of another QIOChannel instance. This initial implementation
is only capable of acting as a websockets server. There is no support
for acting as a websockets client yet.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
8 years agoio: add QIOChannelTLS class
Daniel P. Berrange [Mon, 2 Mar 2015 18:13:13 +0000 (18:13 +0000)]
io: add QIOChannelTLS class

Add a QIOChannel subclass that can run the TLS protocol over
the top of another QIOChannel instance. The object provides a
simplified API to perform the handshake when starting the TLS
session. The layering of TLS over the underlying channel does
not have to be setup immediately. It is possible to take an
existing QIOChannel that has done some handshake and then swap
in the QIOChannelTLS layer. This allows for use with protocols
which start TLS right away, and those which start plain text
and then negotiate TLS.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
8 years agoio: add QIOChannelFile class
Daniel P. Berrange [Fri, 27 Feb 2015 18:25:25 +0000 (18:25 +0000)]
io: add QIOChannelFile class

Add a QIOChannel subclass that is capable of operating on things
that are files, such as plain files, pipes, character/block
devices, but notably not sockets.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
8 years agoio: add QIOChannelSocket class
Daniel P. Berrange [Fri, 27 Feb 2015 16:19:33 +0000 (16:19 +0000)]
io: add QIOChannelSocket class

Implement a QIOChannel subclass that supports sockets I/O.
The implementation is able to manage a single socket file
descriptor, whether a TCP/UNIX listener, TCP/UNIX connection,
or a UDP datagram. It provides APIs which can listen and
connect either asynchronously or synchronously. Since there
is no asynchronous DNS lookup API available, it uses the
QIOTask helper for spawning a background thread to ensure
non-blocking operation.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
8 years agoio: add QIOTask class for async operations
Daniel P. Berrange [Wed, 18 Mar 2015 17:25:45 +0000 (17:25 +0000)]
io: add QIOTask class for async operations

A number of I/O operations need to be performed asynchronously
to avoid blocking the main loop. The caller of such APIs need
to provide a callback to be invoked on completion/error and
need access to the error, if any. The small QIOTask provides
a simple framework for dealing with such probes. The API
docs inline provide an outline of how this is to be used.

Some functions don't have the ability to run asynchronously
(eg getaddrinfo always blocks), so to facilitate their use,
the task class provides a mechanism to run a blocking
function in a thread, while triggering the completion
callback in the main event loop thread. This easily allows
any synchronous function to be made asynchronous, albeit
at the cost of spawning a thread.

In this series, the QIOTask class will be used for things like
the TLS handshake, the websockets handshake and TCP connect()
progress.

The concept of QIOTask is inspired by the GAsyncResult
interface / GTask class in the GIO libraries. The min
version requirements on glib don't allow those to be
used from QEMU, so QIOTask provides a facsimilie which
can be easily switched to GTask in the future if the
min version is increased.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
8 years agoio: add helper module for creating watches on FDs
Daniel P. Berrange [Tue, 3 Mar 2015 12:59:16 +0000 (12:59 +0000)]
io: add helper module for creating watches on FDs

A number of the channel implementations will require the
ability to create watches on file descriptors / sockets.
To avoid duplicating this code in each channel, provide a
helper API for dealing with file descriptor watches.

There are two watch implementations provided. The first
is useful for bi-directional file descriptors such as
sockets, regular files, character devices, etc. The
second works with a pair of unidirectional file descriptors
such as pipes.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
8 years agoio: add abstract QIOChannel classes
Daniel P. Berrange [Fri, 27 Feb 2015 16:19:33 +0000 (16:19 +0000)]
io: add abstract QIOChannel classes

Start the new generic I/O channel framework by defining a
QIOChannel abstract base class. This is designed to feel
similar to GLib's GIOChannel, but with the addition of
support for using iovecs, qemu error reporting, file
descriptor passing, coroutine integration and use of
the QOM framework for easier sub-classing.

The intention is that anywhere in QEMU that almost
anywhere that deals with sockets will use this new I/O
infrastructure, so that it becomes trivial to then layer
in support for TLS encryption. This will at least include
the VNC server, char device backend and migration code.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
8 years agoMerge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging
Peter Maydell [Thu, 17 Dec 2015 18:07:09 +0000 (18:07 +0000)]
Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging

* KVM: synic support, split irqchip support
* memory: cleanups, optimizations, ioeventfd emulation
* SCSI: small fixes, vmw_pvscsi compatibility improvements
* qemu_log cleanups
* Coverity model improvements

# gpg: Signature made Thu 17 Dec 2015 16:35:21 GMT using RSA key ID 78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>"

* remotes/bonzini/tags/for-upstream: (45 commits)
  coverity: Model g_memdup()
  coverity: Model g_poll()
  scsi: always call notifier on async cancellation
  scsi: use scsi_req_cancel_async when purging requests
  target-i386: kvm: clear unusable segments' flags in migration
  rcu: optimize rcu_read_lock
  memory: try to inline constant-length reads
  memory: inline a few small accessors
  memory: extract first iteration of address_space_read and address_space_write
  memory: split address_space_read and address_space_write
  memory: avoid unnecessary object_ref/unref
  memory: reorder MemoryRegion fields
  exec: make qemu_ram_ptr_length more similar to qemu_get_ram_ptr
  exec: always call qemu_get_ram_ptr within rcu_read_lock
  linux-user: convert DEBUG_SIGNAL logging to tracepoints
  linux-user: avoid "naked" qemu_log
  user: introduce "-d page"
  xtensa: avoid "naked" qemu_log
  tricore: avoid "naked" qemu_log
  ppc: cleanup logging
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8 years agocoverity: Model g_memdup()
Markus Armbruster [Mon, 30 Nov 2015 16:32:32 +0000 (17:32 +0100)]
coverity: Model g_memdup()

We model all the non-deprecated memory allocation functions from
https://developer.gnome.org/glib/stable/glib-Memory-Allocation.html
except for g_memdup(), g_clear_pointer(), g_steal_pointer().  We don't
use the latter two.  Model the former.

Coverity now reports an OVERRUN
vl.c:2317: alloc_strlen: Allocating insufficient memory for the terminating null of the string.
Correct, but we omit the terminating null intentionally there.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1448901152-11716-1-git-send-email-armbru@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
8 years agocoverity: Model g_poll()
Markus Armbruster [Thu, 17 Dec 2015 07:20:33 +0000 (08:20 +0100)]
coverity: Model g_poll()

In my testing, Coverity reported two more CHECKED_RETURN:

* qemu-char.c:1248: fixed in commit c1f2448: "qemu-char: retry g_poll
  on EINTR".

* migration/qemu-file-unix.c:75: harmless, cleaned up in commit
  4e39f57 "migration: Clean up use of g_poll() in
  socket_writev_buffer()

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1450336833-27710-1-git-send-email-armbru@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
8 years agoscsi: always call notifier on async cancellation
Paolo Bonzini [Wed, 16 Dec 2015 18:33:44 +0000 (19:33 +0100)]
scsi: always call notifier on async cancellation

This was found by code inspection.  If the request is cancelled twice,
the notifier is never called on the second cancellation request,
and hence for example a TMF might never finish.

All the calls in scsi_req_cancel_async are idempotent, so the change
is safe.

Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <1450290827-30508-2-git-send-email-pbonzini@redhat.com>

8 years agoscsi: use scsi_req_cancel_async when purging requests
Paolo Bonzini [Wed, 16 Dec 2015 18:33:43 +0000 (19:33 +0100)]
scsi: use scsi_req_cancel_async when purging requests

This avoids calls to aio_poll without having acquired the context first.

Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <1450290827-30508-1-git-send-email-pbonzini@redhat.com>

8 years agotarget-i386: kvm: clear unusable segments' flags in migration
Michael Chapman [Mon, 7 Dec 2015 04:54:07 +0000 (15:54 +1100)]
target-i386: kvm: clear unusable segments' flags in migration

This commit fixes migration of a QEMU/KVM guest from kernel >= v3.9 to
kernel <= v3.7 (e.g. from RHEL 7 to RHEL 6). Without this commit a guest
migrated across these kernel versions fails to resume on the target host
as its segment descriptors are invalid.

Two separate kernel commits combined together to result in this bug:

  commit f0495f9b9992f80f82b14306946444b287193390
  Author: Avi Kivity <avi@redhat.com>
  Date:   Thu Jun 7 17:06:10 2012 +0300

      KVM: VMX: Relax check on unusable segment

      Some userspace (e.g. QEMU 1.1) munge the d and g bits of segment
      descriptors, causing us not to recognize them as unusable segments
      with emulate_invalid_guest_state=1.  Relax the check by testing for
      segment not present (a non-present segment cannot be usable).

Signed-off-by: Avi Kivity <avi@redhat.com>
  commit 25391454e73e3156202264eb3c473825afe4bc94
  Author: Gleb Natapov <gleb@redhat.com>
  Date:   Mon Jan 21 15:36:46 2013 +0200

      KVM: VMX: don't clobber segment AR of unusable segments.

      Usability is returned in unusable field, so not need to clobber entire
      AR. Callers have to know how to deal with unusable segments already
      since if emulate_invalid_guest_state=true AR is not zeroed.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
The first commit changed the KVM_SET_SREGS ioctl so that it did no treat
segment flags == 0 as an unusable segment, instead only looking at the
"present" flag.

The second commit changed KVM_GET_SREGS so that it did not clear the
flags of an unusable segment.

Since QEMU does not itself maintain the "unusable" flag across a
migration, the end result is that unusable segments read from a kernel
with these commits and loaded into a kernel without these commits are
not properly recognised as being unusable.

This commit updates both get_seg and set_seg so that the problem is
avoided even when migrating to or migrating from a QEMU without this
commit. In get_seg, we clear the segment flags if the segment is marked
unusable. In set_seg, we mark the segment unusable if the segment's
"present" flag is not set.

Signed-off-by: Michael Chapman <mike@very.puzzling.org>
Message-Id: <1449464047-17467-1-git-send-email-mike@very.puzzling.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
8 years agorcu: optimize rcu_read_lock
Paolo Bonzini [Wed, 16 Dec 2015 11:32:22 +0000 (12:32 +0100)]
rcu: optimize rcu_read_lock

rcu_read_lock cannot change rcu_gp_ongoing from true to false
(the previous value of p_rcu_reader->ctr is zero), hence
there is no need to check p_rcu_reader->waiting and wake up
a concurrent synchronize_rcu.

While at it mark the wakeup as unlikely in rcu_read_unlock.

Reviewed-by: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <1450265542-4323-1-git-send-email-pbonzini@redhat.com>

8 years agomemory: try to inline constant-length reads
Paolo Bonzini [Wed, 9 Dec 2015 09:34:13 +0000 (10:34 +0100)]
memory: try to inline constant-length reads

memcpy can take a large amount of time for small reads and writes.
Handle the common case of reading s/g descriptors from memory (there
is no corresponding "write" case that is as common, because writes
often use address_space_st* functions) by inlining the relevant
parts of address_space_read into the caller.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
8 years agomemory: inline a few small accessors
Paolo Bonzini [Wed, 9 Dec 2015 16:47:39 +0000 (17:47 +0100)]
memory: inline a few small accessors

These are used in the address_space_* fast paths.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
8 years agomemory: extract first iteration of address_space_read and address_space_write
Paolo Bonzini [Wed, 9 Dec 2015 09:18:57 +0000 (10:18 +0100)]
memory: extract first iteration of address_space_read and address_space_write

We want to inline the case where there is only one iteration, because
then the compiler can also inline the memcpy.  As a start, extract
everything after the first address_space_translate call.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
8 years agomemory: split address_space_read and address_space_write
Paolo Bonzini [Wed, 9 Dec 2015 09:06:31 +0000 (10:06 +0100)]
memory: split address_space_read and address_space_write

Rather than dispatching on is_write for every iteration, make
address_space_rw call one of the two functions.  The amount of
duplicate logic is pretty small, and memory_access_is_direct can
be tweaked so that it inlines better in the callers.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
8 years agomemory: avoid unnecessary object_ref/unref
Paolo Bonzini [Wed, 9 Dec 2015 10:44:25 +0000 (11:44 +0100)]
memory: avoid unnecessary object_ref/unref

For the common case of DMA into non-hotplugged RAM, it is unnecessary
but expensive to do object_ref/unref.  Add back an owner field to
MemoryRegion, so that these memory regions can skip the reference
counting.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
8 years agomemory: reorder MemoryRegion fields
Paolo Bonzini [Wed, 9 Dec 2015 10:40:14 +0000 (11:40 +0100)]
memory: reorder MemoryRegion fields

Order fields so that all fields accessed during a RAM read/write fit in
the same cache line.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
8 years agoexec: make qemu_ram_ptr_length more similar to qemu_get_ram_ptr
Paolo Bonzini [Wed, 16 Dec 2015 09:31:26 +0000 (10:31 +0100)]
exec: make qemu_ram_ptr_length more similar to qemu_get_ram_ptr

Notably, use qemu_get_ram_block to enjoy the MRU optimization.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
8 years agoexec: always call qemu_get_ram_ptr within rcu_read_lock
Paolo Bonzini [Wed, 16 Dec 2015 09:30:47 +0000 (10:30 +0100)]
exec: always call qemu_get_ram_ptr within rcu_read_lock

Simplify the code and document the assumption.  The only caller
that is not within rcu_read_lock is memory_region_get_ram_ptr.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
8 years agolinux-user: convert DEBUG_SIGNAL logging to tracepoints
Paolo Bonzini [Fri, 13 Nov 2015 12:52:21 +0000 (13:52 +0100)]
linux-user: convert DEBUG_SIGNAL logging to tracepoints

"Unimplemented" messages go to stderr, everything else goes to tracepoints

Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
8 years agolinux-user: avoid "naked" qemu_log
Paolo Bonzini [Fri, 13 Nov 2015 12:20:35 +0000 (13:20 +0100)]
linux-user: avoid "naked" qemu_log

Ensure that all log writes are protected by qemu_loglevel_mask or,
in serious cases, go to both the log and stderr.

Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
8 years agouser: introduce "-d page"
Paolo Bonzini [Fri, 13 Nov 2015 11:32:19 +0000 (12:32 +0100)]
user: introduce "-d page"

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
8 years agoxtensa: avoid "naked" qemu_log
Paolo Bonzini [Fri, 13 Nov 2015 12:43:35 +0000 (13:43 +0100)]
xtensa: avoid "naked" qemu_log

Cc: Max Filippov <jcmvbkbc@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
8 years agotricore: avoid "naked" qemu_log
Paolo Bonzini [Fri, 13 Nov 2015 12:35:27 +0000 (13:35 +0100)]
tricore: avoid "naked" qemu_log

Cc: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
8 years agoppc: cleanup logging
Paolo Bonzini [Fri, 13 Nov 2015 12:34:23 +0000 (13:34 +0100)]
ppc: cleanup logging

Avoid "naked" qemu_log, bring documentation for DEBUG #defines
up to date.

Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
8 years agos390x: avoid "naked" qemu_log
Paolo Bonzini [Fri, 13 Nov 2015 12:25:21 +0000 (13:25 +0100)]
s390x: avoid "naked" qemu_log

Convert to debug-only qemu_log.

Cc: Alexander Graf <agraf@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
8 years agomicroblaze: avoid "naked" qemu_log
Paolo Bonzini [Fri, 13 Nov 2015 12:24:57 +0000 (13:24 +0100)]
microblaze: avoid "naked" qemu_log

Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
8 years agocris: avoid "naked" qemu_log
Paolo Bonzini [Fri, 13 Nov 2015 12:24:26 +0000 (13:24 +0100)]
cris: avoid "naked" qemu_log

Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
8 years agoalpha: convert "naked" qemu_log to tracepoint
Paolo Bonzini [Fri, 13 Nov 2015 12:23:45 +0000 (13:23 +0100)]
alpha: convert "naked" qemu_log to tracepoint

Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
8 years agoqemu-log: introduce qemu_log_separate
Paolo Bonzini [Fri, 13 Nov 2015 12:16:27 +0000 (13:16 +0100)]
qemu-log: introduce qemu_log_separate

In some cases, the same message is printed both on stderr and in the log.
Avoid duplicate output in the default case where stderr _is_ the log,
and standardize this to stderr+log where it used to use stdio+log.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
8 years agoqemu-char: append opt to stop truncation of serial file
Olga Krishtal [Fri, 4 Dec 2015 06:42:04 +0000 (09:42 +0300)]
qemu-char: append opt to stop truncation of serial file

Our QA team wants to preserve serial output of the guest in between QEMU
runs to perform post-analysis.

By default this behavior is off (file is truncated each time QEMU is
started or device is plugged).

Signed-off-by: Olga Krishtal <okrishtal@virtuozzo.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Eric Blake <eblake@redhat.com>
CC: Markus Armbruster <armbru@redhat.com>
CC: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <1449211324-17856-1-git-send-email-den@openvz.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
8 years agokvm: x86: add support for KVM_CAP_SPLIT_IRQCHIP
Paolo Bonzini [Thu, 17 Dec 2015 16:16:08 +0000 (17:16 +0100)]
kvm: x86: add support for KVM_CAP_SPLIT_IRQCHIP

This patch adds support for split IRQ chip mode. When
KVM_CAP_SPLIT_IRQCHIP is enabled:

    1.) The PIC, PIT, and IOAPIC are implemented in userspace while
    the LAPIC is implemented by KVM.

    2.) The software IOAPIC delivers interrupts to the KVM LAPIC via
    kvm_set_irq. Interrupt delivery is configured via the MSI routing
    table, for which routes are reserved in target-i386/kvm.c then
    configured in hw/intc/ioapic.c

    3.) KVM delivers IOAPIC EOIs via a new exit KVM_EXIT_IOAPIC_EOI,
    which is handled in target-i386/kvm.c and relayed to the software
    IOAPIC via ioapic_eoi_broadcast.

Signed-off-by: Matt Gingell <gingell@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
8 years agokvm: add support for -machine kernel_irqchip=split
Matt Gingell [Mon, 16 Nov 2015 18:03:06 +0000 (10:03 -0800)]
kvm: add support for -machine kernel_irqchip=split

This patch adds the initial plumbing for split IRQ chip mode via
KVM_CAP_SPLIT_IRQCHIP. In addition to option processing, a number of
kvm_*_in_kernel macros are defined to help clarify which component is
where.

Signed-off-by: Matt Gingell <gingell@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
8 years agotarget-i386/kvm: Hyper-V SynIC timers MSR's support
Andrey Smetanin [Wed, 25 Nov 2015 15:21:25 +0000 (18:21 +0300)]
target-i386/kvm: Hyper-V SynIC timers MSR's support

Hyper-V SynIC timers are host timers that are configurable
by guest through corresponding MSR's (HV_X64_MSR_STIMER*).
Guest setup and use fired by host events(SynIC interrupt
and appropriate timer expiration message) as guest clock
events.

The state of Hyper-V SynIC timers are stored in corresponding
MSR's. This patch seria implements such MSR's support and migration.

Signed-off-by: Andrey Smetanin <asmetanin@virtuozzo.com>
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: Richard Henderson <rth@twiddle.net>
CC: Eduardo Habkost <ehabkost@redhat.com>
CC: "Andreas Färber" <afaerber@suse.de>
CC: Marcelo Tosatti <mtosatti@redhat.com>
CC: Denis V. Lunev <den@openvz.org>
CC: Roman Kagan <rkagan@virtuozzo.com>
CC: kvm@vger.kernel.org
Message-Id: <1448464885-8300-3-git-send-email-asmetanin@virtuozzo.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
8 years agohw/misc: Hyper-V test device 'hyperv-testdev'
Andrey Smetanin [Tue, 10 Nov 2015 12:52:44 +0000 (15:52 +0300)]
hw/misc: Hyper-V test device 'hyperv-testdev'

'hyperv-testdev' will be used by kvm-unit-tests
to setup Hyper-V SynIC SINT's routing and to inject
Hyper-V SynIC SINT's.

Hyper-V test device is ISA type device that creates 0x3000
IO memory region and catches write access into it. Every
write operation data decoded into ctl code and parameters
for Hyper-V test device.

Signed-off-by: Andrey Smetanin <asmetanin@virtuozzo.com>
Reviewed-by: Roman Kagan <rkagan@virtuozzo.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: Richard Henderson <rth@twiddle.net>
CC: Eduardo Habkost <ehabkost@redhat.com>
CC: "Andreas Färber" <afaerber@suse.de>
CC: Marcelo Tosatti <mtosatti@redhat.com>
CC: Roman Kagan <rkagan@virtuozzo.com>
CC: Denis V. Lunev <den@openvz.org>
CC: kvm@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
8 years agotarget-i386/hyperv: Hyper-V SynIC SINT routing and vcpu exit
Andrey Smetanin [Tue, 10 Nov 2015 12:52:43 +0000 (15:52 +0300)]
target-i386/hyperv: Hyper-V SynIC SINT routing and vcpu exit

Hyper-V SynIC(synthetic interrupt controller) helpers for
Hyper-V SynIC irq routing setup, irq injection, irq ack
notifications event/message pages changes tracking for future use.

Signed-off-by: Andrey Smetanin <asmetanin@virtuozzo.com>
Reviewed-by: Roman Kagan <rkagan@virtuozzo.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: Richard Henderson <rth@twiddle.net>
CC: Eduardo Habkost <ehabkost@redhat.com>
CC: "Andreas Färber" <afaerber@suse.de>
CC: Marcelo Tosatti <mtosatti@redhat.com>
CC: Roman Kagan <rkagan@virtuozzo.com>
CC: Denis V. Lunev <den@openvz.org>
CC: kvm@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
8 years agokvm: Hyper-V SynIC irq routing support
Andrey Smetanin [Tue, 10 Nov 2015 12:52:42 +0000 (15:52 +0300)]
kvm: Hyper-V SynIC irq routing support

Signed-off-by: Andrey Smetanin <asmetanin@virtuozzo.com>
Reviewed-by: Roman Kagan <rkagan@virtuozzo.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: Richard Henderson <rth@twiddle.net>
CC: Eduardo Habkost <ehabkost@redhat.com>
CC: "Andreas Färber" <afaerber@suse.de>
CC: Marcelo Tosatti <mtosatti@redhat.com>
CC: Roman Kagan <rkagan@virtuozzo.com>
CC: Denis V. Lunev <den@openvz.org>
CC: kvm@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
8 years agotarget-i386/kvm: Hyper-V SynIC MSR's support
Andrey Smetanin [Wed, 11 Nov 2015 10:18:38 +0000 (13:18 +0300)]
target-i386/kvm: Hyper-V SynIC MSR's support

This patch does Hyper-V Synthetic interrupt
controller(Hyper-V SynIC) MSR's support and
migration. Hyper-V SynIC is enabled by cpu's
'hv-synic' option.

This patch does not allow cpu creation if
'hv-synic' option specified but kernel
doesn't support Hyper-V SynIC.

Changes v3:
* removed 'msr_hv_synic_version' migration because
it's value always the same
* moved SynIC msr's initialization into kvm_arch_init_vcpu

Signed-off-by: Andrey Smetanin <asmetanin@virtuozzo.com>
Reviewed-by: Roman Kagan <rkagan@virtuozzo.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: Richard Henderson <rth@twiddle.net>
CC: Eduardo Habkost <ehabkost@redhat.com>
CC: "Andreas Färber" <afaerber@suse.de>
CC: Marcelo Tosatti <mtosatti@redhat.com>
CC: Roman Kagan <rkagan@virtuozzo.com>
CC: Denis V. Lunev <den@openvz.org>
CC: kvm@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
8 years agolinux-headers: update from kvm/next
Paolo Bonzini [Tue, 15 Dec 2015 14:00:27 +0000 (15:00 +0100)]
linux-headers: update from kvm/next

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
8 years agovmw_pvscsi: Introduce 'x-disable-pcie' backword compatability property
Shmulik Ladkani [Sun, 13 Dec 2015 08:08:32 +0000 (10:08 +0200)]
vmw_pvscsi: Introduce 'x-disable-pcie' backword compatability property

Following the previous patch which changed pvscsi to be a pci express
device, this patch introduces a boolean property 'x-disable-pcie'.

Its default value is false, exposing pvscsi as a pcie device.

Setting 'x-disable-pcie' to 'on' preserves the old 'pci device' (non
express) behavior. This allows migration to older versions.

Signed-off-by: Shmulik Ladkani <shmulik.ladkani@ravellosystems.com>
Message-Id: <1449994112-7054-7-git-send-email-shmulik.ladkani@ravellosystems.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
8 years agovmw_pvscsi: The pvscsi device is a PCIE endpoint
Shmulik Ladkani [Sun, 13 Dec 2015 08:08:31 +0000 (10:08 +0200)]
vmw_pvscsi: The pvscsi device is a PCIE endpoint

Report the 'express endpoint' capability if on a PCIE bus.

Signed-off-by: Shmulik Ladkani <shmulik.ladkani@ravellosystems.com>
Message-Id: <1449994112-7054-6-git-send-email-shmulik.ladkani@ravellosystems.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
8 years agovmw_pvscsi: coding: Introduce PVSCSIClass
Shmulik Ladkani [Sun, 13 Dec 2015 08:08:30 +0000 (10:08 +0200)]
vmw_pvscsi: coding: Introduce PVSCSIClass

Introduce a class type for pvscsi, and the usual
DEVICE_CLASS/DEVICE_GET_CLASS macros.

No semantic change.

Signed-off-by: Shmulik Ladkani <shmulik.ladkani@ravellosystems.com>
Message-Id: <1449994112-7054-5-git-send-email-shmulik.ladkani@ravellosystems.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
8 years agovmw_pvscsi: Introduce 'x-old-pci-configuration' backword compatability property
Shmulik Ladkani [Sun, 13 Dec 2015 08:08:29 +0000 (10:08 +0200)]
vmw_pvscsi: Introduce 'x-old-pci-configuration' backword compatability property

Following the previous patches, which introduced various changes in
pvscsi's pci configuration space (device subsystem id and revision, msi
offset), this patch introduces a boolean property
'x-old-pci-configuration' to pvscsi.

Its default value is false, exposing the above changes in the pci config
space.

Setting 'x-old-pci-configuration' to 'on' preserves the old behavior,
which allows migration to older versions.

Signed-off-by: Shmulik Ladkani <shmulik.ladkani@ravellosystems.com>
Message-Id: <1449994112-7054-4-git-send-email-shmulik.ladkani@ravellosystems.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
8 years agovmw_pvscsi: Change offset of msi pci capability
Shmulik Ladkani [Sun, 13 Dec 2015 08:08:28 +0000 (10:08 +0200)]
vmw_pvscsi: Change offset of msi pci capability

Place device reported MSI capability at the same offset as placed by
the VMware virtual hardware - at offset 0x7c.

Signed-off-by: Shmulik Ladkani <shmulik.ladkani@ravellosystems.com>
Message-Id: <1449994112-7054-3-git-send-email-shmulik.ladkani@ravellosystems.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
8 years agovmw_pvscsi: Set device subsystem and revision
Shmulik Ladkani [Sun, 13 Dec 2015 08:08:27 +0000 (10:08 +0200)]
vmw_pvscsi: Set device subsystem and revision

To be VMware PVSCSI SCSI Controller, rev 02.
As reported by the VMware virtual hardware.

Signed-off-by: Shmulik Ladkani <shmulik.ladkani@ravellosystems.com>
Message-Id: <1449994112-7054-2-git-send-email-shmulik.ladkani@ravellosystems.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
8 years agomemory: emulate ioeventfd
Pavel Fedin [Fri, 20 Nov 2015 09:37:16 +0000 (12:37 +0300)]
memory: emulate ioeventfd

The ioeventfd mechanism is used by vhost, dataplane, and virtio-pci to
turn guest MMIO/PIO writes into eventfd file descriptor events.  This
allows arbitrary threads to be notified when the guest writes to a
specific MMIO/PIO address.

qtest and TCG do not support ioeventfd because memory writes are not
checked against registered ioeventfds in QEMU.  This patch implements
this in memory_region_dispatch_write() so qtest can use ioeventfd.

Also this patch fixes vhost aborting on some misconfigured old kernels
like 3.18.0 on ARM. It is possible to explicitly enable CONFIG_EVENTFD
in expert settings, while MMIO binding support in KVM will still be
missing.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Pavel Fedin <p.fedin@samsung.com>
Message-Id: <006e01d12377$0b9c2d40$22d487c0$@samsung.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
8 years agokvm-all: PAGE_SIZE should be real host page size
Andrew Jones [Tue, 10 Nov 2015 00:23:42 +0000 (19:23 -0500)]
kvm-all: PAGE_SIZE should be real host page size

Just noticed this while grepping TARGET_PAGE_SIZE for an unrelated
reason. I didn't use qemu_real_host_page_size as kvm_set_phys_mem()
does, because we'd need to make sure page_size_init() has run first.

Signed-off-by: Andrew Jones <drjones@redhat.com>
Message-Id: <1447115022-4142-1-git-send-email-drjones@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
8 years agoexec: Remove unnecessary RAM_FILE flag
Eduardo Habkost [Fri, 6 Nov 2015 22:11:21 +0000 (20:11 -0200)]
exec: Remove unnecessary RAM_FILE flag

The only code that sets RAMBlock.fd is file_ram_alloc(), and the only
code that calls file_ram_alloc() sets the RAM_FILE flag. That means the
flag is always set when RAMBlock.fd >= 0, and the munmap() call at
reclaim_ramblock() is dead code that never runs.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <1446847881-9385-1-git-send-email-ehabkost@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
8 years agomemory: Eliminate memory_region_destructor_ram_from_ptr()
Eduardo Habkost [Fri, 6 Nov 2015 21:20:05 +0000 (19:20 -0200)]
memory: Eliminate memory_region_destructor_ram_from_ptr()

The function is equivalent to memory_region_destructor_ram(), so
it's not needed anymore.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <1446844805-14492-3-git-send-email-ehabkost@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
8 years agoexec: Eliminate qemu_ram_free_from_ptr()
Eduardo Habkost [Fri, 6 Nov 2015 21:20:04 +0000 (19:20 -0200)]
exec: Eliminate qemu_ram_free_from_ptr()

Replace qemu_ram_free_from_ptr() with qemu_ram_free().

The only difference between qemu_ram_free_from_ptr() and
qemu_ram_free() is that g_free_rcu() is used instead of
call_rcu(reclaim_ramblock). We can safely replace it because:

* RAM blocks allocated by qemu_ram_alloc_from_ptr() always have
  RAM_PREALLOC set;
* reclaim_ramblock(block) will do nothing except g_free(block)
  if RAM_PREALLOC is set at block->flags.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <1446844805-14492-2-git-send-email-ehabkost@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
8 years agoMerge remote-tracking branch 'remotes/pmaydell/tags/pull-target-arm-20151217-1' into...
Peter Maydell [Thu, 17 Dec 2015 13:38:34 +0000 (13:38 +0000)]
Merge remote-tracking branch 'remotes/pmaydell/tags/pull-target-arm-20151217-1' into staging

target-arm queue:
 * i.MX CCM patches
 * support guest debug for AArch64 KVM
 * support power button on virt board via GPIO
 * clean up AArch32 singlestep code
 * raise exception on misaligned LDREX operands
 * soc-dma: use hwaddr instead of target_ulong in printf
 * explicitly mark some ARM device loads as little-endian
 * i.MX: add support for lower and upper interrupt in GPIO

# gpg: Signature made Thu 17 Dec 2015 13:38:09 GMT using RSA key ID 14360CDE
# gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>"
# gpg:                 aka "Peter Maydell <pmaydell@gmail.com>"
# gpg:                 aka "Peter Maydell <pmaydell@chiark.greenend.org.uk>"

* remotes/pmaydell/tags/pull-target-arm-20151217-1: (25 commits)
  i.MX: Add an i.MX25 specific CCM class/instance
  i.MX: Split the CCM class into an abstract base class and a concrete class
  i.MX: rename i.MX CCM get_clock() function and CLK ID enum names
  i.MX: Fix i.MX31 default/reset configuration
  tests/guest-debug: introduce basic gdbstub tests
  target-arm: kvm - re-inject guest debug exceptions
  target-arm: kvm - add support for HW assisted debug
  target-arm: kvm - support for single step
  target-arm: kvm - implement software breakpoints
  target-arm: kvm64 - introduce kvm_arm_init_debug()
  ARM: Virt: Add gpio-keys node for Poweroff using DT
  ARM: Virt: Add QEMU powerdown notifier and hook it to GPIO Pin 3
  ARM: ACPI: Add _E03 for Power Button
  ACPI: Add aml_gpio_int() wrapper for GPIO Interrupt Connection
  ACPI: Add GPIO Connection Descriptor
  ARM: ACPI: Add power button device in ACPI DSDT table
  ARM: ACPI: Add GPIO controller in ACPI DSDT table
  ARM: Virt: Add a GPIO controller
  acpi: extend aml_interrupt() to support multiple irqs
  acpi: support serialized method
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8 years agoi.MX: Add an i.MX25 specific CCM class/instance
Jean-Christophe Dubois [Thu, 17 Dec 2015 13:37:16 +0000 (13:37 +0000)]
i.MX: Add an i.MX25 specific CCM class/instance

With this CCM, i.MX25 timer is accurate with "real world time".

Signed-off-by: Jean-Christophe Dubois <jcd@tribudubois.net>
Message-id: 2c0cf90be767bfc8520661eca891ab22c61f18fe.1449528242.git.jcd@tribudubois.net
Reviewed-by Peter Crosthwaite <crosthwaite.peter@gmail.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8 years agoi.MX: Split the CCM class into an abstract base class and a concrete class
Jean-Christophe Dubois [Thu, 17 Dec 2015 13:37:15 +0000 (13:37 +0000)]
i.MX: Split the CCM class into an abstract base class and a concrete class

The IMX_CCM class is now the base abstract class that is used by EPIT
and GPT timer implementation.

IMX31_CCM class is the concrete class implementing CCM for i.MX31 SOC.

For now the i.MX25 continues to use the i.MX31 CCM implementation.

An i.MX25 specific CCM will be introduced in a later patch.

We also rework initialization to stop using deprecated sysbus device init.

Signed-off-by: Jean-Christophe Dubois <jcd@tribudubois.net>
Reviewed-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Message-id: fd3c7f87b50f5ebc99ec91f01413db35017f116d.1449528242.git.jcd@tribudubois.net
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8 years agoi.MX: rename i.MX CCM get_clock() function and CLK ID enum names
Jean-Christophe Dubois [Thu, 17 Dec 2015 13:37:15 +0000 (13:37 +0000)]
i.MX: rename i.MX CCM get_clock() function and CLK ID enum names

This is to prepare for CCM code refactoring.

This is just a bit of function and enum values renaming.

We also remove some useless intermediate variables.

Signed-off-by: Jean-Christophe Dubois <jcd@tribudubois.net>
Reviewed-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Message-id: 53c4d9b9611988a5f56f178f285e04490747925e.1449528242.git.jcd@tribudubois.net
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8 years agoi.MX: Fix i.MX31 default/reset configuration
Jean-Christophe Dubois [Thu, 17 Dec 2015 13:37:15 +0000 (13:37 +0000)]
i.MX: Fix i.MX31 default/reset configuration

Linux on i.MX31/KZM is expecting the CCM to use the CKIH ref clock
instead of the CKIL plus the FPM multiplier.

We change the CCMR reg reset value to match linux expected config.

This allows the CCM to provide a 39MHz clk (as expected by linux)
instead of the actual 50MHz.

With this change the "sleep 60" command on linux is time accurate
with "real world time".

Signed-off-by: Jean-Christophe Dubois <jcd@tribudubois.net>
Reviewed-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Message-id: 6dc5bc4e0a450b20cecdb2991112e7281b653345.1449528242.git.jcd@tribudubois.net
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8 years agotests/guest-debug: introduce basic gdbstub tests
Alex Bennée [Thu, 17 Dec 2015 13:37:15 +0000 (13:37 +0000)]
tests/guest-debug: introduce basic gdbstub tests

The aim of these tests is to combine with an appropriate kernel
image (with symbol-file vmlinux) and check it behaves as it should.
Given a kernel it checks:

  - single step
  - software breakpoint
  - hardware breakpoint
  - access, read and write watchpoints

On success it returns 0 to the calling process.

I've not plumbed this into the "make check" logic though as we need a
solution for providing non-host binaries to the tests. However the test
is structured to work with pretty much any Linux kernel image as it
uses the basic kernel_init code which is common across architectures.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 1449599553-24713-7-git-send-email-alex.bennee@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8 years agotarget-arm: kvm - re-inject guest debug exceptions
Alex Bennée [Thu, 17 Dec 2015 13:37:15 +0000 (13:37 +0000)]
target-arm: kvm - re-inject guest debug exceptions

If we can't find details for the debug exception in our debug state
then we can assume the exception is due to debugging inside the guest.
To inject the exception into the guest state we re-use the TCG exception
code (do_interrupt).

However while guest debugging is in effect we currently can't handle the
guest using single step as we will keep trapping to back to userspace.
GDB makes heavy use of single-step behind the scenes which effectively
means the guest's ability to debug itself is disabled while it is being
debugged.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 1449599553-24713-6-git-send-email-alex.bennee@linaro.org
[PMM: Fixed a few typos in comments and commit message]
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8 years agotarget-arm: kvm - add support for HW assisted debug
Alex Bennée [Thu, 17 Dec 2015 13:37:15 +0000 (13:37 +0000)]
target-arm: kvm - add support for HW assisted debug

This adds basic support for HW assisted debug. The ioctl interface to
KVM allows us to pass an implementation defined number of break and
watch point registers. When KVM_GUESTDBG_USE_HW is specified these
debug registers will be installed in place on the world switch into the
guest.

The hardware is actually capable of more advanced matching but it is
unclear if this expressiveness is available via the gdbstub protocol.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 1449599553-24713-5-git-send-email-alex.bennee@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8 years agotarget-arm: kvm - support for single step
Alex Bennée [Thu, 17 Dec 2015 13:37:15 +0000 (13:37 +0000)]
target-arm: kvm - support for single step

This adds support for single-step. There isn't much to do on the QEMU
side as after we set-up the request for single step via the debug ioctl
it is all handled within the kernel.

The actual setting of the KVM_GUESTDBG_SINGLESTEP flag is already in the
common code. If the kernel doesn't support guest debug the ioctl will
simply error.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 1449599553-24713-4-git-send-email-alex.bennee@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8 years agotarget-arm: kvm - implement software breakpoints
Alex Bennée [Thu, 17 Dec 2015 13:37:15 +0000 (13:37 +0000)]
target-arm: kvm - implement software breakpoints

These don't involve messing around with debug registers, just setting
the breakpoint instruction in memory. GDB will not use this mechanism if
it can't access the memory to write the breakpoint.

All the kernel has to do is ensure the hypervisor traps the breakpoint
exceptions and returns to userspace.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 1449599553-24713-3-git-send-email-alex.bennee@linaro.org
[PMM: Fixed typo in comment]
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8 years agotarget-arm: kvm64 - introduce kvm_arm_init_debug()
Alex Bennée [Thu, 17 Dec 2015 13:37:14 +0000 (13:37 +0000)]
target-arm: kvm64 - introduce kvm_arm_init_debug()

As we haven't always had guest debug support we need to probe for it.
Additionally we don't do this in the start-up capability code so we
don't fall over on old kernels.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 1449599553-24713-2-git-send-email-alex.bennee@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8 years agoARM: Virt: Add gpio-keys node for Poweroff using DT
Shannon Zhao [Thu, 17 Dec 2015 13:37:14 +0000 (13:37 +0000)]
ARM: Virt: Add gpio-keys node for Poweroff using DT

Add a gpio-keys node. This is used for Poweroff for the systems which
use DT not ACPI.

Signed-off-by: Shannon Zhao <zhaoshenglong@huawei.com>
Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org>
Tested-by: Wei Huang <wei@redhat.com>
Message-id: 1449804086-3464-11-git-send-email-zhaoshenglong@huawei.com
[PMM: use "standard-headers/linux/input.h" rather than <linux/input.h>]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8 years agoARM: Virt: Add QEMU powerdown notifier and hook it to GPIO Pin 3
Shannon Zhao [Thu, 17 Dec 2015 13:37:14 +0000 (13:37 +0000)]
ARM: Virt: Add QEMU powerdown notifier and hook it to GPIO Pin 3

Currently mach-virt model doesn't support powerdown request. Guest VM
doesn't react to system_powerdown from monitor console (or QMP) because
there is no communication mechanism for such requests. This patch registers
GPIO Pin 3 with powerdown notification. So guest VM can receive notification
when such powerdown request is triggered.

Signed-off-by: Wei Huang <wei@redhat.com>
Signed-off-by: Shannon Zhao <zhaoshenglong@huawei.com>
Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org>
Reviewed-by: Wei Huang <wei@redhat.com>
Tested-by: Wei Huang <wei@redhat.com>
Message-id: 1449804086-3464-10-git-send-email-zhaoshenglong@huawei.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8 years agoARM: ACPI: Add _E03 for Power Button
Shannon Zhao [Thu, 17 Dec 2015 13:37:14 +0000 (13:37 +0000)]
ARM: ACPI: Add _E03 for Power Button

Here GPIO pin 3 is used for Power Button, add _E03 in ACPI DSDT table.

Signed-off-by: Shannon Zhao <zhaoshenglong@huawei.com>
Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org>
Tested-by: Wei Huang <wei@redhat.com>
Message-id: 1449804086-3464-9-git-send-email-zhaoshenglong@huawei.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8 years agoACPI: Add aml_gpio_int() wrapper for GPIO Interrupt Connection
Shannon Zhao [Thu, 17 Dec 2015 13:37:14 +0000 (13:37 +0000)]
ACPI: Add aml_gpio_int() wrapper for GPIO Interrupt Connection

Signed-off-by: Shannon Zhao <zhaoshenglong@huawei.com>
Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org>
Tested-by: Wei Huang <wei@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Message-id: 1449804086-3464-8-git-send-email-zhaoshenglong@huawei.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8 years agoACPI: Add GPIO Connection Descriptor
Shannon Zhao [Thu, 17 Dec 2015 13:37:14 +0000 (13:37 +0000)]
ACPI: Add GPIO Connection Descriptor

Signed-off-by: Shannon Zhao <zhaoshenglong@huawei.com>
Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org>
Tested-by: Wei Huang <wei@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Message-id: 1449804086-3464-7-git-send-email-zhaoshenglong@huawei.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8 years agoARM: ACPI: Add power button device in ACPI DSDT table
Shannon Zhao [Thu, 17 Dec 2015 13:37:14 +0000 (13:37 +0000)]
ARM: ACPI: Add power button device in ACPI DSDT table

Add power button device in ACPI DSDT table.

Signed-off-by: Shannon Zhao <zhaoshenglong@huawei.com>
Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org>
Reviewed-by: Wei Huang <wei@redhat.com>
Tested-by: Wei Huang <wei@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Message-id: 1449804086-3464-6-git-send-email-zhaoshenglong@huawei.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8 years agoARM: ACPI: Add GPIO controller in ACPI DSDT table
Shannon Zhao [Thu, 17 Dec 2015 13:37:14 +0000 (13:37 +0000)]
ARM: ACPI: Add GPIO controller in ACPI DSDT table

Add GPIO controller in ACPI DSDT table. It can be used for gpio event.

Signed-off-by: Shannon Zhao <zhaoshenglong@huawei.com>
Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org>
Tested-by: Wei Huang <wei@redhat.com>
Message-id: 1449804086-3464-5-git-send-email-zhaoshenglong@huawei.com
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8 years agoARM: Virt: Add a GPIO controller
Shannon Zhao [Thu, 17 Dec 2015 13:37:13 +0000 (13:37 +0000)]
ARM: Virt: Add a GPIO controller

ACPI 5.0 supports GPIO-signaled ACPI Events. This can be used for
powerdown, hotplug evnets. Add a GPIO controller in machine virt,
to support powerdown, maybe can be used for cpu hotplug. And
here we use pl061.

Signed-off-by: Shannon Zhao <zhaoshenglong@huawei.com>
Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org>
Reviewed-by: Wei Huang <wei@redhat.com>
Tested-by: Wei Huang <wei@redhat.com>
Message-id: 1449804086-3464-4-git-send-email-zhaoshenglong@huawei.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8 years agoacpi: extend aml_interrupt() to support multiple irqs
Igor Mammedov [Thu, 17 Dec 2015 13:37:13 +0000 (13:37 +0000)]
acpi: extend aml_interrupt() to support multiple irqs

ASL Interrupt() macro translates to Extended Interrupt Descriptor
which supports variable number of IRQs. It will be used for
conversion of ASL code for pc/q35 machines that use it for
returning several IRQs in _PSR object.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Shannon Zhao <shannon.zhao@linaro.org>
Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org>
Message-id: 1449804086-3464-3-git-send-email-zhaoshenglong@huawei.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8 years agoacpi: support serialized method
Xiao Guangrong [Thu, 17 Dec 2015 13:37:13 +0000 (13:37 +0000)]
acpi: support serialized method

Add serialized method support so that explicit Mutex can be
avoided

Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Shannon Zhao <shannon.zhao@linaro.org>
Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org>
Message-id: 1449804086-3464-2-git-send-email-zhaoshenglong@huawei.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8 years agotarget-arm: Fix and improve AA32 singlestep translation completion code
Sergey Fedorov [Thu, 17 Dec 2015 13:37:13 +0000 (13:37 +0000)]
target-arm: Fix and improve AA32 singlestep translation completion code

The AArch32 translation completion code for singlestep enabled/active
case was a way more confusing and too repetitive then it needs to be.
Probably that was the cause for a bug to be introduced into it at some
point. The bug was that SWI/HVC/SMC exception would be generated in
condition-failed instruction code path whereas it shouldn't.

This patch rewrites the code in a way similar to the non-singlestep
case.

In the condition-passed/unconditional instruction code path we need to:
 - Write the condexec bits back to the CPU state
 - Advance the singlestep state machine and generate a corresponding
   exception in case of SWI/HVC/SMC
 - Write the PC back to the CPU state if it hasn't already been written
   and generate an appropriate singlestep exception otherwise

In the condition-failed instruction code path we need to:
 - Set a TCG label to jump to it if the condition is failed
 - Write the condexec bits back to the CPU state
 - Write the PC back to the CPU state since it hasn't been written in
   this case
 - Generate an appropriate singlestep exception

Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com>
Message-id: 1448474560-22475-1-git-send-email-serge.fdrv@gmail.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8 years agotarget-arm: raise exception on misaligned LDREX operands
Andrew Baumann [Thu, 17 Dec 2015 13:37:13 +0000 (13:37 +0000)]
target-arm: raise exception on misaligned LDREX operands

Qemu does not generally perform alignment checks. However, the ARM ARM
requires implementation of alignment exceptions for a number of cases
including LDREX, and Windows-on-ARM relies on this.

This change adds plumbing to enable alignment checks on loads using
MO_ALIGN, a do_unaligned_access hook to raise the exception (data
abort), and uses the new aligned loads in LDREX (for all but
single-byte loads).

Signed-off-by: Andrew Baumann <Andrew.Baumann@microsoft.com>
Message-id: 1449167808-5656-1-git-send-email-Andrew.Baumann@microsoft.com
[PMM: set WnR bits in syndrome and FSR as appropriate]
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8 years agoarm: soc-dma: use hwaddr instead of target_ulong in printf
Paolo Bonzini [Thu, 17 Dec 2015 13:37:13 +0000 (13:37 +0000)]
arm: soc-dma: use hwaddr instead of target_ulong in printf

This is a first baby step towards removing widespread inclusion of
cpu.h and compiling more devices once (so that arm, aarch64 and
in the future target-multi can share the object files).

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
[PMM: changed __FUNCTION__ to __func__ since we're touching
 these lines of code anyway]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
8 years agoarm: explicitly mark device loads as little-endian
Paolo Bonzini [Thu, 17 Dec 2015 13:37:13 +0000 (13:37 +0000)]
arm: explicitly mark device loads as little-endian

Behaviour of emulated devices should not depend on the endianness
of the CPU, so avoid using the endian-dependent load and store
functions in the PXA2xx and OMAP display devices. These devices
are little endian when they do DMA access.

(Since ARM softmmu is always compiled as little endian, this means
that the endian-dependent load and store functions are always little
endian, so this commit makes no functionally visible change.)

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
[PMM: rewrote commit message]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>