]> git.proxmox.com Git - mirror_qemu.git/log
mirror_qemu.git
4 years agoMerge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging
Peter Maydell [Tue, 4 Jun 2019 16:22:42 +0000 (17:22 +0100)]
Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging

Block layer patches:

- block: AioContext management, part 2
- Avoid recursive block_status call (i.e. lseek() calls) if possible
- linux-aio: Drop unused BlockAIOCB submission method
- nvme: add Get/Set Feature Timestamp support
- Fix crash on commit job start with active I/O on base node
- Fix crash in bdrv_drained_end
- Fix integer overflow in qcow2 discard

# gpg: Signature made Tue 04 Jun 2019 16:20:02 BST
# gpg:                using RSA key 7F09B272C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>" [full]
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74  56FE 7F09 B272 C88F 2FD6

* remotes/kevin/tags/for-upstream: (29 commits)
  iotests: Fix duplicated diff output on failure
  iotests: test big qcow2 shrink
  block/io: bdrv_pdiscard: support int64_t bytes parameter
  block/qcow2-refcount: add trace-point to qcow2_process_discards
  block: Remove bdrv_set_aio_context()
  test-bdrv-drain: Use bdrv_try_set_aio_context()
  iotests: Attach new devices to node in non-default iothread
  virtio-scsi-test: Test attaching new overlay with iothreads
  block: Remove wrong bdrv_set_aio_context() calls
  blockdev: Use bdrv_try_set_aio_context() for monitor commands
  block: Move node without parents to main AioContext
  test-block-iothread: BlockBackend AioContext across root node change
  test-block-iothread: Test adding parent to iothread node
  block: Adjust AioContexts when attaching nodes
  scsi-disk: Use qdev_prop_drive_iothread
  block: Add qdev_prop_drive_iothread property type
  block: Add BlockBackend.ctx
  block: Add Error to blk_set_aio_context()
  nbd-server: Call blk_set_allow_aio_context_change()
  test-block-iothread: Check filter node in test_propagate_mirror
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4 years agoiotests: Fix duplicated diff output on failure
Kevin Wolf [Mon, 3 Jun 2019 13:43:20 +0000 (15:43 +0200)]
iotests: Fix duplicated diff output on failure

Commit 70ff5b07 wanted to move the diff between actual and reference
output to the end after printing the test result line. It really only
copied it, though, so the diff is now displayed twice. Remove the old
one.

Fixes: 70ff5b07fcdd378180ad2d5cc0b0d5e67e7ef325
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
4 years agoiotests: test big qcow2 shrink
Vladimir Sementsov-Ogievskiy [Tue, 4 Jun 2019 12:39:48 +0000 (15:39 +0300)]
iotests: test big qcow2 shrink

This test checks bug in qcow2_process_discards, fixed by previous
commit.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
4 years agoblock/io: bdrv_pdiscard: support int64_t bytes parameter
Vladimir Sementsov-Ogievskiy [Tue, 23 Apr 2019 12:57:05 +0000 (15:57 +0300)]
block/io: bdrv_pdiscard: support int64_t bytes parameter

This fixes at least one overflow in qcow2_process_discards, which
passes 64bit region length to bdrv_pdiscard where bytes (or sectors in
the past) parameter is int since its introduction in 0b919fae.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
4 years agoblock/qcow2-refcount: add trace-point to qcow2_process_discards
Vladimir Sementsov-Ogievskiy [Tue, 23 Apr 2019 12:57:04 +0000 (15:57 +0300)]
block/qcow2-refcount: add trace-point to qcow2_process_discards

Let's at least trace ignored failure.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
4 years agoblock: Remove bdrv_set_aio_context()
Kevin Wolf [Tue, 7 May 2019 16:31:38 +0000 (18:31 +0200)]
block: Remove bdrv_set_aio_context()

All callers of bdrv_set_aio_context() are eliminated now, they have
moved to bdrv_try_set_aio_context() and related safe functions. Remove
bdrv_set_aio_context().

With this, we can now know that the .set_aio_ctx callback must be
present in bdrv_set_aio_context_ignore() because
bdrv_can_set_aio_context() would have returned false previously, so
instead of checking the condition, we can assert it.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
4 years agotest-bdrv-drain: Use bdrv_try_set_aio_context()
Kevin Wolf [Tue, 7 May 2019 16:22:10 +0000 (18:22 +0200)]
test-bdrv-drain: Use bdrv_try_set_aio_context()

No reason to use the unchecked version in tests, even more so when these
are the last callers of bdrv_set_aio_context() outside of block.c.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
4 years agoiotests: Attach new devices to node in non-default iothread
Kevin Wolf [Thu, 23 May 2019 14:09:19 +0000 (16:09 +0200)]
iotests: Attach new devices to node in non-default iothread

This tests that devices refuse to be attached to a node that has already
been moved to a different iothread if they can't be or aren't configured
to work in the same iothread.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
4 years agovirtio-scsi-test: Test attaching new overlay with iothreads
Kevin Wolf [Wed, 8 May 2019 09:58:45 +0000 (11:58 +0200)]
virtio-scsi-test: Test attaching new overlay with iothreads

This tests that blockdev-add can correctly add a qcow2 overlay to an
image used by a virtio-scsi disk in an iothread. The interesting point
here is whether the newly added node gets correctly moved into the
iothread AioContext.

If it isn't, we get an assertion failure in virtio-scsi while processing
the next request:

    virtio_scsi_ctx_check: Assertion `blk_get_aio_context(d->conf.blk) == s->ctx' failed.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
4 years agoblock: Remove wrong bdrv_set_aio_context() calls
Kevin Wolf [Tue, 7 May 2019 16:19:16 +0000 (18:19 +0200)]
block: Remove wrong bdrv_set_aio_context() calls

The mirror and commit block jobs use bdrv_set_aio_context() to move
their filter node into the right AioContext before hooking it up in the
graph. Similarly, bdrv_open_backing_file() explicitly moves the backing
file node into the right AioContext first.

This isn't necessary any more, they get automatically moved into the
right context now when attaching them.

However, in the case of bdrv_open_backing_file() with a node reference,
it's actually not only unnecessary, but even wrong: The unchecked
bdrv_set_aio_context() changes the AioContext of the child node even if
other parents require it to retain the old context. So this is not only
a simplification, but a bug fix, too.

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1684342
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
4 years agoblockdev: Use bdrv_try_set_aio_context() for monitor commands
Kevin Wolf [Fri, 26 Apr 2019 14:12:27 +0000 (16:12 +0200)]
blockdev: Use bdrv_try_set_aio_context() for monitor commands

Monitor commands can handle errors, so they can easily be converted to
using the safer bdrv_try_set_aio_context() function.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
4 years agoblock: Move node without parents to main AioContext
Kevin Wolf [Wed, 24 Apr 2019 16:04:42 +0000 (18:04 +0200)]
block: Move node without parents to main AioContext

A node should only be in a non-default AioContext if a user is attached
to it that requires this. When the last parent of a node is gone, it can
move back to the main AioContext.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
4 years agotest-block-iothread: BlockBackend AioContext across root node change
Kevin Wolf [Wed, 24 Apr 2019 15:47:34 +0000 (17:47 +0200)]
test-block-iothread: BlockBackend AioContext across root node change

Test that BlockBackends preserve their assigned AioContext even when the
root node goes away. Inserting a new root node will move it to the right
AioContext.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
4 years agotest-block-iothread: Test adding parent to iothread node
Kevin Wolf [Wed, 24 Apr 2019 15:49:33 +0000 (17:49 +0200)]
test-block-iothread: Test adding parent to iothread node

Opening a new parent node for a node that has already been moved into a
different AioContext must cause the new parent to move into the same
context.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
4 years agoblock: Adjust AioContexts when attaching nodes
Kevin Wolf [Wed, 24 Apr 2019 15:41:46 +0000 (17:41 +0200)]
block: Adjust AioContexts when attaching nodes

So far, we only made sure that updating the AioContext of a node
affected the whole subtree. However, if a node is newly attached to a
new parent, we also need to make sure that both the subtree of the node
and the parent are in the same AioContext. This tries to move the new
child node to the parent AioContext and returns an error if this isn't
possible.

BlockBackends now actually apply their AioContext to their root node.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
4 years agoscsi-disk: Use qdev_prop_drive_iothread
Kevin Wolf [Fri, 26 Apr 2019 17:29:47 +0000 (19:29 +0200)]
scsi-disk: Use qdev_prop_drive_iothread

This makes use of qdev_prop_drive_iothread for scsi-disk so that the
disk can be attached to a node that is already in the target AioContext.
We need to check that the HBA actually supports iothreads, otherwise
scsi-disk must make sure that the node is already in the main
AioContext.

This changes the error message for conflicting iothread settings.
Previously, virtio-scsi produced the error message, now it comes from
blk_set_aio_context(). Update a test case accordingly.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
4 years agoblock: Add qdev_prop_drive_iothread property type
Kevin Wolf [Mon, 29 Apr 2019 15:40:14 +0000 (17:40 +0200)]
block: Add qdev_prop_drive_iothread property type

Some qdev block devices have support for iothreads and take care of the
AioContext they are running in, but most devices don't know about any of
this. For the latter category, the qdev drive property must make sure
that their BlockBackend is in the main AioContext.

Unfortunately, while the current code just does the same thing for
devices that do support iothreads, this is not correct and it would show
as soon as we actually try to keep a consistent AioContext assignment
across all nodes and users of a block graph subtree: If a node is
already in a non-default AioContext because of one of its users,
attaching a new device should still be possible if that device can work
in the same AioContext. Switching the node back to the main context
first and only then into the device AioContext causes failure (because
the existing user wouldn't allow the switch to the main context).

So devices that support iothreads need a different kind of drive
property that leaves the node in its current AioContext, but by using
this type, the device promises to check later that it can work with this
context.

This patch adds the qdev infrastructure that allows devices to signal
that they handle iothreads and qdev should leave the AioContext alone.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
4 years agoblock: Add BlockBackend.ctx
Kevin Wolf [Thu, 25 Apr 2019 12:25:10 +0000 (14:25 +0200)]
block: Add BlockBackend.ctx

This adds a new parameter to blk_new() which requires its callers to
declare from which AioContext this BlockBackend is going to be used (or
the locks of which AioContext need to be taken anyway).

The given context is only stored and kept up to date when changing
AioContexts. Actually applying the stored AioContext to the root node
is saved for another commit.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
4 years agoblock: Add Error to blk_set_aio_context()
Kevin Wolf [Thu, 2 May 2019 09:10:59 +0000 (11:10 +0200)]
block: Add Error to blk_set_aio_context()

Add an Error parameter to blk_set_aio_context() and use
bdrv_child_try_set_aio_context() internally to check whether all
involved nodes can actually support the AioContext switch.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
4 years agonbd-server: Call blk_set_allow_aio_context_change()
Kevin Wolf [Fri, 24 May 2019 08:41:34 +0000 (10:41 +0200)]
nbd-server: Call blk_set_allow_aio_context_change()

The NBD server uses an AioContext notifier, so it can tolerate that its
BlockBackend is switched to a different AioContext. Before we start
actually calling bdrv_try_set_aio_context(), which checks for
consistency, outside of test cases, we need to make sure that the NBD
server actually allows this.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
4 years agotest-block-iothread: Check filter node in test_propagate_mirror
Kevin Wolf [Fri, 3 May 2019 11:20:54 +0000 (13:20 +0200)]
test-block-iothread: Check filter node in test_propagate_mirror

Just make the test cover the AioContext of the filter node as well.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
4 years agonvme: add Get/Set Feature Timestamp support
Kenneth Heitke [Mon, 20 May 2019 17:40:30 +0000 (11:40 -0600)]
nvme: add Get/Set Feature Timestamp support

Signed-off-by: Kenneth Heitke <kenneth.heitke@intel.com>
Reviewed-by: Klaus Birkelund Jensen <klaus.jensen@cnexlabs.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
4 years agoblock/linux-aio: Drop unused BlockAIOCB submission method
Julia Suvorova [Sun, 2 Jun 2019 20:17:09 +0000 (23:17 +0300)]
block/linux-aio: Drop unused BlockAIOCB submission method

Callback-based laio_submit() and laio_cancel() were left after
rewriting Linux AIO backend to coroutines in hope that they would be
used in other code that could bypass coroutines. They can be safely
removed because they have not been used since that time.

Signed-off-by: Julia Suvorova <jusual@mail.ru>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
4 years agoiotests: Test cancelling a job and closing the VM
Max Reitz [Wed, 22 May 2019 14:40:37 +0000 (16:40 +0200)]
iotests: Test cancelling a job and closing the VM

This patch adds a test where we cancel a throttled mirror job and
immediately close the VM before it can be cancelled.  Doing so will
invoke bdrv_drain_all() while the mirror job tries to drain the
throttled node.  When bdrv_drain_all_end() tries to lift its drain on
the throttle node, the job will exit and replace the current root node
of the BB drive0 (which is the job's filter node) by the throttle node.
Before the previous patch, this replacement did not increase drive0's
quiesce_counter by a sufficient amount, so when
bdrv_parent_drained_end() (invoked by bdrv_do_drained_end(), invoked by
bdrv_drain_all_end()) tried to end the drain on all of the throttle
node's parents, it decreased drive0's quiesce_counter below 0 -- which
fails an assertion.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
4 years agoblock/io: Delay decrementing the quiesce_counter
Max Reitz [Wed, 22 May 2019 14:40:36 +0000 (16:40 +0200)]
block/io: Delay decrementing the quiesce_counter

When ending a drained section, bdrv_do_drained_end() currently first
decrements the quiesce_counter, and only then actually ends the drain.

The bdrv_drain_invoke(bs, false) call may cause graph changes.  Say the
graph change involves replacing an existing BB's ("blk") BDS
(blk_bs(blk)) by @bs.  Let us introducing the following values:
- bs_oqc = old_quiesce_counter
  (so bs->quiesce_counter == bs_oqc - 1)
- obs_qc = blk_bs(blk)->quiesce_counter (before bdrv_drain_invoke())

Let us assume there is no blk_pread_unthrottled() involved, so
blk->quiesce_counter == obs_qc (before bdrv_drain_invoke()).

Now replacing blk_bs(blk) by @bs will reduce blk->quiesce_counter by
obs_qc (making it 0) and increase it by bs_oqc-1 (making it bs_oqc-1).

bdrv_drain_invoke() returns and we invoke bdrv_parent_drained_end().
This will decrement blk->quiesce_counter by one, so it would be -1 --
were there not an assertion against that in blk_root_drained_end().

We therefore have to keep the quiesce_counter up at least until
bdrv_drain_invoke() returns, so that bdrv_parent_drained_end() does the
right thing for the parents @bs got during bdrv_drain_invoke().

But let us delay it even further, namely until bdrv_parent_drained_end()
returns, because then it mirrors bdrv_do_drained_begin(): There, we
first increment the quiesce_counter, then begin draining the parents,
and then call bdrv_drain_invoke().  It makes sense to let
bdrv_do_drained_end() unravel this exactly in reverse.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
4 years agoblock: avoid recursive block_status call if possible
Vladimir Sementsov-Ogievskiy [Mon, 8 Apr 2019 16:26:17 +0000 (19:26 +0300)]
block: avoid recursive block_status call if possible

drv_co_block_status digs bs->file for additional, more accurate search
for hole inside region, reported as DATA by bs since 5daa74a6ebc.

This accuracy is not free: assume we have qcow2 disk. Actually, qcow2
knows, where are holes and where is data. But every block_status
request calls lseek additionally. Assume a big disk, full of
data, in any iterative copying block job (or img convert) we'll call
lseek(HOLE) on every iteration, and each of these lseeks will have to
iterate through all metadata up to the end of file. It's obviously
ineffective behavior. And for many scenarios we don't need this lseek
at all.

However, lseek is needed when we have metadata-preallocated image.

So, let's detect metadata-preallocation case and don't dig qcow2's
protocol file in other cases.

The idea is to compare allocation size in POV of filesystem with
allocations size in POV of Qcow2 (by refcounts). If allocation in fs is
significantly lower, consider it as metadata-preallocation case.

102 iotest changed, as our detector can't detect shrinked file as
metadata-preallocation, which don't seem to be wrong, as with metadata
preallocation we always have valid file length.

Two other iotests have a slight change in their QMP output sequence:
Active 'block-commit' returns earlier because the job coroutine yields
earlier on a blocking operation. This operation is loading the refcount
blocks in qcow2_detect_metadata_preallocation().

Suggested-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
4 years agotests/perf: Test lseek influence on qcow2 block-status
Vladimir Sementsov-Ogievskiy [Mon, 8 Apr 2019 16:26:16 +0000 (19:26 +0300)]
tests/perf: Test lseek influence on qcow2 block-status

Block layer may recursively check block_status in file child of qcow2,
if qcow2 driver returned DATA. There are several test cases to check
influence of lseek on block_status performance. To see real difference
run on tmpfs.

Tests originally created by Kevin, I just refactored and put them
together into one executable file with simple output.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
4 years agoblockdev: fix missed target unref for drive-backup
John Snow [Mon, 13 May 2019 15:06:38 +0000 (11:06 -0400)]
blockdev: fix missed target unref for drive-backup

If the bitmap can't be used for whatever reason, we skip putting down
the reference. Fix that.

In practice, this means that if you attempt to gracefully exit QEMU
after a backup command being rejected, bdrv_close_all will fail and
tell you some unpleasant things via assert().

Reported-by: aihua liang <aliang@redhat.com>
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1703916
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
4 years agoiotests: Test commit job start with concurrent I/O
Kevin Wolf [Tue, 21 May 2019 18:35:52 +0000 (20:35 +0200)]
iotests: Test commit job start with concurrent I/O

This tests that concurrent requests are correctly drained before making
graph modifications instead of running into assertions in
bdrv_replace_node().

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
4 years agoblock: Drain source node in bdrv_replace_node()
Kevin Wolf [Tue, 21 May 2019 17:00:25 +0000 (19:00 +0200)]
block: Drain source node in bdrv_replace_node()

Instead of just asserting that no requests are in flight in
bdrv_replace_node(), which is a requirement that most callers ignore, we
can just drain the source node right there. This fixes at least starting
a commit job while I/O is active on the backing chain, but probably
other callers, too.

Having requests in flight on the target node isn't a problem because the
target just gets new parents, but the call path of running requests
isn't modified. So we can just drop this assertion without a replacement.

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1711643
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
4 years agoMerge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging
Peter Maydell [Mon, 3 Jun 2019 17:26:21 +0000 (18:26 +0100)]
Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging

* Revert q35 to kernel irqchip (Alex)
* edu device fixes (Li Qiang)
* cleanups (Marc-André, Peter)
* Improvements to -accel help
* Better support for IA32_MISC_ENABLE MSR (Wanpeng)
* I2C test conversion to qgraph (Paolo)

# gpg: Signature made Mon 03 Jun 2019 14:20:12 BST
# gpg:                using RSA key BFFBD25F78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full]
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>" [full]
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4  E2F7 7E15 100C CD36 69B1
#      Subkey fingerprint: F133 3857 4B66 2389 866C  7682 BFFB D25F 78C7 AE83

* remotes/bonzini/tags/for-upstream: (24 commits)
  q35: Revert to kernel irqchip
  configure: remove tpm_passthrough & tpm_emulator
  ci: store Patchew configuration in the tree
  libqos: i2c: move address into QI2CDevice
  tests: convert ds1338-test to qtest
  tests: convert OMAP i2c tests to qgraph
  libqos: add ARM imx25-pdk machine object
  libqos: add ARM n800 machine object
  libqos: convert I2C to qgraph
  libqos: split I2CAdapter initialization and allocation
  imx25-pdk: create ds1338 for qtest inside the test
  pca9552-test: do not rely on state across tests
  libqos: fix omap-i2c receiving more than 4 bytes
  libqos: move common i2c code to libqos
  qgraph: fix qos_node_contains with options
  qgraph: allow extra_device_opts on contains nodes
  edu: uses uint64_t in dma operation
  edu: mmio: allow 64-bit access in read dispatch
  edu: mmio: allow 64-bit access
  i386: Enable IA32_MISC_ENABLE MWAIT bit when exposing mwait/monitor
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4 years agoq35: Revert to kernel irqchip
Alex Williamson [Tue, 14 May 2019 20:14:41 +0000 (14:14 -0600)]
q35: Revert to kernel irqchip

Commit b2fc91db8447 ("q35: set split kernel irqchip as default") changed
the default for the pc-q35-4.0 machine type to use split irqchip, which
turned out to have disasterous effects on vfio-pci INTx support.  KVM
resampling irqfds are registered for handling these interrupts, but
these are non-functional in split irqchip mode.  We can't simply test
for split irqchip in QEMU as userspace handling of this interrupt is a
significant performance regression versus KVM handling (GeForce GPUs
assigned to Windows VMs are non-functional without forcing MSI mode or
re-enabling kernel irqchip).

The resolution is to revert the change in default irqchip mode in the
pc-q35-4.1 machine and create a pc-q35-4.0.1 machine for the 4.0-stable
branch.  The qemu-q35-4.0 machine type should not be used in vfio-pci
configurations for devices requiring legacy INTx support without
explicitly modifying the VM configuration to use kernel irqchip.

Link: https://bugs.launchpad.net/qemu/+bug/1826422
Fixes: b2fc91db8447 ("q35: set split kernel irqchip as default")
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Message-Id: <155786484688.13873.6037015630912983760.stgit@gimli.home>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
4 years agoconfigure: remove tpm_passthrough & tpm_emulator
Marc-André Lureau [Fri, 24 May 2019 18:14:11 +0000 (20:14 +0200)]
configure: remove tpm_passthrough & tpm_emulator

This is a left-over from commit 7aaa6a16373 "tpm: express dependencies
with Kconfig".

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20190524181411.8599-1-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
4 years agoci: store Patchew configuration in the tree
Paolo Bonzini [Fri, 15 Mar 2019 09:16:20 +0000 (10:16 +0100)]
ci: store Patchew configuration in the tree

Patchew cannot yet retrieve the configuration from the QEMU Git tree, but
this is planned.  In the meanwhile, let's start storing it as YAML
so that the Patchew configuration (currently accessible only to administrators)
is public and documented.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
4 years agolibqos: i2c: move address into QI2CDevice
Paolo Bonzini [Mon, 18 Mar 2019 14:06:50 +0000 (15:06 +0100)]
libqos: i2c: move address into QI2CDevice

This removes the hardcoded I2C address from the tests.  The address
is passed via QOSGraphEdgeOptions to i2c_device_create and stored
in the QI2CDevice.

The i2c_send and i2c_recv functions, along with their wrappers,
therefore, can be changed to take a QI2CDevice rather than an
adapter/address pair.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
4 years agotests: convert ds1338-test to qtest
Paolo Bonzini [Mon, 18 Mar 2019 14:59:42 +0000 (15:59 +0100)]
tests: convert ds1338-test to qtest

This way, ds1338-test will run for every machine that exposes
an i2c-bus.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
4 years agotests: convert OMAP i2c tests to qgraph
Paolo Bonzini [Mon, 18 Mar 2019 13:07:13 +0000 (14:07 +0100)]
tests: convert OMAP i2c tests to qgraph

This way, pca9952-test and tmp105-test will run for every machine
that exposes an i2c-bus.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
4 years agolibqos: add ARM imx25-pdk machine object
Paolo Bonzini [Mon, 18 Mar 2019 14:36:49 +0000 (15:36 +0100)]
libqos: add ARM imx25-pdk machine object

This is used to test imx_i2c.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
4 years agolibqos: add ARM n800 machine object
Paolo Bonzini [Mon, 18 Mar 2019 12:52:23 +0000 (13:52 +0100)]
libqos: add ARM n800 machine object

This is used to test omap_i2c.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
4 years agolibqos: convert I2C to qgraph
Paolo Bonzini [Mon, 18 Mar 2019 12:48:23 +0000 (13:48 +0100)]
libqos: convert I2C to qgraph

Create an i2c-bus interface, corresponding to the I2CAdapter struct.
Wrap IMXI2C and OMAPI2C with a QOSGraphObject, and add the get_driver
function to retrieve the I2CAdapter.

The conversion is still not complete; for simplicity, i2c_recv and
i2c_send (along with their wrappers) still take an adapter/address
pair.  Fixing that would be complicated until the tests are converted
to qgraph, so it is left for after the conversion.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
4 years agolibqos: split I2CAdapter initialization and allocation
Paolo Bonzini [Mon, 18 Mar 2019 16:12:25 +0000 (17:12 +0100)]
libqos: split I2CAdapter initialization and allocation

Provide *_init functions that populate an I2CAdapter struct without
allocating one, and make the existing *_create functions wrap them.

Because in the new setup *_create might return a pointer inside the
IMXI2C or OMAPI2C struct, create companion *_free functions to go
back to the outer pointer.

All this is temporary until allocation will be handled entirely by
qgraph.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
4 years agoimx25-pdk: create ds1338 for qtest inside the test
Paolo Bonzini [Mon, 18 Mar 2019 14:56:23 +0000 (15:56 +0100)]
imx25-pdk: create ds1338 for qtest inside the test

There is no need to have a test device created by the board.
Instead, create it in the qtest so that we will be able to run
it on other boards too.

Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
4 years agopca9552-test: do not rely on state across tests
Paolo Bonzini [Mon, 18 Mar 2019 13:56:21 +0000 (14:56 +0100)]
pca9552-test: do not rely on state across tests

receive_autoinc is relying on the LED state that is set by
send_and_receive.  Stop doing that, because qgraph resets the
machine between tests.

Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
4 years agolibqos: fix omap-i2c receiving more than 4 bytes
Paolo Bonzini [Mon, 18 Mar 2019 15:49:59 +0000 (16:49 +0100)]
libqos: fix omap-i2c receiving more than 4 bytes

If more than 4 bytes are received, the FIFO cannot host the entire
contents of the transfer and STP will be nonzero before entering
the transfer loop.  Also, CNT will contain the number of bytes
left to be transferred instead of the total number of bytes in
the transfer.

(Reverse engineered from the omap_i2c.c source code; no available
datasheet).

This will fix ds1338-test for omap-i2c.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
4 years agolibqos: move common i2c code to libqos
Paolo Bonzini [Mon, 18 Mar 2019 14:09:51 +0000 (15:09 +0100)]
libqos: move common i2c code to libqos

The functions to read/write 8-bit or 16-bit registers are the same
in tmp105 and pca9552 tests, and in fact they are a special case of
"read block"/"write block" functionality; read block in turn is used
in ds1338-test.

Move everything inside libqos-test, removing the duplication.  Account
for the small differences by adding to tmp105-test.c the "read register
after writing" behavior that is specific to it.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
4 years agoqgraph: fix qos_node_contains with options
Paolo Bonzini [Mon, 18 Mar 2019 16:14:12 +0000 (17:14 +0100)]
qgraph: fix qos_node_contains with options

Currently, if qos_node_contains was passed options, it would still
create an edge without any options.  Instead, in that case
NULL acts as a terminator.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
4 years agoqgraph: allow extra_device_opts on contains nodes
Paolo Bonzini [Mon, 18 Mar 2019 13:48:47 +0000 (14:48 +0100)]
qgraph: allow extra_device_opts on contains nodes

Allow choosing the bus that the device will be placed on, in case
the machine has more than one.  Otherwise, the bus may not match
the base address of the controller we attach it to.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
4 years agoedu: uses uint64_t in dma operation
Li Qiang [Fri, 10 May 2019 16:43:49 +0000 (09:43 -0700)]
edu: uses uint64_t in dma operation

The dma related variable dma.dst/src/cnt is dma_addr_t, it is
uint64_t in x64 platform. Change these usage from uint32_to
uint64_t to avoid trancation in edu_dma_timer.

Signed-off-by: Li Qiang <liq3ea@163.com>
Message-Id: <20190510164349.81507-4-liq3ea@163.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
4 years agoedu: mmio: allow 64-bit access in read dispatch
Li Qiang [Fri, 10 May 2019 16:43:48 +0000 (09:43 -0700)]
edu: mmio: allow 64-bit access in read dispatch

The edu spec says when address >= 0x80, the MMIO area can
be accessed by 64-bit.

Signed-off-by: Li Qiang <liq3ea@163.com>
Reviewed-by: Philippe Mathieu-Daude <philmd@redhat.com>
Message-Id: <20190510164349.81507-3-liq3ea@163.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
4 years agoedu: mmio: allow 64-bit access
Li Qiang [Fri, 10 May 2019 16:43:47 +0000 (09:43 -0700)]
edu: mmio: allow 64-bit access

The edu spec says the MMIO area can be accessed by 64-bit.
However currently the 'max_access_size' is not so the MMIO
access dispatch can only access 32-bit one time. This patch fixes
this to respect the spec.

Signed-off-by: Li Qiang <liq3ea@163.com>
Reviewed-by: Philippe Mathieu-Daude <philmd@redhat.com>
Message-Id: <20190510164349.81507-2-liq3ea@163.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
4 years agoi386: Enable IA32_MISC_ENABLE MWAIT bit when exposing mwait/monitor
Wanpeng Li [Tue, 14 May 2019 06:06:39 +0000 (14:06 +0800)]
i386: Enable IA32_MISC_ENABLE MWAIT bit when exposing mwait/monitor

The CPUID.01H:ECX[bit 3] ought to mirror the value of the MSR
IA32_MISC_ENABLE MWAIT bit and as userspace has control of them
both, it is userspace's job to configure both bits to match on
the initial setup.

Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
Message-Id: <1557813999-9175-1-git-send-email-wanpengli@tencent.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
4 years agomemory: Remove memory_region_get_dirty()
Peter Xu [Mon, 20 May 2019 03:08:28 +0000 (11:08 +0800)]
memory: Remove memory_region_get_dirty()

It's never used anywhere.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20190520030839.6795-5-peterx@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
4 years agocheckpatch: allow SPDX-License-Identifier
Peter Xu [Fri, 26 Apr 2019 06:27:05 +0000 (14:27 +0800)]
checkpatch: allow SPDX-License-Identifier

According to: https://spdx.org/ids-how, let's still allow QEMU to use
the SPDX license identifier:

// SPDX-License-Identifier: ***

CC: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20190426062705.4651-1-peterx@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
4 years agovl: make -accel help to list enabled accelerators only
Wainer dos Santos Moschetta [Thu, 30 May 2019 21:57:55 +0000 (17:57 -0400)]
vl: make -accel help to list enabled accelerators only

Currently, -accel help shows all possible accelerators regardless
if they are enabled in the binary or not. That is a different
semantic from -cpu and -machine helps, for example. So this change
makes it to list only the accelerators which support is compiled
in the binary target.

Note that it does not check if the accelerator is enabled in the
host, so the help message's header was rewritten to emphasize
that. Also qtest is not displayed given that it is used for
internal testing purpose only.

Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
Message-Id: <20190530215755.328-2-wainersm@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
4 years agotest-thread-pool: be more reliable
Paolo Bonzini [Thu, 14 Mar 2019 18:20:07 +0000 (19:20 +0100)]
test-thread-pool: be more reliable

There is a rare race between the atomic_cmpxchg and
bdrv_aio_cancel/bdrv_aio_cancel_async invocations.  Detect it, the
only sensible we can do about it is to exit long_cb immediately.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
4 years agoMerge remote-tracking branch 'remotes/amarkovic/tags/mips-queue-jun-1-2019' into...
Peter Maydell [Mon, 3 Jun 2019 09:25:12 +0000 (10:25 +0100)]
Merge remote-tracking branch 'remotes/amarkovic/tags/mips-queue-jun-1-2019' into staging

MIPS queue for June 1st, 2019

# gpg: Signature made Sat 01 Jun 2019 19:20:47 BST
# gpg:                using RSA key D4972A8967F75A65
# gpg: Good signature from "Aleksandar Markovic <amarkovic@wavecomp.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 8526 FBF1 5DA3 811F 4A01  DD75 D497 2A89 67F7 5A65

* remotes/amarkovic/tags/mips-queue-jun-1-2019:
  target/mips: Improve performance of certain MSA instructions
  target/mips: Clean up lmi_helper.c
  target/mips: Clean up dsp_helper.c
  tests/tcg: target/mips: Add tests for MSA bit set instructions
  target/mips: Amend and cleanup MSA TCG tests
  target/mips: Add emulation of MMI instruction PCPYUD
  target/mips: Add emulation of MMI instruction PCPYLD
  target/mips: Add emulation of MMI instruction PCPYH

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4 years agotarget/mips: Improve performance of certain MSA instructions
Mateja Marjanovic [Mon, 4 Mar 2019 16:51:22 +0000 (17:51 +0100)]
target/mips: Improve performance of certain MSA instructions

Eliminate loops for better performance.

Following MSA instructions from "UNOP" group are affected:

 - NLZC.<B|H|W|D>
 - NLOC.<B|H|W|D>
 - PCNT.<B|H|W|D>

Following MSA instructions from "BINOP" group are affected:

 - ADD_A.<B|H|W|D>
 - ADDS_A.<B|H|W|D>
 - ADDS_S.<B|H|W|D>
 - ADDS_U.<B|H|W|D>
 - ADDV.<B|H|W|D>
 - ASUB_S.<B|H|W|D>
 - ASUB_U.<B|H|W|D>
 - AVE_S.<B|H|W|D>
 - AVE_U.<B|H|W|D>
 - AVER_S.<B|H|W|D>
 - AVER_U.<B|H|W|D>
 - BCLR.<B|H|W|D>
 - BNEG.<B|H|W|D>
 - BSET.<B|H|W|D>
 - CEQ.<B|H|W|D>
 - CLE_S.<B|H|W|D>
 - CLE_U.<B|H|W|D>
 - CLT_S.<B|H|W|D>
 - CLT_U.<B|H|W|D>
 - DIV_S.<B|H|W|D>
 - DIV_U.<B|H|W|D>
 - DOTP_S.<B|H|W|D>
 - DOTP_U.<B|H|W|D>
 - HADD_S.<B|H|W|D>
 - HADD_U.<B|H|W|D>
 - HSUB_S.<B|H|W|D>
 - HSUB_U.<B|H|W|D>
 - MAX_A.<B|H|W|D>
 - MAX_S.<B|H|W|D>
 - MAX_U.<B|H|W|D>
 - MIN_A.<B|H|W|D>
 - MIN_S.<B|H|W|D>
 - MIN_U.<B|H|W|D>
 - MOD_S.<B|H|W|D>
 - MOD_U.<B|H|W|D>
 - MUL_Q.<B|H|W|D>
 - MULR_Q.<B|H|W|D>
 - MULV.<B|H|W|D>
 - SLL.<B|H|W|D>
 - SRA.<B|H|W|D>
 - SRAR.<B|H|W|D>
 - SRL.<B|H|W|D>
 - SRLR.<B|H|W|D>
 - SUBS_S.<B|H|W|D>
 - SUBS_U.<B|H|W|D>
 - SUBSUS_U.<B|H|W|D>
 - SUBSUU_S.<B|H|W|D>
 - SUBV.<B|H|W|D>

Following MSA instructions from "TEROP" group are affected:

 - BINSL.<B|H|W|D>
 - BINSR.<B|H|W|D>
 - DPADD_S.<B|H|W|D>
 - DPADD_U.<B|H|W|D>
 - DPSUB_S.<B|H|W|D>
 - DPSUB_U.<B|H|W|D>
 - MADD_Q.<B|H|W|D>
 - MADDR_Q.<B|H|W|D>
 - MADDV.<B|H|W|D>
 - MSUB_Q.<B|H|W|D>
 - MSUBR_Q.<B|H|W|D>
 - MSUBV.<B|H|W|D>

Additionally, following MSA instructionas are also affected:

 - ILVL.<B|H|W|D>
 - ILVR.<B|H|W|D>
 - ILVEV.<B|H|W|D>
 - ILVOD.<B|H|W|D>
 - PCKEV.<B|H|W|D>
 - PCKOD.<B|H|W|D>

Signed-off-by: Mateja Marjanovic <mateja.marjanovic@rt-rk.com>
Signed-off-by: Aleksandar Markovic <amarkovic@wavecomp.com>
Reviewed-by: Aleksandar Markovic <amarkovic@wavecomp.com>
Message-Id: <1551718283-4487-2-git-send-email-mateja.marjanovic@rt-rk.com>

4 years agotarget/mips: Clean up lmi_helper.c
Aleksandar Markovic [Tue, 23 Apr 2019 11:29:40 +0000 (13:29 +0200)]
target/mips: Clean up lmi_helper.c

Remove several minor checkpatch warnings and errors.

Signed-off-by: Aleksandar Markovic <amarkovic@wavecomp.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <1556018982-3715-7-git-send-email-aleksandar.markovic@rt-rk.com>

4 years agotarget/mips: Clean up dsp_helper.c
Aleksandar Markovic [Tue, 23 Apr 2019 11:29:39 +0000 (13:29 +0200)]
target/mips: Clean up dsp_helper.c

Remove several minor checkpatch warnings and errors.

Signed-off-by: Aleksandar Markovic <amarkovic@wavecomp.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <1556018982-3715-6-git-send-email-aleksandar.markovic@rt-rk.com>

4 years agotests/tcg: target/mips: Add tests for MSA bit set instructions
Aleksandar Markovic [Fri, 19 Apr 2019 18:38:00 +0000 (20:38 +0200)]
tests/tcg: target/mips: Add tests for MSA bit set instructions

Add tests for MSA bit set instructions. This includes following
instructions:

  * BCLR.B - clear bit (bytes)
  * BCLR.H - clear bit (halfwords)
  * BCLR.W - clear bit (words)
  * BCLR.D - clear bit (doublewords)
  * BNEG.B - negate bit (bytes)
  * BNEG.H - negate bit (halfwords)
  * BNEG.W - negate bit (words)
  * BNEG.D - negate bit (doublewords)
  * BSET.B - set bit (bytes)
  * BSET.H - set bit (halfwords)
  * BSET.W - set bit (words)
  * BSET.D - set bit (doublewords)

Signed-off-by: Aleksandar Markovic <amarkovic@wavecomp.com>
Reviewed-by: Aleksandar Rikalo <arikalo@wavecomp.com>
Message-Id: <1555699081-24577-5-git-send-email-aleksandar.markovic@rt-rk.com>

4 years agotarget/mips: Amend and cleanup MSA TCG tests
Aleksandar Markovic [Fri, 19 Apr 2019 18:37:59 +0000 (20:37 +0200)]
target/mips: Amend and cleanup MSA TCG tests

Add missing bits and peaces of the tests of the emulation of certain
MSA (non-immediate variants): some tests were missing two last cases;
some instructions were missing wrappers; some test included wrong
headers; some tests were missing altogether; updated some copywright
preambles; do several other minor cleanups.

Signed-off-by: Aleksandar Markovic <amarkovic@wavecomp.com>
Signed-off-by: Mateja Marjanovic <mateja.marjanovic@rt-rk.com>
Reviewed-by: Aleksandar Rikalo <arikalo@wavecomp.com>
Message-Id: <1555699081-24577-4-git-send-email-aleksandar.markovic@rt-rk.com>

4 years agotarget/mips: Add emulation of MMI instruction PCPYUD
Mateja Marjanovic [Mon, 4 Mar 2019 15:13:15 +0000 (16:13 +0100)]
target/mips: Add emulation of MMI instruction PCPYUD

Add emulation of MMI instruction PCPYUD. The emulation is implemented
using TCG front end operations directly to achieve better performance.

Signed-off-by: Mateja Marjanovic <mateja.marjanovic@rt-rk.com>
Signed-off-by: Aleksandar Markovic <amarkovic@wavecomp.com>
Reviewed-by: Aleksandar Rikalo <arikalo@wavecomp.com>
Message-Id: <1551712405-2530-4-git-send-email-mateja.marjanovic@rt-rk.com>

4 years agotarget/mips: Add emulation of MMI instruction PCPYLD
Mateja Marjanovic [Mon, 4 Mar 2019 15:13:14 +0000 (16:13 +0100)]
target/mips: Add emulation of MMI instruction PCPYLD

Add emulation of MMI instruction PCPYLD. The emulation is implemented
using TCG front end operations directly to achieve better performance.

Signed-off-by: Mateja Marjanovic <mateja.marjanovic@rt-rk.com>
Signed-off-by: Aleksandar Markovic <amarkovic@wavecomp.com>
Reviewed-by: Aleksandar Rikalo <arikalo@wavecomp.com>
Message-Id: <1551712405-2530-3-git-send-email-mateja.marjanovic@rt-rk.com>

4 years agotarget/mips: Add emulation of MMI instruction PCPYH
Mateja Marjanovic [Mon, 4 Mar 2019 15:13:13 +0000 (16:13 +0100)]
target/mips: Add emulation of MMI instruction PCPYH

Add emulation of MMI instruction PCPYH. The emulation is implemented
using TCG front end operations directly to achieve better performance.

Signed-off-by: Mateja Marjanovic <mateja.marjanovic@rt-rk.com>
Signed-off-by: Aleksandar Markovic <amarkovic@wavecomp.com>
Reviewed-by: Aleksandar Rikalo <arikalo@wavecomp.com>
Message-Id: <1551712405-2530-2-git-send-email-mateja.marjanovic@rt-rk.com>

4 years agoMerge remote-tracking branch 'remotes/dgibson/tags/ppc-for-4.1-20190529' into staging
Peter Maydell [Thu, 30 May 2019 14:08:00 +0000 (15:08 +0100)]
Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-4.1-20190529' into staging

ppc patch queue 2019-05-29

Next pull request against qemu-4.1.  Highlights:
  * KVM accelerated support for the XIVE interrupt controller in PAPR
    guests
  * A number of TCG vector fixes
  * Fixes for the PReP / 40p machine
  * Improvements to make check-tcg test coverage

Other than that it's just a bunch of assorted fixes, cleanups and
minor improvements.

This supersedes both the pull request dated 2019-05-21 and the one
dated 2019-05-22.  I've dropped one hunk which I think may have caused
the check-tcg failure that Peter saw (by enabling the ppc64abi32
build, which I think has been broken for ages).  I'm not entirely
certain, since I haven't reproduced exactly the same failure.

# gpg: Signature made Wed 29 May 2019 07:49:04 BST
# gpg:                using RSA key 75F46586AE61A66CC44E87DC6C38CACA20D9B392
# gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>" [full]
# gpg:                 aka "David Gibson (Red Hat) <dgibson@redhat.com>" [full]
# gpg:                 aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>" [full]
# gpg:                 aka "David Gibson (kernel.org) <dwg@kernel.org>" [unknown]
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E  87DC 6C38 CACA 20D9 B392

* remotes/dgibson/tags/ppc-for-4.1-20190529: (44 commits)
  ppc/pnv: add dummy XSCOM registers for PRD initialization
  ppc/pnv: introduce new skiboot platform properties
  spapr: Don't migrate the hpt_maxpagesize cap to older machine types
  spapr: change default interrupt mode to 'dual'
  spapr/xive: fix multiple resets when using the 'dual' interrupt mode
  docs: provide documentation on the POWER9 XIVE interrupt controller
  spapr/irq: add KVM support to the 'dual' machine
  ppc/xics: fix irq priority in ics_set_irq_type()
  spapr/irq: initialize the IRQ device only once
  spapr/irq: introduce a spapr_irq_init_device() helper
  spapr: check for the activation of the KVM IRQ device
  spapr: introduce routines to delete the KVM IRQ device
  sysbus: add a sysbus_mmio_unmap() helper
  spapr/xive: activate KVM support
  spapr/xive: add migration support for KVM
  spapr/xive: introduce a VM state change handler
  spapr/xive: add state synchronization with KVM
  spapr/xive: add hcall support when under KVM
  spapr/xive: add KVM support
  spapr: Print out extra hints when CAS negotiation of interrupt mode fails
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4 years agoMerge remote-tracking branch 'remotes/kraxel/tags/usb-20190529-pull-request' into...
Peter Maydell [Thu, 30 May 2019 12:55:27 +0000 (13:55 +0100)]
Merge remote-tracking branch 'remotes/kraxel/tags/usb-20190529-pull-request' into staging

usb-hub: port count config option, emulate power switching, cleanups.
usb-tablet, usb-host: bugfixes.

# gpg: Signature made Wed 29 May 2019 07:28:18 BST
# gpg:                using RSA key 4CB6D8EED3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>" [full]
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>" [full]
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>" [full]
# Primary key fingerprint: A032 8CFF B93A 17A7 9901  FE7D 4CB6 D8EE D3E8 7138

* remotes/kraxel/tags/usb-20190529-pull-request:
  usb-tablet: fix serial compat property
  usb-hub: emulate per port power switching
  usb-hub: add usb_hub_port_update()
  usb-hub: add helpers to update port state
  usb-hub: make number of ports runtime-configurable
  usb-hub: tweak feature names
  usb-host: avoid libusb_set_configuration calls
  usb-host: skip reset for untouched devices
  usb: call reset handler before updating state

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4 years agoMerge remote-tracking branch 'remotes/kraxel/tags/vga-20190529-pull-request' into...
Peter Maydell [Thu, 30 May 2019 12:10:00 +0000 (13:10 +0100)]
Merge remote-tracking branch 'remotes/kraxel/tags/vga-20190529-pull-request' into staging

vga: add vhost-user-gpu.

# gpg: Signature made Wed 29 May 2019 05:40:02 BST
# gpg:                using RSA key 4CB6D8EED3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>" [full]
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>" [full]
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>" [full]
# Primary key fingerprint: A032 8CFF B93A 17A7 9901  FE7D 4CB6 D8EE D3E8 7138

* remotes/kraxel/tags/vga-20190529-pull-request:
  hw/display: add vhost-user-vga & gpu-pci
  virtio-gpu: split virtio-gpu-pci & virtio-vga
  virtio-gpu: split virtio-gpu, introduce virtio-gpu-base
  spice-app: fix running when !CONFIG_OPENGL
  contrib: add vhost-user-gpu
  util: compile drm.o on posix
  virtio-gpu: add a pixman helper header
  virtio-gpu: add bswap helpers header
  vhost-user: add vhost_user_gpu_set_socket()
  virtio-gpu: add sanity check

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4 years agoMerge remote-tracking branch 'remotes/jnsnow/tags/bitmaps-pull-request' into staging
Peter Maydell [Thu, 30 May 2019 11:10:27 +0000 (12:10 +0100)]
Merge remote-tracking branch 'remotes/jnsnow/tags/bitmaps-pull-request' into staging

Pull request

# gpg: Signature made Wed 29 May 2019 00:58:33 BST
# gpg:                using RSA key F9B7ABDBBCACDF95BE76CBD07DEF8106AAFC390E
# gpg: Good signature from "John Snow (John Huston) <jsnow@redhat.com>" [full]
# Primary key fingerprint: FAEB 9711 A12C F475 812F  18F2 88A9 064D 1835 61EB
#      Subkey fingerprint: F9B7 ABDB BCAC DF95 BE76  CBD0 7DEF 8106 AAFC 390E

* remotes/jnsnow/tags/bitmaps-pull-request:
  iotests: test external snapshot with bitmap copying
  qapi: support external bitmaps in block-dirty-bitmap-merge
  migration/dirty-bitmaps: change bitmap enumeration method

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4 years agoMerge remote-tracking branch 'remotes/maxreitz/tags/pull-block-2019-05-28' into staging
Peter Maydell [Thu, 30 May 2019 10:17:56 +0000 (11:17 +0100)]
Merge remote-tracking branch 'remotes/maxreitz/tags/pull-block-2019-05-28' into staging

Block patches:
- qcow2: Use threads for encrypted I/O
- qemu-img rebase: Optimizations
- backup job: Allow any source node, and some refactoring
- Some general simplifications in the block layer

# gpg: Signature made Tue 28 May 2019 20:26:56 BST
# gpg:                using RSA key 91BEB60A30DB3E8857D11829F407DB0061D5CF40
# gpg:                issuer "mreitz@redhat.com"
# gpg: Good signature from "Max Reitz <mreitz@redhat.com>" [full]
# Primary key fingerprint: 91BE B60A 30DB 3E88 57D1  1829 F407 DB00 61D5 CF40

* remotes/maxreitz/tags/pull-block-2019-05-28: (21 commits)
  blockdev: loosen restrictions on drive-backup source node
  qcow2-bitmap: initialize bitmap directory alignment
  qcow2: skip writing zero buffers to empty COW areas
  qemu-img: rebase: Reuse in-chain BlockDriverState
  qemu-img: rebase: Reduce reads on in-chain rebase
  qemu-img: rebase: Reuse parent BlockDriverState
  block: Make bdrv_root_attach_child() unref child_bs on failure
  block: Use bdrv_unref_child() for all children in bdrv_close()
  block/backup: refactor: split out backup_calculate_cluster_size
  block/backup: unify different modes code path
  block/backup: refactor and tolerate unallocated cluster skipping
  block/backup: move to copy_bitmap with granularity
  block/backup: simplify backup_incremental_init_copy_bitmap
  qcow2: do encryption in threads
  qcow2: bdrv_co_pwritev: move encryption code out of the lock
  qcow2: qcow2_co_preadv: improve locking
  qcow2-threads: split out generic path
  qcow2-threads: qcow2_co_do_compress: protect queuing by mutex
  qcow2-threads: use thread_pool_submit_co
  qcow2: add separate file for threaded data processing functions
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
4 years agousb-tablet: fix serial compat property
Gerd Hoffmann [Mon, 20 May 2019 08:18:05 +0000 (10:18 +0200)]
usb-tablet: fix serial compat property

s/kbd/tablet/, fixes cut+paste bug.

Cc: qemu-stable@nongnu.org
Reported-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20190520081805.15019-1-kraxel@redhat.com

4 years agousb-hub: emulate per port power switching
Gerd Hoffmann [Fri, 24 May 2019 07:03:10 +0000 (09:03 +0200)]
usb-hub: emulate per port power switching

Add support for per port power switching.
Virtual power of course ;)

Use port-power=on property to enable this.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20190524070310.4952-6-kraxel@redhat.com

4 years agousb-hub: add usb_hub_port_update()
Gerd Hoffmann [Fri, 24 May 2019 07:03:09 +0000 (09:03 +0200)]
usb-hub: add usb_hub_port_update()

Helper function to update port status bits which depends on the
connected device.  We need the same logic for device attach and
port reset, so factor it out.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20190524070310.4952-5-kraxel@redhat.com

4 years agousb-hub: add helpers to update port state
Gerd Hoffmann [Fri, 24 May 2019 07:03:08 +0000 (09:03 +0200)]
usb-hub: add helpers to update port state

Add usb_hub_port_set() and usb_hub_port_clear() helpers which care about
updating the change bits (port->wPortChange) properly, so we don't need
to have that logic sprinkled all over the place ;)

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20190524070310.4952-4-kraxel@redhat.com

4 years agousb-hub: make number of ports runtime-configurable
Gerd Hoffmann [Fri, 24 May 2019 07:03:07 +0000 (09:03 +0200)]
usb-hub: make number of ports runtime-configurable

Add num_ports property which allows configure the number of downstream
ports.  Valid range is 1-8, default is 8.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20190524070310.4952-3-kraxel@redhat.com

4 years agousb-hub: tweak feature names
Gerd Hoffmann [Fri, 24 May 2019 07:03:06 +0000 (09:03 +0200)]
usb-hub: tweak feature names

Add dashes, so they don't look like two separate things when printed.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20190524070310.4952-2-kraxel@redhat.com

4 years agousb-host: avoid libusb_set_configuration calls
Gerd Hoffmann [Wed, 22 May 2019 09:47:02 +0000 (11:47 +0200)]
usb-host: avoid libusb_set_configuration calls

Seems some devices become confused when we call
libusb_set_configuration().  So before calling the function check
whenever the device has multiple configurations in the first place, and
in case it hasn't (which is the case for the majority of devices) simply
skip the call as it will have no effect anyway.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20190522094702.17619-4-kraxel@redhat.com

4 years agousb-host: skip reset for untouched devices
Gerd Hoffmann [Wed, 22 May 2019 09:47:01 +0000 (11:47 +0200)]
usb-host: skip reset for untouched devices

If the guest didn't talk to the device yet, skip the reset.
Without this usb-host devices get resetted a number of times
at boot time for no good reason.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20190522094702.17619-3-kraxel@redhat.com

4 years agousb: call reset handler before updating state
Gerd Hoffmann [Wed, 22 May 2019 09:47:00 +0000 (11:47 +0200)]
usb: call reset handler before updating state

That way the device reset handler can see what
the before-reset state of the device is.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20190522094702.17619-2-kraxel@redhat.com

4 years agohw/display: add vhost-user-vga & gpu-pci
Marc-André Lureau [Fri, 24 May 2019 13:09:46 +0000 (15:09 +0200)]
hw/display: add vhost-user-vga & gpu-pci

Add new virtio-gpu devices with a "vhost-user" property. The
associated vhost-user backend is used to handle the virtio rings and
provide rendering results thanks to the vhost-user-gpu protocol.

Example usage:
-object vhost-user-backend,id=vug,cmd="./vhost-user-gpu"
-device vhost-user-vga,vhost-user=vug

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20190524130946.31736-10-marcandre.lureau@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
4 years agovirtio-gpu: split virtio-gpu-pci & virtio-vga
Marc-André Lureau [Fri, 24 May 2019 13:09:45 +0000 (15:09 +0200)]
virtio-gpu: split virtio-gpu-pci & virtio-vga

Add base classes that are common to vhost-user-gpu-pci and
vhost-user-vga.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20190524130946.31736-9-marcandre.lureau@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
4 years agovirtio-gpu: split virtio-gpu, introduce virtio-gpu-base
Marc-André Lureau [Fri, 24 May 2019 13:09:44 +0000 (15:09 +0200)]
virtio-gpu: split virtio-gpu, introduce virtio-gpu-base

Add a base class that is common to virtio-gpu and vhost-user-gpu
devices.

The VirtIOGPUBase base class provides common functionalities necessary
for both virtio-gpu and vhost-user-gpu:
- common configuration (max-outputs, initial resolution, flags)
- virtio device initialization, including queue setup
- device pre-conditions checks (iommu)
- migration blocker
- virtio device callbacks
- hooking up to qemu display subsystem
- a few common helper functions to reset the device, retrieve display
informations
- a class callback to unblock the rendering (for GL updates)

What is left to the virtio-gpu subdevice to take care of, in short,
are all the virtio queues handling, command processing and migration.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20190524130946.31736-8-marcandre.lureau@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
4 years agospice-app: fix running when !CONFIG_OPENGL
Marc-André Lureau [Fri, 24 May 2019 13:09:43 +0000 (15:09 +0200)]
spice-app: fix running when !CONFIG_OPENGL

Do not set 'gl' parameter, fixes:
qemu-system-x86_64: Invalid parameter 'gl'

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20190524130946.31736-7-marcandre.lureau@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
4 years agocontrib: add vhost-user-gpu
Marc-André Lureau [Fri, 24 May 2019 13:09:42 +0000 (15:09 +0200)]
contrib: add vhost-user-gpu

Add a vhost-user gpu backend, based on virtio-gpu/3d device. It is
associated with a vhost-user-gpu device.

Various TODO and nice to have items:
- multi-head support
- crash & resume handling
- accelerated rendering/display that avoids the waiting round trips
- edid support

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20190524130946.31736-6-marcandre.lureau@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
4 years agoutil: compile drm.o on posix
Marc-André Lureau [Fri, 24 May 2019 13:09:41 +0000 (15:09 +0200)]
util: compile drm.o on posix

OpenGL isn't required to use DRM rendernodes. The following patches
uses it for 2d resources for ex.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20190524130946.31736-5-marcandre.lureau@redhat.com

[ kraxel s/LINUX/POSIX/ (fixes openbsd build failure) ]

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
4 years agovirtio-gpu: add a pixman helper header
Marc-André Lureau [Fri, 24 May 2019 13:09:40 +0000 (15:09 +0200)]
virtio-gpu: add a pixman helper header

This will allow to share the format conversion function with
vhost-user-gpu.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20190524130946.31736-4-marcandre.lureau@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
4 years agovirtio-gpu: add bswap helpers header
Marc-André Lureau [Fri, 24 May 2019 13:09:39 +0000 (15:09 +0200)]
virtio-gpu: add bswap helpers header

The helper functions are useful to build the vhost-user-gpu backend.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20190524130946.31736-3-marcandre.lureau@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
4 years agovhost-user: add vhost_user_gpu_set_socket()
Marc-André Lureau [Fri, 24 May 2019 13:09:38 +0000 (15:09 +0200)]
vhost-user: add vhost_user_gpu_set_socket()

Add a new vhost-user message to give a unix socket to a vhost-user
backend for GPU display updates.

Back when I started that work, I added a new GPU channel because the
vhost-user protocol wasn't bidirectional. Since then, there is a
vhost-user-slave channel for the slave to send requests to the master.
We could extend it with GPU messages. However, the GPU protocol is
quite orthogonal to vhost-user, thus I chose to have a new dedicated
channel.

See vhost-user-gpu.rst for the protocol details.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20190524130946.31736-2-marcandre.lureau@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
4 years agoppc/pnv: add dummy XSCOM registers for PRD initialization
Cédric Le Goater [Mon, 27 May 2019 07:17:22 +0000 (09:17 +0200)]
ppc/pnv: add dummy XSCOM registers for PRD initialization

PRD (Processor recovery diagnostics) is a service available on
OpenPower systems. The opal-prd daemon initializes the PowerPC
Processor through the XSCOM bus and then waits for hardware diagnostic
events.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190527071722.31424-1-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
4 years agoppc/pnv: introduce new skiboot platform properties
Cédric Le Goater [Mon, 27 May 2019 07:17:49 +0000 (09:17 +0200)]
ppc/pnv: introduce new skiboot platform properties

Newer skiboots (after 6.3) support QEMU platforms that have
characteristics closer to real OpenPOWER systems. The CPU type is used
to define the BMC drivers: Aspeed AST2400 for POWER8 processors and
AST2500 for POWER9s.

Advertise the new platform property names, "qemu,powernv8" and
"qemu,powernv9", using the CPU type chosen for the QEMU PowerNV
machine. Also, advertise the original platform name "qemu,powernv" in
case of POWER8 processors for compatibility with older skiboots.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190527071749.31499-1-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
4 years agospapr: Don't migrate the hpt_maxpagesize cap to older machine types
Greg Kurz [Wed, 22 May 2019 13:43:46 +0000 (15:43 +0200)]
spapr: Don't migrate the hpt_maxpagesize cap to older machine types

Commit 0b8c89be7f7b added the hpt_maxpagesize capability to the migration
stream. This is okay for new machine types but it breaks backward migration
to older QEMUs, which don't expect the extra subsection.

Add a compatibility boolean flag to the sPAPR machine class and use it to
skip migration of the capability for machine types 4.0 and older. This
fixes migration to an older QEMU. Note that the destination will emit a
warning:

qemu-system-ppc64: warning: cap-hpt-max-page-size lower level (16) in incoming stream than on destination (24)

This is expected and harmless though. It is okay to migrate from a lower
HPT maximum page size (64k) to a greater one (16M).

Fixes: 0b8c89be7f7b "spapr: Add forgotten capability to migration stream"
Based-on: <20190522074016.10521-3-clg@kaod.org>
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <155853262675.1158324.17301777846476373459.stgit@bahia.lan>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
4 years agospapr: change default interrupt mode to 'dual'
Cédric Le Goater [Wed, 22 May 2019 07:40:16 +0000 (09:40 +0200)]
spapr: change default interrupt mode to 'dual'

Now that XIVE support is complete (QEMU emulated and KVM devices),
change the pseries machine to advertise both interrupt modes: XICS
(P7/P8) and XIVE (P9).

The machine default interrupt modes depends on the version. Current
settings are:

    pseries   default interrupt mode

    4.1       dual
    4.0       xics
    3.1       xics
    3.0       legacy xics (different IRQ number space layout)

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190522074016.10521-3-clg@kaod.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
4 years agospapr/xive: fix multiple resets when using the 'dual' interrupt mode
Cédric Le Goater [Wed, 22 May 2019 07:40:15 +0000 (09:40 +0200)]
spapr/xive: fix multiple resets when using the 'dual' interrupt mode

Today, when a reset occurs on a pseries machine using the 'dual'
interrupt mode, the KVM devices are released and recreated depending
on the interrupt mode selected by CAS. If XIVE is selected, the SysBus
memory regions of the SpaprXive model are initialized by the KVM
backend initialization routine each time a reset occurs. This leads to
a crash after a couple of resets because the machine reaches the
QDEV_MAX_MMIO limit of SysBusDevice :

qemu-system-ppc64: hw/core/sysbus.c:193: sysbus_init_mmio: Assertion `dev->num_mmio < QDEV_MAX_MMIO' failed.

To fix, initialize the SysBus memory regions in spapr_xive_realize()
called only once and remove the same inits from the QEMU and KVM
backend initialization routines which are called at each reset.

Reported-by: Satheesh Rajendran <sathnaga@linux.vnet.ibm.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190522074016.10521-2-clg@kaod.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
4 years agodocs: provide documentation on the POWER9 XIVE interrupt controller
Cédric Le Goater [Tue, 21 May 2019 08:24:11 +0000 (10:24 +0200)]
docs: provide documentation on the POWER9 XIVE interrupt controller

This documents the overall XIVE architecture and the XIVE support for
sPAPR guest machines (pseries).

It also provides documentation on the 'info pic' command.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190521082411.24719-1-clg@kaod.org>
Reviewed-by: Satheesh Rajendran <sathnaga@linux.vnet.ibm.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
4 years agospapr/irq: add KVM support to the 'dual' machine
Cédric Le Goater [Mon, 13 May 2019 08:42:45 +0000 (10:42 +0200)]
spapr/irq: add KVM support to the 'dual' machine

The interrupt mode is chosen by the CAS negotiation process and
activated after a reset to take into account the required changes in
the machine. This brings new constraints on how the associated KVM IRQ
device is initialized.

Currently, each model takes care of the initialization of the KVM
device in their realize method but this is not possible anymore as the
initialization needs to be done globaly when the interrupt mode is
known, i.e. when machine is reseted. It also means that we need a way
to delete a KVM device when another mode is chosen.

Also, to support migration, the QEMU objects holding the state to
transfer should always be available but not necessarily activated.

The overall approach of this proposal is to initialize both interrupt
mode at the QEMU level to keep the IRQ number space in sync and to
allow switching from one mode to another. For the KVM side of things,
the whole initialization of the KVM device, sources and presenters, is
grouped in a single routine. The XICS and XIVE sPAPR IRQ reset
handlers are modified accordingly to handle the init and the delete
sequences of the KVM device.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <20190513084245.25755-15-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
4 years agoppc/xics: fix irq priority in ics_set_irq_type()
Cédric Le Goater [Mon, 13 May 2019 08:42:44 +0000 (10:42 +0200)]
ppc/xics: fix irq priority in ics_set_irq_type()

Recent commits changed the behavior of ics_set_irq_type() to
initialize correctly LSIs at the KVM level. ics_set_irq_type() is also
called by the realize routine of the different devices of the machine
when initial interrupts are claimed, before the ICSState device is
reseted.

In the case, the ICSIRQState priority is 0x0 and the call to
ics_set_irq_type() results in configuring the target of the
interrupt. On P9, when using the KVM XICS-on-XIVE device, the target
is configured to be server 0, priority 0 and the event queue 0 is
created automatically by KVM.

With the dual interrupt mode creating the KVM device at reset, it
leads to unexpected effects on the guest, mostly blocking IPIs. This
is wrong, fix it by reseting the ICSIRQState structure when
ics_set_irq_type() is called.

Fixes: commit 6cead90c5c9c ("xics: Write source state to KVM at claim time")
Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190513084245.25755-14-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
4 years agospapr/irq: initialize the IRQ device only once
Cédric Le Goater [Mon, 13 May 2019 08:42:43 +0000 (10:42 +0200)]
spapr/irq: initialize the IRQ device only once

Add a check to make sure that the routine initializing the emulated
IRQ device is called once. We don't have much to test on the XICS
side, so we introduce a 'init' boolean under ICSState.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190513084245.25755-13-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
4 years agospapr/irq: introduce a spapr_irq_init_device() helper
Cédric Le Goater [Mon, 13 May 2019 08:42:42 +0000 (10:42 +0200)]
spapr/irq: introduce a spapr_irq_init_device() helper

The way the XICS and the XIVE devices are initialized follows the same
pattern. First, try to connect to the KVM device and if not possible
fallback on the emulated device, unless a kernel_irqchip is required.
The spapr_irq_init_device() routine implements this sequence in
generic way using new sPAPR IRQ handlers ->init_emu() and ->init_kvm().

The XIVE init sequence is moved under the associated sPAPR IRQ
->init() handler. This will change again when KVM support is added for
the dual interrupt mode.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <20190513084245.25755-12-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
4 years agospapr: check for the activation of the KVM IRQ device
Cédric Le Goater [Mon, 13 May 2019 08:42:41 +0000 (10:42 +0200)]
spapr: check for the activation of the KVM IRQ device

The activation of the KVM IRQ device depends on the interrupt mode
chosen at CAS time by the machine and some methods used at reset or by
the migration need to be protected.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190513084245.25755-11-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
4 years agospapr: introduce routines to delete the KVM IRQ device
Cédric Le Goater [Mon, 13 May 2019 08:42:40 +0000 (10:42 +0200)]
spapr: introduce routines to delete the KVM IRQ device

If a new interrupt mode is chosen by CAS, the machine generates a
reset to reconfigure. At this point, the connection with the previous
KVM device needs to be closed and a new connection needs to opened
with the KVM device operating the chosen interrupt mode.

New routines are introduced to destroy the XICS and the XIVE KVM
devices. They make use of a new KVM device ioctl which destroys the
device and also disconnects the IRQ presenters from the vCPUs.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <20190513084245.25755-10-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
4 years agosysbus: add a sysbus_mmio_unmap() helper
Cédric Le Goater [Mon, 13 May 2019 08:42:39 +0000 (10:42 +0200)]
sysbus: add a sysbus_mmio_unmap() helper

This will be used to remove the MMIO regions of the POWER9 XIVE
interrupt controller when the sPAPR machine is reseted.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <20190513084245.25755-9-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>