]> git.proxmox.com Git - mirror_qemu.git/log
mirror_qemu.git
6 months agotests/qemu-iotests: Restrict test 066 to the 'file' protocol
Thomas Huth [Fri, 15 Mar 2024 11:11:01 +0000 (12:11 +0100)]
tests/qemu-iotests: Restrict test 066 to the 'file' protocol

The hand-crafted json statement in this test only works if the test
is run with the "file" protocol, so mark this test accordingly.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-ID: <20240315111108.153201-3-thuth@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
6 months agotests/qemu-iotests: Fix test 033 for running with non-file protocols
Thomas Huth [Fri, 15 Mar 2024 11:11:00 +0000 (12:11 +0100)]
tests/qemu-iotests: Fix test 033 for running with non-file protocols

When running iotest 033 with the ssh protocol, it fails with:

 033   fail       [14:48:31] [14:48:41]   10.2s                output mismatch
 --- /.../tests/qemu-iotests/033.out
 +++ /.../tests/qemu-iotests/scratch/qcow2-ssh-033/033.out.bad
 @@ -174,6 +174,7 @@
  512 bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
  wrote 512/512 bytes at offset 2097152
  512 bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
 +qemu-io: warning: Failed to truncate the tail of the image: ssh driver does not support shrinking files
  read 512/512 bytes at offset 0
  512 bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)

We already check for the qcow2 format here, so let's simply also
add a check for the protocol here, too, to only test the truncation
with the file protocol.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-ID: <20240315111108.153201-2-thuth@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
6 months agoqemu-img: Fix Column Width and Improve Formatting in snapshot list
Abhiram Tilak [Tue, 23 Jan 2024 05:03:55 +0000 (10:33 +0530)]
qemu-img: Fix Column Width and Improve Formatting in snapshot list

When running the command `qemu-img snapshot -l SNAPSHOT` the output of
VM_CLOCK (measures the offset between host and VM clock) cannot to
accommodate values in the order of thousands (4-digit).

This line [1] hints on the problem. Additionally, the column width for
the VM_CLOCK field was reduced from 15 to 13 spaces in commit b39847a5
in line [2], resulting in a shortage of space.

[1]:
https://gitlab.com/qemu-project/qemu/-/blob/master/block/qapi.c?ref_type=heads#L753
[2]:
https://gitlab.com/qemu-project/qemu/-/blob/master/block/qapi.c?ref_type=heads#L763

This patch restores the column width to 15 spaces and makes adjustments
to the affected iotests accordingly. Furthermore, addresses a potential
source
of confusion by removing whitespace in column headers. Example, VM CLOCK
is modified to VM_CLOCK. Additionally a '--' symbol is introduced when
ICOUNT returns no output for clarity.

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2062
Fixes: b39847a50553 ("migration: introduce icount field for snapshots")
Signed-off-by: Abhiram Tilak <atp.exp@gmail.com>
Message-ID: <20240123050354.22152-2-atp.exp@gmail.com>
[kwolf: Fixed up qemu-iotests 261 and 286]
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
6 months agoblockdev: Fix blockdev-snapshot-sync error reporting for no medium
Markus Armbruster [Wed, 6 Mar 2024 14:28:31 +0000 (15:28 +0100)]
blockdev: Fix blockdev-snapshot-sync error reporting for no medium

When external_snapshot_abort() rejects a BlockDriverState without a
medium, it creates an error like this:

        error_setg(errp, "Device '%s' has no medium", device);

Trouble is @device can be null.  My system formats null as "(null)",
but other systems might crash.  Reproducer:

1. Create a block device without a medium

    -> {"execute": "blockdev-add", "arguments": {"driver": "host_cdrom", "node-name": "blk0", "filename": "/dev/sr0"}}
    <- {"return": {}}

3. Attempt to snapshot it

    -> {"execute":"blockdev-snapshot-sync", "arguments": { "node-name": "blk0", "snapshot-file":"/tmp/foo.qcow2","format":"qcow2"}}
    <- {"error": {"class": "GenericError", "desc": "Device '(null)' has no medium"}}

Broken when commit 0901f67ecdb made @device optional.

Use bdrv_get_device_or_node_name() instead.  Now it fails as it
should:

    <- {"error": {"class": "GenericError", "desc": "Device 'blk0' has no medium"}}

Fixes: 0901f67ecdb7 ("qmp: Allow to take external snapshots on bs graphs node.")
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240306142831.2514431-1-armbru@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
6 months agoiotests: Add test for reset/AioContext switches with NBD exports
Kevin Wolf [Thu, 14 Mar 2024 16:58:25 +0000 (17:58 +0100)]
iotests: Add test for reset/AioContext switches with NBD exports

This replicates the scenario in which the bug was reported.
Unfortunately this relies on actually executing a guest (so that the
firmware initialises the virtio-blk device and moves it to its
configured iothread), so this can't make use of the qtest accelerator
like most other test cases. I tried to find a different easy way to
trigger the bug, but couldn't find one.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20240314165825.40261-3-kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
6 months agonbd/server: Fix race in draining the export
Kevin Wolf [Thu, 14 Mar 2024 16:58:24 +0000 (17:58 +0100)]
nbd/server: Fix race in draining the export

When draining an NBD export, nbd_drained_begin() first sets
client->quiescing so that nbd_client_receive_next_request() won't start
any new request coroutines. Then nbd_drained_poll() tries to makes sure
that we wait for any existing request coroutines by checking that
client->nb_requests has become 0.

However, there is a small window between creating a new request
coroutine and increasing client->nb_requests. If a coroutine is in this
state, it won't be waited for and drain returns too early.

In the context of switching to a different AioContext, this means that
blk_aio_attached() will see client->recv_coroutine != NULL and fail its
assertion.

Fix this by increasing client->nb_requests immediately when starting the
coroutine. Doing this after the checks if we should create a new
coroutine is okay because client->lock is held.

Cc: qemu-stable@nongnu.org
Fixes: fd6afc501a01 ("nbd/server: Use drained block ops to quiesce the server")
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20240314165825.40261-2-kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
6 months agomirror: Don't call job_pause_point() under graph lock
Kevin Wolf [Wed, 13 Mar 2024 15:30:00 +0000 (16:30 +0100)]
mirror: Don't call job_pause_point() under graph lock

Calling job_pause_point() while holding the graph reader lock
potentially results in a deadlock: bdrv_graph_wrlock() first drains
everything, including the mirror job, which pauses it. The job is only
unpaused at the end of the drain section, which is when the graph writer
lock has been successfully taken. However, if the job happens to be
paused at a pause point where it still holds the reader lock, the writer
lock can't be taken as long as the job is still paused.

Mark job_pause_point() as GRAPH_UNLOCKED and fix mirror accordingly.

Cc: qemu-stable@nongnu.org
Buglink: https://issues.redhat.com/browse/RHEL-28125
Fixes: 004915a96a7a ("block: Protect bs->backing with graph_lock")
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20240313153000.33121-1-kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
6 months agoMerge tag 'pull-maintainer-final-130324-1' of https://gitlab.com/stsquad/qemu into...
Peter Maydell [Wed, 13 Mar 2024 15:12:14 +0000 (15:12 +0000)]
Merge tag 'pull-maintainer-final-130324-1' of https://gitlab.com/stsquad/qemu into staging

final updates for 9.0 (testing, gdbstub):

  - fix the over rebuilding of test VMs
  - support Xfer:siginfo:read in gdbstub
  - fix double close() in gdbstub

# -----BEGIN PGP SIGNATURE-----
#
# iQEzBAABCgAdFiEEZoWumedRZ7yvyN81+9DbCVqeKkQFAmXxkb0ACgkQ+9DbCVqe
# KkSw9wf+K+3kJYaZ2unEFku3Y6f4Z9XkrZCsFQFVNIJQgpYVc6peQyLUB1pZwzZc
# yoQhmTIgej16iRZc7gEcJhFl2zlX2vulE/m+wiaR0Chv3E2r510AGn4aWl+GLB9+
# /WduHaz1NobPW4JWaarxespa84Re8QZQgqkHX4nwYd++FW63E4uxydL4F1nmSNca
# eTA6RwS48h4wqPzHBX72hYTRUnYrDUSSGCGUDzK3NHumuPi+AQ77GLRMO0MTYFfy
# hWriapogCmghY+Xtn++eUIwDyh1CCnUT6Ntf5Qj06bZ+f6eaTwINM8QWhj9mxYX+
# 5/F5Q4JJDqRPYw/hF4wYXRsiZxTYFw==
# =BOWW
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 13 Mar 2024 11:45:01 GMT
# gpg:                using RSA key 6685AE99E75167BCAFC8DF35FBD0DB095A9E2A44
# gpg: Good signature from "Alex Bennée (Master Work Key) <alex.bennee@linaro.org>" [full]
# Primary key fingerprint: 6685 AE99 E751 67BC AFC8  DF35 FBD0 DB09 5A9E 2A44

* tag 'pull-maintainer-final-130324-1' of https://gitlab.com/stsquad/qemu:
  gdbstub: Fix double close() of the follow-fork-mode socket
  tests/tcg: Add multiarch test for Xfer:siginfo:read stub
  gdbstub: Add Xfer:siginfo:read stub
  gdbstub: Save target's siginfo
  linux-user: Move tswap_siginfo out of target code
  gdbstub: Rename back gdb_handlesig
  tests/vm: ensure we build everything by default

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6 months agoMerge tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu into...
Peter Maydell [Wed, 13 Mar 2024 15:11:53 +0000 (15:11 +0000)]
Merge tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu into staging

virtio,pc,pci: features, cleanups, fixes

more memslots support in libvhost-user
support PCIe Gen5/Gen6 link speeds in pcie
more traces in vdpa
network simulation devices support in vdpa
SMBIOS type 9 descriptor implementation
Bump max_cpus to 4096 vcpus in q35
aw-bits and granule options in VIRTIO-IOMMU
Support report NUMA nodes for device memory using GI in acpi
Beginning of shutdown event support in pvpanic

fixes, cleanups all over the place.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# -----BEGIN PGP SIGNATURE-----
#
# iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmXw0TMPHG1zdEByZWRo
# YXQuY29tAAoJECgfDbjSjVRp8x4H+gLMoGwaGAX7gDGPgn2Ix4j/3kO77ZJ9X9k/
# 1KqZu/9eMS1j2Ei+vZqf05w7qRjxxhwDq3ilEXF/+UFqgAehLqpRRB8j5inqvzYt
# +jv0DbL11PBp/oFjWcytm5CbiVsvq8KlqCF29VNzc162XdtcduUOWagL96y8lJfZ
# uPrOoyeR7SMH9lp3LLLHWgu+9W4nOS03RroZ6Umj40y5B7yR0Rrppz8lMw5AoQtr
# 0gMRnFhYXeiW6CXdz+Tzcr7XfvkkYDi/j7ibiNSURLBfOpZa6Y8+kJGKxz5H1K1G
# 6ZY4PBcOpQzl+NMrktPHogczgJgOK10t+1i/R3bGZYw2Qn/93Eg=
# =C0UU
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 12 Mar 2024 22:03:31 GMT
# gpg:                using RSA key 5D09FD0871C8F85B94CA8A0D281F0DB8D28D5469
# gpg:                issuer "mst@redhat.com"
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [full]
# gpg:                 aka "Michael S. Tsirkin <mst@redhat.com>" [full]
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17  0970 C350 3912 AFBE 8E67
#      Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA  8A0D 281F 0DB8 D28D 5469

* tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu: (68 commits)
  docs/specs/pvpanic: document shutdown event
  hw/cxl: Fix missing reserved data in CXL Device DVSEC
  hmat acpi: Fix out of bounds access due to missing use of indirection
  hmat acpi: Do not add Memory Proximity Domain Attributes Structure targetting non existent memory.
  qemu-options.hx: Document the virtio-iommu-pci aw-bits option
  hw/arm/virt: Set virtio-iommu aw-bits default value to 48
  hw/i386/q35: Set virtio-iommu aw-bits default value to 39
  virtio-iommu: Add an option to define the input range width
  virtio-iommu: Trace domain range limits as unsigned int
  qemu-options.hx: Document the virtio-iommu-pci granule option
  virtio-iommu: Change the default granule to the host page size
  virtio-iommu: Add a granule property
  hw/i386/acpi-build: Add support for SRAT Generic Initiator structures
  hw/acpi: Implement the SRAT GI affinity structure
  qom: new object to associate device to NUMA node
  hw/i386/pc: Inline pc_cmos_init() into pc_cmos_init_late() and remove it
  hw/i386/pc: Set "normal" boot device order in pc_basic_device_init()
  hw/i386/pc: Avoid one use of the current_machine global
  hw/i386/pc: Remove "rtc_state" link again
  Revert "hw/i386/pc: Confine system flash handling to pc_sysfw"
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
# Conflicts:
# hw/core/machine.c

6 months agoMerge tag 'pull-ppc-for-9.0-2-20240313' of https://gitlab.com/npiggin/qemu into staging
Peter Maydell [Wed, 13 Mar 2024 12:37:27 +0000 (12:37 +0000)]
Merge tag 'pull-ppc-for-9.0-2-20240313' of https://gitlab.com/npiggin/qemu into staging

* PAPR nested hypervisor host implementation for spapr TCG
* excp_helper.c code cleanups and improvements
* Move more ops to decodetree
* Deprecate pseries-2.12 machines and P9 and P10 DD1.0 CPUs
* Document running Linux on AmigaNG
* Update dt feature advertising POWER CPUs.
* Add P10 PMU SPRs
* Improve pnv topology calculation for SMT8 CPUs.
* Various bug fixes.

# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCgAdFiEETkN92lZhb0MpsKeVZ7MCdqhiHK4FAmXwiT8ACgkQZ7MCdqhi
# HK7C/w//XxEO2bQTFPLFDTrP/voq7pcX8XeQNVyXCkXYjvsbu05oQow50k+Y5UAE
# US4MFjt8jFz0vuIKuKyoA3kG41zDSOzoX4TQXMM+tyTWbuFF3KAyfizb1xE6SYAN
# xJEGvmiXv/EgoSBD7BTKQp1tMPdIGZLwSdYiA0lmOo7YaMCgYAXaujW5hnNjQecT
# 873sN+10pHtQY++mINtD9Nfb6AcDGMWw0b+bykqIXhNRkI8IGOS4WF4vAuMBrwfe
# UM00wDnNRb86Dk14bv2XVNDr6/i0VRtUMwM4yiptrQ1TQx18LZaPSQFYjQfPaan7
# LwN4QkMFnBX54yJ7Npvjvu8BCBF47kwOVu4CIAFJ4sIm0WfTmozDpPttwcZ5w7Ve
# iXDOB9ECAB4pQ2rCgbSNG8MYUZgoHHOuThqolOP0Vh9NHRRJxpdw6CyAbmCGftc0
# lvRDPFiKp8xmCNJ/j3XzoUdHoG7NMwpUmHv9ruGU18SdQ8hyJN9AcQGWYrB4v0RV
# /hs2RAbwntG7ahkcwd8uy5aFw88Wph/uGXPXc49EWj7i49vHeIV2y5+gtthMywje
# qqjFXkistXuF+JHVnyoYmqqCyXaHX5CEwtawMv4EQeaJs76bLhMeMTKKl9rRp8qB
# DtbIZphO8iMsocrBnje48sA5HR0PM+H4HTjw10i8R0fLlWitaIY=
# =XnY5
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 12 Mar 2024 16:56:31 GMT
# gpg:                using RSA key 4E437DDA56616F4329B0A79567B30276A8621CAE
# gpg: Good signature from "Nicholas Piggin <npiggin@gmail.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 4E43 7DDA 5661 6F43 29B0  A795 67B3 0276 A862 1CAE

* tag 'pull-ppc-for-9.0-2-20240313' of https://gitlab.com/npiggin/qemu: (38 commits)
  spapr: nested: Introduce cap-nested-papr for Nested PAPR API
  spapr: nested: Introduce H_GUEST_RUN_VCPU hcall.
  spapr: nested: Use correct source for parttbl info for nested PAPR API.
  spapr: nested: Introduce H_GUEST_[GET|SET]_STATE hcalls.
  spapr: nested: Initialize the GSB elements lookup table.
  spapr: nested: Extend nested_ppc_state for nested PAPR API
  spapr: nested: Introduce H_GUEST_CREATE_VCPU hcall.
  spapr: nested: Introduce H_GUEST_[CREATE|DELETE] hcalls.
  spapr: nested: Introduce H_GUEST_[GET|SET]_CAPABILITIES hcalls.
  spapr: nested: Document Nested PAPR API
  spapr: nested: keep nested-hv related code restricted to its API.
  spapr: nested: Introduce SpaprMachineStateNested to store related info.
  spapr: nested: move nested part of spapr_get_pate into spapr_nested.c
  spapr: nested: register nested-hv api hcalls only for cap-nested-hv
  target/ppc: Remove interrupt handler wrapper functions
  target/ppc: Clean up ifdefs in excp_helper.c, part 3
  target/ppc: Clean up ifdefs in excp_helper.c, part 2
  target/ppc: Clean up ifdefs in excp_helper.c, part 1
  target/ppc: Add gen_exception_err_nip() function
  target/ppc: Readability improvements in exception handlers
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6 months agoMerge tag 'tracing-pull-request' of https://gitlab.com/stefanha/qemu into staging
Peter Maydell [Wed, 13 Mar 2024 12:37:15 +0000 (12:37 +0000)]
Merge tag 'tracing-pull-request' of https://gitlab.com/stefanha/qemu into staging

Pull request

# -----BEGIN PGP SIGNATURE-----
#
# iQEzBAABCAAdFiEEhpWov9P5fNqsNXdanKSrs4Grc8gFAmXwpoYACgkQnKSrs4Gr
# c8gE0wf/c0hNDKoV01N8IwfJdmIBySNeCYRQiwcR84iiPoGGAwYdKuLa7wHaQKiO
# iM0EV/ltJiiOGCHxlffVqLBzJurJHsHG6m429KBLRBXWc6gVzhCN9TjD8DwHxiTU
# qzczoev8NJ2y5mrxzPPPjMxSSJEe3Ynas6ngeHeYBUtu0PRNp79zceWdtS0sPzia
# sCI8EH/oCZQgVcwI/UkIOXjzbKK1lZWa2805//KIqvG27i9zHzLJ0l5eeLtbpZpy
# LnFGRyQGGf+jEKAJuT6598q6T+jCkLCMN6zpyKWGvcYleNvBnlw6+N8Il8zV7KSc
# TE5BNk+C7I9aimrRyaz3WrFCZW5DbQ==
# =q9Im
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 12 Mar 2024 19:01:26 GMT
# gpg:                using RSA key 8695A8BFD3F97CDAAC35775A9CA4ABB381AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>" [full]
# gpg:                 aka "Stefan Hajnoczi <stefanha@gmail.com>" [full]
# Primary key fingerprint: 8695 A8BF D3F9 7CDA AC35  775A 9CA4 ABB3 81AB 73C8

* tag 'tracing-pull-request' of https://gitlab.com/stefanha/qemu:
  meson: generate .stp files for tools too
  tracetool: remove redundant --target-type / --target-name args

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6 months agogdbstub: Fix double close() of the follow-fork-mode socket
Ilya Leoshkevich [Tue, 12 Mar 2024 00:07:01 +0000 (01:07 +0100)]
gdbstub: Fix double close() of the follow-fork-mode socket

When the terminal GDB_FORK_ENABLED state is reached, the coordination
socket is not needed anymore and is therefore closed. However, if there
is a communication error between QEMU gdbstub and GDB, the generic
error handling code attempts to close it again.

Fix by closing it later - before returning - instead.

Fixes: Coverity CID 1539966
Fixes: d547e711a8a5 ("gdbstub: Implement follow-fork-mode child")
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20240312001813.13720-1-iii@linux.ibm.com>

6 months agotests/tcg: Add multiarch test for Xfer:siginfo:read stub
Gustavo Romero [Sat, 9 Mar 2024 03:09:01 +0000 (03:09 +0000)]
tests/tcg: Add multiarch test for Xfer:siginfo:read stub

Add multiarch test for testing if Xfer:siginfo:read query is properly
handled by gdbstub.

Signed-off-by: Gustavo Romero <gustavo.romero@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20240309030901.1726211-6-gustavo.romero@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
6 months agogdbstub: Add Xfer:siginfo:read stub
Gustavo Romero [Sat, 9 Mar 2024 03:09:00 +0000 (03:09 +0000)]
gdbstub: Add Xfer:siginfo:read stub

Add stub to handle Xfer:siginfo:read packet query that requests the
machine's siginfo data.

This is used when GDB user executes 'print $_siginfo' and when the
machine stops due to a signal, for instance, on SIGSEGV. The information
in siginfo allows GDB to determiner further details on the signal, like
the fault address/insn when the SIGSEGV is caught.

Signed-off-by: Gustavo Romero <gustavo.romero@linaro.org>
Message-Id: <20240309030901.1726211-5-gustavo.romero@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
6 months agogdbstub: Save target's siginfo
Gustavo Romero [Sat, 9 Mar 2024 03:08:59 +0000 (03:08 +0000)]
gdbstub: Save target's siginfo

Save target's siginfo into gdbserver_state so it can be used later, for
example, in any stub that requires the target's si_signo and si_code.

This change affects only linux-user mode.

Signed-off-by: Gustavo Romero <gustavo.romero@linaro.org>
Suggested-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20240309030901.1726211-4-gustavo.romero@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
6 months agolinux-user: Move tswap_siginfo out of target code
Gustavo Romero [Sat, 9 Mar 2024 03:08:58 +0000 (03:08 +0000)]
linux-user: Move tswap_siginfo out of target code

Move tswap_siginfo from target code to handle_pending_signal. This will
allow some cleanups and having the siginfo ready to be used in gdbstub.

Signed-off-by: Gustavo Romero <gustavo.romero@linaro.org>
Suggested-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20240309030901.1726211-3-gustavo.romero@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
6 months agogdbstub: Rename back gdb_handlesig
Gustavo Romero [Sat, 9 Mar 2024 03:08:57 +0000 (03:08 +0000)]
gdbstub: Rename back gdb_handlesig

Rename gdb_handlesig_reason back to gdb_handlesig. There is no need to
add a wrapper for gdb_handlesig and rename it when a new parameter is
added.

Signed-off-by: Gustavo Romero <gustavo.romero@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20240309030901.1726211-2-gustavo.romero@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
6 months agotests/vm: ensure we build everything by default
Alex Bennée [Sat, 9 Mar 2024 11:56:02 +0000 (11:56 +0000)]
tests/vm: ensure we build everything by default

The "check" target by itself is not enough to ensure we build the user
mode binaries. While we can't test them with check-tcg we can at least
include them in the build.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Cc: Richard Henderson <richard.henderson@linaro.org>
Cc: Gustavo Romero <gustavo.romero@linaro.org>
6 months agodocs/specs/pvpanic: document shutdown event
Thomas Weißschuh [Sun, 10 Mar 2024 15:04:51 +0000 (16:04 +0100)]
docs/specs/pvpanic: document shutdown event

Shutdown requests are normally hardware dependent.
By extending pvpanic to also handle shutdown requests, guests can
submit such requests with an easily implementable and cross-platform
mechanism.

Signed-off-by: Thomas Weißschuh <thomas@t-8ch.de>
Message-Id: <20240310-pvpanic-shutdown-spec-v1-1-b258e182ce55@t-8ch.de>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
6 months agohw/cxl: Fix missing reserved data in CXL Device DVSEC
Jonathan Cameron [Fri, 8 Mar 2024 14:38:31 +0000 (14:38 +0000)]
hw/cxl: Fix missing reserved data in CXL Device DVSEC

The r3.1 specification introduced a new 2 byte field, but
to maintain DWORD alignment, a additional 2 reserved bytes
were added. Forgot those in updating the structure definition
but did include them in the size define leading to a buffer
overrun.

Also use the define so that we don't duplicate the value.

Fixes: Coverity ID 1534095 buffer overrun
Fixes: 8700ee15de ("hw/cxl: Standardize all references on CXL r3.1 and minor updates")
Reported-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <20240308143831.6256-1-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
6 months agohmat acpi: Fix out of bounds access due to missing use of indirection
Jonathan Cameron [Thu, 7 Mar 2024 16:03:26 +0000 (16:03 +0000)]
hmat acpi: Fix out of bounds access due to missing use of indirection

With a numa set up such as

-numa nodeid=0,cpus=0 \
-numa nodeid=1,memdev=mem \
-numa nodeid=2,cpus=1

and appropriate hmat_lb entries the initiator list is correctly
computed and writen to HMAT as 0,2 but then the LB data is accessed
using the node id (here 2), landing outside the entry_list array.

Stash the reverse lookup when writing the initiator list and use
it to get the correct array index index.

Fixes: 4586a2cb83 ("hmat acpi: Build System Locality Latency and Bandwidth Information Structure(s)")
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <20240307160326.31570-3-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
6 months agohmat acpi: Do not add Memory Proximity Domain Attributes Structure targetting non...
Jonathan Cameron [Thu, 7 Mar 2024 16:03:25 +0000 (16:03 +0000)]
hmat acpi: Do not add Memory Proximity Domain Attributes Structure targetting non existent memory.

If qemu is started with a proximity node containing CPUs alone,
it will provide one of these structures to say memory in this
node is directly connected to itself.

This description is arguably pointless even if there is memory
in the node.  If there is no memory present, and hence no SRAT
entry it breaks Linux HMAT passing and the table is rejected.

https://elixir.bootlin.com/linux/v6.7/source/drivers/acpi/numa/hmat.c#L444

Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <20240307160326.31570-2-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
6 months agoqemu-options.hx: Document the virtio-iommu-pci aw-bits option
Eric Auger [Thu, 7 Mar 2024 13:43:10 +0000 (14:43 +0100)]
qemu-options.hx: Document the virtio-iommu-pci aw-bits option

Document the new aw-bits option.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Message-Id: <20240307134445.92296-10-eric.auger@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
6 months agohw/arm/virt: Set virtio-iommu aw-bits default value to 48
Eric Auger [Thu, 7 Mar 2024 13:43:09 +0000 (14:43 +0100)]
hw/arm/virt: Set virtio-iommu aw-bits default value to 48

On ARM we set 48b as a default (matching SMMUv3 SMMU_IDR5.VAX == 0).

hw_compat_8_2 is used to handle the compatibility for machine types
before 9.0 (default was 64 bits).

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Zhenzhong Duan <Zhenzhong.duan@intel.com>
Message-Id: <20240307134445.92296-9-eric.auger@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
6 months agohw/i386/q35: Set virtio-iommu aw-bits default value to 39
Eric Auger [Thu, 7 Mar 2024 13:43:08 +0000 (14:43 +0100)]
hw/i386/q35: Set virtio-iommu aw-bits default value to 39

Currently the default input range can extend to 64 bits. On x86,
when the virtio-iommu protects vfio devices, the physical iommu
may support only 39 bits. Let's set the default to 39, as done
for the intel-iommu.

We use hw_compat_8_2 to handle the compatibility for machines
before 9.0 which used to have a virtio-iommu default input range
of 64 bits.

Of course if aw-bits is set from the command line, the default
is overriden.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Message-Id: <20240307134445.92296-8-eric.auger@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
6 months agovirtio-iommu: Add an option to define the input range width
Eric Auger [Thu, 7 Mar 2024 13:43:07 +0000 (14:43 +0100)]
virtio-iommu: Add an option to define the input range width

aw-bits is a new option that allows to set the bit width of
the input address range. This value will be used as a default for
the device config input_range.end. By default it is set to 64 bits
which is the current value.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Message-Id: <20240307134445.92296-7-eric.auger@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
6 months agovirtio-iommu: Trace domain range limits as unsigned int
Eric Auger [Thu, 7 Mar 2024 13:43:06 +0000 (14:43 +0100)]
virtio-iommu: Trace domain range limits as unsigned int

Use %u format to trace domain_range limits.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Message-Id: <20240307134445.92296-6-eric.auger@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
6 months agoqemu-options.hx: Document the virtio-iommu-pci granule option
Eric Auger [Thu, 7 Mar 2024 13:43:05 +0000 (14:43 +0100)]
qemu-options.hx: Document the virtio-iommu-pci granule option

We are missing an entry for the virtio-iommu-pci device. Add the
information on which machine it is currently supported and document
the new granule option.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Message-Id: <20240307134445.92296-5-eric.auger@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
6 months agovirtio-iommu: Change the default granule to the host page size
Eric Auger [Thu, 7 Mar 2024 13:43:04 +0000 (14:43 +0100)]
virtio-iommu: Change the default granule to the host page size

We used to set the default granule to 4KB but with VFIO assignment
it makes more sense to use the actual host page size.

Indeed when hotplugging a VFIO device protected by a virtio-iommu
on a 64kB/64kB host/guest config, we current get a qemu crash:

"vfio: DMA mapping failed, unable to continue"

This is due to the hot-attached VFIO device calling
memory_region_iommu_set_page_size_mask() with 64kB granule
whereas the virtio-iommu granule was already frozen to 4KB on
machine init done.

Set the granule property to "host" and introduce a new compat.
The page size mask used before 9.0 was qemu_target_page_mask().
Since the virtio-iommu currently only supports x86_64 and aarch64,
this matched a 4KB granule.

Note that the new default will prevent 4kB guest on 64kB host
because the granule will be set to 64kB which would be larger
than the guest page size. In that situation, the virtio-iommu
driver fails on viommu_domain_finalise() with
"granule 0x10000 larger than system page size 0x1000".

In that case the workaround is to request 4K granule.

The current limitation of global granule in the virtio-iommu
should be removed and turned into per domain granule. But
until we get this upgraded, this new default is probably
better because I don't think anyone is currently interested in
running a 4KB page size guest with virtio-iommu on a 64KB host.
However supporting 64kB guest on 64kB host with virtio-iommu and
VFIO looks a more important feature.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Message-Id: <20240307134445.92296-4-eric.auger@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
6 months agovirtio-iommu: Add a granule property
Eric Auger [Thu, 7 Mar 2024 13:43:03 +0000 (14:43 +0100)]
virtio-iommu: Add a granule property

This allows to choose which granule will be used by
default by the virtio-iommu. Current page size mask
default is qemu_target_page_mask so this translates
into a 4k granule on ARM and x86_64 where virtio-iommu
is supported.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Message-Id: <20240307134445.92296-3-eric.auger@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
6 months agohw/i386/acpi-build: Add support for SRAT Generic Initiator structures
Ankit Agrawal [Fri, 8 Mar 2024 14:55:25 +0000 (14:55 +0000)]
hw/i386/acpi-build: Add support for SRAT Generic Initiator structures

The acpi-generic-initiator object is added to allow a host device
to be linked with a NUMA node. Qemu use it to build the SRAT
Generic Initiator Affinity structure [1]. Add support for i386.

[1] ACPI Spec 6.3, Section 5.2.16.6

Suggested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Ankit Agrawal <ankita@nvidia.com>
Message-Id: <20240308145525.10886-4-ankita@nvidia.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
6 months agohw/acpi: Implement the SRAT GI affinity structure
Ankit Agrawal [Fri, 8 Mar 2024 14:55:24 +0000 (14:55 +0000)]
hw/acpi: Implement the SRAT GI affinity structure

ACPI spec provides a scheme to associate "Generic Initiators" [1]
(e.g. heterogeneous processors and accelerators, GPUs, and I/O devices with
integrated compute or DMA engines GPUs) with Proximity Domains. This is
achieved using Generic Initiator Affinity Structure in SRAT. During bootup,
Linux kernel parse the ACPI SRAT to determine the PXM ids and create a NUMA
node for each unique PXM ID encountered. Qemu currently do not implement
these structures while building SRAT.

Add GI structures while building VM ACPI SRAT. The association between
device and node are stored using acpi-generic-initiator object. Lookup
presence of all such objects and use them to build these structures.

The structure needs a PCI device handle [2] that consists of the device BDF.
The vfio-pci device corresponding to the acpi-generic-initiator object is
located to determine the BDF.

[1] ACPI Spec 6.3, Section 5.2.16.6
[2] ACPI Spec 6.3, Table 5.80

Cc: Jonathan Cameron <qemu-devel@nongnu.org>
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Cedric Le Goater <clg@redhat.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Ankit Agrawal <ankita@nvidia.com>
Message-Id: <20240308145525.10886-3-ankita@nvidia.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
6 months agoqom: new object to associate device to NUMA node
Ankit Agrawal [Fri, 8 Mar 2024 14:55:23 +0000 (14:55 +0000)]
qom: new object to associate device to NUMA node

NVIDIA GPU's support MIG (Mult-Instance GPUs) feature [1], which allows
partitioning of the GPU device resources (including device memory) into
several (upto 8) isolated instances. Each of the partitioned memory needs
a dedicated NUMA node to operate. The partitions are not fixed and they
can be created/deleted at runtime.

Unfortunately Linux OS does not provide a means to dynamically create/destroy
NUMA nodes and such feature implementation is not expected to be trivial. The
nodes that OS discovers at the boot time while parsing SRAT remains fixed. So
we utilize the Generic Initiator (GI) Affinity structures that allows
association between nodes and devices. Multiple GI structures per BDF is
possible, allowing creation of multiple nodes by exposing unique PXM in each
of these structures.

Implement the mechanism to build the GI affinity structures as Qemu currently
does not. Introduce a new acpi-generic-initiator object to allow host admin
link a device with an associated NUMA node. Qemu maintains this association
and use this object to build the requisite GI Affinity Structure.

When multiple NUMA nodes are associated with a device, it is required to
create those many number of acpi-generic-initiator objects, each representing
a unique device:node association.

Following is one of a decoded GI affinity structure in VM ACPI SRAT.
[0C8h 0200   1]                Subtable Type : 05 [Generic Initiator Affinity]
[0C9h 0201   1]                       Length : 20

[0CAh 0202   1]                    Reserved1 : 00
[0CBh 0203   1]           Device Handle Type : 01
[0CCh 0204   4]             Proximity Domain : 00000007
[0D0h 0208  16]                Device Handle : 00 00 20 00 00 00 00 00 00 00 00
00 00 00 00 00
[0E0h 0224   4]        Flags (decoded below) : 00000001
                                     Enabled : 1
[0E4h 0228   4]                    Reserved2 : 00000000

[0E8h 0232   1]                Subtable Type : 05 [Generic Initiator Affinity]
[0E9h 0233   1]                       Length : 20

An admin can provide a range of acpi-generic-initiator objects, each
associating a device (by providing the id through pci-dev argument)
to the desired NUMA node (using the node argument). Currently, only PCI
device is supported.

For the grace hopper system, create a range of 8 nodes and associate that
with the device using the acpi-generic-initiator object. While a configuration
of less than 8 nodes per device is allowed, such configuration will prevent
utilization of the feature to the fullest. The following sample creates 8
nodes per PCI device for a VM with 2 PCI devices and link them to the
respecitve PCI device using acpi-generic-initiator objects:

-numa node,nodeid=2 -numa node,nodeid=3 -numa node,nodeid=4 \
-numa node,nodeid=5 -numa node,nodeid=6 -numa node,nodeid=7 \
-numa node,nodeid=8 -numa node,nodeid=9 \
-device vfio-pci-nohotplug,host=0009:01:00.0,bus=pcie.0,addr=04.0,rombar=0,id=dev0 \
-object acpi-generic-initiator,id=gi0,pci-dev=dev0,node=2 \
-object acpi-generic-initiator,id=gi1,pci-dev=dev0,node=3 \
-object acpi-generic-initiator,id=gi2,pci-dev=dev0,node=4 \
-object acpi-generic-initiator,id=gi3,pci-dev=dev0,node=5 \
-object acpi-generic-initiator,id=gi4,pci-dev=dev0,node=6 \
-object acpi-generic-initiator,id=gi5,pci-dev=dev0,node=7 \
-object acpi-generic-initiator,id=gi6,pci-dev=dev0,node=8 \
-object acpi-generic-initiator,id=gi7,pci-dev=dev0,node=9 \

-numa node,nodeid=10 -numa node,nodeid=11 -numa node,nodeid=12 \
-numa node,nodeid=13 -numa node,nodeid=14 -numa node,nodeid=15 \
-numa node,nodeid=16 -numa node,nodeid=17 \
-device vfio-pci-nohotplug,host=0009:01:01.0,bus=pcie.0,addr=05.0,rombar=0,id=dev1 \
-object acpi-generic-initiator,id=gi8,pci-dev=dev1,node=10 \
-object acpi-generic-initiator,id=gi9,pci-dev=dev1,node=11 \
-object acpi-generic-initiator,id=gi10,pci-dev=dev1,node=12 \
-object acpi-generic-initiator,id=gi11,pci-dev=dev1,node=13 \
-object acpi-generic-initiator,id=gi12,pci-dev=dev1,node=14 \
-object acpi-generic-initiator,id=gi13,pci-dev=dev1,node=15 \
-object acpi-generic-initiator,id=gi14,pci-dev=dev1,node=16 \
-object acpi-generic-initiator,id=gi15,pci-dev=dev1,node=17 \

Link: https://www.nvidia.com/en-in/technologies/multi-instance-gpu
Cc: Jonathan Cameron <qemu-devel@nongnu.org>
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Acked-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Ankit Agrawal <ankita@nvidia.com>
Message-Id: <20240308145525.10886-2-ankita@nvidia.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
6 months agohw/i386/pc: Inline pc_cmos_init() into pc_cmos_init_late() and remove it
Bernhard Beschow [Sun, 3 Mar 2024 18:53:32 +0000 (19:53 +0100)]
hw/i386/pc: Inline pc_cmos_init() into pc_cmos_init_late() and remove it

Now that pc_cmos_init() doesn't populate the X86MachineState::rtc attribute any
longer, its duties can be merged into pc_cmos_init_late() which is called within
machine_done notifier. This frees pc_piix and pc_q35 from explicit CMOS
initialization.

Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Message-Id: <20240303185332.1408-5-shentey@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
6 months agohw/i386/pc: Set "normal" boot device order in pc_basic_device_init()
Bernhard Beschow [Sun, 3 Mar 2024 18:53:31 +0000 (19:53 +0100)]
hw/i386/pc: Set "normal" boot device order in pc_basic_device_init()

The boot device order may change during the lifetime of a VM. Usually, the
"normal" order is set once during machine init(). However, if a user specifies
`-boot once=...`, the "normal" order is overwritten by the "once" order just
before machine_done, and a reset handler is registered which restores the
"normal" order during the next reset.

In the next patch, pc_cmos_init() will be inlined into pc_cmos_init_late() which
runs during machine_done. This means that the "once" boot order would be
overwritten again with the "normal" boot order -- which renders the user's
choice ineffective. Fix this by setting the "normal" boot order in
pc_basic_device_init() which already registers the boot_set() handler.

Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Message-Id: <20240303185332.1408-4-shentey@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
6 months agohw/i386/pc: Avoid one use of the current_machine global
Bernhard Beschow [Sun, 3 Mar 2024 18:53:30 +0000 (19:53 +0100)]
hw/i386/pc: Avoid one use of the current_machine global

The RTC can be accessed through the X86 machine instance, so rather than passing
the RTC it's possible to pass the machine state instead. This avoids
pc_boot_set() from having to access the current_machine global.

Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Message-Id: <20240303185332.1408-3-shentey@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
6 months agohw/i386/pc: Remove "rtc_state" link again
Bernhard Beschow [Sun, 3 Mar 2024 18:53:29 +0000 (19:53 +0100)]
hw/i386/pc: Remove "rtc_state" link again

Commit 99e1c1137b6f "hw/i386/pc: Populate RTC attribute directly" made linking
the "rtc_state" property unnecessary and removed it. Commit 84e945aad2d0 "vl,
pc: turn -no-fd-bootchk into a machine property" accidently reintroduced the
link. Remove it again since it is not needed.

Fixes: 84e945aad2d0 "vl, pc: turn -no-fd-bootchk into a machine property"
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Message-Id: <20240303185332.1408-2-shentey@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
6 months agoRevert "hw/i386/pc: Confine system flash handling to pc_sysfw"
Bernhard Beschow [Mon, 26 Feb 2024 21:59:09 +0000 (22:59 +0100)]
Revert "hw/i386/pc: Confine system flash handling to pc_sysfw"

Specifying the property `-M pflash0` results in a regression:
  qemu-system-x86_64: Property 'pc-q35-9.0-machine.pflash0' not found
Revert the change for now until a solution is found.

This reverts commit 6f6ad2b24582593d8feb00434ce2396840666227.

Reported-by: Volker Rümelin <vr_qemu@t-online.de>
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Message-Id: <20240226215909.30884-3-shentey@gmail.com>
Tested-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
6 months agoRevert "hw/i386/pc_sysfw: Inline pc_system_flash_create() and remove it"
Bernhard Beschow [Mon, 26 Feb 2024 21:59:08 +0000 (22:59 +0100)]
Revert "hw/i386/pc_sysfw: Inline pc_system_flash_create() and remove it"

Commit 6f6ad2b24582 "hw/i386/pc: Confine system flash handling to pc_sysfw"
causes a regression when specifying the property `-M pflash0` in the PCI PC
machines:
  qemu-system-x86_64: Property 'pc-q35-9.0-machine.pflash0' not found
In order to revert the commit, the commit below must be reverted first.

This reverts commit cb05cc16029bb0a61ac5279ab7b3b90dcf2aa69f.

Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Message-Id: <20240226215909.30884-2-shentey@gmail.com>
Tested-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
6 months agopc: q35: Bump max_cpus to 4096 vcpus
Ani Sinha [Wed, 28 Feb 2024 14:33:51 +0000 (20:03 +0530)]
pc: q35: Bump max_cpus to 4096 vcpus

Since commit f10a570b093e6 ("KVM: x86: Add CONFIG_KVM_MAX_NR_VCPUS to allow up to 4096 vCPUs")
Linux kernel can support upto a maximum number of 4096 vcpus when MAXSMP is
enabled in the kernel. At present, QEMU has been tested to correctly boot a
linux guest with 4096 vcpus using the current edk2 upstream master branch that
has the fixes corresponding to the following two PRs:

https://github.com/tianocore/edk2/pull/5410
https://github.com/tianocore/edk2/pull/5418

The changes merged into edk2 with the above PRs will be in the upcoming 2024-05
release. With current seabios firmware, it boots fine with 4096 vcpus already.
So bump up the value max_cpus to 4096 for q35 machines versions 9 and newer.
Q35 machines versions 8.2 and older continue to support 1024 maximum vcpus
as before for compatibility reasons.

If KVM is not able to support the specified number of vcpus, QEMU would
return the following error messages:

$ ./qemu-system-x86_64 -cpu host -accel kvm -machine q35 -smp 1728
qemu-system-x86_64: -accel kvm: warning: Number of SMP cpus requested (1728) exceeds the recommended cpus supported by KVM (12)
qemu-system-x86_64: -accel kvm: warning: Number of hotpluggable cpus requested (1728) exceeds the recommended cpus supported by KVM (12)
Number of SMP cpus requested (1728) exceeds the maximum cpus supported by KVM (1024)

Cc: Daniel P. Berrangé <berrange@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Julia Suvorova <jusual@redhat.com>
Cc: kraxel@redhat.com
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Ani Sinha <anisinha@redhat.com>
Message-Id: <20240228143351.3967-1-anisinha@redhat.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
6 months agohw/pci: Always call pcie_sriov_pf_reset()
Akihiko Odaki [Wed, 28 Feb 2024 11:33:16 +0000 (20:33 +0900)]
hw/pci: Always call pcie_sriov_pf_reset()

Call pcie_sriov_pf_reset() from pci_do_device_reset() just as we do
for msi_reset() and msix_reset() to prevent duplicating code for each
SR-IOV PF.

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Message-Id: <20240228-reuse-v8-5-282660281e60@daynix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Sriram Yagnaraman <sriram.yagnaraman@ericsson.com>
6 months agopcie_sriov: Do not reset NumVFs after disabling VFs
Akihiko Odaki [Wed, 28 Feb 2024 11:33:15 +0000 (20:33 +0900)]
pcie_sriov: Do not reset NumVFs after disabling VFs

The spec does not NumVFs is reset after disabling VFs except when
resetting the PF. Clearing it is guest visible and out of spec, even
though Linux doesn't rely on this value being preserved, so we never
noticed.

Fixes: 7c0fa8dff811 ("pcie: Add support for Single Root I/O Virtualization (SR/IOV)")
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Message-Id: <20240228-reuse-v8-4-282660281e60@daynix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
6 months agopcie_sriov: Reset SR-IOV extended capability
Akihiko Odaki [Wed, 28 Feb 2024 11:33:14 +0000 (20:33 +0900)]
pcie_sriov: Reset SR-IOV extended capability

pcie_sriov_pf_disable_vfs() is called when resetting the PF, but it only
disables VFs and does not reset SR-IOV extended capability, leaking the
state and making the VF Enable register inconsistent with the actual
state.

Replace pcie_sriov_pf_disable_vfs() with pcie_sriov_pf_reset(), which
does not only disable VFs but also resets the capability.

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Message-Id: <20240228-reuse-v8-3-282660281e60@daynix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Sriram Yagnaraman <sriram.yagnaraman@ericsson.com>
6 months agopcie_sriov: Validate NumVFs
Akihiko Odaki [Wed, 28 Feb 2024 11:33:13 +0000 (20:33 +0900)]
pcie_sriov: Validate NumVFs

The guest may write NumVFs greater than TotalVFs and that can lead
to buffer overflow in VF implementations.

Cc: qemu-stable@nongnu.org
Fixes: CVE-2024-26327
Fixes: 7c0fa8dff811 ("pcie: Add support for Single Root I/O Virtualization (SR/IOV)")
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Message-Id: <20240228-reuse-v8-2-282660281e60@daynix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Sriram Yagnaraman <sriram.yagnaraman@ericsson.com>
6 months agohw/nvme: Use pcie_sriov_num_vfs()
Akihiko Odaki [Wed, 28 Feb 2024 11:33:12 +0000 (20:33 +0900)]
hw/nvme: Use pcie_sriov_num_vfs()

nvme_sriov_pre_write_ctrl() used to directly inspect SR-IOV
configurations to know the number of VFs being disabled due to SR-IOV
configuration writes, but the logic was flawed and resulted in
out-of-bound memory access.

It assumed PCI_SRIOV_NUM_VF always has the number of currently enabled
VFs, but it actually doesn't in the following cases:
- PCI_SRIOV_NUM_VF has been set but PCI_SRIOV_CTRL_VFE has never been.
- PCI_SRIOV_NUM_VF was written after PCI_SRIOV_CTRL_VFE was set.
- VFs were only partially enabled because of realization failure.

It is a responsibility of pcie_sriov to interpret SR-IOV configurations
and pcie_sriov does it correctly, so use pcie_sriov_num_vfs(), which it
provides, to get the number of enabled VFs before and after SR-IOV
configuration writes.

Cc: qemu-stable@nongnu.org
Fixes: CVE-2024-26328
Fixes: 11871f53ef8e ("hw/nvme: Add support for the Virtualization Management command")
Suggested-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Message-Id: <20240228-reuse-v8-1-282660281e60@daynix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
6 months agoImplement SMBIOS type 9 v2.6
Felix Wu [Wed, 21 Feb 2024 17:00:27 +0000 (17:00 +0000)]
Implement SMBIOS type 9 v2.6

Signed-off-by: Felix Wu <flwu@google.com>
Signed-off-by: Nabih Estefan <nabihestefan@google.com>
Message-Id: <20240221170027.1027325-3-nabihestefan@google.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
6 months agoImplement base of SMBIOS type 9 descriptor.
Felix Wu [Wed, 21 Feb 2024 17:00:26 +0000 (17:00 +0000)]
Implement base of SMBIOS type 9 descriptor.

Version 2.1+.

Signed-off-by: Felix Wu <flwu@google.com>
Signed-off-by: Nabih Estefan <nabihestefan@google.com>
Message-Id: <20240221170027.1027325-2-nabihestefan@google.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
6 months agohw/intc: Check @errp to handle the error of IOAPICCommonClass.realize()
Zhao Liu [Fri, 23 Feb 2024 08:56:53 +0000 (16:56 +0800)]
hw/intc: Check @errp to handle the error of IOAPICCommonClass.realize()

IOAPICCommonClass implements its own private realize(), and this private
realize() allows error.

Since IOAPICCommonClass.realize() returns void, to check the error,
dereference @errp with ERRP_GUARD().

Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Message-Id: <20240223085653.1255438-8-zhao1.liu@linux.intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
6 months agohw/vfio/iommufd: Fix missing ERRP_GUARD() in iommufd_cdev_getfd()
Zhao Liu [Fri, 23 Feb 2024 08:56:52 +0000 (16:56 +0800)]
hw/vfio/iommufd: Fix missing ERRP_GUARD() in iommufd_cdev_getfd()

As the comment in qapi/error, dereferencing @errp requires
ERRP_GUARD():

* = Why, when and how to use ERRP_GUARD() =
*
* Without ERRP_GUARD(), use of the @errp parameter is restricted:
* - It must not be dereferenced, because it may be null.
...
* ERRP_GUARD() lifts these restrictions.
*
* To use ERRP_GUARD(), add it right at the beginning of the function.
* @errp can then be used without worrying about the argument being
* NULL or &error_fatal.
*
* Using it when it's not needed is safe, but please avoid cluttering
* the source with useless code.

But in iommufd_cdev_getfd(), @errp is dereferenced without ERRP_GUARD():

if (*errp) {
    error_prepend(errp, VFIO_MSG_PREFIX, path);
}

Currently, since vfio_attach_device() - the caller of
iommufd_cdev_getfd() - is always called in DeviceClass.realize() context
and doesn't get the NULL @errp parameter, iommufd_cdev_getfd()
hasn't triggered the bug that dereferencing the NULL @errp.

To follow the requirement of @errp, add missing ERRP_GUARD() in
iommufd_cdev_getfd().

Suggested-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20240223085653.1255438-7-zhao1.liu@linux.intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
6 months agohw/pci-bridge/cxl_upstream: Fix missing ERRP_GUARD() in cxl_usp_realize()
Zhao Liu [Fri, 23 Feb 2024 08:56:51 +0000 (16:56 +0800)]
hw/pci-bridge/cxl_upstream: Fix missing ERRP_GUARD() in cxl_usp_realize()

As the comment in qapi/error, dereferencing @errp requires
ERRP_GUARD():

* = Why, when and how to use ERRP_GUARD() =
*
* Without ERRP_GUARD(), use of the @errp parameter is restricted:
* - It must not be dereferenced, because it may be null.
...
* ERRP_GUARD() lifts these restrictions.
*
* To use ERRP_GUARD(), add it right at the beginning of the function.
* @errp can then be used without worrying about the argument being
* NULL or &error_fatal.
*
* Using it when it's not needed is safe, but please avoid cluttering
* the source with useless code.

But in cxl_usp_realize(), @errp is dereferenced without ERRP_GUARD():

cxl_doe_cdat_init(cxl_cstate, errp);
if (*errp) {
    goto err_cap;
}

Here we check *errp, because cxl_doe_cdat_init() returns void. And since
cxl_usp_realize() - as a PCIDeviceClass.realize() method - doesn't get
the NULL @errp parameter, it hasn't triggered the bug that dereferencing
the NULL @errp.

To follow the requirement of @errp, add missing ERRP_GUARD() in
cxl_usp_realize().

Suggested-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20240223085653.1255438-6-zhao1.liu@linux.intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
6 months agohw/misc/xlnx-versal-trng: Check returned bool in trng_prop_fault_event_set()
Zhao Liu [Fri, 23 Feb 2024 08:56:50 +0000 (16:56 +0800)]
hw/misc/xlnx-versal-trng: Check returned bool in trng_prop_fault_event_set()

As the comment in qapi/error, dereferencing @errp requires
ERRP_GUARD():

* = Why, when and how to use ERRP_GUARD() =
*
* Without ERRP_GUARD(), use of the @errp parameter is restricted:
* - It must not be dereferenced, because it may be null.
...
* ERRP_GUARD() lifts these restrictions.
*
* To use ERRP_GUARD(), add it right at the beginning of the function.
* @errp can then be used without worrying about the argument being
* NULL or &error_fatal.
*
* Using it when it's not needed is safe, but please avoid cluttering
* the source with useless code.

But in trng_prop_fault_event_set, @errp is dereferenced without
ERRP_GUARD():

visit_type_uint32(v, name, events, errp);
if (*errp) {
    return;
}

Currently, since trng_prop_fault_event_set() doesn't get the NULL @errp
parameter as a "set" method of object property, it hasn't triggered the
bug that dereferencing the NULL @errp.

And since visit_type_uint32() returns bool, check the returned bool
directly instead of dereferencing @errp, then we needn't the add missing
ERRP_GUARD().

Suggested-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Message-Id: <20240223085653.1255438-5-zhao1.liu@linux.intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
6 months agohw/mem/cxl_type3: Fix missing ERRP_GUARD() in ct3_realize()
Zhao Liu [Fri, 23 Feb 2024 08:56:49 +0000 (16:56 +0800)]
hw/mem/cxl_type3: Fix missing ERRP_GUARD() in ct3_realize()

As the comment in qapi/error, dereferencing @errp requires
ERRP_GUARD():

* = Why, when and how to use ERRP_GUARD() =
*
* Without ERRP_GUARD(), use of the @errp parameter is restricted:
* - It must not be dereferenced, because it may be null.
...
* ERRP_GUARD() lifts these restrictions.
*
* To use ERRP_GUARD(), add it right at the beginning of the function.
* @errp can then be used without worrying about the argument being
* NULL or &error_fatal.
*
* Using it when it's not needed is safe, but please avoid cluttering
* the source with useless code.

But in ct3_realize(), @errp is dereferenced without ERRP_GUARD():

cxl_doe_cdat_init(cxl_cstate, errp);
if (*errp) {
    goto err_free_special_ops;
}

Here we check *errp, because cxl_doe_cdat_init() returns void. And
ct3_realize() - as a PCIDeviceClass.realize() method - doesn't get the
NULL @errp parameter, it hasn't triggered the bug that dereferencing
the NULL @errp.

To follow the requirement of @errp, add missing ERRP_GUARD() in
ct3_realize().

Suggested-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20240223085653.1255438-4-zhao1.liu@linux.intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
6 months agohw/display/macfb: Fix missing ERRP_GUARD() in macfb_nubus_realize()
Zhao Liu [Fri, 23 Feb 2024 08:56:48 +0000 (16:56 +0800)]
hw/display/macfb: Fix missing ERRP_GUARD() in macfb_nubus_realize()

As the comment in qapi/error, dereferencing @errp requires
ERRP_GUARD():

* = Why, when and how to use ERRP_GUARD() =
*
* Without ERRP_GUARD(), use of the @errp parameter is restricted:
* - It must not be dereferenced, because it may be null.
...
* ERRP_GUARD() lifts these restrictions.
*
* To use ERRP_GUARD(), add it right at the beginning of the function.
* @errp can then be used without worrying about the argument being
* NULL or &error_fatal.
*
* Using it when it's not needed is safe, but please avoid cluttering
* the source with useless code.

But in macfb_nubus_realize(), @errp is dereferenced without
ERRP_GUARD():

ndc->parent_realize(dev, errp);
if (*errp) {
    return;
}

Here we check *errp, because the ndc->parent_realize(), as a
DeviceClass.realize() callback, returns void. And since
macfb_nubus_realize(), also as a DeviceClass.realize(), doesn't get the
NULL @errp parameter, it hasn't triggered the bug that dereferencing the
NULL @errp.

To follow the requirement of @errp, add missing ERRP_GUARD() in
macfb_nubus_realize().

Suggested-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20240223085653.1255438-3-zhao1.liu@linux.intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
6 months agohw/cxl/cxl-host: Fix missing ERRP_GUARD() in cxl_fixed_memory_window_config()
Zhao Liu [Fri, 23 Feb 2024 08:56:47 +0000 (16:56 +0800)]
hw/cxl/cxl-host: Fix missing ERRP_GUARD() in cxl_fixed_memory_window_config()

As the comment in qapi/error, dereferencing @errp requires
ERRP_GUARD():

* = Why, when and how to use ERRP_GUARD() =
*
* Without ERRP_GUARD(), use of the @errp parameter is restricted:
* - It must not be dereferenced, because it may be null.
...
* ERRP_GUARD() lifts these restrictions.
*
* To use ERRP_GUARD(), add it right at the beginning of the function.
* @errp can then be used without worrying about the argument being
* NULL or &error_fatal.
*
* Using it when it's not needed is safe, but please avoid cluttering
* the source with useless code.

But in cxl_fixed_memory_window_config(), @errp is dereferenced in 2
places without ERRP_GUARD():

fw->enc_int_ways = cxl_interleave_ways_enc(fw->num_targets, errp);
if (*errp) {
    return;
}

and

fw->enc_int_gran =
    cxl_interleave_granularity_enc(object->interleave_granularity,
                                   errp);
if (*errp) {
    return;
}

For the above 2 places, we check "*errp", because neither function
returns a suitable error code. And since machine_set_cfmw() - the caller
of cxl_fixed_memory_window_config() - doesn't get the NULL @errp
parameter as the "set" method of object property,
cxl_fixed_memory_window_config() hasn't triggered the bug that
dereferencing the NULL @errp.

To follow the requirement of @errp, add missing ERRP_GUARD() in
cxl_fixed_memory_window_config().

Suggested-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20240223085653.1255438-2-zhao1.liu@linux.intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
6 months agohw/virtio: Add support for VDPA network simulation devices
Hao Chen [Wed, 21 Feb 2024 07:38:02 +0000 (15:38 +0800)]
hw/virtio: Add support for VDPA network simulation devices

This patch adds support for VDPA network simulation devices.
The device is developed based on virtio-net and tap backend,
and supports hardware live migration function.

For more details, please refer to "docs/system/devices/vdpa-net.rst"

Signed-off-by: Hao Chen <chenh@yusur.tech>
Message-Id: <20240221073802.2888022-1-chenh@yusur.tech>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
6 months agohw/virtio: check owner for removing objects
Albert Esteve [Mon, 19 Feb 2024 14:34:19 +0000 (15:34 +0100)]
hw/virtio: check owner for removing objects

Shared objects lack spoofing protection.
For VHOST_USER_BACKEND_SHARED_OBJECT_REMOVE messages
received by the vhost-user interface, any backend was
allowed to remove entries from the shared table just
by knowing the UUID. Only the owner of the entry
shall be allowed to removed their resources
from the table.

To fix that, add a check for all
*SHARED_OBJECT_REMOVE messages received.
A vhost device can only remove TYPE_VHOST_DEV
entries that are owned by them, otherwise skip
the removal, and inform the device that the entry
has not been removed in the answer.

Signed-off-by: Albert Esteve <aesteve@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20240219143423.272012-2-aesteve@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
6 months agohw/audio/virtio-sound: return correct command response size
Volker Rümelin [Sun, 18 Feb 2024 08:33:41 +0000 (09:33 +0100)]
hw/audio/virtio-sound: return correct command response size

The payload size returned by command VIRTIO_SND_R_PCM_INFO is
wrong. The code in process_cmd() assumes that all commands
return only a virtio_snd_hdr payload, but some commands like
VIRTIO_SND_R_PCM_INFO may return an additional payload.

Add a zero initialized payload_size variable to struct
virtio_snd_ctrl_command to allow for additional payloads.

Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Volker Rümelin <vr_qemu@t-online.de>
Message-Id: <20240218083351.8524-1-vr_qemu@t-online.de>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
6 months agohw/pci-bridge/pxb-cxl: Drop RAS capability from host bridge.
Jonathan Cameron [Thu, 15 Feb 2024 15:52:06 +0000 (15:52 +0000)]
hw/pci-bridge/pxb-cxl: Drop RAS capability from host bridge.

This CXL component isn't allowed to have a RAS capability.
Whilst this should be harmless as software is not expected to look
here, good to clean it up.

Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <20240215155206.2736-1-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
6 months agovdpa: trace skipped memory sections
Eugenio Pérez [Thu, 15 Feb 2024 10:36:16 +0000 (11:36 +0100)]
vdpa: trace skipped memory sections

Sometimes, certain parts are not being skipped in
vhost_vdpa_listener_region_del, but they are skipped in
vhost_vdpa_listener_region_add, or vice versa.  The vhost-vdpa code
expects all parts to maintain their properties, so we're adding a trace
to help with debugging when any part is skipped.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Message-Id: <20240215103616.330518-3-eperezma@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
6 months agovdpa: stash memory region properties in vars
Eugenio Pérez [Thu, 15 Feb 2024 10:36:15 +0000 (11:36 +0100)]
vdpa: stash memory region properties in vars

Next changes uses this variables, so avoid call repeatedly to memory
region functions. No functional change intended.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Message-Id: <20240215103616.330518-2-eperezma@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
6 months agopcie: Support PCIe Gen5/Gen6 link speeds
Lukas Stockner [Thu, 15 Feb 2024 01:23:26 +0000 (02:23 +0100)]
pcie: Support PCIe Gen5/Gen6 link speeds

This patch extends the PCIe link speed option so that slots can be
configured as supporting 32GT/s (Gen5) or 64GT/s (Gen5) speeds.
This is as simple as setting the appropriate bit in LnkCap2 and
the appropriate value in LnkCap and LnkCtl2.

Signed-off-by: Lukas Stockner <lstockner@genesiscloud.com>
Message-Id: <20240215012326.3272366-1-lstockner@genesiscloud.com>
Reviewed-by: Manos Pitsidianakis <manos.pitsidianakis@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
6 months agolibvhost-user: Mark mmap'ed region memory as MADV_DONTDUMP
David Hildenbrand [Wed, 14 Feb 2024 15:17:01 +0000 (16:17 +0100)]
libvhost-user: Mark mmap'ed region memory as MADV_DONTDUMP

We already use MADV_NORESERVE to deal with sparse memory regions. Let's
also set madvise(MADV_DONTDUMP), otherwise a crash of the process can
result in us allocating all memory in the mmap'ed region for dumping
purposes.

This change implies that the mmap'ed rings won't be included in a
coredump. If ever required for debugging purposes, we could mark only
the mapped rings MADV_DODUMP.

Ignore errors during madvise() for now.

Reviewed-by: Raphael Norwitz <raphael@enfabrica.net>
Acked-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20240214151701.29906-15-david@redhat.com>
Tested-by: Mario Casquero <mcasquer@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
6 months agolibvhost-user: Dynamically remap rings after (temporarily?) removing memory regions
David Hildenbrand [Wed, 14 Feb 2024 15:17:00 +0000 (16:17 +0100)]
libvhost-user: Dynamically remap rings after (temporarily?) removing memory regions

Currently, we try to remap all rings whenever we add a single new memory
region. That doesn't quite make sense, because we already map rings when
setting the ring address, and panic if that goes wrong. Likely, that
handling was simply copied from set_mem_table code, where we actually
have to remap all rings.

Remapping all rings might require us to walk quite a lot of memory
regions to perform the address translations. Ideally, we'd simply remove
that remapping.

However, let's be a bit careful. There might be some weird corner cases
where we might temporarily remove a single memory region (e.g., resize
it), that would have worked for now. Further, a ring might be located on
hotplugged memory, and as the VM reboots, we might unplug that memory, to
hotplug memory before resetting the ring addresses.

So let's unmap affected rings as we remove a memory region, and try
dynamically mapping the ring again when required.

Acked-by: Raphael Norwitz <raphael@enfabrica.net>
Acked-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20240214151701.29906-14-david@redhat.com>
Tested-by: Mario Casquero <mcasquer@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
6 months agolibvhost-user: Factor out vq usability check
David Hildenbrand [Wed, 14 Feb 2024 15:16:59 +0000 (16:16 +0100)]
libvhost-user: Factor out vq usability check

Let's factor it out to prepare for further changes.

Reviewed-by: Raphael Norwitz <raphael@enfabrica.net>
Acked-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20240214151701.29906-13-david@redhat.com>
Tested-by: Mario Casquero <mcasquer@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
6 months agolibvhost-user: Use most of mmap_offset as fd_offset
David Hildenbrand [Wed, 14 Feb 2024 15:16:58 +0000 (16:16 +0100)]
libvhost-user: Use most of mmap_offset as fd_offset

In the past, QEMU would create memory regions that could partially cover
hugetlb pages, making mmap() fail if we would use the mmap_offset as an
fd_offset. For that reason, we never used the mmap_offset as an offset into
the fd and instead always mapped the fd from the very start.

However, that can easily result in us mmap'ing a lot of unnecessary
parts of an fd, possibly repeatedly.

QEMU nowadays does not create memory regions that partially cover huge
pages -- it never really worked with postcopy. QEMU handles merging of
regions that partially cover huge pages (due to holes in boot memory) since
2018 in c1ece84e7c93 ("vhost: Huge page align and merge").

Let's be a bit careful and not unconditionally convert the
mmap_offset into an fd_offset. Instead, let's simply detect the hugetlb
size and pass as much as we can as fd_offset, making sure that we call
mmap() with a properly aligned offset.

With QEMU and a virtio-mem device that is fully plugged (50GiB using 50
memslots) the qemu-storage daemon process consumes in the VA space
1281GiB before this change and 58GiB after this change.

================ Vhost user message ================
Request: VHOST_USER_ADD_MEM_REG (37)
Flags:   0x9
Size:    40
Fds: 59
Adding region 4
    guest_phys_addr: 0x0000000200000000
    memory_size:     0x0000000040000000
    userspace_addr:  0x00007fb73bffe000
    old mmap_offset: 0x0000000080000000
    fd_offset:       0x0000000080000000
    new mmap_offset: 0x0000000000000000
    mmap_addr:       0x00007f02f1bdc000
Successfully added new region
================ Vhost user message ================
Request: VHOST_USER_ADD_MEM_REG (37)
Flags:   0x9
Size:    40
Fds: 59
Adding region 5
    guest_phys_addr: 0x0000000240000000
    memory_size:     0x0000000040000000
    userspace_addr:  0x00007fb77bffe000
    old mmap_offset: 0x00000000c0000000
    fd_offset:       0x00000000c0000000
    new mmap_offset: 0x0000000000000000
    mmap_addr:       0x00007f0284000000
Successfully added new region

Reviewed-by: Raphael Norwitz <raphael@enfabrica.net>
Acked-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20240214151701.29906-12-david@redhat.com>
Tested-by: Mario Casquero <mcasquer@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
6 months agolibvhost-user: Speedup gpa_to_mem_region() and vu_gpa_to_va()
David Hildenbrand [Wed, 14 Feb 2024 15:16:57 +0000 (16:16 +0100)]
libvhost-user: Speedup gpa_to_mem_region() and vu_gpa_to_va()

Let's speed up GPA to memory region / virtual address lookup. Store the
memory regions ordered by guest physical addresses, and use binary
search for address translation, as well as when adding/removing memory
regions.

Most importantly, this will speed up GPA->VA address translation when we
have many memslots.

Reviewed-by: Raphael Norwitz <raphael@enfabrica.net>
Acked-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20240214151701.29906-11-david@redhat.com>
Tested-by: Mario Casquero <mcasquer@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
6 months agolibvhost-user: Factor out search for memory region by GPA and simplify
David Hildenbrand [Wed, 14 Feb 2024 15:16:56 +0000 (16:16 +0100)]
libvhost-user: Factor out search for memory region by GPA and simplify

Memory regions cannot overlap, and if we ever hit that case something
would be really flawed.

For example, when vhost code in QEMU decides to increase the size of memory
regions to cover full huge pages, it makes sure to never create overlaps,
and if there would be overlaps, it would bail out.

QEMU commits 48d7c9757749 ("vhost: Merge sections added to temporary
list"), c1ece84e7c93 ("vhost: Huge page align and merge") and
e7b94a84b6cb ("vhost: Allow adjoining regions") added and clarified that
handling and how overlaps are impossible.

Consequently, each GPA can belong to at most one memory region, and
everything else doesn't make sense. Let's factor out our search to prepare
for further changes.

Reviewed-by: Raphael Norwitz <raphael@enfabrica.net>
Acked-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20240214151701.29906-10-david@redhat.com>
Tested-by: Mario Casquero <mcasquer@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
6 months agolibvhost-user: Don't search for duplicates when removing memory regions
David Hildenbrand [Wed, 14 Feb 2024 15:16:55 +0000 (16:16 +0100)]
libvhost-user: Don't search for duplicates when removing memory regions

We cannot have duplicate memory regions, something would be deeply
flawed elsewhere. Let's just stop the search once we found an entry.

We'll add more sanity checks when adding memory regions later.

Reviewed-by: Raphael Norwitz <raphael@enfabrica.net>
Acked-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20240214151701.29906-9-david@redhat.com>
Tested-by: Mario Casquero <mcasquer@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
6 months agolibvhost-user: Don't zero out memory for memory regions
David Hildenbrand [Wed, 14 Feb 2024 15:16:54 +0000 (16:16 +0100)]
libvhost-user: Don't zero out memory for memory regions

dev->nregions always covers only valid entries. Stop zeroing out other
array elements that are unused.

Reviewed-by: Raphael Norwitz <raphael@enfabrica.net>
Acked-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20240214151701.29906-8-david@redhat.com>
Tested-by: Mario Casquero <mcasquer@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
6 months agolibvhost-user: No need to check for NULL when unmapping
David Hildenbrand [Wed, 14 Feb 2024 15:16:53 +0000 (16:16 +0100)]
libvhost-user: No need to check for NULL when unmapping

We never add a memory region if mmap() failed. Therefore, no need to check
for NULL.

Reviewed-by: Raphael Norwitz <raphael@enfabrica.net>
Acked-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20240214151701.29906-7-david@redhat.com>
Tested-by: Mario Casquero <mcasquer@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
6 months agolibvhost-user: Factor out adding a memory region
David Hildenbrand [Wed, 14 Feb 2024 15:16:52 +0000 (16:16 +0100)]
libvhost-user: Factor out adding a memory region

Let's factor it out, reducing quite some code duplication and perparing
for further changes.

If we fail to mmap a region and panic, we now simply don't add that
(broken) region.

Note that we now increment dev->nregions as we are successfully
adding memory regions, and don't increment dev->nregions if anything went
wrong.

Reviewed-by: Raphael Norwitz <raphael@enfabrica.net>
Acked-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20240214151701.29906-6-david@redhat.com>
Tested-by: Mario Casquero <mcasquer@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
6 months agolibvhost-user: Merge vu_set_mem_table_exec_postcopy() into vu_set_mem_table_exec()
David Hildenbrand [Wed, 14 Feb 2024 15:16:51 +0000 (16:16 +0100)]
libvhost-user: Merge vu_set_mem_table_exec_postcopy() into vu_set_mem_table_exec()

Let's reduce some code duplication and prepare for further changes.

Reviewed-by: Raphael Norwitz <raphael@enfabrica.net>
Acked-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20240214151701.29906-5-david@redhat.com>
Tested-by: Mario Casquero <mcasquer@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
6 months agolibvhost-user: Factor out removing all mem regions
David Hildenbrand [Wed, 14 Feb 2024 15:16:50 +0000 (16:16 +0100)]
libvhost-user: Factor out removing all mem regions

Let's factor it out. Note that the check for MAP_FAILED was wrong as
we never set mmap_addr if mmap() failed. We'll remove the NULL check
separately.

Reviewed-by: Raphael Norwitz <raphael@enfabrica.net>
Acked-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20240214151701.29906-4-david@redhat.com>
Tested-by: Mario Casquero <mcasquer@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
6 months agolibvhost-user: Bump up VHOST_USER_MAX_RAM_SLOTS to 509
David Hildenbrand [Wed, 14 Feb 2024 15:16:49 +0000 (16:16 +0100)]
libvhost-user: Bump up VHOST_USER_MAX_RAM_SLOTS to 509

Let's support up to 509 mem slots, just like vhost in the kernel usually
does and the rust vhost-user implementation recently [1] started doing.
This is required to properly support memory hotplug, either using
multiple DIMMs (ACPI supports up to 256) or using virtio-mem.

The 509 used to be the KVM limit, it supported 512, but 3 were
used for internal purposes. Currently, KVM supports more than 512, but
it usually doesn't make use of more than ~260 (i.e., 256 DIMMs + boot
memory), except when other memory devices like PCI devices with BARs are
used. So, 509 seems to work well for vhost in the kernel.

Details can be found in the QEMU change that made virtio-mem consume
up to 256 mem slots across all virtio-mem devices. [2]

509 mem slots implies 509 VMAs/mappings in the worst case (even though,
in practice with virtio-mem we won't be seeing more than ~260 in most
setups).

With max_map_count under Linux defaulting to 64k, 509 mem slots
still correspond to less than 1% of the maximum number of mappings.
There are plenty left for the application to consume.

[1] https://github.com/rust-vmm/vhost/pull/224
[2] https://lore.kernel.org/all/20230926185738.277351-1-david@redhat.com/

Reviewed-by: Raphael Norwitz <raphael@enfabrica.net>
Acked-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20240214151701.29906-3-david@redhat.com>
Tested-by: Mario Casquero <mcasquer@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
6 months agolibvhost-user: Dynamically allocate memory for memory slots
David Hildenbrand [Wed, 14 Feb 2024 15:16:48 +0000 (16:16 +0100)]
libvhost-user: Dynamically allocate memory for memory slots

Let's prepare for increasing VHOST_USER_MAX_RAM_SLOTS by dynamically
allocating dev->regions. We don't have any ABI guarantees (not
dynamically linked), so we can simply change the layout of VuDev.

Let's zero out the memory, just as we used to do.

Reviewed-by: Raphael Norwitz <raphael@enfabrica.net>
Acked-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20240214151701.29906-2-david@redhat.com>
Tested-by: Mario Casquero <mcasquer@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
6 months agovdpa: fix network breakage after cancelling migration
Si-Wei Liu [Wed, 14 Feb 2024 11:28:02 +0000 (03:28 -0800)]
vdpa: fix network breakage after cancelling migration

Fix an issue where cancellation of ongoing migration ends up
with no network connectivity.

When canceling migration, SVQ will be switched back to the
passthrough mode, but the right call fd is not programed to
the device and the svq's own call fd is still used. At the
point of this transitioning period, the shadow_vqs_enabled
hadn't been set back to false yet, causing the installation
of call fd inadvertently bypassed.

Message-Id: <1707910082-10243-13-git-send-email-si-wei.liu@oracle.com>
Fixes: a8ac88585da1 ("vhost: Add Shadow VirtQueue call forwarding capabilities")
Cc: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
6 months agovdpa: indicate transitional state for SVQ switching
Si-Wei Liu [Wed, 14 Feb 2024 11:28:01 +0000 (03:28 -0800)]
vdpa: indicate transitional state for SVQ switching

svq_switching indicates the transitional state whether
or not SVQ mode switching is in progress, and towards
which direction. Add the neccessary state around where
the switching would take place.

Message-Id: <1707910082-10243-12-git-send-email-si-wei.liu@oracle.com>
Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
6 months agovdpa: define SVQ transitioning state for mode switching
Si-Wei Liu [Wed, 14 Feb 2024 11:28:00 +0000 (03:28 -0800)]
vdpa: define SVQ transitioning state for mode switching

Will be used in following patches.

DISABLING(-1) means SVQ is being switched off to passthrough
mode.

ENABLING(1) means passthrough VQs are being switched to SVQ.

DONE(0) means SVQ switching is completed.

Message-Id: <1707910082-10243-11-git-send-email-si-wei.liu@oracle.com>
Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
6 months agovdpa: add trace event for vhost_vdpa_net_load_mq
Si-Wei Liu [Wed, 14 Feb 2024 11:27:59 +0000 (03:27 -0800)]
vdpa: add trace event for vhost_vdpa_net_load_mq

For better debuggability and observability.

Message-Id: <1707910082-10243-10-git-send-email-si-wei.liu@oracle.com>
Reviewed-by: Eugenio Pérez <eperezma@redhat.com>
Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
6 months agovdpa: add trace events for vhost_vdpa_net_load_cmd
Si-Wei Liu [Wed, 14 Feb 2024 11:27:58 +0000 (03:27 -0800)]
vdpa: add trace events for vhost_vdpa_net_load_cmd

For better debuggability and observability.

Message-Id: <1707910082-10243-9-git-send-email-si-wei.liu@oracle.com>
Reviewed-by: Eugenio Pérez <eperezma@redhat.com>
Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
6 months agovdpa: add vhost_vdpa_set_dev_vring_base trace for svq mode
Si-Wei Liu [Wed, 14 Feb 2024 11:27:57 +0000 (03:27 -0800)]
vdpa: add vhost_vdpa_set_dev_vring_base trace for svq mode

For better debuggability and observability.

Message-Id: <1707910082-10243-8-git-send-email-si-wei.liu@oracle.com>
Reviewed-by: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
6 months agovdpa: add vhost_vdpa_get_vring_base trace for svq mode
Si-Wei Liu [Wed, 14 Feb 2024 11:27:56 +0000 (03:27 -0800)]
vdpa: add vhost_vdpa_get_vring_base trace for svq mode

For better debuggability and observability.

Message-Id: <1707910082-10243-7-git-send-email-si-wei.liu@oracle.com>
Reviewed-by: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
6 months agovdpa: add vhost_vdpa_set_address_space_id trace
Si-Wei Liu [Wed, 14 Feb 2024 11:27:55 +0000 (03:27 -0800)]
vdpa: add vhost_vdpa_set_address_space_id trace

For better debuggability and observability.

Message-Id: <1707910082-10243-6-git-send-email-si-wei.liu@oracle.com>
Reviewed-by: Eugenio Pérez <eperezma@redhat.com>
Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
6 months agovdpa: factor out vhost_vdpa_net_get_nc_vdpa
Si-Wei Liu [Wed, 14 Feb 2024 11:27:54 +0000 (03:27 -0800)]
vdpa: factor out vhost_vdpa_net_get_nc_vdpa

Introduce new API. No functional change on existing API.

Message-Id: <1707910082-10243-5-git-send-email-si-wei.liu@oracle.com>
Reviewed-by: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
6 months agovdpa: factor out vhost_vdpa_last_dev
Si-Wei Liu [Wed, 14 Feb 2024 11:27:53 +0000 (03:27 -0800)]
vdpa: factor out vhost_vdpa_last_dev

Generalize duplicated condition check for the last vq of vdpa
device to a common function.

Message-Id: <1707910082-10243-4-git-send-email-si-wei.liu@oracle.com>
Reviewed-by: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
6 months agovdpa: add back vhost_vdpa_net_first_nc_vdpa
Si-Wei Liu [Wed, 14 Feb 2024 11:27:51 +0000 (03:27 -0800)]
vdpa: add back vhost_vdpa_net_first_nc_vdpa

Previous commits had it removed. Now adding it back because
this function will be needed by future patches.

Message-Id: <1707910082-10243-2-git-send-email-si-wei.liu@oracle.com>
Reviewed-by: Eugenio Pérez <eperezma@redhat.com>
Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
6 months agoMerge tag 'pull-tcg-20240312' of https://gitlab.com/rth7680/qemu into staging
Peter Maydell [Tue, 12 Mar 2024 21:33:15 +0000 (21:33 +0000)]
Merge tag 'pull-tcg-20240312' of https://gitlab.com/rth7680/qemu into staging

linux-user: Add FIFREEZE and FITHAW ioctls
linux-user: Implement PR_*_{CHILD_SUBREAPER,SPECULATION_CTRL,TID_ADDRESS}
linux-user/elfload: Fixes for two Coverity CIDs
tcg/aarch64: Fixes for two TCG_COND_TST{EQ,NE} bugs

# -----BEGIN PGP SIGNATURE-----
#
# iQFRBAABCgA7FiEEekgeeIaLTbaoWgXAZN846K9+IV8FAmXwoYwdHHJpY2hhcmQu
# aGVuZGVyc29uQGxpbmFyby5vcmcACgkQZN846K9+IV99KwgAlUxbn0dPTXKmCkIE
# X+FAUllPYCZJwpTCa1h3P8taczLLeAocI4/iJxUudBE77I0HY7jv4FRnWrrdHr/V
# rQXjNkpQUByWr0Y4MB6Gl1+AKYo2SNqVHNP5AI4DdgDeSASXhP1aSlT5h4V4gdeX
# 1OwSnTQfONInJaoOQ7QQRf3JShKSYZSO7/sjMlJrubgGJBP8ivPxyPKiGbX3zUBS
# 6fI/ICLewC/g1fLPKaMHmqdrPK30ubPSGtnKdcz0q5NsT3hy6QWgtrQs5WAf3Liz
# 9WKGbq/y+qaFyLHat2tBpDnzT1Jso1SlIMkxL8kau3g6Pvk91E/pZjF5K3JOG8By
# PR4uQA==
# =FckT
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 12 Mar 2024 18:40:12 GMT
# gpg:                using RSA key 7A481E78868B4DB6A85A05C064DF38E8AF7E215F
# gpg:                issuer "richard.henderson@linaro.org"
# gpg: Good signature from "Richard Henderson <richard.henderson@linaro.org>" [full]
# Primary key fingerprint: 7A48 1E78 868B 4DB6 A85A  05C0 64DF 38E8 AF7E 215F

* tag 'pull-tcg-20240312' of https://gitlab.com/rth7680/qemu:
  tcg/aarch64: Fix tcg_out_brcond for test comparisons
  tcg/aarch64: Fix tcg_out_cmp for test comparisons
  linux-user/elfload: Fully initialize struct target_elf_prpsinfo
  linux-user/elfload: Don't close an unopened file descriptor
  linux-user: Implement PR_GET_TID_ADDRESS
  linux-user: Implement PR_{GET,SET}_SPECULATION_CTRL
  linux-user: Implement PR_{GET,SET}_CHILD_SUBREAPER
  linux-user: Add FIFREEZE and FITHAW ioctls

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6 months agoMerge tag 'nvme-next-pull-request' of https://gitlab.com/birkelund/qemu into staging
Peter Maydell [Tue, 12 Mar 2024 21:32:51 +0000 (21:32 +0000)]
Merge tag 'nvme-next-pull-request' of https://gitlab.com/birkelund/qemu into staging

hw/nvme updates

# -----BEGIN PGP SIGNATURE-----
#
# iQEzBAABCgAdFiEEUigzqnXi3OaiR2bATeGvMW1PDekFAmXwj+wACgkQTeGvMW1P
# DelOsAf+Jg51zf3vtWpe4MS/WtULjSr5GtnXMJ5hkHS0WdKOiLW3P+pUZXbsohmh
# faVlYeCWptF1CFGfxBf4Trc7XzJy8J6W1YJEofs/07hIAnazo9pwk5shoVu4oiex
# HVsBg7/9y7DuiEEg1MRvVvW895cP60WmG1AqU63SYwrVgxZ51ZH0XNuyRhQeYC/6
# OSXJ3FDYu2iJQ58uEzGEwv8vhskIpEFTdz0J6gQVxIdzFBbuk87VgZo6pqwgfMBm
# /65K85TgFBT4SASc7a2iSUv+iAqSCA6Jdy0VWxCYCikiv5nuPCMCrlbvqcVp+i2B
# GKtgfFXhtgepxx6jmYd03EkRjCrxUA==
# =W3gg
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 12 Mar 2024 17:25:00 GMT
# gpg:                using RSA key 522833AA75E2DCE6A24766C04DE1AF316D4F0DE9
# gpg: Good signature from "Klaus Jensen <its@irrelevant.dk>" [full]
# gpg:                 aka "Klaus Jensen <k.jensen@samsung.com>" [full]
# Primary key fingerprint: DDCA 4D9C 9EF9 31CC 3468  4272 63D5 6FC5 E55D A838
#      Subkey fingerprint: 5228 33AA 75E2 DCE6 A247  66C0 4DE1 AF31 6D4F 0DE9

* tag 'nvme-next-pull-request' of https://gitlab.com/birkelund/qemu:
  hw/nvme: add machine compatibility parameter to enable msix exclusive bar
  hw/nvme: generalize the mbar size helper
  hw/nvme: Add NVMe NGUID property
  MAINTAINERS: add Jesper as reviewer on hw/nvme
  hw/nvme: fix invalid check on mcl
  hw/nvme: separate 'serial' property for VFs

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6 months agoMerge tag 'pull-xen-20240312' of https://xenbits.xen.org/git-http/people/aperard...
Peter Maydell [Tue, 12 Mar 2024 21:32:31 +0000 (21:32 +0000)]
Merge tag 'pull-xen-20240312' of https://xenbits.xen.org/git-http/people/aperard/qemu-dm into staging

Xen queue:

* In Xen PCI passthrough, emulate multifunction bit.
* Fix in Xen mapcache.
* Improve performance of kernel+initrd loading in an Xen HVM Direct
  Kernel Boot scenario.

# -----BEGIN PGP SIGNATURE-----
#
# iQEzBAABCgAdFiEE+AwAYwjiLP2KkueYDPVXL9f7Va8FAmXwZbwACgkQDPVXL9f7
# Va+PhQgAusZBhy3b0hOCCoqC/1ffCE5J2JxUTnN3zN/2FSOe8/kqQYqt4Zk3vi2e
# Eq8FbGupU357eoJSz0gTEPKQ8y+FVBCmFKEHM1PS54TW1yUZchQg4RmlII6+Psoj
# 7u+qC1RqZu/ZQ9f1QZd8YDJ5oVOkfAZYwq5BkWVS6h5gJiQTSkekAXlMNOQBZxz4
# 48fzpokatiJBbyaBGEm6YKEOwkYG76eHhxB4SC0Rgx6zW+EDQpX0s/Lg19SXnj2C
# UOueiPod1GkE+iH6dQFJUSbsnrkAtJZf253bs3BQnoChGiqQLuXn4jC79ffjPzHI
# AKP2+u+bSJ+8C1SdPuoJN6sJIZmOfA==
# =FZ2n
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 12 Mar 2024 14:25:00 GMT
# gpg:                using RSA key F80C006308E22CFD8A92E7980CF5572FD7FB55AF
# gpg: Good signature from "Anthony PERARD <anthony.perard@gmail.com>" [marginal]
# gpg:                 aka "Anthony PERARD <anthony.perard@citrix.com>" [marginal]
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg:          It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 5379 2F71 024C 600F 778A  7161 D8D5 7199 DF83 42C8
#      Subkey fingerprint: F80C 0063 08E2 2CFD 8A92  E798 0CF5 572F D7FB 55AF

* tag 'pull-xen-20240312' of https://xenbits.xen.org/git-http/people/aperard/qemu-dm:
  i386: load kernel on xen using DMA
  xen: Drop out of coroutine context xen_invalidate_map_cache_entry
  xen/pt: Emulate multifunction bit in header type

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6 months agomeson: generate .stp files for tools too
Daniel P. Berrangé [Mon, 8 Jan 2024 17:13:56 +0000 (17:13 +0000)]
meson: generate .stp files for tools too

The qemu-img, qemu-io, qemu-nbd, qemu-storage-daemon tools all have
support for systemtap tracing built-in, so should be given corresponding
.stp files to define their probes.

Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 20240108171356.1037059-3-berrange@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
6 months agotracetool: remove redundant --target-type / --target-name args
Daniel P. Berrangé [Mon, 8 Jan 2024 17:13:55 +0000 (17:13 +0000)]
tracetool: remove redundant --target-type / --target-name args

The --target-type and --target-name args are used to construct
the default probe prefix if '--probe-prefix' is not given. The
meson.build will always pass '--probe-prefix', so the other args
are effectively redundant.

Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 20240108171356.1037059-2-berrange@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
6 months agoMerge tag 'ui-pull-request' of https://gitlab.com/marcandre.lureau/qemu into staging
Peter Maydell [Tue, 12 Mar 2024 16:56:13 +0000 (16:56 +0000)]
Merge tag 'ui-pull-request' of https://gitlab.com/marcandre.lureau/qemu into staging

display/ui: pending fixes

- ui/vnc: Respect bound console
- ui/dbus: optimize a bit message queuing
- virtio-gpu: fix blob scanout post-load

# -----BEGIN PGP SIGNATURE-----
#
# iQJQBAABCAA6FiEEh6m9kz+HxgbSdvYt2ujhCXWWnOUFAmXwYCYcHG1hcmNhbmRy
# ZS5sdXJlYXVAcmVkaGF0LmNvbQAKCRDa6OEJdZac5bv9D/9J1g76mYND+ad++d+G
# YiewXtHVwrHm9g+TxUdWXaBcDFy+uFtGpwIBtYN76YjSSL47li74V7sQTZ2FQVys
# Y8W61xBzDoAcCLV7/m48WW/mov2+TtyUFYIC3ZOBFS6Ol5aiJ8uurJa11h2WTacq
# tQKlK5g//Yv0H0cxn1cYMqRFdsko3H2hSmYz36QuPWfxivC2VeMnN/iTSGfiVSb+
# hTkOdRu+5qmt3mbbYo0Z6YpvjhLqSLob6n29+P7/QlwrQxP+A/JSS4FrAHryXzvm
# qZ/wRsPmThjwpnt3ZV9AapagQ7908FRmh1EhyAxrWq2G8QGK/XvJ/JPwBOgZGEiy
# W48N5FQhdQUkxkVpkmQVpGhJFAzclqJh/duZiBtixw+25Md6DG04OwHy9k7qCph7
# qj2BZuaSafVcSE0JEG78bt5YHAO3Joyfjf7Jhb0Tqvn2kbv94tCTGtUIH6ngYv4Z
# r0vTmlDr7pe1xaa9HeFpaopckvj4uQhlcMHnrETnUtcdWKE5SaBlgNsIwHlNlKZ6
# wmUIMKymXNRIiCZrf2xxJr7PeZ8FJgTlHCy9poSJRwpZDKHaZQMecklELx+jECuU
# DPhAmTPTZjCKiXGCI+KlL6nDy/H7zA6boCMO2QpKVk0ehviWOQZvu94srTJL5nz/
# RX+rwGbf3+8LfIFJmLcQCD5qag==
# =oY0A
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 12 Mar 2024 14:01:10 GMT
# gpg:                using RSA key 87A9BD933F87C606D276F62DDAE8E10975969CE5
# gpg:                issuer "marcandre.lureau@redhat.com"
# gpg: Good signature from "Marc-André Lureau <marcandre.lureau@redhat.com>" [full]
# gpg:                 aka "Marc-André Lureau <marcandre.lureau@gmail.com>" [full]
# Primary key fingerprint: 87A9 BD93 3F87 C606 D276  F62D DAE8 E109 7596 9CE5

* tag 'ui-pull-request' of https://gitlab.com/marcandre.lureau/qemu:
  virtio-gpu: fix scanout migration post-load
  virtio-gpu: remove needless condition
  ui/dbus: filter out pending messages when scanout
  ui/dbus: factor out sending a scanout
  ui/vnc: Respect bound console

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6 months agoMerge tag 'pull-error-2024-03-12' of https://repo.or.cz/qemu/armbru into staging
Peter Maydell [Tue, 12 Mar 2024 16:55:56 +0000 (16:55 +0000)]
Merge tag 'pull-error-2024-03-12' of https://repo.or.cz/qemu/armbru into staging

Error reporting patches for 2024-03-12

# -----BEGIN PGP SIGNATURE-----
#
# iQJGBAABCAAwFiEENUvIs9frKmtoZ05fOHC0AOuRhlMFAmXwWOYSHGFybWJydUBy
# ZWRoYXQuY29tAAoJEDhwtADrkYZT+voP/jAEyPfbtwggKLHjSCkHchn/uUziLJ2o
# //i7+ZV9soCizAEkW+AkIR17PsMCaRsa8W4AULLn+ZaDJNy1Vlj2WYIkgeFm/rba
# AWfNXywIg7dLnj0Hd98nz13hPuP52hO9vpakPhcua9L6mmk1htdqbbGIFIIfbQhp
# e6FM+sBEW44uGcZx+N0wMEpKF0F7RId/jzH4mfP35WE7CLaAr2EfTXFaadAM636e
# QsrM8wuiNAPQeyXz14gxYTWAnnMGglM5WQ4hoxSGN0y8c007gvff5vMKc7vapn4/
# DdiYJqpq/DIWaiGL0Fl8Cpry3WrQ8UY0st745kCLF/f9nlL0GvnBGdLdUaap7lQZ
# A/C1sDKNubAGwzcw643AhV73QHc9f5kDBdWIj5wj3k5DQmBmgKACzGs1edDVVB+2
# OaStqZZ/V9Q5gljjh6PiHEptTjPhsaftX7GGjbhXTJUDFB9GONSCEVwAdZZxJ0Pm
# 6cQLtcIMtcjL4xXNz6niVZkxGT/zu4kqbZ01LudIqEQAnULwRiVpyjkCmReSAOPP
# eBtkCQtn7WPlz4N3ZiV2+a1p4/e88KH9wvxF+XvPEJjgsdeUmxX44f82ouLPJzvE
# fOXE11tRr41u9m+UmoinVo581CKYGlkRJlNQWQwFOmnXoKP2nPZzADxraihkCR5p
# wT0Hz9uwJs94
# =6FSf
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 12 Mar 2024 13:30:14 GMT
# gpg:                using RSA key 354BC8B3D7EB2A6B68674E5F3870B400EB918653
# gpg:                issuer "armbru@redhat.com"
# gpg: Good signature from "Markus Armbruster <armbru@redhat.com>" [full]
# gpg:                 aka "Markus Armbruster <armbru@pond.sub.org>" [full]
# Primary key fingerprint: 354B C8B3 D7EB 2A6B 6867  4E5F 3870 B400 EB91 8653

* tag 'pull-error-2024-03-12' of https://repo.or.cz/qemu/armbru:
  target/loongarch: Fix query-cpu-model-expansion to reject props
  target: Improve error reporting for CpuModelInfo member @props
  target/i386: Fix query-cpu-model-expansion to reject props
  target: Simplify type checks for CpuModelInfo member @props

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6 months agoMerge tag 'pull-request-2024-03-12' of https://gitlab.com/thuth/qemu into staging
Peter Maydell [Tue, 12 Mar 2024 16:55:41 +0000 (16:55 +0000)]
Merge tag 'pull-request-2024-03-12' of https://gitlab.com/thuth/qemu into staging

* Add missing ERRP_GUARD() statements in functions that need it
* Prefer fast cpu_env() over slower CPU QOM cast macro

# -----BEGIN PGP SIGNATURE-----
#
# iQJFBAABCAAvFiEEJ7iIR+7gJQEY8+q5LtnXdP5wLbUFAmXwPhYRHHRodXRoQHJl
# ZGhhdC5jb20ACgkQLtnXdP5wLbWHvBAAgKx5LHFjz3xREVA+LkDTQ49mz0lK3s32
# SGvNlIHjiaDGVttVYhVC4sinBWUruG4Lyv/2QN72OJBzn6WUsEUQE3KPH1d7Y3/s
# wS9X7mj70n4kugWJqeIJP5AXSRasHmWoQ4QJLVQRJd6+Eb9jqwep0x7bYkI1de6D
# bL1Q7bIfkFeNQBXaiPWAm2i+hqmT4C1r8HEAGZIjAsMFrjy/hzBEjNV+pnh6ZSq9
# Vp8BsPWRfLU2XHm4WX0o8d89WUMAfUGbVkddEl/XjIHDrUD+Zbd1HAhLyfhsmrnE
# jXIwSzm+ML1KX4MoF5ilGtg8Oo0gQDEBy9/xck6G0HCm9lIoLKlgTxK9glr2vdT8
# yxZmrM9Hder7F9hKKxmb127xgU6AmL7rYmVqsoQMNAq22D6Xr4UDpgFRXNk2/wO6
# zZZBkfZ4H4MpZXbd/KJpXvYH5mQA4IpkOy8LJdE+dbcHX7Szy9ksZdPA+Z10hqqf
# zqS13qTs3abxymy2Q/tO3hPKSJCk1+vCGUkN60Wm+9VoLWGoU43qMc7gnY/pCS7m
# 0rFKtvfwFHhokX1orK0lP/ppVzPv/5oFIeK8YDY9if+N+dU2LCwVZHIuf2/VJPRq
# wmgH2vAn3JDoRKPxTGX9ly6AMxuZaeP92qBTOPap0gDhihYzIpaCq9ecEBoTakI7
# tdFhV0iRr08=
# =NiP4
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 12 Mar 2024 11:35:50 GMT
# gpg:                using RSA key 27B88847EEE0250118F3EAB92ED9D774FE702DB5
# gpg:                issuer "thuth@redhat.com"
# gpg: Good signature from "Thomas Huth <th.huth@gmx.de>" [full]
# gpg:                 aka "Thomas Huth <thuth@redhat.com>" [full]
# gpg:                 aka "Thomas Huth <huth@tuxfamily.org>" [full]
# gpg:                 aka "Thomas Huth <th.huth@posteo.de>" [unknown]
# Primary key fingerprint: 27B8 8847 EEE0 2501 18F3  EAB9 2ED9 D774 FE70 2DB5

* tag 'pull-request-2024-03-12' of https://gitlab.com/thuth/qemu: (55 commits)
  user: Prefer fast cpu_env() over slower CPU QOM cast macro
  target/xtensa: Prefer fast cpu_env() over slower CPU QOM cast macro
  target/tricore: Prefer fast cpu_env() over slower CPU QOM cast macro
  target/sparc: Prefer fast cpu_env() over slower CPU QOM cast macro
  target/sh4: Prefer fast cpu_env() over slower CPU QOM cast macro
  target/rx: Prefer fast cpu_env() over slower CPU QOM cast macro
  target/ppc: Prefer fast cpu_env() over slower CPU QOM cast macro
  target/openrisc: Prefer fast cpu_env() over slower CPU QOM cast macro
  target/nios2: Prefer fast cpu_env() over slower CPU QOM cast macro
  target/mips: Prefer fast cpu_env() over slower CPU QOM cast macro
  target/microblaze: Prefer fast cpu_env() over slower CPU QOM cast macro
  target/m68k: Prefer fast cpu_env() over slower CPU QOM cast macro
  target/loongarch: Prefer fast cpu_env() over slower CPU QOM cast macro
  target/i386/hvf: Use CPUState typedef
  target/hexagon: Prefer fast cpu_env() over slower CPU QOM cast macro
  target/cris: Prefer fast cpu_env() over slower CPU QOM cast macro
  target/avr: Prefer fast cpu_env() over slower CPU QOM cast macro
  target/alpha: Prefer fast cpu_env() over slower CPU QOM cast macro
  target: Replace CPU_GET_CLASS(cpu -> obj) in cpu_reset_hold() handler
  bulk: Call in place single use cpu_env()
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
6 months agospapr: nested: Introduce cap-nested-papr for Nested PAPR API
Harsh Prateek Bora [Fri, 8 Mar 2024 11:19:40 +0000 (16:49 +0530)]
spapr: nested: Introduce cap-nested-papr for Nested PAPR API

Introduce a SPAPR capability cap-nested-papr which enables nested PAPR
API for nested guests. This new API is to enable support for KVM on PowerVM
and the support in Linux kernel has already merged upstream.

Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
6 months agospapr: nested: Introduce H_GUEST_RUN_VCPU hcall.
Harsh Prateek Bora [Fri, 8 Mar 2024 11:19:39 +0000 (16:49 +0530)]
spapr: nested: Introduce H_GUEST_RUN_VCPU hcall.

The H_GUEST_RUN_VCPU hcall is used to start execution of a Guest VCPU.
The Hypervisor will update the state of the Guest VCPU based on the
input buffer, restore the saved Guest VCPU state, and start its
execution.

The Guest VCPU can stop running for numerous reasons including HCALLs,
hypervisor exceptions, or an outstanding Host Partition Interrupt.
The reason that the Guest VCPU stopped running is communicated through
R4 and the output buffer will be filled in with any relevant state.

Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
6 months agospapr: nested: Use correct source for parttbl info for nested PAPR API.
Harsh Prateek Bora [Fri, 8 Mar 2024 11:19:38 +0000 (16:49 +0530)]
spapr: nested: Use correct source for parttbl info for nested PAPR API.

For nested PAPR API, we use SpaprMachineStateNestedGuest struct to store
partition table info, use the same in spapr_get_pate_nested() via
helper.

Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
6 months agospapr: nested: Introduce H_GUEST_[GET|SET]_STATE hcalls.
Harsh Prateek Bora [Fri, 8 Mar 2024 11:19:37 +0000 (16:49 +0530)]
spapr: nested: Introduce H_GUEST_[GET|SET]_STATE hcalls.

Introduce the nested PAPR hcalls:
    - H_GUEST_GET_STATE which is used to get state of a nested guest or
      a guest VCPU. The value field for each element in the request is
      destination to be updated to reflect current state on success.
    - H_GUEST_SET_STATE which is used to modify the state of a guest or
      a guest VCPU. On success, guest (or its VCPU) state shall be
      updated as per the value field for the requested element(s).

Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
6 months agospapr: nested: Initialize the GSB elements lookup table.
Harsh Prateek Bora [Fri, 8 Mar 2024 11:19:36 +0000 (16:49 +0530)]
spapr: nested: Initialize the GSB elements lookup table.

Nested PAPR API provides a standard Guest State Buffer (GSB) format
with unique IDs for each guest state element for which get/set state is
supported by the API. Some of the elements are read-only and/or guest-wide.
Introducing additional required GSB elements and helper routines for state
exchange of each of the nested guest state elements for which get/set state
should be supported by the API.

[amachhiw: set the PCR whenever logical PVR is set]

Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Shivaprasad G Bhat <sbhat@linux.ibm.com>
Signed-off-by: Amit Machhiwal <amachhiw@linux.vnet.ibm.com>
Signed-off-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
6 months agospapr: nested: Extend nested_ppc_state for nested PAPR API
Harsh Prateek Bora [Fri, 8 Mar 2024 11:19:35 +0000 (16:49 +0530)]
spapr: nested: Extend nested_ppc_state for nested PAPR API

Currently, nested_ppc_state stores a certain set of registers and works
with nested_[load|save]_state() for state transfer as reqd for nested-hv API.
Extending these with additional registers state as reqd for nested PAPR API.

Acked-by: Nicholas Piggin <npiggin@gmail.com>
Suggested-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>