git.proxmox.com Git - mirror

Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging

Block layer patches:

- Block graph change fixes (avoid loops, cope with non-tree graphs)
- bdrv_set_aio_context() related fixes
- HMP snapshot commands: Use only tag, not the ID to identify snapshots
- qmeu-img, commit: Error path fixes
- block/nvme: Build fix for gcc 9
- MAINTAINERS updates
- Fix various issues with bdrv_refresh_filename()
- Fix various iotests
- Include LUKS overhead in qemu-img measure for qcow2
- A fix for vmdk's image creation interface

# gpg: Signature made Mon 25 Feb 2019 14:18:15 GMT
# gpg:                using RSA key 7F09B272C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>" [full]
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74  56FE 7F09 B272 C88F 2FD6

* remotes/kevin/tags/for-upstream: (71 commits)
  iotests: Skip 211 on insufficient memory
  vmdk: false positive of compat6 with hwversion not set
  iotests: add LUKS payload overhead to 178 qemu-img measure test
  qcow2: include LUKS payload overhead in qemu-img measure
  iotests.py: s/_/-/g on keys in qmp_log()
  iotests: Let 045 be run concurrently
  iotests: Filter SSH paths
  iotests.py: Filter filename in any string value
  iotests.py: Add is_str()
  iotests: Fix 207 to use QMP filters for qmp_log
  iotests: Fix 232 for LUKS
  iotests: Remove superfluous rm from 232
  iotests: Fix 237 for Python 2.x
  iotests: Re-add filename filters
  iotests: Test json:{} filenames of internal BDSs
  block: BDS options may lack the "driver" option
  block/null: Generate filename even with latency-ns
  block/curl: Implement bdrv_refresh_filename()
  block/curl: Harmonize option defaults
  block/nvme: Fix bdrv_refresh_filename()
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

Merge remote-tracking branch 'remotes/berrange/tags/authz-core-pull-request' into staging

Add a standard authorization framework

The current network services now support encryption via TLS and in some
cases support authentication via SASL. In cases where SASL is not
available, x509 client certificates can be used as a crude authorization
scheme, but using a sub-CA and controlling who you give certs to. In
general this is not very flexible though, so this series introduces a
new standard authorization framework.

It comes with four initial authorization mechanisms

- Simple - an exact username match. This is useful when there is
   exactly one user that is known to connect. For example when live
   migrating from one QEMU to another with TLS, libvirt would use
   the simple scheme to whitelist the TLS cert of the source QEMU.

- List - an full access control list, with optional regex matching.
   This is more flexible and is used to provide 100% backcompat with
   the existing HMP ACL commands. The caveat is that we can't create
   these via the CLI -object arg yet.

- ListFile - the same as List, but with the rules stored in JSON
   format in an external file. This avoids the -object limitation
   while also allowing the admin to change list entries on the file.
   QEMU uses inotify to notice these changes and auto-reload the
   file contents. This is likely a good default choice for most
   network services, if the "simple" mechanism isn't sufficient.

- PAM - delegate the username lookup to a PAM module, which opens
   the door to many options including things like SQL/LDAP lookups.

# gpg: Signature made Tue 26 Feb 2019 15:33:46 GMT
# gpg:                using RSA key BE86EBB415104FDF
# gpg: Good signature from "Daniel P. Berrange <dan@berrange.com>" [full]
# gpg:                 aka "Daniel P. Berrange <berrange@redhat.com>" [full]
# Primary key fingerprint: DAF3 A6FD B26B 6291 2D0E  8E3F BE86 EBB4 1510 4FDF

* remotes/berrange/tags/authz-core-pull-request:
  authz: delete existing ACL implementation
  authz: add QAuthZPAM object type for authorizing using PAM
  authz: add QAuthZListFile object type for a file access control list
  authz: add QAuthZList object type for an access control list
  authz: add QAuthZSimple object type for easy whitelist auth checks
  authz: add QAuthZ object as an authorization base class
  hw/usb: switch MTP to use new inotify APIs
  hw/usb: fix const-ness for string params in MTP driver
  hw/usb: don't set IN_ISDIR for inotify watch in MTP driver
  qom: don't require user creatable objects to be registered
  util: add helper APIs for dealing with inotify in portable manner

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

authz: delete existing ACL implementation

The 'qemu_acl' type was a previous non-QOM based attempt to provide an
authorization facility in QEMU. Because it is non-QOM based it cannot be
created via the command line and requires special monitor commands to
manipulate it.

The new QAuthZ subclasses provide a superset of the functionality in
qemu_acl, so the latter can now be deleted. The HMP 'acl_*' monitor
commands are converted to use the new QAuthZSimple data type instead
in order to provide temporary backwards compatibility.

Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>

authz: add QAuthZPAM object type for authorizing using PAM

Add an authorization backend that talks to PAM to check whether the user
identity is allowed. This only uses the PAM account validation facility,
which is essentially just a check to see if the provided username is permitted
access. It doesn't use the authentication or session parts of PAM, since
that's dealt with by the relevant part of QEMU (eg VNC server).

Consider starting QEMU with a VNC server and telling it to use TLS with
x509 client certificates and configuring it to use an PAM to validate
the x509 distinguished name. In this example we're telling it to use PAM
for the QAuthZ impl with a service name of "qemu-vnc"

$ qemu-system-x86_64 \
     -object tls-creds-x509,id=tls0,dir=/home/berrange/security/qemutls,\
             endpoint=server,verify-peer=yes \
     -object authz-pam,id=authz0,service=qemu-vnc \
     -vnc :1,tls-creds=tls0,tls-authz=authz0

This requires an /etc/pam/qemu-vnc file to be created with the auth
rules. A very simple file based whitelist can be setup using

  $ cat > /etc/pam/qemu-vnc <<EOF
  account         requisite       pam_listfile.so item=user sense=allow file=/etc/qemu/vnc.allow
  EOF

The /etc/qemu/vnc.allow file simply contains one username per line. Any
username not in the file is denied. The usernames in this example are
the x509 distinguished name from the client's x509 cert.

  $ cat > /etc/qemu/vnc.allow <<EOF
  CN=laptop.berrange.com,O=Berrange Home,L=London,ST=London,C=GB
  EOF

More interesting would be to configure PAM to use an LDAP backend, so
that the QEMU authorization check data can be centralized instead of
requiring each compute host to have file maintained.

The main limitation with this PAM module is that the rules apply to all
QEMU instances on the host. Setting up different rules per VM, would
require creating a separate PAM service name & config file for every
guest. An alternative approach for the future might be to not pass in
the plain username to PAM, but instead combine the VM name or UUID with
the username. This requires further consideration though.

Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>

authz: add QAuthZListFile object type for a file access control list

Add a QAuthZListFile object type that implements the QAuthZ interface. This
built-in implementation is a proxy around the QAuthZList object type,
initializing it from an external file, and optionally, automatically
reloading it whenever it changes.

To create an instance of this object via the QMP monitor, the syntax
used would be:

      {
        "execute": "object-add",
        "arguments": {
          "qom-type": "authz-list-file",
          "id": "authz0",
          "props": {
            "filename": "/etc/qemu/vnc.acl",
    "refresh": true
          }
        }
      }

If "refresh" is "yes", inotify is used to monitor the file,
automatically reloading changes. If an error occurs during reloading,
all authorizations will fail until the file is next successfully
loaded.

The /etc/qemu/vnc.acl file would contain a JSON representation of a
QAuthZList object

    {
      "rules": [
         { "match": "fred", "policy": "allow", "format": "exact" },
         { "match": "bob", "policy": "allow", "format": "exact" },
         { "match": "danb", "policy": "deny", "format": "glob" },
         { "match": "dan*", "policy": "allow", "format": "exact" },
      ],
      "policy": "deny"
    }

This sets up an authorization rule that allows 'fred', 'bob' and anyone
whose name starts with 'dan', except for 'danb'. Everyone unmatched is
denied.

The object can be loaded on the comand line using

   -object authz-list-file,id=authz0,filename=/etc/qemu/vnc.acl,refresh=yes

Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>

authz: add QAuthZList object type for an access control list

Add a QAuthZList object type that implements the QAuthZ interface. This
built-in implementation maintains a trivial access control list with a
sequence of match rules and a final default policy. This replicates the
functionality currently provided by the qemu_acl module.

To create an instance of this object via the QMP monitor, the syntax
used would be:

  {
    "execute": "object-add",
    "arguments": {
      "qom-type": "authz-list",
      "id": "authz0",
      "props": {
        "rules": [
           { "match": "fred", "policy": "allow", "format": "exact" },
           { "match": "bob", "policy": "allow", "format": "exact" },
           { "match": "danb", "policy": "deny", "format": "glob" },
           { "match": "dan*", "policy": "allow", "format": "exact" },
        ],
        "policy": "deny"
      }
    }
  }

This sets up an authorization rule that allows 'fred', 'bob' and anyone
whose name starts with 'dan', except for 'danb'. Everyone unmatched is
denied.

It is not currently possible to create this via -object, since there is
no syntax supported to specify non-scalar properties for objects. This
is likely to be addressed by later support for using JSON with -object,
or an equivalent approach.

In any case the future "authz-listfile" object can be used from the
CLI and is likely a better choice, as it allows the ACL to be refreshed
automatically on change.

Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>

authz: add QAuthZSimple object type for easy whitelist auth checks

In many cases a single VM will just need to whitelist a single identity
as the allowed user of network services. This is especially the case for
TLS live migration (optionally with NBD storage) where we just need to
whitelist the x509 certificate distinguished name of the source QEMU
host.

Via QMP this can be configured with:

  {
    "execute": "object-add",
    "arguments": {
      "qom-type": "authz-simple",
      "id": "authz0",
      "props": {
        "identity": "fred"
      }
    }
  }

Or via the command line

  -object authz-simple,id=authz0,identity=fred

Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>

authz: add QAuthZ object as an authorization base class

The current qemu_acl module provides a simple access control list
facility inside QEMU, which is used via a set of monitor commands
acl_show, acl_policy, acl_add, acl_remove & acl_reset.

Note there is no ability to create ACLs - the network services (eg VNC
server) were expected to create ACLs that they want to check.

There is also no way to define ACLs on the command line, nor potentially
integrate with external authorization systems like polkit, pam, ldap
lookup, etc.

The QAuthZ object defines a minimal abstract QOM class that can be
subclassed for creating different authorization providers.

Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>

hw/usb: switch MTP to use new inotify APIs

The internal inotify APIs allow a lot of conditional statements to be
cleared out, and provide a simpler callback for handling events.

Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>

hw/usb: fix const-ness for string params in MTP driver

Various functions accepting 'char *' string parameters were missing
'const' qualifiers.

Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>

hw/usb: don't set IN_ISDIR for inotify watch in MTP driver

IN_ISDIR is not a bit that one can request when registering a
watch with inotify_add_watch. Rather it is a bit that is set
automatically when reading events from the kernel.

Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>

qom: don't require user creatable objects to be registered

When an object is in turn owned by another user object, it is not
desirable to expose this in the QOM object hierarchy. It is just an
internal implementation detail, we should be free to change without
exposure to apps managing QEMU.

Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>

util: add helper APIs for dealing with inotify in portable manner

The inotify userspace API for reading events is quite horrible, so it is
useful to wrap it in a more friendly API to avoid duplicating code
across many users in QEMU. Wrapping it also allows introduction of a
platform portability layer, so that we can add impls for non-Linux based
equivalents in future.

Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>

Merge remote-tracking branch 'remotes/stefanha/tags/block-pull-request' into staging

Pull request

# gpg: Signature made Fri 22 Feb 2019 14:07:01 GMT
# gpg:                using RSA key 9CA4ABB381AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>" [full]
# gpg:                 aka "Stefan Hajnoczi <stefanha@gmail.com>" [full]
# Primary key fingerprint: 8695 A8BF D3F9 7CDA AC35  775A 9CA4 ABB3 81AB 73C8

* remotes/stefanha/tags/block-pull-request: (27 commits)
  tests/virtio-blk: add test for DISCARD command
  tests/virtio-blk: add test for WRITE_ZEROES command
  tests/virtio-blk: add virtio_blk_fix_dwz_hdr() function
  tests/virtio-blk: change assert on data_size in virtio_blk_request()
  virtio-blk: add DISCARD and WRITE_ZEROES features
  virtio-blk: set config size depending on the features enabled
  virtio-net: make VirtIOFeature usable for other virtio devices
  virtio-blk: add "discard" and "write-zeroes" properties
  virtio-blk: add host_features field in VirtIOBlock
  virtio-blk: add acct_failed param to virtio_blk_handle_rw_error()
  hw/ide: drop iov field from IDEDMA
  hw/ide: drop iov field from IDEBufferedRequest
  hw/ide: drop iov field from IDEState
  tests/test-bdrv-drain: use QEMU_IOVEC_INIT_BUF
  migration/block: use qemu_iovec_init_buf
  qemu-img: use qemu_iovec_init_buf
  block/vmdk: use qemu_iovec_init_buf
  block/qed: use qemu_iovec_init_buf
  block/qcow2: use qemu_iovec_init_buf
  block/qcow: use qemu_iovec_init_buf
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

Merge remote-tracking branch 'mreitz/tags/pull-block-2019-02-25' into queue-block

Block patches:
- Fix various issues with bdrv_refresh_filename()
- Fix various iotests
- Include LUKS overhead in qemu-img measure for qcow2
- A fix for vmdk's image creation interface

# gpg: Signature made Mon Feb 25 15:13:43 2019 CET
# gpg:                using RSA key F407DB0061D5CF40
# gpg: Good signature from "Max Reitz <mreitz@redhat.com>"
# Primary key fingerprint: 91BE B60A 30DB 3E88 57D1  1829 F407 DB00 61D5 CF40

* mreitz/tags/pull-block-2019-02-25: (45 commits)
  iotests: Skip 211 on insufficient memory
  vmdk: false positive of compat6 with hwversion not set
  iotests: add LUKS payload overhead to 178 qemu-img measure test
  qcow2: include LUKS payload overhead in qemu-img measure
  iotests.py: s/_/-/g on keys in qmp_log()
  iotests: Let 045 be run concurrently
  iotests: Filter SSH paths
  iotests.py: Filter filename in any string value
  iotests.py: Add is_str()
  iotests: Fix 207 to use QMP filters for qmp_log
  iotests: Fix 232 for LUKS
  iotests: Remove superfluous rm from 232
  iotests: Fix 237 for Python 2.x
  iotests: Re-add filename filters
  iotests: Test json:{} filenames of internal BDSs
  block: BDS options may lack the "driver" option
  block/null: Generate filename even with latency-ns
  block/curl: Implement bdrv_refresh_filename()
  block/curl: Harmonize option defaults
  block/nvme: Fix bdrv_refresh_filename()
  ...

Signed-off-by: Kevin Wolf <kwolf@redhat.com>

iotests: Skip 211 on insufficient memory

VDI keeps the whole bitmap in memory, and the maximum size (which is
tested here) is 2 GB. This may not be available on all machines, and it
rarely is available when running a 32 bit build.

Fix this by making VM.run_job() return the error string if an error
occurred, and checking whether that contains "Could not allocate bmap"
in 211. If so, the test is skipped.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-id: 20190218180646.30282-1-mreitz@redhat.com
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>

vmdk: false positive of compat6 with hwversion not set

In vmdk_co_create_opts, when it finds hw_version is undefined, it will
set it to 4, which misleading the compat6 and hwversion in
vmdk_co_do_create. Simply set hw_version to NULL after free, let
the logic in vmdk_co_do_create to decide the value of hw_version.

This bug can be reproduced by:

$ qemu-img convert -O vmdk -o subformat=streamOptimized,compat6
/home/yuchenlin/syno.qcow2 /home/yuchenlin/syno.vmdk

qemu-img: /home/yuchenlin/syno.vmdk: error while converting vmdk:
compat6 cannot be enabled with hwversion set

Signed-off-by: yuchenlin <yuchenlin@synology.com>
Message-id: 20190221110805.28239-1-yuchenlin@synology.com
Signed-off-by: Max Reitz <mreitz@redhat.com>

iotests: add LUKS payload overhead to 178 qemu-img measure test

The previous patch includes the LUKS payload overhead into the qemu-img
measure calculation for qcow2. Update qemu-iotests 178 to exercise this
new code path.

Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20190218104525.23674-3-stefanha@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>

qcow2: include LUKS payload overhead in qemu-img measure

LUKS encryption reserves clusters for its own payload data. The size of
this area must be included in the qemu-img measure calculation so that
we arrive at the correct minimum required image size.

(Ab)use the qcrypto_block_create() API to determine the payload
overhead. We discard the payload data that qcrypto thinks will be
written to the image.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 20190218104525.23674-2-stefanha@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>

iotests.py: s/_/-/g on keys in qmp_log()

This follows what qmp() does, so the output will correspond to the
actual QMP command.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-id: 20190210145736.1486-11-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>

iotests: Let 045 be run concurrently

Adding a telnet monitor for no real purpose on a fixed port is not so
great. Just use a null monitor instead.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 20190210145736.1486-10-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>

iotests: Filter SSH paths

8908b253c4ad5f8874c8d13abec169c696a5cd32 has implemented filtering of
remote paths for NFS, but forgot SSH. This patch takes care of that.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 20190210145736.1486-9-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>

iotests.py: Filter filename in any string value

filter_qmp_testfiles() currently filters the filename only for specific
keys. However, there are more keys that take filenames (such as
block-commit's @top and @base, or ssh's @path), and it does not make
sense to list them all here. "$TEST_DIR/$PID-" should have enough
entropy not to appear anywhere randomly.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 20190210145736.1486-8-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>

iotests.py: Add is_str()

On Python 2.x, strings are not always unicode strings. This function
checks whether a given value is a plain string, or a unicode string (if
there is a difference).

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 20190210145736.1486-7-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>

iotests: Fix 207 to use QMP filters for qmp_log

Fixes: 08fcd6111e1949f456e1b232ebeeb0cc17019a92
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 20190210145736.1486-6-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>

iotests: Fix 232 for LUKS

With IMGOPTSSYNTAX, $TEST_IMG is useless for this test (it only tests
the file-posix protocol driver). Therefore, if $TEST_IMG_FILE is set,
use that instead.

Because this test requires the file protocol, $TEST_IMG_FILE will always
be set if $IMGOPTSSYNTAX is true.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 20190210145736.1486-5-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>

iotests: Remove superfluous rm from 232

This test creates no such file.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 20190210145736.1486-4-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>

iotests: Fix 237 for Python 2.x

math.ceil() returns an integer on Python 3.x, but a float on Python 2.x.
range() always needs integers, so we need an explicit conversion on 2.x
(which does not hurt on 3.x).

It is not quite clear whether we want to support Python 2.x for any
prolonged time, but this may as well be fixed along with the other
issues some iotests have right now.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20190210145736.1486-3-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>

iotests: Re-add filename filters

A previous commit removed the default filters for qmp_log with the
intention to make them explicit; but this happened only for test 206.
There are more tests (for more exotic image formats than qcow2) which
require the filename filter, though.

Note that 237 is still broken for Python 2.x, which is fixed in the next
commit.

Fixes: f8ca8609d8549def45b28e82ecac64adaeee9f12
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 20190210145736.1486-2-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>

iotests: Test json:{} filenames of internal BDSs

Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-id: 20190201192935.18394-32-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>

block: BDS options may lack the "driver" option

When BDSs are created by qemu itself (e.g. as filters in block jobs),
they may not have a "driver" option in their options QDict. When
generating a json:{} filename, however, it must always be present.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Message-id: 20190201192935.18394-31-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>

block/null: Generate filename even with latency-ns

While we cannot represent the latency-ns option in a filename, it is not
a strong option so not being able to should not stop us from generating
a filename nonetheless.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Message-id: 20190201192935.18394-30-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>

block/curl: Implement bdrv_refresh_filename()

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Message-id: 20190201192935.18394-29-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>

block/curl: Harmonize option defaults

Both of the defaults we currently have in the curl driver are named
based on a slightly different schema, let's unify that and call both
CURL_BLOCK_OPT_${NAME}_DEFAULT.

While at it, we can add a macro for the third option for which a default
exists, namely "sslverify".

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Message-id: 20190201192935.18394-28-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>

block/nvme: Fix bdrv_refresh_filename()

Currently, nvme's bdrv_refresh_filename() is an exact copy of null's
implementation. However, for null, "null-co://" and "null-aio://" are
indeed valid filenames -- for nvme, they are not, as a device address is
still required.

The correct implementation should generate a filename of the form
"nvme://[PCI address]/[namespace]" (as the comment above
nvme_parse_filename() describes).

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Message-id: 20190201192935.18394-27-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>

block: Do not copy exact_filename from format file

If a format BDS's file BDS is in turn a format BDS, we cannot simply use
the same filename, because when opening a BDS tree based on a filename
alone, qemu will create only one format node on top of one protocol node
(disregarding a potential backing file).

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Message-id: 20190201192935.18394-26-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>

block: Purify .bdrv_refresh_filename()

Currently, BlockDriver.bdrv_refresh_filename() is supposed to both
refresh the filename (BDS.exact_filename) and set BDS.full_open_options.
Now that we have generic code in the central bdrv_refresh_filename() for
creating BDS.full_open_options, we can drop the latter part from all
BlockDriver.bdrv_refresh_filename() implementations.

This also means that we can drop all of the existing default code for
this from the global bdrv_refresh_filename() itself.

Furthermore, we now have to call BlockDriver.bdrv_refresh_filename()
after having set BDS.full_open_options, because the block driver's
implementation should now be allowed to depend on BDS.full_open_options
being set correctly.

Finally, with this patch we can drop the @options parameter from
BlockDriver.bdrv_refresh_filename(); also, add a comment on this
function's purpose in block/block_int.h while touching its interface.

This completely obsoletes blklogwrite's implementation of
.bdrv_refresh_filename().

Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-id: 20190201192935.18394-25-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>

block: Generically refresh runtime options

Instead of having every block driver which implements
bdrv_refresh_filename() copy all of the strong runtime options over to
bs->full_open_options, implement this process generically in
bdrv_refresh_filename().

This patch only adds this new generic implementation, it does not remove
the old functionality. This is done in a follow-up patch.

With this patch, some superfluous information (that should never have
been there) may be removed from some JSON filenames, as can be seen in
the change to iotests 110's and 228's reference outputs.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-id: 20190201192935.18394-24-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>

block: Add BlockDriver.bdrv_gather_child_options

Some follow-up patches will rework the way bs->full_open_options is
refreshed in bdrv_refresh_filename(). The new implementation will remove
the need for the block drivers' bdrv_refresh_filename() implementations
to set bs->full_open_options; instead, it will be generic and use static
information from each block driver.

However, by implementing bdrv_gather_child_options(), block drivers will
still be able to override the way the full_open_options of their
children are incorporated into their own.

We need to implement this function for VMDK because we have to prevent
the generic implementation from gathering the options of all children:
It is not possible to specify options for the extents through the
runtime options.

For quorum, the child names that would be used by the generic
implementation and the ones that we actually (currently) want to use
differ. See quorum_gather_child_options() for more information.

Note that both of these are cases which are not ideal: In case of VMDK
it would probably be nice to be able to specify options for all extents.
In case of quorum, the current runtime option structure is simply broken
and needs to be fixed (but that is left for another patch).

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Message-id: 20190201192935.18394-23-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>

block: Add strong_runtime_opts to BlockDriver

This new field can be set by block drivers to list the runtime options
they accept that may influence the contents of the respective BDS. As of
a follow-up patch, this list will be used by the common
bdrv_refresh_filename() implementation to decide which options to put
into BDS.full_open_options (and consequently whether a JSON filename has
to be created), thus freeing the drivers of having to implement that
logic themselves.

Additionally, this patch adds the field to all of the block drivers that
need it and sets it accordingly.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Message-id: 20190201192935.18394-22-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>

iotests: Add quorum case to test 110

Test 110 tests relative backing filenames for complex BDS trees. Now
that the originally supposedly failing test passes, let us add a new
failing test: Quorum can never work automatically (without detecting
whether all child nodes have the same base directory, but that would be
rather inconsistent behavior).

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Message-id: 20190201192935.18394-21-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>

block: Use bdrv_dirname() for relative filenames

bdrv_get_full_backing_filename_from_filename() breaks down when it comes
to JSON filenames. Using bdrv_dirname() as the basis is better because
since we have BDS, we can descend through the BDS tree to the protocol
layer, which gives us a greater probability of finding a non-JSON name;
also, bdrv_dirname() is more correct as it allows block drivers to
override the generation of that directory name in a protocol-specific
way.

We still need to keep bdrv_get_full_backing_filename_from_filename(),
though, because it has valid callers which need it during image creation
when no BDS is available yet.

This makes a test case in qemu-iotest 110, which was supposed to fail,
work. That is actually good, but we need to change the reference output
(and the comment in 110) accordingly.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Message-id: 20190201192935.18394-20-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>

block/nfs: Implement bdrv_dirname()

While the basic idea is obvious and could be handled by the default
bdrv_dirname() implementation, we cannot generate a directory name if
the gid or uid are set, so we have to explicitly return NULL in those
cases.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Message-id: 20190201192935.18394-19-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>

block/nbd: Make bdrv_dirname() return NULL

The generic bdrv_dirname() implementation would be able to generate some
form of directory name for many NBD nodes, but it would be always wrong.
Therefore, we have to explicitly make it an error (until NBD has some
form of specification for export paths, if it ever will).

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 20190201192935.18394-18-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>

quorum: Make bdrv_dirname() return NULL

While the common implementation for bdrv_dirname() should return NULL
for quorum BDSs already (because they do not have a file node and their
exact_filename field should be empty), there is no reason not to make
that explicit.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Message-id: 20190201192935.18394-17-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>

blkverify: Make bdrv_dirname() return NULL

blkverify's BDSs have a file BDS, but we do not want this to be
preferred over the raw node. There is no way to decide between the two
(and not really a reason to, either), so just return NULL in blkverify's
implementation of bdrv_dirname().

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Message-id: 20190201192935.18394-16-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>

block: Add bdrv_dirname()

This function may be implemented by block drivers to derive a directory
name from a BDS. Concatenating this g_free()-able string with a relative
filename must result in a valid (not necessarily existing) filename, so
this is a function that should generally be not implemented by format
drivers, because this is protocol-specific.

If a BDS's driver does not implement this function, bdrv_dirname() will
fall through to the BDS's file if it exists. If it does not, the
exact_filename field will be used to generate a directory name.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Message-id: 20190201192935.18394-15-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>

block: Fix bdrv_find_backing_image()

bdrv_find_backing_image() should use bdrv_get_full_backing_filename() or
bdrv_make_absolute_filename() instead of trying to do what those
functions do by itself.

path_combine_deprecated() can now be dropped, so let's do that.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Message-id: 20190201192935.18394-14-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>

block: Add bdrv_make_absolute_filename()

This is a general function for making a filename that is relative to a
certain BDS absolute.

It calls bdrv_get_full_backing_filename_from_filename() for now, but
that will be changed in a follow-up patch.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Message-id: 20190201192935.18394-13-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>

block: bdrv_get_full_backing_filename's ret. val.

Make bdrv_get_full_backing_filename() return an allocated string instead
of placing the result in a caller-provided buffer.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Message-id: 20190201192935.18394-12-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>

block: bdrv_get_full_backing_filename_from_...'s ret. val.

Make bdrv_get_full_backing_filename_from_filename() return an allocated
string instead of placing the result in a caller-provided buffer.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-id: 20190201192935.18394-11-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>

block: Make path_combine() return the path

Besides being safe for arbitrary path lengths, after some follow-up
patches all callers will want a freshly allocated buffer anyway.

In the meantime, path_combine_deprecated() is added which has the same
interface as path_combine() had before this patch. All callers to that
function will be converted in follow-up patches.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-id: 20190201192935.18394-10-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>

iotests: Add test for backing file overrides

Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-id: 20190201192935.18394-9-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>

iotests.py: Add node_info()

This function queries a node; since we cannot do that right now, it
executes query-named-block-nodes and returns the matching node's object.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 20190201192935.18394-8-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>

iotests.py: Add filter_imgfmt()

Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-id: 20190201192935.18394-7-mreitz@redhat.com
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>

block: Respect backing bs in bdrv_refresh_filename

Basically, bdrv_refresh_filename() should respect all children of a
BlockDriverState. However, generally those children are driver-specific,
so this function cannot handle the general case. On the other hand,
there are only few drivers which use other children than @file and
@backing (that being vmdk, quorum, and blkverify).

Most block drivers only use @file and/or @backing (if they use any
children at all). Both can be implemented directly in
bdrv_refresh_filename.

The user overriding the file's filename is already handled, however, the
user overriding the backing file is not. If this is done, opening the
BDS with the plain filename of its file will not be correct, so we may
not set bs->exact_filename in that case.

iotest 051 contains test cases for overriding the backing file, and so
its output changes with this patch applied.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Message-id: 20190201192935.18394-6-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>

block: Add BDS.auto_backing_file

If the backing file is overridden, this most probably does change the
guest-visible data of a BDS.  Therefore, we will need to consider this
in bdrv_refresh_filename().

To see whether it has been overridden, we might want to compare
bs->backing_file and bs->backing->bs->filename.  However,
bs->backing_file is changed by bdrv_set_backing_hd() (which is just used
to change the backing child at runtime, without modifying the image
header), so bs->backing_file most of the time simply contains a copy of
bs->backing->bs->filename anyway, so it is useless for such a
comparison.

This patch adds an auto_backing_file BDS field which contains the
backing file path as indicated by the image header, which is not changed
by bdrv_set_backing_hd().

Because of bdrv_refresh_filename() magic, however, a BDS's filename may
differ from what has been specified during bdrv_open().  Then, the
comparison between bs->auto_backing_file and bs->backing->bs->filename
may fail even though bs->backing was opened from bs->auto_backing_file.
To mitigate this, we can copy the real BDS's filename (after the whole
bdrv_open() and bdrv_refresh_filename() process) into
bs->auto_backing_file, if we know the former has been opened based on
the latter.  This is only possible if no options modifying the backing
file's behavior have been specified, though.  To simplify things, this
patch only copies the filename from the backing file if no options have
been specified for it at all.

Furthermore, there are cases where an overlay is created by qemu which
already contains a BDS's filename (e.g. in blockdev-snapshot-sync).  We
do not need to worry about updating the overlay's bs->auto_backing_file
there, because we actually wrote a post-bdrv_refresh_filename() filename
into the image header.

So all in all, there will be false negatives where (as of a future
patch) bdrv_refresh_filename() will assume that the backing file differs
from what was specified in the image header, even though it really does
not.  However, these cases should be limited to where (1) the user
actually did override something in the backing chain (e.g. by specifying
options for the backing file), or (2) the user executed a QMP command to
change some node's backing file (e.g. change-backing-file or
block-commit with @backing-file given) where the given filename does not
happen to coincide with qemu's idea of the backing BDS's filename.

Then again, (1) really is limited to -drive.  With -blockdev or
blockdev-add, you have to adhere to the schema, so a user cannot give
partial "unimportant" options (e.g. by just setting backing.node-name
and leaving the rest to the image header).  Therefore, trying to fix
this would mean trying to fix something for -drive only.

To improve on (2), we would need a full infrastructure to "canonicalize"
an arbitrary filename (+ options), so it can be compared against
another.  That seems a bit over the top, considering that filenames
nowadays are there mostly for the user's entertainment.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Message-id: 20190201192935.18394-5-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>

block: Skip implicit nodes for filename info

bdrv_refresh_filename() should simply skip all implicit nodes. They are
supposed to be invisible to the user, so they should not appear in
filename information.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Message-id: 20190201192935.18394-4-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>

block: Use children list in bdrv_refresh_filename

bdrv_refresh_filename() should invoke itself recursively on all
children, not just on file.

With that change, we can remove the manual invocations in blkverify,
quorum, commit, mirror, and blklogwrites.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Message-id: 20190201192935.18394-3-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>

block: Use bdrv_refresh_filename() to pull

Before this patch, bdrv_refresh_filename() is used in a pushing manner:
Whenever the BDS graph is modified, the parents of the modified edges
are supposed to be updated (recursively upwards).  However, that is
nonviable, considering that we want child changes not to concern
parents.

Also, in the long run we want a pull model anyway: Here, we would have a
bdrv_filename() function which returns a BDS's filename, freshly
constructed.

This patch is an intermediate step.  It adds bdrv_refresh_filename()
calls before every place a BDS.filename value is used.  The only
exceptions are protocol drivers that use their own filename, which
clearly would not profit from refreshing that filename before.

Also, bdrv_get_encrypted_filename() is removed along the way (as a user
of BDS.filename), since it is completely unused.

In turn, all of the calls to bdrv_refresh_filename() before this patch
are removed, because we no longer have to call this function on graph
changes.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-id: 20190201192935.18394-2-mreitz@redhat.com
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>

block/nvme: Remove QEMU_PACKED from naturally aligned NVMeRegs struct

The QEMU_PACKED is causing a compiler warning/error with GCC 9:

CC block/nvme.o
block/nvme.c: In function ‘nvme_create_queue_pair’:
block/nvme.c:209:22: error: taking address of packed member of
‘struct <anonymous>’ may result in an unaligned pointer value
[-Werror=address-of-packed-member]
209 | q->sq.doorbell = &s->regs->doorbells[idx * 2 * s->doorbell_scale];

All members of the struct are naturally aligned, so there should
not be the need for QEMU_PACKED here, and the following QEMU_BUILD_BUG_ON
also ensures that there is no padding. Thus simply remove the QEMU_PACKED
here.

Buglink: https://bugs.launchpad.net/qemu/+bug/1817525
Reported-by: Satheesh Rajendran <sathnaga@linux.vnet.ibm.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

qcow2: Assert that L2 table offsets fit in the L1 table

L1 table entries have a field to store the offset of an L2 table.
The rest of the bits of the entry are currently reserved except from
bit 63, which stores the COPIED flag.

The offset is always taken from the entry using L1E_OFFSET_MASK to
ensure that we only use the bits that belong to that field.

While that mask is used every time we read from the L1 table, it is
never used when we write to it. Due to the limits set elsewhere in the
code QEMU can never produce L2 table offsets that don't fit in that
field so any such offset when allocating an L2 table would indicate a
bug in QEMU.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

Merge remote-tracking branch 'remotes/stsquad/tags/pull-testing-next-220219-1' into staging

Various testing fixes:

  - Travis updates (inc disable isapc cdrom test)
  - Add gitlab control
  - Fix docker image
  - keep softloat tests short

# gpg: Signature made Fri 22 Feb 2019 09:51:36 GMT
# gpg:                using RSA key 6685AE99E75167BCAFC8DF35FBD0DB095A9E2A44
# gpg: Good signature from "Alex Bennée (Master Work Key) <alex.bennee@linaro.org>" [full]
# Primary key fingerprint: 6685 AE99 E751 67BC AFC8  DF35 FBD0 DB09 5A9E 2A44

* remotes/stsquad/tags/pull-testing-next-220219-1:
  tests/cdrom-test: only include isapc cdrom test when g_test_slow()
  tests/softfloat: always do quick softfloat tests
  Add a gitlab-ci file for Continuous Integration testing on Gitlab
  tests/docker: peg netmap code to a specific version
  tests/docker: squash initial update and install step for debian9
  .travis.yml: Remove disable-uuid
  .travis.yml: Test with disable-replication
  .travis.yml: split debug builds
  .travis.yml: the xcode10 image seems to be hosed

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

tests: add test-bdrv-graph-mod

Add two tests of node graph modification.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

block: fix bdrv_check_perm for non-tree subgraph

bdrv_check_perm in it's recursion checks each node in context of new
permissions for one parent, because of nature of DFS. It works well,
while children subgraph of top-most updated node is a tree, i.e. it
doesn't have any kind of loops. But if we have a loop (not oriented,
of course), i.e. we have two different ways from top-node to some
child-node, then bdrv_check_perm will do wrong thing:

  top
  | \
  |  |
  v  v
  A  B
  |  |
  v  v
  node

It will once check new permissions of node in context of new A
permissions and old B permissions and once visa-versa. It's a wrong way
and may lead to corruption of permission system. We may start with
no-permissions and all-shared for both A->node and B->node relations
and finish up with non shared write permission for both ways.

The following commit will add a test, which shows this bug.

To fix this situation, let's really set BdrvChild permissions during
bdrv_check_perm procedure. And we are happy here, as check-perm is
already written in transaction manner, so we just need to restore
backed-up permissions in _abort.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

block: improve should_update_child

As it already said in the comment, we don't want to create loops in
parent->child relations. So, when we try to append @to to @c, we should
check that @c is not in @to children subtree, and we should check it
recursively, not only the first level. The patch provides BFS-based
search, to check the relations.

This is needed for further fleecing-hook filter usage: we need to
append it to source, when the hook is already a parent of target, and
source may be in a backing chain of target (fleecing-scheme). So, on
appending, the hook should not became a child (direct or through
children subtree) of the target.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

aio-posix: Assert that aio_poll() is always called in home thread

aio_poll() has an existing assertion that the function is only called
from the AioContext's home thread if blocking is allowed.

This is not enough, some handlers make assumptions about the thread they
run in. Extend the assertion to non-blocking calls, too.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>

block: Use normal drain for bdrv_set_aio_context()

Now that bdrv_set_aio_context() works inside drained sections, it can
also use the real drain function instead of open coding something
similar.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>

test-bdrv-drain: AioContext switch in drained section

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>

block: Fix AioContext switch for drained node

When a drained node changes its AioContext, we need to move its
aio_disable_external() to the new context, too.

Without this fix, drain_end will try to reenable the new context, which
has never been disabled, so an assertion failure is triggered.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>

block: Don't poll in bdrv_set_aio_context()

The explicit aio_poll() call in bdrv_set_aio_context() was added in
commit c2b6428d388 as a workaround for bdrv_drain() failing to achieve
to actually quiesce everything (specifically the NBD client code to
switch AioContext).

Now that the NBD client has been fixed to complete this operation during
bdrv_drain(), we don't need the workaround any more.

It was wrong anyway: aio_poll() must always be run in the home thread of
the AioContext.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>

nbd: Increase bs->in_flight during AioContext switch

bdrv_drain() must not leave connection_co scheduled, so bs->in_flight
needs to be increased while the coroutine is waiting to be scheduled
in the new AioContext after nbd_client_attach_aio_context().

Signed-off-by: Kevin Wolf <kwolf@redhat.com>

nbd: Use low-level QIOChannel API in nbd_read_eof()

Instead of using the convenience wrapper qio_channel_read_all_eof(), use
the lower level QIOChannel API. This means duplicating some code, but
we'll need this because this coroutine yield is special: We want it to
be interruptible so that nbd_client_attach_aio_context() can correctly
reenter the coroutine.

This moves the bdrv_dec/inc_in_flight() pair into nbd_read_eof(), so
that connection_co will always sit in this exact qio_channel_yield()
call when bdrv_drain() returns.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>

nbd: Move nbd_read_eof() to nbd/client.c

The only caller of nbd_read_eof() is nbd_receive_reply(), so it doesn't
have to live in the header file, but can move next to its caller.

Also add the missing coroutine_fn to the function and its caller.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>

io: Remove redundant read/write_coroutine assignments

qio_channel_yield() now updates ioc->read_write/coroutine and calls
qio_channel_set_aio_fd_handlers(), so the code in the handlers has
become redundant and can be removed.

This does not make a difference in intermediate states because
aio_co_wake() really enters the coroutine immediately here: These
handlers are never run in coroutine context, and we're in the right
AioContext because qio_channel_attach_aio_context() asserts that the
handlers are inactive.

To make these conditions more obvious, assert the right AioContext.

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

io: Make qio_channel_yield() interruptible

Similar to how qemu_co_sleep_ns() allows preemption from an external
coroutine entry, allow reentering qio_channel_yield() early.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>

nbd: Restrict connection_co reentrance

nbd_client_attach_aio_context() schedules connection_co in the new
AioContext and this way reenters it in any arbitrary place that has
yielded. We can restrict this a bit to the function call where the
coroutine actually sits waiting when it's idle.

This doesn't solve any bug yet, but it shows where in the code we need
to support this random reentrance and where we don't have to care.

Add FIXME comments for the existing bugs that the rest of this series
will fix.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>

virtio-blk: Increase in_flight for request restart BH

virtio_blk_dma_restart_bh() submits new requests, so in order to make
sure that these requests are not started inside a drained section of the
attached BlockBackend, we need to make sure that draining the
BlockBackend waits for the BH to be executed.

This BH is still questionable because its scheduled in the main thread
instead of the configured iothread. Leave a FIXME comment for this.

But with this fix, enabling the data plane at least waits for these
requests (in bdrv_set_aio_context()) instead of changing the AioContext
under their feet and making them run in the wrong thread, causing
crashes and failures (e.g. due to missing locking).

Signed-off-by: Kevin Wolf <kwolf@redhat.com>

block-backend: Make blk_inc/dec_in_flight public

For some users of BlockBackends, just increasing the in_flight counter
is easier than implementing separate handlers in BlockDevOps. Make the
helper functions for this public.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>

qemu-img: fix error reporting for -object

Error reporting for user_creatable_add_opts_foreach was changed so that
it no longer called 'error_report_err' in:

  commit 7e1e0c11127bde81cff260fc6859690435c509d6
  Author: Markus Armbruster <armbru@redhat.com>
  Date:   Wed Oct 17 10:26:43 2018 +0200

    qom: Clean up error reporting in user_creatable_add_opts_foreach()

Some callers were updated to pass in "&error_fatal" but all the ones in
qemu-img were left passing NULL. As a result all errors went to
/dev/null instead of being reported to the user.

Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

commit: Replace commit_top_bs on failure after deleting the block job

If there's an error in commit_start() then the block job must be
deleted before replacing commit_top_bs, otherwise it will fail because
of lack of permissions. This happens since the permission system was
introduced in 8dfba2797761d8a43744e4e6571c8175e448a478.

Fortunately this bug doesn't seem to be possible to reproduce at the
moment without changing the code.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

block: don't set the same context

Adds a fast path on aio context setting preventing
unnecessary context setting routine.
Also, it prevents issues with cyclic walk of child
bds-es appeared because of registering aio walking
notifiers:

Call stack:

0  __GI_raise
1  __GI_abort
2  __assert_fail_base
3  __GI___assert_fail
4  bdrv_detach_aio_context (bs=0x55f54d65c000)      <<<
5  bdrv_detach_aio_context (bs=0x55f54fc8a800)
6  bdrv_set_aio_context (bs=0x55f54fc8a800, ...)
7  block_job_attached_aio_context
8  bdrv_attach_aio_context (bs=0x55f54d65c000, ...) <<<
9  bdrv_set_aio_context (bs=0x55f54d65c000)
10 blk_set_aio_context
11 virtio_blk_data_plane_stop
12 virtio_bus_stop_ioeventfd
13 virtio_vmstate_change
14 vm_state_notify (running=0, state=RUN_STATE_SHUTDOWN)
15 do_vm_stop (state=RUN_STATE_SHUTDOWN, send_stop=true)
16 vm_stop (state=RUN_STATE_SHUTDOWN)
17 main_loop_should_exit
18 main_loop
19 main

This can happen because of "new" context attachment to VM disk bds.
When attaching a new context the corresponding aio context handler is
called for each of aio_notifiers registered on the VM disk bds context.
Among those handlers, there is the block_job_attached_aio_context handler
which sets a new aio context for the block job bds. When doing so,
the old context is detached from all the block job bds children and one of
them is the VM disk bds, serving as backing store for the blockjob bds,
although the VM disk bds is actually the initializer of that process.
Since the VM disk bds is protected with walking_aio_notifiers flag
from double processing in recursive calls, the assert fires.

Signed-off-by: Denis Plotnikov <dplotnikov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

qcow2-snapshot: remove redundant find_snapshot_by_id_and_name call

In qcow2_snapshot_create there is the following code block:

    /* Generate an ID */
    find_new_snapshot_id(bs, sn_info->id_str, sizeof(sn_info->id_str));

    /* Check that the ID is unique */
    if (find_snapshot_by_id_and_name(bs, sn_info->id_str, NULL) >= 0) {
        return -EEXIST;
    }

find_new_snapshot_id cycles through all snapshots, getting the id_str
as an unsigned long int, calculating the max id_max value of all the
existing id_strs and writing in the id_str pointer id_max + 1:

    for(i = 0; i < s->nb_snapshots; i++) {
        sn = s->snapshots + i;
        id = strtoul(sn->id_str, NULL, 10);
        if (id > id_max)
            id_max = id;
    }
    snprintf(id_str, id_str_size, "%lu", id_max + 1);

Here, sn_info->id_str will have the unique value id_max + 1. Right
after that, find_snapshot_by_id_and_name is called with
id = sn_info->id_str and name = NULL. This will cause the function
to execute the following:

    } else if (id) {
        for (i = 0; i < s->nb_snapshots; i++) {
            if (!strcmp(s->snapshots[i].id_str, id)) {
                return i;
            }
        }
    }

In short, we're searching the existing snapshots to see if sn_info->id_str
matches any existing id, right after we set in the previous line a
sn_info->id_str value that is already unique.

The first code block goes way back to commit 585f8587ad, a 2006 commit from
Fabrice Bellard that simply says "new qcow2 disk image format". No more
info is provided about this logic in any subsequent commits that moved
this code block around.

I can't say about the original design, but the current logic is redundant.
bdrv_snapshot_create is called in aio_context lock, forbidding any
concurrent call to accidentally create a new snapshot between
the find_new_snapshot_id and find_snapshot_by_id_and_name calls. What
we're ending up doing is to cycle through the snapshots two times
for no viable reason.

This patch eliminates the redundancy by removing the 'id is unique'
check that calls find_snapshot_by_id_and_name.

Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

block/snapshot: remove bdrv_snapshot_delete_by_id_or_name

After the previous patch, the only instance of this function left
is inside qemu-img.c.

qemu-img is using it inside the 'img_snapshot' function to delete
snapshots in the SNAPSHOT_DELETE case, based on a "snapshot_name"
string that refers to the tag, not ID, of the QEMUSnapshotInfo struct.
This can be verified by checking the SNAPSHOT_CREATE case that
comes shortly before SNAPSHOT_DELETE. In that case, the same
"snapshot_name" variable is being strcpy to the 'name' field
of the QEMUSnapshotInfo struct sn:

pstrcpy(sn.name, sizeof(sn.name), snapshot_name);

Based on that, it is unlikely that "snapshot_name" might contain
an "id" in SNAPSHOT_DELETE.

This patch changes SNAPSHOT_DELETE to use snapshot_find() and
snapshot_delete() instead of bdrv_snapshot_delete_by_id_or_name.
After that, there is no instances left of bdrv_snapshot_delete_by_id_or_name
in the code, so it is safe to remove it entirely.

Suggested-by: Murilo Opsfelder Araujo <muriloo@linux.ibm.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

block/snapshot.c: eliminate use of ID input in snapshot operations

At this moment, QEMU attempts to create/load/delete snapshots
by using either an ID (id_str) or a name. The problem is that the code
isn't consistent of whether the entered argument is an ID or a name,
causing unexpected behaviors.

For example, when creating snapshots via savevm <arg>, what happens is that
"arg" is treated as both name and id_str. In a guest without snapshots, create
a single snapshot via savevm:

(qemu) savevm 0
(qemu) info snapshots
List of snapshots present on all disks:
ID        TAG                 VM SIZE                DATE       VM CLOCK
--        0                      741M 2018-07-31 13:39:56   00:41:25.313

A snapshot with name "0" is created. ID is hidden from the user, but the
ID is a non-zero integer that starts at "1". Thus, this snapshot has
id_str=1, TAG="0". Creating a second snapshot with arg = 1, the first one
is deleted:

(qemu) savevm 1
(qemu) info snapshots
List of snapshots present on all disks:
ID        TAG                 VM SIZE                DATE       VM CLOCK
--        1                      741M 2018-07-31 13:42:14   00:41:55.252

What happened?

- when creating the second snapshot, a verification is done inside
bdrv_all_delete_snapshot to delete any existing snapshots that matches an
string argument. Here, the code calls bdrv_all_delete_snapshot("1", ...);

- bdrv_all_delete_snapshot calls bdrv_snapshot_find(..., "1") for each
BlockDriverState of the guest. And this is where things goes tilting:
bdrv_snapshot_find does a search by both id_str and name. It finds
out that there is a snapshot that has id_str = 1, stores a reference
to the snapshot in the sn_info pointer and then returns match found;

- since a match was found, a call to bdrv_snapshot_delete_by_id_or_name() is
made. This function ignores the pointer written by bdrv_snapshot_find. Instead,
it deletes the snapshot using bdrv_snapshot_delete() calling it first with
id_str = 1. If it fails to delete, then it calls it again with name = 1.

- after all that, QEMU creates the new snapshot, that has id_str = 1 and
name = 1. The user is left wondering that happened with the first snapshot
created. Similar bugs can be triggered when using loadvm and delvm.

Before contemplating discarding the use of ID input in these operations,
I've searched the code of what would be the implications. My findings
are:

- the RBD and Sheepdog drivers don't care. Both uses the 'name' field as
key in their logic, making id_str = name when appropriate.
replay-snapshot.c does not make any special use of id_str;

- qcow2 uses id_str as an unique identifier but it is automatically
calculated, not being influenced by user input. Other than that, there are
no distinguish operations made only with id_str;

- in blockdev.c, the delete operation uses a match of both id_str AND
name. Given that id_str is either a copy of 'name' or auto-generated,
we're fine here.

This gives motivation to not consider ID as a valid user input in HMP
commands - sticking with 'name' input only is more consistent. To
accomplish that, the following changes were made in this patch:

- bdrv_snapshot_find() does not match for id_str anymore, only 'name'. The
function is called in save_snapshot(), load_snapshot(), bdrv_all_delete_snapshot()
and bdrv_all_find_snapshot(). This change makes the search function more
predictable and does not change the behavior of any underlying code that uses
these affected functions, which are related to HMP (which is fine) and the
main loop inside vl.c (which doesn't care about it anyways);

- bdrv_all_delete_snapshot() does not call bdrv_snapshot_delete_by_id_or_name
anymore. Instead, it uses the pointer returned by bdrv_snapshot_find to
erase the snapshot with the exact match of id_str an name. This function
is called in save_snapshot and hmp_delvm, thus this change  produces the
intended effect;

- documentation changes to reflect the new behavior. I consider this to
be an API fix instead of an API change - the user was already creating
snapshots using 'name', but now he/she will also enjoy a consistent
behavior.

Ideally we would get rid of the id_str field entirely, but this would have
repercussions on existing snapshots. Another day perhaps.

Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Acked-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

MAINTAINERS: Remove myself as block maintainer

I'll not be involved in day-to-day qemu development. Remove myself as
maintainer from the remainder of the network block drivers, and revert
them to the general block layer maintainership.

Move 'sheepdog' to the 'Odd Fixes' support level.

For VHDX, added my personal email address as a maintainer, as I can
answer questions or send the occassional bug fix. Leaving it as
'Supported', instead of 'Odd Fixes', because I think the rest of the
block layer maintainers and developers will upkeep it as well, if
needed.

Signed-off-by: Jeff Cody <jcody@redhat.com>
Acked-by: Max Reitz <mreitz@redhat.com>
Message-Id: <63e205cb84c8f0a10c1bc6d5d6856d72ceb56e41.1537984851.git.jcody@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

MAINTAINERS: Replace myself with John Snow for block jobs

I'll not be involved with day-to-day qemu development, and John
Snow is a block jobs wizard. Have him take over block job
maintainership duties.

Signed-off-by: Jeff Cody <jcody@redhat.com>
Acked-by: John Snow <jsnow@redhat.com>
Message-Id: <d56d7c6592e7d68aa72764e9616878394bffbc14.1537984851.git.jcody@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

Merge remote-tracking branch 'remotes/kraxel/tags/vga-20190222-pull-request' into staging

vga: bugfixes and edid support for virtio-vga

# gpg: Signature made Fri 22 Feb 2019 08:24:25 GMT
# gpg:                using RSA key 4CB6D8EED3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>" [full]
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>" [full]
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>" [full]
# Primary key fingerprint: A032 8CFF B93A 17A7 9901  FE7D 4CB6 D8EE D3E8 7138

* remotes/kraxel/tags/vga-20190222-pull-request:
  display/virtio: add edid support.
  virtio-gpu: remove useless 'waiting' field
  virtio-gpu: block both 2d and 3d rendering
  virtio-gpu: remove unused config_size
  virtio-gpu: remove unused qdev

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

Merge remote-tracking branch 'remotes/kraxel/tags/ui-20190222-pull-request' into staging

ui: add support for -display spice-app
ui: gtk+sdl bugfixes.

# gpg: Signature made Fri 22 Feb 2019 07:53:13 GMT
# gpg:                using RSA key 4CB6D8EED3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>" [full]
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>" [full]
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>" [full]
# Primary key fingerprint: A032 8CFF B93A 17A7 9901  FE7D 4CB6 D8EE D3E8 7138

* remotes/kraxel/tags/ui-20190222-pull-request:
  display: add -display spice-app launching a Spice client
  spice: use a default name for the server
  qapi: document DisplayType enum
  build-sys: add gio-2.0 check
  char: register spice ports after spice started
  char: move SpiceChardev and open_spice_port() to spice.h header
  spice: do not stop spice if VM is paused
  spice: merge options lists
  spice: avoid spice runtime assert
  char/spice: discard write() if backend is disconnected
  char/spice: trigger HUP event
  ui/gtk: Fix the license information
  sdl2: drop qemu_input_event_send_key_qcode call
  spice: set device address and device display ID in QXL interface
  kbd-state: don't block auto-repeat events

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

Merge remote-tracking branch 'remotes/awilliam/tags/vfio-updates-20190221.0' into staging

VFIO updates 2019-02-21

- Workaround kernel overflow bug in vfio type1 DMA unmap
   (Alex Williamson)

- Refactor vfio container initialization (Eric Auger)

# gpg: Signature made Fri 22 Feb 2019 05:21:07 GMT
# gpg:                using RSA key 239B9B6E3BB08B22
# gpg: Good signature from "Alex Williamson <alex.williamson@redhat.com>" [full]
# gpg:                 aka "Alex Williamson <alex@shazbot.org>" [full]
# gpg:                 aka "Alex Williamson <alwillia@redhat.com>" [full]
# gpg:                 aka "Alex Williamson <alex.l.williamson@gmail.com>" [full]
# Primary key fingerprint: 42F6 C04E 540B D1A9 9E7B  8A90 239B 9B6E 3BB0 8B22

* remotes/awilliam/tags/vfio-updates-20190221.0:
  hw/vfio/common: Refactor container initialization
  vfio/common: Work around kernel overflow bug in DMA unmap

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

Merge remote-tracking branch 'remotes/rth/tags/pull-hppa-20190221' into staging

Fix dino pci config access.

# gpg: Signature made Thu 21 Feb 2019 19:03:26 GMT
# gpg:                using RSA key 64DF38E8AF7E215F
# gpg: Good signature from "Richard Henderson <richard.henderson@linaro.org>" [full]
# Primary key fingerprint: 7A48 1E78 868B 4DB6 A85A  05C0 64DF 38E8 AF7E 215F

* remotes/rth/tags/pull-hppa-20190221:
  hw/hppa/dino: mask out lower 2 bits of PCI config addr

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

Merge remote-tracking branch 'remotes/rth/tags/pull-tcg-20190221' into staging

Allow const void * as argument to helpers.
Remove obsolete TODO file.

# gpg: Signature made Thu 21 Feb 2019 18:59:11 GMT
# gpg:                using RSA key 64DF38E8AF7E215F
# gpg: Good signature from "Richard Henderson <richard.henderson@linaro.org>" [full]
# Primary key fingerprint: 7A48 1E78 868B 4DB6 A85A  05C0 64DF 38E8 AF7E 215F

* remotes/rth/tags/pull-tcg-20190221:
  include/exec/helper-head.h: support "const void *" in helper calls
  tcg: Remove TODO file

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

Merge remote-tracking branch 'remotes/amarkovic/tags/mips-queue-feb-21-2019-v2' into staging

MIPS queue for February 21st, 2019, v2

# gpg: Signature made Thu 21 Feb 2019 18:37:04 GMT
# gpg:                using RSA key D4972A8967F75A65
# gpg: Good signature from "Aleksandar Markovic <amarkovic@wavecomp.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 8526 FBF1 5DA3 811F 4A01  DD75 D497 2A89 67F7 5A65

* remotes/amarkovic/tags/mips-queue-feb-21-2019-v2:
  target/mips: fulong2e: Dynamically generate SPD EEPROM data
  target/mips: fulong2e: Fix bios flash size
  hw/pci-host/bonito.c: Add PCI mem region mapped at the correct address
  target/mips: implement QMP query-cpu-definitions command
  tests/tcg: target/mips: Add wrappers for MSA integer compare instructions
  tests/tcg: target/mips: Change directory name 'bit-counting' to 'bit-count'
  tests/tcg: target/mips: Correct path to headers in some test source files
  hw/misc: mips_itu: Fix 32/64 bit issue in a line involving shift operator

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

tests/virtio-blk: add test for DISCARD command

If the DISCARD feature is enabled, we try this command in the
test_basic(), checking only the status returned by the request.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Message-id: 20190221103314.58500-11-sgarzare@redhat.com
Message-Id: <20190221103314.58500-11-sgarzare@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>

tests/virtio-blk: add test for WRITE_ZEROES command

If the WRITE_ZEROES feature is enabled, we check this command
in the test_basic().

Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Acked-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Message-id: 20190221103314.58500-10-sgarzare@redhat.com
Message-Id: <20190221103314.58500-10-sgarzare@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>

tests/virtio-blk: add virtio_blk_fix_dwz_hdr() function

This function is useful to fix the endianness of struct
virtio_blk_discard_write_zeroes headers.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Message-id: 20190221103314.58500-9-sgarzare@redhat.com
Message-Id: <20190221103314.58500-9-sgarzare@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>

tests/virtio-blk: change assert on data_size in virtio_blk_request()

The size of data in the virtio_blk_request must be a multiple
of 512 bytes for IN and OUT requests, or a multiple of the size
of struct virtio_blk_discard_write_zeroes for DISCARD and
WRITE_ZEROES requests.

Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Message-id: 20190221103314.58500-8-sgarzare@redhat.com
Message-Id: <20190221103314.58500-8-sgarzare@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>

virtio-blk: add DISCARD and WRITE_ZEROES features

This patch adds the support of DISCARD and WRITE_ZEROES commands,
that have been introduced in the virtio-blk protocol to have
better performance when using SSD backend.

We support only one segment per request since multiple segments
are not widely used and there are no userspace APIs that allow
applications to submit multiple segments in a single call.

Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Message-id: 20190221103314.58500-7-sgarzare@redhat.com
Message-Id: <20190221103314.58500-7-sgarzare@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>

virtio-blk: set config size depending on the features enabled

Starting from DISABLE and WRITE_ZEROES features, we use an array of
VirtIOFeature (as virtio-net) to properly set the config size
depending on the features enabled.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Message-id: 20190221103314.58500-6-sgarzare@redhat.com
Message-Id: <20190221103314.58500-6-sgarzare@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>

virtio-net: make VirtIOFeature usable for other virtio devices

In order to use VirtIOFeature also in other virtio devices, we move
its declaration and the endof() macro (renamed in virtio_endof())
in virtio.h.
We add virtio_feature_get_config_size() function to iterate the array
of VirtIOFeature and to return the config size depending on the
features enabled. (as virtio_net_set_config_size() did)

Suggested-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Message-id: 20190221103314.58500-5-sgarzare@redhat.com
Message-Id: <20190221103314.58500-5-sgarzare@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>