From: Wolfgang Bumiller Date: Fri, 19 Apr 2019 07:53:37 +0000 (+0200) Subject: bump version to 3.0.1-1 X-Git-Url: https://git.proxmox.com/?a=commitdiff_plain;h=0775f12b636756ead830d039601c83d372894ef9;p=pve-qemu.git bump version to 3.0.1-1 Signed-off-by: Wolfgang Bumiller --- diff --git a/Makefile b/Makefile index d4be693..3daaf8d 100644 --- a/Makefile +++ b/Makefile @@ -1,6 +1,6 @@ # also update debian/changelog -KVMVER=3.0.0 -KVMPKGREL=1~pvetest2 +KVMVER=3.0.1 +KVMPKGREL=1 KVMPACKAGE = pve-qemu-kvm KVMSRC = qemu diff --git a/debian/changelog b/debian/changelog index 5da43ab..28937ba 100644 --- a/debian/changelog +++ b/debian/changelog @@ -1,8 +1,42 @@ -pve-qemu-kvm (3.0.0-1~pvetest2) unstable; urgency=medium +pve-qemu-kvm (3.0.1-1) unstable; urgency=medium - * update to 3.0.0 + * update to 3.0.1 - -- Proxmox Support Team Thu, 30 Aug 2018 14:59:49 +0200 + -- Proxmox Support Team Fri, 19 Apr 2019 09:51:15 +0200 + +pve-qemu-kvm (2.12.1-3) stable; urgency=medium + + * fix an error handling issue with live snapshots where if a storage runs + full the process would ignore the error + + -- Proxmox Support Team Mon, 18 Mar 2019 11:34:08 +0100 + +pve-qemu-kvm (2.12.1-2) stable; urgency=medium + + * fix CVE-2019-3812: Out-of-bounds read in hw/i2c/i2c-ddc.c allows for memory + disclosure + + * fix CVE-2018-18849: lsi53c895a: OOB msg buffer access leads to DoS + + * fix CVE-2018-20124: rdma: OOB access when building scatter-gather array + + * fix CVE-2019-6778: slirp: heap buffer overflow in tcp_emu() + + -- Proxmox Support Team Tue, 19 Feb 2019 09:28:42 +0100 + +pve-qemu-kvm (2.12.1-1) stable; urgency=medium + + * update to 2.12.1 with some additional CVE fixes included + + * fix CVE-2018-10839: ne2000: integer overflow leads to buffer overflow issue + + * fix CVE-2018-17958: rtl8139: integer overflow leads to buffer overflow + + * fix CVE-2018-17962: pcnet: integer overflow leads to buffer overflow + + * fix CVE-2018-17963: net: ignore packets with large size + + -- Proxmox Support Team Tue, 16 Oct 2018 14:22:11 +0200 pve-qemu-kvm (2.11.2-1) stable; urgency=medium diff --git a/debian/patches/extra/0001-monitor-guard-iothread-access-by-mon-use_io_thread.patch b/debian/patches/extra/0001-monitor-guard-iothread-access-by-mon-use_io_thread.patch new file mode 100644 index 0000000..136a2a6 --- /dev/null +++ b/debian/patches/extra/0001-monitor-guard-iothread-access-by-mon-use_io_thread.patch @@ -0,0 +1,36 @@ +From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001 +From: Wolfgang Bumiller +Date: Tue, 25 Sep 2018 10:15:06 +0200 +Subject: [PATCH] monitor: guard iothread access by mon->use_io_thread + +monitor_resume() and monitor_suspend() both want to +"kick" the I/O thread if it is there, but in +monitor_suspend() lacked the use_io_thread flag condition. +This is required when we later only spawn the thread on +first use. + +Signed-off-by: Wolfgang Bumiller +Reviewed-by: Eric Blake +Reviewed-by: Peter Xu +Message-Id: <20180925081507.11873-2-w.bumiller@proxmox.com> +Signed-off-by: Markus Armbruster +--- + monitor.c | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +diff --git a/monitor.c b/monitor.c +index a1999e396c..836c0bbdaa 100644 +--- a/monitor.c ++++ b/monitor.c +@@ -4376,7 +4376,7 @@ int monitor_suspend(Monitor *mon) + + atomic_inc(&mon->suspend_cnt); + +- if (monitor_is_qmp(mon)) { ++ if (monitor_is_qmp(mon) && mon->use_io_thread) { + /* + * Kick I/O thread to make sure this takes effect. It'll be + * evaluated again in prepare() of the watch object. +-- +2.11.0 + diff --git a/debian/patches/extra/0001-seccomp-use-SIGSYS-signal-instead-of-killing-the-thr.patch b/debian/patches/extra/0001-seccomp-use-SIGSYS-signal-instead-of-killing-the-thr.patch deleted file mode 100644 index 5bdc035..0000000 --- a/debian/patches/extra/0001-seccomp-use-SIGSYS-signal-instead-of-killing-the-thr.patch +++ /dev/null @@ -1,47 +0,0 @@ -From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001 -From: =?UTF-8?q?Marc-Andr=C3=A9=20Lureau?= -Date: Wed, 22 Aug 2018 19:02:47 +0200 -Subject: [PATCH] seccomp: use SIGSYS signal instead of killing the thread -MIME-Version: 1.0 -Content-Type: text/plain; charset=UTF-8 -Content-Transfer-Encoding: 8bit - -The seccomp action SCMP_ACT_KILL results in immediate termination of -the thread that made the bad system call. However, qemu being -multi-threaded, it keeps running. There is no easy way for parent -process / management layer (libvirt) to know about that situation. - -Instead, the default SIGSYS handler when invoked with SCMP_ACT_TRAP -will terminate the program and core dump. - -This may not be the most secure solution, but probably better than -just killing the offending thread. SCMP_ACT_KILL_PROCESS has been -added in Linux 4.14 to improve the situation, which I propose to use -by default if available in the next patch. - -Related to: -https://bugzilla.redhat.com/show_bug.cgi?id=1594456 - -Signed-off-by: Marc-André Lureau -Reviewed-by: Daniel P. Berrangé -Acked-by: Eduardo Otubo ---- - qemu-seccomp.c | 2 +- - 1 file changed, 1 insertion(+), 1 deletion(-) - -diff --git a/qemu-seccomp.c b/qemu-seccomp.c -index 9cd8eb9499..b117a92559 100644 ---- a/qemu-seccomp.c -+++ b/qemu-seccomp.c -@@ -125,7 +125,7 @@ static int seccomp_start(uint32_t seccomp_opts) - continue; - } - -- rc = seccomp_rule_add_array(ctx, SCMP_ACT_KILL, blacklist[i].num, -+ rc = seccomp_rule_add_array(ctx, SCMP_ACT_TRAP, blacklist[i].num, - blacklist[i].narg, blacklist[i].arg_cmp); - if (rc < 0) { - goto seccomp_return; --- -2.11.0 - diff --git a/debian/patches/extra/0002-monitor-delay-monitor-iothread-creation.patch b/debian/patches/extra/0002-monitor-delay-monitor-iothread-creation.patch new file mode 100644 index 0000000..7a9cda7 --- /dev/null +++ b/debian/patches/extra/0002-monitor-delay-monitor-iothread-creation.patch @@ -0,0 +1,114 @@ +From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001 +From: Wolfgang Bumiller +Date: Tue, 25 Sep 2018 10:15:07 +0200 +Subject: [PATCH] monitor: delay monitor iothread creation + +Commit d32749deb615 moved the call to monitor_init_globals() +to before os_daemonize(), making it an unsuitable place to +spawn the monitor iothread as it won't be inherited over the +fork() in os_daemonize(). + +We now spawn the thread the first time we instantiate a +monitor which actually has use_io_thread == true. +Instantiation of monitors happens only after os_daemonize(). +We still need to create the qmp_dispatcher_bh when not using +iothreads, so this now still happens in +monitor_init_globals(). + +Signed-off-by: Wolfgang Bumiller +Fixes: d32749deb615 ("monitor: move init global earlier") +Message-Id: <20180925081507.11873-3-w.bumiller@proxmox.com> +Reviewed-by: Eric Blake +Reviewed-by: Peter Xu +Tested-by: Peter Xu +[This fixes a crash on shutdown with --daemonize] +Signed-off-by: Markus Armbruster +--- + monitor.c | 36 ++++++++++++++++++++++-------------- + 1 file changed, 22 insertions(+), 14 deletions(-) + +diff --git a/monitor.c b/monitor.c +index 836c0bbdaa..c7eae64fd9 100644 +--- a/monitor.c ++++ b/monitor.c +@@ -807,9 +807,14 @@ static void monitor_qapi_event_init(void) + + static void handle_hmp_command(Monitor *mon, const char *cmdline); + ++static void monitor_iothread_init(void); ++ + static void monitor_data_init(Monitor *mon, bool skip_flush, + bool use_io_thread) + { ++ if (use_io_thread && !mon_iothread) { ++ monitor_iothread_init(); ++ } + memset(mon, 0, sizeof(Monitor)); + qemu_mutex_init(&mon->mon_lock); + qemu_mutex_init(&mon->qmp.qmp_queue_lock); +@@ -4544,6 +4549,15 @@ static AioContext *monitor_get_aio_context(void) + static void monitor_iothread_init(void) + { + mon_iothread = iothread_create("mon_iothread", &error_abort); ++} ++ ++void monitor_init_globals(void) ++{ ++ monitor_init_qmp_commands(); ++ monitor_qapi_event_init(); ++ sortcmdlist(); ++ qemu_mutex_init(&monitor_lock); ++ qemu_mutex_init(&mon_fdsets_lock); + + /* + * The dispatcher BH must run in the main loop thread, since we +@@ -4559,21 +4573,11 @@ static void monitor_iothread_init(void) + * monitors that are using the I/O thread have their output + * written by the I/O thread. + */ +- qmp_respond_bh = aio_bh_new(monitor_get_aio_context(), ++ qmp_respond_bh = aio_bh_new(iohandler_get_aio_context(), + monitor_qmp_bh_responder, + NULL); + } + +-void monitor_init_globals(void) +-{ +- monitor_init_qmp_commands(); +- monitor_qapi_event_init(); +- sortcmdlist(); +- qemu_mutex_init(&monitor_lock); +- qemu_mutex_init(&mon_fdsets_lock); +- monitor_iothread_init(); +-} +- + /* These functions just adapt the readline interface in a typesafe way. We + * could cast function pointers but that discards compiler checks. + */ +@@ -4711,7 +4715,9 @@ void monitor_cleanup(void) + * we need to unregister from chardev below in + * monitor_data_destroy(), and chardev is not thread-safe yet + */ +- iothread_stop(mon_iothread); ++ if (mon_iothread) { ++ iothread_stop(mon_iothread); ++ } + + /* + * Flush all response queues. Note that even after this flush, +@@ -4735,8 +4741,10 @@ void monitor_cleanup(void) + qemu_bh_delete(qmp_respond_bh); + qmp_respond_bh = NULL; + +- iothread_destroy(mon_iothread); +- mon_iothread = NULL; ++ if (mon_iothread) { ++ iothread_destroy(mon_iothread); ++ mon_iothread = NULL; ++ } + } + + QemuOptsList qemu_mon_opts = { +-- +2.11.0 + diff --git a/debian/patches/extra/0002-seccomp-prefer-SCMP_ACT_KILL_PROCESS-if-available.patch b/debian/patches/extra/0002-seccomp-prefer-SCMP_ACT_KILL_PROCESS-if-available.patch deleted file mode 100644 index 7f8ce25..0000000 --- a/debian/patches/extra/0002-seccomp-prefer-SCMP_ACT_KILL_PROCESS-if-available.patch +++ /dev/null @@ -1,90 +0,0 @@ -From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001 -From: =?UTF-8?q?Marc-Andr=C3=A9=20Lureau?= -Date: Wed, 22 Aug 2018 19:02:48 +0200 -Subject: [PATCH] seccomp: prefer SCMP_ACT_KILL_PROCESS if available -MIME-Version: 1.0 -Content-Type: text/plain; charset=UTF-8 -Content-Transfer-Encoding: 8bit - -The upcoming libseccomp release should have SCMP_ACT_KILL_PROCESS -action (https://github.com/seccomp/libseccomp/issues/96). - -SCMP_ACT_KILL_PROCESS is preferable to immediately terminate the -offending process, rather than having the SIGSYS handler running. - -Use SECCOMP_GET_ACTION_AVAIL to check availability of kernel support, -as libseccomp will fallback on SCMP_ACT_KILL otherwise, and we still -prefer SCMP_ACT_TRAP. - -Signed-off-by: Marc-André Lureau -Reviewed-by: Daniel P. Berrangé -Acked-by: Eduardo Otubo ---- - qemu-seccomp.c | 31 ++++++++++++++++++++++++++++++- - 1 file changed, 30 insertions(+), 1 deletion(-) - -diff --git a/qemu-seccomp.c b/qemu-seccomp.c -index b117a92559..f0c833f3ca 100644 ---- a/qemu-seccomp.c -+++ b/qemu-seccomp.c -@@ -20,6 +20,7 @@ - #include - #include - #include "sysemu/seccomp.h" -+#include - - /* For some architectures (notably ARM) cacheflush is not supported until - * libseccomp 2.2.3, but configure enforces that we are using a more recent -@@ -107,12 +108,40 @@ static const struct QemuSeccompSyscall blacklist[] = { - { SCMP_SYS(sched_get_priority_min), QEMU_SECCOMP_SET_RESOURCECTL }, - }; - -+static inline __attribute__((unused)) int -+qemu_seccomp(unsigned int operation, unsigned int flags, void *args) -+{ -+#ifdef __NR_seccomp -+ return syscall(__NR_seccomp, operation, flags, args); -+#else -+ errno = ENOSYS; -+ return -1; -+#endif -+} -+ -+static uint32_t qemu_seccomp_get_kill_action(void) -+{ -+#if defined(SECCOMP_GET_ACTION_AVAIL) && defined(SCMP_ACT_KILL_PROCESS) && \ -+ defined(SECCOMP_RET_KILL_PROCESS) -+ { -+ uint32_t action = SECCOMP_RET_KILL_PROCESS; -+ -+ if (qemu_seccomp(SECCOMP_GET_ACTION_AVAIL, 0, &action) == 0) { -+ return SCMP_ACT_KILL_PROCESS; -+ } -+ } -+#endif -+ -+ return SCMP_ACT_TRAP; -+} -+ - - static int seccomp_start(uint32_t seccomp_opts) - { - int rc = 0; - unsigned int i = 0; - scmp_filter_ctx ctx; -+ uint32_t action = qemu_seccomp_get_kill_action(); - - ctx = seccomp_init(SCMP_ACT_ALLOW); - if (ctx == NULL) { -@@ -125,7 +154,7 @@ static int seccomp_start(uint32_t seccomp_opts) - continue; - } - -- rc = seccomp_rule_add_array(ctx, SCMP_ACT_TRAP, blacklist[i].num, -+ rc = seccomp_rule_add_array(ctx, action, blacklist[i].num, - blacklist[i].narg, blacklist[i].arg_cmp); - if (rc < 0) { - goto seccomp_return; --- -2.11.0 - diff --git a/debian/patches/extra/0003-configure-require-libseccomp-2.2.0.patch b/debian/patches/extra/0003-configure-require-libseccomp-2.2.0.patch deleted file mode 100644 index 34ec05b..0000000 --- a/debian/patches/extra/0003-configure-require-libseccomp-2.2.0.patch +++ /dev/null @@ -1,53 +0,0 @@ -From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001 -From: =?UTF-8?q?Marc-Andr=C3=A9=20Lureau?= -Date: Wed, 22 Aug 2018 19:02:49 +0200 -Subject: [PATCH] configure: require libseccomp 2.2.0 -MIME-Version: 1.0 -Content-Type: text/plain; charset=UTF-8 -Content-Transfer-Encoding: 8bit - -The following patch is going to require TSYNC, which is only available -since libseccomp 2.2.0. - -libseccomp 2.2.0 was released February 12, 2015. - -According to repology, libseccomp version in different distros: - - RHEL-7: 2.3.1 - Debian (Stretch): 2.3.1 - OpenSUSE Leap 15: 2.3.2 - Ubuntu (Xenial): 2.3.1 - -This will drop support for -sandbox on: - - Debian (Jessie): 2.1.1 (but 2.2.3 in backports) - -Signed-off-by: Marc-André Lureau -Acked-by: Eduardo Otubo ---- - configure | 7 ++----- - 1 file changed, 2 insertions(+), 5 deletions(-) - -diff --git a/configure b/configure -index 601c1f44f9..d2cc11cdbb 100755 ---- a/configure -+++ b/configure -@@ -2222,13 +2222,10 @@ fi - ########################################## - # libseccomp check - -+libseccomp_minver="2.2.0" - if test "$seccomp" != "no" ; then - case "$cpu" in -- i386|x86_64) -- libseccomp_minver="2.1.0" -- ;; -- mips) -- libseccomp_minver="2.2.0" -+ i386|x86_64|mips) - ;; - arm|aarch64) - libseccomp_minver="2.2.3" --- -2.11.0 - diff --git a/debian/patches/extra/0004-seccomp-set-the-seccomp-filter-to-all-threads.patch b/debian/patches/extra/0004-seccomp-set-the-seccomp-filter-to-all-threads.patch deleted file mode 100644 index 2363cb7..0000000 --- a/debian/patches/extra/0004-seccomp-set-the-seccomp-filter-to-all-threads.patch +++ /dev/null @@ -1,57 +0,0 @@ -From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001 -From: =?UTF-8?q?Marc-Andr=C3=A9=20Lureau?= -Date: Wed, 22 Aug 2018 19:02:50 +0200 -Subject: [PATCH] seccomp: set the seccomp filter to all threads -MIME-Version: 1.0 -Content-Type: text/plain; charset=UTF-8 -Content-Transfer-Encoding: 8bit - -When using "-seccomp on", the seccomp policy is only applied to the -main thread, the vcpu worker thread and other worker threads created -after seccomp policy is applied; the seccomp policy is not applied to -e.g. the RCU thread because it is created before the seccomp policy is -applied and SECCOMP_FILTER_FLAG_TSYNC isn't used. - -This can be verified with -for task in /proc/`pidof qemu`/task/*; do cat $task/status | grep Secc ; done -Seccomp: 2 -Seccomp: 0 -Seccomp: 0 -Seccomp: 2 -Seccomp: 2 -Seccomp: 2 - -Starting with libseccomp 2.2.0 and kernel >= 3.17, we can use -seccomp_attr_set(ctx, > SCMP_FLTATR_CTL_TSYNC, 1) to update the policy -on all threads. - -libseccomp requirement was bumped to 2.2.0 in previous patch. -libseccomp should fail to set the filter if it can't honour -SCMP_FLTATR_CTL_TSYNC (untested), and thus -sandbox will now fail on -kernel < 3.17. - -Signed-off-by: Marc-André Lureau -Acked-by: Eduardo Otubo ---- - qemu-seccomp.c | 5 +++++ - 1 file changed, 5 insertions(+) - -diff --git a/qemu-seccomp.c b/qemu-seccomp.c -index f0c833f3ca..4729eb107f 100644 ---- a/qemu-seccomp.c -+++ b/qemu-seccomp.c -@@ -149,6 +149,11 @@ static int seccomp_start(uint32_t seccomp_opts) - goto seccomp_return; - } - -+ rc = seccomp_attr_set(ctx, SCMP_FLTATR_CTL_TSYNC, 1); -+ if (rc != 0) { -+ goto seccomp_return; -+ } -+ - for (i = 0; i < ARRAY_SIZE(blacklist); i++) { - if (!(seccomp_opts & blacklist[i].set)) { - continue; --- -2.11.0 - diff --git a/debian/patches/extra/0005-monitor-create-iothread-after-daemonizing.patch b/debian/patches/extra/0005-monitor-create-iothread-after-daemonizing.patch deleted file mode 100644 index df2159a..0000000 --- a/debian/patches/extra/0005-monitor-create-iothread-after-daemonizing.patch +++ /dev/null @@ -1,73 +0,0 @@ -From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001 -From: Wolfgang Bumiller -Date: Fri, 7 Sep 2018 14:45:51 +0200 -Subject: [PATCH] monitor: create iothread after daemonizing - -Commit d32749deb615 moved the call to monitor_init_globals() -to before os_daemonize() in order to initialize locks used -when parsing arguments and instantiating monitors. -This function also creates an iothread which is now lost -when fork()ing in os_daemonize(), causing its final join to -fail. -Fix this by exposing monitor_iothread_init() to be used in -vl.c after the os_daemonize() call. - -FIXME: verify nothing between the new init() place and -iothread spawning actually already depends on the iothread. - -Signed-off-by: Wolfgang Bumiller -Fixes: d32749deb615 ("monitor: move init global earlier") ---- - include/monitor/monitor.h | 1 + - monitor.c | 3 +-- - vl.c | 1 + - 3 files changed, 3 insertions(+), 2 deletions(-) - -diff --git a/include/monitor/monitor.h b/include/monitor/monitor.h -index 2ef5e04b37..119c4a393e 100644 ---- a/include/monitor/monitor.h -+++ b/include/monitor/monitor.h -@@ -18,6 +18,7 @@ extern __thread Monitor *cur_mon; - bool monitor_cur_is_qmp(void); - - void monitor_init_globals(void); -+void monitor_iothread_init(void); - void monitor_init(Chardev *chr, int flags); - void monitor_cleanup(void); - -diff --git a/monitor.c b/monitor.c -index 77861e96af..24bfa0266b 100644 ---- a/monitor.c -+++ b/monitor.c -@@ -4539,7 +4539,7 @@ static AioContext *monitor_get_aio_context(void) - return iothread_get_aio_context(mon_iothread); - } - --static void monitor_iothread_init(void) -+void monitor_iothread_init(void) - { - mon_iothread = iothread_create("mon_iothread", &error_abort); - -@@ -4569,7 +4569,6 @@ void monitor_init_globals(void) - sortcmdlist(); - qemu_mutex_init(&monitor_lock); - qemu_mutex_init(&mon_fdsets_lock); -- monitor_iothread_init(); - } - - /* These functions just adapt the readline interface in a typesafe way. We -diff --git a/vl.c b/vl.c -index a03e4c2867..d96f4d0d2a 100644 ---- a/vl.c -+++ b/vl.c -@@ -4008,6 +4008,7 @@ int main(int argc, char **argv, char **envp) - - os_daemonize(); - rcu_disable_atfork(); -+ monitor_iothread_init(); - - if (pid_file && qemu_create_pidfile(pid_file) != 0) { - error_report("could not acquire pid file: %s", strerror(errno)); --- -2.11.0 - diff --git a/debian/patches/pve/0002-PVE-Config-Adjust-network-script-path-to-etc-kvm.patch b/debian/patches/pve/0002-PVE-Config-Adjust-network-script-path-to-etc-kvm.patch index 52dda54..4811863 100644 --- a/debian/patches/pve/0002-PVE-Config-Adjust-network-script-path-to-etc-kvm.patch +++ b/debian/patches/pve/0002-PVE-Config-Adjust-network-script-path-to-etc-kvm.patch @@ -8,10 +8,10 @@ Subject: [PATCH] PVE: [Config] Adjust network script path to /etc/kvm/ 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/include/net/net.h b/include/net/net.h -index 1425960f76..fdf0957642 100644 +index 3e4638b8c6..e4dfe43f75 100644 --- a/include/net/net.h +++ b/include/net/net.h -@@ -216,8 +216,9 @@ void qmp_netdev_add(QDict *qdict, QObject **ret, Error **errp); +@@ -210,8 +210,9 @@ void qmp_netdev_add(QDict *qdict, QObject **ret, Error **errp); int net_hub_id_for_client(NetClientState *nc, int *id); NetClientState *net_hub_port_find(int hub_id); diff --git a/debian/patches/pve/0008-PVE-Config-rbd-block-rbd-disable-rbd_cache_writethro.patch b/debian/patches/pve/0008-PVE-Config-rbd-block-rbd-disable-rbd_cache_writethro.patch index 1df5d1e..ba42289 100644 --- a/debian/patches/pve/0008-PVE-Config-rbd-block-rbd-disable-rbd_cache_writethro.patch +++ b/debian/patches/pve/0008-PVE-Config-rbd-block-rbd-disable-rbd_cache_writethro.patch @@ -17,7 +17,7 @@ Signed-off-by: Wolfgang Bumiller 1 file changed, 2 insertions(+) diff --git a/block/rbd.c b/block/rbd.c -index ca8e5bbace..34ae730711 100644 +index 014c68d629..53293845f6 100644 --- a/block/rbd.c +++ b/block/rbd.c @@ -634,6 +634,8 @@ static int qemu_rbd_connect(rados_t *cluster, rados_ioctx_t *io_ctx, diff --git a/debian/patches/pve/0009-PVE-Up-qmp-add-get_link_status.patch b/debian/patches/pve/0009-PVE-Up-qmp-add-get_link_status.patch index e210da2..1e05516 100644 --- a/debian/patches/pve/0009-PVE-Up-qmp-add-get_link_status.patch +++ b/debian/patches/pve/0009-PVE-Up-qmp-add-get_link_status.patch @@ -10,10 +10,10 @@ Subject: [PATCH] PVE: [Up] qmp: add get_link_status 3 files changed, 43 insertions(+) diff --git a/net/net.c b/net/net.c -index 2a3133990c..cd9178d6c9 100644 +index f8275843fb..8c8e100afa 100644 --- a/net/net.c +++ b/net/net.c -@@ -1331,6 +1331,33 @@ void hmp_info_network(Monitor *mon, const QDict *qdict) +@@ -1342,6 +1342,33 @@ void hmp_info_network(Monitor *mon, const QDict *qdict) } } diff --git a/debian/patches/pve/0011-PVE-Up-qemu-img-return-success-on-info-without-snaps.patch b/debian/patches/pve/0011-PVE-Up-qemu-img-return-success-on-info-without-snaps.patch index 26cc868..f2232eb 100644 --- a/debian/patches/pve/0011-PVE-Up-qemu-img-return-success-on-info-without-snaps.patch +++ b/debian/patches/pve/0011-PVE-Up-qemu-img-return-success-on-info-without-snaps.patch @@ -8,10 +8,10 @@ Subject: [PATCH] PVE: [Up] qemu-img: return success on info without snapshots 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/qemu-img.c b/qemu-img.c -index 1acddf693c..4438e0c2c9 100644 +index 4799e097dc..789217cd35 100644 --- a/qemu-img.c +++ b/qemu-img.c -@@ -2720,7 +2720,8 @@ static int img_info(int argc, char **argv) +@@ -2719,7 +2719,8 @@ static int img_info(int argc, char **argv) list = collect_image_info_list(image_opts, filename, fmt, chain, force_share); if (!list) { diff --git a/debian/patches/pve/0012-PVE-Up-qemu-img-dd-add-osize-and-read-from-to-stdin-.patch b/debian/patches/pve/0012-PVE-Up-qemu-img-dd-add-osize-and-read-from-to-stdin-.patch index d545253..470cb1b 100644 --- a/debian/patches/pve/0012-PVE-Up-qemu-img-dd-add-osize-and-read-from-to-stdin-.patch +++ b/debian/patches/pve/0012-PVE-Up-qemu-img-dd-add-osize-and-read-from-to-stdin-.patch @@ -52,10 +52,10 @@ index 1526f327a5..0ea4b6ffb2 100644 DEF("info", img_info, diff --git a/qemu-img.c b/qemu-img.c -index 4438e0c2c9..f46eefce4f 100644 +index 789217cd35..f459dd8345 100644 --- a/qemu-img.c +++ b/qemu-img.c -@@ -4302,10 +4302,12 @@ out: +@@ -4301,10 +4301,12 @@ out: #define C_IF 04 #define C_OF 010 #define C_SKIP 020 @@ -68,7 +68,7 @@ index 4438e0c2c9..f46eefce4f 100644 }; struct DdIo { -@@ -4384,6 +4386,20 @@ static int img_dd_skip(const char *arg, +@@ -4383,6 +4385,20 @@ static int img_dd_skip(const char *arg, return 0; } @@ -89,7 +89,7 @@ index 4438e0c2c9..f46eefce4f 100644 static int img_dd(int argc, char **argv) { int ret = 0; -@@ -4424,6 +4440,7 @@ static int img_dd(int argc, char **argv) +@@ -4423,6 +4439,7 @@ static int img_dd(int argc, char **argv) { "if", img_dd_if, C_IF }, { "of", img_dd_of, C_OF }, { "skip", img_dd_skip, C_SKIP }, @@ -97,7 +97,7 @@ index 4438e0c2c9..f46eefce4f 100644 { NULL, NULL, 0 } }; const struct option long_options[] = { -@@ -4502,8 +4519,13 @@ static int img_dd(int argc, char **argv) +@@ -4501,8 +4518,13 @@ static int img_dd(int argc, char **argv) arg = NULL; } @@ -113,7 +113,7 @@ index 4438e0c2c9..f46eefce4f 100644 ret = -1; goto out; } -@@ -4515,85 +4537,101 @@ static int img_dd(int argc, char **argv) +@@ -4514,85 +4536,101 @@ static int img_dd(int argc, char **argv) goto out; } @@ -279,7 +279,7 @@ index 4438e0c2c9..f46eefce4f 100644 } if (dd.flags & C_SKIP && (in.offset > INT64_MAX / in.bsz || -@@ -4611,11 +4649,17 @@ static int img_dd(int argc, char **argv) +@@ -4610,11 +4648,17 @@ static int img_dd(int argc, char **argv) for (out_pos = 0; in_pos < size; block_count++) { int in_ret, out_ret; @@ -301,7 +301,7 @@ index 4438e0c2c9..f46eefce4f 100644 } if (in_ret < 0) { error_report("error while reading from input image file: %s", -@@ -4625,9 +4669,13 @@ static int img_dd(int argc, char **argv) +@@ -4624,9 +4668,13 @@ static int img_dd(int argc, char **argv) } in_pos += in_ret; diff --git a/debian/patches/pve/0013-PVE-Up-qemu-img-dd-add-isize-parameter.patch b/debian/patches/pve/0013-PVE-Up-qemu-img-dd-add-isize-parameter.patch index ec46951..3a751d8 100644 --- a/debian/patches/pve/0013-PVE-Up-qemu-img-dd-add-isize-parameter.patch +++ b/debian/patches/pve/0013-PVE-Up-qemu-img-dd-add-isize-parameter.patch @@ -14,10 +14,10 @@ Signed-off-by: Wolfgang Bumiller 1 file changed, 26 insertions(+), 3 deletions(-) diff --git a/qemu-img.c b/qemu-img.c -index f46eefce4f..ec546846c6 100644 +index f459dd8345..1f623f5bba 100644 --- a/qemu-img.c +++ b/qemu-img.c -@@ -4303,11 +4303,13 @@ out: +@@ -4302,11 +4302,13 @@ out: #define C_OF 010 #define C_SKIP 020 #define C_OSIZE 040 @@ -31,7 +31,7 @@ index f46eefce4f..ec546846c6 100644 }; struct DdIo { -@@ -4400,6 +4402,20 @@ static int img_dd_osize(const char *arg, +@@ -4399,6 +4401,20 @@ static int img_dd_osize(const char *arg, return 0; } @@ -52,7 +52,7 @@ index f46eefce4f..ec546846c6 100644 static int img_dd(int argc, char **argv) { int ret = 0; -@@ -4414,12 +4430,14 @@ static int img_dd(int argc, char **argv) +@@ -4413,12 +4429,14 @@ static int img_dd(int argc, char **argv) int c, i; const char *out_fmt = "raw"; const char *fmt = NULL; @@ -68,7 +68,7 @@ index f46eefce4f..ec546846c6 100644 }; struct DdIo in = { .bsz = 512, /* Block size is by default 512 bytes */ -@@ -4441,6 +4459,7 @@ static int img_dd(int argc, char **argv) +@@ -4440,6 +4458,7 @@ static int img_dd(int argc, char **argv) { "of", img_dd_of, C_OF }, { "skip", img_dd_skip, C_SKIP }, { "osize", img_dd_osize, C_OSIZE }, @@ -76,7 +76,7 @@ index f46eefce4f..ec546846c6 100644 { NULL, NULL, 0 } }; const struct option long_options[] = { -@@ -4647,14 +4666,18 @@ static int img_dd(int argc, char **argv) +@@ -4646,14 +4665,18 @@ static int img_dd(int argc, char **argv) in.buf = g_new(uint8_t, in.bsz); diff --git a/debian/patches/pve/0014-PVE-Up-qemu-img-dd-add-n-skip_create.patch b/debian/patches/pve/0014-PVE-Up-qemu-img-dd-add-n-skip_create.patch index d03c64b..531d22c 100644 --- a/debian/patches/pve/0014-PVE-Up-qemu-img-dd-add-n-skip_create.patch +++ b/debian/patches/pve/0014-PVE-Up-qemu-img-dd-add-n-skip_create.patch @@ -8,10 +8,10 @@ Subject: [PATCH] PVE: [Up] qemu-img dd : add -n skip_create 1 file changed, 14 insertions(+), 9 deletions(-) diff --git a/qemu-img.c b/qemu-img.c -index ec546846c6..afa6e26ccf 100644 +index 1f623f5bba..5d9322db33 100644 --- a/qemu-img.c +++ b/qemu-img.c -@@ -4432,7 +4432,7 @@ static int img_dd(int argc, char **argv) +@@ -4431,7 +4431,7 @@ static int img_dd(int argc, char **argv) const char *fmt = NULL; int64_t size = 0, readsize = 0; int64_t block_count = 0, out_pos, in_pos; @@ -20,7 +20,7 @@ index ec546846c6..afa6e26ccf 100644 struct DdInfo dd = { .flags = 0, .count = 0, -@@ -4470,7 +4470,7 @@ static int img_dd(int argc, char **argv) +@@ -4469,7 +4469,7 @@ static int img_dd(int argc, char **argv) { 0, 0, 0, 0 } }; @@ -29,7 +29,7 @@ index ec546846c6..afa6e26ccf 100644 if (c == EOF) { break; } -@@ -4490,6 +4490,9 @@ static int img_dd(int argc, char **argv) +@@ -4489,6 +4489,9 @@ static int img_dd(int argc, char **argv) case 'h': help(); break; @@ -39,7 +39,7 @@ index ec546846c6..afa6e26ccf 100644 case 'U': force_share = true; break; -@@ -4630,13 +4633,15 @@ static int img_dd(int argc, char **argv) +@@ -4629,13 +4632,15 @@ static int img_dd(int argc, char **argv) size - in.bsz * in.offset, &error_abort); } diff --git a/debian/patches/pve/0016-PVE-qapi-modify-query-machines.patch b/debian/patches/pve/0016-PVE-qapi-modify-query-machines.patch index e0f7f6a..4abfa8b 100644 --- a/debian/patches/pve/0016-PVE-qapi-modify-query-machines.patch +++ b/debian/patches/pve/0016-PVE-qapi-modify-query-machines.patch @@ -32,7 +32,7 @@ index a7d890c076..4e8ebf9adc 100644 ## diff --git a/vl.c b/vl.c -index 16b913f9d5..c750b7c18e 100644 +index 12d27fa028..9c3a41bfe2 100644 --- a/vl.c +++ b/vl.c @@ -1455,6 +1455,11 @@ MachineInfoList *qmp_query_machines(Error **errp) diff --git a/debian/patches/pve/0018-PVE-internal-snapshot-async.patch b/debian/patches/pve/0018-PVE-internal-snapshot-async.patch index 987ee25..7347a8b 100644 --- a/debian/patches/pve/0018-PVE-internal-snapshot-async.patch +++ b/debian/patches/pve/0018-PVE-internal-snapshot-async.patch @@ -7,15 +7,15 @@ Subject: [PATCH] PVE: internal snapshot async Makefile.objs | 1 + hmp-commands-info.hx | 13 ++ hmp-commands.hx | 32 +++ - hmp.c | 57 +++++ + hmp.c | 57 ++++++ hmp.h | 5 + include/migration/snapshot.h | 1 + - qapi/migration.json | 34 +++ + qapi/migration.json | 34 ++++ qapi/misc.json | 32 +++ qemu-options.hx | 13 ++ - savevm-async.c | 528 +++++++++++++++++++++++++++++++++++++++++++ + savevm-async.c | 460 +++++++++++++++++++++++++++++++++++++++++++ vl.c | 10 + - 11 files changed, 726 insertions(+) + 11 files changed, 658 insertions(+) create mode 100644 savevm-async.c diff --git a/Makefile.objs b/Makefile.objs @@ -310,10 +310,10 @@ index b1bf0f485f..31329e26e2 100644 "-daemonize daemonize QEMU after initializing\n", QEMU_ARCH_ALL) diff --git a/savevm-async.c b/savevm-async.c new file mode 100644 -index 0000000000..0bf830c906 +index 0000000000..73b7fe75ed --- /dev/null +++ b/savevm-async.c -@@ -0,0 +1,528 @@ +@@ -0,0 +1,460 @@ +#include "qemu/osdep.h" +#include "migration/migration.h" +#include "migration/savevm.h" @@ -321,9 +321,7 @@ index 0000000000..0bf830c906 +#include "migration/global_state.h" +#include "migration/ram.h" +#include "migration/qemu-file.h" -+#include "qapi/qmp/qerror.h" +#include "sysemu/sysemu.h" -+#include "qmp-commands.h" +#include "block/block.h" +#include "sysemu/block-backend.h" +#include "qapi/error.h" @@ -331,11 +329,13 @@ index 0000000000..0bf830c906 +#include "qapi/qmp/qdict.h" +#include "qapi/qapi-commands-migration.h" +#include "qapi/qapi-commands-misc.h" ++#include "qapi/qapi-commands-block.h" +#include "qemu/cutils.h" + +/* #define DEBUG_SAVEVM_STATE */ + -+#define NOT_DONE 0x7fffffff /* used while emulated sync operation in progress */ ++/* used while emulated sync operation in progress */ ++#define NOT_DONE -EINPROGRESS + +#ifdef DEBUG_SAVEVM_STATE +#define DPRINTF(fmt, ...) \ @@ -363,6 +363,8 @@ index 0000000000..0bf830c906 + int saved_vm_running; + QEMUFile *file; + int64_t total_time; ++ QEMUBH *cleanup_bh; ++ QemuThread thread; +} snap_state; + +SaveVMInfo *qmp_query_savevm(Error **errp) @@ -450,19 +452,6 @@ index 0000000000..0bf830c906 + g_free (msg); + + snap_state.state = SAVE_STATE_ERROR; -+ -+ save_snapshot_cleanup(); -+} -+ -+static void save_snapshot_completed(void) -+{ -+ DPRINTF("save_snapshot_completed\n"); -+ -+ if (save_snapshot_cleanup() < 0) { -+ snap_state.state = SAVE_STATE_ERROR; -+ } else { -+ snap_state.state = SAVE_STATE_COMPLETED; -+ } +} + +static int block_state_close(void *opaque) @@ -471,67 +460,123 @@ index 0000000000..0bf830c906 + return blk_flush(snap_state.target); +} + ++typedef struct BlkRwCo { ++ int64_t offset; ++ QEMUIOVector *qiov; ++ ssize_t ret; ++} BlkRwCo; ++ ++static void coroutine_fn block_state_write_entry(void *opaque) { ++ BlkRwCo *rwco = opaque; ++ rwco->ret = blk_co_pwritev(snap_state.target, rwco->offset, rwco->qiov->size, ++ rwco->qiov, 0); ++} ++ +static ssize_t block_state_writev_buffer(void *opaque, struct iovec *iov, + int iovcnt, int64_t pos) +{ -+ int ret; + QEMUIOVector qiov; ++ BlkRwCo rwco; ++ ++ assert(pos == snap_state.bs_pos); ++ rwco = (BlkRwCo) { ++ .offset = pos, ++ .qiov = &qiov, ++ .ret = NOT_DONE, ++ }; + + qemu_iovec_init_external(&qiov, iov, iovcnt); -+ ret = blk_co_pwritev(snap_state.target, pos, qiov.size, &qiov, 0); -+ if (ret < 0) { -+ return ret; ++ ++ if (qemu_in_coroutine()) { ++ block_state_write_entry(&rwco); ++ } else { ++ Coroutine *co = qemu_coroutine_create(&block_state_write_entry, &rwco); ++ bdrv_coroutine_enter(blk_bs(snap_state.target), co); ++ BDRV_POLL_WHILE(blk_bs(snap_state.target), rwco.ret == NOT_DONE); + } ++ if (rwco.ret < 0) { ++ return rwco.ret; ++ } ++ + snap_state.bs_pos += qiov.size; + return qiov.size; +} + -+static int store_and_stop(void) { -+ if (global_state_store()) { -+ save_snapshot_error("Error saving global state"); -+ return 1; ++static const QEMUFileOps block_file_ops = { ++ .writev_buffer = block_state_writev_buffer, ++ .close = block_state_close, ++}; ++ ++static void process_savevm_cleanup(void *opaque) ++{ ++ int ret; ++ qemu_bh_delete(snap_state.cleanup_bh); ++ snap_state.cleanup_bh = NULL; ++ qemu_mutex_unlock_iothread(); ++ qemu_thread_join(&snap_state.thread); ++ qemu_mutex_lock_iothread(); ++ ret = save_snapshot_cleanup(); ++ if (ret < 0) { ++ save_snapshot_error("save_snapshot_cleanup error %d", ret); ++ } else if (snap_state.state == SAVE_STATE_ACTIVE) { ++ snap_state.state = SAVE_STATE_COMPLETED; ++ } else { ++ save_snapshot_error("process_savevm_cleanup: invalid state: %d", ++ snap_state.state); + } -+ if (runstate_is_running()) { -+ vm_stop(RUN_STATE_SAVE_VM); ++ if (snap_state.saved_vm_running) { ++ vm_start(); ++ snap_state.saved_vm_running = false; + } -+ return 0; +} + -+static void process_savevm_co(void *opaque) ++static void *process_savevm_thread(void *opaque) +{ + int ret; + int64_t maxlen; + -+ snap_state.state = SAVE_STATE_ACTIVE; ++ rcu_register_thread(); + -+ qemu_mutex_unlock_iothread(); + qemu_savevm_state_header(snap_state.file); + qemu_savevm_state_setup(snap_state.file); + ret = qemu_file_get_error(snap_state.file); -+ qemu_mutex_lock_iothread(); + + if (ret < 0) { + save_snapshot_error("qemu_savevm_state_setup failed"); -+ return; ++ rcu_unregister_thread(); ++ return NULL; + } + + while (snap_state.state == SAVE_STATE_ACTIVE) { -+ uint64_t pending_size, pend_post, pend_nonpost; ++ uint64_t pending_size, pend_precopy, pend_compatible, pend_postcopy; + -+ qemu_savevm_state_pending(snap_state.file, 0, &pend_nonpost, &pend_post); -+ pending_size = pend_post + pend_nonpost; ++ qemu_savevm_state_pending(snap_state.file, 0, &pend_precopy, &pend_compatible, &pend_postcopy); ++ pending_size = pend_precopy + pend_compatible + pend_postcopy; + -+ if (pending_size) { -+ ret = qemu_savevm_state_iterate(snap_state.file, false); -+ if (ret < 0) { -+ save_snapshot_error("qemu_savevm_state_iterate error %d", ret); -+ break; -+ } -+ DPRINTF("savevm inerate pending size %lu ret %d\n", pending_size, ret); ++ maxlen = blk_getlength(snap_state.target) - 30*1024*1024; ++ ++ if (pending_size > 400000 && snap_state.bs_pos + pending_size < maxlen) { ++ qemu_mutex_lock_iothread(); ++ ret = qemu_savevm_state_iterate(snap_state.file, false); ++ if (ret < 0) { ++ save_snapshot_error("qemu_savevm_state_iterate error %d", ret); ++ break; ++ } ++ qemu_mutex_unlock_iothread(); ++ DPRINTF("savevm inerate pending size %lu ret %d\n", pending_size, ret); + } else { -+ DPRINTF("done iterating\n"); -+ if (store_and_stop()) ++ qemu_mutex_lock_iothread(); ++ qemu_system_wakeup_request(QEMU_WAKEUP_REASON_OTHER); ++ ret = global_state_store(); ++ if (ret) { ++ save_snapshot_error("global_state_store error %d", ret); + break; ++ } ++ ret = vm_stop_force_state(RUN_STATE_FINISH_MIGRATE); ++ if (ret < 0) { ++ save_snapshot_error("vm_stop_force_state error %d", ret); ++ break; ++ } + DPRINTF("savevm inerate finished\n"); + /* upstream made the return value here inconsistent + * (-1 instead of 'ret' in one case and 0 after flush which can @@ -543,36 +588,19 @@ index 0000000000..0bf830c906 + save_snapshot_error("qemu_savevm_state_iterate error %d", ret); + break; + } ++ qemu_savevm_state_cleanup(); + DPRINTF("save complete\n"); -+ save_snapshot_completed(); + break; + } -+ -+ /* stop the VM if we get to the end of available space, -+ * or if pending_size is just a few MB -+ */ -+ maxlen = blk_getlength(snap_state.target) - 30*1024*1024; -+ if ((pending_size < 100000) || -+ ((snap_state.bs_pos + pending_size) >= maxlen)) { -+ if (store_and_stop()) -+ break; -+ } + } + -+ if(snap_state.state == SAVE_STATE_CANCELLED) { -+ save_snapshot_completed(); -+ Error *errp = NULL; -+ qmp_savevm_end(&errp); -+ } ++ qemu_bh_schedule(snap_state.cleanup_bh); ++ qemu_mutex_unlock_iothread(); + ++ rcu_unregister_thread(); ++ return NULL; +} + -+static const QEMUFileOps block_file_ops = { -+ .writev_buffer = block_state_writev_buffer, -+ .close = block_state_close, -+}; -+ -+ +void qmp_savevm_start(bool has_statefile, const char *statefile, Error **errp) +{ + Error *local_err = NULL; @@ -627,8 +655,10 @@ index 0000000000..0bf830c906 + error_setg(&snap_state.blocker, "block device is in use by savevm"); + blk_op_block_all(snap_state.target, snap_state.blocker); + -+ Coroutine *co = qemu_coroutine_create(process_savevm_co, NULL); -+ qemu_coroutine_enter(co); ++ snap_state.state = SAVE_STATE_ACTIVE; ++ snap_state.cleanup_bh = qemu_bh_new(process_savevm_cleanup, &snap_state); ++ qemu_thread_create(&snap_state.thread, "savevm-async", process_savevm_thread, ++ NULL, QEMU_THREAD_JOINABLE); + + return; + @@ -661,118 +691,20 @@ index 0000000000..0bf830c906 + snap_state.state = SAVE_STATE_DONE; +} + ++// FIXME: Deprecated +void qmp_snapshot_drive(const char *device, const char *name, Error **errp) +{ -+ BlockBackend *blk; -+ BlockDriverState *bs; -+ QEMUSnapshotInfo sn1, *sn = &sn1; -+ int ret; -+#ifdef _WIN32 -+ struct _timeb tb; -+#else -+ struct timeval tv; -+#endif -+ -+ if (snap_state.state != SAVE_STATE_COMPLETED) { -+ error_set(errp, ERROR_CLASS_GENERIC_ERROR, -+ "VM snapshot not ready/started\n"); -+ return; -+ } -+ -+ blk = blk_by_name(device); -+ if (!blk) { -+ error_set(errp, ERROR_CLASS_DEVICE_NOT_FOUND, -+ "Device '%s' not found", device); -+ return; -+ } -+ -+ bs = blk_bs(blk); -+ if (!bdrv_is_inserted(bs)) { -+ error_setg(errp, QERR_DEVICE_HAS_NO_MEDIUM, device); -+ return; -+ } -+ -+ if (bdrv_is_read_only(bs)) { -+ error_setg(errp, "Node '%s' is read only", device); -+ return; -+ } -+ -+ if (!bdrv_can_snapshot(bs)) { -+ error_setg(errp, QERR_UNSUPPORTED); -+ return; -+ } -+ -+ if (bdrv_snapshot_find(bs, sn, name) >= 0) { -+ error_set(errp, ERROR_CLASS_GENERIC_ERROR, -+ "snapshot '%s' already exists", name); -+ return; -+ } -+ -+ sn = &sn1; -+ memset(sn, 0, sizeof(*sn)); -+ -+#ifdef _WIN32 -+ _ftime(&tb); -+ sn->date_sec = tb.time; -+ sn->date_nsec = tb.millitm * 1000000; -+#else -+ gettimeofday(&tv, NULL); -+ sn->date_sec = tv.tv_sec; -+ sn->date_nsec = tv.tv_usec * 1000; -+#endif -+ sn->vm_clock_nsec = qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL); -+ -+ pstrcpy(sn->name, sizeof(sn->name), name); -+ -+ sn->vm_state_size = 0; /* do not save state */ -+ -+ ret = bdrv_snapshot_create(bs, sn); -+ if (ret < 0) { -+ error_set(errp, ERROR_CLASS_GENERIC_ERROR, -+ "Error while creating snapshot on '%s'\n", device); -+ return; -+ } ++ // Compatibility to older qemu-server. ++ qmp_blockdev_snapshot_internal_sync(device, name, errp); +} + ++// FIXME: Deprecated +void qmp_delete_drive_snapshot(const char *device, const char *name, + Error **errp) +{ -+ BlockBackend *blk; -+ BlockDriverState *bs; -+ QEMUSnapshotInfo sn1, *sn = &sn1; -+ Error *local_err = NULL; -+ -+ int ret; -+ -+ blk = blk_by_name(device); -+ if (!blk) { -+ error_set(errp, ERROR_CLASS_DEVICE_NOT_FOUND, -+ "Device '%s' not found", device); -+ return; -+ } -+ -+ bs = blk_bs(blk); -+ if (bdrv_is_read_only(bs)) { -+ error_setg(errp, "Node '%s' is read only", device); -+ return; -+ } -+ -+ if (!bdrv_can_snapshot(bs)) { -+ error_setg(errp, QERR_UNSUPPORTED); -+ return; -+ } -+ -+ if (bdrv_snapshot_find(bs, sn, name) < 0) { -+ /* return success if snapshot does not exists */ -+ return; -+ } -+ -+ ret = bdrv_snapshot_delete(bs, NULL, name, &local_err); -+ if (ret < 0) { -+ error_set(errp, ERROR_CLASS_GENERIC_ERROR, -+ "Error while deleting snapshot on '%s'\n", device); -+ return; -+ } ++ // Compatibility to older qemu-server. ++ (void)qmp_blockdev_snapshot_delete_internal_sync(device, false, NULL, ++ true, name, errp); +} + +static ssize_t loadstate_get_buffer(void *opaque, uint8_t *buf, int64_t pos, @@ -843,7 +775,7 @@ index 0000000000..0bf830c906 + return ret; +} diff --git a/vl.c b/vl.c -index c750b7c18e..b2e3e23724 100644 +index 9c3a41bfe2..63107d82a3 100644 --- a/vl.c +++ b/vl.c @@ -2927,6 +2927,7 @@ int main(int argc, char **argv, char **envp) @@ -854,7 +786,7 @@ index c750b7c18e..b2e3e23724 100644 MachineClass *machine_class; const char *cpu_model; const char *vga_model = NULL; -@@ -3528,6 +3529,9 @@ int main(int argc, char **argv, char **envp) +@@ -3529,6 +3530,9 @@ int main(int argc, char **argv, char **envp) case QEMU_OPTION_loadvm: loadvm = optarg; break; @@ -864,7 +796,7 @@ index c750b7c18e..b2e3e23724 100644 case QEMU_OPTION_full_screen: dpy.has_full_screen = true; dpy.full_screen = true; -@@ -4623,6 +4627,12 @@ int main(int argc, char **argv, char **envp) +@@ -4624,6 +4628,12 @@ int main(int argc, char **argv, char **envp) error_report_err(local_err); autostart = 0; } diff --git a/debian/patches/pve/0019-PVE-block-add-the-zeroinit-block-driver-filter.patch b/debian/patches/pve/0019-PVE-block-add-the-zeroinit-block-driver-filter.patch new file mode 100644 index 0000000..dd2fc49 --- /dev/null +++ b/debian/patches/pve/0019-PVE-block-add-the-zeroinit-block-driver-filter.patch @@ -0,0 +1,235 @@ +From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001 +From: Wolfgang Bumiller +Date: Thu, 17 Mar 2016 11:33:37 +0100 +Subject: [PATCH] PVE: block: add the zeroinit block driver filter + +--- + block/Makefile.objs | 1 + + block/zeroinit.c | 203 ++++++++++++++++++++++++++++++++++++++++++++++++++++ + 2 files changed, 204 insertions(+) + create mode 100644 block/zeroinit.c + +diff --git a/block/Makefile.objs b/block/Makefile.objs +index c8337bf186..c00f0b32d6 100644 +--- a/block/Makefile.objs ++++ b/block/Makefile.objs +@@ -4,6 +4,7 @@ block-obj-y += qed.o qed-l2-cache.o qed-table.o qed-cluster.o + block-obj-y += qed-check.o + block-obj-y += vhdx.o vhdx-endian.o vhdx-log.o + block-obj-y += quorum.o ++block-obj-y += zeroinit.o + block-obj-y += parallels.o blkdebug.o blkverify.o blkreplay.o + block-obj-y += blklogwrites.o + block-obj-y += block-backend.o snapshot.o qapi.o +diff --git a/block/zeroinit.c b/block/zeroinit.c +new file mode 100644 +index 0000000000..64c49ad0e0 +--- /dev/null ++++ b/block/zeroinit.c +@@ -0,0 +1,203 @@ ++/* ++ * Filter to fake a zero-initialized block device. ++ * ++ * Copyright (c) 2016 Wolfgang Bumiller ++ * Copyright (c) 2016 Proxmox Server Solutions GmbH ++ * ++ * This work is licensed under the terms of the GNU GPL, version 2 or later. ++ * See the COPYING file in the top-level directory. ++ */ ++ ++#include "qemu/osdep.h" ++#include "qapi/error.h" ++#include "block/block_int.h" ++#include "qapi/qmp/qdict.h" ++#include "qapi/qmp/qstring.h" ++#include "qemu/cutils.h" ++#include "qemu/option.h" ++ ++typedef struct { ++ bool has_zero_init; ++ int64_t extents; ++} BDRVZeroinitState; ++ ++/* Valid blkverify filenames look like blkverify:path/to/raw_image:path/to/image */ ++static void zeroinit_parse_filename(const char *filename, QDict *options, ++ Error **errp) ++{ ++ QString *raw_path; ++ ++ /* Parse the blkverify: prefix */ ++ if (!strstart(filename, "zeroinit:", &filename)) { ++ /* There was no prefix; therefore, all options have to be already ++ present in the QDict (except for the filename) */ ++ return; ++ } ++ ++ raw_path = qstring_from_str(filename); ++ qdict_put(options, "x-next", raw_path); ++} ++ ++static QemuOptsList runtime_opts = { ++ .name = "zeroinit", ++ .head = QTAILQ_HEAD_INITIALIZER(runtime_opts.head), ++ .desc = { ++ { ++ .name = "x-next", ++ .type = QEMU_OPT_STRING, ++ .help = "[internal use only, will be removed]", ++ }, ++ { ++ .name = "x-zeroinit", ++ .type = QEMU_OPT_BOOL, ++ .help = "set has_initialized_zero flag", ++ }, ++ { /* end of list */ } ++ }, ++}; ++ ++static int zeroinit_open(BlockDriverState *bs, QDict *options, int flags, ++ Error **errp) ++{ ++ BDRVZeroinitState *s = bs->opaque; ++ QemuOpts *opts; ++ Error *local_err = NULL; ++ int ret; ++ ++ s->extents = 0; ++ ++ opts = qemu_opts_create(&runtime_opts, NULL, 0, &error_abort); ++ qemu_opts_absorb_qdict(opts, options, &local_err); ++ if (local_err) { ++ error_propagate(errp, local_err); ++ ret = -EINVAL; ++ goto fail; ++ } ++ ++ /* Open the raw file */ ++ bs->file = bdrv_open_child(qemu_opt_get(opts, "x-next"), options, "next", ++ bs, &child_file, false, &local_err); ++ if (local_err) { ++ ret = -EINVAL; ++ error_propagate(errp, local_err); ++ goto fail; ++ } ++ ++ /* set the options */ ++ s->has_zero_init = qemu_opt_get_bool(opts, "x-zeroinit", true); ++ ++ ret = 0; ++fail: ++ if (ret < 0) { ++ bdrv_unref_child(bs, bs->file); ++ } ++ qemu_opts_del(opts); ++ return ret; ++} ++ ++static void zeroinit_close(BlockDriverState *bs) ++{ ++ BDRVZeroinitState *s = bs->opaque; ++ (void)s; ++} ++ ++static int64_t zeroinit_getlength(BlockDriverState *bs) ++{ ++ return bdrv_getlength(bs->file->bs); ++} ++ ++static int coroutine_fn zeroinit_co_preadv(BlockDriverState *bs, ++ uint64_t offset, uint64_t bytes, QEMUIOVector *qiov, int flags) ++{ ++ return bdrv_co_preadv(bs->file, offset, bytes, qiov, flags); ++} ++ ++static int coroutine_fn zeroinit_co_pwrite_zeroes(BlockDriverState *bs, int64_t offset, ++ int count, BdrvRequestFlags flags) ++{ ++ BDRVZeroinitState *s = bs->opaque; ++ if (offset >= s->extents) ++ return 0; ++ return bdrv_pwrite_zeroes(bs->file, offset, count, flags); ++} ++ ++static int coroutine_fn zeroinit_co_pwritev(BlockDriverState *bs, ++ uint64_t offset, uint64_t bytes, QEMUIOVector *qiov, int flags) ++{ ++ BDRVZeroinitState *s = bs->opaque; ++ int64_t extents = offset + bytes; ++ if (extents > s->extents) ++ s->extents = extents; ++ return bdrv_co_pwritev(bs->file, offset, bytes, qiov, flags); ++} ++ ++static bool zeroinit_recurse_is_first_non_filter(BlockDriverState *bs, ++ BlockDriverState *candidate) ++{ ++ return bdrv_recurse_is_first_non_filter(bs->file->bs, candidate); ++} ++ ++static coroutine_fn int zeroinit_co_flush(BlockDriverState *bs) ++{ ++ return bdrv_co_flush(bs->file->bs); ++} ++ ++static int zeroinit_has_zero_init(BlockDriverState *bs) ++{ ++ BDRVZeroinitState *s = bs->opaque; ++ return s->has_zero_init; ++} ++ ++static int coroutine_fn zeroinit_co_pdiscard(BlockDriverState *bs, ++ int64_t offset, int count) ++{ ++ return bdrv_co_pdiscard(bs->file, offset, count); ++} ++ ++static int zeroinit_co_truncate(BlockDriverState *bs, int64_t offset, ++ PreallocMode prealloc, Error **errp) ++{ ++ return bdrv_co_truncate(bs->file, offset, prealloc, errp); ++} ++ ++static int zeroinit_get_info(BlockDriverState *bs, BlockDriverInfo *bdi) ++{ ++ return bdrv_get_info(bs->file->bs, bdi); ++} ++ ++static BlockDriver bdrv_zeroinit = { ++ .format_name = "zeroinit", ++ .protocol_name = "zeroinit", ++ .instance_size = sizeof(BDRVZeroinitState), ++ ++ .bdrv_parse_filename = zeroinit_parse_filename, ++ .bdrv_file_open = zeroinit_open, ++ .bdrv_close = zeroinit_close, ++ .bdrv_getlength = zeroinit_getlength, ++ .bdrv_child_perm = bdrv_filter_default_perms, ++ .bdrv_co_flush_to_disk = zeroinit_co_flush, ++ ++ .bdrv_co_pwrite_zeroes = zeroinit_co_pwrite_zeroes, ++ .bdrv_co_pwritev = zeroinit_co_pwritev, ++ .bdrv_co_preadv = zeroinit_co_preadv, ++ .bdrv_co_flush = zeroinit_co_flush, ++ ++ .is_filter = true, ++ .bdrv_recurse_is_first_non_filter = zeroinit_recurse_is_first_non_filter, ++ ++ .bdrv_has_zero_init = zeroinit_has_zero_init, ++ ++ .bdrv_co_block_status = bdrv_co_block_status_from_file, ++ ++ .bdrv_co_pdiscard = zeroinit_co_pdiscard, ++ ++ .bdrv_co_truncate = zeroinit_co_truncate, ++ .bdrv_get_info = zeroinit_get_info, ++}; ++ ++static void bdrv_zeroinit_init(void) ++{ ++ bdrv_register(&bdrv_zeroinit); ++} ++ ++block_init(bdrv_zeroinit_init); +-- +2.11.0 + diff --git a/debian/patches/pve/0019-PVE-convert-savevm-async-to-threads.patch b/debian/patches/pve/0019-PVE-convert-savevm-async-to-threads.patch deleted file mode 100644 index d4701a2..0000000 --- a/debian/patches/pve/0019-PVE-convert-savevm-async-to-threads.patch +++ /dev/null @@ -1,249 +0,0 @@ -From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001 -From: Wolfgang Bumiller -Date: Tue, 8 Nov 2016 11:13:06 +0100 -Subject: [PATCH] PVE: convert savevm-async to threads - ---- - savevm-async.c | 151 ++++++++++++++++++++++++++++++++++----------------------- - 1 file changed, 90 insertions(+), 61 deletions(-) - -diff --git a/savevm-async.c b/savevm-async.c -index 0bf830c906..157eb7a50d 100644 ---- a/savevm-async.c -+++ b/savevm-async.c -@@ -5,9 +5,7 @@ - #include "migration/global_state.h" - #include "migration/ram.h" - #include "migration/qemu-file.h" --#include "qapi/qmp/qerror.h" - #include "sysemu/sysemu.h" --#include "qmp-commands.h" - #include "block/block.h" - #include "sysemu/block-backend.h" - #include "qapi/error.h" -@@ -47,6 +45,8 @@ static struct SnapshotState { - int saved_vm_running; - QEMUFile *file; - int64_t total_time; -+ QEMUBH *cleanup_bh; -+ QemuThread thread; - } snap_state; - - SaveVMInfo *qmp_query_savevm(Error **errp) -@@ -134,19 +134,6 @@ static void save_snapshot_error(const char *fmt, ...) - g_free (msg); - - snap_state.state = SAVE_STATE_ERROR; -- -- save_snapshot_cleanup(); --} -- --static void save_snapshot_completed(void) --{ -- DPRINTF("save_snapshot_completed\n"); -- -- if (save_snapshot_cleanup() < 0) { -- snap_state.state = SAVE_STATE_ERROR; -- } else { -- snap_state.state = SAVE_STATE_COMPLETED; -- } - } - - static int block_state_close(void *opaque) -@@ -155,67 +142,118 @@ static int block_state_close(void *opaque) - return blk_flush(snap_state.target); - } - -+typedef struct BlkRwCo { -+ int64_t offset; -+ QEMUIOVector *qiov; -+ int ret; -+} BlkRwCo; -+ -+static void block_state_write_entry(void *opaque) { -+ BlkRwCo *rwco = opaque; -+ rwco->ret = blk_co_pwritev(snap_state.target, rwco->offset, rwco->qiov->size, -+ rwco->qiov, 0); -+} -+ - static ssize_t block_state_writev_buffer(void *opaque, struct iovec *iov, - int iovcnt, int64_t pos) - { -- int ret; - QEMUIOVector qiov; -+ AioContext *aio_context; -+ Coroutine *co; -+ BlkRwCo rwco; -+ -+ assert(pos == snap_state.bs_pos); -+ rwco = (BlkRwCo) { -+ .offset = pos, -+ .qiov = &qiov, -+ .ret = NOT_DONE, -+ }; - - qemu_iovec_init_external(&qiov, iov, iovcnt); -- ret = blk_co_pwritev(snap_state.target, pos, qiov.size, &qiov, 0); -- if (ret < 0) { -- return ret; -+ -+ aio_context = blk_get_aio_context(snap_state.target); -+ aio_context_acquire(aio_context); -+ co = qemu_coroutine_create(&block_state_write_entry, &rwco); -+ qemu_coroutine_enter(co); -+ while (rwco.ret == NOT_DONE) { -+ aio_poll(aio_context, true); - } -+ aio_context_release(aio_context); -+ - snap_state.bs_pos += qiov.size; - return qiov.size; - } - --static int store_and_stop(void) { -- if (global_state_store()) { -- save_snapshot_error("Error saving global state"); -- return 1; -+static void process_savevm_cleanup(void *opaque) -+{ -+ int ret; -+ qemu_bh_delete(snap_state.cleanup_bh); -+ snap_state.cleanup_bh = NULL; -+ qemu_mutex_unlock_iothread(); -+ qemu_thread_join(&snap_state.thread); -+ qemu_mutex_lock_iothread(); -+ ret = save_snapshot_cleanup(); -+ if (ret < 0) { -+ save_snapshot_error("save_snapshot_cleanup error %d", ret); -+ } else if (snap_state.state == SAVE_STATE_ACTIVE) { -+ snap_state.state = SAVE_STATE_COMPLETED; -+ } else { -+ save_snapshot_error("process_savevm_cleanup: invalid state: %d", -+ snap_state.state); - } -- if (runstate_is_running()) { -- vm_stop(RUN_STATE_SAVE_VM); -+ if (snap_state.saved_vm_running) { -+ vm_start(); -+ snap_state.saved_vm_running = false; - } -- return 0; - } - --static void process_savevm_co(void *opaque) -+static void *process_savevm_thread(void *opaque) - { - int ret; - int64_t maxlen; - -- snap_state.state = SAVE_STATE_ACTIVE; -+ rcu_register_thread(); - -- qemu_mutex_unlock_iothread(); - qemu_savevm_state_header(snap_state.file); - qemu_savevm_state_setup(snap_state.file); - ret = qemu_file_get_error(snap_state.file); -- qemu_mutex_lock_iothread(); - - if (ret < 0) { - save_snapshot_error("qemu_savevm_state_setup failed"); -- return; -+ rcu_unregister_thread(); -+ return NULL; - } - - while (snap_state.state == SAVE_STATE_ACTIVE) { -- uint64_t pending_size, pend_post, pend_nonpost; -+ uint64_t pending_size, pend_precopy, pend_compatible, pend_postcopy; - -- qemu_savevm_state_pending(snap_state.file, 0, &pend_nonpost, &pend_post); -- pending_size = pend_post + pend_nonpost; -+ qemu_savevm_state_pending(snap_state.file, 0, &pend_precopy, &pend_compatible, &pend_postcopy); -+ pending_size = pend_precopy + pend_compatible + pend_postcopy; - -- if (pending_size) { -- ret = qemu_savevm_state_iterate(snap_state.file, false); -- if (ret < 0) { -- save_snapshot_error("qemu_savevm_state_iterate error %d", ret); -- break; -- } -- DPRINTF("savevm inerate pending size %lu ret %d\n", pending_size, ret); -+ maxlen = blk_getlength(snap_state.target) - 30*1024*1024; -+ -+ if (pending_size > 400000 && snap_state.bs_pos + pending_size < maxlen) { -+ qemu_mutex_lock_iothread(); -+ ret = qemu_savevm_state_iterate(snap_state.file, false); -+ if (ret < 0) { -+ save_snapshot_error("qemu_savevm_state_iterate error %d", ret); -+ break; -+ } -+ qemu_mutex_unlock_iothread(); -+ DPRINTF("savevm inerate pending size %lu ret %d\n", pending_size, ret); - } else { -- DPRINTF("done iterating\n"); -- if (store_and_stop()) -+ qemu_mutex_lock_iothread(); -+ qemu_system_wakeup_request(QEMU_WAKEUP_REASON_OTHER); -+ ret = global_state_store(); -+ if (ret) { -+ save_snapshot_error("global_state_store error %d", ret); - break; -+ } -+ ret = vm_stop_force_state(RUN_STATE_FINISH_MIGRATE); -+ if (ret < 0) { -+ save_snapshot_error("vm_stop_force_state error %d", ret); -+ break; -+ } - DPRINTF("savevm inerate finished\n"); - /* upstream made the return value here inconsistent - * (-1 instead of 'ret' in one case and 0 after flush which can -@@ -227,28 +265,17 @@ static void process_savevm_co(void *opaque) - save_snapshot_error("qemu_savevm_state_iterate error %d", ret); - break; - } -+ qemu_savevm_state_cleanup(); - DPRINTF("save complete\n"); -- save_snapshot_completed(); - break; - } -- -- /* stop the VM if we get to the end of available space, -- * or if pending_size is just a few MB -- */ -- maxlen = blk_getlength(snap_state.target) - 30*1024*1024; -- if ((pending_size < 100000) || -- ((snap_state.bs_pos + pending_size) >= maxlen)) { -- if (store_and_stop()) -- break; -- } - } - -- if(snap_state.state == SAVE_STATE_CANCELLED) { -- save_snapshot_completed(); -- Error *errp = NULL; -- qmp_savevm_end(&errp); -- } -+ qemu_bh_schedule(snap_state.cleanup_bh); -+ qemu_mutex_unlock_iothread(); - -+ rcu_unregister_thread(); -+ return NULL; - } - - static const QEMUFileOps block_file_ops = { -@@ -311,8 +338,10 @@ void qmp_savevm_start(bool has_statefile, const char *statefile, Error **errp) - error_setg(&snap_state.blocker, "block device is in use by savevm"); - blk_op_block_all(snap_state.target, snap_state.blocker); - -- Coroutine *co = qemu_coroutine_create(process_savevm_co, NULL); -- qemu_coroutine_enter(co); -+ snap_state.state = SAVE_STATE_ACTIVE; -+ snap_state.cleanup_bh = qemu_bh_new(process_savevm_cleanup, &snap_state); -+ qemu_thread_create(&snap_state.thread, "savevm-async", process_savevm_thread, -+ NULL, QEMU_THREAD_JOINABLE); - - return; - --- -2.11.0 - diff --git a/debian/patches/pve/0020-PVE-backup-modify-job-api.patch b/debian/patches/pve/0020-PVE-backup-modify-job-api.patch new file mode 100644 index 0000000..6b040d4 --- /dev/null +++ b/debian/patches/pve/0020-PVE-backup-modify-job-api.patch @@ -0,0 +1,100 @@ +From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001 +From: Wolfgang Bumiller +Date: Wed, 9 Dec 2015 15:04:57 +0100 +Subject: [PATCH] PVE: backup: modify job api + +Introduce a pause_count parameter to start a backup in +paused mode. This way backups of multiple drives can be +started up sequentially via the completion callback while +having been started at the same point in time. +--- + block/backup.c | 2 ++ + block/replication.c | 2 +- + blockdev.c | 4 ++-- + include/block/block_int.h | 1 + + job.c | 2 +- + 5 files changed, 7 insertions(+), 4 deletions(-) + +diff --git a/block/backup.c b/block/backup.c +index 8630d32926..3aaa75892a 100644 +--- a/block/backup.c ++++ b/block/backup.c +@@ -613,6 +613,7 @@ BlockJob *backup_job_create(const char *job_id, BlockDriverState *bs, + BlockdevOnError on_target_error, + int creation_flags, + BlockCompletionFunc *cb, void *opaque, ++ int pause_count, + JobTxn *txn, Error **errp) + { + int64_t len; +@@ -746,6 +747,7 @@ BlockJob *backup_job_create(const char *job_id, BlockDriverState *bs, + block_job_add_bdrv(&job->common, "target", target, 0, BLK_PERM_ALL, + &error_abort); + job->len = len; ++ job->common.job.pause_count += pause_count; + + return &job->common; + +diff --git a/block/replication.c b/block/replication.c +index 6349d6958e..84e07cc4d4 100644 +--- a/block/replication.c ++++ b/block/replication.c +@@ -571,7 +571,7 @@ static void replication_start(ReplicationState *rs, ReplicationMode mode, + 0, MIRROR_SYNC_MODE_NONE, NULL, false, + BLOCKDEV_ON_ERROR_REPORT, + BLOCKDEV_ON_ERROR_REPORT, JOB_INTERNAL, +- backup_job_completed, bs, NULL, &local_err); ++ backup_job_completed, bs, 0, NULL, &local_err); + if (local_err) { + error_propagate(errp, local_err); + backup_job_cleanup(bs); +diff --git a/blockdev.c b/blockdev.c +index dcf8c8d2ab..d5eb6b62ca 100644 +--- a/blockdev.c ++++ b/blockdev.c +@@ -3568,7 +3568,7 @@ static BlockJob *do_drive_backup(DriveBackup *backup, JobTxn *txn, + job = backup_job_create(backup->job_id, bs, target_bs, backup->speed, + backup->sync, bmap, backup->compress, + backup->on_source_error, backup->on_target_error, +- job_flags, NULL, NULL, txn, &local_err); ++ job_flags, NULL, NULL, 0, txn, &local_err); + bdrv_unref(target_bs); + if (local_err != NULL) { + error_propagate(errp, local_err); +@@ -3660,7 +3660,7 @@ BlockJob *do_blockdev_backup(BlockdevBackup *backup, JobTxn *txn, + job = backup_job_create(backup->job_id, bs, target_bs, backup->speed, + backup->sync, NULL, backup->compress, + backup->on_source_error, backup->on_target_error, +- job_flags, NULL, NULL, txn, &local_err); ++ job_flags, NULL, NULL, 0, txn, &local_err); + if (local_err != NULL) { + error_propagate(errp, local_err); + } +diff --git a/include/block/block_int.h b/include/block/block_int.h +index 903b9c1034..0b2516c3cf 100644 +--- a/include/block/block_int.h ++++ b/include/block/block_int.h +@@ -1083,6 +1083,7 @@ BlockJob *backup_job_create(const char *job_id, BlockDriverState *bs, + BlockdevOnError on_target_error, + int creation_flags, + BlockCompletionFunc *cb, void *opaque, ++ int pause_count, + JobTxn *txn, Error **errp); + + void hmp_drive_add_node(Monitor *mon, const char *optstr); +diff --git a/job.c b/job.c +index a3bec7fb22..950924ebad 100644 +--- a/job.c ++++ b/job.c +@@ -549,7 +549,7 @@ void job_start(Job *job) + job->co = qemu_coroutine_create(job_co_entry, job); + job->pause_count--; + job->busy = true; +- job->paused = false; ++ job->paused = job->pause_count > 0; + job_state_transition(job, JOB_STATUS_RUNNING); + aio_co_enter(job->aio_context, job->co); + } +-- +2.11.0 + diff --git a/debian/patches/pve/0020-PVE-block-snapshot-qmp_snapshot_drive-add-aiocontext.patch b/debian/patches/pve/0020-PVE-block-snapshot-qmp_snapshot_drive-add-aiocontext.patch deleted file mode 100644 index 003fcc7..0000000 --- a/debian/patches/pve/0020-PVE-block-snapshot-qmp_snapshot_drive-add-aiocontext.patch +++ /dev/null @@ -1,65 +0,0 @@ -From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001 -From: Alexandre Derumier -Date: Tue, 13 Sep 2016 01:57:56 +0200 -Subject: [PATCH] PVE: block: snapshot: qmp_snapshot_drive: add aiocontext - -Signed-off-by: Alexandre Derumier ---- - savevm-async.c | 15 +++++++++++---- - 1 file changed, 11 insertions(+), 4 deletions(-) - -diff --git a/savevm-async.c b/savevm-async.c -index 157eb7a50d..87d5460a26 100644 ---- a/savevm-async.c -+++ b/savevm-async.c -@@ -379,6 +379,7 @@ void qmp_snapshot_drive(const char *device, const char *name, Error **errp) - BlockBackend *blk; - BlockDriverState *bs; - QEMUSnapshotInfo sn1, *sn = &sn1; -+ AioContext *aio_context; - int ret; - #ifdef _WIN32 - struct _timeb tb; -@@ -405,20 +406,23 @@ void qmp_snapshot_drive(const char *device, const char *name, Error **errp) - return; - } - -+ aio_context = bdrv_get_aio_context(bs); -+ aio_context_acquire(aio_context); -+ - if (bdrv_is_read_only(bs)) { - error_setg(errp, "Node '%s' is read only", device); -- return; -+ goto out; - } - - if (!bdrv_can_snapshot(bs)) { - error_setg(errp, QERR_UNSUPPORTED); -- return; -+ goto out; - } - - if (bdrv_snapshot_find(bs, sn, name) >= 0) { - error_set(errp, ERROR_CLASS_GENERIC_ERROR, - "snapshot '%s' already exists", name); -- return; -+ goto out; - } - - sn = &sn1; -@@ -443,8 +447,11 @@ void qmp_snapshot_drive(const char *device, const char *name, Error **errp) - if (ret < 0) { - error_set(errp, ERROR_CLASS_GENERIC_ERROR, - "Error while creating snapshot on '%s'\n", device); -- return; -+ goto out; - } -+ -+out: -+ aio_context_release(aio_context); - } - - void qmp_delete_drive_snapshot(const char *device, const char *name, --- -2.11.0 - diff --git a/debian/patches/pve/0021-PVE-backup-introduce-vma-archive-format.patch b/debian/patches/pve/0021-PVE-backup-introduce-vma-archive-format.patch new file mode 100644 index 0000000..66b1408 --- /dev/null +++ b/debian/patches/pve/0021-PVE-backup-introduce-vma-archive-format.patch @@ -0,0 +1,1564 @@ +From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001 +From: Wolfgang Bumiller +Date: Wed, 2 Aug 2017 13:51:02 +0200 +Subject: [PATCH] PVE: backup: introduce vma archive format + +TODO: Move to a libvma block backend. +--- + MAINTAINERS | 6 + + block/Makefile.objs | 3 + + block/vma.c | 503 +++++++++++++++++++++++++++++++++++++++++++++++ + blockdev.c | 536 +++++++++++++++++++++++++++++++++++++++++++++++++++ + configure | 29 +++ + hmp-commands-info.hx | 13 ++ + hmp-commands.hx | 31 +++ + hmp.c | 63 ++++++ + hmp.h | 3 + + qapi/block-core.json | 109 ++++++++++- + qapi/common.json | 13 ++ + qapi/misc.json | 13 -- + 12 files changed, 1308 insertions(+), 14 deletions(-) + create mode 100644 block/vma.c + +diff --git a/MAINTAINERS b/MAINTAINERS +index 666e936812..299a73cd86 100644 +--- a/MAINTAINERS ++++ b/MAINTAINERS +@@ -2140,6 +2140,12 @@ L: qemu-block@nongnu.org + S: Supported + F: block/vvfat.c + ++VMA ++M: Wolfgang Bumiller . ++L: pve-devel@proxmox.com ++S: Supported ++F: block/vma.c ++ + Image format fuzzer + M: Stefan Hajnoczi + L: qemu-block@nongnu.org +diff --git a/block/Makefile.objs b/block/Makefile.objs +index c00f0b32d6..abfd0f69d7 100644 +--- a/block/Makefile.objs ++++ b/block/Makefile.objs +@@ -24,6 +24,7 @@ block-obj-$(CONFIG_RBD) += rbd.o + block-obj-$(CONFIG_GLUSTERFS) += gluster.o + block-obj-$(CONFIG_VXHS) += vxhs.o + block-obj-$(CONFIG_LIBSSH2) += ssh.o ++block-obj-$(CONFIG_VMA) += vma.o + block-obj-y += accounting.o dirty-bitmap.o + block-obj-y += write-threshold.o + block-obj-y += backup.o +@@ -52,3 +53,5 @@ qcow.o-libs := -lz + linux-aio.o-libs := -laio + parallels.o-cflags := $(LIBXML2_CFLAGS) + parallels.o-libs := $(LIBXML2_LIBS) ++vma.o-cflags := $(VMA_CFLAGS) ++vma.o-libs := $(VMA_LIBS) +diff --git a/block/vma.c b/block/vma.c +new file mode 100644 +index 0000000000..b911b198dc +--- /dev/null ++++ b/block/vma.c +@@ -0,0 +1,503 @@ ++/* ++ * VMA archive backend for QEMU, container object ++ * ++ * Copyright (C) 2017 Proxmox Server Solutions GmbH ++ * ++ * This work is licensed under the terms of the GNU GPL, version 2 or later. ++ * See the COPYING file in the top-level directory. ++ * ++ */ ++#include ++ ++#include "qemu/osdep.h" ++#include "qemu/uuid.h" ++#include "qemu/option.h" ++#include "qemu-common.h" ++#include "qapi/error.h" ++#include "qapi/qmp/qerror.h" ++#include "qapi/qmp/qstring.h" ++#include "qapi/qmp/qdict.h" ++#include "qom/object.h" ++#include "qom/object_interfaces.h" ++#include "block/block_int.h" ++ ++/* exported interface */ ++void vma_object_add_config_file(Object *obj, const char *name, ++ const char *contents, size_t len, ++ Error **errp); ++ ++#define TYPE_VMA_OBJECT "vma" ++#define VMA_OBJECT(obj) \ ++ OBJECT_CHECK(VMAObjectState, (obj), TYPE_VMA_OBJECT) ++#define VMA_OBJECT_GET_CLASS(obj) \ ++ OBJECT_GET_CLASS(VMAObjectClass, (obj), TYPE_VMA_OBJECT) ++ ++typedef struct VMAObjectClass { ++ ObjectClass parent_class; ++} VMAObjectClass; ++ ++typedef struct VMAObjectState { ++ Object parent; ++ ++ char *filename; ++ ++ QemuUUID uuid; ++ bool blocked; ++ VMAWriter *vma; ++ QemuMutex mutex; ++} VMAObjectState; ++ ++static VMAObjectState *vma_by_id(const char *name) ++{ ++ Object *container; ++ Object *obj; ++ ++ container = object_get_objects_root(); ++ obj = object_resolve_path_component(container, name); ++ ++ return VMA_OBJECT(obj); ++} ++ ++static void vma_object_class_complete(UserCreatable *uc, Error **errp) ++{ ++ int rc; ++ VMAObjectState *vo = VMA_OBJECT(uc); ++ VMAObjectClass *voc = VMA_OBJECT_GET_CLASS(uc); ++ (void)!vo; ++ (void)!voc; ++ ++ if (!vo->filename) { ++ error_setg(errp, "Parameter 'filename' is required"); ++ return; ++ } ++ ++ vo->vma = VMAWriter_fopen(vo->filename); ++ if (!vo->vma) { ++ error_setg_errno(errp, errno, "failed to create VMA archive"); ++ return; ++ } ++ ++ rc = VMAWriter_set_uuid(vo->vma, vo->uuid.data, sizeof(vo->uuid.data)); ++ if (rc < 0) { ++ error_setg_errno(errp, -rc, "failed to set UUID of VMA archive"); ++ return; ++ } ++ ++ qemu_mutex_init(&vo->mutex); ++} ++ ++static bool vma_object_can_be_deleted(UserCreatable *uc) ++{ ++ //VMAObjectState *vo = VMA_OBJECT(uc); ++ //if (!vo->vma) { ++ // return true; ++ //} ++ //return false; ++ return true; ++} ++ ++static void vma_object_class_init(ObjectClass *oc, void *data) ++{ ++ UserCreatableClass *ucc = USER_CREATABLE_CLASS(oc); ++ ++ ucc->can_be_deleted = vma_object_can_be_deleted; ++ ucc->complete = vma_object_class_complete; ++} ++ ++static char *vma_object_get_filename(Object *obj, Error **errp) ++{ ++ VMAObjectState *vo = VMA_OBJECT(obj); ++ ++ return g_strdup(vo->filename); ++} ++ ++static void vma_object_set_filename(Object *obj, const char *str, Error **errp) ++{ ++ VMAObjectState *vo = VMA_OBJECT(obj); ++ ++ if (vo->vma) { ++ error_setg(errp, "filename cannot be changed after creation"); ++ return; ++ } ++ ++ g_free(vo->filename); ++ vo->filename = g_strdup(str); ++} ++ ++static char *vma_object_get_uuid(Object *obj, Error **errp) ++{ ++ VMAObjectState *vo = VMA_OBJECT(obj); ++ ++ return qemu_uuid_unparse_strdup(&vo->uuid); ++} ++ ++static void vma_object_set_uuid(Object *obj, const char *str, Error **errp) ++{ ++ VMAObjectState *vo = VMA_OBJECT(obj); ++ ++ if (vo->vma) { ++ error_setg(errp, "uuid cannot be changed after creation"); ++ return; ++ } ++ ++ qemu_uuid_parse(str, &vo->uuid); ++} ++ ++static bool vma_object_get_blocked(Object *obj, Error **errp) ++{ ++ VMAObjectState *vo = VMA_OBJECT(obj); ++ ++ return vo->blocked; ++} ++ ++static void vma_object_set_blocked(Object *obj, bool blocked, Error **errp) ++{ ++ VMAObjectState *vo = VMA_OBJECT(obj); ++ ++ (void)errp; ++ ++ vo->blocked = blocked; ++} ++ ++void vma_object_add_config_file(Object *obj, const char *name, ++ const char *contents, size_t len, ++ Error **errp) ++{ ++ int rc; ++ VMAObjectState *vo = VMA_OBJECT(obj); ++ ++ if (!vo || !vo->vma) { ++ error_setg(errp, "not a valid vma object to add config files to"); ++ return; ++ } ++ ++ rc = VMAWriter_addConfigFile(vo->vma, name, contents, len); ++ if (rc < 0) { ++ error_setg_errno(errp, -rc, "failed to add config file to VMA"); ++ return; ++ } ++} ++ ++static void vma_object_init(Object *obj) ++{ ++ VMAObjectState *vo = VMA_OBJECT(obj); ++ (void)!vo; ++ ++ object_property_add_str(obj, "filename", ++ vma_object_get_filename, vma_object_set_filename, ++ NULL); ++ object_property_add_str(obj, "uuid", ++ vma_object_get_uuid, vma_object_set_uuid, ++ NULL); ++ object_property_add_bool(obj, "blocked", ++ vma_object_get_blocked, vma_object_set_blocked, ++ NULL); ++} ++ ++static void vma_object_finalize(Object *obj) ++{ ++ VMAObjectState *vo = VMA_OBJECT(obj); ++ VMAObjectClass *voc = VMA_OBJECT_GET_CLASS(obj); ++ (void)!voc; ++ ++ qemu_mutex_destroy(&vo->mutex); ++ ++ VMAWriter_destroy(vo->vma, true); ++ g_free(vo->filename); ++} ++ ++static const TypeInfo vma_object_info = { ++ .name = TYPE_VMA_OBJECT, ++ .parent = TYPE_OBJECT, ++ .class_size = sizeof(VMAObjectClass), ++ .class_init = vma_object_class_init, ++ .instance_size = sizeof(VMAObjectState), ++ .instance_init = vma_object_init, ++ .instance_finalize = vma_object_finalize, ++ .interfaces = (InterfaceInfo[]) { ++ { TYPE_USER_CREATABLE }, ++ { } ++ } ++}; ++ ++static void register_types(void) ++{ ++ type_register_static(&vma_object_info); ++} ++ ++type_init(register_types); ++ ++typedef struct { ++ VMAObjectState *vma_obj; ++ char *name; ++ size_t device_id; ++ uint64_t byte_size; ++} BDRVVMAState; ++ ++static void qemu_vma_parse_filename(const char *filename, QDict *options, ++ Error **errp) ++{ ++ char *sep; ++ ++ if (strncmp(filename, "vma:", sizeof("vma:")-1) == 0) { ++ filename += sizeof("vma:")-1; ++ } ++ ++ sep = strchr(filename, '/'); ++ if (!sep || sep == filename) { ++ error_setg(errp, "VMA file should be //"); ++ return; ++ } ++ ++ qdict_put(options, "vma", qstring_from_substr(filename, 0, sep-filename)); ++ ++ while (*sep && *sep == '/') ++ ++sep; ++ if (!*sep) { ++ error_setg(errp, "missing device name\n"); ++ return; ++ } ++ ++ filename = sep; ++ sep = strchr(filename, '/'); ++ if (!sep || sep == filename) { ++ error_setg(errp, "VMA file should be //"); ++ return; ++ } ++ ++ qdict_put(options, "name", qstring_from_substr(filename, 0, sep-filename)); ++ ++ while (*sep && *sep == '/') ++ ++sep; ++ if (!*sep) { ++ error_setg(errp, "missing device size\n"); ++ return; ++ } ++ ++ filename = sep; ++ qdict_put_str(options, "size", filename); ++} ++ ++static QemuOptsList runtime_opts = { ++ .name = "vma-drive", ++ .head = QTAILQ_HEAD_INITIALIZER(runtime_opts.head), ++ .desc = { ++ { ++ .name = "vma", ++ .type = QEMU_OPT_STRING, ++ .help = "VMA Object name", ++ }, ++ { ++ .name = "name", ++ .type = QEMU_OPT_STRING, ++ .help = "VMA device name", ++ }, ++ { ++ .name = BLOCK_OPT_SIZE, ++ .type = QEMU_OPT_SIZE, ++ .help = "Virtual disk size" ++ }, ++ { /* end of list */ } ++ }, ++}; ++static int qemu_vma_open(BlockDriverState *bs, QDict *options, int flags, ++ Error **errp) ++{ ++ Error *local_err = NULL; ++ BDRVVMAState *s = bs->opaque; ++ QemuOpts *opts; ++ const char *vma_id, *device_name; ++ ssize_t dev_id; ++ int64_t bytes = 0; ++ int ret; ++ ++ opts = qemu_opts_create(&runtime_opts, NULL, 0, &error_abort); ++ qemu_opts_absorb_qdict(opts, options, &local_err); ++ if (local_err) { ++ error_propagate(errp, local_err); ++ ret = -EINVAL; ++ goto failed_opts; ++ } ++ ++ bytes = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0), ++ BDRV_SECTOR_SIZE); ++ ++ vma_id = qemu_opt_get(opts, "vma"); ++ if (!vma_id) { ++ ret = -EINVAL; ++ error_setg(errp, "missing 'vma' property"); ++ goto failed_opts; ++ } ++ ++ device_name = qemu_opt_get(opts, "name"); ++ if (!device_name) { ++ ret = -EINVAL; ++ error_setg(errp, "missing 'name' property"); ++ goto failed_opts; ++ } ++ ++ VMAObjectState *vma = vma_by_id(vma_id); ++ if (!vma) { ++ ret = -EINVAL; ++ error_setg(errp, "no such VMA object: %s", vma_id); ++ goto failed_opts; ++ } ++ ++ dev_id = VMAWriter_findDevice(vma->vma, device_name); ++ if (dev_id >= 0) { ++ error_setg(errp, "drive already exists in VMA object"); ++ ret = -EIO; ++ goto failed_opts; ++ } ++ ++ dev_id = VMAWriter_addDevice(vma->vma, device_name, (uint64_t)bytes); ++ if (dev_id < 0) { ++ error_setg_errno(errp, -dev_id, "failed to add VMA device"); ++ ret = -EIO; ++ goto failed_opts; ++ } ++ ++ object_ref(OBJECT(vma)); ++ s->vma_obj = vma; ++ s->name = g_strdup(device_name); ++ s->device_id = (size_t)dev_id; ++ s->byte_size = bytes; ++ ++ ret = 0; ++ ++failed_opts: ++ qemu_opts_del(opts); ++ return ret; ++} ++ ++static void qemu_vma_close(BlockDriverState *bs) ++{ ++ BDRVVMAState *s = bs->opaque; ++ ++ (void)VMAWriter_finishDevice(s->vma_obj->vma, s->device_id); ++ object_unref(OBJECT(s->vma_obj)); ++ ++ g_free(s->name); ++} ++ ++static int64_t qemu_vma_getlength(BlockDriverState *bs) ++{ ++ BDRVVMAState *s = bs->opaque; ++ ++ return s->byte_size; ++} ++ ++static coroutine_fn int qemu_vma_co_writev(BlockDriverState *bs, ++ int64_t sector_num, ++ int nb_sectors, ++ QEMUIOVector *qiov, ++ int flags) ++{ ++ size_t i; ++ ssize_t rc; ++ BDRVVMAState *s = bs->opaque; ++ VMAObjectState *vo = s->vma_obj; ++ off_t offset = sector_num * BDRV_SECTOR_SIZE; ++ /* flags can be only values we set in supported_write_flags */ ++ assert(flags == 0); ++ ++ qemu_mutex_lock(&vo->mutex); ++ if (vo->blocked) { ++ return -EPERM; ++ } ++ for (i = 0; i != qiov->niov; ++i) { ++ const struct iovec *v = &qiov->iov[i]; ++ size_t blocks = v->iov_len / VMA_BLOCK_SIZE; ++ if (blocks * VMA_BLOCK_SIZE != v->iov_len) { ++ return -EIO; ++ } ++ rc = VMAWriter_writeBlocks(vo->vma, s->device_id, ++ v->iov_base, blocks, offset); ++ if (errno) { ++ return -errno; ++ } ++ if (rc != blocks) { ++ return -EIO; ++ } ++ offset += v->iov_len; ++ } ++ qemu_mutex_unlock(&vo->mutex); ++ return 0; ++} ++ ++static int qemu_vma_get_info(BlockDriverState *bs, BlockDriverInfo *bdi) ++{ ++ bdi->cluster_size = VMA_CLUSTER_SIZE; ++ bdi->unallocated_blocks_are_zero = true; ++ return 0; ++} ++ ++static int qemu_vma_check_perm(BlockDriverState *bs, ++ uint64_t perm, ++ uint64_t shared, ++ Error **errp) ++{ ++ /* Nothing to do. */ ++ return 0; ++} ++ ++static void qemu_vma_set_perm(BlockDriverState *bs, ++ uint64_t perm, ++ uint64_t shared) ++{ ++ /* Nothing to do. */ ++} ++ ++static void qemu_vma_abort_perm_update(BlockDriverState *bs) ++{ ++ /* Nothing to do. */ ++} ++ ++static void qemu_vma_refresh_limits(BlockDriverState *bs, Error **errp) ++{ ++ bs->bl.request_alignment = BDRV_SECTOR_SIZE; /* No sub-sector I/O */ ++} ++static void qemu_vma_child_perm(BlockDriverState *bs, BdrvChild *c, ++ const BdrvChildRole *role, ++ BlockReopenQueue *reopen_queue, ++ uint64_t perm, uint64_t shared, ++ uint64_t *nperm, uint64_t *nshared) ++{ ++ *nperm = BLK_PERM_ALL; ++ *nshared = BLK_PERM_ALL; ++} ++ ++static BlockDriver bdrv_vma_drive = { ++ .format_name = "vma-drive", ++ .protocol_name = "vma", ++ .instance_size = sizeof(BDRVVMAState), ++ ++#if 0 ++ .bdrv_create = qemu_vma_create, ++ .create_opts = &qemu_vma_create_opts, ++#endif ++ ++ .bdrv_parse_filename = qemu_vma_parse_filename, ++ .bdrv_file_open = qemu_vma_open, ++ ++ .bdrv_close = qemu_vma_close, ++ .bdrv_has_zero_init = bdrv_has_zero_init_1, ++ .bdrv_getlength = qemu_vma_getlength, ++ .bdrv_get_info = qemu_vma_get_info, ++ ++ //.bdrv_co_preadv = qemu_vma_co_preadv, ++ .bdrv_co_writev = qemu_vma_co_writev, ++ ++ .bdrv_refresh_limits = qemu_vma_refresh_limits, ++ .bdrv_check_perm = qemu_vma_check_perm, ++ .bdrv_set_perm = qemu_vma_set_perm, ++ .bdrv_abort_perm_update = qemu_vma_abort_perm_update, ++ .bdrv_child_perm = qemu_vma_child_perm, ++}; ++ ++static void bdrv_vma_init(void) ++{ ++ bdrv_register(&bdrv_vma_drive); ++} ++ ++block_init(bdrv_vma_init); +diff --git a/blockdev.c b/blockdev.c +index d5eb6b62ca..4f18d3c3d7 100644 +--- a/blockdev.c ++++ b/blockdev.c +@@ -31,11 +31,13 @@ + */ + + #include "qemu/osdep.h" ++#include "qemu/uuid.h" + #include "sysemu/block-backend.h" + #include "sysemu/blockdev.h" + #include "hw/block/block.h" + #include "block/blockjob.h" + #include "block/qdict.h" ++#include "block/blockjob_int.h" + #include "block/throttle-groups.h" + #include "monitor/monitor.h" + #include "qemu/error-report.h" +@@ -44,6 +46,7 @@ + #include "qapi/qapi-commands-block.h" + #include "qapi/qapi-commands-transaction.h" + #include "qapi/qapi-visit-block-core.h" ++#include "qapi/qapi-types-misc.h" + #include "qapi/qmp/qdict.h" + #include "qapi/qmp/qnum.h" + #include "qapi/qmp/qstring.h" +@@ -3220,6 +3223,539 @@ out: + aio_context_release(aio_context); + } + ++/* PVE backup related function */ ++ ++static struct PVEBackupState { ++ Error *error; ++ bool cancel; ++ QemuUUID uuid; ++ char uuid_str[37]; ++ int64_t speed; ++ time_t start_time; ++ time_t end_time; ++ char *backup_file; ++ Object *vmaobj; ++ GList *di_list; ++ size_t next_job; ++ size_t total; ++ size_t transferred; ++ size_t zero_bytes; ++ QemuMutex backup_mutex; ++ bool backup_mutex_initialized; ++} backup_state; ++ ++typedef struct PVEBackupDevInfo { ++ BlockDriverState *bs; ++ size_t size; ++ uint8_t dev_id; ++ bool completed; ++ char targetfile[PATH_MAX]; ++ BlockDriverState *target; ++} PVEBackupDevInfo; ++ ++static void pvebackup_run_next_job(void); ++ ++static void pvebackup_cleanup(void) ++{ ++ qemu_mutex_lock(&backup_state.backup_mutex); ++ // Avoid race between block jobs and backup-cancel command: ++ if (!backup_state.vmaw) { ++ qemu_mutex_unlock(&backup_state.backup_mutex); ++ return; ++ } ++ ++ backup_state.end_time = time(NULL); ++ ++ if (backup_state.vmaobj) { ++ object_unparent(backup_state.vmaobj); ++ backup_state.vmaobj = NULL; ++ } ++ ++ g_list_free(backup_state.di_list); ++ backup_state.di_list = NULL; ++ qemu_mutex_unlock(&backup_state.backup_mutex); ++} ++ ++static void pvebackup_complete_cb(void *opaque, int ret) ++{ ++ // This always runs in the main loop ++ ++ PVEBackupDevInfo *di = opaque; ++ ++ di->completed = true; ++ ++ if (ret < 0 && !backup_state.error) { ++ error_setg(&backup_state.error, "job failed with err %d - %s", ++ ret, strerror(-ret)); ++ } ++ ++ di->bs = NULL; ++ di->target = NULL; ++ ++ if (backup_state.vmaobj) { ++ object_unparent(backup_state.vmaobj); ++ backup_state.vmaobj = NULL; ++ } ++ ++ // remove self from job queue ++ qemu_mutex_lock(&backup_state.backup_mutex); ++ backup_state.di_list = g_list_remove(backup_state.di_list, di); ++ g_free(di); ++ qemu_mutex_unlock(&backup_state.backup_mutex); ++ ++ if (!backup_state.cancel) { ++ pvebackup_run_next_job(); ++ } ++} ++ ++static void pvebackup_cancel(void *opaque) ++{ ++ backup_state.cancel = true; ++ qemu_mutex_lock(&backup_state.backup_mutex); ++ // Avoid race between block jobs and backup-cancel command: ++ if (!backup_state.vmaw) { ++ qemu_mutex_unlock(&backup_state.backup_mutex); ++ return; ++ } ++ ++ if (!backup_state.error) { ++ error_setg(&backup_state.error, "backup cancelled"); ++ } ++ ++ if (backup_state.vmaobj) { ++ Error *err; ++ /* make sure vma writer does not block anymore */ ++ if (!object_set_props(backup_state.vmaobj, &err, "blocked", "yes", NULL)) { ++ if (err) { ++ error_report_err(err); ++ } ++ } ++ } ++ ++ GList *l = backup_state.di_list; ++ while (l) { ++ PVEBackupDevInfo *di = (PVEBackupDevInfo *)l->data; ++ l = g_list_next(l); ++ if (!di->completed && di->bs) { ++ BlockJob *job = di->bs->job; ++ if (job) { ++ AioContext *aio_context = blk_get_aio_context(job->blk); ++ aio_context_acquire(aio_context); ++ if (!di->completed) { ++ job_cancel(&job->job, false); ++ } ++ aio_context_release(aio_context); ++ } ++ } ++ } ++ ++ qemu_mutex_unlock(&backup_state.backup_mutex); ++ pvebackup_cleanup(); ++} ++ ++void qmp_backup_cancel(Error **errp) ++{ ++ if (!backup_state.backup_mutex_initialized) ++ return; ++ Coroutine *co = qemu_coroutine_create(pvebackup_cancel, NULL); ++ qemu_coroutine_enter(co); ++ ++ while (backup_state.vmaobj) { ++ /* FIXME: Find something better for this */ ++ aio_poll(qemu_get_aio_context(), true); ++ } ++} ++ ++void vma_object_add_config_file(Object *obj, const char *name, ++ const char *contents, size_t len, ++ Error **errp); ++static int config_to_vma(const char *file, BackupFormat format, ++ Object *vmaobj, ++ const char *backup_dir, ++ Error **errp) ++{ ++ char *cdata = NULL; ++ gsize clen = 0; ++ GError *err = NULL; ++ if (!g_file_get_contents(file, &cdata, &clen, &err)) { ++ error_setg(errp, "unable to read file '%s'", file); ++ return 1; ++ } ++ ++ char *basename = g_path_get_basename(file); ++ ++ if (format == BACKUP_FORMAT_VMA) { ++ vma_object_add_config_file(vmaobj, basename, cdata, clen, errp); ++ } else if (format == BACKUP_FORMAT_DIR) { ++ char config_path[PATH_MAX]; ++ snprintf(config_path, PATH_MAX, "%s/%s", backup_dir, basename); ++ if (!g_file_set_contents(config_path, cdata, clen, &err)) { ++ error_setg(errp, "unable to write config file '%s'", config_path); ++ g_free(cdata); ++ g_free(basename); ++ return 1; ++ } ++ } ++ ++ g_free(basename); ++ g_free(cdata); ++ return 0; ++} ++ ++static void pvebackup_run_next_job(void) ++{ ++ qemu_mutex_lock(&backup_state.backup_mutex); ++ ++ GList *next = g_list_nth(backup_state.di_list, backup_state.next_job); ++ while (next) { ++ PVEBackupDevInfo *di = (PVEBackupDevInfo *)next->data; ++ backup_state.next_job++; ++ if (!di->completed && di->bs && di->bs->job) { ++ BlockJob *job = di->bs->job; ++ AioContext *aio_context = blk_get_aio_context(job->blk); ++ aio_context_acquire(aio_context); ++ qemu_mutex_unlock(&backup_state.backup_mutex); ++ if (backup_state.error || backup_state.cancel) { ++ job_cancel_sync(job); ++ } else { ++ job_resume(job); ++ } ++ aio_context_release(aio_context); ++ return; ++ } ++ next = g_list_next(next); ++ } ++ qemu_mutex_unlock(&backup_state.backup_mutex); ++ ++ // no more jobs, run the cleanup ++ pvebackup_cleanup(); ++} ++ ++UuidInfo *qmp_backup(const char *backup_file, bool has_format, ++ BackupFormat format, ++ bool has_config_file, const char *config_file, ++ bool has_firewall_file, const char *firewall_file, ++ bool has_devlist, const char *devlist, ++ bool has_speed, int64_t speed, Error **errp) ++{ ++ BlockBackend *blk; ++ BlockDriverState *bs = NULL; ++ const char *backup_dir = NULL; ++ Error *local_err = NULL; ++ QemuUUID uuid; ++ gchar **devs = NULL; ++ GList *di_list = NULL; ++ GList *l; ++ UuidInfo *uuid_info; ++ BlockJob *job; ++ ++ if (!backup_state.backup_mutex_initialized) { ++ qemu_mutex_init(&backup_state.backup_mutex); ++ backup_state.backup_mutex_initialized = true; ++ } ++ ++ if (backup_state.di_list || backup_state.vmaobj) { ++ error_set(errp, ERROR_CLASS_GENERIC_ERROR, ++ "previous backup not finished"); ++ return NULL; ++ } ++ ++ /* Todo: try to auto-detect format based on file name */ ++ format = has_format ? format : BACKUP_FORMAT_VMA; ++ ++ if (has_devlist) { ++ devs = g_strsplit_set(devlist, ",;:", -1); ++ ++ gchar **d = devs; ++ while (d && *d) { ++ blk = blk_by_name(*d); ++ if (blk) { ++ bs = blk_bs(blk); ++ if (bdrv_is_read_only(bs)) { ++ error_setg(errp, "Node '%s' is read only", *d); ++ goto err; ++ } ++ if (!bdrv_is_inserted(bs)) { ++ error_setg(errp, QERR_DEVICE_HAS_NO_MEDIUM, *d); ++ goto err; ++ } ++ PVEBackupDevInfo *di = g_new0(PVEBackupDevInfo, 1); ++ di->bs = bs; ++ di_list = g_list_append(di_list, di); ++ } else { ++ error_set(errp, ERROR_CLASS_DEVICE_NOT_FOUND, ++ "Device '%s' not found", *d); ++ goto err; ++ } ++ d++; ++ } ++ ++ } else { ++ BdrvNextIterator it; ++ ++ bs = NULL; ++ for (bs = bdrv_first(&it); bs; bs = bdrv_next(&it)) { ++ if (!bdrv_is_inserted(bs) || bdrv_is_read_only(bs)) { ++ continue; ++ } ++ ++ PVEBackupDevInfo *di = g_new0(PVEBackupDevInfo, 1); ++ di->bs = bs; ++ di_list = g_list_append(di_list, di); ++ } ++ } ++ ++ if (!di_list) { ++ error_set(errp, ERROR_CLASS_GENERIC_ERROR, "empty device list"); ++ goto err; ++ } ++ ++ size_t total = 0; ++ ++ l = di_list; ++ while (l) { ++ PVEBackupDevInfo *di = (PVEBackupDevInfo *)l->data; ++ l = g_list_next(l); ++ if (bdrv_op_is_blocked(di->bs, BLOCK_OP_TYPE_BACKUP_SOURCE, errp)) { ++ goto err; ++ } ++ ++ ssize_t size = bdrv_getlength(di->bs); ++ if (size < 0) { ++ error_setg_errno(errp, -di->size, "bdrv_getlength failed"); ++ goto err; ++ } ++ di->size = size; ++ total += size; ++ } ++ ++ qemu_uuid_generate(&uuid); ++ ++ if (format == BACKUP_FORMAT_VMA) { ++ char uuidstr[UUID_FMT_LEN+1]; ++ qemu_uuid_unparse(&uuid, uuidstr); ++ uuidstr[UUID_FMT_LEN] = 0; ++ backup_state.vmaobj = ++ object_new_with_props("vma", object_get_objects_root(), ++ "vma-backup-obj", &local_err, ++ "filename", backup_file, ++ "uuid", uuidstr, ++ NULL); ++ if (!backup_state.vmaobj) { ++ if (local_err) { ++ error_propagate(errp, local_err); ++ } ++ goto err; ++ } ++ ++ l = di_list; ++ while (l) { ++ QDict *options = qdict_new(); ++ ++ PVEBackupDevInfo *di = (PVEBackupDevInfo *)l->data; ++ l = g_list_next(l); ++ ++ const char *devname = bdrv_get_device_name(di->bs); ++ snprintf(di->targetfile, PATH_MAX, "vma-backup-obj/%s.raw", devname); ++ ++ qdict_put(options, "driver", qstring_from_str("vma-drive")); ++ qdict_put(options, "size", qint_from_int(di->size)); ++ di->target = bdrv_open(di->targetfile, NULL, options, BDRV_O_RDWR, &local_err); ++ if (!di->target) { ++ error_propagate(errp, local_err); ++ goto err; ++ } ++ } ++ } else if (format == BACKUP_FORMAT_DIR) { ++ if (mkdir(backup_file, 0640) != 0) { ++ error_setg_errno(errp, errno, "can't create directory '%s'\n", ++ backup_file); ++ goto err; ++ } ++ backup_dir = backup_file; ++ ++ l = di_list; ++ while (l) { ++ PVEBackupDevInfo *di = (PVEBackupDevInfo *)l->data; ++ l = g_list_next(l); ++ ++ const char *devname = bdrv_get_device_name(di->bs); ++ snprintf(di->targetfile, PATH_MAX, "%s/%s.raw", backup_dir, devname); ++ ++ int flags = BDRV_O_RDWR; ++ bdrv_img_create(di->targetfile, "raw", NULL, NULL, NULL, ++ di->size, flags, false, &local_err); ++ if (local_err) { ++ error_propagate(errp, local_err); ++ goto err; ++ } ++ ++ di->target = bdrv_open(di->targetfile, NULL, NULL, flags, &local_err); ++ if (!di->target) { ++ error_propagate(errp, local_err); ++ goto err; ++ } ++ } ++ } else { ++ error_set(errp, ERROR_CLASS_GENERIC_ERROR, "unknown backup format"); ++ goto err; ++ } ++ ++ /* add configuration file to archive */ ++ if (has_config_file) { ++ if(config_to_vma(config_file, format, backup_state.vmaobj, backup_dir, errp) != 0) { ++ goto err; ++ } ++ } ++ ++ /* add firewall file to archive */ ++ if (has_firewall_file) { ++ if(config_to_vma(firewall_file, format, backup_state.vmaobj, backup_dir, errp) != 0) { ++ goto err; ++ } ++ } ++ /* initialize global backup_state now */ ++ ++ backup_state.cancel = false; ++ ++ if (backup_state.error) { ++ error_free(backup_state.error); ++ backup_state.error = NULL; ++ } ++ ++ backup_state.speed = (has_speed && speed > 0) ? speed : 0; ++ ++ backup_state.start_time = time(NULL); ++ backup_state.end_time = 0; ++ ++ if (backup_state.backup_file) { ++ g_free(backup_state.backup_file); ++ } ++ backup_state.backup_file = g_strdup(backup_file); ++ ++ memcpy(&backup_state.uuid, &uuid, sizeof(uuid)); ++ qemu_uuid_unparse(&uuid, backup_state.uuid_str); ++ ++ qemu_mutex_lock(&backup_state.backup_mutex); ++ backup_state.di_list = di_list; ++ backup_state.next_job = 0; ++ ++ backup_state.total = total; ++ backup_state.transferred = 0; ++ backup_state.zero_bytes = 0; ++ ++ /* start all jobs (paused state) */ ++ l = di_list; ++ while (l) { ++ PVEBackupDevInfo *di = (PVEBackupDevInfo *)l->data; ++ l = g_list_next(l); ++ ++ job = backup_job_create(NULL, di->bs, di->target, speed, MIRROR_SYNC_MODE_FULL, NULL, ++ false, BLOCKDEV_ON_ERROR_REPORT, BLOCKDEV_ON_ERROR_REPORT, ++ JOB_DEFAULT, ++ pvebackup_complete_cb, di, 2, NULL, &local_err); ++ if (di->target) { ++ bdrv_unref(di->target); ++ di->target = NULL; ++ } ++ if (!job || local_err != NULL) { ++ error_setg(&backup_state.error, "backup_job_create failed"); ++ pvebackup_cancel(NULL); ++ } else { ++ job_start(&job->job); ++ } ++ } ++ ++ qemu_mutex_unlock(&backup_state.backup_mutex); ++ ++ if (!backup_state.error) { ++ pvebackup_run_next_job(); // run one job ++ } ++ ++ uuid_info = g_malloc0(sizeof(*uuid_info)); ++ uuid_info->UUID = g_strdup(backup_state.uuid_str); ++ ++ return uuid_info; ++ ++err: ++ ++ l = di_list; ++ while (l) { ++ PVEBackupDevInfo *di = (PVEBackupDevInfo *)l->data; ++ l = g_list_next(l); ++ ++ if (di->target) { ++ bdrv_unref(di->target); ++ } ++ ++ if (di->targetfile[0]) { ++ unlink(di->targetfile); ++ } ++ g_free(di); ++ } ++ g_list_free(di_list); ++ ++ if (devs) { ++ g_strfreev(devs); ++ } ++ ++ if (backup_state.vmaobj) { ++ object_unparent(backup_state.vmaobj); ++ backup_state.vmaobj = NULL; ++ } ++ ++ if (backup_dir) { ++ rmdir(backup_dir); ++ } ++ ++ return NULL; ++} ++ ++BackupStatus *qmp_query_backup(Error **errp) ++{ ++ BackupStatus *info = g_malloc0(sizeof(*info)); ++ ++ if (!backup_state.start_time) { ++ /* not started, return {} */ ++ return info; ++ } ++ ++ info->has_status = true; ++ info->has_start_time = true; ++ info->start_time = backup_state.start_time; ++ ++ if (backup_state.backup_file) { ++ info->has_backup_file = true; ++ info->backup_file = g_strdup(backup_state.backup_file); ++ } ++ ++ info->has_uuid = true; ++ info->uuid = g_strdup(backup_state.uuid_str); ++ ++ if (backup_state.end_time) { ++ if (backup_state.error) { ++ info->status = g_strdup("error"); ++ info->has_errmsg = true; ++ info->errmsg = g_strdup(error_get_pretty(backup_state.error)); ++ } else { ++ info->status = g_strdup("done"); ++ } ++ info->has_end_time = true; ++ info->end_time = backup_state.end_time; ++ } else { ++ info->status = g_strdup("active"); ++ } ++ ++ info->has_total = true; ++ info->total = backup_state.total; ++ info->has_zero_bytes = true; ++ info->zero_bytes = backup_state.zero_bytes; ++ info->has_transferred = true; ++ info->transferred = backup_state.transferred; ++ ++ return info; ++} ++ + void qmp_block_stream(bool has_job_id, const char *job_id, const char *device, + bool has_base, const char *base, + bool has_base_node, const char *base_node, +diff --git a/configure b/configure +index 7b3f80a49c..d2cc11cdbb 100755 +--- a/configure ++++ b/configure +@@ -475,6 +475,7 @@ vxhs="" + libxml2="" + docker="no" + debug_mutex="no" ++vma="" + + # cross compilers defaults, can be overridden with --cross-cc-ARCH + cross_cc_aarch64="aarch64-linux-gnu-gcc" +@@ -1435,6 +1436,10 @@ for opt do + ;; + --disable-debug-mutex) debug_mutex=no + ;; ++ --enable-vma) vma=yes ++ ;; ++ --disable-vma) vma=no ++ ;; + *) + echo "ERROR: unknown option $opt" + echo "Try '$0 --help' for more information" +@@ -1710,6 +1715,7 @@ disabled with --disable-FEATURE, default is enabled if available: + vhost-user vhost-user support + capstone capstone disassembler support + debug-mutex mutex debugging support ++ vma VMA archive backend + + NOTE: The object files are built at the place where configure is launched + EOF +@@ -4121,6 +4127,22 @@ EOF + fi + + ########################################## ++# vma probe ++if test "$vma" != "no" ; then ++ if $pkg_config --exact-version=0.1.0 vma; then ++ vma="yes" ++ vma_cflags=$($pkg_config --cflags vma) ++ vma_libs=$($pkg_config --libs vma) ++ else ++ if test "$vma" = "yes" ; then ++ feature_not_found "VMA Archive backend support" \ ++ "Install libvma devel" ++ fi ++ vma="no" ++ fi ++fi ++ ++########################################## + # signalfd probe + signalfd="no" + cat > $TMPC << EOF +@@ -6007,6 +6029,7 @@ echo "replication support $replication" + echo "VxHS block device $vxhs" + echo "capstone $capstone" + echo "docker $docker" ++echo "VMA support $vma" + + if test "$sdl_too_old" = "yes"; then + echo "-> Your SDL version is too old - please upgrade to have SDL support" +@@ -6493,6 +6516,12 @@ if test "$usb_redir" = "yes" ; then + echo "USB_REDIR_LIBS=$usb_redir_libs" >> $config_host_mak + fi + ++if test "$vma" = "yes" ; then ++ echo "CONFIG_VMA=y" >> $config_host_mak ++ echo "VMA_CFLAGS=$vma_cflags" >> $config_host_mak ++ echo "VMA_LIBS=$vma_libs" >> $config_host_mak ++fi ++ + if test "$opengl" = "yes" ; then + echo "CONFIG_OPENGL=y" >> $config_host_mak + echo "OPENGL_LIBS=$opengl_libs" >> $config_host_mak +diff --git a/hmp-commands-info.hx b/hmp-commands-info.hx +index 42c148fdc9..277e140092 100644 +--- a/hmp-commands-info.hx ++++ b/hmp-commands-info.hx +@@ -502,6 +502,19 @@ STEXI + Show CPU statistics. + ETEXI + ++ { ++ .name = "backup", ++ .args_type = "", ++ .params = "", ++ .help = "show backup status", ++ .cmd = hmp_info_backup, ++ }, ++ ++STEXI ++@item info backup ++show backup status ++ETEXI ++ + #if defined(CONFIG_SLIRP) + { + .name = "usernet", +diff --git a/hmp-commands.hx b/hmp-commands.hx +index a6f0720442..956cbf04b9 100644 +--- a/hmp-commands.hx ++++ b/hmp-commands.hx +@@ -107,6 +107,37 @@ STEXI + Copy data from a backing file into a block device. + ETEXI + ++ { ++ .name = "backup", ++ .args_type = "directory:-d,backupfile:s,speed:o?,devlist:s?", ++ .params = "[-d] backupfile [speed [devlist]]", ++ .help = "create a VM Backup." ++ "\n\t\t\t Use -d to dump data into a directory instead" ++ "\n\t\t\t of using VMA format.", ++ .cmd = hmp_backup, ++ }, ++ ++STEXI ++@item backup ++@findex backup ++Create a VM backup. ++ETEXI ++ ++ { ++ .name = "backup_cancel", ++ .args_type = "", ++ .params = "", ++ .help = "cancel the current VM backup", ++ .cmd = hmp_backup_cancel, ++ }, ++ ++STEXI ++@item backup_cancel ++@findex backup_cancel ++Cancel the current VM backup. ++ ++ETEXI ++ + { + .name = "block_job_set_speed", + .args_type = "device:B,speed:o", +diff --git a/hmp.c b/hmp.c +index 7c975f3ead..8d659e20f6 100644 +--- a/hmp.c ++++ b/hmp.c +@@ -166,6 +166,44 @@ void hmp_info_mice(Monitor *mon, const QDict *qdict) + qapi_free_MouseInfoList(mice_list); + } + ++void hmp_info_backup(Monitor *mon, const QDict *qdict) ++{ ++ BackupStatus *info; ++ ++ info = qmp_query_backup(NULL); ++ if (info->has_status) { ++ if (info->has_errmsg) { ++ monitor_printf(mon, "Backup status: %s - %s\n", ++ info->status, info->errmsg); ++ } else { ++ monitor_printf(mon, "Backup status: %s\n", info->status); ++ } ++ } ++ ++ if (info->has_backup_file) { ++ monitor_printf(mon, "Start time: %s", ctime(&info->start_time)); ++ if (info->end_time) { ++ monitor_printf(mon, "End time: %s", ctime(&info->end_time)); ++ } ++ ++ int per = (info->has_total && info->total && ++ info->has_transferred && info->transferred) ? ++ (info->transferred * 100)/info->total : 0; ++ int zero_per = (info->has_total && info->total && ++ info->has_zero_bytes && info->zero_bytes) ? ++ (info->zero_bytes * 100)/info->total : 0; ++ monitor_printf(mon, "Backup file: %s\n", info->backup_file); ++ monitor_printf(mon, "Backup uuid: %s\n", info->uuid); ++ monitor_printf(mon, "Total size: %zd\n", info->total); ++ monitor_printf(mon, "Transferred bytes: %zd (%d%%)\n", ++ info->transferred, per); ++ monitor_printf(mon, "Zero bytes: %zd (%d%%)\n", ++ info->zero_bytes, zero_per); ++ } ++ ++ qapi_free_BackupStatus(info); ++} ++ + void hmp_info_migrate(Monitor *mon, const QDict *qdict) + { + MigrationInfo *info; +@@ -1899,6 +1937,31 @@ void hmp_block_stream(Monitor *mon, const QDict *qdict) + hmp_handle_error(mon, &error); + } + ++void hmp_backup_cancel(Monitor *mon, const QDict *qdict) ++{ ++ Error *error = NULL; ++ ++ qmp_backup_cancel(&error); ++ ++ hmp_handle_error(mon, &error); ++} ++ ++void hmp_backup(Monitor *mon, const QDict *qdict) ++{ ++ Error *error = NULL; ++ ++ int dir = qdict_get_try_bool(qdict, "directory", 0); ++ const char *backup_file = qdict_get_str(qdict, "backupfile"); ++ const char *devlist = qdict_get_try_str(qdict, "devlist"); ++ int64_t speed = qdict_get_try_int(qdict, "speed", 0); ++ ++ qmp_backup(backup_file, true, dir ? BACKUP_FORMAT_DIR : BACKUP_FORMAT_VMA, ++ false, NULL, false, NULL, !!devlist, ++ devlist, qdict_haskey(qdict, "speed"), speed, &error); ++ ++ hmp_handle_error(mon, &error); ++} ++ + void hmp_block_job_set_speed(Monitor *mon, const QDict *qdict) + { + Error *error = NULL; +diff --git a/hmp.h b/hmp.h +index 98bb7a44db..853f233195 100644 +--- a/hmp.h ++++ b/hmp.h +@@ -29,6 +29,7 @@ void hmp_info_migrate(Monitor *mon, const QDict *qdict); + void hmp_info_migrate_capabilities(Monitor *mon, const QDict *qdict); + void hmp_info_migrate_parameters(Monitor *mon, const QDict *qdict); + void hmp_info_migrate_cache_size(Monitor *mon, const QDict *qdict); ++void hmp_info_backup(Monitor *mon, const QDict *qdict); + void hmp_info_cpus(Monitor *mon, const QDict *qdict); + void hmp_info_block(Monitor *mon, const QDict *qdict); + void hmp_info_blockstats(Monitor *mon, const QDict *qdict); +@@ -86,6 +87,8 @@ void hmp_eject(Monitor *mon, const QDict *qdict); + void hmp_change(Monitor *mon, const QDict *qdict); + void hmp_block_set_io_throttle(Monitor *mon, const QDict *qdict); + void hmp_block_stream(Monitor *mon, const QDict *qdict); ++void hmp_backup(Monitor *mon, const QDict *qdict); ++void hmp_backup_cancel(Monitor *mon, const QDict *qdict); + void hmp_block_job_set_speed(Monitor *mon, const QDict *qdict); + void hmp_block_job_cancel(Monitor *mon, const QDict *qdict); + void hmp_block_job_pause(Monitor *mon, const QDict *qdict); +diff --git a/qapi/block-core.json b/qapi/block-core.json +index 5b9084a394..9c3c2d4917 100644 +--- a/qapi/block-core.json ++++ b/qapi/block-core.json +@@ -718,6 +718,97 @@ + + + ## ++# @BackupStatus: ++# ++# Detailed backup status. ++# ++# @status: string describing the current backup status. ++# This can be 'active', 'done', 'error'. If this field is not ++# returned, no backup process has been initiated ++# ++# @errmsg: error message (only returned if status is 'error') ++# ++# @total: total amount of bytes involved in the backup process ++# ++# @transferred: amount of bytes already backed up. ++# ++# @zero-bytes: amount of 'zero' bytes detected. ++# ++# @start-time: time (epoch) when backup job started. ++# ++# @end-time: time (epoch) when backup job finished. ++# ++# @backup-file: backup file name ++# ++# @uuid: uuid for this backup job ++# ++## ++{ 'struct': 'BackupStatus', ++ 'data': {'*status': 'str', '*errmsg': 'str', '*total': 'int', ++ '*transferred': 'int', '*zero-bytes': 'int', ++ '*start-time': 'int', '*end-time': 'int', ++ '*backup-file': 'str', '*uuid': 'str' } } ++ ++## ++# @BackupFormat: ++# ++# An enumeration of supported backup formats. ++# ++# @vma: Proxmox vma backup format ++## ++{ 'enum': 'BackupFormat', ++ 'data': [ 'vma', 'dir' ] } ++ ++## ++# @backup: ++# ++# Starts a VM backup. ++# ++# @backup-file: the backup file name ++# ++# @format: format of the backup file ++# ++# @config-file: a configuration file to include into ++# the backup archive. ++# ++# @speed: the maximum speed, in bytes per second ++# ++# @devlist: list of block device names (separated by ',', ';' ++# or ':'). By default the backup includes all writable block devices. ++# ++# Returns: the uuid of the backup job ++# ++## ++{ 'command': 'backup', 'data': { 'backup-file': 'str', ++ '*format': 'BackupFormat', ++ '*config-file': 'str', ++ '*firewall-file': 'str', ++ '*devlist': 'str', '*speed': 'int' }, ++ 'returns': 'UuidInfo' } ++ ++## ++# @query-backup: ++# ++# Returns information about current/last backup task. ++# ++# Returns: @BackupStatus ++# ++## ++{ 'command': 'query-backup', 'returns': 'BackupStatus' } ++ ++## ++# @backup-cancel: ++# ++# Cancel the current executing backup process. ++# ++# Returns: nothing on success ++# ++# Notes: This command succeeds even if there is no backup process running. ++# ++## ++{ 'command': 'backup-cancel' } ++ ++## + # @BlockDeviceTimedStats: + # + # Statistics of a block device during a given interval of time. +@@ -2549,7 +2640,7 @@ + 'host_cdrom', 'host_device', 'http', 'https', 'iscsi', 'luks', + 'nbd', 'nfs', 'null-aio', 'null-co', 'nvme', 'parallels', 'qcow', + 'qcow2', 'qed', 'quorum', 'raw', 'rbd', 'replication', 'sheepdog', +- 'ssh', 'throttle', 'vdi', 'vhdx', 'vmdk', 'vpc', 'vvfat', 'vxhs' ] } ++ 'ssh', 'throttle', 'vdi', 'vhdx', 'vma-drive', 'vmdk', 'vpc', 'vvfat', 'vxhs' ] } + + ## + # @BlockdevOptionsFile: +@@ -3550,6 +3641,21 @@ + '*tls-creds': 'str' } } + + ## ++# @BlockdevOptionsVMADrive: ++# ++# Driver specific block device options for VMA Drives ++# ++# @filename: vma-drive path ++# ++# @size: drive size in bytes ++# ++# Since: 2.9 ++## ++{ 'struct': 'BlockdevOptionsVMADrive', ++ 'data': { 'filename': 'str', ++ 'size': 'int' } } ++ ++## + # @BlockdevOptionsThrottle: + # + # Driver specific block device options for the throttle driver +@@ -3633,6 +3739,7 @@ + 'throttle': 'BlockdevOptionsThrottle', + 'vdi': 'BlockdevOptionsGenericFormat', + 'vhdx': 'BlockdevOptionsGenericFormat', ++ 'vma-drive': 'BlockdevOptionsVMADrive', + 'vmdk': 'BlockdevOptionsGenericCOWFormat', + 'vpc': 'BlockdevOptionsGenericFormat', + 'vvfat': 'BlockdevOptionsVVFAT', +diff --git a/qapi/common.json b/qapi/common.json +index c367adc4b6..070b7b52c8 100644 +--- a/qapi/common.json ++++ b/qapi/common.json +@@ -149,3 +149,16 @@ + 'ppc64', 'ppcemb', 'riscv32', 'riscv64', 's390x', 'sh4', + 'sh4eb', 'sparc', 'sparc64', 'tricore', 'unicore32', + 'x86_64', 'xtensa', 'xtensaeb' ] } ++ ++## ++# @UuidInfo: ++# ++# Guest UUID information (Universally Unique Identifier). ++# ++# @UUID: the UUID of the guest ++# ++# Since: 0.14.0 ++# ++# Notes: If no UUID was specified for the guest, a null UUID is returned. ++## ++{ 'struct': 'UuidInfo', 'data': {'UUID': 'str'} } +diff --git a/qapi/misc.json b/qapi/misc.json +index b6ad5f028d..3dd5117fc3 100644 +--- a/qapi/misc.json ++++ b/qapi/misc.json +@@ -275,19 +275,6 @@ + { 'command': 'query-kvm', 'returns': 'KvmInfo' } + + ## +-# @UuidInfo: +-# +-# Guest UUID information (Universally Unique Identifier). +-# +-# @UUID: the UUID of the guest +-# +-# Since: 0.14.0 +-# +-# Notes: If no UUID was specified for the guest, a null UUID is returned. +-## +-{ 'struct': 'UuidInfo', 'data': {'UUID': 'str'} } +- +-## + # @query-uuid: + # + # Query the guest UUID information. +-- +2.11.0 + diff --git a/debian/patches/pve/0021-PVE-block-snapshot-qmp_delete_drive_snapshot-add-aio.patch b/debian/patches/pve/0021-PVE-block-snapshot-qmp_delete_drive_snapshot-add-aio.patch deleted file mode 100644 index def2307..0000000 --- a/debian/patches/pve/0021-PVE-block-snapshot-qmp_delete_drive_snapshot-add-aio.patch +++ /dev/null @@ -1,60 +0,0 @@ -From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001 -From: Alexandre Derumier -Date: Mon, 7 Nov 2016 11:47:50 +0100 -Subject: [PATCH] PVE: block: snapshot: qmp_delete_drive_snapshot : add - aiocontext - -this fix snapshot delete of qcow2 with iothread enabled - -Signed-off-by: Alexandre Derumier ---- - savevm-async.c | 13 ++++++++++--- - 1 file changed, 10 insertions(+), 3 deletions(-) - -diff --git a/savevm-async.c b/savevm-async.c -index 87d5460a26..aa578c4a49 100644 ---- a/savevm-async.c -+++ b/savevm-async.c -@@ -461,6 +461,7 @@ void qmp_delete_drive_snapshot(const char *device, const char *name, - BlockDriverState *bs; - QEMUSnapshotInfo sn1, *sn = &sn1; - Error *local_err = NULL; -+ AioContext *aio_context; - - int ret; - -@@ -477,22 +478,28 @@ void qmp_delete_drive_snapshot(const char *device, const char *name, - return; - } - -+ aio_context = bdrv_get_aio_context(bs); -+ aio_context_acquire(aio_context); -+ - if (!bdrv_can_snapshot(bs)) { - error_setg(errp, QERR_UNSUPPORTED); -- return; -+ goto out; - } - - if (bdrv_snapshot_find(bs, sn, name) < 0) { - /* return success if snapshot does not exists */ -- return; -+ goto out; - } - - ret = bdrv_snapshot_delete(bs, NULL, name, &local_err); - if (ret < 0) { - error_set(errp, ERROR_CLASS_GENERIC_ERROR, - "Error while deleting snapshot on '%s'\n", device); -- return; -+ goto out; - } -+ -+out: -+ aio_context_release(aio_context); - } - - static ssize_t loadstate_get_buffer(void *opaque, uint8_t *buf, int64_t pos, --- -2.11.0 - diff --git a/debian/patches/pve/0022-PVE-Deprecated-adding-old-vma-files.patch b/debian/patches/pve/0022-PVE-Deprecated-adding-old-vma-files.patch new file mode 100644 index 0000000..24cf94c --- /dev/null +++ b/debian/patches/pve/0022-PVE-Deprecated-adding-old-vma-files.patch @@ -0,0 +1,3294 @@ +From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001 +From: Wolfgang Bumiller +Date: Mon, 7 Aug 2017 08:51:16 +0200 +Subject: [PATCH] PVE: [Deprecated] adding old vma files + +TODO: Move to using a libvma block backend +--- + Makefile | 3 +- + Makefile.objs | 1 + + block/backup.c | 107 ++++-- + block/replication.c | 1 + + blockdev.c | 208 +++++++---- + include/block/block_int.h | 4 + + job.c | 3 +- + vma-reader.c | 857 ++++++++++++++++++++++++++++++++++++++++++++++ + vma-writer.c | 771 +++++++++++++++++++++++++++++++++++++++++ + vma.c | 756 ++++++++++++++++++++++++++++++++++++++++ + vma.h | 150 ++++++++ + 11 files changed, 2754 insertions(+), 107 deletions(-) + create mode 100644 vma-reader.c + create mode 100644 vma-writer.c + create mode 100644 vma.c + create mode 100644 vma.h + +diff --git a/Makefile b/Makefile +index 2da686be33..5a0aad2004 100644 +--- a/Makefile ++++ b/Makefile +@@ -436,7 +436,7 @@ dummy := $(call unnest-vars,, \ + + include $(SRC_PATH)/tests/Makefile.include + +-all: $(DOCS) $(TOOLS) $(HELPERS-y) recurse-all modules ++all: $(DOCS) $(TOOLS) vma$(EXESUF) $(HELPERS-y) recurse-all modules + + qemu-version.h: FORCE + $(call quiet-command, \ +@@ -537,6 +537,7 @@ qemu-img.o: qemu-img-cmds.h + qemu-img$(EXESUF): qemu-img.o $(block-obj-y) $(crypto-obj-y) $(io-obj-y) $(qom-obj-y) $(COMMON_LDADDS) + qemu-nbd$(EXESUF): qemu-nbd.o $(block-obj-y) $(crypto-obj-y) $(io-obj-y) $(qom-obj-y) $(COMMON_LDADDS) + qemu-io$(EXESUF): qemu-io.o $(block-obj-y) $(crypto-obj-y) $(io-obj-y) $(qom-obj-y) $(COMMON_LDADDS) ++vma$(EXESUF): vma.o vma-reader.o $(block-obj-y) $(crypto-obj-y) $(io-obj-y) $(qom-obj-y) $(COMMON_LDADDS) + + qemu-bridge-helper$(EXESUF): qemu-bridge-helper.o $(COMMON_LDADDS) + +diff --git a/Makefile.objs b/Makefile.objs +index a836ee87d7..92c7886dee 100644 +--- a/Makefile.objs ++++ b/Makefile.objs +@@ -70,6 +70,7 @@ block-obj-y += block.o blockjob.o job.o + block-obj-y += block/ scsi/ + block-obj-y += qemu-io-cmds.o + block-obj-$(CONFIG_REPLICATION) += replication.o ++block-obj-y += vma-writer.o + + block-obj-m = block/ + +diff --git a/block/backup.c b/block/backup.c +index 3aaa75892a..2410cca257 100644 +--- a/block/backup.c ++++ b/block/backup.c +@@ -34,6 +34,7 @@ typedef struct BackupBlockJob { + /* bitmap for sync=incremental */ + BdrvDirtyBitmap *sync_bitmap; + MirrorSyncMode sync_mode; ++ BackupDumpFunc *dump_cb; + BlockdevOnError on_source_error; + BlockdevOnError on_target_error; + CoRwlock flush_rwlock; +@@ -126,12 +127,20 @@ static int coroutine_fn backup_cow_with_bounce_buffer(BackupBlockJob *job, + } + + if (qemu_iovec_is_zero(&qiov)) { +- ret = blk_co_pwrite_zeroes(job->target, start, +- qiov.size, write_flags | BDRV_REQ_MAY_UNMAP); ++ if (job->dump_cb) { ++ ret = job->dump_cb(job->common.job.opaque, job->target, start, qiov.size, NULL); ++ } else { ++ ret = blk_co_pwrite_zeroes(job->target, start, ++ qiov.size, write_flags | BDRV_REQ_MAY_UNMAP); ++ } + } else { +- ret = blk_co_pwritev(job->target, start, +- qiov.size, &qiov, write_flags | +- (job->compress ? BDRV_REQ_WRITE_COMPRESSED : 0)); ++ if (job->dump_cb) { ++ ret = job->dump_cb(job->common.job.opaque, job->target, start, qiov.size, *bounce_buffer); ++ } else { ++ ret = blk_co_pwritev(job->target, start, ++ qiov.size, &qiov, write_flags | ++ (job->compress ? BDRV_REQ_WRITE_COMPRESSED : 0)); ++ } + } + if (ret < 0) { + trace_backup_do_cow_write_fail(job, start, ret); +@@ -209,7 +218,11 @@ static int coroutine_fn backup_do_cow(BackupBlockJob *job, + trace_backup_do_cow_process(job, start); + + if (job->use_copy_range) { +- ret = backup_cow_with_offload(job, start, end, is_write_notifier); ++ if (job->dump_cb) { ++ ret = - 1; ++ } else { ++ ret = backup_cow_with_offload(job, start, end, is_write_notifier); ++ } + if (ret < 0) { + job->use_copy_range = false; + } +@@ -293,7 +306,9 @@ static void backup_abort(Job *job) + static void backup_clean(Job *job) + { + BackupBlockJob *s = container_of(job, BackupBlockJob, common.job); +- assert(s->target); ++ if (!s->target) { ++ return; ++ } + blk_unref(s->target); + s->target = NULL; + } +@@ -302,7 +317,9 @@ static void backup_attached_aio_context(BlockJob *job, AioContext *aio_context) + { + BackupBlockJob *s = container_of(job, BackupBlockJob, common); + +- blk_set_aio_context(s->target, aio_context); ++ if (s->target) { ++ blk_set_aio_context(s->target, aio_context); ++ } + } + + void backup_do_checkpoint(BlockJob *job, Error **errp) +@@ -374,9 +391,11 @@ static BlockErrorAction backup_error_action(BackupBlockJob *job, + if (read) { + return block_job_error_action(&job->common, job->on_source_error, + true, error); +- } else { ++ } else if (job->target) { + return block_job_error_action(&job->common, job->on_target_error, + false, error); ++ } else { ++ return BLOCK_ERROR_ACTION_REPORT; + } + } + +@@ -612,6 +631,7 @@ BlockJob *backup_job_create(const char *job_id, BlockDriverState *bs, + BlockdevOnError on_source_error, + BlockdevOnError on_target_error, + int creation_flags, ++ BackupDumpFunc *dump_cb, + BlockCompletionFunc *cb, void *opaque, + int pause_count, + JobTxn *txn, Error **errp) +@@ -622,7 +642,7 @@ BlockJob *backup_job_create(const char *job_id, BlockDriverState *bs, + int ret; + + assert(bs); +- assert(target); ++ assert(target || dump_cb); + + if (bs == target) { + error_setg(errp, "Source and target cannot be the same"); +@@ -635,13 +655,13 @@ BlockJob *backup_job_create(const char *job_id, BlockDriverState *bs, + return NULL; + } + +- if (!bdrv_is_inserted(target)) { ++ if (target && !bdrv_is_inserted(target)) { + error_setg(errp, "Device is not inserted: %s", + bdrv_get_device_name(target)); + return NULL; + } + +- if (compress && target->drv->bdrv_co_pwritev_compressed == NULL) { ++ if (target && compress && target->drv->bdrv_co_pwritev_compressed == NULL) { + error_setg(errp, "Compression is not supported for this drive %s", + bdrv_get_device_name(target)); + return NULL; +@@ -651,7 +671,7 @@ BlockJob *backup_job_create(const char *job_id, BlockDriverState *bs, + return NULL; + } + +- if (bdrv_op_is_blocked(target, BLOCK_OP_TYPE_BACKUP_TARGET, errp)) { ++ if (target && bdrv_op_is_blocked(target, BLOCK_OP_TYPE_BACKUP_TARGET, errp)) { + return NULL; + } + +@@ -691,15 +711,18 @@ BlockJob *backup_job_create(const char *job_id, BlockDriverState *bs, + goto error; + } + +- /* The target must match the source in size, so no resize here either */ +- job->target = blk_new(BLK_PERM_WRITE, +- BLK_PERM_CONSISTENT_READ | BLK_PERM_WRITE | +- BLK_PERM_WRITE_UNCHANGED | BLK_PERM_GRAPH_MOD); +- ret = blk_insert_bs(job->target, target, errp); +- if (ret < 0) { +- goto error; ++ if (target) { ++ /* The target must match the source in size, so no resize here either */ ++ job->target = blk_new(BLK_PERM_WRITE, ++ BLK_PERM_CONSISTENT_READ | BLK_PERM_WRITE | ++ BLK_PERM_WRITE_UNCHANGED | BLK_PERM_GRAPH_MOD); ++ ret = blk_insert_bs(job->target, target, errp); ++ if (ret < 0) { ++ goto error; ++ } + } + ++ job->dump_cb = dump_cb; + job->on_source_error = on_source_error; + job->on_target_error = on_target_error; + job->sync_mode = sync_mode; +@@ -710,6 +733,9 @@ BlockJob *backup_job_create(const char *job_id, BlockDriverState *bs, + /* Detect image-fleecing (and similar) schemes */ + job->serialize_target_writes = bdrv_chain_contains(target, bs); + ++ if (!target) { ++ goto use_default_cluster_size; ++ } + /* If there is no backing file on the target, we cannot rely on COW if our + * backup cluster size is smaller than the target cluster size. Even for + * targets with a backing file, try to avoid COW if possible. */ +@@ -734,18 +760,35 @@ BlockJob *backup_job_create(const char *job_id, BlockDriverState *bs, + /* Not fatal; just trudge on ahead. */ + job->cluster_size = BACKUP_CLUSTER_SIZE_DEFAULT; + } else { +- job->cluster_size = MAX(BACKUP_CLUSTER_SIZE_DEFAULT, bdi.cluster_size); +- } +- job->use_copy_range = true; +- job->copy_range_size = MIN_NON_ZERO(blk_get_max_transfer(job->common.blk), +- blk_get_max_transfer(job->target)); +- job->copy_range_size = MAX(job->cluster_size, +- QEMU_ALIGN_UP(job->copy_range_size, +- job->cluster_size)); +- +- /* Required permissions are already taken with target's blk_new() */ +- block_job_add_bdrv(&job->common, "target", target, 0, BLK_PERM_ALL, +- &error_abort); ++ use_default_cluster_size: ++ ret = bdrv_get_info(bs, &bdi); ++ if (ret < 0) { ++ job->cluster_size = BACKUP_CLUSTER_SIZE_DEFAULT; ++ } else { ++ /* round down to nearest BACKUP_CLUSTER_SIZE_DEFAULT */ ++ job->cluster_size = (bdi.cluster_size / BACKUP_CLUSTER_SIZE_DEFAULT) * BACKUP_CLUSTER_SIZE_DEFAULT; ++ if (job->cluster_size == 0) { ++ /* but we can't go below it */ ++ job->cluster_size = BACKUP_CLUSTER_SIZE_DEFAULT; ++ } ++ } ++ } ++ if (target) { ++ job->use_copy_range = true; ++ job->copy_range_size = MIN_NON_ZERO(blk_get_max_transfer(job->common.blk), ++ blk_get_max_transfer(job->target)); ++ job->copy_range_size = MAX(job->cluster_size, ++ QEMU_ALIGN_UP(job->copy_range_size, ++ job->cluster_size)); ++ } else { ++ job->use_copy_range = false; ++ } ++ ++ if (target) { ++ /* Required permissions are already taken with target's blk_new() */ ++ block_job_add_bdrv(&job->common, "target", target, 0, BLK_PERM_ALL, ++ &error_abort); ++ } + job->len = len; + job->common.job.pause_count += pause_count; + +diff --git a/block/replication.c b/block/replication.c +index 84e07cc4d4..04fa448a5b 100644 +--- a/block/replication.c ++++ b/block/replication.c +@@ -571,6 +571,7 @@ static void replication_start(ReplicationState *rs, ReplicationMode mode, + 0, MIRROR_SYNC_MODE_NONE, NULL, false, + BLOCKDEV_ON_ERROR_REPORT, + BLOCKDEV_ON_ERROR_REPORT, JOB_INTERNAL, ++ NULL, + backup_job_completed, bs, 0, NULL, &local_err); + if (local_err) { + error_propagate(errp, local_err); +diff --git a/blockdev.c b/blockdev.c +index 4f18d3c3d7..86508066cc 100644 +--- a/blockdev.c ++++ b/blockdev.c +@@ -31,7 +31,6 @@ + */ + + #include "qemu/osdep.h" +-#include "qemu/uuid.h" + #include "sysemu/block-backend.h" + #include "sysemu/blockdev.h" + #include "hw/block/block.h" +@@ -63,6 +62,7 @@ + #include "qemu/cutils.h" + #include "qemu/help_option.h" + #include "qemu/throttle-options.h" ++#include "vma.h" + + static QTAILQ_HEAD(, BlockDriverState) monitor_bdrv_states = + QTAILQ_HEAD_INITIALIZER(monitor_bdrv_states); +@@ -3228,15 +3228,14 @@ out: + static struct PVEBackupState { + Error *error; + bool cancel; +- QemuUUID uuid; ++ uuid_t uuid; + char uuid_str[37]; + int64_t speed; + time_t start_time; + time_t end_time; + char *backup_file; +- Object *vmaobj; ++ VmaWriter *vmaw; + GList *di_list; +- size_t next_job; + size_t total; + size_t transferred; + size_t zero_bytes; +@@ -3255,6 +3254,71 @@ typedef struct PVEBackupDevInfo { + + static void pvebackup_run_next_job(void); + ++static int pvebackup_dump_cb(void *opaque, BlockBackend *target, ++ uint64_t start, uint64_t bytes, ++ const void *pbuf) ++{ ++ const uint64_t size = bytes; ++ const unsigned char *buf = pbuf; ++ PVEBackupDevInfo *di = opaque; ++ ++ if (backup_state.cancel) { ++ return size; // return success ++ } ++ ++ uint64_t cluster_num = start / VMA_CLUSTER_SIZE; ++ if ((cluster_num * VMA_CLUSTER_SIZE) != start) { ++ if (!backup_state.error) { ++ error_setg(&backup_state.error, ++ "got unaligned write inside backup dump " ++ "callback (sector %ld)", start); ++ } ++ return -1; // not aligned to cluster size ++ } ++ ++ int ret = -1; ++ ++ if (backup_state.vmaw) { ++ size_t zero_bytes = 0; ++ uint64_t remaining = size; ++ while (remaining > 0) { ++ ret = vma_writer_write(backup_state.vmaw, di->dev_id, cluster_num, ++ buf, &zero_bytes); ++ ++cluster_num; ++ if (buf) { ++ buf += VMA_CLUSTER_SIZE; ++ } ++ if (ret < 0) { ++ if (!backup_state.error) { ++ vma_writer_error_propagate(backup_state.vmaw, &backup_state.error); ++ } ++ if (di->bs && di->bs->job) { ++ job_cancel(&di->bs->job->job, true); ++ } ++ break; ++ } else { ++ backup_state.zero_bytes += zero_bytes; ++ if (remaining >= VMA_CLUSTER_SIZE) { ++ backup_state.transferred += VMA_CLUSTER_SIZE; ++ remaining -= VMA_CLUSTER_SIZE; ++ } else { ++ backup_state.transferred += remaining; ++ remaining = 0; ++ } ++ } ++ } ++ } else { ++ if (!buf) { ++ backup_state.zero_bytes += size; ++ } ++ backup_state.transferred += size; ++ } ++ ++ // Note: always return success, because we want that writes succeed anyways. ++ ++ return size; ++} ++ + static void pvebackup_cleanup(void) + { + qemu_mutex_lock(&backup_state.backup_mutex); +@@ -3266,9 +3330,11 @@ static void pvebackup_cleanup(void) + + backup_state.end_time = time(NULL); + +- if (backup_state.vmaobj) { +- object_unparent(backup_state.vmaobj); +- backup_state.vmaobj = NULL; ++ if (backup_state.vmaw) { ++ Error *local_err = NULL; ++ vma_writer_close(backup_state.vmaw, &local_err); ++ error_propagate(&backup_state.error, local_err); ++ backup_state.vmaw = NULL; + } + + g_list_free(backup_state.di_list); +@@ -3276,6 +3342,13 @@ static void pvebackup_cleanup(void) + qemu_mutex_unlock(&backup_state.backup_mutex); + } + ++static void coroutine_fn backup_close_vma_stream(void *opaque) ++{ ++ PVEBackupDevInfo *di = opaque; ++ ++ vma_writer_close_stream(backup_state.vmaw, di->dev_id); ++} ++ + static void pvebackup_complete_cb(void *opaque, int ret) + { + // This always runs in the main loop +@@ -3292,9 +3365,9 @@ static void pvebackup_complete_cb(void *opaque, int ret) + di->bs = NULL; + di->target = NULL; + +- if (backup_state.vmaobj) { +- object_unparent(backup_state.vmaobj); +- backup_state.vmaobj = NULL; ++ if (backup_state.vmaw) { ++ Coroutine *co = qemu_coroutine_create(backup_close_vma_stream, di); ++ qemu_coroutine_enter(co); + } + + // remove self from job queue +@@ -3322,14 +3395,9 @@ static void pvebackup_cancel(void *opaque) + error_setg(&backup_state.error, "backup cancelled"); + } + +- if (backup_state.vmaobj) { +- Error *err; ++ if (backup_state.vmaw) { + /* make sure vma writer does not block anymore */ +- if (!object_set_props(backup_state.vmaobj, &err, "blocked", "yes", NULL)) { +- if (err) { +- error_report_err(err); +- } +- } ++ vma_writer_set_error(backup_state.vmaw, "backup cancelled"); + } + + GList *l = backup_state.di_list; +@@ -3360,18 +3428,14 @@ void qmp_backup_cancel(Error **errp) + Coroutine *co = qemu_coroutine_create(pvebackup_cancel, NULL); + qemu_coroutine_enter(co); + +- while (backup_state.vmaobj) { +- /* FIXME: Find something better for this */ ++ while (backup_state.vmaw) { ++ /* vma writer use main aio context */ + aio_poll(qemu_get_aio_context(), true); + } + } + +-void vma_object_add_config_file(Object *obj, const char *name, +- const char *contents, size_t len, +- Error **errp); + static int config_to_vma(const char *file, BackupFormat format, +- Object *vmaobj, +- const char *backup_dir, ++ const char *backup_dir, VmaWriter *vmaw, + Error **errp) + { + char *cdata = NULL; +@@ -3385,7 +3449,12 @@ static int config_to_vma(const char *file, BackupFormat format, + char *basename = g_path_get_basename(file); + + if (format == BACKUP_FORMAT_VMA) { +- vma_object_add_config_file(vmaobj, basename, cdata, clen, errp); ++ if (vma_writer_add_config(vmaw, basename, cdata, clen) != 0) { ++ error_setg(errp, "unable to add %s config data to vma archive", file); ++ g_free(cdata); ++ g_free(basename); ++ return 1; ++ } + } else if (format == BACKUP_FORMAT_DIR) { + char config_path[PATH_MAX]; + snprintf(config_path, PATH_MAX, "%s/%s", backup_dir, basename); +@@ -3402,28 +3471,30 @@ static int config_to_vma(const char *file, BackupFormat format, + return 0; + } + ++bool job_should_pause(Job *job); + static void pvebackup_run_next_job(void) + { + qemu_mutex_lock(&backup_state.backup_mutex); + +- GList *next = g_list_nth(backup_state.di_list, backup_state.next_job); +- while (next) { +- PVEBackupDevInfo *di = (PVEBackupDevInfo *)next->data; +- backup_state.next_job++; ++ GList *l = backup_state.di_list; ++ while (l) { ++ PVEBackupDevInfo *di = (PVEBackupDevInfo *)l->data; ++ l = g_list_next(l); + if (!di->completed && di->bs && di->bs->job) { + BlockJob *job = di->bs->job; + AioContext *aio_context = blk_get_aio_context(job->blk); + aio_context_acquire(aio_context); + qemu_mutex_unlock(&backup_state.backup_mutex); +- if (backup_state.error || backup_state.cancel) { +- job_cancel_sync(job); +- } else { +- job_resume(job); ++ if (job_should_pause(&job->job)) { ++ if (backup_state.error || backup_state.cancel) { ++ job_cancel_sync(&job->job); ++ } else { ++ job_resume(&job->job); ++ } + } + aio_context_release(aio_context); + return; + } +- next = g_list_next(next); + } + qemu_mutex_unlock(&backup_state.backup_mutex); + +@@ -3434,7 +3505,7 @@ static void pvebackup_run_next_job(void) + UuidInfo *qmp_backup(const char *backup_file, bool has_format, + BackupFormat format, + bool has_config_file, const char *config_file, +- bool has_firewall_file, const char *firewall_file, ++ bool has_firewall_file, const char *firewall_file, + bool has_devlist, const char *devlist, + bool has_speed, int64_t speed, Error **errp) + { +@@ -3442,7 +3513,8 @@ UuidInfo *qmp_backup(const char *backup_file, bool has_format, + BlockDriverState *bs = NULL; + const char *backup_dir = NULL; + Error *local_err = NULL; +- QemuUUID uuid; ++ uuid_t uuid; ++ VmaWriter *vmaw = NULL; + gchar **devs = NULL; + GList *di_list = NULL; + GList *l; +@@ -3454,7 +3526,7 @@ UuidInfo *qmp_backup(const char *backup_file, bool has_format, + backup_state.backup_mutex_initialized = true; + } + +- if (backup_state.di_list || backup_state.vmaobj) { ++ if (backup_state.di_list) { + error_set(errp, ERROR_CLASS_GENERIC_ERROR, + "previous backup not finished"); + return NULL; +@@ -3529,40 +3601,28 @@ UuidInfo *qmp_backup(const char *backup_file, bool has_format, + total += size; + } + +- qemu_uuid_generate(&uuid); ++ uuid_generate(uuid); + + if (format == BACKUP_FORMAT_VMA) { +- char uuidstr[UUID_FMT_LEN+1]; +- qemu_uuid_unparse(&uuid, uuidstr); +- uuidstr[UUID_FMT_LEN] = 0; +- backup_state.vmaobj = +- object_new_with_props("vma", object_get_objects_root(), +- "vma-backup-obj", &local_err, +- "filename", backup_file, +- "uuid", uuidstr, +- NULL); +- if (!backup_state.vmaobj) { ++ vmaw = vma_writer_create(backup_file, uuid, &local_err); ++ if (!vmaw) { + if (local_err) { + error_propagate(errp, local_err); + } + goto err; + } + ++ /* register all devices for vma writer */ + l = di_list; + while (l) { +- QDict *options = qdict_new(); +- + PVEBackupDevInfo *di = (PVEBackupDevInfo *)l->data; + l = g_list_next(l); + + const char *devname = bdrv_get_device_name(di->bs); +- snprintf(di->targetfile, PATH_MAX, "vma-backup-obj/%s.raw", devname); +- +- qdict_put(options, "driver", qstring_from_str("vma-drive")); +- qdict_put(options, "size", qint_from_int(di->size)); +- di->target = bdrv_open(di->targetfile, NULL, options, BDRV_O_RDWR, &local_err); +- if (!di->target) { +- error_propagate(errp, local_err); ++ di->dev_id = vma_writer_register_stream(vmaw, devname, di->size); ++ if (di->dev_id <= 0) { ++ error_set(errp, ERROR_CLASS_GENERIC_ERROR, ++ "register_stream failed"); + goto err; + } + } +@@ -3603,14 +3663,14 @@ UuidInfo *qmp_backup(const char *backup_file, bool has_format, + + /* add configuration file to archive */ + if (has_config_file) { +- if(config_to_vma(config_file, format, backup_state.vmaobj, backup_dir, errp) != 0) { ++ if (config_to_vma(config_file, format, backup_dir, vmaw, errp) != 0) { + goto err; + } + } + + /* add firewall file to archive */ + if (has_firewall_file) { +- if(config_to_vma(firewall_file, format, backup_state.vmaobj, backup_dir, errp) != 0) { ++ if (config_to_vma(firewall_file, format, backup_dir, vmaw, errp) != 0) { + goto err; + } + } +@@ -3633,12 +3693,13 @@ UuidInfo *qmp_backup(const char *backup_file, bool has_format, + } + backup_state.backup_file = g_strdup(backup_file); + +- memcpy(&backup_state.uuid, &uuid, sizeof(uuid)); +- qemu_uuid_unparse(&uuid, backup_state.uuid_str); ++ backup_state.vmaw = vmaw; ++ ++ uuid_copy(backup_state.uuid, uuid); ++ uuid_unparse_lower(uuid, backup_state.uuid_str); + + qemu_mutex_lock(&backup_state.backup_mutex); + backup_state.di_list = di_list; +- backup_state.next_job = 0; + + backup_state.total = total; + backup_state.transferred = 0; +@@ -3649,21 +3710,21 @@ UuidInfo *qmp_backup(const char *backup_file, bool has_format, + while (l) { + PVEBackupDevInfo *di = (PVEBackupDevInfo *)l->data; + l = g_list_next(l); +- + job = backup_job_create(NULL, di->bs, di->target, speed, MIRROR_SYNC_MODE_FULL, NULL, + false, BLOCKDEV_ON_ERROR_REPORT, BLOCKDEV_ON_ERROR_REPORT, + JOB_DEFAULT, +- pvebackup_complete_cb, di, 2, NULL, &local_err); +- if (di->target) { +- bdrv_unref(di->target); +- di->target = NULL; +- } ++ pvebackup_dump_cb, pvebackup_complete_cb, di, ++ 1, NULL, &local_err); + if (!job || local_err != NULL) { + error_setg(&backup_state.error, "backup_job_create failed"); + pvebackup_cancel(NULL); + } else { + job_start(&job->job); + } ++ if (di->target) { ++ bdrv_unref(di->target); ++ di->target = NULL; ++ } + } + + qemu_mutex_unlock(&backup_state.backup_mutex); +@@ -3699,9 +3760,10 @@ err: + g_strfreev(devs); + } + +- if (backup_state.vmaobj) { +- object_unparent(backup_state.vmaobj); +- backup_state.vmaobj = NULL; ++ if (vmaw) { ++ Error *err = NULL; ++ vma_writer_close(vmaw, &err); ++ unlink(backup_file); + } + + if (backup_dir) { +@@ -4104,7 +4166,7 @@ static BlockJob *do_drive_backup(DriveBackup *backup, JobTxn *txn, + job = backup_job_create(backup->job_id, bs, target_bs, backup->speed, + backup->sync, bmap, backup->compress, + backup->on_source_error, backup->on_target_error, +- job_flags, NULL, NULL, 0, txn, &local_err); ++ job_flags, NULL, NULL, NULL, 0, txn, &local_err); + bdrv_unref(target_bs); + if (local_err != NULL) { + error_propagate(errp, local_err); +@@ -4196,7 +4258,7 @@ BlockJob *do_blockdev_backup(BlockdevBackup *backup, JobTxn *txn, + job = backup_job_create(backup->job_id, bs, target_bs, backup->speed, + backup->sync, NULL, backup->compress, + backup->on_source_error, backup->on_target_error, +- job_flags, NULL, NULL, 0, txn, &local_err); ++ job_flags, NULL, NULL, NULL, 0, txn, &local_err); + if (local_err != NULL) { + error_propagate(errp, local_err); + } +diff --git a/include/block/block_int.h b/include/block/block_int.h +index 0b2516c3cf..ecd6243440 100644 +--- a/include/block/block_int.h ++++ b/include/block/block_int.h +@@ -59,6 +59,9 @@ + + #define BLOCK_PROBE_BUF_SIZE 512 + ++typedef int BackupDumpFunc(void *opaque, BlockBackend *be, ++ uint64_t offset, uint64_t bytes, const void *buf); ++ + enum BdrvTrackedRequestType { + BDRV_TRACKED_READ, + BDRV_TRACKED_WRITE, +@@ -1082,6 +1085,7 @@ BlockJob *backup_job_create(const char *job_id, BlockDriverState *bs, + BlockdevOnError on_source_error, + BlockdevOnError on_target_error, + int creation_flags, ++ BackupDumpFunc *dump_cb, + BlockCompletionFunc *cb, void *opaque, + int pause_count, + JobTxn *txn, Error **errp); +diff --git a/job.c b/job.c +index 950924ebad..b4eaf57e64 100644 +--- a/job.c ++++ b/job.c +@@ -248,7 +248,8 @@ static bool job_started(Job *job) + return job->co; + } + +-static bool job_should_pause(Job *job) ++bool job_should_pause(Job *job); ++bool job_should_pause(Job *job) + { + return job->pause_count > 0; + } +diff --git a/vma-reader.c b/vma-reader.c +new file mode 100644 +index 0000000000..2b1d1cdab3 +--- /dev/null ++++ b/vma-reader.c +@@ -0,0 +1,857 @@ ++/* ++ * VMA: Virtual Machine Archive ++ * ++ * Copyright (C) 2012 Proxmox Server Solutions ++ * ++ * Authors: ++ * Dietmar Maurer (dietmar@proxmox.com) ++ * ++ * This work is licensed under the terms of the GNU GPL, version 2 or later. ++ * See the COPYING file in the top-level directory. ++ * ++ */ ++ ++#include "qemu/osdep.h" ++#include ++#include ++ ++#include "qemu-common.h" ++#include "qemu/timer.h" ++#include "qemu/ratelimit.h" ++#include "vma.h" ++#include "block/block.h" ++#include "sysemu/block-backend.h" ++ ++static unsigned char zero_vma_block[VMA_BLOCK_SIZE]; ++ ++typedef struct VmaRestoreState { ++ BlockBackend *target; ++ bool write_zeroes; ++ unsigned long *bitmap; ++ int bitmap_size; ++} VmaRestoreState; ++ ++struct VmaReader { ++ int fd; ++ GChecksum *md5csum; ++ GHashTable *blob_hash; ++ unsigned char *head_data; ++ VmaDeviceInfo devinfo[256]; ++ VmaRestoreState rstate[256]; ++ GList *cdata_list; ++ guint8 vmstate_stream; ++ uint32_t vmstate_clusters; ++ /* to show restore percentage if run with -v */ ++ time_t start_time; ++ int64_t cluster_count; ++ int64_t clusters_read; ++ int64_t zero_cluster_data; ++ int64_t partial_zero_cluster_data; ++ int clusters_read_per; ++}; ++ ++static guint ++g_int32_hash(gconstpointer v) ++{ ++ return *(const uint32_t *)v; ++} ++ ++static gboolean ++g_int32_equal(gconstpointer v1, gconstpointer v2) ++{ ++ return *((const uint32_t *)v1) == *((const uint32_t *)v2); ++} ++ ++static int vma_reader_get_bitmap(VmaRestoreState *rstate, int64_t cluster_num) ++{ ++ assert(rstate); ++ assert(rstate->bitmap); ++ ++ unsigned long val, idx, bit; ++ ++ idx = cluster_num / BITS_PER_LONG; ++ ++ assert(rstate->bitmap_size > idx); ++ ++ bit = cluster_num % BITS_PER_LONG; ++ val = rstate->bitmap[idx]; ++ ++ return !!(val & (1UL << bit)); ++} ++ ++static void vma_reader_set_bitmap(VmaRestoreState *rstate, int64_t cluster_num, ++ int dirty) ++{ ++ assert(rstate); ++ assert(rstate->bitmap); ++ ++ unsigned long val, idx, bit; ++ ++ idx = cluster_num / BITS_PER_LONG; ++ ++ assert(rstate->bitmap_size > idx); ++ ++ bit = cluster_num % BITS_PER_LONG; ++ val = rstate->bitmap[idx]; ++ if (dirty) { ++ if (!(val & (1UL << bit))) { ++ val |= 1UL << bit; ++ } ++ } else { ++ if (val & (1UL << bit)) { ++ val &= ~(1UL << bit); ++ } ++ } ++ rstate->bitmap[idx] = val; ++} ++ ++typedef struct VmaBlob { ++ uint32_t start; ++ uint32_t len; ++ void *data; ++} VmaBlob; ++ ++static const VmaBlob *get_header_blob(VmaReader *vmar, uint32_t pos) ++{ ++ assert(vmar); ++ assert(vmar->blob_hash); ++ ++ return g_hash_table_lookup(vmar->blob_hash, &pos); ++} ++ ++static const char *get_header_str(VmaReader *vmar, uint32_t pos) ++{ ++ const VmaBlob *blob = get_header_blob(vmar, pos); ++ if (!blob) { ++ return NULL; ++ } ++ const char *res = (char *)blob->data; ++ if (res[blob->len-1] != '\0') { ++ return NULL; ++ } ++ return res; ++} ++ ++static ssize_t ++safe_read(int fd, unsigned char *buf, size_t count) ++{ ++ ssize_t n; ++ ++ do { ++ n = read(fd, buf, count); ++ } while (n < 0 && errno == EINTR); ++ ++ return n; ++} ++ ++static ssize_t ++full_read(int fd, unsigned char *buf, size_t len) ++{ ++ ssize_t n; ++ size_t total; ++ ++ total = 0; ++ ++ while (len > 0) { ++ n = safe_read(fd, buf, len); ++ ++ if (n == 0) { ++ return total; ++ } ++ ++ if (n <= 0) { ++ break; ++ } ++ ++ buf += n; ++ total += n; ++ len -= n; ++ } ++ ++ if (len) { ++ return -1; ++ } ++ ++ return total; ++} ++ ++void vma_reader_destroy(VmaReader *vmar) ++{ ++ assert(vmar); ++ ++ if (vmar->fd >= 0) { ++ close(vmar->fd); ++ } ++ ++ if (vmar->cdata_list) { ++ g_list_free(vmar->cdata_list); ++ } ++ ++ int i; ++ for (i = 1; i < 256; i++) { ++ if (vmar->rstate[i].bitmap) { ++ g_free(vmar->rstate[i].bitmap); ++ } ++ } ++ ++ if (vmar->md5csum) { ++ g_checksum_free(vmar->md5csum); ++ } ++ ++ if (vmar->blob_hash) { ++ g_hash_table_destroy(vmar->blob_hash); ++ } ++ ++ if (vmar->head_data) { ++ g_free(vmar->head_data); ++ } ++ ++ g_free(vmar); ++ ++}; ++ ++static int vma_reader_read_head(VmaReader *vmar, Error **errp) ++{ ++ assert(vmar); ++ assert(errp); ++ assert(*errp == NULL); ++ ++ unsigned char md5sum[16]; ++ int i; ++ int ret = 0; ++ ++ vmar->head_data = g_malloc(sizeof(VmaHeader)); ++ ++ if (full_read(vmar->fd, vmar->head_data, sizeof(VmaHeader)) != ++ sizeof(VmaHeader)) { ++ error_setg(errp, "can't read vma header - %s", ++ errno ? g_strerror(errno) : "got EOF"); ++ return -1; ++ } ++ ++ VmaHeader *h = (VmaHeader *)vmar->head_data; ++ ++ if (h->magic != VMA_MAGIC) { ++ error_setg(errp, "not a vma file - wrong magic number"); ++ return -1; ++ } ++ ++ uint32_t header_size = GUINT32_FROM_BE(h->header_size); ++ int need = header_size - sizeof(VmaHeader); ++ if (need <= 0) { ++ error_setg(errp, "wrong vma header size %d", header_size); ++ return -1; ++ } ++ ++ vmar->head_data = g_realloc(vmar->head_data, header_size); ++ h = (VmaHeader *)vmar->head_data; ++ ++ if (full_read(vmar->fd, vmar->head_data + sizeof(VmaHeader), need) != ++ need) { ++ error_setg(errp, "can't read vma header data - %s", ++ errno ? g_strerror(errno) : "got EOF"); ++ return -1; ++ } ++ ++ memcpy(md5sum, h->md5sum, 16); ++ memset(h->md5sum, 0, 16); ++ ++ g_checksum_reset(vmar->md5csum); ++ g_checksum_update(vmar->md5csum, vmar->head_data, header_size); ++ gsize csize = 16; ++ g_checksum_get_digest(vmar->md5csum, (guint8 *)(h->md5sum), &csize); ++ ++ if (memcmp(md5sum, h->md5sum, 16) != 0) { ++ error_setg(errp, "wrong vma header chechsum"); ++ return -1; ++ } ++ ++ /* we can modify header data after checksum verify */ ++ h->header_size = header_size; ++ ++ h->version = GUINT32_FROM_BE(h->version); ++ if (h->version != 1) { ++ error_setg(errp, "wrong vma version %d", h->version); ++ return -1; ++ } ++ ++ h->ctime = GUINT64_FROM_BE(h->ctime); ++ h->blob_buffer_offset = GUINT32_FROM_BE(h->blob_buffer_offset); ++ h->blob_buffer_size = GUINT32_FROM_BE(h->blob_buffer_size); ++ ++ uint32_t bstart = h->blob_buffer_offset + 1; ++ uint32_t bend = h->blob_buffer_offset + h->blob_buffer_size; ++ ++ if (bstart <= sizeof(VmaHeader)) { ++ error_setg(errp, "wrong vma blob buffer offset %d", ++ h->blob_buffer_offset); ++ return -1; ++ } ++ ++ if (bend > header_size) { ++ error_setg(errp, "wrong vma blob buffer size %d/%d", ++ h->blob_buffer_offset, h->blob_buffer_size); ++ return -1; ++ } ++ ++ while ((bstart + 2) <= bend) { ++ uint32_t size = vmar->head_data[bstart] + ++ (vmar->head_data[bstart+1] << 8); ++ if ((bstart + size + 2) <= bend) { ++ VmaBlob *blob = g_new0(VmaBlob, 1); ++ blob->start = bstart - h->blob_buffer_offset; ++ blob->len = size; ++ blob->data = vmar->head_data + bstart + 2; ++ g_hash_table_insert(vmar->blob_hash, &blob->start, blob); ++ } ++ bstart += size + 2; ++ } ++ ++ ++ int count = 0; ++ for (i = 1; i < 256; i++) { ++ VmaDeviceInfoHeader *dih = &h->dev_info[i]; ++ uint32_t devname_ptr = GUINT32_FROM_BE(dih->devname_ptr); ++ uint64_t size = GUINT64_FROM_BE(dih->size); ++ const char *devname = get_header_str(vmar, devname_ptr); ++ ++ if (size && devname) { ++ count++; ++ vmar->devinfo[i].size = size; ++ vmar->devinfo[i].devname = devname; ++ ++ if (strcmp(devname, "vmstate") == 0) { ++ vmar->vmstate_stream = i; ++ } ++ } ++ } ++ ++ for (i = 0; i < VMA_MAX_CONFIGS; i++) { ++ uint32_t name_ptr = GUINT32_FROM_BE(h->config_names[i]); ++ uint32_t data_ptr = GUINT32_FROM_BE(h->config_data[i]); ++ ++ if (!(name_ptr && data_ptr)) { ++ continue; ++ } ++ const char *name = get_header_str(vmar, name_ptr); ++ const VmaBlob *blob = get_header_blob(vmar, data_ptr); ++ ++ if (!(name && blob)) { ++ error_setg(errp, "vma contains invalid data pointers"); ++ return -1; ++ } ++ ++ VmaConfigData *cdata = g_new0(VmaConfigData, 1); ++ cdata->name = name; ++ cdata->data = blob->data; ++ cdata->len = blob->len; ++ ++ vmar->cdata_list = g_list_append(vmar->cdata_list, cdata); ++ } ++ ++ return ret; ++}; ++ ++VmaReader *vma_reader_create(const char *filename, Error **errp) ++{ ++ assert(filename); ++ assert(errp); ++ ++ VmaReader *vmar = g_new0(VmaReader, 1); ++ ++ if (strcmp(filename, "-") == 0) { ++ vmar->fd = dup(0); ++ } else { ++ vmar->fd = open(filename, O_RDONLY); ++ } ++ ++ if (vmar->fd < 0) { ++ error_setg(errp, "can't open file %s - %s\n", filename, ++ g_strerror(errno)); ++ goto err; ++ } ++ ++ vmar->md5csum = g_checksum_new(G_CHECKSUM_MD5); ++ if (!vmar->md5csum) { ++ error_setg(errp, "can't allocate cmsum\n"); ++ goto err; ++ } ++ ++ vmar->blob_hash = g_hash_table_new_full(g_int32_hash, g_int32_equal, ++ NULL, g_free); ++ ++ if (vma_reader_read_head(vmar, errp) < 0) { ++ goto err; ++ } ++ ++ return vmar; ++ ++err: ++ if (vmar) { ++ vma_reader_destroy(vmar); ++ } ++ ++ return NULL; ++} ++ ++VmaHeader *vma_reader_get_header(VmaReader *vmar) ++{ ++ assert(vmar); ++ assert(vmar->head_data); ++ ++ return (VmaHeader *)(vmar->head_data); ++} ++ ++GList *vma_reader_get_config_data(VmaReader *vmar) ++{ ++ assert(vmar); ++ assert(vmar->head_data); ++ ++ return vmar->cdata_list; ++} ++ ++VmaDeviceInfo *vma_reader_get_device_info(VmaReader *vmar, guint8 dev_id) ++{ ++ assert(vmar); ++ assert(dev_id); ++ ++ if (vmar->devinfo[dev_id].size && vmar->devinfo[dev_id].devname) { ++ return &vmar->devinfo[dev_id]; ++ } ++ ++ return NULL; ++} ++ ++static void allocate_rstate(VmaReader *vmar, guint8 dev_id, ++ BlockBackend *target, bool write_zeroes) ++{ ++ assert(vmar); ++ assert(dev_id); ++ ++ vmar->rstate[dev_id].target = target; ++ vmar->rstate[dev_id].write_zeroes = write_zeroes; ++ ++ int64_t size = vmar->devinfo[dev_id].size; ++ ++ int64_t bitmap_size = (size/BDRV_SECTOR_SIZE) + ++ (VMA_CLUSTER_SIZE/BDRV_SECTOR_SIZE) * BITS_PER_LONG - 1; ++ bitmap_size /= (VMA_CLUSTER_SIZE/BDRV_SECTOR_SIZE) * BITS_PER_LONG; ++ ++ vmar->rstate[dev_id].bitmap_size = bitmap_size; ++ vmar->rstate[dev_id].bitmap = g_new0(unsigned long, bitmap_size); ++ ++ vmar->cluster_count += size/VMA_CLUSTER_SIZE; ++} ++ ++int vma_reader_register_bs(VmaReader *vmar, guint8 dev_id, BlockBackend *target, ++ bool write_zeroes, Error **errp) ++{ ++ assert(vmar); ++ assert(target != NULL); ++ assert(dev_id); ++ assert(vmar->rstate[dev_id].target == NULL); ++ ++ int64_t size = blk_getlength(target); ++ int64_t size_diff = size - vmar->devinfo[dev_id].size; ++ ++ /* storage types can have different size restrictions, so it ++ * is not always possible to create an image with exact size. ++ * So we tolerate a size difference up to 4MB. ++ */ ++ if ((size_diff < 0) || (size_diff > 4*1024*1024)) { ++ error_setg(errp, "vma_reader_register_bs for stream %s failed - " ++ "unexpected size %zd != %zd", vmar->devinfo[dev_id].devname, ++ size, vmar->devinfo[dev_id].size); ++ return -1; ++ } ++ ++ allocate_rstate(vmar, dev_id, target, write_zeroes); ++ ++ return 0; ++} ++ ++static ssize_t safe_write(int fd, void *buf, size_t count) ++{ ++ ssize_t n; ++ ++ do { ++ n = write(fd, buf, count); ++ } while (n < 0 && errno == EINTR); ++ ++ return n; ++} ++ ++static size_t full_write(int fd, void *buf, size_t len) ++{ ++ ssize_t n; ++ size_t total; ++ ++ total = 0; ++ ++ while (len > 0) { ++ n = safe_write(fd, buf, len); ++ if (n < 0) { ++ return n; ++ } ++ buf += n; ++ total += n; ++ len -= n; ++ } ++ ++ if (len) { ++ /* incomplete write ? */ ++ return -1; ++ } ++ ++ return total; ++} ++ ++static int restore_write_data(VmaReader *vmar, guint8 dev_id, ++ BlockBackend *target, int vmstate_fd, ++ unsigned char *buf, int64_t sector_num, ++ int nb_sectors, Error **errp) ++{ ++ assert(vmar); ++ ++ if (dev_id == vmar->vmstate_stream) { ++ if (vmstate_fd >= 0) { ++ int len = nb_sectors * BDRV_SECTOR_SIZE; ++ int res = full_write(vmstate_fd, buf, len); ++ if (res < 0) { ++ error_setg(errp, "write vmstate failed %d", res); ++ return -1; ++ } ++ } ++ } else { ++ int res = blk_pwrite(target, sector_num * BDRV_SECTOR_SIZE, buf, nb_sectors * BDRV_SECTOR_SIZE, 0); ++ if (res < 0) { ++ error_setg(errp, "blk_pwrite to %s failed (%d)", ++ bdrv_get_device_name(blk_bs(target)), res); ++ return -1; ++ } ++ } ++ return 0; ++} ++ ++static int restore_extent(VmaReader *vmar, unsigned char *buf, ++ int extent_size, int vmstate_fd, ++ bool verbose, bool verify, Error **errp) ++{ ++ assert(vmar); ++ assert(buf); ++ ++ VmaExtentHeader *ehead = (VmaExtentHeader *)buf; ++ int start = VMA_EXTENT_HEADER_SIZE; ++ int i; ++ ++ for (i = 0; i < VMA_BLOCKS_PER_EXTENT; i++) { ++ uint64_t block_info = GUINT64_FROM_BE(ehead->blockinfo[i]); ++ uint64_t cluster_num = block_info & 0xffffffff; ++ uint8_t dev_id = (block_info >> 32) & 0xff; ++ uint16_t mask = block_info >> (32+16); ++ int64_t max_sector; ++ ++ if (!dev_id) { ++ continue; ++ } ++ ++ VmaRestoreState *rstate = &vmar->rstate[dev_id]; ++ BlockBackend *target = NULL; ++ ++ if (dev_id != vmar->vmstate_stream) { ++ target = rstate->target; ++ if (!verify && !target) { ++ error_setg(errp, "got wrong dev id %d", dev_id); ++ return -1; ++ } ++ ++ if (vma_reader_get_bitmap(rstate, cluster_num)) { ++ error_setg(errp, "found duplicated cluster %zd for stream %s", ++ cluster_num, vmar->devinfo[dev_id].devname); ++ return -1; ++ } ++ vma_reader_set_bitmap(rstate, cluster_num, 1); ++ ++ max_sector = vmar->devinfo[dev_id].size/BDRV_SECTOR_SIZE; ++ } else { ++ max_sector = G_MAXINT64; ++ if (cluster_num != vmar->vmstate_clusters) { ++ error_setg(errp, "found out of order vmstate data"); ++ return -1; ++ } ++ vmar->vmstate_clusters++; ++ } ++ ++ vmar->clusters_read++; ++ ++ if (verbose) { ++ time_t duration = time(NULL) - vmar->start_time; ++ int percent = (vmar->clusters_read*100)/vmar->cluster_count; ++ if (percent != vmar->clusters_read_per) { ++ printf("progress %d%% (read %zd bytes, duration %zd sec)\n", ++ percent, vmar->clusters_read*VMA_CLUSTER_SIZE, ++ duration); ++ fflush(stdout); ++ vmar->clusters_read_per = percent; ++ } ++ } ++ ++ /* try to write whole clusters to speedup restore */ ++ if (mask == 0xffff) { ++ if ((start + VMA_CLUSTER_SIZE) > extent_size) { ++ error_setg(errp, "short vma extent - too many blocks"); ++ return -1; ++ } ++ int64_t sector_num = (cluster_num * VMA_CLUSTER_SIZE) / ++ BDRV_SECTOR_SIZE; ++ int64_t end_sector = sector_num + ++ VMA_CLUSTER_SIZE/BDRV_SECTOR_SIZE; ++ ++ if (end_sector > max_sector) { ++ end_sector = max_sector; ++ } ++ ++ if (end_sector <= sector_num) { ++ error_setg(errp, "got wrong block address - write beyond end"); ++ return -1; ++ } ++ ++ if (!verify) { ++ int nb_sectors = end_sector - sector_num; ++ if (restore_write_data(vmar, dev_id, target, vmstate_fd, ++ buf + start, sector_num, nb_sectors, ++ errp) < 0) { ++ return -1; ++ } ++ } ++ ++ start += VMA_CLUSTER_SIZE; ++ } else { ++ int j; ++ int bit = 1; ++ ++ for (j = 0; j < 16; j++) { ++ int64_t sector_num = (cluster_num*VMA_CLUSTER_SIZE + ++ j*VMA_BLOCK_SIZE)/BDRV_SECTOR_SIZE; ++ ++ int64_t end_sector = sector_num + ++ VMA_BLOCK_SIZE/BDRV_SECTOR_SIZE; ++ if (end_sector > max_sector) { ++ end_sector = max_sector; ++ } ++ ++ if (mask & bit) { ++ if ((start + VMA_BLOCK_SIZE) > extent_size) { ++ error_setg(errp, "short vma extent - too many blocks"); ++ return -1; ++ } ++ ++ if (end_sector <= sector_num) { ++ error_setg(errp, "got wrong block address - " ++ "write beyond end"); ++ return -1; ++ } ++ ++ if (!verify) { ++ int nb_sectors = end_sector - sector_num; ++ if (restore_write_data(vmar, dev_id, target, vmstate_fd, ++ buf + start, sector_num, ++ nb_sectors, errp) < 0) { ++ return -1; ++ } ++ } ++ ++ start += VMA_BLOCK_SIZE; ++ ++ } else { ++ ++ ++ if (end_sector > sector_num) { ++ /* Todo: use bdrv_co_write_zeroes (but that need to ++ * be run inside coroutine?) ++ */ ++ int nb_sectors = end_sector - sector_num; ++ int zero_size = BDRV_SECTOR_SIZE*nb_sectors; ++ vmar->zero_cluster_data += zero_size; ++ if (mask != 0) { ++ vmar->partial_zero_cluster_data += zero_size; ++ } ++ ++ if (rstate->write_zeroes && !verify) { ++ if (restore_write_data(vmar, dev_id, target, vmstate_fd, ++ zero_vma_block, sector_num, ++ nb_sectors, errp) < 0) { ++ return -1; ++ } ++ } ++ } ++ } ++ ++ bit = bit << 1; ++ } ++ } ++ } ++ ++ if (start != extent_size) { ++ error_setg(errp, "vma extent error - missing blocks"); ++ return -1; ++ } ++ ++ return 0; ++} ++ ++static int vma_reader_restore_full(VmaReader *vmar, int vmstate_fd, ++ bool verbose, bool verify, ++ Error **errp) ++{ ++ assert(vmar); ++ assert(vmar->head_data); ++ ++ int ret = 0; ++ unsigned char buf[VMA_MAX_EXTENT_SIZE]; ++ int buf_pos = 0; ++ unsigned char md5sum[16]; ++ VmaHeader *h = (VmaHeader *)vmar->head_data; ++ ++ vmar->start_time = time(NULL); ++ ++ while (1) { ++ int bytes = full_read(vmar->fd, buf + buf_pos, sizeof(buf) - buf_pos); ++ if (bytes < 0) { ++ error_setg(errp, "read failed - %s", g_strerror(errno)); ++ return -1; ++ } ++ ++ buf_pos += bytes; ++ ++ if (!buf_pos) { ++ break; /* EOF */ ++ } ++ ++ if (buf_pos < VMA_EXTENT_HEADER_SIZE) { ++ error_setg(errp, "read short extent (%d bytes)", buf_pos); ++ return -1; ++ } ++ ++ VmaExtentHeader *ehead = (VmaExtentHeader *)buf; ++ ++ /* extract md5sum */ ++ memcpy(md5sum, ehead->md5sum, sizeof(ehead->md5sum)); ++ memset(ehead->md5sum, 0, sizeof(ehead->md5sum)); ++ ++ g_checksum_reset(vmar->md5csum); ++ g_checksum_update(vmar->md5csum, buf, VMA_EXTENT_HEADER_SIZE); ++ gsize csize = 16; ++ g_checksum_get_digest(vmar->md5csum, ehead->md5sum, &csize); ++ ++ if (memcmp(md5sum, ehead->md5sum, 16) != 0) { ++ error_setg(errp, "wrong vma extent header chechsum"); ++ return -1; ++ } ++ ++ if (memcmp(h->uuid, ehead->uuid, sizeof(ehead->uuid)) != 0) { ++ error_setg(errp, "wrong vma extent uuid"); ++ return -1; ++ } ++ ++ if (ehead->magic != VMA_EXTENT_MAGIC || ehead->reserved1 != 0) { ++ error_setg(errp, "wrong vma extent header magic"); ++ return -1; ++ } ++ ++ int block_count = GUINT16_FROM_BE(ehead->block_count); ++ int extent_size = VMA_EXTENT_HEADER_SIZE + block_count*VMA_BLOCK_SIZE; ++ ++ if (buf_pos < extent_size) { ++ error_setg(errp, "short vma extent (%d < %d)", buf_pos, ++ extent_size); ++ return -1; ++ } ++ ++ if (restore_extent(vmar, buf, extent_size, vmstate_fd, verbose, ++ verify, errp) < 0) { ++ return -1; ++ } ++ ++ if (buf_pos > extent_size) { ++ memmove(buf, buf + extent_size, buf_pos - extent_size); ++ buf_pos = buf_pos - extent_size; ++ } else { ++ buf_pos = 0; ++ } ++ } ++ ++ bdrv_drain_all(); ++ ++ int i; ++ for (i = 1; i < 256; i++) { ++ VmaRestoreState *rstate = &vmar->rstate[i]; ++ if (!rstate->target) { ++ continue; ++ } ++ ++ if (blk_flush(rstate->target) < 0) { ++ error_setg(errp, "vma blk_flush %s failed", ++ vmar->devinfo[i].devname); ++ return -1; ++ } ++ ++ if (vmar->devinfo[i].size && ++ (strcmp(vmar->devinfo[i].devname, "vmstate") != 0)) { ++ assert(rstate->bitmap); ++ ++ int64_t cluster_num, end; ++ ++ end = (vmar->devinfo[i].size + VMA_CLUSTER_SIZE - 1) / ++ VMA_CLUSTER_SIZE; ++ ++ for (cluster_num = 0; cluster_num < end; cluster_num++) { ++ if (!vma_reader_get_bitmap(rstate, cluster_num)) { ++ error_setg(errp, "detected missing cluster %zd " ++ "for stream %s", cluster_num, ++ vmar->devinfo[i].devname); ++ return -1; ++ } ++ } ++ } ++ } ++ ++ if (verbose) { ++ if (vmar->clusters_read) { ++ printf("total bytes read %zd, sparse bytes %zd (%.3g%%)\n", ++ vmar->clusters_read*VMA_CLUSTER_SIZE, ++ vmar->zero_cluster_data, ++ (double)(100.0*vmar->zero_cluster_data)/ ++ (vmar->clusters_read*VMA_CLUSTER_SIZE)); ++ ++ int64_t datasize = vmar->clusters_read*VMA_CLUSTER_SIZE-vmar->zero_cluster_data; ++ if (datasize) { // this does not make sense for empty files ++ printf("space reduction due to 4K zero blocks %.3g%%\n", ++ (double)(100.0*vmar->partial_zero_cluster_data) / datasize); ++ } ++ } else { ++ printf("vma archive contains no image data\n"); ++ } ++ } ++ return ret; ++} ++ ++int vma_reader_restore(VmaReader *vmar, int vmstate_fd, bool verbose, ++ Error **errp) ++{ ++ return vma_reader_restore_full(vmar, vmstate_fd, verbose, false, errp); ++} ++ ++int vma_reader_verify(VmaReader *vmar, bool verbose, Error **errp) ++{ ++ guint8 dev_id; ++ ++ for (dev_id = 1; dev_id < 255; dev_id++) { ++ if (vma_reader_get_device_info(vmar, dev_id)) { ++ allocate_rstate(vmar, dev_id, NULL, false); ++ } ++ } ++ ++ return vma_reader_restore_full(vmar, -1, verbose, true, errp); ++} ++ +diff --git a/vma-writer.c b/vma-writer.c +new file mode 100644 +index 0000000000..fd9567634d +--- /dev/null ++++ b/vma-writer.c +@@ -0,0 +1,771 @@ ++/* ++ * VMA: Virtual Machine Archive ++ * ++ * Copyright (C) 2012 Proxmox Server Solutions ++ * ++ * Authors: ++ * Dietmar Maurer (dietmar@proxmox.com) ++ * ++ * This work is licensed under the terms of the GNU GPL, version 2 or later. ++ * See the COPYING file in the top-level directory. ++ * ++ */ ++ ++#include "qemu/osdep.h" ++#include ++#include ++ ++#include "vma.h" ++#include "block/block.h" ++#include "monitor/monitor.h" ++#include "qemu/main-loop.h" ++#include "qemu/coroutine.h" ++#include "qemu/cutils.h" ++ ++#define DEBUG_VMA 0 ++ ++#define DPRINTF(fmt, ...)\ ++ do { if (DEBUG_VMA) { printf("vma: " fmt, ## __VA_ARGS__); } } while (0) ++ ++#define WRITE_BUFFERS 5 ++#define HEADER_CLUSTERS 8 ++#define HEADERBUF_SIZE (VMA_CLUSTER_SIZE*HEADER_CLUSTERS) ++ ++struct VmaWriter { ++ int fd; ++ FILE *cmd; ++ int status; ++ char errmsg[8192]; ++ uuid_t uuid; ++ bool header_written; ++ bool closed; ++ ++ /* we always write extents */ ++ unsigned char *outbuf; ++ int outbuf_pos; /* in bytes */ ++ int outbuf_count; /* in VMA_BLOCKS */ ++ uint64_t outbuf_block_info[VMA_BLOCKS_PER_EXTENT]; ++ ++ unsigned char *headerbuf; ++ ++ GChecksum *md5csum; ++ CoMutex flush_lock; ++ Coroutine *co_writer; ++ ++ /* drive informations */ ++ VmaStreamInfo stream_info[256]; ++ guint stream_count; ++ ++ guint8 vmstate_stream; ++ uint32_t vmstate_clusters; ++ ++ /* header blob table */ ++ char *header_blob_table; ++ uint32_t header_blob_table_size; ++ uint32_t header_blob_table_pos; ++ ++ /* store for config blobs */ ++ uint32_t config_names[VMA_MAX_CONFIGS]; /* offset into blob_buffer table */ ++ uint32_t config_data[VMA_MAX_CONFIGS]; /* offset into blob_buffer table */ ++ uint32_t config_count; ++}; ++ ++void vma_writer_set_error(VmaWriter *vmaw, const char *fmt, ...) ++{ ++ va_list ap; ++ ++ if (vmaw->status < 0) { ++ return; ++ } ++ ++ vmaw->status = -1; ++ ++ va_start(ap, fmt); ++ g_vsnprintf(vmaw->errmsg, sizeof(vmaw->errmsg), fmt, ap); ++ va_end(ap); ++ ++ DPRINTF("vma_writer_set_error: %s\n", vmaw->errmsg); ++} ++ ++static uint32_t allocate_header_blob(VmaWriter *vmaw, const char *data, ++ size_t len) ++{ ++ if (len > 65535) { ++ return 0; ++ } ++ ++ if (!vmaw->header_blob_table || ++ (vmaw->header_blob_table_size < ++ (vmaw->header_blob_table_pos + len + 2))) { ++ int newsize = vmaw->header_blob_table_size + ((len + 2 + 511)/512)*512; ++ ++ vmaw->header_blob_table = g_realloc(vmaw->header_blob_table, newsize); ++ memset(vmaw->header_blob_table + vmaw->header_blob_table_size, ++ 0, newsize - vmaw->header_blob_table_size); ++ vmaw->header_blob_table_size = newsize; ++ } ++ ++ uint32_t cpos = vmaw->header_blob_table_pos; ++ vmaw->header_blob_table[cpos] = len & 255; ++ vmaw->header_blob_table[cpos+1] = (len >> 8) & 255; ++ memcpy(vmaw->header_blob_table + cpos + 2, data, len); ++ vmaw->header_blob_table_pos += len + 2; ++ return cpos; ++} ++ ++static uint32_t allocate_header_string(VmaWriter *vmaw, const char *str) ++{ ++ assert(vmaw); ++ ++ size_t len = strlen(str) + 1; ++ ++ return allocate_header_blob(vmaw, str, len); ++} ++ ++int vma_writer_add_config(VmaWriter *vmaw, const char *name, gpointer data, ++ gsize len) ++{ ++ assert(vmaw); ++ assert(!vmaw->header_written); ++ assert(vmaw->config_count < VMA_MAX_CONFIGS); ++ assert(name); ++ assert(data); ++ ++ gchar *basename = g_path_get_basename(name); ++ uint32_t name_ptr = allocate_header_string(vmaw, basename); ++ g_free(basename); ++ ++ if (!name_ptr) { ++ return -1; ++ } ++ ++ uint32_t data_ptr = allocate_header_blob(vmaw, data, len); ++ if (!data_ptr) { ++ return -1; ++ } ++ ++ vmaw->config_names[vmaw->config_count] = name_ptr; ++ vmaw->config_data[vmaw->config_count] = data_ptr; ++ ++ vmaw->config_count++; ++ ++ return 0; ++} ++ ++int vma_writer_register_stream(VmaWriter *vmaw, const char *devname, ++ size_t size) ++{ ++ assert(vmaw); ++ assert(devname); ++ assert(!vmaw->status); ++ ++ if (vmaw->header_written) { ++ vma_writer_set_error(vmaw, "vma_writer_register_stream: header " ++ "already written"); ++ return -1; ++ } ++ ++ guint n = vmaw->stream_count + 1; ++ ++ /* we can have dev_ids form 1 to 255 (0 reserved) ++ * 255(-1) reseverd for safety ++ */ ++ if (n > 254) { ++ vma_writer_set_error(vmaw, "vma_writer_register_stream: " ++ "too many drives"); ++ return -1; ++ } ++ ++ if (size <= 0) { ++ vma_writer_set_error(vmaw, "vma_writer_register_stream: " ++ "got strange size %zd", size); ++ return -1; ++ } ++ ++ DPRINTF("vma_writer_register_stream %s %zu %d\n", devname, size, n); ++ ++ vmaw->stream_info[n].devname = g_strdup(devname); ++ vmaw->stream_info[n].size = size; ++ ++ vmaw->stream_info[n].cluster_count = (size + VMA_CLUSTER_SIZE - 1) / ++ VMA_CLUSTER_SIZE; ++ ++ vmaw->stream_count = n; ++ ++ if (strcmp(devname, "vmstate") == 0) { ++ vmaw->vmstate_stream = n; ++ } ++ ++ return n; ++} ++ ++static void vma_co_continue_write(void *opaque) ++{ ++ VmaWriter *vmaw = opaque; ++ ++ DPRINTF("vma_co_continue_write\n"); ++ qemu_coroutine_enter(vmaw->co_writer); ++} ++ ++static ssize_t coroutine_fn ++vma_queue_write(VmaWriter *vmaw, const void *buf, size_t bytes) ++{ ++ DPRINTF("vma_queue_write enter %zd\n", bytes); ++ ++ assert(vmaw); ++ assert(buf); ++ assert(bytes <= VMA_MAX_EXTENT_SIZE); ++ ++ size_t done = 0; ++ ssize_t ret; ++ ++ assert(vmaw->co_writer == NULL); ++ ++ vmaw->co_writer = qemu_coroutine_self(); ++ ++ while (done < bytes) { ++ aio_set_fd_handler(qemu_get_aio_context(), vmaw->fd, false, NULL, vma_co_continue_write, NULL, vmaw); ++ qemu_coroutine_yield(); ++ aio_set_fd_handler(qemu_get_aio_context(), vmaw->fd, false, NULL, NULL, NULL, NULL); ++ if (vmaw->status < 0) { ++ DPRINTF("vma_queue_write detected canceled backup\n"); ++ done = -1; ++ break; ++ } ++ ret = write(vmaw->fd, buf + done, bytes - done); ++ if (ret > 0) { ++ done += ret; ++ DPRINTF("vma_queue_write written %zd %zd\n", done, ret); ++ } else if (ret < 0) { ++ if (errno == EAGAIN || errno == EWOULDBLOCK) { ++ /* try again */ ++ } else { ++ vma_writer_set_error(vmaw, "vma_queue_write: write error - %s", ++ g_strerror(errno)); ++ done = -1; /* always return failure for partial writes */ ++ break; ++ } ++ } else if (ret == 0) { ++ /* should not happen - simply try again */ ++ } ++ } ++ ++ vmaw->co_writer = NULL; ++ ++ return (done == bytes) ? bytes : -1; ++} ++ ++VmaWriter *vma_writer_create(const char *filename, uuid_t uuid, Error **errp) ++{ ++ const char *p; ++ ++ assert(sizeof(VmaHeader) == (4096 + 8192)); ++ assert(G_STRUCT_OFFSET(VmaHeader, config_names) == 2044); ++ assert(G_STRUCT_OFFSET(VmaHeader, config_data) == 3068); ++ assert(G_STRUCT_OFFSET(VmaHeader, dev_info) == 4096); ++ assert(sizeof(VmaExtentHeader) == 512); ++ ++ VmaWriter *vmaw = g_new0(VmaWriter, 1); ++ vmaw->fd = -1; ++ ++ vmaw->md5csum = g_checksum_new(G_CHECKSUM_MD5); ++ if (!vmaw->md5csum) { ++ error_setg(errp, "can't allocate cmsum\n"); ++ goto err; ++ } ++ ++ if (strstart(filename, "exec:", &p)) { ++ vmaw->cmd = popen(p, "w"); ++ if (vmaw->cmd == NULL) { ++ error_setg(errp, "can't popen command '%s' - %s\n", p, ++ g_strerror(errno)); ++ goto err; ++ } ++ vmaw->fd = fileno(vmaw->cmd); ++ ++ /* try to use O_NONBLOCK */ ++ fcntl(vmaw->fd, F_SETFL, fcntl(vmaw->fd, F_GETFL)|O_NONBLOCK); ++ ++ } else { ++ struct stat st; ++ int oflags; ++ const char *tmp_id_str; ++ ++ if ((stat(filename, &st) == 0) && S_ISFIFO(st.st_mode)) { ++ oflags = O_NONBLOCK|O_WRONLY; ++ vmaw->fd = qemu_open(filename, oflags, 0644); ++ } else if (strstart(filename, "/dev/fdset/", &tmp_id_str)) { ++ oflags = O_NONBLOCK|O_WRONLY; ++ vmaw->fd = qemu_open(filename, oflags, 0644); ++ } else if (strstart(filename, "/dev/fdname/", &tmp_id_str)) { ++ vmaw->fd = monitor_get_fd(cur_mon, tmp_id_str, errp); ++ if (vmaw->fd < 0) { ++ goto err; ++ } ++ /* try to use O_NONBLOCK */ ++ fcntl(vmaw->fd, F_SETFL, fcntl(vmaw->fd, F_GETFL)|O_NONBLOCK); ++ } else { ++ oflags = O_NONBLOCK|O_DIRECT|O_WRONLY|O_CREAT|O_EXCL; ++ vmaw->fd = qemu_open(filename, oflags, 0644); ++ } ++ ++ if (vmaw->fd < 0) { ++ error_setg(errp, "can't open file %s - %s\n", filename, ++ g_strerror(errno)); ++ goto err; ++ } ++ } ++ ++ /* we use O_DIRECT, so we need to align IO buffers */ ++ ++ vmaw->outbuf = qemu_memalign(512, VMA_MAX_EXTENT_SIZE); ++ vmaw->headerbuf = qemu_memalign(512, HEADERBUF_SIZE); ++ ++ vmaw->outbuf_count = 0; ++ vmaw->outbuf_pos = VMA_EXTENT_HEADER_SIZE; ++ ++ vmaw->header_blob_table_pos = 1; /* start at pos 1 */ ++ ++ qemu_co_mutex_init(&vmaw->flush_lock); ++ ++ uuid_copy(vmaw->uuid, uuid); ++ ++ return vmaw; ++ ++err: ++ if (vmaw) { ++ if (vmaw->cmd) { ++ pclose(vmaw->cmd); ++ } else if (vmaw->fd >= 0) { ++ close(vmaw->fd); ++ } ++ ++ if (vmaw->md5csum) { ++ g_checksum_free(vmaw->md5csum); ++ } ++ ++ g_free(vmaw); ++ } ++ ++ return NULL; ++} ++ ++static int coroutine_fn vma_write_header(VmaWriter *vmaw) ++{ ++ assert(vmaw); ++ unsigned char *buf = vmaw->headerbuf; ++ VmaHeader *head = (VmaHeader *)buf; ++ ++ int i; ++ ++ DPRINTF("VMA WRITE HEADER\n"); ++ ++ if (vmaw->status < 0) { ++ return vmaw->status; ++ } ++ ++ memset(buf, 0, HEADERBUF_SIZE); ++ ++ head->magic = VMA_MAGIC; ++ head->version = GUINT32_TO_BE(1); /* v1 */ ++ memcpy(head->uuid, vmaw->uuid, 16); ++ ++ time_t ctime = time(NULL); ++ head->ctime = GUINT64_TO_BE(ctime); ++ ++ for (i = 0; i < VMA_MAX_CONFIGS; i++) { ++ head->config_names[i] = GUINT32_TO_BE(vmaw->config_names[i]); ++ head->config_data[i] = GUINT32_TO_BE(vmaw->config_data[i]); ++ } ++ ++ /* 32 bytes per device (12 used currently) = 8192 bytes max */ ++ for (i = 1; i <= 254; i++) { ++ VmaStreamInfo *si = &vmaw->stream_info[i]; ++ if (si->size) { ++ assert(si->devname); ++ uint32_t devname_ptr = allocate_header_string(vmaw, si->devname); ++ if (!devname_ptr) { ++ return -1; ++ } ++ head->dev_info[i].devname_ptr = GUINT32_TO_BE(devname_ptr); ++ head->dev_info[i].size = GUINT64_TO_BE(si->size); ++ } ++ } ++ ++ uint32_t header_size = sizeof(VmaHeader) + vmaw->header_blob_table_size; ++ head->header_size = GUINT32_TO_BE(header_size); ++ ++ if (header_size > HEADERBUF_SIZE) { ++ return -1; /* just to be sure */ ++ } ++ ++ uint32_t blob_buffer_offset = sizeof(VmaHeader); ++ memcpy(buf + blob_buffer_offset, vmaw->header_blob_table, ++ vmaw->header_blob_table_size); ++ head->blob_buffer_offset = GUINT32_TO_BE(blob_buffer_offset); ++ head->blob_buffer_size = GUINT32_TO_BE(vmaw->header_blob_table_pos); ++ ++ g_checksum_reset(vmaw->md5csum); ++ g_checksum_update(vmaw->md5csum, (const guchar *)buf, header_size); ++ gsize csize = 16; ++ g_checksum_get_digest(vmaw->md5csum, (guint8 *)(head->md5sum), &csize); ++ ++ return vma_queue_write(vmaw, buf, header_size); ++} ++ ++static int coroutine_fn vma_writer_flush(VmaWriter *vmaw) ++{ ++ assert(vmaw); ++ ++ int ret; ++ int i; ++ ++ if (vmaw->status < 0) { ++ return vmaw->status; ++ } ++ ++ if (!vmaw->header_written) { ++ vmaw->header_written = true; ++ ret = vma_write_header(vmaw); ++ if (ret < 0) { ++ vma_writer_set_error(vmaw, "vma_writer_flush: write header failed"); ++ return ret; ++ } ++ } ++ ++ DPRINTF("VMA WRITE FLUSH %d %d\n", vmaw->outbuf_count, vmaw->outbuf_pos); ++ ++ ++ VmaExtentHeader *ehead = (VmaExtentHeader *)vmaw->outbuf; ++ ++ ehead->magic = VMA_EXTENT_MAGIC; ++ ehead->reserved1 = 0; ++ ++ for (i = 0; i < VMA_BLOCKS_PER_EXTENT; i++) { ++ ehead->blockinfo[i] = GUINT64_TO_BE(vmaw->outbuf_block_info[i]); ++ } ++ ++ guint16 block_count = (vmaw->outbuf_pos - VMA_EXTENT_HEADER_SIZE) / ++ VMA_BLOCK_SIZE; ++ ++ ehead->block_count = GUINT16_TO_BE(block_count); ++ ++ memcpy(ehead->uuid, vmaw->uuid, sizeof(ehead->uuid)); ++ memset(ehead->md5sum, 0, sizeof(ehead->md5sum)); ++ ++ g_checksum_reset(vmaw->md5csum); ++ g_checksum_update(vmaw->md5csum, vmaw->outbuf, VMA_EXTENT_HEADER_SIZE); ++ gsize csize = 16; ++ g_checksum_get_digest(vmaw->md5csum, ehead->md5sum, &csize); ++ ++ int bytes = vmaw->outbuf_pos; ++ ret = vma_queue_write(vmaw, vmaw->outbuf, bytes); ++ if (ret != bytes) { ++ vma_writer_set_error(vmaw, "vma_writer_flush: failed write"); ++ } ++ ++ vmaw->outbuf_count = 0; ++ vmaw->outbuf_pos = VMA_EXTENT_HEADER_SIZE; ++ ++ for (i = 0; i < VMA_BLOCKS_PER_EXTENT; i++) { ++ vmaw->outbuf_block_info[i] = 0; ++ } ++ ++ return vmaw->status; ++} ++ ++static int vma_count_open_streams(VmaWriter *vmaw) ++{ ++ g_assert(vmaw != NULL); ++ ++ int i; ++ int open_drives = 0; ++ for (i = 0; i <= 255; i++) { ++ if (vmaw->stream_info[i].size && !vmaw->stream_info[i].finished) { ++ open_drives++; ++ } ++ } ++ ++ return open_drives; ++} ++ ++ ++/** ++ * You need to call this if the vma archive does not contain ++ * any data stream. ++ */ ++int coroutine_fn ++vma_writer_flush_output(VmaWriter *vmaw) ++{ ++ qemu_co_mutex_lock(&vmaw->flush_lock); ++ int ret = vma_writer_flush(vmaw); ++ qemu_co_mutex_unlock(&vmaw->flush_lock); ++ if (ret < 0) { ++ vma_writer_set_error(vmaw, "vma_writer_flush_header failed"); ++ } ++ return ret; ++} ++ ++/** ++ * all jobs should call this when there is no more data ++ * Returns: number of remaining stream (0 ==> finished) ++ */ ++int coroutine_fn ++vma_writer_close_stream(VmaWriter *vmaw, uint8_t dev_id) ++{ ++ g_assert(vmaw != NULL); ++ ++ DPRINTF("vma_writer_set_status %d\n", dev_id); ++ if (!vmaw->stream_info[dev_id].size) { ++ vma_writer_set_error(vmaw, "vma_writer_close_stream: " ++ "no such stream %d", dev_id); ++ return -1; ++ } ++ if (vmaw->stream_info[dev_id].finished) { ++ vma_writer_set_error(vmaw, "vma_writer_close_stream: " ++ "stream already closed %d", dev_id); ++ return -1; ++ } ++ ++ vmaw->stream_info[dev_id].finished = true; ++ ++ int open_drives = vma_count_open_streams(vmaw); ++ ++ if (open_drives <= 0) { ++ DPRINTF("vma_writer_set_status all drives completed\n"); ++ vma_writer_flush_output(vmaw); ++ } ++ ++ return open_drives; ++} ++ ++int vma_writer_get_status(VmaWriter *vmaw, VmaStatus *status) ++{ ++ int i; ++ ++ g_assert(vmaw != NULL); ++ ++ if (status) { ++ status->status = vmaw->status; ++ g_strlcpy(status->errmsg, vmaw->errmsg, sizeof(status->errmsg)); ++ for (i = 0; i <= 255; i++) { ++ status->stream_info[i] = vmaw->stream_info[i]; ++ } ++ ++ uuid_unparse_lower(vmaw->uuid, status->uuid_str); ++ } ++ ++ status->closed = vmaw->closed; ++ ++ return vmaw->status; ++} ++ ++static int vma_writer_get_buffer(VmaWriter *vmaw) ++{ ++ int ret = 0; ++ ++ qemu_co_mutex_lock(&vmaw->flush_lock); ++ ++ /* wait until buffer is available */ ++ while (vmaw->outbuf_count >= (VMA_BLOCKS_PER_EXTENT - 1)) { ++ ret = vma_writer_flush(vmaw); ++ if (ret < 0) { ++ vma_writer_set_error(vmaw, "vma_writer_get_buffer: flush failed"); ++ break; ++ } ++ } ++ ++ qemu_co_mutex_unlock(&vmaw->flush_lock); ++ ++ return ret; ++} ++ ++ ++int64_t coroutine_fn ++vma_writer_write(VmaWriter *vmaw, uint8_t dev_id, int64_t cluster_num, ++ const unsigned char *buf, size_t *zero_bytes) ++{ ++ g_assert(vmaw != NULL); ++ g_assert(zero_bytes != NULL); ++ ++ *zero_bytes = 0; ++ ++ if (vmaw->status < 0) { ++ return vmaw->status; ++ } ++ ++ if (!dev_id || !vmaw->stream_info[dev_id].size) { ++ vma_writer_set_error(vmaw, "vma_writer_write: " ++ "no such stream %d", dev_id); ++ return -1; ++ } ++ ++ if (vmaw->stream_info[dev_id].finished) { ++ vma_writer_set_error(vmaw, "vma_writer_write: " ++ "stream already closed %d", dev_id); ++ return -1; ++ } ++ ++ ++ if (cluster_num >= (((uint64_t)1)<<32)) { ++ vma_writer_set_error(vmaw, "vma_writer_write: " ++ "cluster number out of range"); ++ return -1; ++ } ++ ++ if (dev_id == vmaw->vmstate_stream) { ++ if (cluster_num != vmaw->vmstate_clusters) { ++ vma_writer_set_error(vmaw, "vma_writer_write: " ++ "non sequential vmstate write"); ++ } ++ vmaw->vmstate_clusters++; ++ } else if (cluster_num >= vmaw->stream_info[dev_id].cluster_count) { ++ vma_writer_set_error(vmaw, "vma_writer_write: cluster number too big"); ++ return -1; ++ } ++ ++ /* wait until buffer is available */ ++ if (vma_writer_get_buffer(vmaw) < 0) { ++ vma_writer_set_error(vmaw, "vma_writer_write: " ++ "vma_writer_get_buffer failed"); ++ return -1; ++ } ++ ++ DPRINTF("VMA WRITE %d %zd\n", dev_id, cluster_num); ++ ++ uint16_t mask = 0; ++ ++ if (buf) { ++ int i; ++ int bit = 1; ++ for (i = 0; i < 16; i++) { ++ const unsigned char *vmablock = buf + (i*VMA_BLOCK_SIZE); ++ if (!buffer_is_zero(vmablock, VMA_BLOCK_SIZE)) { ++ mask |= bit; ++ memcpy(vmaw->outbuf + vmaw->outbuf_pos, vmablock, ++ VMA_BLOCK_SIZE); ++ vmaw->outbuf_pos += VMA_BLOCK_SIZE; ++ } else { ++ DPRINTF("VMA WRITE %zd ZERO BLOCK %d\n", cluster_num, i); ++ vmaw->stream_info[dev_id].zero_bytes += VMA_BLOCK_SIZE; ++ *zero_bytes += VMA_BLOCK_SIZE; ++ } ++ ++ bit = bit << 1; ++ } ++ } else { ++ DPRINTF("VMA WRITE %zd ZERO CLUSTER\n", cluster_num); ++ vmaw->stream_info[dev_id].zero_bytes += VMA_CLUSTER_SIZE; ++ *zero_bytes += VMA_CLUSTER_SIZE; ++ } ++ ++ uint64_t block_info = ((uint64_t)mask) << (32+16); ++ block_info |= ((uint64_t)dev_id) << 32; ++ block_info |= (cluster_num & 0xffffffff); ++ vmaw->outbuf_block_info[vmaw->outbuf_count] = block_info; ++ ++ DPRINTF("VMA WRITE MASK %zd %zx\n", cluster_num, block_info); ++ ++ vmaw->outbuf_count++; ++ ++ /** NOTE: We allways write whole clusters, but we correctly set ++ * transferred bytes. So transferred == size when when everything ++ * went OK. ++ */ ++ size_t transferred = VMA_CLUSTER_SIZE; ++ ++ if (dev_id != vmaw->vmstate_stream) { ++ uint64_t last = (cluster_num + 1) * VMA_CLUSTER_SIZE; ++ if (last > vmaw->stream_info[dev_id].size) { ++ uint64_t diff = last - vmaw->stream_info[dev_id].size; ++ if (diff >= VMA_CLUSTER_SIZE) { ++ vma_writer_set_error(vmaw, "vma_writer_write: " ++ "read after last cluster"); ++ return -1; ++ } ++ transferred -= diff; ++ } ++ } ++ ++ vmaw->stream_info[dev_id].transferred += transferred; ++ ++ return transferred; ++} ++ ++void vma_writer_error_propagate(VmaWriter *vmaw, Error **errp) ++{ ++ if (vmaw->status < 0 && *errp == NULL) { ++ error_setg(errp, "%s", vmaw->errmsg); ++ } ++} ++ ++int vma_writer_close(VmaWriter *vmaw, Error **errp) ++{ ++ g_assert(vmaw != NULL); ++ ++ int i; ++ ++ while (vmaw->co_writer) { ++ aio_poll(qemu_get_aio_context(), true); ++ } ++ ++ assert(vmaw->co_writer == NULL); ++ ++ if (vmaw->cmd) { ++ if (pclose(vmaw->cmd) < 0) { ++ vma_writer_set_error(vmaw, "vma_writer_close: " ++ "pclose failed - %s", g_strerror(errno)); ++ } ++ } else { ++ if (close(vmaw->fd) < 0) { ++ vma_writer_set_error(vmaw, "vma_writer_close: " ++ "close failed - %s", g_strerror(errno)); ++ } ++ } ++ ++ for (i = 0; i <= 255; i++) { ++ VmaStreamInfo *si = &vmaw->stream_info[i]; ++ if (si->size) { ++ if (!si->finished) { ++ vma_writer_set_error(vmaw, "vma_writer_close: " ++ "detected open stream '%s'", si->devname); ++ } else if ((si->transferred != si->size) && ++ (i != vmaw->vmstate_stream)) { ++ vma_writer_set_error(vmaw, "vma_writer_close: " ++ "incomplete stream '%s' (%zd != %zd)", ++ si->devname, si->transferred, si->size); ++ } ++ } ++ } ++ ++ for (i = 0; i <= 255; i++) { ++ vmaw->stream_info[i].finished = 1; /* mark as closed */ ++ } ++ ++ vmaw->closed = 1; ++ ++ if (vmaw->status < 0 && *errp == NULL) { ++ error_setg(errp, "%s", vmaw->errmsg); ++ } ++ ++ return vmaw->status; ++} ++ ++void vma_writer_destroy(VmaWriter *vmaw) ++{ ++ assert(vmaw); ++ ++ int i; ++ ++ for (i = 0; i <= 255; i++) { ++ if (vmaw->stream_info[i].devname) { ++ g_free(vmaw->stream_info[i].devname); ++ } ++ } ++ ++ if (vmaw->md5csum) { ++ g_checksum_free(vmaw->md5csum); ++ } ++ ++ g_free(vmaw); ++} +diff --git a/vma.c b/vma.c +new file mode 100644 +index 0000000000..1b59fd1555 +--- /dev/null ++++ b/vma.c +@@ -0,0 +1,756 @@ ++/* ++ * VMA: Virtual Machine Archive ++ * ++ * Copyright (C) 2012-2013 Proxmox Server Solutions ++ * ++ * Authors: ++ * Dietmar Maurer (dietmar@proxmox.com) ++ * ++ * This work is licensed under the terms of the GNU GPL, version 2 or later. ++ * See the COPYING file in the top-level directory. ++ * ++ */ ++ ++#include "qemu/osdep.h" ++#include ++ ++#include "vma.h" ++#include "qemu-common.h" ++#include "qemu/error-report.h" ++#include "qemu/main-loop.h" ++#include "qapi/qmp/qstring.h" ++#include "sysemu/block-backend.h" ++ ++static void help(void) ++{ ++ const char *help_msg = ++ "usage: vma command [command options]\n" ++ "\n" ++ "vma list \n" ++ "vma config [-c config]\n" ++ "vma create [-c config] pathname ...\n" ++ "vma extract [-r ] \n" ++ "vma verify [-v]\n" ++ ; ++ ++ printf("%s", help_msg); ++ exit(1); ++} ++ ++static const char *extract_devname(const char *path, char **devname, int index) ++{ ++ assert(path); ++ ++ const char *sep = strchr(path, '='); ++ ++ if (sep) { ++ *devname = g_strndup(path, sep - path); ++ path = sep + 1; ++ } else { ++ if (index >= 0) { ++ *devname = g_strdup_printf("disk%d", index); ++ } else { ++ *devname = NULL; ++ } ++ } ++ ++ return path; ++} ++ ++static void print_content(VmaReader *vmar) ++{ ++ assert(vmar); ++ ++ VmaHeader *head = vma_reader_get_header(vmar); ++ ++ GList *l = vma_reader_get_config_data(vmar); ++ while (l && l->data) { ++ VmaConfigData *cdata = (VmaConfigData *)l->data; ++ l = g_list_next(l); ++ printf("CFG: size: %d name: %s\n", cdata->len, cdata->name); ++ } ++ ++ int i; ++ VmaDeviceInfo *di; ++ for (i = 1; i < 255; i++) { ++ di = vma_reader_get_device_info(vmar, i); ++ if (di) { ++ if (strcmp(di->devname, "vmstate") == 0) { ++ printf("VMSTATE: dev_id=%d memory: %zd\n", i, di->size); ++ } else { ++ printf("DEV: dev_id=%d size: %zd devname: %s\n", ++ i, di->size, di->devname); ++ } ++ } ++ } ++ /* ctime is the last entry we print */ ++ printf("CTIME: %s", ctime(&head->ctime)); ++ fflush(stdout); ++} ++ ++static int list_content(int argc, char **argv) ++{ ++ int c, ret = 0; ++ const char *filename; ++ ++ for (;;) { ++ c = getopt(argc, argv, "h"); ++ if (c == -1) { ++ break; ++ } ++ switch (c) { ++ case '?': ++ case 'h': ++ help(); ++ break; ++ default: ++ g_assert_not_reached(); ++ } ++ } ++ ++ /* Get the filename */ ++ if ((optind + 1) != argc) { ++ help(); ++ } ++ filename = argv[optind++]; ++ ++ Error *errp = NULL; ++ VmaReader *vmar = vma_reader_create(filename, &errp); ++ ++ if (!vmar) { ++ g_error("%s", error_get_pretty(errp)); ++ } ++ ++ print_content(vmar); ++ ++ vma_reader_destroy(vmar); ++ ++ return ret; ++} ++ ++typedef struct RestoreMap { ++ char *devname; ++ char *path; ++ char *format; ++ bool write_zero; ++} RestoreMap; ++ ++static int extract_content(int argc, char **argv) ++{ ++ int c, ret = 0; ++ int verbose = 0; ++ const char *filename; ++ const char *dirname; ++ const char *readmap = NULL; ++ ++ for (;;) { ++ c = getopt(argc, argv, "hvr:"); ++ if (c == -1) { ++ break; ++ } ++ switch (c) { ++ case '?': ++ case 'h': ++ help(); ++ break; ++ case 'r': ++ readmap = optarg; ++ break; ++ case 'v': ++ verbose = 1; ++ break; ++ default: ++ help(); ++ } ++ } ++ ++ /* Get the filename */ ++ if ((optind + 2) != argc) { ++ help(); ++ } ++ filename = argv[optind++]; ++ dirname = argv[optind++]; ++ ++ Error *errp = NULL; ++ VmaReader *vmar = vma_reader_create(filename, &errp); ++ ++ if (!vmar) { ++ g_error("%s", error_get_pretty(errp)); ++ } ++ ++ if (mkdir(dirname, 0777) < 0) { ++ g_error("unable to create target directory %s - %s", ++ dirname, g_strerror(errno)); ++ } ++ ++ GList *l = vma_reader_get_config_data(vmar); ++ while (l && l->data) { ++ VmaConfigData *cdata = (VmaConfigData *)l->data; ++ l = g_list_next(l); ++ char *cfgfn = g_strdup_printf("%s/%s", dirname, cdata->name); ++ GError *err = NULL; ++ if (!g_file_set_contents(cfgfn, (gchar *)cdata->data, cdata->len, ++ &err)) { ++ g_error("unable to write file: %s", err->message); ++ } ++ } ++ ++ GHashTable *devmap = g_hash_table_new(g_str_hash, g_str_equal); ++ ++ if (readmap) { ++ print_content(vmar); ++ ++ FILE *map = fopen(readmap, "r"); ++ if (!map) { ++ g_error("unable to open fifo %s - %s", readmap, g_strerror(errno)); ++ } ++ ++ while (1) { ++ char inbuf[8192]; ++ char *line = fgets(inbuf, sizeof(inbuf), map); ++ if (!line || line[0] == '\0' || !strcmp(line, "done\n")) { ++ break; ++ } ++ int len = strlen(line); ++ if (line[len - 1] == '\n') { ++ line[len - 1] = '\0'; ++ if (len == 1) { ++ break; ++ } ++ } ++ ++ char *format = NULL; ++ if (strncmp(line, "format=", sizeof("format=")-1) == 0) { ++ format = line + sizeof("format=")-1; ++ char *colon = strchr(format, ':'); ++ if (!colon) { ++ g_error("read map failed - found only a format ('%s')", inbuf); ++ } ++ format = g_strndup(format, colon - format); ++ line = colon+1; ++ } ++ ++ const char *path; ++ bool write_zero; ++ if (line[0] == '0' && line[1] == ':') { ++ path = line + 2; ++ write_zero = false; ++ } else if (line[0] == '1' && line[1] == ':') { ++ path = line + 2; ++ write_zero = true; ++ } else { ++ g_error("read map failed - parse error ('%s')", inbuf); ++ } ++ ++ char *devname = NULL; ++ path = extract_devname(path, &devname, -1); ++ if (!devname) { ++ g_error("read map failed - no dev name specified ('%s')", ++ inbuf); ++ } ++ ++ RestoreMap *map = g_new0(RestoreMap, 1); ++ map->devname = g_strdup(devname); ++ map->path = g_strdup(path); ++ map->format = format; ++ map->write_zero = write_zero; ++ ++ g_hash_table_insert(devmap, map->devname, map); ++ ++ }; ++ } ++ ++ int i; ++ int vmstate_fd = -1; ++ guint8 vmstate_stream = 0; ++ ++ BlockBackend *blk = NULL; ++ ++ for (i = 1; i < 255; i++) { ++ VmaDeviceInfo *di = vma_reader_get_device_info(vmar, i); ++ if (di && (strcmp(di->devname, "vmstate") == 0)) { ++ vmstate_stream = i; ++ char *statefn = g_strdup_printf("%s/vmstate.bin", dirname); ++ vmstate_fd = open(statefn, O_WRONLY|O_CREAT|O_EXCL, 0644); ++ if (vmstate_fd < 0) { ++ g_error("create vmstate file '%s' failed - %s", statefn, ++ g_strerror(errno)); ++ } ++ g_free(statefn); ++ } else if (di) { ++ char *devfn = NULL; ++ const char *format = NULL; ++ int flags = BDRV_O_RDWR | BDRV_O_NO_FLUSH; ++ bool write_zero = true; ++ ++ if (readmap) { ++ RestoreMap *map; ++ map = (RestoreMap *)g_hash_table_lookup(devmap, di->devname); ++ if (map == NULL) { ++ g_error("no device name mapping for %s", di->devname); ++ } ++ devfn = map->path; ++ format = map->format; ++ write_zero = map->write_zero; ++ } else { ++ devfn = g_strdup_printf("%s/tmp-disk-%s.raw", ++ dirname, di->devname); ++ printf("DEVINFO %s %zd\n", devfn, di->size); ++ ++ bdrv_img_create(devfn, "raw", NULL, NULL, NULL, di->size, ++ flags, true, &errp); ++ if (errp) { ++ g_error("can't create file %s: %s", devfn, ++ error_get_pretty(errp)); ++ } ++ ++ /* Note: we created an empty file above, so there is no ++ * need to write zeroes (so we generate a sparse file) ++ */ ++ write_zero = false; ++ } ++ ++ size_t devlen = strlen(devfn); ++ QDict *options = NULL; ++ if (format) { ++ /* explicit format from commandline */ ++ options = qdict_new(); ++ qdict_put(options, "driver", qstring_from_str(format)); ++ } else if ((devlen > 4 && strcmp(devfn+devlen-4, ".raw") == 0) || ++ strncmp(devfn, "/dev/", 5) == 0) ++ { ++ /* This part is now deprecated for PVE as well (just as qemu ++ * deprecated not specifying an explicit raw format, too. ++ */ ++ /* explicit raw format */ ++ options = qdict_new(); ++ qdict_put(options, "driver", qstring_from_str("raw")); ++ } ++ ++ ++ if (errp || !(blk = blk_new_open(devfn, NULL, options, flags, &errp))) { ++ g_error("can't open file %s - %s", devfn, ++ error_get_pretty(errp)); ++ } ++ ++ if (vma_reader_register_bs(vmar, i, blk, write_zero, &errp) < 0) { ++ g_error("%s", error_get_pretty(errp)); ++ } ++ ++ if (!readmap) { ++ g_free(devfn); ++ } ++ } ++ } ++ ++ if (vma_reader_restore(vmar, vmstate_fd, verbose, &errp) < 0) { ++ g_error("restore failed - %s", error_get_pretty(errp)); ++ } ++ ++ if (!readmap) { ++ for (i = 1; i < 255; i++) { ++ VmaDeviceInfo *di = vma_reader_get_device_info(vmar, i); ++ if (di && (i != vmstate_stream)) { ++ char *tmpfn = g_strdup_printf("%s/tmp-disk-%s.raw", ++ dirname, di->devname); ++ char *fn = g_strdup_printf("%s/disk-%s.raw", ++ dirname, di->devname); ++ if (rename(tmpfn, fn) != 0) { ++ g_error("rename %s to %s failed - %s", ++ tmpfn, fn, g_strerror(errno)); ++ } ++ } ++ } ++ } ++ ++ vma_reader_destroy(vmar); ++ ++ blk_unref(blk); ++ ++ bdrv_close_all(); ++ ++ return ret; ++} ++ ++static int verify_content(int argc, char **argv) ++{ ++ int c, ret = 0; ++ int verbose = 0; ++ const char *filename; ++ ++ for (;;) { ++ c = getopt(argc, argv, "hv"); ++ if (c == -1) { ++ break; ++ } ++ switch (c) { ++ case '?': ++ case 'h': ++ help(); ++ break; ++ case 'v': ++ verbose = 1; ++ break; ++ default: ++ help(); ++ } ++ } ++ ++ /* Get the filename */ ++ if ((optind + 1) != argc) { ++ help(); ++ } ++ filename = argv[optind++]; ++ ++ Error *errp = NULL; ++ VmaReader *vmar = vma_reader_create(filename, &errp); ++ ++ if (!vmar) { ++ g_error("%s", error_get_pretty(errp)); ++ } ++ ++ if (verbose) { ++ print_content(vmar); ++ } ++ ++ if (vma_reader_verify(vmar, verbose, &errp) < 0) { ++ g_error("verify failed - %s", error_get_pretty(errp)); ++ } ++ ++ vma_reader_destroy(vmar); ++ ++ bdrv_close_all(); ++ ++ return ret; ++} ++ ++typedef struct BackupJob { ++ BlockBackend *target; ++ int64_t len; ++ VmaWriter *vmaw; ++ uint8_t dev_id; ++} BackupJob; ++ ++#define BACKUP_SECTORS_PER_CLUSTER (VMA_CLUSTER_SIZE / BDRV_SECTOR_SIZE) ++ ++static void coroutine_fn backup_run_empty(void *opaque) ++{ ++ VmaWriter *vmaw = (VmaWriter *)opaque; ++ ++ vma_writer_flush_output(vmaw); ++ ++ Error *err = NULL; ++ if (vma_writer_close(vmaw, &err) != 0) { ++ g_warning("vma_writer_close failed %s", error_get_pretty(err)); ++ } ++} ++ ++static void coroutine_fn backup_run(void *opaque) ++{ ++ BackupJob *job = (BackupJob *)opaque; ++ struct iovec iov; ++ QEMUIOVector qiov; ++ ++ int64_t start, end; ++ int ret = 0; ++ ++ unsigned char *buf = blk_blockalign(job->target, VMA_CLUSTER_SIZE); ++ ++ start = 0; ++ end = DIV_ROUND_UP(job->len / BDRV_SECTOR_SIZE, ++ BACKUP_SECTORS_PER_CLUSTER); ++ ++ for (; start < end; start++) { ++ iov.iov_base = buf; ++ iov.iov_len = VMA_CLUSTER_SIZE; ++ qemu_iovec_init_external(&qiov, &iov, 1); ++ ++ ret = blk_co_preadv(job->target, start * VMA_CLUSTER_SIZE, ++ VMA_CLUSTER_SIZE, &qiov, 0); ++ if (ret < 0) { ++ vma_writer_set_error(job->vmaw, "read error", -1); ++ goto out; ++ } ++ ++ size_t zb = 0; ++ if (vma_writer_write(job->vmaw, job->dev_id, start, buf, &zb) < 0) { ++ vma_writer_set_error(job->vmaw, "backup_dump_cb vma_writer_write failed", -1); ++ goto out; ++ } ++ } ++ ++ ++out: ++ if (vma_writer_close_stream(job->vmaw, job->dev_id) <= 0) { ++ Error *err = NULL; ++ if (vma_writer_close(job->vmaw, &err) != 0) { ++ g_warning("vma_writer_close failed %s", error_get_pretty(err)); ++ } ++ } ++} ++ ++static int create_archive(int argc, char **argv) ++{ ++ int i, c; ++ int verbose = 0; ++ const char *archivename; ++ GList *config_files = NULL; ++ ++ for (;;) { ++ c = getopt(argc, argv, "hvc:"); ++ if (c == -1) { ++ break; ++ } ++ switch (c) { ++ case '?': ++ case 'h': ++ help(); ++ break; ++ case 'c': ++ config_files = g_list_append(config_files, optarg); ++ break; ++ case 'v': ++ verbose = 1; ++ break; ++ default: ++ g_assert_not_reached(); ++ } ++ } ++ ++ ++ /* make sure we an archive name */ ++ if ((optind + 1) > argc) { ++ help(); ++ } ++ ++ archivename = argv[optind++]; ++ ++ uuid_t uuid; ++ uuid_generate(uuid); ++ ++ Error *local_err = NULL; ++ VmaWriter *vmaw = vma_writer_create(archivename, uuid, &local_err); ++ ++ if (vmaw == NULL) { ++ g_error("%s", error_get_pretty(local_err)); ++ } ++ ++ GList *l = config_files; ++ while (l && l->data) { ++ char *name = l->data; ++ char *cdata = NULL; ++ gsize clen = 0; ++ GError *err = NULL; ++ if (!g_file_get_contents(name, &cdata, &clen, &err)) { ++ unlink(archivename); ++ g_error("Unable to read file: %s", err->message); ++ } ++ ++ if (vma_writer_add_config(vmaw, name, cdata, clen) != 0) { ++ unlink(archivename); ++ g_error("Unable to append config data %s (len = %zd)", ++ name, clen); ++ } ++ l = g_list_next(l); ++ } ++ ++ int devcount = 0; ++ while (optind < argc) { ++ const char *path = argv[optind++]; ++ char *devname = NULL; ++ path = extract_devname(path, &devname, devcount++); ++ ++ Error *errp = NULL; ++ BlockBackend *target; ++ ++ target = blk_new_open(path, NULL, NULL, 0, &errp); ++ if (!target) { ++ unlink(archivename); ++ g_error("bdrv_open '%s' failed - %s", path, error_get_pretty(errp)); ++ } ++ int64_t size = blk_getlength(target); ++ int dev_id = vma_writer_register_stream(vmaw, devname, size); ++ if (dev_id <= 0) { ++ unlink(archivename); ++ g_error("vma_writer_register_stream '%s' failed", devname); ++ } ++ ++ BackupJob *job = g_new0(BackupJob, 1); ++ job->len = size; ++ job->target = target; ++ job->vmaw = vmaw; ++ job->dev_id = dev_id; ++ ++ Coroutine *co = qemu_coroutine_create(backup_run, job); ++ qemu_coroutine_enter(co); ++ } ++ ++ VmaStatus vmastat; ++ int percent = 0; ++ int last_percent = -1; ++ ++ if (devcount) { ++ while (1) { ++ main_loop_wait(false); ++ vma_writer_get_status(vmaw, &vmastat); ++ ++ if (verbose) { ++ ++ uint64_t total = 0; ++ uint64_t transferred = 0; ++ uint64_t zero_bytes = 0; ++ ++ int i; ++ for (i = 0; i < 256; i++) { ++ if (vmastat.stream_info[i].size) { ++ total += vmastat.stream_info[i].size; ++ transferred += vmastat.stream_info[i].transferred; ++ zero_bytes += vmastat.stream_info[i].zero_bytes; ++ } ++ } ++ percent = (transferred*100)/total; ++ if (percent != last_percent) { ++ fprintf(stderr, "progress %d%% %zd/%zd %zd\n", percent, ++ transferred, total, zero_bytes); ++ fflush(stderr); ++ ++ last_percent = percent; ++ } ++ } ++ ++ if (vmastat.closed) { ++ break; ++ } ++ } ++ } else { ++ Coroutine *co = qemu_coroutine_create(backup_run_empty, vmaw); ++ qemu_coroutine_enter(co); ++ while (1) { ++ main_loop_wait(false); ++ vma_writer_get_status(vmaw, &vmastat); ++ if (vmastat.closed) { ++ break; ++ } ++ } ++ } ++ ++ bdrv_drain_all(); ++ ++ vma_writer_get_status(vmaw, &vmastat); ++ ++ if (verbose) { ++ for (i = 0; i < 256; i++) { ++ VmaStreamInfo *si = &vmastat.stream_info[i]; ++ if (si->size) { ++ fprintf(stderr, "image %s: size=%zd zeros=%zd saved=%zd\n", ++ si->devname, si->size, si->zero_bytes, ++ si->size - si->zero_bytes); ++ } ++ } ++ } ++ ++ if (vmastat.status < 0) { ++ unlink(archivename); ++ g_error("creating vma archive failed"); ++ } ++ ++ return 0; ++} ++ ++static int dump_config(int argc, char **argv) ++{ ++ int c, ret = 0; ++ const char *filename; ++ const char *config_name = "qemu-server.conf"; ++ ++ for (;;) { ++ c = getopt(argc, argv, "hc:"); ++ if (c == -1) { ++ break; ++ } ++ switch (c) { ++ case '?': ++ case 'h': ++ help(); ++ break; ++ case 'c': ++ config_name = optarg; ++ break; ++ default: ++ help(); ++ } ++ } ++ ++ /* Get the filename */ ++ if ((optind + 1) != argc) { ++ help(); ++ } ++ filename = argv[optind++]; ++ ++ Error *errp = NULL; ++ VmaReader *vmar = vma_reader_create(filename, &errp); ++ ++ if (!vmar) { ++ g_error("%s", error_get_pretty(errp)); ++ } ++ ++ int found = 0; ++ GList *l = vma_reader_get_config_data(vmar); ++ while (l && l->data) { ++ VmaConfigData *cdata = (VmaConfigData *)l->data; ++ l = g_list_next(l); ++ if (strcmp(cdata->name, config_name) == 0) { ++ found = 1; ++ fwrite(cdata->data, cdata->len, 1, stdout); ++ break; ++ } ++ } ++ ++ vma_reader_destroy(vmar); ++ ++ bdrv_close_all(); ++ ++ if (!found) { ++ fprintf(stderr, "unable to find configuration data '%s'\n", config_name); ++ return -1; ++ } ++ ++ return ret; ++} ++ ++int main(int argc, char **argv) ++{ ++ const char *cmdname; ++ Error *main_loop_err = NULL; ++ ++ error_set_progname(argv[0]); ++ ++ if (qemu_init_main_loop(&main_loop_err)) { ++ g_error("%s", error_get_pretty(main_loop_err)); ++ } ++ ++ bdrv_init(); ++ ++ if (argc < 2) { ++ help(); ++ } ++ ++ cmdname = argv[1]; ++ argc--; argv++; ++ ++ ++ if (!strcmp(cmdname, "list")) { ++ return list_content(argc, argv); ++ } else if (!strcmp(cmdname, "create")) { ++ return create_archive(argc, argv); ++ } else if (!strcmp(cmdname, "extract")) { ++ return extract_content(argc, argv); ++ } else if (!strcmp(cmdname, "verify")) { ++ return verify_content(argc, argv); ++ } else if (!strcmp(cmdname, "config")) { ++ return dump_config(argc, argv); ++ } ++ ++ help(); ++ return 0; ++} +diff --git a/vma.h b/vma.h +new file mode 100644 +index 0000000000..c895c97f6d +--- /dev/null ++++ b/vma.h +@@ -0,0 +1,150 @@ ++/* ++ * VMA: Virtual Machine Archive ++ * ++ * Copyright (C) Proxmox Server Solutions ++ * ++ * Authors: ++ * Dietmar Maurer (dietmar@proxmox.com) ++ * ++ * This work is licensed under the terms of the GNU GPL, version 2 or later. ++ * See the COPYING file in the top-level directory. ++ * ++ */ ++ ++#ifndef BACKUP_VMA_H ++#define BACKUP_VMA_H ++ ++#include ++#include "qapi/error.h" ++#include "block/block.h" ++ ++#define VMA_BLOCK_BITS 12 ++#define VMA_BLOCK_SIZE (1< -Date: Thu, 17 Mar 2016 11:33:37 +0100 -Subject: [PATCH] PVE: block: add the zeroinit block driver filter - ---- - block/Makefile.objs | 1 + - block/zeroinit.c | 203 ++++++++++++++++++++++++++++++++++++++++++++++++++++ - 2 files changed, 204 insertions(+) - create mode 100644 block/zeroinit.c - -diff --git a/block/Makefile.objs b/block/Makefile.objs -index c8337bf186..c00f0b32d6 100644 ---- a/block/Makefile.objs -+++ b/block/Makefile.objs -@@ -4,6 +4,7 @@ block-obj-y += qed.o qed-l2-cache.o qed-table.o qed-cluster.o - block-obj-y += qed-check.o - block-obj-y += vhdx.o vhdx-endian.o vhdx-log.o - block-obj-y += quorum.o -+block-obj-y += zeroinit.o - block-obj-y += parallels.o blkdebug.o blkverify.o blkreplay.o - block-obj-y += blklogwrites.o - block-obj-y += block-backend.o snapshot.o qapi.o -diff --git a/block/zeroinit.c b/block/zeroinit.c -new file mode 100644 -index 0000000000..64c49ad0e0 ---- /dev/null -+++ b/block/zeroinit.c -@@ -0,0 +1,203 @@ -+/* -+ * Filter to fake a zero-initialized block device. -+ * -+ * Copyright (c) 2016 Wolfgang Bumiller -+ * Copyright (c) 2016 Proxmox Server Solutions GmbH -+ * -+ * This work is licensed under the terms of the GNU GPL, version 2 or later. -+ * See the COPYING file in the top-level directory. -+ */ -+ -+#include "qemu/osdep.h" -+#include "qapi/error.h" -+#include "block/block_int.h" -+#include "qapi/qmp/qdict.h" -+#include "qapi/qmp/qstring.h" -+#include "qemu/cutils.h" -+#include "qemu/option.h" -+ -+typedef struct { -+ bool has_zero_init; -+ int64_t extents; -+} BDRVZeroinitState; -+ -+/* Valid blkverify filenames look like blkverify:path/to/raw_image:path/to/image */ -+static void zeroinit_parse_filename(const char *filename, QDict *options, -+ Error **errp) -+{ -+ QString *raw_path; -+ -+ /* Parse the blkverify: prefix */ -+ if (!strstart(filename, "zeroinit:", &filename)) { -+ /* There was no prefix; therefore, all options have to be already -+ present in the QDict (except for the filename) */ -+ return; -+ } -+ -+ raw_path = qstring_from_str(filename); -+ qdict_put(options, "x-next", raw_path); -+} -+ -+static QemuOptsList runtime_opts = { -+ .name = "zeroinit", -+ .head = QTAILQ_HEAD_INITIALIZER(runtime_opts.head), -+ .desc = { -+ { -+ .name = "x-next", -+ .type = QEMU_OPT_STRING, -+ .help = "[internal use only, will be removed]", -+ }, -+ { -+ .name = "x-zeroinit", -+ .type = QEMU_OPT_BOOL, -+ .help = "set has_initialized_zero flag", -+ }, -+ { /* end of list */ } -+ }, -+}; -+ -+static int zeroinit_open(BlockDriverState *bs, QDict *options, int flags, -+ Error **errp) -+{ -+ BDRVZeroinitState *s = bs->opaque; -+ QemuOpts *opts; -+ Error *local_err = NULL; -+ int ret; -+ -+ s->extents = 0; -+ -+ opts = qemu_opts_create(&runtime_opts, NULL, 0, &error_abort); -+ qemu_opts_absorb_qdict(opts, options, &local_err); -+ if (local_err) { -+ error_propagate(errp, local_err); -+ ret = -EINVAL; -+ goto fail; -+ } -+ -+ /* Open the raw file */ -+ bs->file = bdrv_open_child(qemu_opt_get(opts, "x-next"), options, "next", -+ bs, &child_file, false, &local_err); -+ if (local_err) { -+ ret = -EINVAL; -+ error_propagate(errp, local_err); -+ goto fail; -+ } -+ -+ /* set the options */ -+ s->has_zero_init = qemu_opt_get_bool(opts, "x-zeroinit", true); -+ -+ ret = 0; -+fail: -+ if (ret < 0) { -+ bdrv_unref_child(bs, bs->file); -+ } -+ qemu_opts_del(opts); -+ return ret; -+} -+ -+static void zeroinit_close(BlockDriverState *bs) -+{ -+ BDRVZeroinitState *s = bs->opaque; -+ (void)s; -+} -+ -+static int64_t zeroinit_getlength(BlockDriverState *bs) -+{ -+ return bdrv_getlength(bs->file->bs); -+} -+ -+static int coroutine_fn zeroinit_co_preadv(BlockDriverState *bs, -+ uint64_t offset, uint64_t bytes, QEMUIOVector *qiov, int flags) -+{ -+ return bdrv_co_preadv(bs->file, offset, bytes, qiov, flags); -+} -+ -+static int coroutine_fn zeroinit_co_pwrite_zeroes(BlockDriverState *bs, int64_t offset, -+ int count, BdrvRequestFlags flags) -+{ -+ BDRVZeroinitState *s = bs->opaque; -+ if (offset >= s->extents) -+ return 0; -+ return bdrv_pwrite_zeroes(bs->file, offset, count, flags); -+} -+ -+static int coroutine_fn zeroinit_co_pwritev(BlockDriverState *bs, -+ uint64_t offset, uint64_t bytes, QEMUIOVector *qiov, int flags) -+{ -+ BDRVZeroinitState *s = bs->opaque; -+ int64_t extents = offset + bytes; -+ if (extents > s->extents) -+ s->extents = extents; -+ return bdrv_co_pwritev(bs->file, offset, bytes, qiov, flags); -+} -+ -+static bool zeroinit_recurse_is_first_non_filter(BlockDriverState *bs, -+ BlockDriverState *candidate) -+{ -+ return bdrv_recurse_is_first_non_filter(bs->file->bs, candidate); -+} -+ -+static coroutine_fn int zeroinit_co_flush(BlockDriverState *bs) -+{ -+ return bdrv_co_flush(bs->file->bs); -+} -+ -+static int zeroinit_has_zero_init(BlockDriverState *bs) -+{ -+ BDRVZeroinitState *s = bs->opaque; -+ return s->has_zero_init; -+} -+ -+static int coroutine_fn zeroinit_co_pdiscard(BlockDriverState *bs, -+ int64_t offset, int count) -+{ -+ return bdrv_co_pdiscard(bs->file, offset, count); -+} -+ -+static int zeroinit_co_truncate(BlockDriverState *bs, int64_t offset, -+ PreallocMode prealloc, Error **errp) -+{ -+ return bdrv_co_truncate(bs->file, offset, prealloc, errp); -+} -+ -+static int zeroinit_get_info(BlockDriverState *bs, BlockDriverInfo *bdi) -+{ -+ return bdrv_get_info(bs->file->bs, bdi); -+} -+ -+static BlockDriver bdrv_zeroinit = { -+ .format_name = "zeroinit", -+ .protocol_name = "zeroinit", -+ .instance_size = sizeof(BDRVZeroinitState), -+ -+ .bdrv_parse_filename = zeroinit_parse_filename, -+ .bdrv_file_open = zeroinit_open, -+ .bdrv_close = zeroinit_close, -+ .bdrv_getlength = zeroinit_getlength, -+ .bdrv_child_perm = bdrv_filter_default_perms, -+ .bdrv_co_flush_to_disk = zeroinit_co_flush, -+ -+ .bdrv_co_pwrite_zeroes = zeroinit_co_pwrite_zeroes, -+ .bdrv_co_pwritev = zeroinit_co_pwritev, -+ .bdrv_co_preadv = zeroinit_co_preadv, -+ .bdrv_co_flush = zeroinit_co_flush, -+ -+ .is_filter = true, -+ .bdrv_recurse_is_first_non_filter = zeroinit_recurse_is_first_non_filter, -+ -+ .bdrv_has_zero_init = zeroinit_has_zero_init, -+ -+ .bdrv_co_block_status = bdrv_co_block_status_from_file, -+ -+ .bdrv_co_pdiscard = zeroinit_co_pdiscard, -+ -+ .bdrv_co_truncate = zeroinit_co_truncate, -+ .bdrv_get_info = zeroinit_get_info, -+}; -+ -+static void bdrv_zeroinit_init(void) -+{ -+ bdrv_register(&bdrv_zeroinit); -+} -+ -+block_init(bdrv_zeroinit_init); --- -2.11.0 - diff --git a/debian/patches/pve/0023-PVE-backup-modify-job-api.patch b/debian/patches/pve/0023-PVE-backup-modify-job-api.patch deleted file mode 100644 index f89c6cd..0000000 --- a/debian/patches/pve/0023-PVE-backup-modify-job-api.patch +++ /dev/null @@ -1,100 +0,0 @@ -From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001 -From: Wolfgang Bumiller -Date: Wed, 9 Dec 2015 15:04:57 +0100 -Subject: [PATCH] PVE: backup: modify job api - -Introduce a pause_count parameter to start a backup in -paused mode. This way backups of multiple drives can be -started up sequentially via the completion callback while -having been started at the same point in time. ---- - block/backup.c | 2 ++ - block/replication.c | 2 +- - blockdev.c | 4 ++-- - include/block/block_int.h | 1 + - job.c | 2 +- - 5 files changed, 7 insertions(+), 4 deletions(-) - -diff --git a/block/backup.c b/block/backup.c -index 8630d32926..7f970842d7 100644 ---- a/block/backup.c -+++ b/block/backup.c -@@ -613,6 +613,7 @@ BlockJob *backup_job_create(const char *job_id, BlockDriverState *bs, - BlockdevOnError on_target_error, - int creation_flags, - BlockCompletionFunc *cb, void *opaque, -+ int pause_count, - JobTxn *txn, Error **errp) - { - int64_t len; -@@ -746,6 +747,7 @@ BlockJob *backup_job_create(const char *job_id, BlockDriverState *bs, - block_job_add_bdrv(&job->common, "target", target, 0, BLK_PERM_ALL, - &error_abort); - job->len = len; -+ job->common.job.pause_count = pause_count; - - return &job->common; - -diff --git a/block/replication.c b/block/replication.c -index 6349d6958e..84e07cc4d4 100644 ---- a/block/replication.c -+++ b/block/replication.c -@@ -571,7 +571,7 @@ static void replication_start(ReplicationState *rs, ReplicationMode mode, - 0, MIRROR_SYNC_MODE_NONE, NULL, false, - BLOCKDEV_ON_ERROR_REPORT, - BLOCKDEV_ON_ERROR_REPORT, JOB_INTERNAL, -- backup_job_completed, bs, NULL, &local_err); -+ backup_job_completed, bs, 0, NULL, &local_err); - if (local_err) { - error_propagate(errp, local_err); - backup_job_cleanup(bs); -diff --git a/blockdev.c b/blockdev.c -index dcf8c8d2ab..d5eb6b62ca 100644 ---- a/blockdev.c -+++ b/blockdev.c -@@ -3568,7 +3568,7 @@ static BlockJob *do_drive_backup(DriveBackup *backup, JobTxn *txn, - job = backup_job_create(backup->job_id, bs, target_bs, backup->speed, - backup->sync, bmap, backup->compress, - backup->on_source_error, backup->on_target_error, -- job_flags, NULL, NULL, txn, &local_err); -+ job_flags, NULL, NULL, 0, txn, &local_err); - bdrv_unref(target_bs); - if (local_err != NULL) { - error_propagate(errp, local_err); -@@ -3660,7 +3660,7 @@ BlockJob *do_blockdev_backup(BlockdevBackup *backup, JobTxn *txn, - job = backup_job_create(backup->job_id, bs, target_bs, backup->speed, - backup->sync, NULL, backup->compress, - backup->on_source_error, backup->on_target_error, -- job_flags, NULL, NULL, txn, &local_err); -+ job_flags, NULL, NULL, 0, txn, &local_err); - if (local_err != NULL) { - error_propagate(errp, local_err); - } -diff --git a/include/block/block_int.h b/include/block/block_int.h -index 903b9c1034..0b2516c3cf 100644 ---- a/include/block/block_int.h -+++ b/include/block/block_int.h -@@ -1083,6 +1083,7 @@ BlockJob *backup_job_create(const char *job_id, BlockDriverState *bs, - BlockdevOnError on_target_error, - int creation_flags, - BlockCompletionFunc *cb, void *opaque, -+ int pause_count, - JobTxn *txn, Error **errp); - - void hmp_drive_add_node(Monitor *mon, const char *optstr); -diff --git a/job.c b/job.c -index fa671b431a..72c50ee18e 100644 ---- a/job.c -+++ b/job.c -@@ -557,7 +557,7 @@ void job_start(Job *job) - job->co = qemu_coroutine_create(job_co_entry, job); - job->pause_count--; - job->busy = true; -- job->paused = false; -+ job->paused = job->pause_count > 0; - job_state_transition(job, JOB_STATUS_RUNNING); - aio_co_enter(job->aio_context, job->co); - } --- -2.11.0 - diff --git a/debian/patches/pve/0023-PVE-vma-add-throttling-options-to-drive-mapping-fifo.patch b/debian/patches/pve/0023-PVE-vma-add-throttling-options-to-drive-mapping-fifo.patch new file mode 100644 index 0000000..2c14854 --- /dev/null +++ b/debian/patches/pve/0023-PVE-vma-add-throttling-options-to-drive-mapping-fifo.patch @@ -0,0 +1,189 @@ +From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001 +From: Wolfgang Bumiller +Date: Thu, 15 Feb 2018 11:07:56 +0100 +Subject: [PATCH] PVE: vma: add throttling options to drive mapping fifo + protocol + +We now need to call initialize the qom module as well. + +Signed-off-by: Wolfgang Bumiller +--- + vma.c | 88 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++---------- + 1 file changed, 76 insertions(+), 12 deletions(-) + +diff --git a/vma.c b/vma.c +index 1b59fd1555..f9f5c308fe 100644 +--- a/vma.c ++++ b/vma.c +@@ -18,7 +18,8 @@ + #include "qemu-common.h" + #include "qemu/error-report.h" + #include "qemu/main-loop.h" +-#include "qapi/qmp/qstring.h" ++#include "qemu/cutils.h" ++#include "qapi/qmp/qdict.h" + #include "sysemu/block-backend.h" + + static void help(void) +@@ -132,9 +133,39 @@ typedef struct RestoreMap { + char *devname; + char *path; + char *format; ++ uint64_t throttling_bps; ++ char *throttling_group; + bool write_zero; + } RestoreMap; + ++static bool try_parse_option(char **line, const char *optname, char **out, const char *inbuf) { ++ size_t optlen = strlen(optname); ++ if (strncmp(*line, optname, optlen) != 0 || (*line)[optlen] != '=') { ++ return false; ++ } ++ if (*out) { ++ g_error("read map failed - duplicate value for option '%s'", optname); ++ } ++ char *value = (*line) + optlen + 1; /* including a '=' */ ++ char *colon = strchr(value, ':'); ++ if (!colon) { ++ g_error("read map failed - option '%s' not terminated ('%s')", ++ optname, inbuf); ++ } ++ *line = colon+1; ++ *out = g_strndup(value, colon - value); ++ return true; ++} ++ ++static uint64_t verify_u64(const char *text) { ++ uint64_t value; ++ const char *endptr = NULL; ++ if (qemu_strtou64(text, &endptr, 0, &value) != 0 || !endptr || *endptr) { ++ g_error("read map failed - not a number: %s", text); ++ } ++ return value; ++} ++ + static int extract_content(int argc, char **argv) + { + int c, ret = 0; +@@ -208,6 +239,9 @@ static int extract_content(int argc, char **argv) + while (1) { + char inbuf[8192]; + char *line = fgets(inbuf, sizeof(inbuf), map); ++ char *format = NULL; ++ char *bps = NULL; ++ char *group = NULL; + if (!line || line[0] == '\0' || !strcmp(line, "done\n")) { + break; + } +@@ -219,15 +253,19 @@ static int extract_content(int argc, char **argv) + } + } + +- char *format = NULL; +- if (strncmp(line, "format=", sizeof("format=")-1) == 0) { +- format = line + sizeof("format=")-1; +- char *colon = strchr(format, ':'); +- if (!colon) { +- g_error("read map failed - found only a format ('%s')", inbuf); ++ while (1) { ++ if (!try_parse_option(&line, "format", &format, inbuf) && ++ !try_parse_option(&line, "throttling.bps", &bps, inbuf) && ++ !try_parse_option(&line, "throttling.group", &group, inbuf)) ++ { ++ break; + } +- format = g_strndup(format, colon - format); +- line = colon+1; ++ } ++ ++ uint64_t bps_value = 0; ++ if (bps) { ++ bps_value = verify_u64(bps); ++ g_free(bps); + } + + const char *path; +@@ -253,6 +291,8 @@ static int extract_content(int argc, char **argv) + map->devname = g_strdup(devname); + map->path = g_strdup(path); + map->format = format; ++ map->throttling_bps = bps_value; ++ map->throttling_group = group; + map->write_zero = write_zero; + + g_hash_table_insert(devmap, map->devname, map); +@@ -280,6 +320,8 @@ static int extract_content(int argc, char **argv) + } else if (di) { + char *devfn = NULL; + const char *format = NULL; ++ uint64_t throttling_bps = 0; ++ const char *throttling_group = NULL; + int flags = BDRV_O_RDWR | BDRV_O_NO_FLUSH; + bool write_zero = true; + +@@ -291,6 +333,8 @@ static int extract_content(int argc, char **argv) + } + devfn = map->path; + format = map->format; ++ throttling_bps = map->throttling_bps; ++ throttling_group = map->throttling_group; + write_zero = map->write_zero; + } else { + devfn = g_strdup_printf("%s/tmp-disk-%s.raw", +@@ -315,7 +359,7 @@ static int extract_content(int argc, char **argv) + if (format) { + /* explicit format from commandline */ + options = qdict_new(); +- qdict_put(options, "driver", qstring_from_str(format)); ++ qdict_put_str(options, "driver", format); + } else if ((devlen > 4 && strcmp(devfn+devlen-4, ".raw") == 0) || + strncmp(devfn, "/dev/", 5) == 0) + { +@@ -324,15 +368,34 @@ static int extract_content(int argc, char **argv) + */ + /* explicit raw format */ + options = qdict_new(); +- qdict_put(options, "driver", qstring_from_str("raw")); ++ qdict_put_str(options, "driver", "raw"); + } + +- + if (errp || !(blk = blk_new_open(devfn, NULL, options, flags, &errp))) { + g_error("can't open file %s - %s", devfn, + error_get_pretty(errp)); + } + ++ if (throttling_group) { ++ blk_io_limits_enable(blk, throttling_group); ++ } ++ ++ if (throttling_bps) { ++ if (!throttling_group) { ++ blk_io_limits_enable(blk, devfn); ++ } ++ ++ ThrottleConfig cfg; ++ throttle_config_init(&cfg); ++ cfg.buckets[THROTTLE_BPS_WRITE].avg = throttling_bps; ++ Error *err = NULL; ++ if (!throttle_is_valid(&cfg, &err)) { ++ error_report_err(err); ++ g_error("failed to apply throttling"); ++ } ++ blk_set_io_limits(blk, &cfg); ++ } ++ + if (vma_reader_register_bs(vmar, i, blk, write_zero, &errp) < 0) { + g_error("%s", error_get_pretty(errp)); + } +@@ -730,6 +793,7 @@ int main(int argc, char **argv) + } + + bdrv_init(); ++ module_call_init(MODULE_INIT_QOM); + + if (argc < 2) { + help(); +-- +2.11.0 + diff --git a/debian/patches/pve/0024-PVE-backup-introduce-vma-archive-format.patch b/debian/patches/pve/0024-PVE-backup-introduce-vma-archive-format.patch deleted file mode 100644 index ab8b00c..0000000 --- a/debian/patches/pve/0024-PVE-backup-introduce-vma-archive-format.patch +++ /dev/null @@ -1,1485 +0,0 @@ -From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001 -From: Wolfgang Bumiller -Date: Wed, 2 Aug 2017 13:51:02 +0200 -Subject: [PATCH] PVE: backup: introduce vma archive format - -TODO: Move to a libvma block backend. ---- - MAINTAINERS | 6 + - block/Makefile.objs | 3 + - block/vma.c | 424 ++++++++++++++++++++++++++++++++++++++++ - blockdev.c | 536 +++++++++++++++++++++++++++++++++++++++++++++++++++ - configure | 29 +++ - hmp-commands-info.hx | 13 ++ - hmp-commands.hx | 31 +++ - hmp.c | 63 ++++++ - hmp.h | 3 + - qapi/block-core.json | 109 ++++++++++- - qapi/common.json | 13 ++ - qapi/misc.json | 13 -- - 12 files changed, 1229 insertions(+), 14 deletions(-) - create mode 100644 block/vma.c - -diff --git a/MAINTAINERS b/MAINTAINERS -index 666e936812..299a73cd86 100644 ---- a/MAINTAINERS -+++ b/MAINTAINERS -@@ -2140,6 +2140,12 @@ L: qemu-block@nongnu.org - S: Supported - F: block/vvfat.c - -+VMA -+M: Wolfgang Bumiller . -+L: pve-devel@proxmox.com -+S: Supported -+F: block/vma.c -+ - Image format fuzzer - M: Stefan Hajnoczi - L: qemu-block@nongnu.org -diff --git a/block/Makefile.objs b/block/Makefile.objs -index c00f0b32d6..abfd0f69d7 100644 ---- a/block/Makefile.objs -+++ b/block/Makefile.objs -@@ -24,6 +24,7 @@ block-obj-$(CONFIG_RBD) += rbd.o - block-obj-$(CONFIG_GLUSTERFS) += gluster.o - block-obj-$(CONFIG_VXHS) += vxhs.o - block-obj-$(CONFIG_LIBSSH2) += ssh.o -+block-obj-$(CONFIG_VMA) += vma.o - block-obj-y += accounting.o dirty-bitmap.o - block-obj-y += write-threshold.o - block-obj-y += backup.o -@@ -52,3 +53,5 @@ qcow.o-libs := -lz - linux-aio.o-libs := -laio - parallels.o-cflags := $(LIBXML2_CFLAGS) - parallels.o-libs := $(LIBXML2_LIBS) -+vma.o-cflags := $(VMA_CFLAGS) -+vma.o-libs := $(VMA_LIBS) -diff --git a/block/vma.c b/block/vma.c -new file mode 100644 -index 0000000000..7151514f94 ---- /dev/null -+++ b/block/vma.c -@@ -0,0 +1,424 @@ -+/* -+ * VMA archive backend for QEMU, container object -+ * -+ * Copyright (C) 2017 Proxmox Server Solutions GmbH -+ * -+ * This work is licensed under the terms of the GNU GPL, version 2 or later. -+ * See the COPYING file in the top-level directory. -+ * -+ */ -+#include -+ -+#include "qemu/osdep.h" -+#include "qemu/uuid.h" -+#include "qemu-common.h" -+#include "qapi/error.h" -+#include "qapi/qmp/qerror.h" -+#include "qapi/qmp/qstring.h" -+#include "qom/object.h" -+#include "qom/object_interfaces.h" -+#include "block/block_int.h" -+ -+/* exported interface */ -+void vma_object_add_config_file(Object *obj, const char *name, -+ const char *contents, size_t len, -+ Error **errp); -+ -+#define TYPE_VMA_OBJECT "vma" -+#define VMA_OBJECT(obj) \ -+ OBJECT_CHECK(VMAObjectState, (obj), TYPE_VMA_OBJECT) -+#define VMA_OBJECT_GET_CLASS(obj) \ -+ OBJECT_GET_CLASS(VMAObjectClass, (obj), TYPE_VMA_OBJECT) -+ -+typedef struct VMAObjectClass { -+ ObjectClass parent_class; -+} VMAObjectClass; -+ -+typedef struct VMAObjectState { -+ Object parent; -+ -+ char *filename; -+ -+ QemuUUID uuid; -+ bool blocked; -+ VMAWriter *vma; -+ QemuMutex mutex; -+} VMAObjectState; -+ -+static VMAObjectState *vma_by_id(const char *name) -+{ -+ Object *container; -+ Object *obj; -+ -+ container = object_get_objects_root(); -+ obj = object_resolve_path_component(container, name); -+ -+ return VMA_OBJECT(obj); -+} -+ -+static void vma_object_class_complete(UserCreatable *uc, Error **errp) -+{ -+ int rc; -+ VMAObjectState *vo = VMA_OBJECT(uc); -+ VMAObjectClass *voc = VMA_OBJECT_GET_CLASS(uc); -+ (void)!vo; -+ (void)!voc; -+ -+ if (!vo->filename) { -+ error_setg(errp, "Parameter 'filename' is required"); -+ return; -+ } -+ -+ rc = VMAWriter_fopen(vo->filename, &vo->vma); -+ if (rc < 0) { -+ error_setg_errno(errp, -rc, "failed to create VMA archive"); -+ return; -+ } -+ -+ rc = VMAWriter_set_uuid(vo->vma, vo->uuid.data, sizeof(vo->uuid.data)); -+ if (rc < 0) { -+ error_setg_errno(errp, -rc, "failed to set UUID of VMA archive"); -+ return; -+ } -+ -+ qemu_mutex_init(&vo->mutex); -+} -+ -+static bool vma_object_can_be_deleted(UserCreatable *uc, Error **errp) -+{ -+ //VMAObjectState *vo = VMA_OBJECT(uc); -+ //if (!vo->vma) { -+ // return true; -+ //} -+ //return false; -+ return true; -+} -+ -+static void vma_object_class_init(ObjectClass *oc, void *data) -+{ -+ UserCreatableClass *ucc = USER_CREATABLE_CLASS(oc); -+ -+ ucc->can_be_deleted = vma_object_can_be_deleted; -+ ucc->complete = vma_object_class_complete; -+} -+ -+static char *vma_object_get_filename(Object *obj, Error **errp) -+{ -+ VMAObjectState *vo = VMA_OBJECT(obj); -+ -+ return g_strdup(vo->filename); -+} -+ -+static void vma_object_set_filename(Object *obj, const char *str, Error **errp) -+{ -+ VMAObjectState *vo = VMA_OBJECT(obj); -+ -+ if (vo->vma) { -+ error_setg(errp, "filename cannot be changed after creation"); -+ return; -+ } -+ -+ g_free(vo->filename); -+ vo->filename = g_strdup(str); -+} -+ -+static char *vma_object_get_uuid(Object *obj, Error **errp) -+{ -+ VMAObjectState *vo = VMA_OBJECT(obj); -+ -+ return qemu_uuid_unparse_strdup(&vo->uuid); -+} -+ -+static void vma_object_set_uuid(Object *obj, const char *str, Error **errp) -+{ -+ VMAObjectState *vo = VMA_OBJECT(obj); -+ -+ if (vo->vma) { -+ error_setg(errp, "uuid cannot be changed after creation"); -+ return; -+ } -+ -+ qemu_uuid_parse(str, &vo->uuid); -+} -+ -+static bool vma_object_get_blocked(Object *obj, Error **errp) -+{ -+ VMAObjectState *vo = VMA_OBJECT(obj); -+ -+ return vo->blocked; -+} -+ -+static void vma_object_set_blocked(Object *obj, bool blocked, Error **errp) -+{ -+ VMAObjectState *vo = VMA_OBJECT(obj); -+ -+ (void)errp; -+ -+ vo->blocked = blocked; -+} -+ -+void vma_object_add_config_file(Object *obj, const char *name, -+ const char *contents, size_t len, -+ Error **errp) -+{ -+ int rc; -+ VMAObjectState *vo = VMA_OBJECT(obj); -+ -+ if (!vo || !vo->vma) { -+ error_setg(errp, "not a valid vma object to add config files to"); -+ return; -+ } -+ -+ rc = VMAWriter_addConfigFile(vo->vma, name, contents, len); -+ if (rc < 0) { -+ error_setg_errno(errp, -rc, "failed to add config file to VMA"); -+ return; -+ } -+} -+ -+static void vma_object_init(Object *obj) -+{ -+ VMAObjectState *vo = VMA_OBJECT(obj); -+ (void)!vo; -+ -+ object_property_add_str(obj, "filename", -+ vma_object_get_filename, vma_object_set_filename, -+ NULL); -+ object_property_add_str(obj, "uuid", -+ vma_object_get_uuid, vma_object_set_uuid, -+ NULL); -+ object_property_add_bool(obj, "blocked", -+ vma_object_get_blocked, vma_object_set_blocked, -+ NULL); -+} -+ -+static void vma_object_finalize(Object *obj) -+{ -+ VMAObjectState *vo = VMA_OBJECT(obj); -+ VMAObjectClass *voc = VMA_OBJECT_GET_CLASS(obj); -+ (void)!voc; -+ -+ qemu_mutex_destroy(&vo->mutex); -+ -+ VMAWriter_destroy(vo->vma, true); -+ g_free(vo->filename); -+} -+ -+static const TypeInfo vma_object_info = { -+ .name = TYPE_VMA_OBJECT, -+ .parent = TYPE_OBJECT, -+ .class_size = sizeof(VMAObjectClass), -+ .class_init = vma_object_class_init, -+ .instance_size = sizeof(VMAObjectState), -+ .instance_init = vma_object_init, -+ .instance_finalize = vma_object_finalize, -+ .interfaces = (InterfaceInfo[]) { -+ { TYPE_USER_CREATABLE }, -+ { } -+ } -+}; -+ -+static void register_types(void) -+{ -+ type_register_static(&vma_object_info); -+} -+ -+type_init(register_types); -+ -+typedef struct { -+ VMAObjectState *vma_obj; -+ char *name; -+ size_t device_id; -+ uint64_t byte_size; -+} BDRVVMAState; -+ -+static void qemu_vma_parse_filename(const char *filename, QDict *options, -+ Error **errp) -+{ -+ char *sep; -+ -+ sep = strchr(filename, '/'); -+ if (!sep || sep == filename) { -+ error_setg(errp, "VMA filename should be /"); -+ return; -+ } -+ -+ qdict_put(options, "vma", qstring_from_substr(filename, 0, sep-filename-1)); -+ -+ while (*sep && *sep == '/') -+ ++sep; -+ if (!*sep) { -+ error_setg(errp, "missing device name\n"); -+ return; -+ } -+ -+ qdict_put(options, "name", qstring_from_str(sep)); -+} -+ -+static QemuOptsList runtime_opts = { -+ .name = "vma-drive", -+ .head = QTAILQ_HEAD_INITIALIZER(runtime_opts.head), -+ .desc = { -+ { -+ .name = "vma", -+ .type = QEMU_OPT_STRING, -+ .help = "VMA Object name", -+ }, -+ { -+ .name = "name", -+ .type = QEMU_OPT_STRING, -+ .help = "VMA device name", -+ }, -+ { -+ .name = BLOCK_OPT_SIZE, -+ .type = QEMU_OPT_SIZE, -+ .help = "Virtual disk size" -+ }, -+ { /* end of list */ } -+ }, -+}; -+static int qemu_vma_open(BlockDriverState *bs, QDict *options, int flags, -+ Error **errp) -+{ -+ Error *local_err = NULL; -+ BDRVVMAState *s = bs->opaque; -+ QemuOpts *opts; -+ const char *vma_id, *device_name; -+ ssize_t dev_id; -+ int64_t bytes = 0; -+ int ret; -+ -+ opts = qemu_opts_create(&runtime_opts, NULL, 0, &error_abort); -+ qemu_opts_absorb_qdict(opts, options, &local_err); -+ if (local_err) { -+ error_propagate(errp, local_err); -+ ret = -EINVAL; -+ goto failed_opts; -+ } -+ -+ bytes = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0), -+ BDRV_SECTOR_SIZE); -+ -+ vma_id = qemu_opt_get(opts, "vma"); -+ device_name = qemu_opt_get(opts, "name"); -+ -+ VMAObjectState *vma = vma_by_id(vma_id); -+ if (!vma) { -+ ret = -EINVAL; -+ error_setg(errp, "no such VMA object: %s", vma_id); -+ goto failed_opts; -+ } -+ -+ dev_id = VMAWriter_findDevice(vma->vma, device_name); -+ if (dev_id >= 0) { -+ error_setg(errp, "drive already exists in VMA object"); -+ ret = -EIO; -+ goto failed_opts; -+ } -+ -+ dev_id = VMAWriter_addDevice(vma->vma, device_name, (uint64_t)bytes); -+ if (dev_id < 0) { -+ error_setg_errno(errp, -dev_id, "failed to add VMA device"); -+ ret = -EIO; -+ goto failed_opts; -+ } -+ -+ object_ref(OBJECT(vma)); -+ s->vma_obj = vma; -+ s->name = g_strdup(device_name); -+ s->device_id = (size_t)dev_id; -+ s->byte_size = bytes; -+ -+ ret = 0; -+ -+failed_opts: -+ qemu_opts_del(opts); -+ return ret; -+} -+ -+static void qemu_vma_close(BlockDriverState *bs) -+{ -+ BDRVVMAState *s = bs->opaque; -+ -+ (void)VMAWriter_finishDevice(s->vma_obj->vma, s->device_id); -+ object_unref(OBJECT(s->vma_obj)); -+ -+ g_free(s->name); -+} -+ -+static int64_t qemu_vma_getlength(BlockDriverState *bs) -+{ -+ BDRVVMAState *s = bs->opaque; -+ -+ return s->byte_size; -+} -+ -+static coroutine_fn int qemu_vma_co_writev(BlockDriverState *bs, -+ int64_t sector_num, -+ int nb_sectors, -+ QEMUIOVector *qiov) -+{ -+ size_t i; -+ ssize_t rc; -+ BDRVVMAState *s = bs->opaque; -+ VMAObjectState *vo = s->vma_obj; -+ off_t offset = sector_num * BDRV_SECTOR_SIZE; -+ -+ qemu_mutex_lock(&vo->mutex); -+ if (vo->blocked) { -+ return -EPERM; -+ } -+ for (i = 0; i != qiov->niov; ++i) { -+ const struct iovec *v = &qiov->iov[i]; -+ size_t blocks = v->iov_len / VMA_BLOCK_SIZE; -+ if (blocks * VMA_BLOCK_SIZE != v->iov_len) { -+ return -EIO; -+ } -+ rc = VMAWriter_writeBlocks(vo->vma, s->device_id, -+ v->iov_base, blocks, offset); -+ if (errno) { -+ return -errno; -+ } -+ if (rc != blocks) { -+ return -EIO; -+ } -+ offset += v->iov_len; -+ } -+ qemu_mutex_unlock(&vo->mutex); -+ return 0; -+} -+ -+static int qemu_vma_get_info(BlockDriverState *bs, BlockDriverInfo *bdi) -+{ -+ bdi->cluster_size = VMA_CLUSTER_SIZE; -+ bdi->unallocated_blocks_are_zero = true; -+ bdi->can_write_zeroes_with_unmap = false; -+ return 0; -+} -+ -+static BlockDriver bdrv_vma_drive = { -+ .format_name = "vma-drive", -+ .instance_size = sizeof(BDRVVMAState), -+ -+#if 0 -+ .bdrv_create = qemu_vma_create, -+ .create_opts = &qemu_vma_create_opts, -+#endif -+ -+ .bdrv_parse_filename = qemu_vma_parse_filename, -+ .bdrv_file_open = qemu_vma_open, -+ -+ .bdrv_close = qemu_vma_close, -+ .bdrv_has_zero_init = bdrv_has_zero_init_1, -+ .bdrv_getlength = qemu_vma_getlength, -+ .bdrv_get_info = qemu_vma_get_info, -+ -+ .bdrv_co_writev = qemu_vma_co_writev, -+}; -+ -+static void bdrv_vma_init(void) -+{ -+ bdrv_register(&bdrv_vma_drive); -+} -+ -+block_init(bdrv_vma_init); -diff --git a/blockdev.c b/blockdev.c -index d5eb6b62ca..4f18d3c3d7 100644 ---- a/blockdev.c -+++ b/blockdev.c -@@ -31,11 +31,13 @@ - */ - - #include "qemu/osdep.h" -+#include "qemu/uuid.h" - #include "sysemu/block-backend.h" - #include "sysemu/blockdev.h" - #include "hw/block/block.h" - #include "block/blockjob.h" - #include "block/qdict.h" -+#include "block/blockjob_int.h" - #include "block/throttle-groups.h" - #include "monitor/monitor.h" - #include "qemu/error-report.h" -@@ -44,6 +46,7 @@ - #include "qapi/qapi-commands-block.h" - #include "qapi/qapi-commands-transaction.h" - #include "qapi/qapi-visit-block-core.h" -+#include "qapi/qapi-types-misc.h" - #include "qapi/qmp/qdict.h" - #include "qapi/qmp/qnum.h" - #include "qapi/qmp/qstring.h" -@@ -3220,6 +3223,539 @@ out: - aio_context_release(aio_context); - } - -+/* PVE backup related function */ -+ -+static struct PVEBackupState { -+ Error *error; -+ bool cancel; -+ QemuUUID uuid; -+ char uuid_str[37]; -+ int64_t speed; -+ time_t start_time; -+ time_t end_time; -+ char *backup_file; -+ Object *vmaobj; -+ GList *di_list; -+ size_t next_job; -+ size_t total; -+ size_t transferred; -+ size_t zero_bytes; -+ QemuMutex backup_mutex; -+ bool backup_mutex_initialized; -+} backup_state; -+ -+typedef struct PVEBackupDevInfo { -+ BlockDriverState *bs; -+ size_t size; -+ uint8_t dev_id; -+ bool completed; -+ char targetfile[PATH_MAX]; -+ BlockDriverState *target; -+} PVEBackupDevInfo; -+ -+static void pvebackup_run_next_job(void); -+ -+static void pvebackup_cleanup(void) -+{ -+ qemu_mutex_lock(&backup_state.backup_mutex); -+ // Avoid race between block jobs and backup-cancel command: -+ if (!backup_state.vmaw) { -+ qemu_mutex_unlock(&backup_state.backup_mutex); -+ return; -+ } -+ -+ backup_state.end_time = time(NULL); -+ -+ if (backup_state.vmaobj) { -+ object_unparent(backup_state.vmaobj); -+ backup_state.vmaobj = NULL; -+ } -+ -+ g_list_free(backup_state.di_list); -+ backup_state.di_list = NULL; -+ qemu_mutex_unlock(&backup_state.backup_mutex); -+} -+ -+static void pvebackup_complete_cb(void *opaque, int ret) -+{ -+ // This always runs in the main loop -+ -+ PVEBackupDevInfo *di = opaque; -+ -+ di->completed = true; -+ -+ if (ret < 0 && !backup_state.error) { -+ error_setg(&backup_state.error, "job failed with err %d - %s", -+ ret, strerror(-ret)); -+ } -+ -+ di->bs = NULL; -+ di->target = NULL; -+ -+ if (backup_state.vmaobj) { -+ object_unparent(backup_state.vmaobj); -+ backup_state.vmaobj = NULL; -+ } -+ -+ // remove self from job queue -+ qemu_mutex_lock(&backup_state.backup_mutex); -+ backup_state.di_list = g_list_remove(backup_state.di_list, di); -+ g_free(di); -+ qemu_mutex_unlock(&backup_state.backup_mutex); -+ -+ if (!backup_state.cancel) { -+ pvebackup_run_next_job(); -+ } -+} -+ -+static void pvebackup_cancel(void *opaque) -+{ -+ backup_state.cancel = true; -+ qemu_mutex_lock(&backup_state.backup_mutex); -+ // Avoid race between block jobs and backup-cancel command: -+ if (!backup_state.vmaw) { -+ qemu_mutex_unlock(&backup_state.backup_mutex); -+ return; -+ } -+ -+ if (!backup_state.error) { -+ error_setg(&backup_state.error, "backup cancelled"); -+ } -+ -+ if (backup_state.vmaobj) { -+ Error *err; -+ /* make sure vma writer does not block anymore */ -+ if (!object_set_props(backup_state.vmaobj, &err, "blocked", "yes", NULL)) { -+ if (err) { -+ error_report_err(err); -+ } -+ } -+ } -+ -+ GList *l = backup_state.di_list; -+ while (l) { -+ PVEBackupDevInfo *di = (PVEBackupDevInfo *)l->data; -+ l = g_list_next(l); -+ if (!di->completed && di->bs) { -+ BlockJob *job = di->bs->job; -+ if (job) { -+ AioContext *aio_context = blk_get_aio_context(job->blk); -+ aio_context_acquire(aio_context); -+ if (!di->completed) { -+ job_cancel(&job->job, false); -+ } -+ aio_context_release(aio_context); -+ } -+ } -+ } -+ -+ qemu_mutex_unlock(&backup_state.backup_mutex); -+ pvebackup_cleanup(); -+} -+ -+void qmp_backup_cancel(Error **errp) -+{ -+ if (!backup_state.backup_mutex_initialized) -+ return; -+ Coroutine *co = qemu_coroutine_create(pvebackup_cancel, NULL); -+ qemu_coroutine_enter(co); -+ -+ while (backup_state.vmaobj) { -+ /* FIXME: Find something better for this */ -+ aio_poll(qemu_get_aio_context(), true); -+ } -+} -+ -+void vma_object_add_config_file(Object *obj, const char *name, -+ const char *contents, size_t len, -+ Error **errp); -+static int config_to_vma(const char *file, BackupFormat format, -+ Object *vmaobj, -+ const char *backup_dir, -+ Error **errp) -+{ -+ char *cdata = NULL; -+ gsize clen = 0; -+ GError *err = NULL; -+ if (!g_file_get_contents(file, &cdata, &clen, &err)) { -+ error_setg(errp, "unable to read file '%s'", file); -+ return 1; -+ } -+ -+ char *basename = g_path_get_basename(file); -+ -+ if (format == BACKUP_FORMAT_VMA) { -+ vma_object_add_config_file(vmaobj, basename, cdata, clen, errp); -+ } else if (format == BACKUP_FORMAT_DIR) { -+ char config_path[PATH_MAX]; -+ snprintf(config_path, PATH_MAX, "%s/%s", backup_dir, basename); -+ if (!g_file_set_contents(config_path, cdata, clen, &err)) { -+ error_setg(errp, "unable to write config file '%s'", config_path); -+ g_free(cdata); -+ g_free(basename); -+ return 1; -+ } -+ } -+ -+ g_free(basename); -+ g_free(cdata); -+ return 0; -+} -+ -+static void pvebackup_run_next_job(void) -+{ -+ qemu_mutex_lock(&backup_state.backup_mutex); -+ -+ GList *next = g_list_nth(backup_state.di_list, backup_state.next_job); -+ while (next) { -+ PVEBackupDevInfo *di = (PVEBackupDevInfo *)next->data; -+ backup_state.next_job++; -+ if (!di->completed && di->bs && di->bs->job) { -+ BlockJob *job = di->bs->job; -+ AioContext *aio_context = blk_get_aio_context(job->blk); -+ aio_context_acquire(aio_context); -+ qemu_mutex_unlock(&backup_state.backup_mutex); -+ if (backup_state.error || backup_state.cancel) { -+ job_cancel_sync(job); -+ } else { -+ job_resume(job); -+ } -+ aio_context_release(aio_context); -+ return; -+ } -+ next = g_list_next(next); -+ } -+ qemu_mutex_unlock(&backup_state.backup_mutex); -+ -+ // no more jobs, run the cleanup -+ pvebackup_cleanup(); -+} -+ -+UuidInfo *qmp_backup(const char *backup_file, bool has_format, -+ BackupFormat format, -+ bool has_config_file, const char *config_file, -+ bool has_firewall_file, const char *firewall_file, -+ bool has_devlist, const char *devlist, -+ bool has_speed, int64_t speed, Error **errp) -+{ -+ BlockBackend *blk; -+ BlockDriverState *bs = NULL; -+ const char *backup_dir = NULL; -+ Error *local_err = NULL; -+ QemuUUID uuid; -+ gchar **devs = NULL; -+ GList *di_list = NULL; -+ GList *l; -+ UuidInfo *uuid_info; -+ BlockJob *job; -+ -+ if (!backup_state.backup_mutex_initialized) { -+ qemu_mutex_init(&backup_state.backup_mutex); -+ backup_state.backup_mutex_initialized = true; -+ } -+ -+ if (backup_state.di_list || backup_state.vmaobj) { -+ error_set(errp, ERROR_CLASS_GENERIC_ERROR, -+ "previous backup not finished"); -+ return NULL; -+ } -+ -+ /* Todo: try to auto-detect format based on file name */ -+ format = has_format ? format : BACKUP_FORMAT_VMA; -+ -+ if (has_devlist) { -+ devs = g_strsplit_set(devlist, ",;:", -1); -+ -+ gchar **d = devs; -+ while (d && *d) { -+ blk = blk_by_name(*d); -+ if (blk) { -+ bs = blk_bs(blk); -+ if (bdrv_is_read_only(bs)) { -+ error_setg(errp, "Node '%s' is read only", *d); -+ goto err; -+ } -+ if (!bdrv_is_inserted(bs)) { -+ error_setg(errp, QERR_DEVICE_HAS_NO_MEDIUM, *d); -+ goto err; -+ } -+ PVEBackupDevInfo *di = g_new0(PVEBackupDevInfo, 1); -+ di->bs = bs; -+ di_list = g_list_append(di_list, di); -+ } else { -+ error_set(errp, ERROR_CLASS_DEVICE_NOT_FOUND, -+ "Device '%s' not found", *d); -+ goto err; -+ } -+ d++; -+ } -+ -+ } else { -+ BdrvNextIterator it; -+ -+ bs = NULL; -+ for (bs = bdrv_first(&it); bs; bs = bdrv_next(&it)) { -+ if (!bdrv_is_inserted(bs) || bdrv_is_read_only(bs)) { -+ continue; -+ } -+ -+ PVEBackupDevInfo *di = g_new0(PVEBackupDevInfo, 1); -+ di->bs = bs; -+ di_list = g_list_append(di_list, di); -+ } -+ } -+ -+ if (!di_list) { -+ error_set(errp, ERROR_CLASS_GENERIC_ERROR, "empty device list"); -+ goto err; -+ } -+ -+ size_t total = 0; -+ -+ l = di_list; -+ while (l) { -+ PVEBackupDevInfo *di = (PVEBackupDevInfo *)l->data; -+ l = g_list_next(l); -+ if (bdrv_op_is_blocked(di->bs, BLOCK_OP_TYPE_BACKUP_SOURCE, errp)) { -+ goto err; -+ } -+ -+ ssize_t size = bdrv_getlength(di->bs); -+ if (size < 0) { -+ error_setg_errno(errp, -di->size, "bdrv_getlength failed"); -+ goto err; -+ } -+ di->size = size; -+ total += size; -+ } -+ -+ qemu_uuid_generate(&uuid); -+ -+ if (format == BACKUP_FORMAT_VMA) { -+ char uuidstr[UUID_FMT_LEN+1]; -+ qemu_uuid_unparse(&uuid, uuidstr); -+ uuidstr[UUID_FMT_LEN] = 0; -+ backup_state.vmaobj = -+ object_new_with_props("vma", object_get_objects_root(), -+ "vma-backup-obj", &local_err, -+ "filename", backup_file, -+ "uuid", uuidstr, -+ NULL); -+ if (!backup_state.vmaobj) { -+ if (local_err) { -+ error_propagate(errp, local_err); -+ } -+ goto err; -+ } -+ -+ l = di_list; -+ while (l) { -+ QDict *options = qdict_new(); -+ -+ PVEBackupDevInfo *di = (PVEBackupDevInfo *)l->data; -+ l = g_list_next(l); -+ -+ const char *devname = bdrv_get_device_name(di->bs); -+ snprintf(di->targetfile, PATH_MAX, "vma-backup-obj/%s.raw", devname); -+ -+ qdict_put(options, "driver", qstring_from_str("vma-drive")); -+ qdict_put(options, "size", qint_from_int(di->size)); -+ di->target = bdrv_open(di->targetfile, NULL, options, BDRV_O_RDWR, &local_err); -+ if (!di->target) { -+ error_propagate(errp, local_err); -+ goto err; -+ } -+ } -+ } else if (format == BACKUP_FORMAT_DIR) { -+ if (mkdir(backup_file, 0640) != 0) { -+ error_setg_errno(errp, errno, "can't create directory '%s'\n", -+ backup_file); -+ goto err; -+ } -+ backup_dir = backup_file; -+ -+ l = di_list; -+ while (l) { -+ PVEBackupDevInfo *di = (PVEBackupDevInfo *)l->data; -+ l = g_list_next(l); -+ -+ const char *devname = bdrv_get_device_name(di->bs); -+ snprintf(di->targetfile, PATH_MAX, "%s/%s.raw", backup_dir, devname); -+ -+ int flags = BDRV_O_RDWR; -+ bdrv_img_create(di->targetfile, "raw", NULL, NULL, NULL, -+ di->size, flags, false, &local_err); -+ if (local_err) { -+ error_propagate(errp, local_err); -+ goto err; -+ } -+ -+ di->target = bdrv_open(di->targetfile, NULL, NULL, flags, &local_err); -+ if (!di->target) { -+ error_propagate(errp, local_err); -+ goto err; -+ } -+ } -+ } else { -+ error_set(errp, ERROR_CLASS_GENERIC_ERROR, "unknown backup format"); -+ goto err; -+ } -+ -+ /* add configuration file to archive */ -+ if (has_config_file) { -+ if(config_to_vma(config_file, format, backup_state.vmaobj, backup_dir, errp) != 0) { -+ goto err; -+ } -+ } -+ -+ /* add firewall file to archive */ -+ if (has_firewall_file) { -+ if(config_to_vma(firewall_file, format, backup_state.vmaobj, backup_dir, errp) != 0) { -+ goto err; -+ } -+ } -+ /* initialize global backup_state now */ -+ -+ backup_state.cancel = false; -+ -+ if (backup_state.error) { -+ error_free(backup_state.error); -+ backup_state.error = NULL; -+ } -+ -+ backup_state.speed = (has_speed && speed > 0) ? speed : 0; -+ -+ backup_state.start_time = time(NULL); -+ backup_state.end_time = 0; -+ -+ if (backup_state.backup_file) { -+ g_free(backup_state.backup_file); -+ } -+ backup_state.backup_file = g_strdup(backup_file); -+ -+ memcpy(&backup_state.uuid, &uuid, sizeof(uuid)); -+ qemu_uuid_unparse(&uuid, backup_state.uuid_str); -+ -+ qemu_mutex_lock(&backup_state.backup_mutex); -+ backup_state.di_list = di_list; -+ backup_state.next_job = 0; -+ -+ backup_state.total = total; -+ backup_state.transferred = 0; -+ backup_state.zero_bytes = 0; -+ -+ /* start all jobs (paused state) */ -+ l = di_list; -+ while (l) { -+ PVEBackupDevInfo *di = (PVEBackupDevInfo *)l->data; -+ l = g_list_next(l); -+ -+ job = backup_job_create(NULL, di->bs, di->target, speed, MIRROR_SYNC_MODE_FULL, NULL, -+ false, BLOCKDEV_ON_ERROR_REPORT, BLOCKDEV_ON_ERROR_REPORT, -+ JOB_DEFAULT, -+ pvebackup_complete_cb, di, 2, NULL, &local_err); -+ if (di->target) { -+ bdrv_unref(di->target); -+ di->target = NULL; -+ } -+ if (!job || local_err != NULL) { -+ error_setg(&backup_state.error, "backup_job_create failed"); -+ pvebackup_cancel(NULL); -+ } else { -+ job_start(&job->job); -+ } -+ } -+ -+ qemu_mutex_unlock(&backup_state.backup_mutex); -+ -+ if (!backup_state.error) { -+ pvebackup_run_next_job(); // run one job -+ } -+ -+ uuid_info = g_malloc0(sizeof(*uuid_info)); -+ uuid_info->UUID = g_strdup(backup_state.uuid_str); -+ -+ return uuid_info; -+ -+err: -+ -+ l = di_list; -+ while (l) { -+ PVEBackupDevInfo *di = (PVEBackupDevInfo *)l->data; -+ l = g_list_next(l); -+ -+ if (di->target) { -+ bdrv_unref(di->target); -+ } -+ -+ if (di->targetfile[0]) { -+ unlink(di->targetfile); -+ } -+ g_free(di); -+ } -+ g_list_free(di_list); -+ -+ if (devs) { -+ g_strfreev(devs); -+ } -+ -+ if (backup_state.vmaobj) { -+ object_unparent(backup_state.vmaobj); -+ backup_state.vmaobj = NULL; -+ } -+ -+ if (backup_dir) { -+ rmdir(backup_dir); -+ } -+ -+ return NULL; -+} -+ -+BackupStatus *qmp_query_backup(Error **errp) -+{ -+ BackupStatus *info = g_malloc0(sizeof(*info)); -+ -+ if (!backup_state.start_time) { -+ /* not started, return {} */ -+ return info; -+ } -+ -+ info->has_status = true; -+ info->has_start_time = true; -+ info->start_time = backup_state.start_time; -+ -+ if (backup_state.backup_file) { -+ info->has_backup_file = true; -+ info->backup_file = g_strdup(backup_state.backup_file); -+ } -+ -+ info->has_uuid = true; -+ info->uuid = g_strdup(backup_state.uuid_str); -+ -+ if (backup_state.end_time) { -+ if (backup_state.error) { -+ info->status = g_strdup("error"); -+ info->has_errmsg = true; -+ info->errmsg = g_strdup(error_get_pretty(backup_state.error)); -+ } else { -+ info->status = g_strdup("done"); -+ } -+ info->has_end_time = true; -+ info->end_time = backup_state.end_time; -+ } else { -+ info->status = g_strdup("active"); -+ } -+ -+ info->has_total = true; -+ info->total = backup_state.total; -+ info->has_zero_bytes = true; -+ info->zero_bytes = backup_state.zero_bytes; -+ info->has_transferred = true; -+ info->transferred = backup_state.transferred; -+ -+ return info; -+} -+ - void qmp_block_stream(bool has_job_id, const char *job_id, const char *device, - bool has_base, const char *base, - bool has_base_node, const char *base_node, -diff --git a/configure b/configure -index 2a7796ea80..601c1f44f9 100755 ---- a/configure -+++ b/configure -@@ -475,6 +475,7 @@ vxhs="" - libxml2="" - docker="no" - debug_mutex="no" -+vma="" - - # cross compilers defaults, can be overridden with --cross-cc-ARCH - cross_cc_aarch64="aarch64-linux-gnu-gcc" -@@ -1435,6 +1436,10 @@ for opt do - ;; - --disable-debug-mutex) debug_mutex=no - ;; -+ --enable-vma) vma=yes -+ ;; -+ --disable-vma) vma=no -+ ;; - *) - echo "ERROR: unknown option $opt" - echo "Try '$0 --help' for more information" -@@ -1710,6 +1715,7 @@ disabled with --disable-FEATURE, default is enabled if available: - vhost-user vhost-user support - capstone capstone disassembler support - debug-mutex mutex debugging support -+ vma VMA archive backend - - NOTE: The object files are built at the place where configure is launched - EOF -@@ -4124,6 +4130,22 @@ EOF - fi - - ########################################## -+# vma probe -+if test "$vma" != "no" ; then -+ if $pkg_config --exact-version=0.1.0 vma; then -+ vma="yes" -+ vma_cflags=$($pkg_config --cflags vma) -+ vma_libs=$($pkg_config --libs vma) -+ else -+ if test "$vma" = "yes" ; then -+ feature_not_found "VMA Archive backend support" \ -+ "Install libvma devel" -+ fi -+ vma="no" -+ fi -+fi -+ -+########################################## - # signalfd probe - signalfd="no" - cat > $TMPC << EOF -@@ -6010,6 +6032,7 @@ echo "replication support $replication" - echo "VxHS block device $vxhs" - echo "capstone $capstone" - echo "docker $docker" -+echo "VMA support $vma" - - if test "$sdl_too_old" = "yes"; then - echo "-> Your SDL version is too old - please upgrade to have SDL support" -@@ -6496,6 +6519,12 @@ if test "$usb_redir" = "yes" ; then - echo "USB_REDIR_LIBS=$usb_redir_libs" >> $config_host_mak - fi - -+if test "$vma" = "yes" ; then -+ echo "CONFIG_VMA=y" >> $config_host_mak -+ echo "VMA_CFLAGS=$vma_cflags" >> $config_host_mak -+ echo "VMA_LIBS=$vma_libs" >> $config_host_mak -+fi -+ - if test "$opengl" = "yes" ; then - echo "CONFIG_OPENGL=y" >> $config_host_mak - echo "OPENGL_LIBS=$opengl_libs" >> $config_host_mak -diff --git a/hmp-commands-info.hx b/hmp-commands-info.hx -index 42c148fdc9..277e140092 100644 ---- a/hmp-commands-info.hx -+++ b/hmp-commands-info.hx -@@ -502,6 +502,19 @@ STEXI - Show CPU statistics. - ETEXI - -+ { -+ .name = "backup", -+ .args_type = "", -+ .params = "", -+ .help = "show backup status", -+ .cmd = hmp_info_backup, -+ }, -+ -+STEXI -+@item info backup -+show backup status -+ETEXI -+ - #if defined(CONFIG_SLIRP) - { - .name = "usernet", -diff --git a/hmp-commands.hx b/hmp-commands.hx -index a6f0720442..956cbf04b9 100644 ---- a/hmp-commands.hx -+++ b/hmp-commands.hx -@@ -107,6 +107,37 @@ STEXI - Copy data from a backing file into a block device. - ETEXI - -+ { -+ .name = "backup", -+ .args_type = "directory:-d,backupfile:s,speed:o?,devlist:s?", -+ .params = "[-d] backupfile [speed [devlist]]", -+ .help = "create a VM Backup." -+ "\n\t\t\t Use -d to dump data into a directory instead" -+ "\n\t\t\t of using VMA format.", -+ .cmd = hmp_backup, -+ }, -+ -+STEXI -+@item backup -+@findex backup -+Create a VM backup. -+ETEXI -+ -+ { -+ .name = "backup_cancel", -+ .args_type = "", -+ .params = "", -+ .help = "cancel the current VM backup", -+ .cmd = hmp_backup_cancel, -+ }, -+ -+STEXI -+@item backup_cancel -+@findex backup_cancel -+Cancel the current VM backup. -+ -+ETEXI -+ - { - .name = "block_job_set_speed", - .args_type = "device:B,speed:o", -diff --git a/hmp.c b/hmp.c -index 7c975f3ead..8d659e20f6 100644 ---- a/hmp.c -+++ b/hmp.c -@@ -166,6 +166,44 @@ void hmp_info_mice(Monitor *mon, const QDict *qdict) - qapi_free_MouseInfoList(mice_list); - } - -+void hmp_info_backup(Monitor *mon, const QDict *qdict) -+{ -+ BackupStatus *info; -+ -+ info = qmp_query_backup(NULL); -+ if (info->has_status) { -+ if (info->has_errmsg) { -+ monitor_printf(mon, "Backup status: %s - %s\n", -+ info->status, info->errmsg); -+ } else { -+ monitor_printf(mon, "Backup status: %s\n", info->status); -+ } -+ } -+ -+ if (info->has_backup_file) { -+ monitor_printf(mon, "Start time: %s", ctime(&info->start_time)); -+ if (info->end_time) { -+ monitor_printf(mon, "End time: %s", ctime(&info->end_time)); -+ } -+ -+ int per = (info->has_total && info->total && -+ info->has_transferred && info->transferred) ? -+ (info->transferred * 100)/info->total : 0; -+ int zero_per = (info->has_total && info->total && -+ info->has_zero_bytes && info->zero_bytes) ? -+ (info->zero_bytes * 100)/info->total : 0; -+ monitor_printf(mon, "Backup file: %s\n", info->backup_file); -+ monitor_printf(mon, "Backup uuid: %s\n", info->uuid); -+ monitor_printf(mon, "Total size: %zd\n", info->total); -+ monitor_printf(mon, "Transferred bytes: %zd (%d%%)\n", -+ info->transferred, per); -+ monitor_printf(mon, "Zero bytes: %zd (%d%%)\n", -+ info->zero_bytes, zero_per); -+ } -+ -+ qapi_free_BackupStatus(info); -+} -+ - void hmp_info_migrate(Monitor *mon, const QDict *qdict) - { - MigrationInfo *info; -@@ -1899,6 +1937,31 @@ void hmp_block_stream(Monitor *mon, const QDict *qdict) - hmp_handle_error(mon, &error); - } - -+void hmp_backup_cancel(Monitor *mon, const QDict *qdict) -+{ -+ Error *error = NULL; -+ -+ qmp_backup_cancel(&error); -+ -+ hmp_handle_error(mon, &error); -+} -+ -+void hmp_backup(Monitor *mon, const QDict *qdict) -+{ -+ Error *error = NULL; -+ -+ int dir = qdict_get_try_bool(qdict, "directory", 0); -+ const char *backup_file = qdict_get_str(qdict, "backupfile"); -+ const char *devlist = qdict_get_try_str(qdict, "devlist"); -+ int64_t speed = qdict_get_try_int(qdict, "speed", 0); -+ -+ qmp_backup(backup_file, true, dir ? BACKUP_FORMAT_DIR : BACKUP_FORMAT_VMA, -+ false, NULL, false, NULL, !!devlist, -+ devlist, qdict_haskey(qdict, "speed"), speed, &error); -+ -+ hmp_handle_error(mon, &error); -+} -+ - void hmp_block_job_set_speed(Monitor *mon, const QDict *qdict) - { - Error *error = NULL; -diff --git a/hmp.h b/hmp.h -index 98bb7a44db..853f233195 100644 ---- a/hmp.h -+++ b/hmp.h -@@ -29,6 +29,7 @@ void hmp_info_migrate(Monitor *mon, const QDict *qdict); - void hmp_info_migrate_capabilities(Monitor *mon, const QDict *qdict); - void hmp_info_migrate_parameters(Monitor *mon, const QDict *qdict); - void hmp_info_migrate_cache_size(Monitor *mon, const QDict *qdict); -+void hmp_info_backup(Monitor *mon, const QDict *qdict); - void hmp_info_cpus(Monitor *mon, const QDict *qdict); - void hmp_info_block(Monitor *mon, const QDict *qdict); - void hmp_info_blockstats(Monitor *mon, const QDict *qdict); -@@ -86,6 +87,8 @@ void hmp_eject(Monitor *mon, const QDict *qdict); - void hmp_change(Monitor *mon, const QDict *qdict); - void hmp_block_set_io_throttle(Monitor *mon, const QDict *qdict); - void hmp_block_stream(Monitor *mon, const QDict *qdict); -+void hmp_backup(Monitor *mon, const QDict *qdict); -+void hmp_backup_cancel(Monitor *mon, const QDict *qdict); - void hmp_block_job_set_speed(Monitor *mon, const QDict *qdict); - void hmp_block_job_cancel(Monitor *mon, const QDict *qdict); - void hmp_block_job_pause(Monitor *mon, const QDict *qdict); -diff --git a/qapi/block-core.json b/qapi/block-core.json -index 5b9084a394..9c3c2d4917 100644 ---- a/qapi/block-core.json -+++ b/qapi/block-core.json -@@ -718,6 +718,97 @@ - - - ## -+# @BackupStatus: -+# -+# Detailed backup status. -+# -+# @status: string describing the current backup status. -+# This can be 'active', 'done', 'error'. If this field is not -+# returned, no backup process has been initiated -+# -+# @errmsg: error message (only returned if status is 'error') -+# -+# @total: total amount of bytes involved in the backup process -+# -+# @transferred: amount of bytes already backed up. -+# -+# @zero-bytes: amount of 'zero' bytes detected. -+# -+# @start-time: time (epoch) when backup job started. -+# -+# @end-time: time (epoch) when backup job finished. -+# -+# @backup-file: backup file name -+# -+# @uuid: uuid for this backup job -+# -+## -+{ 'struct': 'BackupStatus', -+ 'data': {'*status': 'str', '*errmsg': 'str', '*total': 'int', -+ '*transferred': 'int', '*zero-bytes': 'int', -+ '*start-time': 'int', '*end-time': 'int', -+ '*backup-file': 'str', '*uuid': 'str' } } -+ -+## -+# @BackupFormat: -+# -+# An enumeration of supported backup formats. -+# -+# @vma: Proxmox vma backup format -+## -+{ 'enum': 'BackupFormat', -+ 'data': [ 'vma', 'dir' ] } -+ -+## -+# @backup: -+# -+# Starts a VM backup. -+# -+# @backup-file: the backup file name -+# -+# @format: format of the backup file -+# -+# @config-file: a configuration file to include into -+# the backup archive. -+# -+# @speed: the maximum speed, in bytes per second -+# -+# @devlist: list of block device names (separated by ',', ';' -+# or ':'). By default the backup includes all writable block devices. -+# -+# Returns: the uuid of the backup job -+# -+## -+{ 'command': 'backup', 'data': { 'backup-file': 'str', -+ '*format': 'BackupFormat', -+ '*config-file': 'str', -+ '*firewall-file': 'str', -+ '*devlist': 'str', '*speed': 'int' }, -+ 'returns': 'UuidInfo' } -+ -+## -+# @query-backup: -+# -+# Returns information about current/last backup task. -+# -+# Returns: @BackupStatus -+# -+## -+{ 'command': 'query-backup', 'returns': 'BackupStatus' } -+ -+## -+# @backup-cancel: -+# -+# Cancel the current executing backup process. -+# -+# Returns: nothing on success -+# -+# Notes: This command succeeds even if there is no backup process running. -+# -+## -+{ 'command': 'backup-cancel' } -+ -+## - # @BlockDeviceTimedStats: - # - # Statistics of a block device during a given interval of time. -@@ -2549,7 +2640,7 @@ - 'host_cdrom', 'host_device', 'http', 'https', 'iscsi', 'luks', - 'nbd', 'nfs', 'null-aio', 'null-co', 'nvme', 'parallels', 'qcow', - 'qcow2', 'qed', 'quorum', 'raw', 'rbd', 'replication', 'sheepdog', -- 'ssh', 'throttle', 'vdi', 'vhdx', 'vmdk', 'vpc', 'vvfat', 'vxhs' ] } -+ 'ssh', 'throttle', 'vdi', 'vhdx', 'vma-drive', 'vmdk', 'vpc', 'vvfat', 'vxhs' ] } - - ## - # @BlockdevOptionsFile: -@@ -3550,6 +3641,21 @@ - '*tls-creds': 'str' } } - - ## -+# @BlockdevOptionsVMADrive: -+# -+# Driver specific block device options for VMA Drives -+# -+# @filename: vma-drive path -+# -+# @size: drive size in bytes -+# -+# Since: 2.9 -+## -+{ 'struct': 'BlockdevOptionsVMADrive', -+ 'data': { 'filename': 'str', -+ 'size': 'int' } } -+ -+## - # @BlockdevOptionsThrottle: - # - # Driver specific block device options for the throttle driver -@@ -3633,6 +3739,7 @@ - 'throttle': 'BlockdevOptionsThrottle', - 'vdi': 'BlockdevOptionsGenericFormat', - 'vhdx': 'BlockdevOptionsGenericFormat', -+ 'vma-drive': 'BlockdevOptionsVMADrive', - 'vmdk': 'BlockdevOptionsGenericCOWFormat', - 'vpc': 'BlockdevOptionsGenericFormat', - 'vvfat': 'BlockdevOptionsVVFAT', -diff --git a/qapi/common.json b/qapi/common.json -index c367adc4b6..070b7b52c8 100644 ---- a/qapi/common.json -+++ b/qapi/common.json -@@ -149,3 +149,16 @@ - 'ppc64', 'ppcemb', 'riscv32', 'riscv64', 's390x', 'sh4', - 'sh4eb', 'sparc', 'sparc64', 'tricore', 'unicore32', - 'x86_64', 'xtensa', 'xtensaeb' ] } -+ -+## -+# @UuidInfo: -+# -+# Guest UUID information (Universally Unique Identifier). -+# -+# @UUID: the UUID of the guest -+# -+# Since: 0.14.0 -+# -+# Notes: If no UUID was specified for the guest, a null UUID is returned. -+## -+{ 'struct': 'UuidInfo', 'data': {'UUID': 'str'} } -diff --git a/qapi/misc.json b/qapi/misc.json -index b6ad5f028d..3dd5117fc3 100644 ---- a/qapi/misc.json -+++ b/qapi/misc.json -@@ -275,19 +275,6 @@ - { 'command': 'query-kvm', 'returns': 'KvmInfo' } - - ## --# @UuidInfo: --# --# Guest UUID information (Universally Unique Identifier). --# --# @UUID: the UUID of the guest --# --# Since: 0.14.0 --# --# Notes: If no UUID was specified for the guest, a null UUID is returned. --## --{ 'struct': 'UuidInfo', 'data': {'UUID': 'str'} } -- --## - # @query-uuid: - # - # Query the guest UUID information. --- -2.11.0 - diff --git a/debian/patches/pve/0024-PVE-vma-add-cache-option-to-device-map.patch b/debian/patches/pve/0024-PVE-vma-add-cache-option-to-device-map.patch new file mode 100644 index 0000000..15e0deb --- /dev/null +++ b/debian/patches/pve/0024-PVE-vma-add-cache-option-to-device-map.patch @@ -0,0 +1,95 @@ +From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001 +From: Wolfgang Bumiller +Date: Thu, 22 Mar 2018 15:32:04 +0100 +Subject: [PATCH] PVE: vma: add cache option to device map + +Signed-off-by: Wolfgang Bumiller +--- + vma.c | 16 +++++++++++++++- + 1 file changed, 15 insertions(+), 1 deletion(-) + +diff --git a/vma.c b/vma.c +index f9f5c308fe..476b7bee00 100644 +--- a/vma.c ++++ b/vma.c +@@ -135,6 +135,7 @@ typedef struct RestoreMap { + char *format; + uint64_t throttling_bps; + char *throttling_group; ++ char *cache; + bool write_zero; + } RestoreMap; + +@@ -242,6 +243,7 @@ static int extract_content(int argc, char **argv) + char *format = NULL; + char *bps = NULL; + char *group = NULL; ++ char *cache = NULL; + if (!line || line[0] == '\0' || !strcmp(line, "done\n")) { + break; + } +@@ -256,7 +258,8 @@ static int extract_content(int argc, char **argv) + while (1) { + if (!try_parse_option(&line, "format", &format, inbuf) && + !try_parse_option(&line, "throttling.bps", &bps, inbuf) && +- !try_parse_option(&line, "throttling.group", &group, inbuf)) ++ !try_parse_option(&line, "throttling.group", &group, inbuf) && ++ !try_parse_option(&line, "cache", &cache, inbuf)) + { + break; + } +@@ -293,6 +296,7 @@ static int extract_content(int argc, char **argv) + map->format = format; + map->throttling_bps = bps_value; + map->throttling_group = group; ++ map->cache = cache; + map->write_zero = write_zero; + + g_hash_table_insert(devmap, map->devname, map); +@@ -322,6 +326,7 @@ static int extract_content(int argc, char **argv) + const char *format = NULL; + uint64_t throttling_bps = 0; + const char *throttling_group = NULL; ++ const char *cache = NULL; + int flags = BDRV_O_RDWR | BDRV_O_NO_FLUSH; + bool write_zero = true; + +@@ -335,6 +340,7 @@ static int extract_content(int argc, char **argv) + format = map->format; + throttling_bps = map->throttling_bps; + throttling_group = map->throttling_group; ++ cache = map->cache; + write_zero = map->write_zero; + } else { + devfn = g_strdup_printf("%s/tmp-disk-%s.raw", +@@ -356,6 +362,7 @@ static int extract_content(int argc, char **argv) + + size_t devlen = strlen(devfn); + QDict *options = NULL; ++ bool writethrough; + if (format) { + /* explicit format from commandline */ + options = qdict_new(); +@@ -370,12 +377,19 @@ static int extract_content(int argc, char **argv) + options = qdict_new(); + qdict_put_str(options, "driver", "raw"); + } ++ if (cache && bdrv_parse_cache_mode(cache, &flags, &writethrough)) { ++ g_error("invalid cache option: %s\n", cache); ++ } + + if (errp || !(blk = blk_new_open(devfn, NULL, options, flags, &errp))) { + g_error("can't open file %s - %s", devfn, + error_get_pretty(errp)); + } + ++ if (cache) { ++ blk_set_enable_write_cache(blk, !writethrough); ++ } ++ + if (throttling_group) { + blk_io_limits_enable(blk, throttling_group); + } +-- +2.11.0 + diff --git a/debian/patches/pve/0025-PVE-Deprecated-adding-old-vma-files.patch b/debian/patches/pve/0025-PVE-Deprecated-adding-old-vma-files.patch deleted file mode 100644 index ae1dd0b..0000000 --- a/debian/patches/pve/0025-PVE-Deprecated-adding-old-vma-files.patch +++ /dev/null @@ -1,3294 +0,0 @@ -From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001 -From: Wolfgang Bumiller -Date: Mon, 7 Aug 2017 08:51:16 +0200 -Subject: [PATCH] PVE: [Deprecated] adding old vma files - -TODO: Move to using a libvma block backend ---- - Makefile | 3 +- - Makefile.objs | 1 + - block/backup.c | 107 ++++-- - block/replication.c | 1 + - blockdev.c | 208 +++++++---- - include/block/block_int.h | 4 + - job.c | 3 +- - vma-reader.c | 857 ++++++++++++++++++++++++++++++++++++++++++++++ - vma-writer.c | 771 +++++++++++++++++++++++++++++++++++++++++ - vma.c | 756 ++++++++++++++++++++++++++++++++++++++++ - vma.h | 150 ++++++++ - 11 files changed, 2754 insertions(+), 107 deletions(-) - create mode 100644 vma-reader.c - create mode 100644 vma-writer.c - create mode 100644 vma.c - create mode 100644 vma.h - -diff --git a/Makefile b/Makefile -index 2da686be33..5a0aad2004 100644 ---- a/Makefile -+++ b/Makefile -@@ -436,7 +436,7 @@ dummy := $(call unnest-vars,, \ - - include $(SRC_PATH)/tests/Makefile.include - --all: $(DOCS) $(TOOLS) $(HELPERS-y) recurse-all modules -+all: $(DOCS) $(TOOLS) vma$(EXESUF) $(HELPERS-y) recurse-all modules - - qemu-version.h: FORCE - $(call quiet-command, \ -@@ -537,6 +537,7 @@ qemu-img.o: qemu-img-cmds.h - qemu-img$(EXESUF): qemu-img.o $(block-obj-y) $(crypto-obj-y) $(io-obj-y) $(qom-obj-y) $(COMMON_LDADDS) - qemu-nbd$(EXESUF): qemu-nbd.o $(block-obj-y) $(crypto-obj-y) $(io-obj-y) $(qom-obj-y) $(COMMON_LDADDS) - qemu-io$(EXESUF): qemu-io.o $(block-obj-y) $(crypto-obj-y) $(io-obj-y) $(qom-obj-y) $(COMMON_LDADDS) -+vma$(EXESUF): vma.o vma-reader.o $(block-obj-y) $(crypto-obj-y) $(io-obj-y) $(qom-obj-y) $(COMMON_LDADDS) - - qemu-bridge-helper$(EXESUF): qemu-bridge-helper.o $(COMMON_LDADDS) - -diff --git a/Makefile.objs b/Makefile.objs -index a836ee87d7..92c7886dee 100644 ---- a/Makefile.objs -+++ b/Makefile.objs -@@ -70,6 +70,7 @@ block-obj-y += block.o blockjob.o job.o - block-obj-y += block/ scsi/ - block-obj-y += qemu-io-cmds.o - block-obj-$(CONFIG_REPLICATION) += replication.o -+block-obj-y += vma-writer.o - - block-obj-m = block/ - -diff --git a/block/backup.c b/block/backup.c -index 7f970842d7..5f53163a77 100644 ---- a/block/backup.c -+++ b/block/backup.c -@@ -34,6 +34,7 @@ typedef struct BackupBlockJob { - /* bitmap for sync=incremental */ - BdrvDirtyBitmap *sync_bitmap; - MirrorSyncMode sync_mode; -+ BackupDumpFunc *dump_cb; - BlockdevOnError on_source_error; - BlockdevOnError on_target_error; - CoRwlock flush_rwlock; -@@ -126,12 +127,20 @@ static int coroutine_fn backup_cow_with_bounce_buffer(BackupBlockJob *job, - } - - if (qemu_iovec_is_zero(&qiov)) { -- ret = blk_co_pwrite_zeroes(job->target, start, -- qiov.size, write_flags | BDRV_REQ_MAY_UNMAP); -+ if (job->dump_cb) { -+ ret = job->dump_cb(job->common.job.opaque, job->target, start, qiov.size, NULL); -+ } else { -+ ret = blk_co_pwrite_zeroes(job->target, start, -+ qiov.size, write_flags | BDRV_REQ_MAY_UNMAP); -+ } - } else { -- ret = blk_co_pwritev(job->target, start, -- qiov.size, &qiov, write_flags | -- (job->compress ? BDRV_REQ_WRITE_COMPRESSED : 0)); -+ if (job->dump_cb) { -+ ret = job->dump_cb(job->common.job.opaque, job->target, start, qiov.size, *bounce_buffer); -+ } else { -+ ret = blk_co_pwritev(job->target, start, -+ qiov.size, &qiov, write_flags | -+ (job->compress ? BDRV_REQ_WRITE_COMPRESSED : 0)); -+ } - } - if (ret < 0) { - trace_backup_do_cow_write_fail(job, start, ret); -@@ -209,7 +218,11 @@ static int coroutine_fn backup_do_cow(BackupBlockJob *job, - trace_backup_do_cow_process(job, start); - - if (job->use_copy_range) { -- ret = backup_cow_with_offload(job, start, end, is_write_notifier); -+ if (job->dump_cb) { -+ ret = - 1; -+ } else { -+ ret = backup_cow_with_offload(job, start, end, is_write_notifier); -+ } - if (ret < 0) { - job->use_copy_range = false; - } -@@ -293,7 +306,9 @@ static void backup_abort(Job *job) - static void backup_clean(Job *job) - { - BackupBlockJob *s = container_of(job, BackupBlockJob, common.job); -- assert(s->target); -+ if (!s->target) { -+ return; -+ } - blk_unref(s->target); - s->target = NULL; - } -@@ -302,7 +317,9 @@ static void backup_attached_aio_context(BlockJob *job, AioContext *aio_context) - { - BackupBlockJob *s = container_of(job, BackupBlockJob, common); - -- blk_set_aio_context(s->target, aio_context); -+ if (s->target) { -+ blk_set_aio_context(s->target, aio_context); -+ } - } - - void backup_do_checkpoint(BlockJob *job, Error **errp) -@@ -374,9 +391,11 @@ static BlockErrorAction backup_error_action(BackupBlockJob *job, - if (read) { - return block_job_error_action(&job->common, job->on_source_error, - true, error); -- } else { -+ } else if (job->target) { - return block_job_error_action(&job->common, job->on_target_error, - false, error); -+ } else { -+ return BLOCK_ERROR_ACTION_REPORT; - } - } - -@@ -612,6 +631,7 @@ BlockJob *backup_job_create(const char *job_id, BlockDriverState *bs, - BlockdevOnError on_source_error, - BlockdevOnError on_target_error, - int creation_flags, -+ BackupDumpFunc *dump_cb, - BlockCompletionFunc *cb, void *opaque, - int pause_count, - JobTxn *txn, Error **errp) -@@ -622,7 +642,7 @@ BlockJob *backup_job_create(const char *job_id, BlockDriverState *bs, - int ret; - - assert(bs); -- assert(target); -+ assert(target || dump_cb); - - if (bs == target) { - error_setg(errp, "Source and target cannot be the same"); -@@ -635,13 +655,13 @@ BlockJob *backup_job_create(const char *job_id, BlockDriverState *bs, - return NULL; - } - -- if (!bdrv_is_inserted(target)) { -+ if (target && !bdrv_is_inserted(target)) { - error_setg(errp, "Device is not inserted: %s", - bdrv_get_device_name(target)); - return NULL; - } - -- if (compress && target->drv->bdrv_co_pwritev_compressed == NULL) { -+ if (target && compress && target->drv->bdrv_co_pwritev_compressed == NULL) { - error_setg(errp, "Compression is not supported for this drive %s", - bdrv_get_device_name(target)); - return NULL; -@@ -651,7 +671,7 @@ BlockJob *backup_job_create(const char *job_id, BlockDriverState *bs, - return NULL; - } - -- if (bdrv_op_is_blocked(target, BLOCK_OP_TYPE_BACKUP_TARGET, errp)) { -+ if (target && bdrv_op_is_blocked(target, BLOCK_OP_TYPE_BACKUP_TARGET, errp)) { - return NULL; - } - -@@ -691,15 +711,18 @@ BlockJob *backup_job_create(const char *job_id, BlockDriverState *bs, - goto error; - } - -- /* The target must match the source in size, so no resize here either */ -- job->target = blk_new(BLK_PERM_WRITE, -- BLK_PERM_CONSISTENT_READ | BLK_PERM_WRITE | -- BLK_PERM_WRITE_UNCHANGED | BLK_PERM_GRAPH_MOD); -- ret = blk_insert_bs(job->target, target, errp); -- if (ret < 0) { -- goto error; -+ if (target) { -+ /* The target must match the source in size, so no resize here either */ -+ job->target = blk_new(BLK_PERM_WRITE, -+ BLK_PERM_CONSISTENT_READ | BLK_PERM_WRITE | -+ BLK_PERM_WRITE_UNCHANGED | BLK_PERM_GRAPH_MOD); -+ ret = blk_insert_bs(job->target, target, errp); -+ if (ret < 0) { -+ goto error; -+ } - } - -+ job->dump_cb = dump_cb; - job->on_source_error = on_source_error; - job->on_target_error = on_target_error; - job->sync_mode = sync_mode; -@@ -710,6 +733,9 @@ BlockJob *backup_job_create(const char *job_id, BlockDriverState *bs, - /* Detect image-fleecing (and similar) schemes */ - job->serialize_target_writes = bdrv_chain_contains(target, bs); - -+ if (!target) { -+ goto use_default_cluster_size; -+ } - /* If there is no backing file on the target, we cannot rely on COW if our - * backup cluster size is smaller than the target cluster size. Even for - * targets with a backing file, try to avoid COW if possible. */ -@@ -734,18 +760,35 @@ BlockJob *backup_job_create(const char *job_id, BlockDriverState *bs, - /* Not fatal; just trudge on ahead. */ - job->cluster_size = BACKUP_CLUSTER_SIZE_DEFAULT; - } else { -- job->cluster_size = MAX(BACKUP_CLUSTER_SIZE_DEFAULT, bdi.cluster_size); -- } -- job->use_copy_range = true; -- job->copy_range_size = MIN_NON_ZERO(blk_get_max_transfer(job->common.blk), -- blk_get_max_transfer(job->target)); -- job->copy_range_size = MAX(job->cluster_size, -- QEMU_ALIGN_UP(job->copy_range_size, -- job->cluster_size)); -- -- /* Required permissions are already taken with target's blk_new() */ -- block_job_add_bdrv(&job->common, "target", target, 0, BLK_PERM_ALL, -- &error_abort); -+ use_default_cluster_size: -+ ret = bdrv_get_info(bs, &bdi); -+ if (ret < 0) { -+ job->cluster_size = BACKUP_CLUSTER_SIZE_DEFAULT; -+ } else { -+ /* round down to nearest BACKUP_CLUSTER_SIZE_DEFAULT */ -+ job->cluster_size = (bdi.cluster_size / BACKUP_CLUSTER_SIZE_DEFAULT) * BACKUP_CLUSTER_SIZE_DEFAULT; -+ if (job->cluster_size == 0) { -+ /* but we can't go below it */ -+ job->cluster_size = BACKUP_CLUSTER_SIZE_DEFAULT; -+ } -+ } -+ } -+ if (target) { -+ job->use_copy_range = true; -+ job->copy_range_size = MIN_NON_ZERO(blk_get_max_transfer(job->common.blk), -+ blk_get_max_transfer(job->target)); -+ job->copy_range_size = MAX(job->cluster_size, -+ QEMU_ALIGN_UP(job->copy_range_size, -+ job->cluster_size)); -+ } else { -+ job->use_copy_range = false; -+ } -+ -+ if (target) { -+ /* Required permissions are already taken with target's blk_new() */ -+ block_job_add_bdrv(&job->common, "target", target, 0, BLK_PERM_ALL, -+ &error_abort); -+ } - job->len = len; - job->common.job.pause_count = pause_count; - -diff --git a/block/replication.c b/block/replication.c -index 84e07cc4d4..04fa448a5b 100644 ---- a/block/replication.c -+++ b/block/replication.c -@@ -571,6 +571,7 @@ static void replication_start(ReplicationState *rs, ReplicationMode mode, - 0, MIRROR_SYNC_MODE_NONE, NULL, false, - BLOCKDEV_ON_ERROR_REPORT, - BLOCKDEV_ON_ERROR_REPORT, JOB_INTERNAL, -+ NULL, - backup_job_completed, bs, 0, NULL, &local_err); - if (local_err) { - error_propagate(errp, local_err); -diff --git a/blockdev.c b/blockdev.c -index 4f18d3c3d7..d5458f044e 100644 ---- a/blockdev.c -+++ b/blockdev.c -@@ -31,7 +31,6 @@ - */ - - #include "qemu/osdep.h" --#include "qemu/uuid.h" - #include "sysemu/block-backend.h" - #include "sysemu/blockdev.h" - #include "hw/block/block.h" -@@ -63,6 +62,7 @@ - #include "qemu/cutils.h" - #include "qemu/help_option.h" - #include "qemu/throttle-options.h" -+#include "vma.h" - - static QTAILQ_HEAD(, BlockDriverState) monitor_bdrv_states = - QTAILQ_HEAD_INITIALIZER(monitor_bdrv_states); -@@ -3228,15 +3228,14 @@ out: - static struct PVEBackupState { - Error *error; - bool cancel; -- QemuUUID uuid; -+ uuid_t uuid; - char uuid_str[37]; - int64_t speed; - time_t start_time; - time_t end_time; - char *backup_file; -- Object *vmaobj; -+ VmaWriter *vmaw; - GList *di_list; -- size_t next_job; - size_t total; - size_t transferred; - size_t zero_bytes; -@@ -3255,6 +3254,71 @@ typedef struct PVEBackupDevInfo { - - static void pvebackup_run_next_job(void); - -+static int pvebackup_dump_cb(void *opaque, BlockBackend *target, -+ uint64_t start, uint64_t bytes, -+ const void *pbuf) -+{ -+ const uint64_t size = bytes; -+ const unsigned char *buf = pbuf; -+ PVEBackupDevInfo *di = opaque; -+ -+ if (backup_state.cancel) { -+ return size; // return success -+ } -+ -+ uint64_t cluster_num = start / VMA_CLUSTER_SIZE; -+ if ((cluster_num * VMA_CLUSTER_SIZE) != start) { -+ if (!backup_state.error) { -+ error_setg(&backup_state.error, -+ "got unaligned write inside backup dump " -+ "callback (sector %ld)", start); -+ } -+ return -1; // not aligned to cluster size -+ } -+ -+ int ret = -1; -+ -+ if (backup_state.vmaw) { -+ size_t zero_bytes = 0; -+ uint64_t remaining = size; -+ while (remaining > 0) { -+ ret = vma_writer_write(backup_state.vmaw, di->dev_id, cluster_num, -+ buf, &zero_bytes); -+ ++cluster_num; -+ if (buf) { -+ buf += VMA_CLUSTER_SIZE; -+ } -+ if (ret < 0) { -+ if (!backup_state.error) { -+ vma_writer_error_propagate(backup_state.vmaw, &backup_state.error); -+ } -+ if (di->bs && di->bs->job) { -+ job_cancel(&di->bs->job->job, true); -+ } -+ break; -+ } else { -+ backup_state.zero_bytes += zero_bytes; -+ if (remaining >= VMA_CLUSTER_SIZE) { -+ backup_state.transferred += VMA_CLUSTER_SIZE; -+ remaining -= VMA_CLUSTER_SIZE; -+ } else { -+ backup_state.transferred += remaining; -+ remaining = 0; -+ } -+ } -+ } -+ } else { -+ if (!buf) { -+ backup_state.zero_bytes += size; -+ } -+ backup_state.transferred += size; -+ } -+ -+ // Note: always return success, because we want that writes succeed anyways. -+ -+ return size; -+} -+ - static void pvebackup_cleanup(void) - { - qemu_mutex_lock(&backup_state.backup_mutex); -@@ -3266,9 +3330,11 @@ static void pvebackup_cleanup(void) - - backup_state.end_time = time(NULL); - -- if (backup_state.vmaobj) { -- object_unparent(backup_state.vmaobj); -- backup_state.vmaobj = NULL; -+ if (backup_state.vmaw) { -+ Error *local_err = NULL; -+ vma_writer_close(backup_state.vmaw, &local_err); -+ error_propagate(&backup_state.error, local_err); -+ backup_state.vmaw = NULL; - } - - g_list_free(backup_state.di_list); -@@ -3276,6 +3342,13 @@ static void pvebackup_cleanup(void) - qemu_mutex_unlock(&backup_state.backup_mutex); - } - -+static void coroutine_fn backup_close_vma_stream(void *opaque) -+{ -+ PVEBackupDevInfo *di = opaque; -+ -+ vma_writer_close_stream(backup_state.vmaw, di->dev_id); -+} -+ - static void pvebackup_complete_cb(void *opaque, int ret) - { - // This always runs in the main loop -@@ -3292,9 +3365,9 @@ static void pvebackup_complete_cb(void *opaque, int ret) - di->bs = NULL; - di->target = NULL; - -- if (backup_state.vmaobj) { -- object_unparent(backup_state.vmaobj); -- backup_state.vmaobj = NULL; -+ if (backup_state.vmaw) { -+ Coroutine *co = qemu_coroutine_create(backup_close_vma_stream, di); -+ qemu_coroutine_enter(co); - } - - // remove self from job queue -@@ -3322,14 +3395,9 @@ static void pvebackup_cancel(void *opaque) - error_setg(&backup_state.error, "backup cancelled"); - } - -- if (backup_state.vmaobj) { -- Error *err; -+ if (backup_state.vmaw) { - /* make sure vma writer does not block anymore */ -- if (!object_set_props(backup_state.vmaobj, &err, "blocked", "yes", NULL)) { -- if (err) { -- error_report_err(err); -- } -- } -+ vma_writer_set_error(backup_state.vmaw, "backup cancelled"); - } - - GList *l = backup_state.di_list; -@@ -3360,18 +3428,14 @@ void qmp_backup_cancel(Error **errp) - Coroutine *co = qemu_coroutine_create(pvebackup_cancel, NULL); - qemu_coroutine_enter(co); - -- while (backup_state.vmaobj) { -- /* FIXME: Find something better for this */ -+ while (backup_state.vmaw) { -+ /* vma writer use main aio context */ - aio_poll(qemu_get_aio_context(), true); - } - } - --void vma_object_add_config_file(Object *obj, const char *name, -- const char *contents, size_t len, -- Error **errp); - static int config_to_vma(const char *file, BackupFormat format, -- Object *vmaobj, -- const char *backup_dir, -+ const char *backup_dir, VmaWriter *vmaw, - Error **errp) - { - char *cdata = NULL; -@@ -3385,7 +3449,12 @@ static int config_to_vma(const char *file, BackupFormat format, - char *basename = g_path_get_basename(file); - - if (format == BACKUP_FORMAT_VMA) { -- vma_object_add_config_file(vmaobj, basename, cdata, clen, errp); -+ if (vma_writer_add_config(vmaw, basename, cdata, clen) != 0) { -+ error_setg(errp, "unable to add %s config data to vma archive", file); -+ g_free(cdata); -+ g_free(basename); -+ return 1; -+ } - } else if (format == BACKUP_FORMAT_DIR) { - char config_path[PATH_MAX]; - snprintf(config_path, PATH_MAX, "%s/%s", backup_dir, basename); -@@ -3402,28 +3471,30 @@ static int config_to_vma(const char *file, BackupFormat format, - return 0; - } - -+bool job_should_pause(Job *job); - static void pvebackup_run_next_job(void) - { - qemu_mutex_lock(&backup_state.backup_mutex); - -- GList *next = g_list_nth(backup_state.di_list, backup_state.next_job); -- while (next) { -- PVEBackupDevInfo *di = (PVEBackupDevInfo *)next->data; -- backup_state.next_job++; -+ GList *l = backup_state.di_list; -+ while (l) { -+ PVEBackupDevInfo *di = (PVEBackupDevInfo *)l->data; -+ l = g_list_next(l); - if (!di->completed && di->bs && di->bs->job) { - BlockJob *job = di->bs->job; - AioContext *aio_context = blk_get_aio_context(job->blk); - aio_context_acquire(aio_context); - qemu_mutex_unlock(&backup_state.backup_mutex); -- if (backup_state.error || backup_state.cancel) { -- job_cancel_sync(job); -- } else { -- job_resume(job); -+ if (job_should_pause(&job->job)) { -+ if (backup_state.error || backup_state.cancel) { -+ job_cancel_sync(&job->job); -+ } else { -+ job_resume(&job->job); -+ } - } - aio_context_release(aio_context); - return; - } -- next = g_list_next(next); - } - qemu_mutex_unlock(&backup_state.backup_mutex); - -@@ -3434,7 +3505,7 @@ static void pvebackup_run_next_job(void) - UuidInfo *qmp_backup(const char *backup_file, bool has_format, - BackupFormat format, - bool has_config_file, const char *config_file, -- bool has_firewall_file, const char *firewall_file, -+ bool has_firewall_file, const char *firewall_file, - bool has_devlist, const char *devlist, - bool has_speed, int64_t speed, Error **errp) - { -@@ -3442,7 +3513,8 @@ UuidInfo *qmp_backup(const char *backup_file, bool has_format, - BlockDriverState *bs = NULL; - const char *backup_dir = NULL; - Error *local_err = NULL; -- QemuUUID uuid; -+ uuid_t uuid; -+ VmaWriter *vmaw = NULL; - gchar **devs = NULL; - GList *di_list = NULL; - GList *l; -@@ -3454,7 +3526,7 @@ UuidInfo *qmp_backup(const char *backup_file, bool has_format, - backup_state.backup_mutex_initialized = true; - } - -- if (backup_state.di_list || backup_state.vmaobj) { -+ if (backup_state.di_list) { - error_set(errp, ERROR_CLASS_GENERIC_ERROR, - "previous backup not finished"); - return NULL; -@@ -3529,40 +3601,28 @@ UuidInfo *qmp_backup(const char *backup_file, bool has_format, - total += size; - } - -- qemu_uuid_generate(&uuid); -+ uuid_generate(uuid); - - if (format == BACKUP_FORMAT_VMA) { -- char uuidstr[UUID_FMT_LEN+1]; -- qemu_uuid_unparse(&uuid, uuidstr); -- uuidstr[UUID_FMT_LEN] = 0; -- backup_state.vmaobj = -- object_new_with_props("vma", object_get_objects_root(), -- "vma-backup-obj", &local_err, -- "filename", backup_file, -- "uuid", uuidstr, -- NULL); -- if (!backup_state.vmaobj) { -+ vmaw = vma_writer_create(backup_file, uuid, &local_err); -+ if (!vmaw) { - if (local_err) { - error_propagate(errp, local_err); - } - goto err; - } - -+ /* register all devices for vma writer */ - l = di_list; - while (l) { -- QDict *options = qdict_new(); -- - PVEBackupDevInfo *di = (PVEBackupDevInfo *)l->data; - l = g_list_next(l); - - const char *devname = bdrv_get_device_name(di->bs); -- snprintf(di->targetfile, PATH_MAX, "vma-backup-obj/%s.raw", devname); -- -- qdict_put(options, "driver", qstring_from_str("vma-drive")); -- qdict_put(options, "size", qint_from_int(di->size)); -- di->target = bdrv_open(di->targetfile, NULL, options, BDRV_O_RDWR, &local_err); -- if (!di->target) { -- error_propagate(errp, local_err); -+ di->dev_id = vma_writer_register_stream(vmaw, devname, di->size); -+ if (di->dev_id <= 0) { -+ error_set(errp, ERROR_CLASS_GENERIC_ERROR, -+ "register_stream failed"); - goto err; - } - } -@@ -3603,14 +3663,14 @@ UuidInfo *qmp_backup(const char *backup_file, bool has_format, - - /* add configuration file to archive */ - if (has_config_file) { -- if(config_to_vma(config_file, format, backup_state.vmaobj, backup_dir, errp) != 0) { -+ if (config_to_vma(config_file, format, backup_dir, vmaw, errp) != 0) { - goto err; - } - } - - /* add firewall file to archive */ - if (has_firewall_file) { -- if(config_to_vma(firewall_file, format, backup_state.vmaobj, backup_dir, errp) != 0) { -+ if (config_to_vma(firewall_file, format, backup_dir, vmaw, errp) != 0) { - goto err; - } - } -@@ -3633,12 +3693,13 @@ UuidInfo *qmp_backup(const char *backup_file, bool has_format, - } - backup_state.backup_file = g_strdup(backup_file); - -- memcpy(&backup_state.uuid, &uuid, sizeof(uuid)); -- qemu_uuid_unparse(&uuid, backup_state.uuid_str); -+ backup_state.vmaw = vmaw; -+ -+ uuid_copy(backup_state.uuid, uuid); -+ uuid_unparse_lower(uuid, backup_state.uuid_str); - - qemu_mutex_lock(&backup_state.backup_mutex); - backup_state.di_list = di_list; -- backup_state.next_job = 0; - - backup_state.total = total; - backup_state.transferred = 0; -@@ -3649,21 +3710,21 @@ UuidInfo *qmp_backup(const char *backup_file, bool has_format, - while (l) { - PVEBackupDevInfo *di = (PVEBackupDevInfo *)l->data; - l = g_list_next(l); -- - job = backup_job_create(NULL, di->bs, di->target, speed, MIRROR_SYNC_MODE_FULL, NULL, - false, BLOCKDEV_ON_ERROR_REPORT, BLOCKDEV_ON_ERROR_REPORT, - JOB_DEFAULT, -- pvebackup_complete_cb, di, 2, NULL, &local_err); -- if (di->target) { -- bdrv_unref(di->target); -- di->target = NULL; -- } -+ pvebackup_dump_cb, pvebackup_complete_cb, di, -+ 2, NULL, &local_err); - if (!job || local_err != NULL) { - error_setg(&backup_state.error, "backup_job_create failed"); - pvebackup_cancel(NULL); - } else { - job_start(&job->job); - } -+ if (di->target) { -+ bdrv_unref(di->target); -+ di->target = NULL; -+ } - } - - qemu_mutex_unlock(&backup_state.backup_mutex); -@@ -3699,9 +3760,10 @@ err: - g_strfreev(devs); - } - -- if (backup_state.vmaobj) { -- object_unparent(backup_state.vmaobj); -- backup_state.vmaobj = NULL; -+ if (vmaw) { -+ Error *err = NULL; -+ vma_writer_close(vmaw, &err); -+ unlink(backup_file); - } - - if (backup_dir) { -@@ -4104,7 +4166,7 @@ static BlockJob *do_drive_backup(DriveBackup *backup, JobTxn *txn, - job = backup_job_create(backup->job_id, bs, target_bs, backup->speed, - backup->sync, bmap, backup->compress, - backup->on_source_error, backup->on_target_error, -- job_flags, NULL, NULL, 0, txn, &local_err); -+ job_flags, NULL, NULL, NULL, 0, txn, &local_err); - bdrv_unref(target_bs); - if (local_err != NULL) { - error_propagate(errp, local_err); -@@ -4196,7 +4258,7 @@ BlockJob *do_blockdev_backup(BlockdevBackup *backup, JobTxn *txn, - job = backup_job_create(backup->job_id, bs, target_bs, backup->speed, - backup->sync, NULL, backup->compress, - backup->on_source_error, backup->on_target_error, -- job_flags, NULL, NULL, 0, txn, &local_err); -+ job_flags, NULL, NULL, NULL, 0, txn, &local_err); - if (local_err != NULL) { - error_propagate(errp, local_err); - } -diff --git a/include/block/block_int.h b/include/block/block_int.h -index 0b2516c3cf..ecd6243440 100644 ---- a/include/block/block_int.h -+++ b/include/block/block_int.h -@@ -59,6 +59,9 @@ - - #define BLOCK_PROBE_BUF_SIZE 512 - -+typedef int BackupDumpFunc(void *opaque, BlockBackend *be, -+ uint64_t offset, uint64_t bytes, const void *buf); -+ - enum BdrvTrackedRequestType { - BDRV_TRACKED_READ, - BDRV_TRACKED_WRITE, -@@ -1082,6 +1085,7 @@ BlockJob *backup_job_create(const char *job_id, BlockDriverState *bs, - BlockdevOnError on_source_error, - BlockdevOnError on_target_error, - int creation_flags, -+ BackupDumpFunc *dump_cb, - BlockCompletionFunc *cb, void *opaque, - int pause_count, - JobTxn *txn, Error **errp); -diff --git a/job.c b/job.c -index 72c50ee18e..1b3bda275d 100644 ---- a/job.c -+++ b/job.c -@@ -256,7 +256,8 @@ static bool job_started(Job *job) - return job->co; - } - --static bool job_should_pause(Job *job) -+bool job_should_pause(Job *job); -+bool job_should_pause(Job *job) - { - return job->pause_count > 0; - } -diff --git a/vma-reader.c b/vma-reader.c -new file mode 100644 -index 0000000000..2b1d1cdab3 ---- /dev/null -+++ b/vma-reader.c -@@ -0,0 +1,857 @@ -+/* -+ * VMA: Virtual Machine Archive -+ * -+ * Copyright (C) 2012 Proxmox Server Solutions -+ * -+ * Authors: -+ * Dietmar Maurer (dietmar@proxmox.com) -+ * -+ * This work is licensed under the terms of the GNU GPL, version 2 or later. -+ * See the COPYING file in the top-level directory. -+ * -+ */ -+ -+#include "qemu/osdep.h" -+#include -+#include -+ -+#include "qemu-common.h" -+#include "qemu/timer.h" -+#include "qemu/ratelimit.h" -+#include "vma.h" -+#include "block/block.h" -+#include "sysemu/block-backend.h" -+ -+static unsigned char zero_vma_block[VMA_BLOCK_SIZE]; -+ -+typedef struct VmaRestoreState { -+ BlockBackend *target; -+ bool write_zeroes; -+ unsigned long *bitmap; -+ int bitmap_size; -+} VmaRestoreState; -+ -+struct VmaReader { -+ int fd; -+ GChecksum *md5csum; -+ GHashTable *blob_hash; -+ unsigned char *head_data; -+ VmaDeviceInfo devinfo[256]; -+ VmaRestoreState rstate[256]; -+ GList *cdata_list; -+ guint8 vmstate_stream; -+ uint32_t vmstate_clusters; -+ /* to show restore percentage if run with -v */ -+ time_t start_time; -+ int64_t cluster_count; -+ int64_t clusters_read; -+ int64_t zero_cluster_data; -+ int64_t partial_zero_cluster_data; -+ int clusters_read_per; -+}; -+ -+static guint -+g_int32_hash(gconstpointer v) -+{ -+ return *(const uint32_t *)v; -+} -+ -+static gboolean -+g_int32_equal(gconstpointer v1, gconstpointer v2) -+{ -+ return *((const uint32_t *)v1) == *((const uint32_t *)v2); -+} -+ -+static int vma_reader_get_bitmap(VmaRestoreState *rstate, int64_t cluster_num) -+{ -+ assert(rstate); -+ assert(rstate->bitmap); -+ -+ unsigned long val, idx, bit; -+ -+ idx = cluster_num / BITS_PER_LONG; -+ -+ assert(rstate->bitmap_size > idx); -+ -+ bit = cluster_num % BITS_PER_LONG; -+ val = rstate->bitmap[idx]; -+ -+ return !!(val & (1UL << bit)); -+} -+ -+static void vma_reader_set_bitmap(VmaRestoreState *rstate, int64_t cluster_num, -+ int dirty) -+{ -+ assert(rstate); -+ assert(rstate->bitmap); -+ -+ unsigned long val, idx, bit; -+ -+ idx = cluster_num / BITS_PER_LONG; -+ -+ assert(rstate->bitmap_size > idx); -+ -+ bit = cluster_num % BITS_PER_LONG; -+ val = rstate->bitmap[idx]; -+ if (dirty) { -+ if (!(val & (1UL << bit))) { -+ val |= 1UL << bit; -+ } -+ } else { -+ if (val & (1UL << bit)) { -+ val &= ~(1UL << bit); -+ } -+ } -+ rstate->bitmap[idx] = val; -+} -+ -+typedef struct VmaBlob { -+ uint32_t start; -+ uint32_t len; -+ void *data; -+} VmaBlob; -+ -+static const VmaBlob *get_header_blob(VmaReader *vmar, uint32_t pos) -+{ -+ assert(vmar); -+ assert(vmar->blob_hash); -+ -+ return g_hash_table_lookup(vmar->blob_hash, &pos); -+} -+ -+static const char *get_header_str(VmaReader *vmar, uint32_t pos) -+{ -+ const VmaBlob *blob = get_header_blob(vmar, pos); -+ if (!blob) { -+ return NULL; -+ } -+ const char *res = (char *)blob->data; -+ if (res[blob->len-1] != '\0') { -+ return NULL; -+ } -+ return res; -+} -+ -+static ssize_t -+safe_read(int fd, unsigned char *buf, size_t count) -+{ -+ ssize_t n; -+ -+ do { -+ n = read(fd, buf, count); -+ } while (n < 0 && errno == EINTR); -+ -+ return n; -+} -+ -+static ssize_t -+full_read(int fd, unsigned char *buf, size_t len) -+{ -+ ssize_t n; -+ size_t total; -+ -+ total = 0; -+ -+ while (len > 0) { -+ n = safe_read(fd, buf, len); -+ -+ if (n == 0) { -+ return total; -+ } -+ -+ if (n <= 0) { -+ break; -+ } -+ -+ buf += n; -+ total += n; -+ len -= n; -+ } -+ -+ if (len) { -+ return -1; -+ } -+ -+ return total; -+} -+ -+void vma_reader_destroy(VmaReader *vmar) -+{ -+ assert(vmar); -+ -+ if (vmar->fd >= 0) { -+ close(vmar->fd); -+ } -+ -+ if (vmar->cdata_list) { -+ g_list_free(vmar->cdata_list); -+ } -+ -+ int i; -+ for (i = 1; i < 256; i++) { -+ if (vmar->rstate[i].bitmap) { -+ g_free(vmar->rstate[i].bitmap); -+ } -+ } -+ -+ if (vmar->md5csum) { -+ g_checksum_free(vmar->md5csum); -+ } -+ -+ if (vmar->blob_hash) { -+ g_hash_table_destroy(vmar->blob_hash); -+ } -+ -+ if (vmar->head_data) { -+ g_free(vmar->head_data); -+ } -+ -+ g_free(vmar); -+ -+}; -+ -+static int vma_reader_read_head(VmaReader *vmar, Error **errp) -+{ -+ assert(vmar); -+ assert(errp); -+ assert(*errp == NULL); -+ -+ unsigned char md5sum[16]; -+ int i; -+ int ret = 0; -+ -+ vmar->head_data = g_malloc(sizeof(VmaHeader)); -+ -+ if (full_read(vmar->fd, vmar->head_data, sizeof(VmaHeader)) != -+ sizeof(VmaHeader)) { -+ error_setg(errp, "can't read vma header - %s", -+ errno ? g_strerror(errno) : "got EOF"); -+ return -1; -+ } -+ -+ VmaHeader *h = (VmaHeader *)vmar->head_data; -+ -+ if (h->magic != VMA_MAGIC) { -+ error_setg(errp, "not a vma file - wrong magic number"); -+ return -1; -+ } -+ -+ uint32_t header_size = GUINT32_FROM_BE(h->header_size); -+ int need = header_size - sizeof(VmaHeader); -+ if (need <= 0) { -+ error_setg(errp, "wrong vma header size %d", header_size); -+ return -1; -+ } -+ -+ vmar->head_data = g_realloc(vmar->head_data, header_size); -+ h = (VmaHeader *)vmar->head_data; -+ -+ if (full_read(vmar->fd, vmar->head_data + sizeof(VmaHeader), need) != -+ need) { -+ error_setg(errp, "can't read vma header data - %s", -+ errno ? g_strerror(errno) : "got EOF"); -+ return -1; -+ } -+ -+ memcpy(md5sum, h->md5sum, 16); -+ memset(h->md5sum, 0, 16); -+ -+ g_checksum_reset(vmar->md5csum); -+ g_checksum_update(vmar->md5csum, vmar->head_data, header_size); -+ gsize csize = 16; -+ g_checksum_get_digest(vmar->md5csum, (guint8 *)(h->md5sum), &csize); -+ -+ if (memcmp(md5sum, h->md5sum, 16) != 0) { -+ error_setg(errp, "wrong vma header chechsum"); -+ return -1; -+ } -+ -+ /* we can modify header data after checksum verify */ -+ h->header_size = header_size; -+ -+ h->version = GUINT32_FROM_BE(h->version); -+ if (h->version != 1) { -+ error_setg(errp, "wrong vma version %d", h->version); -+ return -1; -+ } -+ -+ h->ctime = GUINT64_FROM_BE(h->ctime); -+ h->blob_buffer_offset = GUINT32_FROM_BE(h->blob_buffer_offset); -+ h->blob_buffer_size = GUINT32_FROM_BE(h->blob_buffer_size); -+ -+ uint32_t bstart = h->blob_buffer_offset + 1; -+ uint32_t bend = h->blob_buffer_offset + h->blob_buffer_size; -+ -+ if (bstart <= sizeof(VmaHeader)) { -+ error_setg(errp, "wrong vma blob buffer offset %d", -+ h->blob_buffer_offset); -+ return -1; -+ } -+ -+ if (bend > header_size) { -+ error_setg(errp, "wrong vma blob buffer size %d/%d", -+ h->blob_buffer_offset, h->blob_buffer_size); -+ return -1; -+ } -+ -+ while ((bstart + 2) <= bend) { -+ uint32_t size = vmar->head_data[bstart] + -+ (vmar->head_data[bstart+1] << 8); -+ if ((bstart + size + 2) <= bend) { -+ VmaBlob *blob = g_new0(VmaBlob, 1); -+ blob->start = bstart - h->blob_buffer_offset; -+ blob->len = size; -+ blob->data = vmar->head_data + bstart + 2; -+ g_hash_table_insert(vmar->blob_hash, &blob->start, blob); -+ } -+ bstart += size + 2; -+ } -+ -+ -+ int count = 0; -+ for (i = 1; i < 256; i++) { -+ VmaDeviceInfoHeader *dih = &h->dev_info[i]; -+ uint32_t devname_ptr = GUINT32_FROM_BE(dih->devname_ptr); -+ uint64_t size = GUINT64_FROM_BE(dih->size); -+ const char *devname = get_header_str(vmar, devname_ptr); -+ -+ if (size && devname) { -+ count++; -+ vmar->devinfo[i].size = size; -+ vmar->devinfo[i].devname = devname; -+ -+ if (strcmp(devname, "vmstate") == 0) { -+ vmar->vmstate_stream = i; -+ } -+ } -+ } -+ -+ for (i = 0; i < VMA_MAX_CONFIGS; i++) { -+ uint32_t name_ptr = GUINT32_FROM_BE(h->config_names[i]); -+ uint32_t data_ptr = GUINT32_FROM_BE(h->config_data[i]); -+ -+ if (!(name_ptr && data_ptr)) { -+ continue; -+ } -+ const char *name = get_header_str(vmar, name_ptr); -+ const VmaBlob *blob = get_header_blob(vmar, data_ptr); -+ -+ if (!(name && blob)) { -+ error_setg(errp, "vma contains invalid data pointers"); -+ return -1; -+ } -+ -+ VmaConfigData *cdata = g_new0(VmaConfigData, 1); -+ cdata->name = name; -+ cdata->data = blob->data; -+ cdata->len = blob->len; -+ -+ vmar->cdata_list = g_list_append(vmar->cdata_list, cdata); -+ } -+ -+ return ret; -+}; -+ -+VmaReader *vma_reader_create(const char *filename, Error **errp) -+{ -+ assert(filename); -+ assert(errp); -+ -+ VmaReader *vmar = g_new0(VmaReader, 1); -+ -+ if (strcmp(filename, "-") == 0) { -+ vmar->fd = dup(0); -+ } else { -+ vmar->fd = open(filename, O_RDONLY); -+ } -+ -+ if (vmar->fd < 0) { -+ error_setg(errp, "can't open file %s - %s\n", filename, -+ g_strerror(errno)); -+ goto err; -+ } -+ -+ vmar->md5csum = g_checksum_new(G_CHECKSUM_MD5); -+ if (!vmar->md5csum) { -+ error_setg(errp, "can't allocate cmsum\n"); -+ goto err; -+ } -+ -+ vmar->blob_hash = g_hash_table_new_full(g_int32_hash, g_int32_equal, -+ NULL, g_free); -+ -+ if (vma_reader_read_head(vmar, errp) < 0) { -+ goto err; -+ } -+ -+ return vmar; -+ -+err: -+ if (vmar) { -+ vma_reader_destroy(vmar); -+ } -+ -+ return NULL; -+} -+ -+VmaHeader *vma_reader_get_header(VmaReader *vmar) -+{ -+ assert(vmar); -+ assert(vmar->head_data); -+ -+ return (VmaHeader *)(vmar->head_data); -+} -+ -+GList *vma_reader_get_config_data(VmaReader *vmar) -+{ -+ assert(vmar); -+ assert(vmar->head_data); -+ -+ return vmar->cdata_list; -+} -+ -+VmaDeviceInfo *vma_reader_get_device_info(VmaReader *vmar, guint8 dev_id) -+{ -+ assert(vmar); -+ assert(dev_id); -+ -+ if (vmar->devinfo[dev_id].size && vmar->devinfo[dev_id].devname) { -+ return &vmar->devinfo[dev_id]; -+ } -+ -+ return NULL; -+} -+ -+static void allocate_rstate(VmaReader *vmar, guint8 dev_id, -+ BlockBackend *target, bool write_zeroes) -+{ -+ assert(vmar); -+ assert(dev_id); -+ -+ vmar->rstate[dev_id].target = target; -+ vmar->rstate[dev_id].write_zeroes = write_zeroes; -+ -+ int64_t size = vmar->devinfo[dev_id].size; -+ -+ int64_t bitmap_size = (size/BDRV_SECTOR_SIZE) + -+ (VMA_CLUSTER_SIZE/BDRV_SECTOR_SIZE) * BITS_PER_LONG - 1; -+ bitmap_size /= (VMA_CLUSTER_SIZE/BDRV_SECTOR_SIZE) * BITS_PER_LONG; -+ -+ vmar->rstate[dev_id].bitmap_size = bitmap_size; -+ vmar->rstate[dev_id].bitmap = g_new0(unsigned long, bitmap_size); -+ -+ vmar->cluster_count += size/VMA_CLUSTER_SIZE; -+} -+ -+int vma_reader_register_bs(VmaReader *vmar, guint8 dev_id, BlockBackend *target, -+ bool write_zeroes, Error **errp) -+{ -+ assert(vmar); -+ assert(target != NULL); -+ assert(dev_id); -+ assert(vmar->rstate[dev_id].target == NULL); -+ -+ int64_t size = blk_getlength(target); -+ int64_t size_diff = size - vmar->devinfo[dev_id].size; -+ -+ /* storage types can have different size restrictions, so it -+ * is not always possible to create an image with exact size. -+ * So we tolerate a size difference up to 4MB. -+ */ -+ if ((size_diff < 0) || (size_diff > 4*1024*1024)) { -+ error_setg(errp, "vma_reader_register_bs for stream %s failed - " -+ "unexpected size %zd != %zd", vmar->devinfo[dev_id].devname, -+ size, vmar->devinfo[dev_id].size); -+ return -1; -+ } -+ -+ allocate_rstate(vmar, dev_id, target, write_zeroes); -+ -+ return 0; -+} -+ -+static ssize_t safe_write(int fd, void *buf, size_t count) -+{ -+ ssize_t n; -+ -+ do { -+ n = write(fd, buf, count); -+ } while (n < 0 && errno == EINTR); -+ -+ return n; -+} -+ -+static size_t full_write(int fd, void *buf, size_t len) -+{ -+ ssize_t n; -+ size_t total; -+ -+ total = 0; -+ -+ while (len > 0) { -+ n = safe_write(fd, buf, len); -+ if (n < 0) { -+ return n; -+ } -+ buf += n; -+ total += n; -+ len -= n; -+ } -+ -+ if (len) { -+ /* incomplete write ? */ -+ return -1; -+ } -+ -+ return total; -+} -+ -+static int restore_write_data(VmaReader *vmar, guint8 dev_id, -+ BlockBackend *target, int vmstate_fd, -+ unsigned char *buf, int64_t sector_num, -+ int nb_sectors, Error **errp) -+{ -+ assert(vmar); -+ -+ if (dev_id == vmar->vmstate_stream) { -+ if (vmstate_fd >= 0) { -+ int len = nb_sectors * BDRV_SECTOR_SIZE; -+ int res = full_write(vmstate_fd, buf, len); -+ if (res < 0) { -+ error_setg(errp, "write vmstate failed %d", res); -+ return -1; -+ } -+ } -+ } else { -+ int res = blk_pwrite(target, sector_num * BDRV_SECTOR_SIZE, buf, nb_sectors * BDRV_SECTOR_SIZE, 0); -+ if (res < 0) { -+ error_setg(errp, "blk_pwrite to %s failed (%d)", -+ bdrv_get_device_name(blk_bs(target)), res); -+ return -1; -+ } -+ } -+ return 0; -+} -+ -+static int restore_extent(VmaReader *vmar, unsigned char *buf, -+ int extent_size, int vmstate_fd, -+ bool verbose, bool verify, Error **errp) -+{ -+ assert(vmar); -+ assert(buf); -+ -+ VmaExtentHeader *ehead = (VmaExtentHeader *)buf; -+ int start = VMA_EXTENT_HEADER_SIZE; -+ int i; -+ -+ for (i = 0; i < VMA_BLOCKS_PER_EXTENT; i++) { -+ uint64_t block_info = GUINT64_FROM_BE(ehead->blockinfo[i]); -+ uint64_t cluster_num = block_info & 0xffffffff; -+ uint8_t dev_id = (block_info >> 32) & 0xff; -+ uint16_t mask = block_info >> (32+16); -+ int64_t max_sector; -+ -+ if (!dev_id) { -+ continue; -+ } -+ -+ VmaRestoreState *rstate = &vmar->rstate[dev_id]; -+ BlockBackend *target = NULL; -+ -+ if (dev_id != vmar->vmstate_stream) { -+ target = rstate->target; -+ if (!verify && !target) { -+ error_setg(errp, "got wrong dev id %d", dev_id); -+ return -1; -+ } -+ -+ if (vma_reader_get_bitmap(rstate, cluster_num)) { -+ error_setg(errp, "found duplicated cluster %zd for stream %s", -+ cluster_num, vmar->devinfo[dev_id].devname); -+ return -1; -+ } -+ vma_reader_set_bitmap(rstate, cluster_num, 1); -+ -+ max_sector = vmar->devinfo[dev_id].size/BDRV_SECTOR_SIZE; -+ } else { -+ max_sector = G_MAXINT64; -+ if (cluster_num != vmar->vmstate_clusters) { -+ error_setg(errp, "found out of order vmstate data"); -+ return -1; -+ } -+ vmar->vmstate_clusters++; -+ } -+ -+ vmar->clusters_read++; -+ -+ if (verbose) { -+ time_t duration = time(NULL) - vmar->start_time; -+ int percent = (vmar->clusters_read*100)/vmar->cluster_count; -+ if (percent != vmar->clusters_read_per) { -+ printf("progress %d%% (read %zd bytes, duration %zd sec)\n", -+ percent, vmar->clusters_read*VMA_CLUSTER_SIZE, -+ duration); -+ fflush(stdout); -+ vmar->clusters_read_per = percent; -+ } -+ } -+ -+ /* try to write whole clusters to speedup restore */ -+ if (mask == 0xffff) { -+ if ((start + VMA_CLUSTER_SIZE) > extent_size) { -+ error_setg(errp, "short vma extent - too many blocks"); -+ return -1; -+ } -+ int64_t sector_num = (cluster_num * VMA_CLUSTER_SIZE) / -+ BDRV_SECTOR_SIZE; -+ int64_t end_sector = sector_num + -+ VMA_CLUSTER_SIZE/BDRV_SECTOR_SIZE; -+ -+ if (end_sector > max_sector) { -+ end_sector = max_sector; -+ } -+ -+ if (end_sector <= sector_num) { -+ error_setg(errp, "got wrong block address - write beyond end"); -+ return -1; -+ } -+ -+ if (!verify) { -+ int nb_sectors = end_sector - sector_num; -+ if (restore_write_data(vmar, dev_id, target, vmstate_fd, -+ buf + start, sector_num, nb_sectors, -+ errp) < 0) { -+ return -1; -+ } -+ } -+ -+ start += VMA_CLUSTER_SIZE; -+ } else { -+ int j; -+ int bit = 1; -+ -+ for (j = 0; j < 16; j++) { -+ int64_t sector_num = (cluster_num*VMA_CLUSTER_SIZE + -+ j*VMA_BLOCK_SIZE)/BDRV_SECTOR_SIZE; -+ -+ int64_t end_sector = sector_num + -+ VMA_BLOCK_SIZE/BDRV_SECTOR_SIZE; -+ if (end_sector > max_sector) { -+ end_sector = max_sector; -+ } -+ -+ if (mask & bit) { -+ if ((start + VMA_BLOCK_SIZE) > extent_size) { -+ error_setg(errp, "short vma extent - too many blocks"); -+ return -1; -+ } -+ -+ if (end_sector <= sector_num) { -+ error_setg(errp, "got wrong block address - " -+ "write beyond end"); -+ return -1; -+ } -+ -+ if (!verify) { -+ int nb_sectors = end_sector - sector_num; -+ if (restore_write_data(vmar, dev_id, target, vmstate_fd, -+ buf + start, sector_num, -+ nb_sectors, errp) < 0) { -+ return -1; -+ } -+ } -+ -+ start += VMA_BLOCK_SIZE; -+ -+ } else { -+ -+ -+ if (end_sector > sector_num) { -+ /* Todo: use bdrv_co_write_zeroes (but that need to -+ * be run inside coroutine?) -+ */ -+ int nb_sectors = end_sector - sector_num; -+ int zero_size = BDRV_SECTOR_SIZE*nb_sectors; -+ vmar->zero_cluster_data += zero_size; -+ if (mask != 0) { -+ vmar->partial_zero_cluster_data += zero_size; -+ } -+ -+ if (rstate->write_zeroes && !verify) { -+ if (restore_write_data(vmar, dev_id, target, vmstate_fd, -+ zero_vma_block, sector_num, -+ nb_sectors, errp) < 0) { -+ return -1; -+ } -+ } -+ } -+ } -+ -+ bit = bit << 1; -+ } -+ } -+ } -+ -+ if (start != extent_size) { -+ error_setg(errp, "vma extent error - missing blocks"); -+ return -1; -+ } -+ -+ return 0; -+} -+ -+static int vma_reader_restore_full(VmaReader *vmar, int vmstate_fd, -+ bool verbose, bool verify, -+ Error **errp) -+{ -+ assert(vmar); -+ assert(vmar->head_data); -+ -+ int ret = 0; -+ unsigned char buf[VMA_MAX_EXTENT_SIZE]; -+ int buf_pos = 0; -+ unsigned char md5sum[16]; -+ VmaHeader *h = (VmaHeader *)vmar->head_data; -+ -+ vmar->start_time = time(NULL); -+ -+ while (1) { -+ int bytes = full_read(vmar->fd, buf + buf_pos, sizeof(buf) - buf_pos); -+ if (bytes < 0) { -+ error_setg(errp, "read failed - %s", g_strerror(errno)); -+ return -1; -+ } -+ -+ buf_pos += bytes; -+ -+ if (!buf_pos) { -+ break; /* EOF */ -+ } -+ -+ if (buf_pos < VMA_EXTENT_HEADER_SIZE) { -+ error_setg(errp, "read short extent (%d bytes)", buf_pos); -+ return -1; -+ } -+ -+ VmaExtentHeader *ehead = (VmaExtentHeader *)buf; -+ -+ /* extract md5sum */ -+ memcpy(md5sum, ehead->md5sum, sizeof(ehead->md5sum)); -+ memset(ehead->md5sum, 0, sizeof(ehead->md5sum)); -+ -+ g_checksum_reset(vmar->md5csum); -+ g_checksum_update(vmar->md5csum, buf, VMA_EXTENT_HEADER_SIZE); -+ gsize csize = 16; -+ g_checksum_get_digest(vmar->md5csum, ehead->md5sum, &csize); -+ -+ if (memcmp(md5sum, ehead->md5sum, 16) != 0) { -+ error_setg(errp, "wrong vma extent header chechsum"); -+ return -1; -+ } -+ -+ if (memcmp(h->uuid, ehead->uuid, sizeof(ehead->uuid)) != 0) { -+ error_setg(errp, "wrong vma extent uuid"); -+ return -1; -+ } -+ -+ if (ehead->magic != VMA_EXTENT_MAGIC || ehead->reserved1 != 0) { -+ error_setg(errp, "wrong vma extent header magic"); -+ return -1; -+ } -+ -+ int block_count = GUINT16_FROM_BE(ehead->block_count); -+ int extent_size = VMA_EXTENT_HEADER_SIZE + block_count*VMA_BLOCK_SIZE; -+ -+ if (buf_pos < extent_size) { -+ error_setg(errp, "short vma extent (%d < %d)", buf_pos, -+ extent_size); -+ return -1; -+ } -+ -+ if (restore_extent(vmar, buf, extent_size, vmstate_fd, verbose, -+ verify, errp) < 0) { -+ return -1; -+ } -+ -+ if (buf_pos > extent_size) { -+ memmove(buf, buf + extent_size, buf_pos - extent_size); -+ buf_pos = buf_pos - extent_size; -+ } else { -+ buf_pos = 0; -+ } -+ } -+ -+ bdrv_drain_all(); -+ -+ int i; -+ for (i = 1; i < 256; i++) { -+ VmaRestoreState *rstate = &vmar->rstate[i]; -+ if (!rstate->target) { -+ continue; -+ } -+ -+ if (blk_flush(rstate->target) < 0) { -+ error_setg(errp, "vma blk_flush %s failed", -+ vmar->devinfo[i].devname); -+ return -1; -+ } -+ -+ if (vmar->devinfo[i].size && -+ (strcmp(vmar->devinfo[i].devname, "vmstate") != 0)) { -+ assert(rstate->bitmap); -+ -+ int64_t cluster_num, end; -+ -+ end = (vmar->devinfo[i].size + VMA_CLUSTER_SIZE - 1) / -+ VMA_CLUSTER_SIZE; -+ -+ for (cluster_num = 0; cluster_num < end; cluster_num++) { -+ if (!vma_reader_get_bitmap(rstate, cluster_num)) { -+ error_setg(errp, "detected missing cluster %zd " -+ "for stream %s", cluster_num, -+ vmar->devinfo[i].devname); -+ return -1; -+ } -+ } -+ } -+ } -+ -+ if (verbose) { -+ if (vmar->clusters_read) { -+ printf("total bytes read %zd, sparse bytes %zd (%.3g%%)\n", -+ vmar->clusters_read*VMA_CLUSTER_SIZE, -+ vmar->zero_cluster_data, -+ (double)(100.0*vmar->zero_cluster_data)/ -+ (vmar->clusters_read*VMA_CLUSTER_SIZE)); -+ -+ int64_t datasize = vmar->clusters_read*VMA_CLUSTER_SIZE-vmar->zero_cluster_data; -+ if (datasize) { // this does not make sense for empty files -+ printf("space reduction due to 4K zero blocks %.3g%%\n", -+ (double)(100.0*vmar->partial_zero_cluster_data) / datasize); -+ } -+ } else { -+ printf("vma archive contains no image data\n"); -+ } -+ } -+ return ret; -+} -+ -+int vma_reader_restore(VmaReader *vmar, int vmstate_fd, bool verbose, -+ Error **errp) -+{ -+ return vma_reader_restore_full(vmar, vmstate_fd, verbose, false, errp); -+} -+ -+int vma_reader_verify(VmaReader *vmar, bool verbose, Error **errp) -+{ -+ guint8 dev_id; -+ -+ for (dev_id = 1; dev_id < 255; dev_id++) { -+ if (vma_reader_get_device_info(vmar, dev_id)) { -+ allocate_rstate(vmar, dev_id, NULL, false); -+ } -+ } -+ -+ return vma_reader_restore_full(vmar, -1, verbose, true, errp); -+} -+ -diff --git a/vma-writer.c b/vma-writer.c -new file mode 100644 -index 0000000000..fd9567634d ---- /dev/null -+++ b/vma-writer.c -@@ -0,0 +1,771 @@ -+/* -+ * VMA: Virtual Machine Archive -+ * -+ * Copyright (C) 2012 Proxmox Server Solutions -+ * -+ * Authors: -+ * Dietmar Maurer (dietmar@proxmox.com) -+ * -+ * This work is licensed under the terms of the GNU GPL, version 2 or later. -+ * See the COPYING file in the top-level directory. -+ * -+ */ -+ -+#include "qemu/osdep.h" -+#include -+#include -+ -+#include "vma.h" -+#include "block/block.h" -+#include "monitor/monitor.h" -+#include "qemu/main-loop.h" -+#include "qemu/coroutine.h" -+#include "qemu/cutils.h" -+ -+#define DEBUG_VMA 0 -+ -+#define DPRINTF(fmt, ...)\ -+ do { if (DEBUG_VMA) { printf("vma: " fmt, ## __VA_ARGS__); } } while (0) -+ -+#define WRITE_BUFFERS 5 -+#define HEADER_CLUSTERS 8 -+#define HEADERBUF_SIZE (VMA_CLUSTER_SIZE*HEADER_CLUSTERS) -+ -+struct VmaWriter { -+ int fd; -+ FILE *cmd; -+ int status; -+ char errmsg[8192]; -+ uuid_t uuid; -+ bool header_written; -+ bool closed; -+ -+ /* we always write extents */ -+ unsigned char *outbuf; -+ int outbuf_pos; /* in bytes */ -+ int outbuf_count; /* in VMA_BLOCKS */ -+ uint64_t outbuf_block_info[VMA_BLOCKS_PER_EXTENT]; -+ -+ unsigned char *headerbuf; -+ -+ GChecksum *md5csum; -+ CoMutex flush_lock; -+ Coroutine *co_writer; -+ -+ /* drive informations */ -+ VmaStreamInfo stream_info[256]; -+ guint stream_count; -+ -+ guint8 vmstate_stream; -+ uint32_t vmstate_clusters; -+ -+ /* header blob table */ -+ char *header_blob_table; -+ uint32_t header_blob_table_size; -+ uint32_t header_blob_table_pos; -+ -+ /* store for config blobs */ -+ uint32_t config_names[VMA_MAX_CONFIGS]; /* offset into blob_buffer table */ -+ uint32_t config_data[VMA_MAX_CONFIGS]; /* offset into blob_buffer table */ -+ uint32_t config_count; -+}; -+ -+void vma_writer_set_error(VmaWriter *vmaw, const char *fmt, ...) -+{ -+ va_list ap; -+ -+ if (vmaw->status < 0) { -+ return; -+ } -+ -+ vmaw->status = -1; -+ -+ va_start(ap, fmt); -+ g_vsnprintf(vmaw->errmsg, sizeof(vmaw->errmsg), fmt, ap); -+ va_end(ap); -+ -+ DPRINTF("vma_writer_set_error: %s\n", vmaw->errmsg); -+} -+ -+static uint32_t allocate_header_blob(VmaWriter *vmaw, const char *data, -+ size_t len) -+{ -+ if (len > 65535) { -+ return 0; -+ } -+ -+ if (!vmaw->header_blob_table || -+ (vmaw->header_blob_table_size < -+ (vmaw->header_blob_table_pos + len + 2))) { -+ int newsize = vmaw->header_blob_table_size + ((len + 2 + 511)/512)*512; -+ -+ vmaw->header_blob_table = g_realloc(vmaw->header_blob_table, newsize); -+ memset(vmaw->header_blob_table + vmaw->header_blob_table_size, -+ 0, newsize - vmaw->header_blob_table_size); -+ vmaw->header_blob_table_size = newsize; -+ } -+ -+ uint32_t cpos = vmaw->header_blob_table_pos; -+ vmaw->header_blob_table[cpos] = len & 255; -+ vmaw->header_blob_table[cpos+1] = (len >> 8) & 255; -+ memcpy(vmaw->header_blob_table + cpos + 2, data, len); -+ vmaw->header_blob_table_pos += len + 2; -+ return cpos; -+} -+ -+static uint32_t allocate_header_string(VmaWriter *vmaw, const char *str) -+{ -+ assert(vmaw); -+ -+ size_t len = strlen(str) + 1; -+ -+ return allocate_header_blob(vmaw, str, len); -+} -+ -+int vma_writer_add_config(VmaWriter *vmaw, const char *name, gpointer data, -+ gsize len) -+{ -+ assert(vmaw); -+ assert(!vmaw->header_written); -+ assert(vmaw->config_count < VMA_MAX_CONFIGS); -+ assert(name); -+ assert(data); -+ -+ gchar *basename = g_path_get_basename(name); -+ uint32_t name_ptr = allocate_header_string(vmaw, basename); -+ g_free(basename); -+ -+ if (!name_ptr) { -+ return -1; -+ } -+ -+ uint32_t data_ptr = allocate_header_blob(vmaw, data, len); -+ if (!data_ptr) { -+ return -1; -+ } -+ -+ vmaw->config_names[vmaw->config_count] = name_ptr; -+ vmaw->config_data[vmaw->config_count] = data_ptr; -+ -+ vmaw->config_count++; -+ -+ return 0; -+} -+ -+int vma_writer_register_stream(VmaWriter *vmaw, const char *devname, -+ size_t size) -+{ -+ assert(vmaw); -+ assert(devname); -+ assert(!vmaw->status); -+ -+ if (vmaw->header_written) { -+ vma_writer_set_error(vmaw, "vma_writer_register_stream: header " -+ "already written"); -+ return -1; -+ } -+ -+ guint n = vmaw->stream_count + 1; -+ -+ /* we can have dev_ids form 1 to 255 (0 reserved) -+ * 255(-1) reseverd for safety -+ */ -+ if (n > 254) { -+ vma_writer_set_error(vmaw, "vma_writer_register_stream: " -+ "too many drives"); -+ return -1; -+ } -+ -+ if (size <= 0) { -+ vma_writer_set_error(vmaw, "vma_writer_register_stream: " -+ "got strange size %zd", size); -+ return -1; -+ } -+ -+ DPRINTF("vma_writer_register_stream %s %zu %d\n", devname, size, n); -+ -+ vmaw->stream_info[n].devname = g_strdup(devname); -+ vmaw->stream_info[n].size = size; -+ -+ vmaw->stream_info[n].cluster_count = (size + VMA_CLUSTER_SIZE - 1) / -+ VMA_CLUSTER_SIZE; -+ -+ vmaw->stream_count = n; -+ -+ if (strcmp(devname, "vmstate") == 0) { -+ vmaw->vmstate_stream = n; -+ } -+ -+ return n; -+} -+ -+static void vma_co_continue_write(void *opaque) -+{ -+ VmaWriter *vmaw = opaque; -+ -+ DPRINTF("vma_co_continue_write\n"); -+ qemu_coroutine_enter(vmaw->co_writer); -+} -+ -+static ssize_t coroutine_fn -+vma_queue_write(VmaWriter *vmaw, const void *buf, size_t bytes) -+{ -+ DPRINTF("vma_queue_write enter %zd\n", bytes); -+ -+ assert(vmaw); -+ assert(buf); -+ assert(bytes <= VMA_MAX_EXTENT_SIZE); -+ -+ size_t done = 0; -+ ssize_t ret; -+ -+ assert(vmaw->co_writer == NULL); -+ -+ vmaw->co_writer = qemu_coroutine_self(); -+ -+ while (done < bytes) { -+ aio_set_fd_handler(qemu_get_aio_context(), vmaw->fd, false, NULL, vma_co_continue_write, NULL, vmaw); -+ qemu_coroutine_yield(); -+ aio_set_fd_handler(qemu_get_aio_context(), vmaw->fd, false, NULL, NULL, NULL, NULL); -+ if (vmaw->status < 0) { -+ DPRINTF("vma_queue_write detected canceled backup\n"); -+ done = -1; -+ break; -+ } -+ ret = write(vmaw->fd, buf + done, bytes - done); -+ if (ret > 0) { -+ done += ret; -+ DPRINTF("vma_queue_write written %zd %zd\n", done, ret); -+ } else if (ret < 0) { -+ if (errno == EAGAIN || errno == EWOULDBLOCK) { -+ /* try again */ -+ } else { -+ vma_writer_set_error(vmaw, "vma_queue_write: write error - %s", -+ g_strerror(errno)); -+ done = -1; /* always return failure for partial writes */ -+ break; -+ } -+ } else if (ret == 0) { -+ /* should not happen - simply try again */ -+ } -+ } -+ -+ vmaw->co_writer = NULL; -+ -+ return (done == bytes) ? bytes : -1; -+} -+ -+VmaWriter *vma_writer_create(const char *filename, uuid_t uuid, Error **errp) -+{ -+ const char *p; -+ -+ assert(sizeof(VmaHeader) == (4096 + 8192)); -+ assert(G_STRUCT_OFFSET(VmaHeader, config_names) == 2044); -+ assert(G_STRUCT_OFFSET(VmaHeader, config_data) == 3068); -+ assert(G_STRUCT_OFFSET(VmaHeader, dev_info) == 4096); -+ assert(sizeof(VmaExtentHeader) == 512); -+ -+ VmaWriter *vmaw = g_new0(VmaWriter, 1); -+ vmaw->fd = -1; -+ -+ vmaw->md5csum = g_checksum_new(G_CHECKSUM_MD5); -+ if (!vmaw->md5csum) { -+ error_setg(errp, "can't allocate cmsum\n"); -+ goto err; -+ } -+ -+ if (strstart(filename, "exec:", &p)) { -+ vmaw->cmd = popen(p, "w"); -+ if (vmaw->cmd == NULL) { -+ error_setg(errp, "can't popen command '%s' - %s\n", p, -+ g_strerror(errno)); -+ goto err; -+ } -+ vmaw->fd = fileno(vmaw->cmd); -+ -+ /* try to use O_NONBLOCK */ -+ fcntl(vmaw->fd, F_SETFL, fcntl(vmaw->fd, F_GETFL)|O_NONBLOCK); -+ -+ } else { -+ struct stat st; -+ int oflags; -+ const char *tmp_id_str; -+ -+ if ((stat(filename, &st) == 0) && S_ISFIFO(st.st_mode)) { -+ oflags = O_NONBLOCK|O_WRONLY; -+ vmaw->fd = qemu_open(filename, oflags, 0644); -+ } else if (strstart(filename, "/dev/fdset/", &tmp_id_str)) { -+ oflags = O_NONBLOCK|O_WRONLY; -+ vmaw->fd = qemu_open(filename, oflags, 0644); -+ } else if (strstart(filename, "/dev/fdname/", &tmp_id_str)) { -+ vmaw->fd = monitor_get_fd(cur_mon, tmp_id_str, errp); -+ if (vmaw->fd < 0) { -+ goto err; -+ } -+ /* try to use O_NONBLOCK */ -+ fcntl(vmaw->fd, F_SETFL, fcntl(vmaw->fd, F_GETFL)|O_NONBLOCK); -+ } else { -+ oflags = O_NONBLOCK|O_DIRECT|O_WRONLY|O_CREAT|O_EXCL; -+ vmaw->fd = qemu_open(filename, oflags, 0644); -+ } -+ -+ if (vmaw->fd < 0) { -+ error_setg(errp, "can't open file %s - %s\n", filename, -+ g_strerror(errno)); -+ goto err; -+ } -+ } -+ -+ /* we use O_DIRECT, so we need to align IO buffers */ -+ -+ vmaw->outbuf = qemu_memalign(512, VMA_MAX_EXTENT_SIZE); -+ vmaw->headerbuf = qemu_memalign(512, HEADERBUF_SIZE); -+ -+ vmaw->outbuf_count = 0; -+ vmaw->outbuf_pos = VMA_EXTENT_HEADER_SIZE; -+ -+ vmaw->header_blob_table_pos = 1; /* start at pos 1 */ -+ -+ qemu_co_mutex_init(&vmaw->flush_lock); -+ -+ uuid_copy(vmaw->uuid, uuid); -+ -+ return vmaw; -+ -+err: -+ if (vmaw) { -+ if (vmaw->cmd) { -+ pclose(vmaw->cmd); -+ } else if (vmaw->fd >= 0) { -+ close(vmaw->fd); -+ } -+ -+ if (vmaw->md5csum) { -+ g_checksum_free(vmaw->md5csum); -+ } -+ -+ g_free(vmaw); -+ } -+ -+ return NULL; -+} -+ -+static int coroutine_fn vma_write_header(VmaWriter *vmaw) -+{ -+ assert(vmaw); -+ unsigned char *buf = vmaw->headerbuf; -+ VmaHeader *head = (VmaHeader *)buf; -+ -+ int i; -+ -+ DPRINTF("VMA WRITE HEADER\n"); -+ -+ if (vmaw->status < 0) { -+ return vmaw->status; -+ } -+ -+ memset(buf, 0, HEADERBUF_SIZE); -+ -+ head->magic = VMA_MAGIC; -+ head->version = GUINT32_TO_BE(1); /* v1 */ -+ memcpy(head->uuid, vmaw->uuid, 16); -+ -+ time_t ctime = time(NULL); -+ head->ctime = GUINT64_TO_BE(ctime); -+ -+ for (i = 0; i < VMA_MAX_CONFIGS; i++) { -+ head->config_names[i] = GUINT32_TO_BE(vmaw->config_names[i]); -+ head->config_data[i] = GUINT32_TO_BE(vmaw->config_data[i]); -+ } -+ -+ /* 32 bytes per device (12 used currently) = 8192 bytes max */ -+ for (i = 1; i <= 254; i++) { -+ VmaStreamInfo *si = &vmaw->stream_info[i]; -+ if (si->size) { -+ assert(si->devname); -+ uint32_t devname_ptr = allocate_header_string(vmaw, si->devname); -+ if (!devname_ptr) { -+ return -1; -+ } -+ head->dev_info[i].devname_ptr = GUINT32_TO_BE(devname_ptr); -+ head->dev_info[i].size = GUINT64_TO_BE(si->size); -+ } -+ } -+ -+ uint32_t header_size = sizeof(VmaHeader) + vmaw->header_blob_table_size; -+ head->header_size = GUINT32_TO_BE(header_size); -+ -+ if (header_size > HEADERBUF_SIZE) { -+ return -1; /* just to be sure */ -+ } -+ -+ uint32_t blob_buffer_offset = sizeof(VmaHeader); -+ memcpy(buf + blob_buffer_offset, vmaw->header_blob_table, -+ vmaw->header_blob_table_size); -+ head->blob_buffer_offset = GUINT32_TO_BE(blob_buffer_offset); -+ head->blob_buffer_size = GUINT32_TO_BE(vmaw->header_blob_table_pos); -+ -+ g_checksum_reset(vmaw->md5csum); -+ g_checksum_update(vmaw->md5csum, (const guchar *)buf, header_size); -+ gsize csize = 16; -+ g_checksum_get_digest(vmaw->md5csum, (guint8 *)(head->md5sum), &csize); -+ -+ return vma_queue_write(vmaw, buf, header_size); -+} -+ -+static int coroutine_fn vma_writer_flush(VmaWriter *vmaw) -+{ -+ assert(vmaw); -+ -+ int ret; -+ int i; -+ -+ if (vmaw->status < 0) { -+ return vmaw->status; -+ } -+ -+ if (!vmaw->header_written) { -+ vmaw->header_written = true; -+ ret = vma_write_header(vmaw); -+ if (ret < 0) { -+ vma_writer_set_error(vmaw, "vma_writer_flush: write header failed"); -+ return ret; -+ } -+ } -+ -+ DPRINTF("VMA WRITE FLUSH %d %d\n", vmaw->outbuf_count, vmaw->outbuf_pos); -+ -+ -+ VmaExtentHeader *ehead = (VmaExtentHeader *)vmaw->outbuf; -+ -+ ehead->magic = VMA_EXTENT_MAGIC; -+ ehead->reserved1 = 0; -+ -+ for (i = 0; i < VMA_BLOCKS_PER_EXTENT; i++) { -+ ehead->blockinfo[i] = GUINT64_TO_BE(vmaw->outbuf_block_info[i]); -+ } -+ -+ guint16 block_count = (vmaw->outbuf_pos - VMA_EXTENT_HEADER_SIZE) / -+ VMA_BLOCK_SIZE; -+ -+ ehead->block_count = GUINT16_TO_BE(block_count); -+ -+ memcpy(ehead->uuid, vmaw->uuid, sizeof(ehead->uuid)); -+ memset(ehead->md5sum, 0, sizeof(ehead->md5sum)); -+ -+ g_checksum_reset(vmaw->md5csum); -+ g_checksum_update(vmaw->md5csum, vmaw->outbuf, VMA_EXTENT_HEADER_SIZE); -+ gsize csize = 16; -+ g_checksum_get_digest(vmaw->md5csum, ehead->md5sum, &csize); -+ -+ int bytes = vmaw->outbuf_pos; -+ ret = vma_queue_write(vmaw, vmaw->outbuf, bytes); -+ if (ret != bytes) { -+ vma_writer_set_error(vmaw, "vma_writer_flush: failed write"); -+ } -+ -+ vmaw->outbuf_count = 0; -+ vmaw->outbuf_pos = VMA_EXTENT_HEADER_SIZE; -+ -+ for (i = 0; i < VMA_BLOCKS_PER_EXTENT; i++) { -+ vmaw->outbuf_block_info[i] = 0; -+ } -+ -+ return vmaw->status; -+} -+ -+static int vma_count_open_streams(VmaWriter *vmaw) -+{ -+ g_assert(vmaw != NULL); -+ -+ int i; -+ int open_drives = 0; -+ for (i = 0; i <= 255; i++) { -+ if (vmaw->stream_info[i].size && !vmaw->stream_info[i].finished) { -+ open_drives++; -+ } -+ } -+ -+ return open_drives; -+} -+ -+ -+/** -+ * You need to call this if the vma archive does not contain -+ * any data stream. -+ */ -+int coroutine_fn -+vma_writer_flush_output(VmaWriter *vmaw) -+{ -+ qemu_co_mutex_lock(&vmaw->flush_lock); -+ int ret = vma_writer_flush(vmaw); -+ qemu_co_mutex_unlock(&vmaw->flush_lock); -+ if (ret < 0) { -+ vma_writer_set_error(vmaw, "vma_writer_flush_header failed"); -+ } -+ return ret; -+} -+ -+/** -+ * all jobs should call this when there is no more data -+ * Returns: number of remaining stream (0 ==> finished) -+ */ -+int coroutine_fn -+vma_writer_close_stream(VmaWriter *vmaw, uint8_t dev_id) -+{ -+ g_assert(vmaw != NULL); -+ -+ DPRINTF("vma_writer_set_status %d\n", dev_id); -+ if (!vmaw->stream_info[dev_id].size) { -+ vma_writer_set_error(vmaw, "vma_writer_close_stream: " -+ "no such stream %d", dev_id); -+ return -1; -+ } -+ if (vmaw->stream_info[dev_id].finished) { -+ vma_writer_set_error(vmaw, "vma_writer_close_stream: " -+ "stream already closed %d", dev_id); -+ return -1; -+ } -+ -+ vmaw->stream_info[dev_id].finished = true; -+ -+ int open_drives = vma_count_open_streams(vmaw); -+ -+ if (open_drives <= 0) { -+ DPRINTF("vma_writer_set_status all drives completed\n"); -+ vma_writer_flush_output(vmaw); -+ } -+ -+ return open_drives; -+} -+ -+int vma_writer_get_status(VmaWriter *vmaw, VmaStatus *status) -+{ -+ int i; -+ -+ g_assert(vmaw != NULL); -+ -+ if (status) { -+ status->status = vmaw->status; -+ g_strlcpy(status->errmsg, vmaw->errmsg, sizeof(status->errmsg)); -+ for (i = 0; i <= 255; i++) { -+ status->stream_info[i] = vmaw->stream_info[i]; -+ } -+ -+ uuid_unparse_lower(vmaw->uuid, status->uuid_str); -+ } -+ -+ status->closed = vmaw->closed; -+ -+ return vmaw->status; -+} -+ -+static int vma_writer_get_buffer(VmaWriter *vmaw) -+{ -+ int ret = 0; -+ -+ qemu_co_mutex_lock(&vmaw->flush_lock); -+ -+ /* wait until buffer is available */ -+ while (vmaw->outbuf_count >= (VMA_BLOCKS_PER_EXTENT - 1)) { -+ ret = vma_writer_flush(vmaw); -+ if (ret < 0) { -+ vma_writer_set_error(vmaw, "vma_writer_get_buffer: flush failed"); -+ break; -+ } -+ } -+ -+ qemu_co_mutex_unlock(&vmaw->flush_lock); -+ -+ return ret; -+} -+ -+ -+int64_t coroutine_fn -+vma_writer_write(VmaWriter *vmaw, uint8_t dev_id, int64_t cluster_num, -+ const unsigned char *buf, size_t *zero_bytes) -+{ -+ g_assert(vmaw != NULL); -+ g_assert(zero_bytes != NULL); -+ -+ *zero_bytes = 0; -+ -+ if (vmaw->status < 0) { -+ return vmaw->status; -+ } -+ -+ if (!dev_id || !vmaw->stream_info[dev_id].size) { -+ vma_writer_set_error(vmaw, "vma_writer_write: " -+ "no such stream %d", dev_id); -+ return -1; -+ } -+ -+ if (vmaw->stream_info[dev_id].finished) { -+ vma_writer_set_error(vmaw, "vma_writer_write: " -+ "stream already closed %d", dev_id); -+ return -1; -+ } -+ -+ -+ if (cluster_num >= (((uint64_t)1)<<32)) { -+ vma_writer_set_error(vmaw, "vma_writer_write: " -+ "cluster number out of range"); -+ return -1; -+ } -+ -+ if (dev_id == vmaw->vmstate_stream) { -+ if (cluster_num != vmaw->vmstate_clusters) { -+ vma_writer_set_error(vmaw, "vma_writer_write: " -+ "non sequential vmstate write"); -+ } -+ vmaw->vmstate_clusters++; -+ } else if (cluster_num >= vmaw->stream_info[dev_id].cluster_count) { -+ vma_writer_set_error(vmaw, "vma_writer_write: cluster number too big"); -+ return -1; -+ } -+ -+ /* wait until buffer is available */ -+ if (vma_writer_get_buffer(vmaw) < 0) { -+ vma_writer_set_error(vmaw, "vma_writer_write: " -+ "vma_writer_get_buffer failed"); -+ return -1; -+ } -+ -+ DPRINTF("VMA WRITE %d %zd\n", dev_id, cluster_num); -+ -+ uint16_t mask = 0; -+ -+ if (buf) { -+ int i; -+ int bit = 1; -+ for (i = 0; i < 16; i++) { -+ const unsigned char *vmablock = buf + (i*VMA_BLOCK_SIZE); -+ if (!buffer_is_zero(vmablock, VMA_BLOCK_SIZE)) { -+ mask |= bit; -+ memcpy(vmaw->outbuf + vmaw->outbuf_pos, vmablock, -+ VMA_BLOCK_SIZE); -+ vmaw->outbuf_pos += VMA_BLOCK_SIZE; -+ } else { -+ DPRINTF("VMA WRITE %zd ZERO BLOCK %d\n", cluster_num, i); -+ vmaw->stream_info[dev_id].zero_bytes += VMA_BLOCK_SIZE; -+ *zero_bytes += VMA_BLOCK_SIZE; -+ } -+ -+ bit = bit << 1; -+ } -+ } else { -+ DPRINTF("VMA WRITE %zd ZERO CLUSTER\n", cluster_num); -+ vmaw->stream_info[dev_id].zero_bytes += VMA_CLUSTER_SIZE; -+ *zero_bytes += VMA_CLUSTER_SIZE; -+ } -+ -+ uint64_t block_info = ((uint64_t)mask) << (32+16); -+ block_info |= ((uint64_t)dev_id) << 32; -+ block_info |= (cluster_num & 0xffffffff); -+ vmaw->outbuf_block_info[vmaw->outbuf_count] = block_info; -+ -+ DPRINTF("VMA WRITE MASK %zd %zx\n", cluster_num, block_info); -+ -+ vmaw->outbuf_count++; -+ -+ /** NOTE: We allways write whole clusters, but we correctly set -+ * transferred bytes. So transferred == size when when everything -+ * went OK. -+ */ -+ size_t transferred = VMA_CLUSTER_SIZE; -+ -+ if (dev_id != vmaw->vmstate_stream) { -+ uint64_t last = (cluster_num + 1) * VMA_CLUSTER_SIZE; -+ if (last > vmaw->stream_info[dev_id].size) { -+ uint64_t diff = last - vmaw->stream_info[dev_id].size; -+ if (diff >= VMA_CLUSTER_SIZE) { -+ vma_writer_set_error(vmaw, "vma_writer_write: " -+ "read after last cluster"); -+ return -1; -+ } -+ transferred -= diff; -+ } -+ } -+ -+ vmaw->stream_info[dev_id].transferred += transferred; -+ -+ return transferred; -+} -+ -+void vma_writer_error_propagate(VmaWriter *vmaw, Error **errp) -+{ -+ if (vmaw->status < 0 && *errp == NULL) { -+ error_setg(errp, "%s", vmaw->errmsg); -+ } -+} -+ -+int vma_writer_close(VmaWriter *vmaw, Error **errp) -+{ -+ g_assert(vmaw != NULL); -+ -+ int i; -+ -+ while (vmaw->co_writer) { -+ aio_poll(qemu_get_aio_context(), true); -+ } -+ -+ assert(vmaw->co_writer == NULL); -+ -+ if (vmaw->cmd) { -+ if (pclose(vmaw->cmd) < 0) { -+ vma_writer_set_error(vmaw, "vma_writer_close: " -+ "pclose failed - %s", g_strerror(errno)); -+ } -+ } else { -+ if (close(vmaw->fd) < 0) { -+ vma_writer_set_error(vmaw, "vma_writer_close: " -+ "close failed - %s", g_strerror(errno)); -+ } -+ } -+ -+ for (i = 0; i <= 255; i++) { -+ VmaStreamInfo *si = &vmaw->stream_info[i]; -+ if (si->size) { -+ if (!si->finished) { -+ vma_writer_set_error(vmaw, "vma_writer_close: " -+ "detected open stream '%s'", si->devname); -+ } else if ((si->transferred != si->size) && -+ (i != vmaw->vmstate_stream)) { -+ vma_writer_set_error(vmaw, "vma_writer_close: " -+ "incomplete stream '%s' (%zd != %zd)", -+ si->devname, si->transferred, si->size); -+ } -+ } -+ } -+ -+ for (i = 0; i <= 255; i++) { -+ vmaw->stream_info[i].finished = 1; /* mark as closed */ -+ } -+ -+ vmaw->closed = 1; -+ -+ if (vmaw->status < 0 && *errp == NULL) { -+ error_setg(errp, "%s", vmaw->errmsg); -+ } -+ -+ return vmaw->status; -+} -+ -+void vma_writer_destroy(VmaWriter *vmaw) -+{ -+ assert(vmaw); -+ -+ int i; -+ -+ for (i = 0; i <= 255; i++) { -+ if (vmaw->stream_info[i].devname) { -+ g_free(vmaw->stream_info[i].devname); -+ } -+ } -+ -+ if (vmaw->md5csum) { -+ g_checksum_free(vmaw->md5csum); -+ } -+ -+ g_free(vmaw); -+} -diff --git a/vma.c b/vma.c -new file mode 100644 -index 0000000000..1b59fd1555 ---- /dev/null -+++ b/vma.c -@@ -0,0 +1,756 @@ -+/* -+ * VMA: Virtual Machine Archive -+ * -+ * Copyright (C) 2012-2013 Proxmox Server Solutions -+ * -+ * Authors: -+ * Dietmar Maurer (dietmar@proxmox.com) -+ * -+ * This work is licensed under the terms of the GNU GPL, version 2 or later. -+ * See the COPYING file in the top-level directory. -+ * -+ */ -+ -+#include "qemu/osdep.h" -+#include -+ -+#include "vma.h" -+#include "qemu-common.h" -+#include "qemu/error-report.h" -+#include "qemu/main-loop.h" -+#include "qapi/qmp/qstring.h" -+#include "sysemu/block-backend.h" -+ -+static void help(void) -+{ -+ const char *help_msg = -+ "usage: vma command [command options]\n" -+ "\n" -+ "vma list \n" -+ "vma config [-c config]\n" -+ "vma create [-c config] pathname ...\n" -+ "vma extract [-r ] \n" -+ "vma verify [-v]\n" -+ ; -+ -+ printf("%s", help_msg); -+ exit(1); -+} -+ -+static const char *extract_devname(const char *path, char **devname, int index) -+{ -+ assert(path); -+ -+ const char *sep = strchr(path, '='); -+ -+ if (sep) { -+ *devname = g_strndup(path, sep - path); -+ path = sep + 1; -+ } else { -+ if (index >= 0) { -+ *devname = g_strdup_printf("disk%d", index); -+ } else { -+ *devname = NULL; -+ } -+ } -+ -+ return path; -+} -+ -+static void print_content(VmaReader *vmar) -+{ -+ assert(vmar); -+ -+ VmaHeader *head = vma_reader_get_header(vmar); -+ -+ GList *l = vma_reader_get_config_data(vmar); -+ while (l && l->data) { -+ VmaConfigData *cdata = (VmaConfigData *)l->data; -+ l = g_list_next(l); -+ printf("CFG: size: %d name: %s\n", cdata->len, cdata->name); -+ } -+ -+ int i; -+ VmaDeviceInfo *di; -+ for (i = 1; i < 255; i++) { -+ di = vma_reader_get_device_info(vmar, i); -+ if (di) { -+ if (strcmp(di->devname, "vmstate") == 0) { -+ printf("VMSTATE: dev_id=%d memory: %zd\n", i, di->size); -+ } else { -+ printf("DEV: dev_id=%d size: %zd devname: %s\n", -+ i, di->size, di->devname); -+ } -+ } -+ } -+ /* ctime is the last entry we print */ -+ printf("CTIME: %s", ctime(&head->ctime)); -+ fflush(stdout); -+} -+ -+static int list_content(int argc, char **argv) -+{ -+ int c, ret = 0; -+ const char *filename; -+ -+ for (;;) { -+ c = getopt(argc, argv, "h"); -+ if (c == -1) { -+ break; -+ } -+ switch (c) { -+ case '?': -+ case 'h': -+ help(); -+ break; -+ default: -+ g_assert_not_reached(); -+ } -+ } -+ -+ /* Get the filename */ -+ if ((optind + 1) != argc) { -+ help(); -+ } -+ filename = argv[optind++]; -+ -+ Error *errp = NULL; -+ VmaReader *vmar = vma_reader_create(filename, &errp); -+ -+ if (!vmar) { -+ g_error("%s", error_get_pretty(errp)); -+ } -+ -+ print_content(vmar); -+ -+ vma_reader_destroy(vmar); -+ -+ return ret; -+} -+ -+typedef struct RestoreMap { -+ char *devname; -+ char *path; -+ char *format; -+ bool write_zero; -+} RestoreMap; -+ -+static int extract_content(int argc, char **argv) -+{ -+ int c, ret = 0; -+ int verbose = 0; -+ const char *filename; -+ const char *dirname; -+ const char *readmap = NULL; -+ -+ for (;;) { -+ c = getopt(argc, argv, "hvr:"); -+ if (c == -1) { -+ break; -+ } -+ switch (c) { -+ case '?': -+ case 'h': -+ help(); -+ break; -+ case 'r': -+ readmap = optarg; -+ break; -+ case 'v': -+ verbose = 1; -+ break; -+ default: -+ help(); -+ } -+ } -+ -+ /* Get the filename */ -+ if ((optind + 2) != argc) { -+ help(); -+ } -+ filename = argv[optind++]; -+ dirname = argv[optind++]; -+ -+ Error *errp = NULL; -+ VmaReader *vmar = vma_reader_create(filename, &errp); -+ -+ if (!vmar) { -+ g_error("%s", error_get_pretty(errp)); -+ } -+ -+ if (mkdir(dirname, 0777) < 0) { -+ g_error("unable to create target directory %s - %s", -+ dirname, g_strerror(errno)); -+ } -+ -+ GList *l = vma_reader_get_config_data(vmar); -+ while (l && l->data) { -+ VmaConfigData *cdata = (VmaConfigData *)l->data; -+ l = g_list_next(l); -+ char *cfgfn = g_strdup_printf("%s/%s", dirname, cdata->name); -+ GError *err = NULL; -+ if (!g_file_set_contents(cfgfn, (gchar *)cdata->data, cdata->len, -+ &err)) { -+ g_error("unable to write file: %s", err->message); -+ } -+ } -+ -+ GHashTable *devmap = g_hash_table_new(g_str_hash, g_str_equal); -+ -+ if (readmap) { -+ print_content(vmar); -+ -+ FILE *map = fopen(readmap, "r"); -+ if (!map) { -+ g_error("unable to open fifo %s - %s", readmap, g_strerror(errno)); -+ } -+ -+ while (1) { -+ char inbuf[8192]; -+ char *line = fgets(inbuf, sizeof(inbuf), map); -+ if (!line || line[0] == '\0' || !strcmp(line, "done\n")) { -+ break; -+ } -+ int len = strlen(line); -+ if (line[len - 1] == '\n') { -+ line[len - 1] = '\0'; -+ if (len == 1) { -+ break; -+ } -+ } -+ -+ char *format = NULL; -+ if (strncmp(line, "format=", sizeof("format=")-1) == 0) { -+ format = line + sizeof("format=")-1; -+ char *colon = strchr(format, ':'); -+ if (!colon) { -+ g_error("read map failed - found only a format ('%s')", inbuf); -+ } -+ format = g_strndup(format, colon - format); -+ line = colon+1; -+ } -+ -+ const char *path; -+ bool write_zero; -+ if (line[0] == '0' && line[1] == ':') { -+ path = line + 2; -+ write_zero = false; -+ } else if (line[0] == '1' && line[1] == ':') { -+ path = line + 2; -+ write_zero = true; -+ } else { -+ g_error("read map failed - parse error ('%s')", inbuf); -+ } -+ -+ char *devname = NULL; -+ path = extract_devname(path, &devname, -1); -+ if (!devname) { -+ g_error("read map failed - no dev name specified ('%s')", -+ inbuf); -+ } -+ -+ RestoreMap *map = g_new0(RestoreMap, 1); -+ map->devname = g_strdup(devname); -+ map->path = g_strdup(path); -+ map->format = format; -+ map->write_zero = write_zero; -+ -+ g_hash_table_insert(devmap, map->devname, map); -+ -+ }; -+ } -+ -+ int i; -+ int vmstate_fd = -1; -+ guint8 vmstate_stream = 0; -+ -+ BlockBackend *blk = NULL; -+ -+ for (i = 1; i < 255; i++) { -+ VmaDeviceInfo *di = vma_reader_get_device_info(vmar, i); -+ if (di && (strcmp(di->devname, "vmstate") == 0)) { -+ vmstate_stream = i; -+ char *statefn = g_strdup_printf("%s/vmstate.bin", dirname); -+ vmstate_fd = open(statefn, O_WRONLY|O_CREAT|O_EXCL, 0644); -+ if (vmstate_fd < 0) { -+ g_error("create vmstate file '%s' failed - %s", statefn, -+ g_strerror(errno)); -+ } -+ g_free(statefn); -+ } else if (di) { -+ char *devfn = NULL; -+ const char *format = NULL; -+ int flags = BDRV_O_RDWR | BDRV_O_NO_FLUSH; -+ bool write_zero = true; -+ -+ if (readmap) { -+ RestoreMap *map; -+ map = (RestoreMap *)g_hash_table_lookup(devmap, di->devname); -+ if (map == NULL) { -+ g_error("no device name mapping for %s", di->devname); -+ } -+ devfn = map->path; -+ format = map->format; -+ write_zero = map->write_zero; -+ } else { -+ devfn = g_strdup_printf("%s/tmp-disk-%s.raw", -+ dirname, di->devname); -+ printf("DEVINFO %s %zd\n", devfn, di->size); -+ -+ bdrv_img_create(devfn, "raw", NULL, NULL, NULL, di->size, -+ flags, true, &errp); -+ if (errp) { -+ g_error("can't create file %s: %s", devfn, -+ error_get_pretty(errp)); -+ } -+ -+ /* Note: we created an empty file above, so there is no -+ * need to write zeroes (so we generate a sparse file) -+ */ -+ write_zero = false; -+ } -+ -+ size_t devlen = strlen(devfn); -+ QDict *options = NULL; -+ if (format) { -+ /* explicit format from commandline */ -+ options = qdict_new(); -+ qdict_put(options, "driver", qstring_from_str(format)); -+ } else if ((devlen > 4 && strcmp(devfn+devlen-4, ".raw") == 0) || -+ strncmp(devfn, "/dev/", 5) == 0) -+ { -+ /* This part is now deprecated for PVE as well (just as qemu -+ * deprecated not specifying an explicit raw format, too. -+ */ -+ /* explicit raw format */ -+ options = qdict_new(); -+ qdict_put(options, "driver", qstring_from_str("raw")); -+ } -+ -+ -+ if (errp || !(blk = blk_new_open(devfn, NULL, options, flags, &errp))) { -+ g_error("can't open file %s - %s", devfn, -+ error_get_pretty(errp)); -+ } -+ -+ if (vma_reader_register_bs(vmar, i, blk, write_zero, &errp) < 0) { -+ g_error("%s", error_get_pretty(errp)); -+ } -+ -+ if (!readmap) { -+ g_free(devfn); -+ } -+ } -+ } -+ -+ if (vma_reader_restore(vmar, vmstate_fd, verbose, &errp) < 0) { -+ g_error("restore failed - %s", error_get_pretty(errp)); -+ } -+ -+ if (!readmap) { -+ for (i = 1; i < 255; i++) { -+ VmaDeviceInfo *di = vma_reader_get_device_info(vmar, i); -+ if (di && (i != vmstate_stream)) { -+ char *tmpfn = g_strdup_printf("%s/tmp-disk-%s.raw", -+ dirname, di->devname); -+ char *fn = g_strdup_printf("%s/disk-%s.raw", -+ dirname, di->devname); -+ if (rename(tmpfn, fn) != 0) { -+ g_error("rename %s to %s failed - %s", -+ tmpfn, fn, g_strerror(errno)); -+ } -+ } -+ } -+ } -+ -+ vma_reader_destroy(vmar); -+ -+ blk_unref(blk); -+ -+ bdrv_close_all(); -+ -+ return ret; -+} -+ -+static int verify_content(int argc, char **argv) -+{ -+ int c, ret = 0; -+ int verbose = 0; -+ const char *filename; -+ -+ for (;;) { -+ c = getopt(argc, argv, "hv"); -+ if (c == -1) { -+ break; -+ } -+ switch (c) { -+ case '?': -+ case 'h': -+ help(); -+ break; -+ case 'v': -+ verbose = 1; -+ break; -+ default: -+ help(); -+ } -+ } -+ -+ /* Get the filename */ -+ if ((optind + 1) != argc) { -+ help(); -+ } -+ filename = argv[optind++]; -+ -+ Error *errp = NULL; -+ VmaReader *vmar = vma_reader_create(filename, &errp); -+ -+ if (!vmar) { -+ g_error("%s", error_get_pretty(errp)); -+ } -+ -+ if (verbose) { -+ print_content(vmar); -+ } -+ -+ if (vma_reader_verify(vmar, verbose, &errp) < 0) { -+ g_error("verify failed - %s", error_get_pretty(errp)); -+ } -+ -+ vma_reader_destroy(vmar); -+ -+ bdrv_close_all(); -+ -+ return ret; -+} -+ -+typedef struct BackupJob { -+ BlockBackend *target; -+ int64_t len; -+ VmaWriter *vmaw; -+ uint8_t dev_id; -+} BackupJob; -+ -+#define BACKUP_SECTORS_PER_CLUSTER (VMA_CLUSTER_SIZE / BDRV_SECTOR_SIZE) -+ -+static void coroutine_fn backup_run_empty(void *opaque) -+{ -+ VmaWriter *vmaw = (VmaWriter *)opaque; -+ -+ vma_writer_flush_output(vmaw); -+ -+ Error *err = NULL; -+ if (vma_writer_close(vmaw, &err) != 0) { -+ g_warning("vma_writer_close failed %s", error_get_pretty(err)); -+ } -+} -+ -+static void coroutine_fn backup_run(void *opaque) -+{ -+ BackupJob *job = (BackupJob *)opaque; -+ struct iovec iov; -+ QEMUIOVector qiov; -+ -+ int64_t start, end; -+ int ret = 0; -+ -+ unsigned char *buf = blk_blockalign(job->target, VMA_CLUSTER_SIZE); -+ -+ start = 0; -+ end = DIV_ROUND_UP(job->len / BDRV_SECTOR_SIZE, -+ BACKUP_SECTORS_PER_CLUSTER); -+ -+ for (; start < end; start++) { -+ iov.iov_base = buf; -+ iov.iov_len = VMA_CLUSTER_SIZE; -+ qemu_iovec_init_external(&qiov, &iov, 1); -+ -+ ret = blk_co_preadv(job->target, start * VMA_CLUSTER_SIZE, -+ VMA_CLUSTER_SIZE, &qiov, 0); -+ if (ret < 0) { -+ vma_writer_set_error(job->vmaw, "read error", -1); -+ goto out; -+ } -+ -+ size_t zb = 0; -+ if (vma_writer_write(job->vmaw, job->dev_id, start, buf, &zb) < 0) { -+ vma_writer_set_error(job->vmaw, "backup_dump_cb vma_writer_write failed", -1); -+ goto out; -+ } -+ } -+ -+ -+out: -+ if (vma_writer_close_stream(job->vmaw, job->dev_id) <= 0) { -+ Error *err = NULL; -+ if (vma_writer_close(job->vmaw, &err) != 0) { -+ g_warning("vma_writer_close failed %s", error_get_pretty(err)); -+ } -+ } -+} -+ -+static int create_archive(int argc, char **argv) -+{ -+ int i, c; -+ int verbose = 0; -+ const char *archivename; -+ GList *config_files = NULL; -+ -+ for (;;) { -+ c = getopt(argc, argv, "hvc:"); -+ if (c == -1) { -+ break; -+ } -+ switch (c) { -+ case '?': -+ case 'h': -+ help(); -+ break; -+ case 'c': -+ config_files = g_list_append(config_files, optarg); -+ break; -+ case 'v': -+ verbose = 1; -+ break; -+ default: -+ g_assert_not_reached(); -+ } -+ } -+ -+ -+ /* make sure we an archive name */ -+ if ((optind + 1) > argc) { -+ help(); -+ } -+ -+ archivename = argv[optind++]; -+ -+ uuid_t uuid; -+ uuid_generate(uuid); -+ -+ Error *local_err = NULL; -+ VmaWriter *vmaw = vma_writer_create(archivename, uuid, &local_err); -+ -+ if (vmaw == NULL) { -+ g_error("%s", error_get_pretty(local_err)); -+ } -+ -+ GList *l = config_files; -+ while (l && l->data) { -+ char *name = l->data; -+ char *cdata = NULL; -+ gsize clen = 0; -+ GError *err = NULL; -+ if (!g_file_get_contents(name, &cdata, &clen, &err)) { -+ unlink(archivename); -+ g_error("Unable to read file: %s", err->message); -+ } -+ -+ if (vma_writer_add_config(vmaw, name, cdata, clen) != 0) { -+ unlink(archivename); -+ g_error("Unable to append config data %s (len = %zd)", -+ name, clen); -+ } -+ l = g_list_next(l); -+ } -+ -+ int devcount = 0; -+ while (optind < argc) { -+ const char *path = argv[optind++]; -+ char *devname = NULL; -+ path = extract_devname(path, &devname, devcount++); -+ -+ Error *errp = NULL; -+ BlockBackend *target; -+ -+ target = blk_new_open(path, NULL, NULL, 0, &errp); -+ if (!target) { -+ unlink(archivename); -+ g_error("bdrv_open '%s' failed - %s", path, error_get_pretty(errp)); -+ } -+ int64_t size = blk_getlength(target); -+ int dev_id = vma_writer_register_stream(vmaw, devname, size); -+ if (dev_id <= 0) { -+ unlink(archivename); -+ g_error("vma_writer_register_stream '%s' failed", devname); -+ } -+ -+ BackupJob *job = g_new0(BackupJob, 1); -+ job->len = size; -+ job->target = target; -+ job->vmaw = vmaw; -+ job->dev_id = dev_id; -+ -+ Coroutine *co = qemu_coroutine_create(backup_run, job); -+ qemu_coroutine_enter(co); -+ } -+ -+ VmaStatus vmastat; -+ int percent = 0; -+ int last_percent = -1; -+ -+ if (devcount) { -+ while (1) { -+ main_loop_wait(false); -+ vma_writer_get_status(vmaw, &vmastat); -+ -+ if (verbose) { -+ -+ uint64_t total = 0; -+ uint64_t transferred = 0; -+ uint64_t zero_bytes = 0; -+ -+ int i; -+ for (i = 0; i < 256; i++) { -+ if (vmastat.stream_info[i].size) { -+ total += vmastat.stream_info[i].size; -+ transferred += vmastat.stream_info[i].transferred; -+ zero_bytes += vmastat.stream_info[i].zero_bytes; -+ } -+ } -+ percent = (transferred*100)/total; -+ if (percent != last_percent) { -+ fprintf(stderr, "progress %d%% %zd/%zd %zd\n", percent, -+ transferred, total, zero_bytes); -+ fflush(stderr); -+ -+ last_percent = percent; -+ } -+ } -+ -+ if (vmastat.closed) { -+ break; -+ } -+ } -+ } else { -+ Coroutine *co = qemu_coroutine_create(backup_run_empty, vmaw); -+ qemu_coroutine_enter(co); -+ while (1) { -+ main_loop_wait(false); -+ vma_writer_get_status(vmaw, &vmastat); -+ if (vmastat.closed) { -+ break; -+ } -+ } -+ } -+ -+ bdrv_drain_all(); -+ -+ vma_writer_get_status(vmaw, &vmastat); -+ -+ if (verbose) { -+ for (i = 0; i < 256; i++) { -+ VmaStreamInfo *si = &vmastat.stream_info[i]; -+ if (si->size) { -+ fprintf(stderr, "image %s: size=%zd zeros=%zd saved=%zd\n", -+ si->devname, si->size, si->zero_bytes, -+ si->size - si->zero_bytes); -+ } -+ } -+ } -+ -+ if (vmastat.status < 0) { -+ unlink(archivename); -+ g_error("creating vma archive failed"); -+ } -+ -+ return 0; -+} -+ -+static int dump_config(int argc, char **argv) -+{ -+ int c, ret = 0; -+ const char *filename; -+ const char *config_name = "qemu-server.conf"; -+ -+ for (;;) { -+ c = getopt(argc, argv, "hc:"); -+ if (c == -1) { -+ break; -+ } -+ switch (c) { -+ case '?': -+ case 'h': -+ help(); -+ break; -+ case 'c': -+ config_name = optarg; -+ break; -+ default: -+ help(); -+ } -+ } -+ -+ /* Get the filename */ -+ if ((optind + 1) != argc) { -+ help(); -+ } -+ filename = argv[optind++]; -+ -+ Error *errp = NULL; -+ VmaReader *vmar = vma_reader_create(filename, &errp); -+ -+ if (!vmar) { -+ g_error("%s", error_get_pretty(errp)); -+ } -+ -+ int found = 0; -+ GList *l = vma_reader_get_config_data(vmar); -+ while (l && l->data) { -+ VmaConfigData *cdata = (VmaConfigData *)l->data; -+ l = g_list_next(l); -+ if (strcmp(cdata->name, config_name) == 0) { -+ found = 1; -+ fwrite(cdata->data, cdata->len, 1, stdout); -+ break; -+ } -+ } -+ -+ vma_reader_destroy(vmar); -+ -+ bdrv_close_all(); -+ -+ if (!found) { -+ fprintf(stderr, "unable to find configuration data '%s'\n", config_name); -+ return -1; -+ } -+ -+ return ret; -+} -+ -+int main(int argc, char **argv) -+{ -+ const char *cmdname; -+ Error *main_loop_err = NULL; -+ -+ error_set_progname(argv[0]); -+ -+ if (qemu_init_main_loop(&main_loop_err)) { -+ g_error("%s", error_get_pretty(main_loop_err)); -+ } -+ -+ bdrv_init(); -+ -+ if (argc < 2) { -+ help(); -+ } -+ -+ cmdname = argv[1]; -+ argc--; argv++; -+ -+ -+ if (!strcmp(cmdname, "list")) { -+ return list_content(argc, argv); -+ } else if (!strcmp(cmdname, "create")) { -+ return create_archive(argc, argv); -+ } else if (!strcmp(cmdname, "extract")) { -+ return extract_content(argc, argv); -+ } else if (!strcmp(cmdname, "verify")) { -+ return verify_content(argc, argv); -+ } else if (!strcmp(cmdname, "config")) { -+ return dump_config(argc, argv); -+ } -+ -+ help(); -+ return 0; -+} -diff --git a/vma.h b/vma.h -new file mode 100644 -index 0000000000..c895c97f6d ---- /dev/null -+++ b/vma.h -@@ -0,0 +1,150 @@ -+/* -+ * VMA: Virtual Machine Archive -+ * -+ * Copyright (C) Proxmox Server Solutions -+ * -+ * Authors: -+ * Dietmar Maurer (dietmar@proxmox.com) -+ * -+ * This work is licensed under the terms of the GNU GPL, version 2 or later. -+ * See the COPYING file in the top-level directory. -+ * -+ */ -+ -+#ifndef BACKUP_VMA_H -+#define BACKUP_VMA_H -+ -+#include -+#include "qapi/error.h" -+#include "block/block.h" -+ -+#define VMA_BLOCK_BITS 12 -+#define VMA_BLOCK_SIZE (1< +Date: Tue, 27 Mar 2018 10:49:03 +0200 +Subject: [PATCH] PVE: vma: remove forced NO_FLUSH option + +This one's rbd specific and in no way a sane choice for all +types storages. Instead, we want to honor the cache option +passed along. + +Signed-off-by: Wolfgang Bumiller +--- + vma.c | 2 +- + 1 file changed, 1 insertion(+), 1 deletion(-) + +diff --git a/vma.c b/vma.c +index 476b7bee00..3289fd722f 100644 +--- a/vma.c ++++ b/vma.c +@@ -327,7 +327,7 @@ static int extract_content(int argc, char **argv) + uint64_t throttling_bps = 0; + const char *throttling_group = NULL; + const char *cache = NULL; +- int flags = BDRV_O_RDWR | BDRV_O_NO_FLUSH; ++ int flags = BDRV_O_RDWR; + bool write_zero = true; + + if (readmap) { +-- +2.11.0 + diff --git a/debian/patches/pve/0026-PVE-Add-dummy-id-command-line-parameter.patch b/debian/patches/pve/0026-PVE-Add-dummy-id-command-line-parameter.patch new file mode 100644 index 0000000..334125c --- /dev/null +++ b/debian/patches/pve/0026-PVE-Add-dummy-id-command-line-parameter.patch @@ -0,0 +1,57 @@ +From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001 +From: Wolfgang Bumiller +Date: Thu, 30 Aug 2018 14:52:56 +0200 +Subject: [PATCH] PVE: Add dummy -id command line parameter + +This used to be part of the qemu-side PVE authentication for +VNC. Now this does nothing. + +Signed-off-by: Wolfgang Bumiller +--- + qemu-options.hx | 3 +++ + vl.c | 8 ++++++++ + 2 files changed, 11 insertions(+) + +diff --git a/qemu-options.hx b/qemu-options.hx +index 31329e26e2..15df7e4fab 100644 +--- a/qemu-options.hx ++++ b/qemu-options.hx +@@ -591,6 +591,9 @@ STEXI + @table @option + ETEXI + ++DEF("id", HAS_ARG, QEMU_OPTION_id, ++ "-id n set the VMID", QEMU_ARCH_ALL) ++ + DEF("fda", HAS_ARG, QEMU_OPTION_fda, + "-fda/-fdb file use 'file' as floppy disk 0/1 image\n", QEMU_ARCH_ALL) + DEF("fdb", HAS_ARG, QEMU_OPTION_fdb, "", QEMU_ARCH_ALL) +diff --git a/vl.c b/vl.c +index 63107d82a3..e349797245 100644 +--- a/vl.c ++++ b/vl.c +@@ -2915,6 +2915,7 @@ static void register_global_properties(MachineState *ms) + int main(int argc, char **argv, char **envp) + { + int i; ++ long vm_id; + int snapshot, linux_boot; + const char *initrd_filename; + const char *kernel_filename, *kernel_cmdline; +@@ -3660,6 +3661,13 @@ int main(int argc, char **argv, char **envp) + exit(1); + } + break; ++ case QEMU_OPTION_id: ++ vm_id = strtol(optarg, (char **)&optarg, 10); ++ if (*optarg != 0 || vm_id < 100 || vm_id > INT_MAX) { ++ error_report("invalid -id argument %s", optarg); ++ exit(1); ++ } ++ break; + case QEMU_OPTION_vnc: + vnc_parse(optarg, &error_fatal); + break; +-- +2.11.0 + diff --git a/debian/patches/pve/0026-PVE-vma-add-throttling-options-to-drive-mapping-fifo.patch b/debian/patches/pve/0026-PVE-vma-add-throttling-options-to-drive-mapping-fifo.patch deleted file mode 100644 index 2c14854..0000000 --- a/debian/patches/pve/0026-PVE-vma-add-throttling-options-to-drive-mapping-fifo.patch +++ /dev/null @@ -1,189 +0,0 @@ -From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001 -From: Wolfgang Bumiller -Date: Thu, 15 Feb 2018 11:07:56 +0100 -Subject: [PATCH] PVE: vma: add throttling options to drive mapping fifo - protocol - -We now need to call initialize the qom module as well. - -Signed-off-by: Wolfgang Bumiller ---- - vma.c | 88 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++---------- - 1 file changed, 76 insertions(+), 12 deletions(-) - -diff --git a/vma.c b/vma.c -index 1b59fd1555..f9f5c308fe 100644 ---- a/vma.c -+++ b/vma.c -@@ -18,7 +18,8 @@ - #include "qemu-common.h" - #include "qemu/error-report.h" - #include "qemu/main-loop.h" --#include "qapi/qmp/qstring.h" -+#include "qemu/cutils.h" -+#include "qapi/qmp/qdict.h" - #include "sysemu/block-backend.h" - - static void help(void) -@@ -132,9 +133,39 @@ typedef struct RestoreMap { - char *devname; - char *path; - char *format; -+ uint64_t throttling_bps; -+ char *throttling_group; - bool write_zero; - } RestoreMap; - -+static bool try_parse_option(char **line, const char *optname, char **out, const char *inbuf) { -+ size_t optlen = strlen(optname); -+ if (strncmp(*line, optname, optlen) != 0 || (*line)[optlen] != '=') { -+ return false; -+ } -+ if (*out) { -+ g_error("read map failed - duplicate value for option '%s'", optname); -+ } -+ char *value = (*line) + optlen + 1; /* including a '=' */ -+ char *colon = strchr(value, ':'); -+ if (!colon) { -+ g_error("read map failed - option '%s' not terminated ('%s')", -+ optname, inbuf); -+ } -+ *line = colon+1; -+ *out = g_strndup(value, colon - value); -+ return true; -+} -+ -+static uint64_t verify_u64(const char *text) { -+ uint64_t value; -+ const char *endptr = NULL; -+ if (qemu_strtou64(text, &endptr, 0, &value) != 0 || !endptr || *endptr) { -+ g_error("read map failed - not a number: %s", text); -+ } -+ return value; -+} -+ - static int extract_content(int argc, char **argv) - { - int c, ret = 0; -@@ -208,6 +239,9 @@ static int extract_content(int argc, char **argv) - while (1) { - char inbuf[8192]; - char *line = fgets(inbuf, sizeof(inbuf), map); -+ char *format = NULL; -+ char *bps = NULL; -+ char *group = NULL; - if (!line || line[0] == '\0' || !strcmp(line, "done\n")) { - break; - } -@@ -219,15 +253,19 @@ static int extract_content(int argc, char **argv) - } - } - -- char *format = NULL; -- if (strncmp(line, "format=", sizeof("format=")-1) == 0) { -- format = line + sizeof("format=")-1; -- char *colon = strchr(format, ':'); -- if (!colon) { -- g_error("read map failed - found only a format ('%s')", inbuf); -+ while (1) { -+ if (!try_parse_option(&line, "format", &format, inbuf) && -+ !try_parse_option(&line, "throttling.bps", &bps, inbuf) && -+ !try_parse_option(&line, "throttling.group", &group, inbuf)) -+ { -+ break; - } -- format = g_strndup(format, colon - format); -- line = colon+1; -+ } -+ -+ uint64_t bps_value = 0; -+ if (bps) { -+ bps_value = verify_u64(bps); -+ g_free(bps); - } - - const char *path; -@@ -253,6 +291,8 @@ static int extract_content(int argc, char **argv) - map->devname = g_strdup(devname); - map->path = g_strdup(path); - map->format = format; -+ map->throttling_bps = bps_value; -+ map->throttling_group = group; - map->write_zero = write_zero; - - g_hash_table_insert(devmap, map->devname, map); -@@ -280,6 +320,8 @@ static int extract_content(int argc, char **argv) - } else if (di) { - char *devfn = NULL; - const char *format = NULL; -+ uint64_t throttling_bps = 0; -+ const char *throttling_group = NULL; - int flags = BDRV_O_RDWR | BDRV_O_NO_FLUSH; - bool write_zero = true; - -@@ -291,6 +333,8 @@ static int extract_content(int argc, char **argv) - } - devfn = map->path; - format = map->format; -+ throttling_bps = map->throttling_bps; -+ throttling_group = map->throttling_group; - write_zero = map->write_zero; - } else { - devfn = g_strdup_printf("%s/tmp-disk-%s.raw", -@@ -315,7 +359,7 @@ static int extract_content(int argc, char **argv) - if (format) { - /* explicit format from commandline */ - options = qdict_new(); -- qdict_put(options, "driver", qstring_from_str(format)); -+ qdict_put_str(options, "driver", format); - } else if ((devlen > 4 && strcmp(devfn+devlen-4, ".raw") == 0) || - strncmp(devfn, "/dev/", 5) == 0) - { -@@ -324,15 +368,34 @@ static int extract_content(int argc, char **argv) - */ - /* explicit raw format */ - options = qdict_new(); -- qdict_put(options, "driver", qstring_from_str("raw")); -+ qdict_put_str(options, "driver", "raw"); - } - -- - if (errp || !(blk = blk_new_open(devfn, NULL, options, flags, &errp))) { - g_error("can't open file %s - %s", devfn, - error_get_pretty(errp)); - } - -+ if (throttling_group) { -+ blk_io_limits_enable(blk, throttling_group); -+ } -+ -+ if (throttling_bps) { -+ if (!throttling_group) { -+ blk_io_limits_enable(blk, devfn); -+ } -+ -+ ThrottleConfig cfg; -+ throttle_config_init(&cfg); -+ cfg.buckets[THROTTLE_BPS_WRITE].avg = throttling_bps; -+ Error *err = NULL; -+ if (!throttle_is_valid(&cfg, &err)) { -+ error_report_err(err); -+ g_error("failed to apply throttling"); -+ } -+ blk_set_io_limits(blk, &cfg); -+ } -+ - if (vma_reader_register_bs(vmar, i, blk, write_zero, &errp) < 0) { - g_error("%s", error_get_pretty(errp)); - } -@@ -730,6 +793,7 @@ int main(int argc, char **argv) - } - - bdrv_init(); -+ module_call_init(MODULE_INIT_QOM); - - if (argc < 2) { - help(); --- -2.11.0 - diff --git a/debian/patches/pve/0027-PVE-Config-Revert-target-i386-disable-LINT0-after-re.patch b/debian/patches/pve/0027-PVE-Config-Revert-target-i386-disable-LINT0-after-re.patch new file mode 100644 index 0000000..abbfaf6 --- /dev/null +++ b/debian/patches/pve/0027-PVE-Config-Revert-target-i386-disable-LINT0-after-re.patch @@ -0,0 +1,33 @@ +From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001 +From: Wolfgang Bumiller +Date: Mon, 4 Jul 2016 15:02:26 +0200 +Subject: [PATCH] PVE: [Config] Revert "target-i386: disable LINT0 after reset" + +This reverts commit b8eb5512fd8a115f164edbbe897cdf8884920ccb. +--- + hw/intc/apic_common.c | 9 +++++++++ + 1 file changed, 9 insertions(+) + +diff --git a/hw/intc/apic_common.c b/hw/intc/apic_common.c +index 78903ea909..cdfbec5e47 100644 +--- a/hw/intc/apic_common.c ++++ b/hw/intc/apic_common.c +@@ -257,6 +257,15 @@ static void apic_reset_common(DeviceState *dev) + info->vapic_base_update(s); + + apic_init_reset(dev); ++ ++ if (bsp) { ++ /* ++ * LINT0 delivery mode on CPU #0 is set to ExtInt at initialization ++ * time typically by BIOS, so PIC interrupt can be delivered to the ++ * processor when local APIC is enabled. ++ */ ++ s->lvt[APIC_LVT_LINT0] = 0x700; ++ } + } + + /* This function is only used for old state version 1 and 2 */ +-- +2.11.0 + diff --git a/debian/patches/pve/0027-PVE-vma-add-cache-option-to-device-map.patch b/debian/patches/pve/0027-PVE-vma-add-cache-option-to-device-map.patch deleted file mode 100644 index 15e0deb..0000000 --- a/debian/patches/pve/0027-PVE-vma-add-cache-option-to-device-map.patch +++ /dev/null @@ -1,95 +0,0 @@ -From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001 -From: Wolfgang Bumiller -Date: Thu, 22 Mar 2018 15:32:04 +0100 -Subject: [PATCH] PVE: vma: add cache option to device map - -Signed-off-by: Wolfgang Bumiller ---- - vma.c | 16 +++++++++++++++- - 1 file changed, 15 insertions(+), 1 deletion(-) - -diff --git a/vma.c b/vma.c -index f9f5c308fe..476b7bee00 100644 ---- a/vma.c -+++ b/vma.c -@@ -135,6 +135,7 @@ typedef struct RestoreMap { - char *format; - uint64_t throttling_bps; - char *throttling_group; -+ char *cache; - bool write_zero; - } RestoreMap; - -@@ -242,6 +243,7 @@ static int extract_content(int argc, char **argv) - char *format = NULL; - char *bps = NULL; - char *group = NULL; -+ char *cache = NULL; - if (!line || line[0] == '\0' || !strcmp(line, "done\n")) { - break; - } -@@ -256,7 +258,8 @@ static int extract_content(int argc, char **argv) - while (1) { - if (!try_parse_option(&line, "format", &format, inbuf) && - !try_parse_option(&line, "throttling.bps", &bps, inbuf) && -- !try_parse_option(&line, "throttling.group", &group, inbuf)) -+ !try_parse_option(&line, "throttling.group", &group, inbuf) && -+ !try_parse_option(&line, "cache", &cache, inbuf)) - { - break; - } -@@ -293,6 +296,7 @@ static int extract_content(int argc, char **argv) - map->format = format; - map->throttling_bps = bps_value; - map->throttling_group = group; -+ map->cache = cache; - map->write_zero = write_zero; - - g_hash_table_insert(devmap, map->devname, map); -@@ -322,6 +326,7 @@ static int extract_content(int argc, char **argv) - const char *format = NULL; - uint64_t throttling_bps = 0; - const char *throttling_group = NULL; -+ const char *cache = NULL; - int flags = BDRV_O_RDWR | BDRV_O_NO_FLUSH; - bool write_zero = true; - -@@ -335,6 +340,7 @@ static int extract_content(int argc, char **argv) - format = map->format; - throttling_bps = map->throttling_bps; - throttling_group = map->throttling_group; -+ cache = map->cache; - write_zero = map->write_zero; - } else { - devfn = g_strdup_printf("%s/tmp-disk-%s.raw", -@@ -356,6 +362,7 @@ static int extract_content(int argc, char **argv) - - size_t devlen = strlen(devfn); - QDict *options = NULL; -+ bool writethrough; - if (format) { - /* explicit format from commandline */ - options = qdict_new(); -@@ -370,12 +377,19 @@ static int extract_content(int argc, char **argv) - options = qdict_new(); - qdict_put_str(options, "driver", "raw"); - } -+ if (cache && bdrv_parse_cache_mode(cache, &flags, &writethrough)) { -+ g_error("invalid cache option: %s\n", cache); -+ } - - if (errp || !(blk = blk_new_open(devfn, NULL, options, flags, &errp))) { - g_error("can't open file %s - %s", devfn, - error_get_pretty(errp)); - } - -+ if (cache) { -+ blk_set_enable_write_cache(blk, !writethrough); -+ } -+ - if (throttling_group) { - blk_io_limits_enable(blk, throttling_group); - } --- -2.11.0 - diff --git a/debian/patches/pve/0028-PVE-vma-remove-forced-NO_FLUSH-option.patch b/debian/patches/pve/0028-PVE-vma-remove-forced-NO_FLUSH-option.patch deleted file mode 100644 index 6b814b6..0000000 --- a/debian/patches/pve/0028-PVE-vma-remove-forced-NO_FLUSH-option.patch +++ /dev/null @@ -1,30 +0,0 @@ -From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001 -From: Wolfgang Bumiller -Date: Tue, 27 Mar 2018 10:49:03 +0200 -Subject: [PATCH] PVE: vma: remove forced NO_FLUSH option - -This one's rbd specific and in no way a sane choice for all -types storages. Instead, we want to honor the cache option -passed along. - -Signed-off-by: Wolfgang Bumiller ---- - vma.c | 2 +- - 1 file changed, 1 insertion(+), 1 deletion(-) - -diff --git a/vma.c b/vma.c -index 476b7bee00..3289fd722f 100644 ---- a/vma.c -+++ b/vma.c -@@ -327,7 +327,7 @@ static int extract_content(int argc, char **argv) - uint64_t throttling_bps = 0; - const char *throttling_group = NULL; - const char *cache = NULL; -- int flags = BDRV_O_RDWR | BDRV_O_NO_FLUSH; -+ int flags = BDRV_O_RDWR; - bool write_zero = true; - - if (readmap) { --- -2.11.0 - diff --git a/debian/patches/pve/0029-PVE-Add-dummy-id-command-line-parameter.patch b/debian/patches/pve/0029-PVE-Add-dummy-id-command-line-parameter.patch deleted file mode 100644 index 3ae539d..0000000 --- a/debian/patches/pve/0029-PVE-Add-dummy-id-command-line-parameter.patch +++ /dev/null @@ -1,57 +0,0 @@ -From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001 -From: Wolfgang Bumiller -Date: Thu, 30 Aug 2018 14:52:56 +0200 -Subject: [PATCH] PVE: Add dummy -id command line parameter - -This used to be part of the qemu-side PVE authentication for -VNC. Now this does nothing. - -Signed-off-by: Wolfgang Bumiller ---- - qemu-options.hx | 3 +++ - vl.c | 8 ++++++++ - 2 files changed, 11 insertions(+) - -diff --git a/qemu-options.hx b/qemu-options.hx -index 31329e26e2..15df7e4fab 100644 ---- a/qemu-options.hx -+++ b/qemu-options.hx -@@ -591,6 +591,9 @@ STEXI - @table @option - ETEXI - -+DEF("id", HAS_ARG, QEMU_OPTION_id, -+ "-id n set the VMID", QEMU_ARCH_ALL) -+ - DEF("fda", HAS_ARG, QEMU_OPTION_fda, - "-fda/-fdb file use 'file' as floppy disk 0/1 image\n", QEMU_ARCH_ALL) - DEF("fdb", HAS_ARG, QEMU_OPTION_fdb, "", QEMU_ARCH_ALL) -diff --git a/vl.c b/vl.c -index b2e3e23724..a03e4c2867 100644 ---- a/vl.c -+++ b/vl.c -@@ -2915,6 +2915,7 @@ static void register_global_properties(MachineState *ms) - int main(int argc, char **argv, char **envp) - { - int i; -+ long vm_id; - int snapshot, linux_boot; - const char *initrd_filename; - const char *kernel_filename, *kernel_cmdline; -@@ -3659,6 +3660,13 @@ int main(int argc, char **argv, char **envp) - exit(1); - } - break; -+ case QEMU_OPTION_id: -+ vm_id = strtol(optarg, (char **)&optarg, 10); -+ if (*optarg != 0 || vm_id < 100 || vm_id > INT_MAX) { -+ error_report("invalid -id argument %s", optarg); -+ exit(1); -+ } -+ break; - case QEMU_OPTION_vnc: - vnc_parse(optarg, &error_fatal); - break; --- -2.11.0 - diff --git a/debian/patches/pve/0030-PVE-Config-Revert-target-i386-disable-LINT0-after-re.patch b/debian/patches/pve/0030-PVE-Config-Revert-target-i386-disable-LINT0-after-re.patch deleted file mode 100644 index abbfaf6..0000000 --- a/debian/patches/pve/0030-PVE-Config-Revert-target-i386-disable-LINT0-after-re.patch +++ /dev/null @@ -1,33 +0,0 @@ -From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001 -From: Wolfgang Bumiller -Date: Mon, 4 Jul 2016 15:02:26 +0200 -Subject: [PATCH] PVE: [Config] Revert "target-i386: disable LINT0 after reset" - -This reverts commit b8eb5512fd8a115f164edbbe897cdf8884920ccb. ---- - hw/intc/apic_common.c | 9 +++++++++ - 1 file changed, 9 insertions(+) - -diff --git a/hw/intc/apic_common.c b/hw/intc/apic_common.c -index 78903ea909..cdfbec5e47 100644 ---- a/hw/intc/apic_common.c -+++ b/hw/intc/apic_common.c -@@ -257,6 +257,15 @@ static void apic_reset_common(DeviceState *dev) - info->vapic_base_update(s); - - apic_init_reset(dev); -+ -+ if (bsp) { -+ /* -+ * LINT0 delivery mode on CPU #0 is set to ExtInt at initialization -+ * time typically by BIOS, so PIC interrupt can be delivered to the -+ * processor when local APIC is enabled. -+ */ -+ s->lvt[APIC_LVT_LINT0] = 0x700; -+ } - } - - /* This function is only used for old state version 1 and 2 */ --- -2.11.0 - diff --git a/debian/patches/series b/debian/patches/series index c41d7da..abfe175 100644 --- a/debian/patches/series +++ b/debian/patches/series @@ -16,20 +16,14 @@ pve/0015-PVE-virtio-balloon-improve-query-balloon.patch pve/0016-PVE-qapi-modify-query-machines.patch pve/0017-PVE-qapi-modify-spice-query.patch pve/0018-PVE-internal-snapshot-async.patch -pve/0019-PVE-convert-savevm-async-to-threads.patch -pve/0020-PVE-block-snapshot-qmp_snapshot_drive-add-aiocontext.patch -pve/0021-PVE-block-snapshot-qmp_delete_drive_snapshot-add-aio.patch -pve/0022-PVE-block-add-the-zeroinit-block-driver-filter.patch -pve/0023-PVE-backup-modify-job-api.patch -pve/0024-PVE-backup-introduce-vma-archive-format.patch -pve/0025-PVE-Deprecated-adding-old-vma-files.patch -pve/0026-PVE-vma-add-throttling-options-to-drive-mapping-fifo.patch -pve/0027-PVE-vma-add-cache-option-to-device-map.patch -pve/0028-PVE-vma-remove-forced-NO_FLUSH-option.patch -pve/0029-PVE-Add-dummy-id-command-line-parameter.patch -pve/0030-PVE-Config-Revert-target-i386-disable-LINT0-after-re.patch -extra/0001-seccomp-use-SIGSYS-signal-instead-of-killing-the-thr.patch -extra/0002-seccomp-prefer-SCMP_ACT_KILL_PROCESS-if-available.patch -extra/0003-configure-require-libseccomp-2.2.0.patch -extra/0004-seccomp-set-the-seccomp-filter-to-all-threads.patch -extra/0005-monitor-create-iothread-after-daemonizing.patch +pve/0019-PVE-block-add-the-zeroinit-block-driver-filter.patch +pve/0020-PVE-backup-modify-job-api.patch +pve/0021-PVE-backup-introduce-vma-archive-format.patch +pve/0022-PVE-Deprecated-adding-old-vma-files.patch +pve/0023-PVE-vma-add-throttling-options-to-drive-mapping-fifo.patch +pve/0024-PVE-vma-add-cache-option-to-device-map.patch +pve/0025-PVE-vma-remove-forced-NO_FLUSH-option.patch +pve/0026-PVE-Add-dummy-id-command-line-parameter.patch +pve/0027-PVE-Config-Revert-target-i386-disable-LINT0-after-re.patch +extra/0001-monitor-guard-iothread-access-by-mon-use_io_thread.patch +extra/0002-monitor-delay-monitor-iothread-creation.patch diff --git a/qemu b/qemu index 3844175..1dfcf65 160000 --- a/qemu +++ b/qemu @@ -1 +1 @@ -Subproject commit 38441756b70eec5807b5f60dad11a93a91199866 +Subproject commit 1dfcf652e6ae5eb6b98d2c55a509e8eb054a2fab