[pve-qemu.git] / backup.txt

Efficient VM backup for qemu

=Requirements=

* Backup to a single archive file
* Backup needs to contain all data to restore VM (full backup)
* Do not depend on storage type or image format
* Avoid use of temporary storage
* store sparse images efficiently

=Introduction=

Most VM backup solutions use some kind of snapshot to get a consistent
VM view at a specific point in time. For example, we previously used
LVM to create a snapshot of all used VM images, which are then copied
into a tar file.

That basically means that any data written during backup involve
considerable overhead. For LVM we get the following steps:

1.) read original data (VM write)
2.) write original data into snapshot (VM write)
3.) write new data (VM write)
4.) read data from snapshot (backup)
5.) write data from snapshot into tar file (backup)

Another approach to backup VM images is to create a new qcow2 image
which use the old image as base. During backup, writes are redirected
to the new image, so the old image represents a 'snapshot'. After
backup, data need to be copied back from new image into the old
one (commit). So a simple write during backup triggers the following
steps:

1.) write new data to new image (VM write)
2.) read data from old image (backup)
3.) write data from old image into tar file (backup)

4.) read data from new image (commit)
5.) write data to old image (commit)

This is in fact the same overhead as before. Other tools like qemu
livebackup produces similar overhead (2 reads, 3 writes).

Some storage types/formats supports internal snapshots using some kind
of reference counting (rados, sheepdog, dm-thin, qcow2). It would be possible
to use that for backups, but for now we want to be storage-independent.

=Make it more efficient=

The be more efficient, we simply need to avoid unnecessary steps. The
following steps are always required:

1.) read old data before it gets overwritten
2.) write that data into the backup archive
3.) write new data (VM write)

As you can see, this involves only one read, and two writes.

To make that work, our backup archive need to be able to store image
data 'out of order'. It is important to notice that this will not work
with traditional archive formats like tar.

During backup we simply intercept writes, then read existing data and
store that directly into the archive. After that we can continue the
write.

==Advantages==

* very good performance (1 read, 2 writes)
* works on any storage type and image format.
* avoid usage of temporary storage
* we can define a new and simple archive format, which is able to
  store sparse files efficiently.

Note: Storing sparse files is a mess with existing archive
formats. For example, tar requires information about holes at the
beginning of the archive.

==Disadvantages==

* we need to define a new archive format

Note: Most existing archive formats are optimized to store small files
including file attributes. We simply do not need that for VM archives.

* archive contains data 'out of order'

If you want to access image data in sequential order, you need to
re-order archive data. It would be possible to to that on the fly,
using temporary files.

Fortunately, a normal restore/extract works perfectly with 'out of
order' data, because the target files are seekable.

* slow backup storage can slow down VM during backup

It is important to note that we only do sequential writes to the
backup storage. Furthermore one can compress the backup stream. IMHO,
it is better to slow down the VM a bit. All other solutions creates
large amounts of temporary data during backup.

=Archive format requirements=

The basic requirement for such new format is that we can store image
date 'out of order'. It is also very likely that we have less than 256
drives/images per VM, and we want to be able to store VM configuration
files.

We have defined a very simply format with those properties, see:

https://git.proxmox.com/?p=pve-qemu.git;a=blob;f=vma_spec.txt;

Please let us know if you know an existing format which provides the
same functionality.
Commit	Line	Data
95259824 WB	1	Efficient VM backup for qemu
	2
	3	=Requirements=
	4
	5	* Backup to a single archive file
	6	* Backup needs to contain all data to restore VM (full backup)
	7	* Do not depend on storage type or image format
	8	* Avoid use of temporary storage
	9	* store sparse images efficiently
	10
	11	=Introduction=
	12
	13	Most VM backup solutions use some kind of snapshot to get a consistent
	14	VM view at a specific point in time. For example, we previously used
	15	LVM to create a snapshot of all used VM images, which are then copied
	16	into a tar file.
	17
	18	That basically means that any data written during backup involve
	19	considerable overhead. For LVM we get the following steps:
	20
	21	1.) read original data (VM write)
	22	2.) write original data into snapshot (VM write)
	23	3.) write new data (VM write)
	24	4.) read data from snapshot (backup)
	25	5.) write data from snapshot into tar file (backup)
	26
	27	Another approach to backup VM images is to create a new qcow2 image
	28	which use the old image as base. During backup, writes are redirected
	29	to the new image, so the old image represents a 'snapshot'. After
	30	backup, data need to be copied back from new image into the old
	31	one (commit). So a simple write during backup triggers the following
	32	steps:
	33
	34	1.) write new data to new image (VM write)
	35	2.) read data from old image (backup)
	36	3.) write data from old image into tar file (backup)
	37
	38	4.) read data from new image (commit)
	39	5.) write data to old image (commit)
	40
	41	This is in fact the same overhead as before. Other tools like qemu
	42	livebackup produces similar overhead (2 reads, 3 writes).
	43
	44	Some storage types/formats supports internal snapshots using some kind
	45	of reference counting (rados, sheepdog, dm-thin, qcow2). It would be possible
	46	to use that for backups, but for now we want to be storage-independent.
	47
	48	=Make it more efficient=
	49
	50	The be more efficient, we simply need to avoid unnecessary steps. The
	51	following steps are always required:
	52
	53	1.) read old data before it gets overwritten
	54	2.) write that data into the backup archive
	55	3.) write new data (VM write)
	56
	57	As you can see, this involves only one read, and two writes.
	58
	59	To make that work, our backup archive need to be able to store image
	60	data 'out of order'. It is important to notice that this will not work
	61	with traditional archive formats like tar.
	62
	63	During backup we simply intercept writes, then read existing data and
	64	store that directly into the archive. After that we can continue the
65	write.
66
67	==Advantages==
68
69	* very good performance (1 read, 2 writes)
70	* works on any storage type and image format.
71	* avoid usage of temporary storage
72	* we can define a new and simple archive format, which is able to
73	store sparse files efficiently.
74
75	Note: Storing sparse files is a mess with existing archive
76	formats. For example, tar requires information about holes at the
77	beginning of the archive.
78
79	==Disadvantages==
80
81	* we need to define a new archive format
82
83	Note: Most existing archive formats are optimized to store small files
84	including file attributes. We simply do not need that for VM archives.
85
86	* archive contains data 'out of order'
87
88	If you want to access image data in sequential order, you need to
89	re-order archive data. It would be possible to to that on the fly,
90	using temporary files.
91
92	Fortunately, a normal restore/extract works perfectly with 'out of
93	order' data, because the target files are seekable.
94
95	* slow backup storage can slow down VM during backup
96
97	It is important to note that we only do sequential writes to the
98	backup storage. Furthermore one can compress the backup stream. IMHO,
99	it is better to slow down the VM a bit. All other solutions creates
100	large amounts of temporary data during backup.
101
102	=Archive format requirements=
103
104	The basic requirement for such new format is that we can store image
105	date 'out of order'. It is also very likely that we have less than 256
106	drives/images per VM, and we want to be able to store VM configuration
107	files.
108
109	We have defined a very simply format with those properties, see:
110
2b2949ca	111	https://git.proxmox.com/?p=pve-qemu.git;a=blob;f=vma_spec.txt;
95259824 WB	112
	113	Please let us know if you know an existing format which provides the
	114	same functionality.
	115
	116