debian/patches/old/0001-add-documenation-for-new-backup-framework.patch

   1 From 2f0dcd89a0de8b656d33ce6997c09879bd287af7 Mon Sep 17 00:00:00 2001
   2 From: Dietmar Maurer <dietmar@proxmox.com>
   3 Date: Tue, 13 Nov 2012 09:24:50 +0100
   4 Subject: [PATCH v5 1/6] add documenation for new backup framework
   5
   6
   7 Signed-off-by: Dietmar Maurer <dietmar@proxmox.com>
   8 ---
   9  docs/backup.txt |  116 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
  10  1 files changed, 116 insertions(+), 0 deletions(-)
  11  create mode 100644 docs/backup.txt
  12
  13 diff --git a/docs/backup.txt b/docs/backup.txt
  14 new file mode 100644
  15 index 0000000..927d787
  16 --- /dev/null
  17 +++ b/docs/backup.txt
  18 @@ -0,0 +1,116 @@
  19 +Efficient VM backup for qemu
  20 +
  21 +=Requirements=
  22 +
  23 +* Backup to a single archive file
  24 +* Backup needs to contain all data to restore VM (full backup)
  25 +* Do not depend on storage type or image format
  26 +* Avoid use of temporary storage
  27 +* store sparse images efficiently
  28 +
  29 +=Introduction=
  30 +
  31 +Most VM backup solutions use some kind of snapshot to get a consistent
  32 +VM view at a specific point in time. For example, we previously used
  33 +LVM to create a snapshot of all used VM images, which are then copied
  34 +into a tar file.
  35 +
  36 +That basically means that any data written during backup involve
  37 +considerable overhead. For LVM we get the following steps:
  38 +
  39 +1.) read original data (VM write)
  40 +2.) write original data into snapshot (VM write)
  41 +3.) write new data (VM write)
  42 +4.) read data from snapshot (backup)
  43 +5.) write data from snapshot into tar file (backup)
  44 +
  45 +Another approach to backup VM images is to create a new qcow2 image
  46 +which use the old image as base. During backup, writes are redirected
  47 +to the new image, so the old image represents a 'snapshot'. After
  48 +backup, data need to be copied back from new image into the old
  49 +one (commit). So a simple write during backup triggers the following
  50 +steps:
  51 +
  52 +1.) write new data to new image (VM write)
  53 +2.) read data from old image (backup)
  54 +3.) write data from old image into tar file (backup)
  55 +
  56 +4.) read data from new image (commit)
  57 +5.) write data to old image (commit)
  58 +
  59 +This is in fact the same overhead as before. Other tools like qemu
  60 +livebackup produces similar overhead (2 reads, 3 writes).
  61 +
  62 +Some storage types/formats supports internal snapshots using some kind
  63 +of reference counting (rados, sheepdog, dm-thin, qcow2). It would be possible
  64 +to use that for backups, but for now we want to be storage-independent.
  65 +
  66 +=Make it more efficient=
  67 +
  68 +The be more efficient, we simply need to avoid unnecessary steps. The
  69 +following steps are always required:
  70 +
  71 +1.) read old data before it gets overwritten
  72 +2.) write that data into the backup archive
  73 +3.) write new data (VM write)
  74 +
  75 +As you can see, this involves only one read, and two writes.
  76 +
  77 +To make that work, our backup archive need to be able to store image
  78 +data 'out of order'. It is important to notice that this will not work
  79 +with traditional archive formats like tar.
  80 +
  81 +During backup we simply intercept writes, then read existing data and
  82 +store that directly into the archive. After that we can continue the
  83 +write.
  84 +
  85 +==Advantages==
  86 +
  87 +* very good performance (1 read, 2 writes)
  88 +* works on any storage type and image format.
  89 +* avoid usage of temporary storage
  90 +* we can define a new and simple archive format, which is able to
  91 +  store sparse files efficiently.
  92 +
  93 +Note: Storing sparse files is a mess with existing archive
  94 +formats. For example, tar requires information about holes at the
  95 +beginning of the archive.
  96 +
  97 +==Disadvantages==
  98 +
  99 +* we need to define a new archive format
 100 +
 101 +Note: Most existing archive formats are optimized to store small files
 102 +including file attributes. We simply do not need that for VM archives.
 103 +
 104 +* archive contains data 'out of order'
 105 +
 106 +If you want to access image data in sequential order, you need to
 107 +re-order archive data. It would be possible to to that on the fly,
 108 +using temporary files.
 109 +
 110 +Fortunately, a normal restore/extract works perfectly with 'out of
 111 +order' data, because the target files are seekable.
 112 +
 113 +* slow backup storage can slow down VM during backup
 114 +
 115 +It is important to note that we only do sequential writes to the
 116 +backup storage. Furthermore one can compress the backup stream. IMHO,
 117 +it is better to slow down the VM a bit. All other solutions creates
 118 +large amounts of temporary data during backup.
 119 +
 120 +=Archive format requirements=
 121 +
 122 +The basic requirement for such new format is that we can store image
 123 +date 'out of order'. It is also very likely that we have less than 256
 124 +drives/images per VM, and we want to be able to store VM configuration
 125 +files.
 126 +
 127 +We have defined a very simply format with those properties, see:
 128 +
 129 +docs/specs/vma_spec.txt
 130 +
 131 +Please let us know if you know an existing format which provides the
 132 +same functionality.
 133 +
 134 +
 135 --
 136 1.7.2.5
 137