]> git.proxmox.com Git - proxmox-backup.git/blame - docs/technical-overview.rst
run cargo fmt
[proxmox-backup.git] / docs / technical-overview.rst
CommitLineData
ec1ae7e6
TL
1.. _tech_design_overview:
2
1531185d
DC
3Technical Overview
4==================
5
1531185d
DC
6Datastores
7----------
8
3253d8a2 9A Datastore is the logical place where :ref:`Backup Snapshots
a98e2287 10<term_backup_snapshot>` and their chunks are stored. Snapshots consist of a
7ccbce03 11manifest, blobs, and dynamic- and fixed-indexes (see :ref:`terms`), and are
3253d8a2 12stored in the following directory structure:
1531185d
DC
13
14 <datastore-root>/<type>/<id>/<time>/
15
16The deduplication of datastores is based on reusing chunks, which are
17referenced by the indexes in a backup snapshot. This means that multiple
3253d8a2
TL
18indexes can reference the same chunks, reducing the amount of space needed to
19contain the data (even across backup snapshots).
1531185d 20
d4827535
DC
21Snapshots
22---------
23
24A Snapshot is the collection of manifest, blobs and indexes that represent
25a backup. When a client creates a snapshot, it can upload blobs (single files
26which are not chunked, e.g. the client log), or one or more indexes
27(fixed or dynamic).
28
29When uploading an index, the client first has to read the source data, chunk it
30and send the data as chunks with their identifying checksum to the server.
c51f0d5e
CE
31When using the :ref:`change detection mode <change_detection_mode>` payload
32chunks for unchanged files are reused from the previous snapshot, thereby not
33reading the source data again.
d4827535
DC
34
35If there is a previous Snapshot in the backup group, the client can first
36download the chunk list of the previous Snapshot. If it detects a chunk that
37already exists on the server, it can send only the checksum instead of data
38and checksum. This way the actual upload of Snapshots is incremental while
39each Snapshot references all chunks and is thus a full backup.
40
41After uploading all data, the client has to signal to the server that the
42backup is finished. If that is not done before the connection closes, the
43server will remove the unfinished snapshot.
44
1531185d
DC
45Chunks
46------
47
3253d8a2
TL
48A chunk is some (possibly encrypted) data with a CRC-32 checksum at the end and
49a type marker at the beginning. It is identified by the SHA-256 checksum of its
50content.
1531185d
DC
51
52To generate such chunks, backup data is split either into fixed-size or
3253d8a2 53dynamically sized chunks. The same content will be hashed to the same checksum.
1531185d
DC
54
55The chunks of a datastore are found in
56
57 <datastore-root>/.chunks/
58
7ccbce03
DW
59This chunk directory is further subdivided by the first four bytes of the
60chunk's checksum, so a chunk with the checksum
1531185d
DC
61
62 a342e8151cbf439ce65f3df696b54c67a114982cc0aa751f2852c2f7acc19a8b
63
64lives in
65
66 <datastore-root>/.chunks/a342/
67
3253d8a2
TL
68This is done to reduce the number of files per directory, as having many files
69per directory can be bad for file system performance.
1531185d 70
3253d8a2
TL
71These chunk directories ('0000'-'ffff') will be preallocated when a datastore
72is created.
1531185d 73
7ccbce03 74Fixed-Sized Chunks
1531185d
DC
75^^^^^^^^^^^^^^^^^^
76
77For block based backups (like VMs), fixed-sized chunks are used. The content
78(disk image), is split into chunks of the same length (typically 4 MiB).
79
3253d8a2
TL
80This works very well for VM images, since the file system on the guest most
81often tries to allocate files in contiguous pieces, so new files get new
82blocks, and changing existing files changes only their own blocks.
1531185d 83
3253d8a2 84As an optimization, VMs in `Proxmox VE`_ can make use of 'dirty bitmaps', which
7ccbce03 85can track the changed blocks of an image. Since these bitmaps are also a
efc09f63 86representation of the image split into chunks, there is a direct relation
7ccbce03
DW
87between the dirty blocks of the image and chunks which need to be uploaded.
88Thus, only modified chunks of the disk need to be uploaded to a backup.
1531185d 89
efc09f63 90Since the image is always split into chunks of the same size, unchanged blocks
3253d8a2
TL
91will result in identical checksums for those chunks, so such chunks do not need
92to be backed up again. This way storage snapshots are not needed to find the
93changed blocks.
1531185d
DC
94
95For consistency, `Proxmox VE`_ uses a QEMU internal snapshot mechanism, that
96does not rely on storage snapshots either.
97
7ccbce03 98Dynamically Sized Chunks
1531185d
DC
99^^^^^^^^^^^^^^^^^^^^^^^^
100
7ccbce03
DW
101When working with file-based systems rather than block-based systems,
102using fixed-sized chunks is not a good idea, since every time a file
103would change in size, the remaining data would be shifted around,
104resulting in many chunks changing and the amount of deduplication being reduced.
1531185d
DC
105
106To improve this, `Proxmox Backup`_ Server uses dynamically sized chunks
3253d8a2
TL
107instead. Instead of splitting an image into fixed sizes, it first generates a
108consistent file archive (:ref:`pxar <pxar-format>`) and uses a rolling hash
1531185d
DC
109over this on-the-fly generated archive to calculate chunk boundaries.
110
7ccbce03 111We use a variant of Buzhash which is a cyclic polynomial algorithm. It works
3253d8a2 112by continuously calculating a checksum while iterating over the data, and on
7ccbce03 113certain conditions, it triggers a hash boundary.
1531185d 114
7ccbce03 115Assuming that most files on the system that is to be backed up have not
3253d8a2
TL
116changed, eventually the algorithm triggers the boundary on the same data as a
117previous backup, resulting in chunks that can be reused.
1531185d
DC
118
119Encrypted Chunks
120^^^^^^^^^^^^^^^^
121
3253d8a2
TL
122Encrypted chunks are a special case. Both fixed- and dynamically sized chunks
123can be encrypted, and they are handled in a slightly different manner than
124normal chunks.
1531185d
DC
125
126The hashes of encrypted chunks are calculated not with the actual (encrypted)
7ccbce03
DW
127chunk content, but with the plain-text content, concatenated with the encryption
128key. This way, two chunks with the same data but encrypted with different keys
3253d8a2
TL
129generate two different checksums and no collisions occur for multiple
130encryption keys.
1531185d 131
3253d8a2
TL
132This is done to speed up the client part of the backup, since it only needs to
133encrypt chunks that are actually getting uploaded. Chunks that exist already in
134the previous backup, do not need to be encrypted and uploaded.
1531185d
DC
135
136Caveats and Limitations
137-----------------------
138
7ccbce03 139Notes on Hash Collisions
1531185d
DC
140^^^^^^^^^^^^^^^^^^^^^^^^
141
3253d8a2
TL
142Every hashing algorithm has a chance to produce collisions, meaning two (or
143more) inputs generate the same checksum. For SHA-256, this chance is
7ccbce03
DW
144negligible. To calculate the chances of such a collision, one can use the ideas
145of the 'birthday problem' from probability theory. For big numbers, this is
146actually unfeasible to calculate with regular computers, but there is a good
3253d8a2 147approximation:
1531185d
DC
148
149.. math::
150
151 p(n, d) = 1 - e^{-n^2/(2d)}
152
efc09f63 153Where `n` is the number of tries, and `d` is the number of possibilities.
7ccbce03 154For a concrete example, lets assume a large datastore of 1 PiB and an average
efc09f63
TL
155chunk size of 4 MiB. That means :math:`n = 268435456` tries, and :math:`d =
1562^{256}` possibilities. Inserting those values in the formula from earlier you
157will see that the probability of a collision in that scenario is:
1531185d
DC
158
159.. math::
160
161 3.1115 * 10^{-61}
162
7ccbce03
DW
163For context, in a lottery game of guessing 6 numbers out of 45, the chance to
164correctly guess all 6 numbers is only :math:`1.2277 * 10^{-7}`. This means the
165chance of a collision is about the same as winning 13 such lottery games *in a
166row*.
1531185d 167
efc09f63
TL
168In conclusion, it is extremely unlikely that such a collision would occur by
169accident in a normal datastore.
1531185d 170
3253d8a2 171Additionally, SHA-256 is prone to length extension attacks, but since there is
7ccbce03 172an upper limit for how big the chunks are, this is not a problem, because a
3253d8a2
TL
173potential attacker cannot arbitrarily add content to the data beyond that
174limit.
1531185d 175
7ccbce03 176File-Based Backup
1531185d
DC
177^^^^^^^^^^^^^^^^^
178
179Since dynamically sized chunks (for file-based backups) are created on a custom
180archive format (pxar) and not over the files directly, there is no relation
7ccbce03 181between the files and chunks. This means that the Proxmox Backup Client has to
efc09f63 182read all files again for every backup, otherwise it would not be possible to
7ccbce03
DW
183generate a consistent, independent pxar archive where the original chunks can be
184reused. Note that in spite of this, only new or changed chunks will be uploaded.
1531185d 185
7ccbce03 186Verification of Encrypted Chunks
1531185d
DC
187^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
188
3253d8a2 189For encrypted chunks, only the checksum of the original (plaintext) data is
7ccbce03 190available, making it impossible for the server (without the encryption key) to
3253d8a2 191verify its content against it. Instead only the CRC-32 checksum gets checked.
5a839306
HL
192
193Troubleshooting
194---------------
195
7ccbce03
DW
196Index files(*.fidx*, *.didx*) contain information about how to rebuild a file.
197More precisely, they contain an ordered list of references to the chunks that
198the original file was split into. If there is something wrong with a snapshot,
199it might be useful to find out which chunks are referenced in it, and check
8c83b75a 200whether they are present and intact. The ``proxmox-backup-debug`` command-line
7ccbce03
DW
201tool can be used to inspect such files and recover their contents. For example,
202to get a list of the referenced chunks of a *.fidx* index:
5a839306
HL
203
204.. code-block:: console
205
206 # proxmox-backup-debug inspect file drive-scsi0.img.fidx
207
7ccbce03
DW
208The same command can be used to inspect *.blob* files. Without the ``--decode``
209parameter, just the size and the encryption type, if any, are printed. If
210``--decode`` is set, the blob file is decoded into the specified file ('-' will
211decode it directly to stdout).
212
213The following example would print the decoded contents of
214`qemu-server.conf.blob`. If the file you're trying to inspect is encrypted, a
215path to the key file must be provided using ``--keyfile``.
5a839306
HL
216
217.. code-block:: console
218
219 # proxmox-backup-debug inspect file qemu-server.conf.blob --decode -
220
7ccbce03 221You can also check in which index files a specific chunk file is referenced
5a839306
HL
222with:
223
224.. code-block:: console
225
a2945884 226 # proxmox-backup-debug inspect chunk b531d3ffc9bd7c65748a61198c060678326a431db7eded874c327b7986e595e0 --reference-filter /path/in/a/datastore/directory
5a839306 227
7ccbce03 228Here ``--reference-filter`` specifies where index files should be searched. This
a2945884 229can be an arbitrary path. If, for some reason, the filename of the chunk was
7ccbce03
DW
230changed, you can explicitly specify the digest using ``--digest``. By default, the
231chunk filename is used as the digest to look for. If no ``--reference-filter``
232is specified, it will only print the CRC and encryption status of the chunk. You
233can also decode chunks, by setting the ``--decode`` flag. If the chunk is
234encrypted, a ``--keyfile`` must be provided, in order to decode it.
5a839306 235
7ccbce03
DW
236Restore without a Running Proxmox Backup Server
237^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
5a839306 238
7ccbce03 239It's possible to restore specific files from a snapshot, without a running
34407477
AZ
240`Proxmox Backup`_ Server instance, using the ``recover`` subcommand, provided
241you have access to the intact index and chunk files. Note that you also need the
7ccbce03 242corresponding key file if the backup was encrypted.
5a839306
HL
243
244.. code-block:: console
245
a2945884 246 # proxmox-backup-debug recover index drive-scsi0.img.fidx /path/to/.chunks
5a839306 247
7ccbce03
DW
248In the above example, the `/path/to/.chunks` argument is the path to the
249directory that contains the chunks, and `drive-scsi0.img.fidx` is the index file
250of the file you'd like to restore. Both paths can be absolute or relative. With
251``--skip-crc``, it's possible to disable the CRC checks of the chunks. This
252will speed up the process slightly and allow for trying to restore (partially)
a2945884
TL
253corrupt chunks. It's recommended to always try without the skip-CRC option
254first.
5a839306 255