]> git.proxmox.com Git - proxmox-backup.git/blob - src/backup.rs
src/backup/crypt_setup.rs: allow compressed and uncompressed chunks
[proxmox-backup.git] / src / backup.rs
1 //! This module implements the proxmox backup chunked data storage
2 //!
3 //! A chunk is simply defined as binary blob. We store them inside a
4 //! `ChunkStore`, addressed by the SHA256 digest of the binary
5 //! blob. This technology is also known as content-addressable
6 //! storage.
7 //!
8 //! We store larger files by splitting them into chunks. The resulting
9 //! SHA256 digest list is stored as separate index file. The
10 //! `DynamicIndex*` format is able to deal with dynamic chunk sizes,
11 //! whereas the `FixedIndex*` format is an optimization to store a
12 //! list of equal sized chunks.
13 //!
14 //! # ChunkStore Locking
15 //!
16 //! We need to be able to restart the proxmox-backup service daemons,
17 //! so that we can update the software without rebooting the host. But
18 //! such restarts must not abort running backup jobs, so we need to
19 //! keep the old service running until those jobs are finished. This
20 //! implies that we need some kind of locking for the
21 //! ChunkStore. Please note that it is perfectly valid to have
22 //! multiple parallel ChunkStore writers, even when they write the
23 //! same chunk (because the chunk would have the same name and the
24 //! same data). The only real problem is garbage collection, because
25 //! we need to avoid deleting chunks which are still referenced.
26 //!
27 //! * Read Index Files:
28 //!
29 //! Acquire shared lock for .idx files.
30 //!
31 //!
32 //! * Delete Index Files:
33 //!
34 //! Acquire exclusive lock for .idx files. This makes sure that we do
35 //! not delete index files while they are still in use.
36 //!
37 //!
38 //! * Create Index Files:
39 //!
40 //! Acquire shared lock for ChunkStore (process wide).
41 //!
42 //! Note: When creating .idx files, we create temporary (.tmp) file,
43 //! then do an atomic rename ...
44 //!
45 //!
46 //! * Garbage Collect:
47 //!
48 //! Acquire exclusive lock for ChunkStore (process wide). If we have
49 //! already an shared lock for ChunkStore, try to updraged that
50 //! lock.
51 //!
52 //!
53 //! * Server Restart
54 //!
55 //! Try to abort running garbage collection to release exclusive
56 //! ChunkStore lock asap. Start new service with existing listening
57 //! socket.
58 //!
59 //!
60 //! # Garbage Collection (GC)
61 //!
62 //! Deleting backups is as easy as deleting the corresponding .idx
63 //! files. Unfortunately, this does not free up any storage, because
64 //! those files just contains references to chunks.
65 //!
66 //! To free up some storage, we run a garbage collection process at
67 //! regular intervals. The collector uses an mark and sweep
68 //! approach. In the first phase, it scans all .idx files to mark used
69 //! chunks. The second phase then removes all unmarked chunks from the
70 //! store.
71 //!
72 //! The above locking mechanism makes sure that we are the only
73 //! process running GC. But we still want to be able to create backups
74 //! during GC, so there may be multiple backup threads/tasks
75 //! running. Either started before GC started, or started while GC is
76 //! running.
77 //!
78 //! ## `atime` based GC
79 //!
80 //! The idea here is to mark chunks by updating the `atime` (access
81 //! timestamp) on the chunk file. This is quite simple and does not
82 //! need additional RAM.
83 //!
84 //! One minor problem is that recent Linux versions use the `relatime`
85 //! mount flag by default for performance reasons (yes, we want
86 //! that). When enabled, `atime` data is written to the disk only if
87 //! the file has been modified since the `atime` data was last updated
88 //! (`mtime`), or if the file was last accessed more than a certain
89 //! amount of time ago (by default 24h). So we may only delete chunks
90 //! with `atime` older than 24 hours.
91 //!
92 //! Another problem arise from running backups. The mark phase does
93 //! not find any chunks from those backups, because there is no .idx
94 //! file for them (created after the backup). Chunks created or
95 //! touched by those backups may have an `atime` as old as the start
96 //! time of those backup. Please not that the backup start time may
97 //! predate the GC start time. Se we may only delete chunk older than
98 //! the start time of those running backup jobs.
99 //!
100 //!
101 //! ## Store `marks` in RAM using a HASH
102 //!
103 //! Not sure if this is better. TODO
104
105 #[macro_export]
106 macro_rules! PROXMOX_BACKUP_PROTOCOL_ID_V1 {
107 () => { "proxmox-backup-protocol-v1" }
108 }
109
110 // WARNING: PLEASE DO NOT MODIFY THOSE MAGIC VALUES
111
112 // openssl::sha::sha256(b"Proxmox Backup uncompressed chunk v1.0")[0..8]
113 pub static UNCOMPRESSED_CHUNK_MAGIC_1_0: [u8; 8] = [79, 127, 200, 4, 121, 74, 135, 239];
114
115 // openssl::sha::sha256(b"Proxmox Backup encrypted chunk v1.0")[0..8]
116 pub static ENCRYPTED_CHUNK_MAGIC_1_0: [u8; 8] = [8, 54, 114, 153, 70, 156, 26, 151];
117
118 // openssl::sha::sha256(b"Proxmox Backup zstd compressed chunk v1.0")[0..8]
119 pub static COMPRESSED_CHUNK_MAGIC_1_0: [u8; 8] = [191, 237, 46, 195, 108, 17, 228, 235];
120
121 // openssl::sha::sha256(b"Proxmox Backup zstd compressed encrypted chunk v1.0")[0..8]
122 pub static ENCR_COMPR_CHUNK_MAGIC_1_0: [u8; 8] = [9, 40, 53, 200, 37, 150, 90, 196];
123
124 mod crypt_setup;
125 pub use crypt_setup::*;
126
127 mod chunk_stream;
128 pub use chunk_stream::*;
129
130 mod chunk_stat;
131 pub use chunk_stat::*;
132
133 pub use proxmox_protocol::Chunker;
134
135 mod chunk_store;
136 pub use chunk_store::*;
137
138 mod index;
139 pub use index::*;
140
141 mod fixed_index;
142 pub use fixed_index::*;
143
144 mod dynamic_index;
145 pub use dynamic_index::*;
146
147 mod backup_info;
148 pub use backup_info::*;
149
150 mod datastore;
151 pub use datastore::*;