]> git.proxmox.com Git - proxmox-backup.git/blame - src/backup.rs
remove proxmox-protocol subcrate
[proxmox-backup.git] / src / backup.rs
CommitLineData
39a4df61 1//! This module implements the proxmox backup data storage
d78345bc 2//!
39a4df61
DM
3//! Proxmox backup splits large files into chunks, and stores them
4//! deduplicated using a content addressable storage format.
d78345bc 5//!
39a4df61
DM
6//! A chunk is simply defined as binary blob, which is stored inside a
7//! `ChunkStore`, addressed by the SHA256 digest of the binary blob.
8//!
9//! Index files are used to reconstruct the original file. They
10//! basically contain a list of SHA256 checksums. The `DynamicIndex*`
11//! format is able to deal with dynamic chunk sizes, whereas the
12//! `FixedIndex*` format is an optimization to store a list of equal
13//! sized chunks.
04652189
DM
14//!
15//! # ChunkStore Locking
16//!
17//! We need to be able to restart the proxmox-backup service daemons,
18//! so that we can update the software without rebooting the host. But
19//! such restarts must not abort running backup jobs, so we need to
20//! keep the old service running until those jobs are finished. This
c8ec450e 21//! implies that we need some kind of locking for the
04652189
DM
22//! ChunkStore. Please note that it is perfectly valid to have
23//! multiple parallel ChunkStore writers, even when they write the
24//! same chunk (because the chunk would have the same name and the
25//! same data). The only real problem is garbage collection, because
26//! we need to avoid deleting chunks which are still referenced.
27//!
28//! * Read Index Files:
29//!
30//! Acquire shared lock for .idx files.
31//!
32//!
33//! * Delete Index Files:
34//!
35//! Acquire exclusive lock for .idx files. This makes sure that we do
36//! not delete index files while they are still in use.
37//!
38//!
39//! * Create Index Files:
40//!
8a475734 41//! Acquire shared lock for ChunkStore (process wide).
04652189 42//!
c8ec450e
DM
43//! Note: When creating .idx files, we create temporary (.tmp) file,
44//! then do an atomic rename ...
04652189
DM
45//!
46//!
47//! * Garbage Collect:
48//!
8a475734
DM
49//! Acquire exclusive lock for ChunkStore (process wide). If we have
50//! already an shared lock for ChunkStore, try to updraged that
51//! lock.
04652189
DM
52//!
53//!
54//! * Server Restart
55//!
56//! Try to abort running garbage collection to release exclusive
57//! ChunkStore lock asap. Start new service with existing listening
58//! socket.
59//!
8a475734 60//!
c8ec450e 61//! # Garbage Collection (GC)
8a475734
DM
62//!
63//! Deleting backups is as easy as deleting the corresponding .idx
64//! files. Unfortunately, this does not free up any storage, because
65//! those files just contains references to chunks.
66//!
67//! To free up some storage, we run a garbage collection process at
68//! regular intervals. The collector uses an mark and sweep
c374f054
DM
69//! approach. In the first phase, it scans all .idx files to mark used
70//! chunks. The second phase then removes all unmarked chunks from the
8a475734
DM
71//! store.
72//!
73//! The above locking mechanism makes sure that we are the only
c8ec450e
DM
74//! process running GC. But we still want to be able to create backups
75//! during GC, so there may be multiple backup threads/tasks
76//! running. Either started before GC started, or started while GC is
77//! running.
8a475734 78//!
c8ec450e 79//! ## `atime` based GC
8a475734 80//!
c8ec450e
DM
81//! The idea here is to mark chunks by updating the `atime` (access
82//! timestamp) on the chunk file. This is quite simple and does not
c374f054 83//! need additional RAM.
c8ec450e
DM
84//!
85//! One minor problem is that recent Linux versions use the `relatime`
86//! mount flag by default for performance reasons (yes, we want
87//! that). When enabled, `atime` data is written to the disk only if
88//! the file has been modified since the `atime` data was last updated
89//! (`mtime`), or if the file was last accessed more than a certain
c374f054
DM
90//! amount of time ago (by default 24h). So we may only delete chunks
91//! with `atime` older than 24 hours.
92//!
93//! Another problem arise from running backups. The mark phase does
94//! not find any chunks from those backups, because there is no .idx
95//! file for them (created after the backup). Chunks created or
96//! touched by those backups may have an `atime` as old as the start
97//! time of those backup. Please not that the backup start time may
98//! predate the GC start time. Se we may only delete chunk older than
99//! the start time of those running backup jobs.
c8ec450e 100//!
c8ec450e
DM
101//!
102//! ## Store `marks` in RAM using a HASH
103//!
104//! Not sure if this is better. TODO
cbdd8c54 105
986bef16
DM
106#[macro_export]
107macro_rules! PROXMOX_BACKUP_PROTOCOL_ID_V1 {
108 () => { "proxmox-backup-protocol-v1" }
109}
c9ec0956 110
dd066d28
DM
111#[macro_export]
112macro_rules! PROXMOX_BACKUP_READER_PROTOCOL_ID_V1 {
113 () => { "proxmox-backup-reader-protocol-v1" }
114}
115
991abfa8
DM
116mod file_formats;
117pub use file_formats::*;
a7dd4830 118
c38266c1
DM
119mod crypt_config;
120pub use crypt_config::*;
48b4b40b 121
826f309b
DM
122mod key_derivation;
123pub use key_derivation::*;
124
018d11bb
DM
125mod crypt_reader;
126pub use crypt_reader::*;
127
128mod crypt_writer;
129pub use crypt_writer::*;
130
131mod checksum_reader;
132pub use checksum_reader::*;
133
134mod checksum_writer;
135pub use checksum_writer::*;
136
7d83440c
WB
137mod chunker;
138pub use chunker::*;
139
b595cb9d
DM
140mod data_chunk;
141pub use data_chunk::*;
142
3025b3a5
DM
143mod data_blob;
144pub use data_blob::*;
145
018d11bb
DM
146mod data_blob_reader;
147pub use data_blob_reader::*;
148
149mod data_blob_writer;
150pub use data_blob_writer::*;
151
9d135fe6
DM
152mod catalog_blob;
153pub use catalog_blob::*;
154
dafc27ae
DM
155mod chunk_stream;
156pub use chunk_stream::*;
157
7e336555
DM
158mod chunk_stat;
159pub use chunk_stat::*;
160
b8506736
DM
161mod read_chunk;
162pub use read_chunk::*;
163
e5064ba6
DM
164mod chunk_store;
165pub use chunk_store::*;
166
7bc1d727
WB
167mod index;
168pub use index::*;
169
e5064ba6
DM
170mod fixed_index;
171pub use fixed_index::*;
172
173mod dynamic_index;
174pub use dynamic_index::*;
175
b3483782
DM
176mod backup_info;
177pub use backup_info::*;
178
e5064ba6
DM
179mod datastore;
180pub use datastore::*;