]> git.proxmox.com Git - proxmox-backup.git/blame - src/backup.rs
CalendarEvent: speedup/simplify repetition tests
[proxmox-backup.git] / src / backup.rs
CommitLineData
39a4df61 1//! This module implements the proxmox backup data storage
d78345bc 2//!
39a4df61
DM
3//! Proxmox backup splits large files into chunks, and stores them
4//! deduplicated using a content addressable storage format.
d78345bc 5//!
39a4df61
DM
6//! A chunk is simply defined as binary blob, which is stored inside a
7//! `ChunkStore`, addressed by the SHA256 digest of the binary blob.
8//!
9//! Index files are used to reconstruct the original file. They
10//! basically contain a list of SHA256 checksums. The `DynamicIndex*`
11//! format is able to deal with dynamic chunk sizes, whereas the
12//! `FixedIndex*` format is an optimization to store a list of equal
13//! sized chunks.
04652189
DM
14//!
15//! # ChunkStore Locking
16//!
17//! We need to be able to restart the proxmox-backup service daemons,
18//! so that we can update the software without rebooting the host. But
19//! such restarts must not abort running backup jobs, so we need to
20//! keep the old service running until those jobs are finished. This
c8ec450e 21//! implies that we need some kind of locking for the
04652189
DM
22//! ChunkStore. Please note that it is perfectly valid to have
23//! multiple parallel ChunkStore writers, even when they write the
24//! same chunk (because the chunk would have the same name and the
25//! same data). The only real problem is garbage collection, because
26//! we need to avoid deleting chunks which are still referenced.
27//!
28//! * Read Index Files:
29//!
30//! Acquire shared lock for .idx files.
31//!
32//!
33//! * Delete Index Files:
34//!
35//! Acquire exclusive lock for .idx files. This makes sure that we do
36//! not delete index files while they are still in use.
37//!
38//!
39//! * Create Index Files:
40//!
8a475734 41//! Acquire shared lock for ChunkStore (process wide).
04652189 42//!
c8ec450e
DM
43//! Note: When creating .idx files, we create temporary (.tmp) file,
44//! then do an atomic rename ...
04652189
DM
45//!
46//!
47//! * Garbage Collect:
48//!
8a475734
DM
49//! Acquire exclusive lock for ChunkStore (process wide). If we have
50//! already an shared lock for ChunkStore, try to updraged that
51//! lock.
04652189
DM
52//!
53//!
54//! * Server Restart
55//!
56//! Try to abort running garbage collection to release exclusive
57//! ChunkStore lock asap. Start new service with existing listening
58//! socket.
59//!
8a475734 60//!
c8ec450e 61//! # Garbage Collection (GC)
8a475734
DM
62//!
63//! Deleting backups is as easy as deleting the corresponding .idx
64//! files. Unfortunately, this does not free up any storage, because
65//! those files just contains references to chunks.
66//!
67//! To free up some storage, we run a garbage collection process at
68//! regular intervals. The collector uses an mark and sweep
c374f054
DM
69//! approach. In the first phase, it scans all .idx files to mark used
70//! chunks. The second phase then removes all unmarked chunks from the
8a475734
DM
71//! store.
72//!
73//! The above locking mechanism makes sure that we are the only
c8ec450e
DM
74//! process running GC. But we still want to be able to create backups
75//! during GC, so there may be multiple backup threads/tasks
76//! running. Either started before GC started, or started while GC is
77//! running.
8a475734 78//!
c8ec450e 79//! ## `atime` based GC
8a475734 80//!
c8ec450e
DM
81//! The idea here is to mark chunks by updating the `atime` (access
82//! timestamp) on the chunk file. This is quite simple and does not
c374f054 83//! need additional RAM.
c8ec450e
DM
84//!
85//! One minor problem is that recent Linux versions use the `relatime`
86//! mount flag by default for performance reasons (yes, we want
87//! that). When enabled, `atime` data is written to the disk only if
88//! the file has been modified since the `atime` data was last updated
89//! (`mtime`), or if the file was last accessed more than a certain
c374f054
DM
90//! amount of time ago (by default 24h). So we may only delete chunks
91//! with `atime` older than 24 hours.
92//!
93//! Another problem arise from running backups. The mark phase does
94//! not find any chunks from those backups, because there is no .idx
95//! file for them (created after the backup). Chunks created or
96//! touched by those backups may have an `atime` as old as the start
97//! time of those backup. Please not that the backup start time may
98//! predate the GC start time. Se we may only delete chunk older than
99//! the start time of those running backup jobs.
c8ec450e 100//!
c8ec450e
DM
101//!
102//! ## Store `marks` in RAM using a HASH
103//!
104//! Not sure if this is better. TODO
cbdd8c54 105
f7d4e4b5 106use anyhow::{bail, Error};
f74a03da 107
bf6e3217
DM
108// Note: .pcat1 => Proxmox Catalog Format version 1
109pub const CATALOG_NAME: &str = "catalog.pcat1.didx";
36493d4d 110
986bef16
DM
111#[macro_export]
112macro_rules! PROXMOX_BACKUP_PROTOCOL_ID_V1 {
113 () => { "proxmox-backup-protocol-v1" }
114}
c9ec0956 115
dd066d28
DM
116#[macro_export]
117macro_rules! PROXMOX_BACKUP_READER_PROTOCOL_ID_V1 {
118 () => { "proxmox-backup-reader-protocol-v1" }
119}
120
f74a03da
DM
121/// Unix system user used by proxmox-backup-proxy
122pub const BACKUP_USER_NAME: &str = "backup";
123
124/// Return User info for the 'backup' user (``getpwnam_r(3)``)
125pub fn backup_user() -> Result<nix::unistd::User, Error> {
126 match nix::unistd::User::from_name(BACKUP_USER_NAME)? {
127 Some(user) => Ok(user),
128 None => bail!("Unable to lookup backup user."),
129 }
130}
131
991abfa8
DM
132mod file_formats;
133pub use file_formats::*;
a7dd4830 134
59e9ba01
DM
135mod manifest;
136pub use manifest::*;
137
c38266c1
DM
138mod crypt_config;
139pub use crypt_config::*;
48b4b40b 140
826f309b
DM
141mod key_derivation;
142pub use key_derivation::*;
143
018d11bb
DM
144mod crypt_reader;
145pub use crypt_reader::*;
146
147mod crypt_writer;
148pub use crypt_writer::*;
149
150mod checksum_reader;
151pub use checksum_reader::*;
152
153mod checksum_writer;
154pub use checksum_writer::*;
155
7d83440c
WB
156mod chunker;
157pub use chunker::*;
158
3025b3a5
DM
159mod data_blob;
160pub use data_blob::*;
161
018d11bb
DM
162mod data_blob_reader;
163pub use data_blob_reader::*;
164
165mod data_blob_writer;
166pub use data_blob_writer::*;
167
89245fb5
DM
168mod catalog;
169pub use catalog::*;
9d135fe6 170
dafc27ae
DM
171mod chunk_stream;
172pub use chunk_stream::*;
173
7e336555
DM
174mod chunk_stat;
175pub use chunk_stat::*;
176
b8506736
DM
177mod read_chunk;
178pub use read_chunk::*;
179
e5064ba6
DM
180mod chunk_store;
181pub use chunk_store::*;
182
7bc1d727
WB
183mod index;
184pub use index::*;
185
e5064ba6
DM
186mod fixed_index;
187pub use fixed_index::*;
188
189mod dynamic_index;
190pub use dynamic_index::*;
191
b3483782
DM
192mod backup_info;
193pub use backup_info::*;
194
dc188491
DM
195mod prune;
196pub use prune::*;
197
e5064ba6
DM
198mod datastore;
199pub use datastore::*;
f14c96ea 200
f14c96ea
CE
201mod catalog_shell;
202pub use catalog_shell::*;