]> git.proxmox.com Git - proxmox-backup.git/blob - src/backup.rs
src/api2/admin/datastore.rs: add status api call
[proxmox-backup.git] / src / backup.rs
1 //! This module implements the proxmox backup chunked data storage
2 //!
3 //! A chunk is simply defined as binary blob. We store them inside a
4 //! `ChunkStore`, addressed by the SHA256 digest of the binary
5 //! blob. This technology is also known as content-addressable
6 //! storage.
7 //!
8 //! We store larger files by splitting them into chunks. The resulting
9 //! SHA256 digest list is stored as separate index file. The
10 //! `DynamicIndex*` format is able to deal with dynamic chunk sizes,
11 //! whereas the `FixedIndex*` format is an optimization to store a
12 //! list of equal sized chunks.
13 //!
14 //! # ChunkStore Locking
15 //!
16 //! We need to be able to restart the proxmox-backup service daemons,
17 //! so that we can update the software without rebooting the host. But
18 //! such restarts must not abort running backup jobs, so we need to
19 //! keep the old service running until those jobs are finished. This
20 //! implies that we need some kind of locking for the
21 //! ChunkStore. Please note that it is perfectly valid to have
22 //! multiple parallel ChunkStore writers, even when they write the
23 //! same chunk (because the chunk would have the same name and the
24 //! same data). The only real problem is garbage collection, because
25 //! we need to avoid deleting chunks which are still referenced.
26 //!
27 //! * Read Index Files:
28 //!
29 //! Acquire shared lock for .idx files.
30 //!
31 //!
32 //! * Delete Index Files:
33 //!
34 //! Acquire exclusive lock for .idx files. This makes sure that we do
35 //! not delete index files while they are still in use.
36 //!
37 //!
38 //! * Create Index Files:
39 //!
40 //! Acquire shared lock for ChunkStore (process wide).
41 //!
42 //! Note: When creating .idx files, we create temporary (.tmp) file,
43 //! then do an atomic rename ...
44 //!
45 //!
46 //! * Garbage Collect:
47 //!
48 //! Acquire exclusive lock for ChunkStore (process wide). If we have
49 //! already an shared lock for ChunkStore, try to updraged that
50 //! lock.
51 //!
52 //!
53 //! * Server Restart
54 //!
55 //! Try to abort running garbage collection to release exclusive
56 //! ChunkStore lock asap. Start new service with existing listening
57 //! socket.
58 //!
59 //!
60 //! # Garbage Collection (GC)
61 //!
62 //! Deleting backups is as easy as deleting the corresponding .idx
63 //! files. Unfortunately, this does not free up any storage, because
64 //! those files just contains references to chunks.
65 //!
66 //! To free up some storage, we run a garbage collection process at
67 //! regular intervals. The collector uses an mark and sweep
68 //! approach. In the first phase, it scans all .idx files to mark used
69 //! chunks. The second phase then removes all unmarked chunks from the
70 //! store.
71 //!
72 //! The above locking mechanism makes sure that we are the only
73 //! process running GC. But we still want to be able to create backups
74 //! during GC, so there may be multiple backup threads/tasks
75 //! running. Either started before GC started, or started while GC is
76 //! running.
77 //!
78 //! ## `atime` based GC
79 //!
80 //! The idea here is to mark chunks by updating the `atime` (access
81 //! timestamp) on the chunk file. This is quite simple and does not
82 //! need additional RAM.
83 //!
84 //! One minor problem is that recent Linux versions use the `relatime`
85 //! mount flag by default for performance reasons (yes, we want
86 //! that). When enabled, `atime` data is written to the disk only if
87 //! the file has been modified since the `atime` data was last updated
88 //! (`mtime`), or if the file was last accessed more than a certain
89 //! amount of time ago (by default 24h). So we may only delete chunks
90 //! with `atime` older than 24 hours.
91 //!
92 //! Another problem arise from running backups. The mark phase does
93 //! not find any chunks from those backups, because there is no .idx
94 //! file for them (created after the backup). Chunks created or
95 //! touched by those backups may have an `atime` as old as the start
96 //! time of those backup. Please not that the backup start time may
97 //! predate the GC start time. Se we may only delete chunk older than
98 //! the start time of those running backup jobs.
99 //!
100 //!
101 //! ## Store `marks` in RAM using a HASH
102 //!
103 //! Not sure if this is better. TODO
104
105 #[macro_export]
106 macro_rules! PROXMOX_BACKUP_PROTOCOL_ID_V1 {
107 () => { "proxmox-backup-protocol-v1" }
108 }
109
110 #[macro_export]
111 macro_rules! PROXMOX_BACKUP_READER_PROTOCOL_ID_V1 {
112 () => { "proxmox-backup-reader-protocol-v1" }
113 }
114
115 mod file_formats;
116 pub use file_formats::*;
117
118 mod crypt_config;
119 pub use crypt_config::*;
120
121 mod key_derivation;
122 pub use key_derivation::*;
123
124 mod data_chunk;
125 pub use data_chunk::*;
126
127 mod data_blob;
128 pub use data_blob::*;
129
130 mod chunk_stream;
131 pub use chunk_stream::*;
132
133 mod chunk_stat;
134 pub use chunk_stat::*;
135
136 pub use proxmox_protocol::Chunker;
137
138 mod read_chunk;
139 pub use read_chunk::*;
140
141 mod chunk_store;
142 pub use chunk_store::*;
143
144 mod index;
145 pub use index::*;
146
147 mod fixed_index;
148 pub use fixed_index::*;
149
150 mod dynamic_index;
151 pub use dynamic_index::*;
152
153 mod backup_info;
154 pub use backup_info::*;
155
156 mod datastore;
157 pub use datastore::*;