1 //! This module implements the proxmox backup chunked data storage
3 //! A chunk is simply defined as binary blob. We store them inside a
4 //! `ChunkStore`, addressed by the SHA256 digest of the binary
5 //! blob. This technology is also known as content-addressable
8 //! We store larger files by splitting them into chunks. The resulting
9 //! SHA256 digest list is stored as separate index file. The
10 //! `DynamicIndex*` format is able to deal with dynamic chunk sizes,
11 //! whereas the `FixedIndex*` format is an optimization to store a
12 //! list of equal sized chunks.
14 //! # ChunkStore Locking
16 //! We need to be able to restart the proxmox-backup service daemons,
17 //! so that we can update the software without rebooting the host. But
18 //! such restarts must not abort running backup jobs, so we need to
19 //! keep the old service running until those jobs are finished. This
20 //! implies that we need some kind of locking for the
21 //! ChunkStore. Please note that it is perfectly valid to have
22 //! multiple parallel ChunkStore writers, even when they write the
23 //! same chunk (because the chunk would have the same name and the
24 //! same data). The only real problem is garbage collection, because
25 //! we need to avoid deleting chunks which are still referenced.
27 //! * Read Index Files:
29 //! Acquire shared lock for .idx files.
32 //! * Delete Index Files:
34 //! Acquire exclusive lock for .idx files. This makes sure that we do
35 //! not delete index files while they are still in use.
38 //! * Create Index Files:
40 //! Acquire shared lock for ChunkStore (process wide).
42 //! Note: When creating .idx files, we create temporary (.tmp) file,
43 //! then do an atomic rename ...
46 //! * Garbage Collect:
48 //! Acquire exclusive lock for ChunkStore (process wide). If we have
49 //! already an shared lock for ChunkStore, try to updraged that
55 //! Try to abort running garbage collection to release exclusive
56 //! ChunkStore lock asap. Start new service with existing listening
60 //! # Garbage Collection (GC)
62 //! Deleting backups is as easy as deleting the corresponding .idx
63 //! files. Unfortunately, this does not free up any storage, because
64 //! those files just contains references to chunks.
66 //! To free up some storage, we run a garbage collection process at
67 //! regular intervals. The collector uses an mark and sweep
68 //! approach. In the first phase, it scans all .idx files to mark used
69 //! chunks. The second phase then removes all unmarked chunks from the
72 //! The above locking mechanism makes sure that we are the only
73 //! process running GC. But we still want to be able to create backups
74 //! during GC, so there may be multiple backup threads/tasks
75 //! running. Either started before GC started, or started while GC is
78 //! ## `atime` based GC
80 //! The idea here is to mark chunks by updating the `atime` (access
81 //! timestamp) on the chunk file. This is quite simple and does not
82 //! need additional RAM.
84 //! One minor problem is that recent Linux versions use the `relatime`
85 //! mount flag by default for performance reasons (yes, we want
86 //! that). When enabled, `atime` data is written to the disk only if
87 //! the file has been modified since the `atime` data was last updated
88 //! (`mtime`), or if the file was last accessed more than a certain
89 //! amount of time ago (by default 24h). So we may only delete chunks
90 //! with `atime` older than 24 hours.
92 //! Another problem arise from running backups. The mark phase does
93 //! not find any chunks from those backups, because there is no .idx
94 //! file for them (created after the backup). Chunks created or
95 //! touched by those backups may have an `atime` as old as the start
96 //! time of those backup. Please not that the backup start time may
97 //! predate the GC start time. Se we may only delete chunk older than
98 //! the start time of those running backup jobs.
101 //! ## Store `marks` in RAM using a HASH
103 //! Not sure if this is better. TODO
106 macro_rules
! PROXMOX_BACKUP_PROTOCOL_ID_V1
{
107 () => { "proxmox-backup-protocol-v1" }
111 macro_rules
! PROXMOX_BACKUP_READER_PROTOCOL_ID_V1
{
112 () => { "proxmox-backup-reader-protocol-v1" }
116 pub use file_formats
::*;
119 pub use crypt_config
::*;
122 pub use key_derivation
::*;
125 pub use data_chunk
::*;
128 pub use data_blob
::*;
131 pub use chunk_stream
::*;
134 pub use chunk_stat
::*;
136 pub use proxmox_protocol
::Chunker
;
139 pub use read_chunk
::*;
142 pub use chunk_store
::*;
148 pub use fixed_index
::*;
151 pub use dynamic_index
::*;
154 pub use backup_info
::*;
157 pub use datastore
::*;