]> git.proxmox.com Git - mirror_ubuntu-eoan-kernel.git/blame - Documentation/filesystems/ubifs-authentication.md
brcmsmac: remove set but not used variables 'phybw40, maxtargetpwr'
[mirror_ubuntu-eoan-kernel.git] / Documentation / filesystems / ubifs-authentication.md
CommitLineData
e453fa60
SH
1% UBIFS Authentication
2% sigma star gmbh
3% 2018
4
5# Introduction
6
7UBIFS utilizes the fscrypt framework to provide confidentiality for file
8contents and file names. This prevents attacks where an attacker is able to
9read contents of the filesystem on a single point in time. A classic example
10is a lost smartphone where the attacker is unable to read personal data stored
11on the device without the filesystem decryption key.
12
13At the current state, UBIFS encryption however does not prevent attacks where
14the attacker is able to modify the filesystem contents and the user uses the
15device afterwards. In such a scenario an attacker can modify filesystem
16contents arbitrarily without the user noticing. One example is to modify a
17binary to perform a malicious action when executed [DMC-CBC-ATTACK]. Since
18most of the filesystem metadata of UBIFS is stored in plain, this makes it
19fairly easy to swap files and replace their contents.
20
21Other full disk encryption systems like dm-crypt cover all filesystem metadata,
22which makes such kinds of attacks more complicated, but not impossible.
23Especially, if the attacker is given access to the device multiple points in
24time. For dm-crypt and other filesystems that build upon the Linux block IO
25layer, the dm-integrity or dm-verity subsystems [DM-INTEGRITY, DM-VERITY]
26can be used to get full data authentication at the block layer.
27These can also be combined with dm-crypt [CRYPTSETUP2].
28
29This document describes an approach to get file contents _and_ full metadata
30authentication for UBIFS. Since UBIFS uses fscrypt for file contents and file
31name encryption, the authentication system could be tied into fscrypt such that
32existing features like key derivation can be utilized. It should however also
33be possible to use UBIFS authentication without using encryption.
34
35
36## MTD, UBI & UBIFS
37
38On Linux, the MTD (Memory Technology Devices) subsystem provides a uniform
39interface to access raw flash devices. One of the more prominent subsystems that
40work on top of MTD is UBI (Unsorted Block Images). It provides volume management
41for flash devices and is thus somewhat similar to LVM for block devices. In
42addition, it deals with flash-specific wear-leveling and transparent I/O error
43handling. UBI offers logical erase blocks (LEBs) to the layers on top of it
44and maps them transparently to physical erase blocks (PEBs) on the flash.
45
46UBIFS is a filesystem for raw flash which operates on top of UBI. Thus, wear
47leveling and some flash specifics are left to UBI, while UBIFS focuses on
48scalability, performance and recoverability.
49
50
51
52 +------------+ +*******+ +-----------+ +-----+
53 | | * UBIFS * | UBI-BLOCK | | ... |
54 | JFFS/JFFS2 | +*******+ +-----------+ +-----+
55 | | +-----------------------------+ +-----------+ +-----+
56 | | | UBI | | MTD-BLOCK | | ... |
57 +------------+ +-----------------------------+ +-----------+ +-----+
58 +------------------------------------------------------------------+
59 | MEMORY TECHNOLOGY DEVICES (MTD) |
60 +------------------------------------------------------------------+
61 +-----------------------------+ +--------------------------+ +-----+
62 | NAND DRIVERS | | NOR DRIVERS | | ... |
63 +-----------------------------+ +--------------------------+ +-----+
64
65 Figure 1: Linux kernel subsystems for dealing with raw flash
66
67
68
69Internally, UBIFS maintains multiple data structures which are persisted on
70the flash:
71
72- *Index*: an on-flash B+ tree where the leaf nodes contain filesystem data
73- *Journal*: an additional data structure to collect FS changes before updating
74 the on-flash index and reduce flash wear.
75- *Tree Node Cache (TNC)*: an in-memory B+ tree that reflects the current FS
76 state to avoid frequent flash reads. It is basically the in-memory
77 representation of the index, but contains additional attributes.
78- *LEB property tree (LPT)*: an on-flash B+ tree for free space accounting per
79 UBI LEB.
80
81In the remainder of this section we will cover the on-flash UBIFS data
82structures in more detail. The TNC is of less importance here since it is never
83persisted onto the flash directly. More details on UBIFS can also be found in
84[UBIFS-WP].
85
86
87### UBIFS Index & Tree Node Cache
88
89Basic on-flash UBIFS entities are called *nodes*. UBIFS knows different types
90of nodes. Eg. data nodes (`struct ubifs_data_node`) which store chunks of file
91contents or inode nodes (`struct ubifs_ino_node`) which represent VFS inodes.
92Almost all types of nodes share a common header (`ubifs_ch`) containing basic
93information like node type, node length, a sequence number, etc. (see
94`fs/ubifs/ubifs-media.h`in kernel source). Exceptions are entries of the LPT
95and some less important node types like padding nodes which are used to pad
96unusable content at the end of LEBs.
97
98To avoid re-writing the whole B+ tree on every single change, it is implemented
99as *wandering tree*, where only the changed nodes are re-written and previous
100versions of them are obsoleted without erasing them right away. As a result,
101the index is not stored in a single place on the flash, but *wanders* around
102and there are obsolete parts on the flash as long as the LEB containing them is
103not reused by UBIFS. To find the most recent version of the index, UBIFS stores
104a special node called *master node* into UBI LEB 1 which always points to the
105most recent root node of the UBIFS index. For recoverability, the master node
106is additionally duplicated to LEB 2. Mounting UBIFS is thus a simple read of
107LEB 1 and 2 to get the current master node and from there get the location of
108the most recent on-flash index.
109
110The TNC is the in-memory representation of the on-flash index. It contains some
111additional runtime attributes per node which are not persisted. One of these is
112a dirty-flag which marks nodes that have to be persisted the next time the
113index is written onto the flash. The TNC acts as a write-back cache and all
114modifications of the on-flash index are done through the TNC. Like other caches,
115the TNC does not have to mirror the full index into memory, but reads parts of
116it from flash whenever needed. A *commit* is the UBIFS operation of updating the
117on-flash filesystem structures like the index. On every commit, the TNC nodes
118marked as dirty are written to the flash to update the persisted index.
119
120
121### Journal
122
123To avoid wearing out the flash, the index is only persisted (*commited*) when
124certain conditions are met (eg. `fsync(2)`). The journal is used to record
125any changes (in form of inode nodes, data nodes etc.) between commits
126of the index. During mount, the journal is read from the flash and replayed
127onto the TNC (which will be created on-demand from the on-flash index).
128
129UBIFS reserves a bunch of LEBs just for the journal called *log area*. The
130amount of log area LEBs is configured on filesystem creation (using
131`mkfs.ubifs`) and stored in the superblock node. The log area contains only
132two types of nodes: *reference nodes* and *commit start nodes*. A commit start
133node is written whenever an index commit is performed. Reference nodes are
134written on every journal update. Each reference node points to the position of
135other nodes (inode nodes, data nodes etc.) on the flash that are part of this
136journal entry. These nodes are called *buds* and describe the actual filesystem
137changes including their data.
138
139The log area is maintained as a ring. Whenever the journal is almost full,
140a commit is initiated. This also writes a commit start node so that during
141mount, UBIFS will seek for the most recent commit start node and just replay
142every reference node after that. Every reference node before the commit start
143node will be ignored as they are already part of the on-flash index.
144
145When writing a journal entry, UBIFS first ensures that enough space is
146available to write the reference node and buds part of this entry. Then, the
147reference node is written and afterwards the buds describing the file changes.
148On replay, UBIFS will record every reference node and inspect the location of
149the referenced LEBs to discover the buds. If these are corrupt or missing,
150UBIFS will attempt to recover them by re-reading the LEB. This is however only
151done for the last referenced LEB of the journal. Only this can become corrupt
152because of a power cut. If the recovery fails, UBIFS will not mount. An error
153for every other LEB will directly cause UBIFS to fail the mount operation.
154
155
156 | ---- LOG AREA ---- | ---------- MAIN AREA ------------ |
157
158 -----+------+-----+--------+---- ------+-----+-----+---------------
159 \ | | | | / / | | | \
160 / CS | REF | REF | | \ \ DENT | INO | INO | /
161 \ | | | | / / | | | \
162 ----+------+-----+--------+--- -------+-----+-----+----------------
163 | | ^ ^
164 | | | |
165 +------------------------+ |
166 | |
167 +-------------------------------+
168
169
170 Figure 2: UBIFS flash layout of log area with commit start nodes
171 (CS) and reference nodes (REF) pointing to main area
172 containing their buds
173
174
175### LEB Property Tree/Table
176
177The LEB property tree is used to store per-LEB information. This includes the
178LEB type and amount of free and *dirty* (old, obsolete content) space [1] on
179the LEB. The type is important, because UBIFS never mixes index nodes with data
180nodes on a single LEB and thus each LEB has a specific purpose. This again is
181useful for free space calculations. See [UBIFS-WP] for more details.
182
183The LEB property tree again is a B+ tree, but it is much smaller than the
184index. Due to its smaller size it is always written as one chunk on every
185commit. Thus, saving the LPT is an atomic operation.
186
187
188[1] Since LEBs can only be appended and never overwritten, there is a
189difference between free space ie. the remaining space left on the LEB to be
190written to without erasing it and previously written content that is obsolete
191but can't be overwritten without erasing the full LEB.
192
193
194# UBIFS Authentication
195
196This chapter introduces UBIFS authentication which enables UBIFS to verify
197the authenticity and integrity of metadata and file contents stored on flash.
198
199
200## Threat Model
201
202UBIFS authentication enables detection of offline data modification. While it
203does not prevent it, it enables (trusted) code to check the integrity and
204authenticity of on-flash file contents and filesystem metadata. This covers
205attacks where file contents are swapped.
206
207UBIFS authentication will not protect against rollback of full flash contents.
208Ie. an attacker can still dump the flash and restore it at a later time without
209detection. It will also not protect against partial rollback of individual
210index commits. That means that an attacker is able to partially undo changes.
211This is possible because UBIFS does not immediately overwrites obsolete
212versions of the index tree or the journal, but instead marks them as obsolete
213and garbage collection erases them at a later time. An attacker can use this by
214erasing parts of the current tree and restoring old versions that are still on
215the flash and have not yet been erased. This is possible, because every commit
216will always write a new version of the index root node and the master node
217without overwriting the previous version. This is further helped by the
218wear-leveling operations of UBI which copies contents from one physical
219eraseblock to another and does not atomically erase the first eraseblock.
220
221UBIFS authentication does not cover attacks where an attacker is able to
222execute code on the device after the authentication key was provided.
223Additional measures like secure boot and trusted boot have to be taken to
224ensure that only trusted code is executed on a device.
225
226
227## Authentication
228
229To be able to fully trust data read from flash, all UBIFS data structures
230stored on flash are authenticated. That is:
231
232- The index which includes file contents, file metadata like extended
233 attributes, file length etc.
234- The journal which also contains file contents and metadata by recording changes
235 to the filesystem
236- The LPT which stores UBI LEB metadata which UBIFS uses for free space accounting
237
238
239### Index Authentication
240
241Through UBIFS' concept of a wandering tree, it already takes care of only
242updating and persisting changed parts from leaf node up to the root node
243of the full B+ tree. This enables us to augment the index nodes of the tree
244with a hash over each node's child nodes. As a result, the index basically also
245a Merkle tree. Since the leaf nodes of the index contain the actual filesystem
246data, the hashes of their parent index nodes thus cover all the file contents
247and file metadata. When a file changes, the UBIFS index is updated accordingly
248from the leaf nodes up to the root node including the master node. This process
249can be hooked to recompute the hash only for each changed node at the same time.
250Whenever a file is read, UBIFS can verify the hashes from each leaf node up to
251the root node to ensure the node's integrity.
252
253To ensure the authenticity of the whole index, the UBIFS master node stores a
254keyed hash (HMAC) over its own contents and a hash of the root node of the index
255tree. As mentioned above, the master node is always written to the flash whenever
256the index is persisted (ie. on index commit).
257
258Using this approach only UBIFS index nodes and the master node are changed to
259include a hash. All other types of nodes will remain unchanged. This reduces
260the storage overhead which is precious for users of UBIFS (ie. embedded
261devices).
262
263
264 +---------------+
265 | Master Node |
266 | (hash) |
267 +---------------+
268 |
269 v
270 +-------------------+
271 | Index Node #1 |
272 | |
273 | branch0 branchn |
274 | (hash) (hash) |
275 +-------------------+
276 | ... | (fanout: 8)
277 | |
278 +-------+ +------+
279 | |
280 v v
281 +-------------------+ +-------------------+
282 | Index Node #2 | | Index Node #3 |
283 | | | |
284 | branch0 branchn | | branch0 branchn |
285 | (hash) (hash) | | (hash) (hash) |
286 +-------------------+ +-------------------+
287 | ... | ... |
288 v v v
289 +-----------+ +----------+ +-----------+
290 | Data Node | | INO Node | | DENT Node |
291 +-----------+ +----------+ +-----------+
292
293
294 Figure 3: Coverage areas of index node hash and master node HMAC
295
296
297
298The most important part for robustness and power-cut safety is to atomically
299persist the hash and file contents. Here the existing UBIFS logic for how
300changed nodes are persisted is already designed for this purpose such that
301UBIFS can safely recover if a power-cut occurs while persisting. Adding
302hashes to index nodes does not change this since each hash will be persisted
303atomically together with its respective node.
304
305
306### Journal Authentication
307
308The journal is authenticated too. Since the journal is continuously written
309it is necessary to also add authentication information frequently to the
310journal so that in case of a powercut not too much data can't be authenticated.
311This is done by creating a continuous hash beginning from the commit start node
312over the previous reference nodes, the current reference node, and the bud
313nodes. From time to time whenever it is suitable authentication nodes are added
314between the bud nodes. This new node type contains a HMAC over the current state
315of the hash chain. That way a journal can be authenticated up to the last
316authentication node. The tail of the journal which may not have a authentication
317node cannot be authenticated and is skipped during journal replay.
318
319We get this picture for journal authentication:
320
321 ,,,,,,,,
322 ,......,...........................................
323 ,. CS , hash1.----. hash2.----.
324 ,. | , . |hmac . |hmac
325 ,. v , . v . v
326 ,.REF#0,-> bud -> bud -> bud.-> auth -> bud -> bud.-> auth ...
327 ,..|...,...........................................
328 , | ,
329 , | ,,,,,,,,,,,,,,,
330 . | hash3,----.
331 , | , |hmac
332 , v , v
333 , REF#1 -> bud -> bud,-> auth ...
334 ,,,|,,,,,,,,,,,,,,,,,,
335 v
336 REF#2 -> ...
337 |
338 V
339 ...
340
341Since the hash also includes the reference nodes an attacker cannot reorder or
342skip any journal heads for replay. An attacker can only remove bud nodes or
343reference nodes from the end of the journal, effectively rewinding the
344filesystem at maximum back to the last commit.
345
346The location of the log area is stored in the master node. Since the master
347node is authenticated with a HMAC as described above, it is not possible to
348tamper with that without detection. The size of the log area is specified when
349the filesystem is created using `mkfs.ubifs` and stored in the superblock node.
350To avoid tampering with this and other values stored there, a HMAC is added to
351the superblock struct. The superblock node is stored in LEB 0 and is only
352modified on feature flag or similar changes, but never on file changes.
353
354
355### LPT Authentication
356
357The location of the LPT root node on the flash is stored in the UBIFS master
358node. Since the LPT is written and read atomically on every commit, there is
359no need to authenticate individual nodes of the tree. It suffices to
360protect the integrity of the full LPT by a simple hash stored in the master
361node. Since the master node itself is authenticated, the LPTs authenticity can
362be verified by verifying the authenticity of the master node and comparing the
363LTP hash stored there with the hash computed from the read on-flash LPT.
364
365
366## Key Management
367
368For simplicity, UBIFS authentication uses a single key to compute the HMACs
369of superblock, master, commit start and reference nodes. This key has to be
370available on creation of the filesystem (`mkfs.ubifs`) to authenticate the
371superblock node. Further, it has to be available on mount of the filesystem
372to verify authenticated nodes and generate new HMACs for changes.
373
374UBIFS authentication is intended to operate side-by-side with UBIFS encryption
375(fscrypt) to provide confidentiality and authenticity. Since UBIFS encryption
376has a different approach of encryption policies per directory, there can be
377multiple fscrypt master keys and there might be folders without encryption.
378UBIFS authentication on the other hand has an all-or-nothing approach in the
379sense that it either authenticates everything of the filesystem or nothing.
380Because of this and because UBIFS authentication should also be usable without
381encryption, it does not share the same master key with fscrypt, but manages
382a dedicated authentication key.
383
384The API for providing the authentication key has yet to be defined, but the
385key can eg. be provided by userspace through a keyring similar to the way it
386is currently done in fscrypt. It should however be noted that the current
387fscrypt approach has shown its flaws and the userspace API will eventually
388change [FSCRYPT-POLICY2].
389
390Nevertheless, it will be possible for a user to provide a single passphrase
391or key in userspace that covers UBIFS authentication and encryption. This can
392be solved by the corresponding userspace tools which derive a second key for
393authentication in addition to the derived fscrypt master key used for
394encryption.
395
396To be able to check if the proper key is available on mount, the UBIFS
397superblock node will additionally store a hash of the authentication key. This
398approach is similar to the approach proposed for fscrypt encryption policy v2
399[FSCRYPT-POLICY2].
400
401
402# Future Extensions
403
404In certain cases where a vendor wants to provide an authenticated filesystem
405image to customers, it should be possible to do so without sharing the secret
406UBIFS authentication key. Instead, in addition the each HMAC a digital
407signature could be stored where the vendor shares the public key alongside the
408filesystem image. In case this filesystem has to be modified afterwards,
409UBIFS can exchange all digital signatures with HMACs on first mount similar
410to the way the IMA/EVM subsystem deals with such situations. The HMAC key
411will then have to be provided beforehand in the normal way.
412
413
414# References
415
416[CRYPTSETUP2] http://www.saout.de/pipermail/dm-crypt/2017-November/005745.html
417
418[DMC-CBC-ATTACK] http://www.jakoblell.com/blog/2013/12/22/practical-malleability-attack-against-cbc-encrypted-luks-partitions/
419
420[DM-INTEGRITY] https://www.kernel.org/doc/Documentation/device-mapper/dm-integrity.txt
421
422[DM-VERITY] https://www.kernel.org/doc/Documentation/device-mapper/verity.txt
423
424[FSCRYPT-POLICY2] https://www.spinics.net/lists/linux-ext4/msg58710.html
425
426[UBIFS-WP] http://www.linux-mtd.infradead.org/doc/ubifs_whitepaper.pdf