]> git.proxmox.com Git - mirror_ubuntu-bionic-kernel.git/blame - Documentation/filesystems/fscrypt.rst
x86/pkeys/selftests: Avoid printf-in-signal deadlocks
[mirror_ubuntu-bionic-kernel.git] / Documentation / filesystems / fscrypt.rst
CommitLineData
f4f864c1
EB
1=====================================
2Filesystem-level encryption (fscrypt)
3=====================================
4
5Introduction
6============
7
8fscrypt is a library which filesystems can hook into to support
9transparent encryption of files and directories.
10
11Note: "fscrypt" in this document refers to the kernel-level portion,
12implemented in ``fs/crypto/``, as opposed to the userspace tool
13`fscrypt <https://github.com/google/fscrypt>`_. This document only
14covers the kernel-level portion. For command-line examples of how to
15use encryption, see the documentation for the userspace tool `fscrypt
16<https://github.com/google/fscrypt>`_. Also, it is recommended to use
17the fscrypt userspace tool, or other existing userspace tools such as
18`fscryptctl <https://github.com/google/fscryptctl>`_ or `Android's key
19management system
20<https://source.android.com/security/encryption/file-based>`_, over
21using the kernel's API directly. Using existing tools reduces the
22chance of introducing your own security bugs. (Nevertheless, for
23completeness this documentation covers the kernel's API anyway.)
24
25Unlike dm-crypt, fscrypt operates at the filesystem level rather than
26at the block device level. This allows it to encrypt different files
27with different keys and to have unencrypted files on the same
28filesystem. This is useful for multi-user systems where each user's
29data-at-rest needs to be cryptographically isolated from the others.
30However, except for filenames, fscrypt does not encrypt filesystem
31metadata.
32
33Unlike eCryptfs, which is a stacked filesystem, fscrypt is integrated
34directly into supported filesystems --- currently ext4, F2FS, and
35UBIFS. This allows encrypted files to be read and written without
36caching both the decrypted and encrypted pages in the pagecache,
37thereby nearly halving the memory used and bringing it in line with
38unencrypted files. Similarly, half as many dentries and inodes are
39needed. eCryptfs also limits encrypted filenames to 143 bytes,
40causing application compatibility issues; fscrypt allows the full 255
41bytes (NAME_MAX). Finally, unlike eCryptfs, the fscrypt API can be
42used by unprivileged users, with no need to mount anything.
43
44fscrypt does not support encrypting files in-place. Instead, it
45supports marking an empty directory as encrypted. Then, after
46userspace provides the key, all regular files, directories, and
47symbolic links created in that directory tree are transparently
48encrypted.
49
50Threat model
51============
52
53Offline attacks
54---------------
55
56Provided that userspace chooses a strong encryption key, fscrypt
57protects the confidentiality of file contents and filenames in the
58event of a single point-in-time permanent offline compromise of the
59block device content. fscrypt does not protect the confidentiality of
60non-filename metadata, e.g. file sizes, file permissions, file
61timestamps, and extended attributes. Also, the existence and location
62of holes (unallocated blocks which logically contain all zeroes) in
63files is not protected.
64
65fscrypt is not guaranteed to protect confidentiality or authenticity
66if an attacker is able to manipulate the filesystem offline prior to
67an authorized user later accessing the filesystem.
68
69Online attacks
70--------------
71
72fscrypt (and storage encryption in general) can only provide limited
73protection, if any at all, against online attacks. In detail:
74
75fscrypt is only resistant to side-channel attacks, such as timing or
76electromagnetic attacks, to the extent that the underlying Linux
77Cryptographic API algorithms are. If a vulnerable algorithm is used,
78such as a table-based implementation of AES, it may be possible for an
79attacker to mount a side channel attack against the online system.
80Side channel attacks may also be mounted against applications
81consuming decrypted data.
82
83After an encryption key has been provided, fscrypt is not designed to
84hide the plaintext file contents or filenames from other users on the
85same system, regardless of the visibility of the keyring key.
86Instead, existing access control mechanisms such as file mode bits,
87POSIX ACLs, LSMs, or mount namespaces should be used for this purpose.
88Also note that as long as the encryption keys are *anywhere* in
89memory, an online attacker can necessarily compromise them by mounting
90a physical attack or by exploiting any kernel security vulnerability
91which provides an arbitrary memory read primitive.
92
93While it is ostensibly possible to "evict" keys from the system,
94recently accessed encrypted files will remain accessible at least
95until the filesystem is unmounted or the VFS caches are dropped, e.g.
96using ``echo 2 > /proc/sys/vm/drop_caches``. Even after that, if the
97RAM is compromised before being powered off, it will likely still be
98possible to recover portions of the plaintext file contents, if not
99some of the encryption keys as well. (Since Linux v4.12, all
100in-kernel keys related to fscrypt are sanitized before being freed.
101However, userspace would need to do its part as well.)
102
103Currently, fscrypt does not prevent a user from maliciously providing
104an incorrect key for another user's existing encrypted files. A
105protection against this is planned.
106
107Key hierarchy
108=============
109
110Master Keys
111-----------
112
113Each encrypted directory tree is protected by a *master key*. Master
114keys can be up to 64 bytes long, and must be at least as long as the
115greater of the key length needed by the contents and filenames
116encryption modes being used. For example, if AES-256-XTS is used for
117contents encryption, the master key must be 64 bytes (512 bits). Note
118that the XTS mode is defined to require a key twice as long as that
119required by the underlying block cipher.
120
121To "unlock" an encrypted directory tree, userspace must provide the
122appropriate master key. There can be any number of master keys, each
123of which protects any number of directory trees on any number of
124filesystems.
125
126Userspace should generate master keys either using a cryptographically
127secure random number generator, or by using a KDF (Key Derivation
128Function). Note that whenever a KDF is used to "stretch" a
129lower-entropy secret such as a passphrase, it is critical that a KDF
130designed for this purpose be used, such as scrypt, PBKDF2, or Argon2.
131
132Per-file keys
133-------------
134
135Master keys are not used to encrypt file contents or names directly.
136Instead, a unique key is derived for each encrypted file, including
137each regular file, directory, and symbolic link. This has several
138advantages:
139
140- In cryptosystems, the same key material should never be used for
141 different purposes. Using the master key as both an XTS key for
142 contents encryption and as a CTS-CBC key for filenames encryption
143 would violate this rule.
144- Per-file keys simplify the choice of IVs (Initialization Vectors)
145 for contents encryption. Without per-file keys, to ensure IV
146 uniqueness both the inode and logical block number would need to be
147 encoded in the IVs. This would make it impossible to renumber
148 inodes, which e.g. ``resize2fs`` can do when resizing an ext4
149 filesystem. With per-file keys, it is sufficient to encode just the
150 logical block number in the IVs.
151- Per-file keys strengthen the encryption of filenames, where IVs are
152 reused out of necessity. With a unique key per directory, IV reuse
153 is limited to within a single directory.
154- Per-file keys allow individual files to be securely erased simply by
155 securely erasing their keys. (Not yet implemented.)
156
157A KDF (Key Derivation Function) is used to derive per-file keys from
158the master key. This is done instead of wrapping a randomly-generated
159key for each file because it reduces the size of the encryption xattr,
160which for some filesystems makes the xattr more likely to fit in-line
161in the filesystem's inode table. With a KDF, only a 16-byte nonce is
162required --- long enough to make key reuse extremely unlikely. A
163wrapped key, on the other hand, would need to be up to 64 bytes ---
164the length of an AES-256-XTS key. Furthermore, currently there is no
165requirement to support unlocking a file with multiple alternative
166master keys or to support rotating master keys. Instead, the master
167keys may be wrapped in userspace, e.g. as done by the `fscrypt
168<https://github.com/google/fscrypt>`_ tool.
169
170The current KDF encrypts the master key using the 16-byte nonce as an
171AES-128-ECB key. The output is used as the derived key. If the
172output is longer than needed, then it is truncated to the needed
173length. Truncation is the norm for directories and symlinks, since
174those use the CTS-CBC encryption mode which requires a key half as
175long as that required by the XTS encryption mode.
176
177Note: this KDF meets the primary security requirement, which is to
178produce unique derived keys that preserve the entropy of the master
179key, assuming that the master key is already a good pseudorandom key.
180However, it is nonstandard and has some problems such as being
181reversible, so it is generally considered to be a mistake! It may be
182replaced with HKDF or another more standard KDF in the future.
183
184Encryption modes and usage
185==========================
186
187fscrypt allows one encryption mode to be specified for file contents
188and one encryption mode to be specified for filenames. Different
189directory trees are permitted to use different encryption modes.
190Currently, the following pairs of encryption modes are supported:
191
192- AES-256-XTS for contents and AES-256-CTS-CBC for filenames
193- AES-128-CBC for contents and AES-128-CTS-CBC for filenames
194
195It is strongly recommended to use AES-256-XTS for contents encryption.
196AES-128-CBC was added only for low-powered embedded devices with
197crypto accelerators such as CAAM or CESA that do not support XTS.
198
199New encryption modes can be added relatively easily, without changes
200to individual filesystems. However, authenticated encryption (AE)
201modes are not currently supported because of the difficulty of dealing
202with ciphertext expansion.
203
204For file contents, each filesystem block is encrypted independently.
205Currently, only the case where the filesystem block size is equal to
206the system's page size (usually 4096 bytes) is supported. With the
207XTS mode of operation (recommended), the logical block number within
208the file is used as the IV. With the CBC mode of operation (not
209recommended), ESSIV is used; specifically, the IV for CBC is the
210logical block number encrypted with AES-256, where the AES-256 key is
211the SHA-256 hash of the inode's data encryption key.
212
213For filenames, the full filename is encrypted at once. Because of the
214requirements to retain support for efficient directory lookups and
215filenames of up to 255 bytes, a constant initialization vector (IV) is
216used. However, each encrypted directory uses a unique key, which
217limits IV reuse to within a single directory. Note that IV reuse in
218the context of CTS-CBC encryption means that when the original
219filenames share a common prefix at least as long as the cipher block
220size (16 bytes for AES), the corresponding encrypted filenames will
221also share a common prefix. This is undesirable; it may be fixed in
222the future by switching to an encryption mode that is a strong
223pseudorandom permutation on arbitrary-length messages, e.g. the HEH
224(Hash-Encrypt-Hash) mode.
225
226Since filenames are encrypted with the CTS-CBC mode of operation, the
227plaintext and ciphertext filenames need not be multiples of the AES
228block size, i.e. 16 bytes. However, the minimum size that can be
229encrypted is 16 bytes, so shorter filenames are NUL-padded to 16 bytes
230before being encrypted. In addition, to reduce leakage of filename
231lengths via their ciphertexts, all filenames are NUL-padded to the
232next 4, 8, 16, or 32-byte boundary (configurable). 32 is recommended
233since this provides the best confidentiality, at the cost of making
234directory entries consume slightly more space. Note that since NUL
235(``\0``) is not otherwise a valid character in filenames, the padding
236will never produce duplicate plaintexts.
237
238Symbolic link targets are considered a type of filename and are
239encrypted in the same way as filenames in directory entries. Each
240symlink also uses a unique key; hence, the hardcoded IV is not a
241problem for symlinks.
242
243User API
244========
245
246Setting an encryption policy
247----------------------------
248
249The FS_IOC_SET_ENCRYPTION_POLICY ioctl sets an encryption policy on an
250empty directory or verifies that a directory or regular file already
251has the specified encryption policy. It takes in a pointer to a
252:c:type:`struct fscrypt_policy`, defined as follows::
253
254 #define FS_KEY_DESCRIPTOR_SIZE 8
255
256 struct fscrypt_policy {
257 __u8 version;
258 __u8 contents_encryption_mode;
259 __u8 filenames_encryption_mode;
260 __u8 flags;
261 __u8 master_key_descriptor[FS_KEY_DESCRIPTOR_SIZE];
262 };
263
264This structure must be initialized as follows:
265
266- ``version`` must be 0.
267
268- ``contents_encryption_mode`` and ``filenames_encryption_mode`` must
269 be set to constants from ``<linux/fs.h>`` which identify the
270 encryption modes to use. If unsure, use
271 FS_ENCRYPTION_MODE_AES_256_XTS (1) for ``contents_encryption_mode``
272 and FS_ENCRYPTION_MODE_AES_256_CTS (4) for
273 ``filenames_encryption_mode``.
274
275- ``flags`` must be set to a value from ``<linux/fs.h>`` which
276 identifies the amount of NUL-padding to use when encrypting
277 filenames. If unsure, use FS_POLICY_FLAGS_PAD_32 (0x3).
278
279- ``master_key_descriptor`` specifies how to find the master key in
280 the keyring; see `Adding keys`_. It is up to userspace to choose a
281 unique ``master_key_descriptor`` for each master key. The e4crypt
282 and fscrypt tools use the first 8 bytes of
283 ``SHA-512(SHA-512(master_key))``, but this particular scheme is not
284 required. Also, the master key need not be in the keyring yet when
285 FS_IOC_SET_ENCRYPTION_POLICY is executed. However, it must be added
286 before any files can be created in the encrypted directory.
287
288If the file is not yet encrypted, then FS_IOC_SET_ENCRYPTION_POLICY
289verifies that the file is an empty directory. If so, the specified
290encryption policy is assigned to the directory, turning it into an
291encrypted directory. After that, and after providing the
292corresponding master key as described in `Adding keys`_, all regular
293files, directories (recursively), and symlinks created in the
294directory will be encrypted, inheriting the same encryption policy.
295The filenames in the directory's entries will be encrypted as well.
296
297Alternatively, if the file is already encrypted, then
298FS_IOC_SET_ENCRYPTION_POLICY validates that the specified encryption
299policy exactly matches the actual one. If they match, then the ioctl
300returns 0. Otherwise, it fails with EEXIST. This works on both
301regular files and directories, including nonempty directories.
302
303Note that the ext4 filesystem does not allow the root directory to be
304encrypted, even if it is empty. Users who want to encrypt an entire
305filesystem with one key should consider using dm-crypt instead.
306
307FS_IOC_SET_ENCRYPTION_POLICY can fail with the following errors:
308
309- ``EACCES``: the file is not owned by the process's uid, nor does the
310 process have the CAP_FOWNER capability in a namespace with the file
311 owner's uid mapped
312- ``EEXIST``: the file is already encrypted with an encryption policy
313 different from the one specified
314- ``EINVAL``: an invalid encryption policy was specified (invalid
315 version, mode(s), or flags)
316- ``ENOTDIR``: the file is unencrypted and is a regular file, not a
317 directory
318- ``ENOTEMPTY``: the file is unencrypted and is a nonempty directory
319- ``ENOTTY``: this type of filesystem does not implement encryption
320- ``EOPNOTSUPP``: the kernel was not configured with encryption
321 support for this filesystem, or the filesystem superblock has not
322 had encryption enabled on it. (For example, to use encryption on an
323 ext4 filesystem, CONFIG_EXT4_ENCRYPTION must be enabled in the
324 kernel config, and the superblock must have had the "encrypt"
325 feature flag enabled using ``tune2fs -O encrypt`` or ``mkfs.ext4 -O
326 encrypt``.)
327- ``EPERM``: this directory may not be encrypted, e.g. because it is
328 the root directory of an ext4 filesystem
329- ``EROFS``: the filesystem is readonly
330
331Getting an encryption policy
332----------------------------
333
334The FS_IOC_GET_ENCRYPTION_POLICY ioctl retrieves the :c:type:`struct
335fscrypt_policy`, if any, for a directory or regular file. See above
336for the struct definition. No additional permissions are required
337beyond the ability to open the file.
338
339FS_IOC_GET_ENCRYPTION_POLICY can fail with the following errors:
340
341- ``EINVAL``: the file is encrypted, but it uses an unrecognized
342 encryption context format
343- ``ENODATA``: the file is not encrypted
344- ``ENOTTY``: this type of filesystem does not implement encryption
345- ``EOPNOTSUPP``: the kernel was not configured with encryption
346 support for this filesystem
347
348Note: if you only need to know whether a file is encrypted or not, on
349most filesystems it is also possible to use the FS_IOC_GETFLAGS ioctl
350and check for FS_ENCRYPT_FL, or to use the statx() system call and
351check for STATX_ATTR_ENCRYPTED in stx_attributes.
352
353Getting the per-filesystem salt
354-------------------------------
355
356Some filesystems, such as ext4 and F2FS, also support the deprecated
357ioctl FS_IOC_GET_ENCRYPTION_PWSALT. This ioctl retrieves a randomly
358generated 16-byte value stored in the filesystem superblock. This
359value is intended to used as a salt when deriving an encryption key
360from a passphrase or other low-entropy user credential.
361
362FS_IOC_GET_ENCRYPTION_PWSALT is deprecated. Instead, prefer to
363generate and manage any needed salt(s) in userspace.
364
365Adding keys
366-----------
367
368To provide a master key, userspace must add it to an appropriate
369keyring using the add_key() system call (see:
370``Documentation/security/keys/core.rst``). The key type must be
371"logon"; keys of this type are kept in kernel memory and cannot be
372read back by userspace. The key description must be "fscrypt:"
373followed by the 16-character lower case hex representation of the
374``master_key_descriptor`` that was set in the encryption policy. The
375key payload must conform to the following structure::
376
377 #define FS_MAX_KEY_SIZE 64
378
379 struct fscrypt_key {
380 u32 mode;
381 u8 raw[FS_MAX_KEY_SIZE];
382 u32 size;
383 };
384
385``mode`` is ignored; just set it to 0. The actual key is provided in
386``raw`` with ``size`` indicating its size in bytes. That is, the
387bytes ``raw[0..size-1]`` (inclusive) are the actual key.
388
389The key description prefix "fscrypt:" may alternatively be replaced
390with a filesystem-specific prefix such as "ext4:". However, the
391filesystem-specific prefixes are deprecated and should not be used in
392new programs.
393
394There are several different types of keyrings in which encryption keys
395may be placed, such as a session keyring, a user session keyring, or a
396user keyring. Each key must be placed in a keyring that is "attached"
397to all processes that might need to access files encrypted with it, in
398the sense that request_key() will find the key. Generally, if only
399processes belonging to a specific user need to access a given
400encrypted directory and no session keyring has been installed, then
401that directory's key should be placed in that user's user session
402keyring or user keyring. Otherwise, a session keyring should be
403installed if needed, and the key should be linked into that session
404keyring, or in a keyring linked into that session keyring.
405
406Note: introducing the complex visibility semantics of keyrings here
407was arguably a mistake --- especially given that by design, after any
408process successfully opens an encrypted file (thereby setting up the
409per-file key), possessing the keyring key is not actually required for
410any process to read/write the file until its in-memory inode is
411evicted. In the future there probably should be a way to provide keys
412directly to the filesystem instead, which would make the intended
413semantics clearer.
414
415Access semantics
416================
417
418With the key
419------------
420
421With the encryption key, encrypted regular files, directories, and
422symlinks behave very similarly to their unencrypted counterparts ---
423after all, the encryption is intended to be transparent. However,
424astute users may notice some differences in behavior:
425
426- Unencrypted files, or files encrypted with a different encryption
427 policy (i.e. different key, modes, or flags), cannot be renamed or
428 linked into an encrypted directory; see `Encryption policy
429 enforcement`_. Attempts to do so will fail with EPERM. However,
430 encrypted files can be renamed within an encrypted directory, or
431 into an unencrypted directory.
432
433- Direct I/O is not supported on encrypted files. Attempts to use
434 direct I/O on such files will fall back to buffered I/O.
435
436- The fallocate operations FALLOC_FL_COLLAPSE_RANGE,
437 FALLOC_FL_INSERT_RANGE, and FALLOC_FL_ZERO_RANGE are not supported
438 on encrypted files and will fail with EOPNOTSUPP.
439
440- Online defragmentation of encrypted files is not supported. The
441 EXT4_IOC_MOVE_EXT and F2FS_IOC_MOVE_RANGE ioctls will fail with
442 EOPNOTSUPP.
443
444- The ext4 filesystem does not support data journaling with encrypted
445 regular files. It will fall back to ordered data mode instead.
446
447- DAX (Direct Access) is not supported on encrypted files.
448
449- The st_size of an encrypted symlink will not necessarily give the
450 length of the symlink target as required by POSIX. It will actually
451 give the length of the ciphertext, which may be slightly longer than
452 the plaintext due to the NUL-padding.
453
454Note that mmap *is* supported. This is possible because the pagecache
455for an encrypted file contains the plaintext, not the ciphertext.
456
457Without the key
458---------------
459
460Some filesystem operations may be performed on encrypted regular
461files, directories, and symlinks even before their encryption key has
462been provided:
463
464- File metadata may be read, e.g. using stat().
465
466- Directories may be listed, in which case the filenames will be
467 listed in an encoded form derived from their ciphertext. The
468 current encoding algorithm is described in `Filename hashing and
469 encoding`_. The algorithm is subject to change, but it is
470 guaranteed that the presented filenames will be no longer than
471 NAME_MAX bytes, will not contain the ``/`` or ``\0`` characters, and
472 will uniquely identify directory entries.
473
474 The ``.`` and ``..`` directory entries are special. They are always
475 present and are not encrypted or encoded.
476
477- Files may be deleted. That is, nondirectory files may be deleted
478 with unlink() as usual, and empty directories may be deleted with
479 rmdir() as usual. Therefore, ``rm`` and ``rm -r`` will work as
480 expected.
481
482- Symlink targets may be read and followed, but they will be presented
483 in encrypted form, similar to filenames in directories. Hence, they
484 are unlikely to point to anywhere useful.
485
486Without the key, regular files cannot be opened or truncated.
487Attempts to do so will fail with ENOKEY. This implies that any
488regular file operations that require a file descriptor, such as
489read(), write(), mmap(), fallocate(), and ioctl(), are also forbidden.
490
491Also without the key, files of any type (including directories) cannot
492be created or linked into an encrypted directory, nor can a name in an
493encrypted directory be the source or target of a rename, nor can an
494O_TMPFILE temporary file be created in an encrypted directory. All
495such operations will fail with ENOKEY.
496
497It is not currently possible to backup and restore encrypted files
498without the encryption key. This would require special APIs which
499have not yet been implemented.
500
501Encryption policy enforcement
502=============================
503
504After an encryption policy has been set on a directory, all regular
505files, directories, and symbolic links created in that directory
506(recursively) will inherit that encryption policy. Special files ---
507that is, named pipes, device nodes, and UNIX domain sockets --- will
508not be encrypted.
509
510Except for those special files, it is forbidden to have unencrypted
511files, or files encrypted with a different encryption policy, in an
512encrypted directory tree. Attempts to link or rename such a file into
513an encrypted directory will fail with EPERM. This is also enforced
514during ->lookup() to provide limited protection against offline
515attacks that try to disable or downgrade encryption in known locations
516where applications may later write sensitive data. It is recommended
517that systems implementing a form of "verified boot" take advantage of
518this by validating all top-level encryption policies prior to access.
519
520Implementation details
521======================
522
523Encryption context
524------------------
525
526An encryption policy is represented on-disk by a :c:type:`struct
527fscrypt_context`. It is up to individual filesystems to decide where
528to store it, but normally it would be stored in a hidden extended
529attribute. It should *not* be exposed by the xattr-related system
530calls such as getxattr() and setxattr() because of the special
531semantics of the encryption xattr. (In particular, there would be
532much confusion if an encryption policy were to be added to or removed
533from anything other than an empty directory.) The struct is defined
534as follows::
535
536 #define FS_KEY_DESCRIPTOR_SIZE 8
537 #define FS_KEY_DERIVATION_NONCE_SIZE 16
538
539 struct fscrypt_context {
540 u8 format;
541 u8 contents_encryption_mode;
542 u8 filenames_encryption_mode;
543 u8 flags;
544 u8 master_key_descriptor[FS_KEY_DESCRIPTOR_SIZE];
545 u8 nonce[FS_KEY_DERIVATION_NONCE_SIZE];
546 };
547
548Note that :c:type:`struct fscrypt_context` contains the same
549information as :c:type:`struct fscrypt_policy` (see `Setting an
550encryption policy`_), except that :c:type:`struct fscrypt_context`
551also contains a nonce. The nonce is randomly generated by the kernel
552and is used to derive the inode's encryption key as described in
553`Per-file keys`_.
554
555Data path changes
556-----------------
557
558For the read path (->readpage()) of regular files, filesystems can
559read the ciphertext into the page cache and decrypt it in-place. The
560page lock must be held until decryption has finished, to prevent the
561page from becoming visible to userspace prematurely.
562
563For the write path (->writepage()) of regular files, filesystems
564cannot encrypt data in-place in the page cache, since the cached
565plaintext must be preserved. Instead, filesystems must encrypt into a
566temporary buffer or "bounce page", then write out the temporary
567buffer. Some filesystems, such as UBIFS, already use temporary
568buffers regardless of encryption. Other filesystems, such as ext4 and
569F2FS, have to allocate bounce pages specially for encryption.
570
571Filename hashing and encoding
572-----------------------------
573
574Modern filesystems accelerate directory lookups by using indexed
575directories. An indexed directory is organized as a tree keyed by
576filename hashes. When a ->lookup() is requested, the filesystem
577normally hashes the filename being looked up so that it can quickly
578find the corresponding directory entry, if any.
579
580With encryption, lookups must be supported and efficient both with and
581without the encryption key. Clearly, it would not work to hash the
582plaintext filenames, since the plaintext filenames are unavailable
583without the key. (Hashing the plaintext filenames would also make it
584impossible for the filesystem's fsck tool to optimize encrypted
585directories.) Instead, filesystems hash the ciphertext filenames,
586i.e. the bytes actually stored on-disk in the directory entries. When
587asked to do a ->lookup() with the key, the filesystem just encrypts
588the user-supplied name to get the ciphertext.
589
590Lookups without the key are more complicated. The raw ciphertext may
591contain the ``\0`` and ``/`` characters, which are illegal in
592filenames. Therefore, readdir() must base64-encode the ciphertext for
593presentation. For most filenames, this works fine; on ->lookup(), the
594filesystem just base64-decodes the user-supplied name to get back to
595the raw ciphertext.
596
597However, for very long filenames, base64 encoding would cause the
598filename length to exceed NAME_MAX. To prevent this, readdir()
599actually presents long filenames in an abbreviated form which encodes
600a strong "hash" of the ciphertext filename, along with the optional
601filesystem-specific hash(es) needed for directory lookups. This
602allows the filesystem to still, with a high degree of confidence, map
603the filename given in ->lookup() back to a particular directory entry
604that was previously listed by readdir(). See :c:type:`struct
605fscrypt_digested_name` in the source for more details.
606
607Note that the precise way that filenames are presented to userspace
608without the key is subject to change in the future. It is only meant
609as a way to temporarily present valid filenames so that commands like
610``rm -r`` work as expected on encrypted directories.