]> git.proxmox.com Git - mirror_ubuntu-bionic-kernel.git/blame - Documentation/filesystems/aufs/design/03lookup.txt
UBUNTU: ubuntu: vbox -- update to 5.2.6-dfsg-5
[mirror_ubuntu-bionic-kernel.git] / Documentation / filesystems / aufs / design / 03lookup.txt
CommitLineData
0006ebb4
SF
1
2# Copyright (C) 2005-2017 Junjiro R. Okajima
3#
4# This program is free software; you can redistribute it and/or modify
5# it under the terms of the GNU General Public License as published by
6# the Free Software Foundation; either version 2 of the License, or
7# (at your option) any later version.
8#
9# This program is distributed in the hope that it will be useful,
10# but WITHOUT ANY WARRANTY; without even the implied warranty of
11# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
12# GNU General Public License for more details.
13#
14# You should have received a copy of the GNU General Public License
15# along with this program. If not, see <http://www.gnu.org/licenses/>.
16
17Lookup in a Branch
18----------------------------------------------------------------------
19Since aufs has a character of sub-VFS (see Introduction), it operates
20lookup for branches as VFS does. It may be a heavy work. But almost all
21lookup operation in aufs is the simplest case, ie. lookup only an entry
22directly connected to its parent. Digging down the directory hierarchy
23is unnecessary. VFS has a function lookup_one_len() for that use, and
24aufs calls it.
25
26When a branch is a remote filesystem, aufs basically relies upon its
27->d_revalidate(), also aufs forces the hardest revalidate tests for
28them.
29For d_revalidate, aufs implements three levels of revalidate tests. See
30"Revalidate Dentry and UDBA" in detail.
31
32
33Test Only the Highest One for the Directory Permission (dirperm1 option)
34----------------------------------------------------------------------
35Let's try case study.
36- aufs has two branches, upper readwrite and lower readonly.
37 /au = /rw + /ro
38- "dirA" exists under /ro, but /rw. and its mode is 0700.
39- user invoked "chmod a+rx /au/dirA"
40- the internal copy-up is activated and "/rw/dirA" is created and its
41 permission bits are set to world readable.
42- then "/au/dirA" becomes world readable?
43
44In this case, /ro/dirA is still 0700 since it exists in readonly branch,
45or it may be a natively readonly filesystem. If aufs respects the lower
46branch, it should not respond readdir request from other users. But user
47allowed it by chmod. Should really aufs rejects showing the entries
48under /ro/dirA?
49
50To be honest, I don't have a good solution for this case. So aufs
51implements 'dirperm1' and 'nodirperm1' mount options, and leave it to
52users.
53When dirperm1 is specified, aufs checks only the highest one for the
54directory permission, and shows the entries. Otherwise, as usual, checks
55every dir existing on all branches and rejects the request.
56
57As a side effect, dirperm1 option improves the performance of aufs
58because the number of permission check is reduced when the number of
59branch is many.
60
61
62Revalidate Dentry and UDBA (User's Direct Branch Access)
63----------------------------------------------------------------------
64Generally VFS helpers re-validate a dentry as a part of lookup.
650. digging down the directory hierarchy.
661. lock the parent dir by its i_mutex.
672. lookup the final (child) entry.
683. revalidate it.
694. call the actual operation (create, unlink, etc.)
705. unlock the parent dir
71
72If the filesystem implements its ->d_revalidate() (step 3), then it is
73called. Actually aufs implements it and checks the dentry on a branch is
74still valid.
75But it is not enough. Because aufs has to release the lock for the
76parent dir on a branch at the end of ->lookup() (step 2) and
77->d_revalidate() (step 3) while the i_mutex of the aufs dir is still
78held by VFS.
79If the file on a branch is changed directly, eg. bypassing aufs, after
80aufs released the lock, then the subsequent operation may cause
81something unpleasant result.
82
83This situation is a result of VFS architecture, ->lookup() and
84->d_revalidate() is separated. But I never say it is wrong. It is a good
85design from VFS's point of view. It is just not suitable for sub-VFS
86character in aufs.
87
88Aufs supports such case by three level of revalidation which is
89selectable by user.
901. Simple Revalidate
91 Addition to the native flow in VFS's, confirm the child-parent
92 relationship on the branch just after locking the parent dir on the
93 branch in the "actual operation" (step 4). When this validation
94 fails, aufs returns EBUSY. ->d_revalidate() (step 3) in aufs still
95 checks the validation of the dentry on branches.
962. Monitor Changes Internally by Inotify/Fsnotify
97 Addition to above, in the "actual operation" (step 4) aufs re-lookup
98 the dentry on the branch, and returns EBUSY if it finds different
99 dentry.
100 Additionally, aufs sets the inotify/fsnotify watch for every dir on branches
101 during it is in cache. When the event is notified, aufs registers a
102 function to kernel 'events' thread by schedule_work(). And the
103 function sets some special status to the cached aufs dentry and inode
104 private data. If they are not cached, then aufs has nothing to
105 do. When the same file is accessed through aufs (step 0-3) later,
106 aufs will detect the status and refresh all necessary data.
107 In this mode, aufs has to ignore the event which is fired by aufs
108 itself.
1093. No Extra Validation
110 This is the simplest test and doesn't add any additional revalidation
111 test, and skip the revalidation in step 4. It is useful and improves
112 aufs performance when system surely hide the aufs branches from user,
113 by over-mounting something (or another method).