]>
Commit | Line | Data |
---|---|---|
1da177e4 LT |
1 | Mandatory File Locking For The Linux Operating System |
2 | ||
3 | Andy Walker <andy@lysaker.kvaerner.no> | |
4 | ||
5 | 15 April 1996 | |
6 | ||
7 | ||
8 | 1. What is mandatory locking? | |
9 | ------------------------------ | |
10 | ||
11 | Mandatory locking is kernel enforced file locking, as opposed to the more usual | |
12 | cooperative file locking used to guarantee sequential access to files among | |
13 | processes. File locks are applied using the flock() and fcntl() system calls | |
14 | (and the lockf() library routine which is a wrapper around fcntl().) It is | |
15 | normally a process' responsibility to check for locks on a file it wishes to | |
16 | update, before applying its own lock, updating the file and unlocking it again. | |
17 | The most commonly used example of this (and in the case of sendmail, the most | |
18 | troublesome) is access to a user's mailbox. The mail user agent and the mail | |
19 | transfer agent must guard against updating the mailbox at the same time, and | |
20 | prevent reading the mailbox while it is being updated. | |
21 | ||
22 | In a perfect world all processes would use and honour a cooperative, or | |
23 | "advisory" locking scheme. However, the world isn't perfect, and there's | |
24 | a lot of poorly written code out there. | |
25 | ||
26 | In trying to address this problem, the designers of System V UNIX came up | |
27 | with a "mandatory" locking scheme, whereby the operating system kernel would | |
28 | block attempts by a process to write to a file that another process holds a | |
29 | "read" -or- "shared" lock on, and block attempts to both read and write to a | |
30 | file that a process holds a "write " -or- "exclusive" lock on. | |
31 | ||
32 | The System V mandatory locking scheme was intended to have as little impact as | |
33 | possible on existing user code. The scheme is based on marking individual files | |
34 | as candidates for mandatory locking, and using the existing fcntl()/lockf() | |
35 | interface for applying locks just as if they were normal, advisory locks. | |
36 | ||
37 | Note 1: In saying "file" in the paragraphs above I am actually not telling | |
38 | the whole truth. System V locking is based on fcntl(). The granularity of | |
39 | fcntl() is such that it allows the locking of byte ranges in files, in addition | |
40 | to entire files, so the mandatory locking rules also have byte level | |
41 | granularity. | |
42 | ||
43 | Note 2: POSIX.1 does not specify any scheme for mandatory locking, despite | |
44 | borrowing the fcntl() locking scheme from System V. The mandatory locking | |
45 | scheme is defined by the System V Interface Definition (SVID) Version 3. | |
46 | ||
47 | 2. Marking a file for mandatory locking | |
48 | --------------------------------------- | |
49 | ||
50 | A file is marked as a candidate for mandatory locking by setting the group-id | |
51 | bit in its file mode but removing the group-execute bit. This is an otherwise | |
52 | meaningless combination, and was chosen by the System V implementors so as not | |
53 | to break existing user programs. | |
54 | ||
55 | Note that the group-id bit is usually automatically cleared by the kernel when | |
56 | a setgid file is written to. This is a security measure. The kernel has been | |
57 | modified to recognize the special case of a mandatory lock candidate and to | |
58 | refrain from clearing this bit. Similarly the kernel has been modified not | |
59 | to run mandatory lock candidates with setgid privileges. | |
60 | ||
61 | 3. Available implementations | |
62 | ---------------------------- | |
63 | ||
64 | I have considered the implementations of mandatory locking available with | |
65 | SunOS 4.1.x, Solaris 2.x and HP-UX 9.x. | |
66 | ||
67 | Generally I have tried to make the most sense out of the behaviour exhibited | |
68 | by these three reference systems. There are many anomalies. | |
69 | ||
70 | All the reference systems reject all calls to open() for a file on which | |
71 | another process has outstanding mandatory locks. This is in direct | |
72 | contravention of SVID 3, which states that only calls to open() with the | |
73 | O_TRUNC flag set should be rejected. The Linux implementation follows the SVID | |
74 | definition, which is the "Right Thing", since only calls with O_TRUNC can | |
75 | modify the contents of the file. | |
76 | ||
77 | HP-UX even disallows open() with O_TRUNC for a file with advisory locks, not | |
78 | just mandatory locks. That would appear to contravene POSIX.1. | |
79 | ||
80 | mmap() is another interesting case. All the operating systems mentioned | |
81 | prevent mandatory locks from being applied to an mmap()'ed file, but HP-UX | |
82 | also disallows advisory locks for such a file. SVID actually specifies the | |
83 | paranoid HP-UX behaviour. | |
84 | ||
85 | In my opinion only MAP_SHARED mappings should be immune from locking, and then | |
86 | only from mandatory locks - that is what is currently implemented. | |
87 | ||
88 | SunOS is so hopeless that it doesn't even honour the O_NONBLOCK flag for | |
89 | mandatory locks, so reads and writes to locked files always block when they | |
90 | should return EAGAIN. | |
91 | ||
92 | I'm afraid that this is such an esoteric area that the semantics described | |
93 | below are just as valid as any others, so long as the main points seem to | |
94 | agree. | |
95 | ||
96 | 4. Semantics | |
97 | ------------ | |
98 | ||
99 | 1. Mandatory locks can only be applied via the fcntl()/lockf() locking | |
100 | interface - in other words the System V/POSIX interface. BSD style | |
101 | locks using flock() never result in a mandatory lock. | |
102 | ||
103 | 2. If a process has locked a region of a file with a mandatory read lock, then | |
104 | other processes are permitted to read from that region. If any of these | |
105 | processes attempts to write to the region it will block until the lock is | |
106 | released, unless the process has opened the file with the O_NONBLOCK | |
107 | flag in which case the system call will return immediately with the error | |
108 | status EAGAIN. | |
109 | ||
110 | 3. If a process has locked a region of a file with a mandatory write lock, all | |
111 | attempts to read or write to that region block until the lock is released, | |
112 | unless a process has opened the file with the O_NONBLOCK flag in which case | |
113 | the system call will return immediately with the error status EAGAIN. | |
114 | ||
115 | 4. Calls to open() with O_TRUNC, or to creat(), on a existing file that has | |
116 | any mandatory locks owned by other processes will be rejected with the | |
117 | error status EAGAIN. | |
118 | ||
119 | 5. Attempts to apply a mandatory lock to a file that is memory mapped and | |
120 | shared (via mmap() with MAP_SHARED) will be rejected with the error status | |
121 | EAGAIN. | |
122 | ||
123 | 6. Attempts to create a shared memory map of a file (via mmap() with MAP_SHARED) | |
124 | that has any mandatory locks in effect will be rejected with the error status | |
125 | EAGAIN. | |
126 | ||
127 | 5. Which system calls are affected? | |
128 | ----------------------------------- | |
129 | ||
130 | Those which modify a file's contents, not just the inode. That gives read(), | |
131 | write(), readv(), writev(), open(), creat(), mmap(), truncate() and | |
132 | ftruncate(). truncate() and ftruncate() are considered to be "write" actions | |
133 | for the purposes of mandatory locking. | |
134 | ||
135 | The affected region is usually defined as stretching from the current position | |
136 | for the total number of bytes read or written. For the truncate calls it is | |
137 | defined as the bytes of a file removed or added (we must also consider bytes | |
138 | added, as a lock can specify just "the whole file", rather than a specific | |
139 | range of bytes.) | |
140 | ||
141 | Note 3: I may have overlooked some system calls that need mandatory lock | |
142 | checking in my eagerness to get this code out the door. Please let me know, or | |
143 | better still fix the system calls yourself and submit a patch to me or Linus. | |
144 | ||
145 | 6. Warning! | |
146 | ----------- | |
147 | ||
148 | Not even root can override a mandatory lock, so runaway processes can wreak | |
149 | havoc if they lock crucial files. The way around it is to change the file | |
150 | permissions (remove the setgid bit) before trying to read or write to it. | |
151 | Of course, that might be a bit tricky if the system is hung :-( | |
152 |