]>
Commit | Line | Data |
---|---|---|
0d24f65e AD |
1 | ====================================== |
2 | Sequence counters and sequential locks | |
3 | ====================================== | |
4 | ||
5 | Introduction | |
6 | ============ | |
7 | ||
8 | Sequence counters are a reader-writer consistency mechanism with | |
9 | lockless readers (read-only retry loops), and no writer starvation. They | |
10 | are used for data that's rarely written to (e.g. system time), where the | |
11 | reader wants a consistent set of information and is willing to retry if | |
12 | that information changes. | |
13 | ||
14 | A data set is consistent when the sequence count at the beginning of the | |
15 | read side critical section is even and the same sequence count value is | |
16 | read again at the end of the critical section. The data in the set must | |
17 | be copied out inside the read side critical section. If the sequence | |
18 | count has changed between the start and the end of the critical section, | |
19 | the reader must retry. | |
20 | ||
21 | Writers increment the sequence count at the start and the end of their | |
22 | critical section. After starting the critical section the sequence count | |
23 | is odd and indicates to the readers that an update is in progress. At | |
24 | the end of the write side critical section the sequence count becomes | |
25 | even again which lets readers make progress. | |
26 | ||
27 | A sequence counter write side critical section must never be preempted | |
28 | or interrupted by read side sections. Otherwise the reader will spin for | |
29 | the entire scheduler tick due to the odd sequence count value and the | |
30 | interrupted writer. If that reader belongs to a real-time scheduling | |
31 | class, it can spin forever and the kernel will livelock. | |
32 | ||
33 | This mechanism cannot be used if the protected data contains pointers, | |
34 | as the writer can invalidate a pointer that the reader is following. | |
35 | ||
36 | ||
37 | .. _seqcount_t: | |
38 | ||
39 | Sequence counters (``seqcount_t``) | |
40 | ================================== | |
41 | ||
42 | This is the the raw counting mechanism, which does not protect against | |
43 | multiple writers. Write side critical sections must thus be serialized | |
44 | by an external lock. | |
45 | ||
46 | If the write serialization primitive is not implicitly disabling | |
47 | preemption, preemption must be explicitly disabled before entering the | |
48 | write side section. If the read section can be invoked from hardirq or | |
49 | softirq contexts, interrupts or bottom halves must also be respectively | |
50 | disabled before entering the write section. | |
51 | ||
52 | If it's desired to automatically handle the sequence counter | |
53 | requirements of writer serialization and non-preemptibility, use | |
54 | :ref:`seqlock_t` instead. | |
55 | ||
56 | Initialization:: | |
57 | ||
58 | /* dynamic */ | |
59 | seqcount_t foo_seqcount; | |
60 | seqcount_init(&foo_seqcount); | |
61 | ||
62 | /* static */ | |
63 | static seqcount_t foo_seqcount = SEQCNT_ZERO(foo_seqcount); | |
64 | ||
65 | /* C99 struct init */ | |
66 | struct { | |
67 | .seq = SEQCNT_ZERO(foo.seq), | |
68 | } foo; | |
69 | ||
70 | Write path:: | |
71 | ||
72 | /* Serialized context with disabled preemption */ | |
73 | ||
74 | write_seqcount_begin(&foo_seqcount); | |
75 | ||
76 | /* ... [[write-side critical section]] ... */ | |
77 | ||
78 | write_seqcount_end(&foo_seqcount); | |
79 | ||
80 | Read path:: | |
81 | ||
82 | do { | |
83 | seq = read_seqcount_begin(&foo_seqcount); | |
84 | ||
85 | /* ... [[read-side critical section]] ... */ | |
86 | ||
87 | } while (read_seqcount_retry(&foo_seqcount, seq)); | |
88 | ||
89 | ||
90 | .. _seqlock_t: | |
91 | ||
92 | Sequential locks (``seqlock_t``) | |
93 | ================================ | |
94 | ||
95 | This contains the :ref:`seqcount_t` mechanism earlier discussed, plus an | |
96 | embedded spinlock for writer serialization and non-preemptibility. | |
97 | ||
98 | If the read side section can be invoked from hardirq or softirq context, | |
99 | use the write side function variants which disable interrupts or bottom | |
100 | halves respectively. | |
101 | ||
102 | Initialization:: | |
103 | ||
104 | /* dynamic */ | |
105 | seqlock_t foo_seqlock; | |
106 | seqlock_init(&foo_seqlock); | |
107 | ||
108 | /* static */ | |
109 | static DEFINE_SEQLOCK(foo_seqlock); | |
110 | ||
111 | /* C99 struct init */ | |
112 | struct { | |
113 | .seql = __SEQLOCK_UNLOCKED(foo.seql) | |
114 | } foo; | |
115 | ||
116 | Write path:: | |
117 | ||
118 | write_seqlock(&foo_seqlock); | |
119 | ||
120 | /* ... [[write-side critical section]] ... */ | |
121 | ||
122 | write_sequnlock(&foo_seqlock); | |
123 | ||
124 | Read path, three categories: | |
125 | ||
126 | 1. Normal Sequence readers which never block a writer but they must | |
127 | retry if a writer is in progress by detecting change in the sequence | |
128 | number. Writers do not wait for a sequence reader:: | |
129 | ||
130 | do { | |
131 | seq = read_seqbegin(&foo_seqlock); | |
132 | ||
133 | /* ... [[read-side critical section]] ... */ | |
134 | ||
135 | } while (read_seqretry(&foo_seqlock, seq)); | |
136 | ||
137 | 2. Locking readers which will wait if a writer or another locking reader | |
138 | is in progress. A locking reader in progress will also block a writer | |
139 | from entering its critical section. This read lock is | |
140 | exclusive. Unlike rwlock_t, only one locking reader can acquire it:: | |
141 | ||
142 | read_seqlock_excl(&foo_seqlock); | |
143 | ||
144 | /* ... [[read-side critical section]] ... */ | |
145 | ||
146 | read_sequnlock_excl(&foo_seqlock); | |
147 | ||
148 | 3. Conditional lockless reader (as in 1), or locking reader (as in 2), | |
149 | according to a passed marker. This is used to avoid lockless readers | |
150 | starvation (too much retry loops) in case of a sharp spike in write | |
151 | activity. First, a lockless read is tried (even marker passed). If | |
152 | that trial fails (odd sequence counter is returned, which is used as | |
153 | the next iteration marker), the lockless read is transformed to a | |
154 | full locking read and no retry loop is necessary:: | |
155 | ||
156 | /* marker; even initialization */ | |
157 | int seq = 0; | |
158 | do { | |
159 | read_seqbegin_or_lock(&foo_seqlock, &seq); | |
160 | ||
161 | /* ... [[read-side critical section]] ... */ | |
162 | ||
163 | } while (need_seqretry(&foo_seqlock, seq)); | |
164 | done_seqretry(&foo_seqlock, seq); | |
165 | ||
166 | ||
167 | API documentation | |
168 | ================= | |
169 | ||
170 | .. kernel-doc:: include/linux/seqlock.h |