]> git.proxmox.com Git - libgit2.git/blob - docs/checkout-internals.md
Update d/ch for 0.28.4+dfsg.1-4 release
[libgit2.git] / docs / checkout-internals.md
1 Checkout Internals
2 ==================
3
4 Checkout has to handle a lot of different cases. It examines the
5 differences between the target tree, the baseline tree and the working
6 directory, plus the contents of the index, and groups files into five
7 categories:
8
9 1. UNMODIFIED - Files that match in all places.
10 2. SAFE - Files where the working directory and the baseline content
11 match that can be safely updated to the target.
12 3. DIRTY/MISSING - Files where the working directory differs from the
13 baseline but there is no conflicting change with the target. One
14 example is a file that doesn't exist in the working directory - no
15 data would be lost as a result of writing this file. Which action
16 will be taken with these files depends on the options you use.
17 4. CONFLICTS - Files where changes in the working directory conflict
18 with changes to be applied by the target. If conflicts are found,
19 they prevent any other modifications from being made (although there
20 are options to override that and force the update, of course).
21 5. UNTRACKED/IGNORED - Files in the working directory that are untracked
22 or ignored (i.e. only in the working directory, not the other places).
23
24 Right now, this classification is done via 3 iterators (for the three
25 trees), with a final lookup in the index. At some point, this may move to
26 a 4 iterator version to incorporate the index better.
27
28 The actual checkout is done in five phases (at least right now).
29
30 1. The diff between the baseline and the target tree is used as a base
31 list of possible updates to be applied.
32 2. Iterate through the diff and the working directory, building a list of
33 actions to be taken (and sending notifications about conflicts and
34 dirty files).
35 3. Remove any files / directories as needed (because alphabetical
36 iteration means that an untracked directory will end up sorted *after*
37 a blob that should be checked out with the same name).
38 4. Update all blobs.
39 5. Update all submodules (after 4 in case a new .gitmodules blob was
40 checked out)
41
42 Checkout could be driven either off a target-to-workdir diff or a
43 baseline-to-target diff. There are pros and cons of each.
44
45 Target-to-workdir means the diff includes every file that could be
46 modified, which simplifies bookkeeping, but the code to constantly refer
47 back to the baseline gets complicated.
48
49 Baseline-to-target has simpler code because the diff defines the action to
50 take, but needs special handling for untracked and ignored files, if they
51 need to be removed.
52
53 The current checkout implementation is based on a baseline-to-target diff.
54
55
56 Picking Actions
57 ===============
58
59 The most interesting aspect of this is phase 2, picking the actions that
60 should be taken. There are a lot of corner cases, so it may be easier to
61 start by looking at the rules for a simple 2-iterator diff:
62
63 Key
64 ---
65 - B1,B2,B3 - blobs with different SHAs,
66 - Bi - ignored blob (WD only)
67 - T1,T2,T3 - trees with different SHAs,
68 - Ti - ignored tree (WD only)
69 - S1,S2 - submodules with different SHAs
70 - Sd - dirty submodule (WD only)
71 - x - nothing
72
73 Diff with 2 non-workdir iterators
74 ---------------------------------
75
76 | | Old | New | |
77 |----|-----|-----|------------------------------------------------------------|
78 | 0 | x | x | nothing |
79 | 1 | x | B1 | added blob |
80 | 2 | x | T1 | added tree |
81 | 3 | B1 | x | removed blob |
82 | 4 | B1 | B1 | unmodified blob |
83 | 5 | B1 | B2 | modified blob |
84 | 6 | B1 | T1 | typechange blob -> tree |
85 | 7 | T1 | x | removed tree |
86 | 8 | T1 | B1 | typechange tree -> blob |
87 | 9 | T1 | T1 | unmodified tree |
88 | 10 | T1 | T2 | modified tree (implies modified/added/removed blob inside) |
89
90
91 Now, let's make the "New" iterator into a working directory iterator, so
92 we replace "added" items with either untracked or ignored, like this:
93
94 Diff with non-work & workdir iterators
95 --------------------------------------
96
97 | | Old | New | |
98 |----|-----|-----|------------------------------------------------------------|
99 | 0 | x | x | nothing |
100 | 1 | x | B1 | untracked blob |
101 | 2 | x | Bi | ignored file |
102 | 3 | x | T1 | untracked tree |
103 | 4 | x | Ti | ignored tree |
104 | 5 | B1 | x | removed blob |
105 | 6 | B1 | B1 | unmodified blob |
106 | 7 | B1 | B2 | modified blob |
107 | 8 | B1 | T1 | typechange blob -> tree |
108 | 9 | B1 | Ti | removed blob AND ignored tree as separate items |
109 | 10 | T1 | x | removed tree |
110 | 11 | T1 | B1 | typechange tree -> blob |
111 | 12 | T1 | Bi | removed tree AND ignored blob as separate items |
112 | 13 | T1 | T1 | unmodified tree |
113 | 14 | T1 | T2 | modified tree (implies modified/added/removed blob inside) |
114
115 Note: if there is a corresponding entry in the old tree, then a working
116 directory item won't be ignored (i.e. no Bi or Ti for tracked items).
117
118
119 Now, expand this to three iterators: a baseline tree, a target tree, and
120 an actual working directory tree:
121
122 Checkout From 3 Iterators (2 not workdir, 1 workdir)
123 ----------------------------------------------------
124
125 (base == old HEAD; target == what to checkout; actual == working dir)
126
127 | |base | target | actual/workdir | |
128 |-----|-----|------- |----------------|--------------------------------------------------------------------|
129 | 0 | x | x | x | nothing |
130 | 1 | x | x | B1/Bi/T1/Ti | untracked/ignored blob/tree (SAFE) |
131 | 2+ | x | B1 | x | add blob (SAFE) |
132 | 3 | x | B1 | B1 | independently added blob (FORCEABLE-2) |
133 | 4* | x | B1 | B2/Bi/T1/Ti | add blob with content conflict (FORCEABLE-2) |
134 | 5+ | x | T1 | x | add tree (SAFE) |
135 | 6* | x | T1 | B1/Bi | add tree with blob conflict (FORCEABLE-2) |
136 | 7 | x | T1 | T1/i | independently added tree (SAFE+MISSING) |
137 | 8 | B1 | x | x | independently deleted blob (SAFE+MISSING) |
138 | 9- | B1 | x | B1 | delete blob (SAFE) |
139 | 10- | B1 | x | B2 | delete of modified blob (FORCEABLE-1) |
140 | 11 | B1 | x | T1/Ti | independently deleted blob AND untrack/ign tree (SAFE+MISSING !!!) |
141 | 12 | B1 | B1 | x | locally deleted blob (DIRTY || SAFE+CREATE) |
142 | 13+ | B1 | B2 | x | update to deleted blob (SAFE+MISSING) |
143 | 14 | B1 | B1 | B1 | unmodified file (SAFE) |
144 | 15 | B1 | B1 | B2 | locally modified file (DIRTY) |
145 | 16+ | B1 | B2 | B1 | update unmodified blob (SAFE) |
146 | 17 | B1 | B2 | B2 | independently updated blob (FORCEABLE-1) |
147 | 18+ | B1 | B2 | B3 | update to modified blob (FORCEABLE-1) |
148 | 19 | B1 | B1 | T1/Ti | locally deleted blob AND untrack/ign tree (DIRTY) |
149 | 20* | B1 | B2 | T1/Ti | update to deleted blob AND untrack/ign tree (F-1) |
150 | 21+ | B1 | T1 | x | add tree with locally deleted blob (SAFE+MISSING) |
151 | 22* | B1 | T1 | B1 | add tree AND deleted blob (SAFE) |
152 | 23* | B1 | T1 | B2 | add tree with delete of modified blob (F-1) |
153 | 24 | B1 | T1 | T1 | add tree with deleted blob (F-1) |
154 | 25 | T1 | x | x | independently deleted tree (SAFE+MISSING) |
155 | 26 | T1 | x | B1/Bi | independently deleted tree AND untrack/ign blob (F-1) |
156 | 27- | T1 | x | T1 | deleted tree (MAYBE SAFE) |
157 | 28+ | T1 | B1 | x | deleted tree AND added blob (SAFE+MISSING) |
158 | 29 | T1 | B1 | B1 | independently typechanged tree -> blob (F-1) |
159 | 30+ | T1 | B1 | B2 | typechange tree->blob with conflicting blob (F-1) |
160 | 31* | T1 | B1 | T1/T2 | typechange tree->blob (MAYBE SAFE) |
161 | 32+ | T1 | T1 | x | restore locally deleted tree (SAFE+MISSING) |
162 | 33 | T1 | T1 | B1/Bi | locally typechange tree->untrack/ign blob (DIRTY) |
163 | 34 | T1 | T1 | T1/T2 | unmodified tree (MAYBE SAFE) |
164 | 35+ | T1 | T2 | x | update locally deleted tree (SAFE+MISSING) |
165 | 36* | T1 | T2 | B1/Bi | update to tree with typechanged tree->blob conflict (F-1) |
166 | 37 | T1 | T2 | T1/T2/T3 | update to existing tree (MAYBE SAFE) |
167 | 38+ | x | S1 | x | add submodule (SAFE) |
168 | 39 | x | S1 | S1/Sd | independently added submodule (SUBMODULE) |
169 | 40* | x | S1 | B1 | add submodule with blob confilct (FORCEABLE) |
170 | 41* | x | S1 | T1 | add submodule with tree conflict (FORCEABLE) |
171 | 42 | S1 | x | S1/Sd | deleted submodule (SUBMODULE) |
172 | 43 | S1 | x | x | independently deleted submodule (SUBMODULE) |
173 | 44 | S1 | x | B1 | independently deleted submodule with added blob (SAFE+MISSING) |
174 | 45 | S1 | x | T1 | independently deleted submodule with added tree (SAFE+MISSING) |
175 | 46 | S1 | S1 | x | locally deleted submodule (SUBMODULE) |
176 | 47+ | S1 | S2 | x | update locally deleted submodule (SAFE) |
177 | 48 | S1 | S1 | S2 | locally updated submodule commit (SUBMODULE) |
178 | 49 | S1 | S2 | S1 | updated submodule commit (SUBMODULE) |
179 | 50+ | S1 | B1 | x | add blob with locally deleted submodule (SAFE+MISSING) |
180 | 51* | S1 | B1 | S1 | typechange submodule->blob (SAFE) |
181 | 52* | S1 | B1 | Sd | typechange dirty submodule->blob (SAFE!?!?) |
182 | 53+ | S1 | T1 | x | add tree with locally deleted submodule (SAFE+MISSING) |
183 | 54* | S1 | T1 | S1/Sd | typechange submodule->tree (MAYBE SAFE) |
184 | 55+ | B1 | S1 | x | add submodule with locally deleted blob (SAFE+MISSING) |
185 | 56* | B1 | S1 | B1 | typechange blob->submodule (SAFE) |
186 | 57+ | T1 | S1 | x | add submodule with locally deleted tree (SAFE+MISSING) |
187 | 58* | T1 | S1 | T1 | typechange tree->submodule (SAFE) |
188
189
190 The number is followed by ' ' if no change is needed or '+' if the case
191 needs to write to disk or '-' if something must be deleted and '*' if
192 there should be a delete followed by an write.
193
194 There are four tiers of safe cases:
195
196 * SAFE == completely safe to update
197 * SAFE+MISSING == safe except the workdir is missing the expect content
198 * MAYBE SAFE == safe if workdir tree matches (or is missing) baseline
199 content, which is unknown at this point
200 * FORCEABLE == conflict unless FORCE is given
201 * DIRTY == no conflict but change is not applied unless FORCE
202 * SUBMODULE == no conflict and no change is applied unless a deleted
203 submodule dir is empty
204
205 Some slightly unusual circumstances:
206
207 * 8 - parent dir is only deleted when file is, so parent will be left if
208 empty even though it would be deleted if the file were present
209 * 11 - core git does not consider this a conflict but attempts to delete T1
210 and gives "unable to unlink file" error yet does not skip the rest
211 of the operation
212 * 12 - without FORCE file is left deleted (i.e. not restored) so new wd is
213 dirty (and warning message "D file" is printed), with FORCE, file is
214 restored.
215 * 24 - This should be considered MAYBE SAFE since effectively it is 7 and 8
216 combined, but core git considers this a conflict unless forced.
217 * 26 - This combines two cases (1 & 25) (and also implied 8 for tree content)
218 which are ok on their own, but core git treat this as a conflict.
219 If not forced, this is a conflict. If forced, this actually doesn't
220 have to write anything and leaves the new blob as an untracked file.
221 * 32 - This is the only case where the baseline and target values match
222 and yet we will still write to the working directory. In all other
223 cases, if baseline == target, we don't touch the workdir (it is
224 either already right or is "dirty"). However, since this case also
225 implies that a ?/B1/x case will exist as well, it can be skipped.
226 * 41 - It's not clear how core git distinguishes this case from 39 (mode?).
227 * 52 - Core git makes destructive changes without any warning when the
228 submodule is dirty and the type changes to a blob.
229
230 Cases 3, 17, 24, 26, and 29 are all considered conflicts even though
231 none of them will require making any updates to the working directory.