When replacing an index with a new one, we need to iterate
through all index entries in order to determine which entries are
equal. When it is not possible to re-use old entries for the new
index, we move it into a list of entries that are to be removed
and thus free'd.
When we encounter a non-zero error code, though, we skip adding
the current index entry to the remove-queue. `INSERT_MAP_EX`,
which is the function last run before adding to the remove-queue,
may return a positive non-zero code that indicates what exactly
happened while inserting the element. In this case we skip adding
the entry to the remove-queue but still continue the current
operation, leading to a leak of the current entry.
Fix this by checking for a negative return value instead of a
non-zero one when we want to add the current index entry to the
remove-queue.
Edward Thomson [Thu, 3 Dec 2015 21:27:15 +0000 (16:27 -0500)]
index: canonicalize inserted paths safely
When adding to the index, we look to see if a portion of the given
path matches a portion of a path in the index. If so, we will use
the existing path information. For example, when adding `foo/bar.c`,
if there is an index entry to `FOO/other` and the filesystem is case
insensitive, then we will put `bar.c` into the existing tree instead
of creating a new one with a different case.
Use `strncmp` to do that instead of `memcmp`. When we `bsearch`
into the index, we locate the position where the new entry would
go. The index entry at that position does not necessarily have
a relation to the entry we're adding, so we cannot make assumptions
and use `memcmp`. Instead, compare them as strings.
When canonicalizing paths, we look for the first index entry that
matches a given substring.
Michał Górny [Tue, 1 Dec 2015 19:41:23 +0000 (20:41 +0100)]
checkout test: Apply umask to file-mode test as well
Fix the file-mode test to expect system umask being applied to the
created file as well (it is currently applied to the directory only).
This fixes the test on systems where umask != 022.
When duplicating a `struct git_tree_entry` with
`git_tree_entry_dup` the resulting structure is not allocated
inside a memory pool. As we do a 1:1 copy of the original struct,
though, we also copy the `pooled` field, which is set to `true`
for pooled entries. This results in a huge memory leak as we
never free tree entries that were duplicated from a pooled
tree entry.
Fix this by marking the newly duplicated entry as un-pooled.
diff: include commit message when formatting patch
When formatting a patch as email we do not include the commit's
message in the formatted patch output. Implement this and add a
test that verifies behavior.
It is already possible to get a commit's summary with the
`git_commit_summary` function. It is not possible to get the
remaining part of the commit message, that is the commit
message's body.
Fix this by introducing a new function `git_commit_body`.
This reduces the size of the struct from 32 to 26 bytes, and leaves a
single padding byte at the end of the struct (which comes from the
zero-length array).
These are rather small allocations, so we end up spending a non-trivial
amount of time asking the OS for memory. Since these entries are tied to
the lifetime of their tree, we can give the tree a pool so we speed up
the allocations.
tree: avoid advancing over the filename multiple times
We've already looked at the filename with `memchr()` and then used
`strlen()` to allocate the entry. We already know how much we have to
advance to get to the object id, so add the filename length instead of
looking at each byte again.
Edward Thomson [Fri, 20 Nov 2015 22:33:49 +0000 (17:33 -0500)]
merge: handle conflicts in recursive base building
When building a recursive merge base, allow conflicts to occur.
Use the file (with conflict markers) as the common ancestor.
The user has already seen and dealt with this conflict by virtue
of having a criss-cross merge. If they resolved this conflict
identically in both branches, then there will be no conflict in the
result. This is the best case scenario.
If they did not resolve the conflict identically in the two branches,
then we will generate a new conflict. If the user is simply using
standard conflict output then the results will be fairly sensible.
But if the user is using a mergetool or using diff3 output, then the
common ancestor will be a conflict file (itself with diff3 output,
haha!). This is quite terrible, but it matches git's behavior.
Edward Thomson [Wed, 11 Nov 2015 05:21:26 +0000 (21:21 -0800)]
merge: use annotated commits for recursion
Use annotated commits to act as our virtual bases, instead of regular
commits, to avoid polluting the odb with virtual base commits and
trees. Instead, build an annotated commit with an index and pointers
to the commits that it was merged from.
Edward Thomson [Wed, 28 Oct 2015 15:00:55 +0000 (11:00 -0400)]
merge: octopus merge common ancestors when >2
When there are more than two common ancestors, continue merging the
virtual base with the additional common ancestors, effectively
octopus merging a new virtual base.
Edward Thomson [Mon, 23 Nov 2015 20:49:54 +0000 (15:49 -0500)]
checkout: only consider nsecs when built that way
When examining the working directory and determining whether it's
up-to-date, only consider the nanoseconds in the index entry when
built with `GIT_USE_NSEC`. This prevents us from believing that
the working directory is always dirty when the index was originally
written with a git client that uinderstands nsecs (like git 2.x).
Edward Thomson [Tue, 17 Nov 2015 16:22:01 +0000 (11:22 -0500)]
tests: use out-of-the-way config dir in sandbox
Don't put the configuration in a subdir of the sandbox named
`config`, lest some tests decide to create their own directory
called `config`. Prefix with some underscores for uniqueness.
Edward Thomson [Tue, 17 Nov 2015 04:31:19 +0000 (23:31 -0500)]
settings: allow users to set PROGRAMDATA
Allow users to set the `git_libgit2_opts` search path for the
`GIT_CONFIG_LEVEL_PROGRAMDATA`. Convert `GIT_CONFIG_LEVEL_PROGRAMDATA`
to `GIT_SYSDIR_PROGRAMDATA` for setting the configuration.
Edward Thomson [Mon, 16 Nov 2015 23:06:52 +0000 (18:06 -0500)]
racy: make git_index_read_index handle raciness
Ensure that `git_index_read_index` clears the uptodate bit on
files that it modifies.
Further, do not propagate the cache from an on-disk index into
another on-disk index. Although this should not be done, as
`git_index_read_index` is used to bring an in-memory index into
another index (that may or may not be on-disk), ensure that we do
not accidentally bring in these bits when misused.
Edward Thomson [Fri, 13 Nov 2015 21:31:51 +0000 (16:31 -0500)]
index: clear uptodate bit on save
The uptodate bit should have a lifecycle of a single read->write
on the index. Once the index is written, the files within it should
be scanned for racy timestamps against the new index timestamp.
Edward Thomson [Fri, 13 Nov 2015 21:30:39 +0000 (16:30 -0500)]
index: test for smudged entries on write only
Test that entries are only smudged when we write the index: the
entry smudging is to prevent us from updating an index in a way
that it would be impossible to tell that an item was racy.
Consider when we load an index: any entries that have the same
(or newer) timestamp than the index itself are considered racy,
and are subject to further scrutiny.
If we *save* that index with the same entries that we loaded,
then the index would now have a newer timestamp than the entries,
and they would no longer be given that additional scrutiny, failing
our racy detection! So test that we smudge those entries only on
writing the new index, but that we can detect them (in diff) without
having to write.
Edward Thomson [Fri, 13 Nov 2015 20:36:45 +0000 (15:36 -0500)]
checkout::crlf test: don't crash when no idx entry
When there's no matching index entry (for whatever reason), don't
try to dereference the null return value to get at the id.
Otherwise when we break something in the index API, the checkout
test crashes for confusing reasons and causes us to step through
it in a debugger thinking that we had broken much more than we
actually did.
Edward Thomson [Fri, 13 Nov 2015 20:32:48 +0000 (15:32 -0500)]
index: don't detect raciness in uptodate entries
Keep track of entries that we believe are up-to-date, because we
added the index entries since the index was loaded. This prevents
us from unnecessarily examining files that we wrote during the
cleanup of racy entries (when we smudge racily clean files that have
a timestamp newer than or equal to the index's timestamp when we
read it). Without keeping track of this, we would examine every
file that we just checked out for raciness, since all their timestamps
would be newer than the index's timestamp.
Edward Thomson [Wed, 4 Nov 2015 21:17:51 +0000 (16:17 -0500)]
submodule: reload HEAD/index after reading config
Reload the HEAD and index data for a submodule after reading the
configuration. The configuration may specify a `path`, so we must
update HEAD and index data with that path in mind.
Edward Thomson [Tue, 3 Nov 2015 22:02:07 +0000 (17:02 -0500)]
filebuf: detect directories in our way
When creating a filebuf, detect a directory that exists in our
target file location. This prevents a failure later, when we try
to move the lock file to the destination.