We don't support using an index object from multiple threads at the same
time, so the locking doesn't have any effect when following the
rules. If not following the rules, things are going to break down
anyway.
Edward Thomson [Sun, 27 Dec 2015 04:39:22 +0000 (22:39 -0600)]
git_repository_init: include dotfiles when copying templates
Include dotfiles when copying template directory, which will handle
both a template directory itself that begins with a dotfile, and
any dotfiles inside the directory.
Michał Górny [Sat, 26 Dec 2015 16:17:05 +0000 (17:17 +0100)]
ssh_stream_read(): fix possible *bytes_read < 0 branch
Fix the possibility of returning successfully from ssh_stream_read()
with *bytes_read < 0. This would occur if stdout channel read resulted
in 0, and stderr channel read failed afterwards.
Vicent Marti [Wed, 16 Dec 2015 18:36:50 +0000 (19:36 +0100)]
index: Also size-hint the hash table
Note that we're not checking whether the resize succeeds; in OOM cases,
we let it run with a "small" vector and hash table and see if by chance
we can grow it dynamically as we insert the new entries. Nothing to
lose really.
Vicent Marti [Wed, 16 Dec 2015 11:30:52 +0000 (12:30 +0100)]
merge: Use `git_index__fill` to populate the index
Instead of calling `git_index_add` in a loop, use the new
`git_index_fill` internal API to fill the index with the initial staged
entries.
The new `fill` helper assumes that all the entries will be unique and
valid, so it can append them at the end of the entries vector and only
sort it once at the end. It performs no validation checks.
This prevents the quadratic behavior caused by having to sort the
entries list once after every insertion.
When replacing an index with a new one, we need to iterate
through all index entries in order to determine which entries are
equal. When it is not possible to re-use old entries for the new
index, we move it into a list of entries that are to be removed
and thus free'd.
When we encounter a non-zero error code, though, we skip adding
the current index entry to the remove-queue. `INSERT_MAP_EX`,
which is the function last run before adding to the remove-queue,
may return a positive non-zero code that indicates what exactly
happened while inserting the element. In this case we skip adding
the entry to the remove-queue but still continue the current
operation, leading to a leak of the current entry.
Fix this by checking for a negative return value instead of a
non-zero one when we want to add the current index entry to the
remove-queue.
Edward Thomson [Thu, 3 Dec 2015 21:27:15 +0000 (16:27 -0500)]
index: canonicalize inserted paths safely
When adding to the index, we look to see if a portion of the given
path matches a portion of a path in the index. If so, we will use
the existing path information. For example, when adding `foo/bar.c`,
if there is an index entry to `FOO/other` and the filesystem is case
insensitive, then we will put `bar.c` into the existing tree instead
of creating a new one with a different case.
Use `strncmp` to do that instead of `memcmp`. When we `bsearch`
into the index, we locate the position where the new entry would
go. The index entry at that position does not necessarily have
a relation to the entry we're adding, so we cannot make assumptions
and use `memcmp`. Instead, compare them as strings.
When canonicalizing paths, we look for the first index entry that
matches a given substring.
Michał Górny [Tue, 1 Dec 2015 19:41:23 +0000 (20:41 +0100)]
checkout test: Apply umask to file-mode test as well
Fix the file-mode test to expect system umask being applied to the
created file as well (it is currently applied to the directory only).
This fixes the test on systems where umask != 022.
When duplicating a `struct git_tree_entry` with
`git_tree_entry_dup` the resulting structure is not allocated
inside a memory pool. As we do a 1:1 copy of the original struct,
though, we also copy the `pooled` field, which is set to `true`
for pooled entries. This results in a huge memory leak as we
never free tree entries that were duplicated from a pooled
tree entry.
Fix this by marking the newly duplicated entry as un-pooled.
diff: include commit message when formatting patch
When formatting a patch as email we do not include the commit's
message in the formatted patch output. Implement this and add a
test that verifies behavior.
It is already possible to get a commit's summary with the
`git_commit_summary` function. It is not possible to get the
remaining part of the commit message, that is the commit
message's body.
Fix this by introducing a new function `git_commit_body`.
blame: use size_t for line counts in git_blame__entry
The `git_blame__entry` struct keeps track of line counts with
`int` fields. Since `int` is only guaranteed to be at least 16
bits we may overflow on certain platforms when line counts exceed
2^15.
Fix this by instead storing line counts in `size_t`.
blame: use size_t for line counts in git_blame_hunk
It is not unreasonable to have versioned files with a line count
exceeding 2^16. Upon blaming such files we fail to correctly keep
track of the lines as `git_blame_hunk` stores them in `uint16_t`
fields.
Fix this by converting the line fields of `git_blame_hunk` to
`size_t`. Add test to verify behavior.
This reduces the size of the struct from 32 to 26 bytes, and leaves a
single padding byte at the end of the struct (which comes from the
zero-length array).
These are rather small allocations, so we end up spending a non-trivial
amount of time asking the OS for memory. Since these entries are tied to
the lifetime of their tree, we can give the tree a pool so we speed up
the allocations.
tree: avoid advancing over the filename multiple times
We've already looked at the filename with `memchr()` and then used
`strlen()` to allocate the entry. We already know how much we have to
advance to get to the object id, so add the filename length instead of
looking at each byte again.
Edward Thomson [Fri, 20 Nov 2015 22:33:49 +0000 (17:33 -0500)]
merge: handle conflicts in recursive base building
When building a recursive merge base, allow conflicts to occur.
Use the file (with conflict markers) as the common ancestor.
The user has already seen and dealt with this conflict by virtue
of having a criss-cross merge. If they resolved this conflict
identically in both branches, then there will be no conflict in the
result. This is the best case scenario.
If they did not resolve the conflict identically in the two branches,
then we will generate a new conflict. If the user is simply using
standard conflict output then the results will be fairly sensible.
But if the user is using a mergetool or using diff3 output, then the
common ancestor will be a conflict file (itself with diff3 output,
haha!). This is quite terrible, but it matches git's behavior.
Edward Thomson [Wed, 11 Nov 2015 05:21:26 +0000 (21:21 -0800)]
merge: use annotated commits for recursion
Use annotated commits to act as our virtual bases, instead of regular
commits, to avoid polluting the odb with virtual base commits and
trees. Instead, build an annotated commit with an index and pointers
to the commits that it was merged from.
Edward Thomson [Wed, 28 Oct 2015 15:00:55 +0000 (11:00 -0400)]
merge: octopus merge common ancestors when >2
When there are more than two common ancestors, continue merging the
virtual base with the additional common ancestors, effectively
octopus merging a new virtual base.
Edward Thomson [Mon, 23 Nov 2015 20:49:54 +0000 (15:49 -0500)]
checkout: only consider nsecs when built that way
When examining the working directory and determining whether it's
up-to-date, only consider the nanoseconds in the index entry when
built with `GIT_USE_NSEC`. This prevents us from believing that
the working directory is always dirty when the index was originally
written with a git client that uinderstands nsecs (like git 2.x).