Only write index if updated when passing GIT_DIFF_UPDATE_INDEX
When diffing the index with the workdir and GIT_DIFF_UPDATE_INDEX has been passed,
the previous implementation was always writing the index to disk even if it wasn't
modified.
When a refspec contains no rhs and thus won't cause an explicit update,
we skip all the logic, but that means that we don't update FETCH_HEAD
with it, which is what the implicit rhs is.
Add another bit of logic which puts those remote heads in the list of
updates so we put them into FETCH_HEAD.
We currently recommend using `git_buf_grow` in order to make a buffer
make an owned copy of the memory it points to. This is not behaviour we
should encourage, so remove this recommendation.
The function itself is not changed, as we need to remain compatible, but
it will be changed not to allow usage on borrowed buffers.
When we don't own a buffer (asize=0) we currently allow the usage of
grow to copy the memory into a buffer we do own. This muddles the
meaning of grow, and lets us be a bit cavalier with ownership semantics.
Don't allow this any more. Usage of grow should be restricted to buffers
which we know own their own memory. If unsure, we must not attempt to
modify it.
Edward Thomson [Wed, 24 Jun 2015 16:06:41 +0000 (12:06 -0400)]
diff: determine DIFFABLE-ness for binaries
Always set `GIT_DIFF_PATCH_DIFFABLE` for all files, regardless of
binary-ness, so that the binary callback is invoked to either
show the binary contents, or just print the standard "Binary files
differ" message. We may need to do deeper inspection for binary
files where we have avoided loading the contents into a file map.
If the libcurl stream is available, use that as the underlying stream
instead of the socket stream. This allows us to set a proxy for HTTPS
connections.
http: ask for the curl stream for non-encrypted connections
The TLS streams talk over the curl stream themselves, so we don't need
to ask for it explicitly. Do so in the case of the non-encrypted one so
we can still make use proxies in that case.
When linking against libcurl, use it as the underlying transport instead
of straight sockets. We can't quite just give over the file descriptor,
as curl puts it into non-blocking mode, so we build a custom BIO so
OpenSSL sends the data through our stream, be it the socket or curl
streams.
cURL has a mode in which it acts a lot like our streams, providing send
and recv functions and taking care of the TLS and proxy setup for us.
Implement a new stream which uses libcurl instead of raw sockets or the
TLS libraries directly. This version does not support reporting
certificates or proxies yet.
Edward Thomson [Tue, 23 Jun 2015 20:27:33 +0000 (16:27 -0400)]
stash: save the workdir file when deleted in index
When stashing the workdir tree, examine the index as well. Using
a mechanism similar to `git_diff_tree_to_workdir_with_index`
allows us to determine that a file was added in the index and
subsequently modified in the working directory. Without examining
the index, we would erroneously believe that this file was
untracked and fail to include it in the working directory tree.
Use a slightly modified `git_diff_tree_to_workdir_with_index` in
order to avoid some of the behavior custom to `git diff`. In
particular, be sure to include the working directory side of a
file when it was deleted in the index.
Edward Thomson [Tue, 23 Jun 2015 20:27:17 +0000 (16:27 -0400)]
stash tests: ensure we save the workdir file
Ensure that when a file is added in the index and subsequently
modified in the working directory, the stashed working directory
tree contains the actual working directory contents.
This is something we do on re-init but not when opening a
repository. This hasn't particularly mattered up to now as the version
has been 0 ever since the first release of git, but the times, they're
a-changing and we will soon see version 1 in the wild. We need to make
sure we don't open those.
Fixed GIT_DELTA_CONFLICTED not returned in some cases
If an index entry for a file that is not in HEAD is in conflicted state,
when diffing HEAD with the index, the status field of the corresponding git_diff_delta was incorrectly reported as GIT_DELTA_ADDED instead of GIT_DELTA_CONFLICTED.
This was due to handle_unmatched_new_item() initially setting the status
to GIT_DELTA_CONFLICTED but then overriding it later with GIT_DELTA_ADDED.
Edward Thomson [Mon, 8 Jun 2015 15:55:04 +0000 (11:55 -0400)]
clar: support hierarchical test resource data
Support hierarchical test resource data, such that you can have
`tests/resources/foo/bar` and move the `bar` directory in as
a fixture.
Calling `cl_fixture_sandbox` on a path that is not directly beneath
the test resources directory succeeds, placing that directory into
the test fixture. (For example, `cl_fixture_sandbox("foo/bar")`
will sandbox the `foo/bar` directory as `bar`).
Add support for cleaning up directories created this way, by only
cleaning up the basename (in this example, `bar`) from the fixture
directory.
Edward Thomson [Mon, 8 Jun 2015 13:08:01 +0000 (09:08 -0400)]
crlf tests: use known-good data produced by git
Given a variety of combinations of core.autocrlf settings and
attributes settings, test that we check out data into the working
directory the same as a known-good test resource created by git.git.
submodule: handle writing out all enum values for settings
We currently do not handle those enum values which require us to set
"true" or unset variables in all cases. Use a common function which does
understand this by looking at our mapping directly.
The current code will always fail, but only because it's asking for a
string on a live config. Take a snapshot and make sure we fail with
ENOTFOUND instead of any old error.
submodule: make `_set_update_fetch_recurse_submodules()` affect the config
Similarly to the other ones. In this test we copy over testing
`RECURSE_YES` which shows an error in our handling of the `YES` variant
which we may have to port to the rest.
submodule: correct detection of existing submodules
During the cache deletion, the check for whether we consider a submodule
to exist got changed regarding submodules which are in the worktree but
not configured.
Instead of checking for the url field to be populated, check the
location where we've found it.
This lets us specify in the status call which ignore rules we want to
use (optionally falling back to whatever the submodule has in its
configuration).
This removes one of the reasons for having `_set_ignore()` set the value
in-memory. We re-use the `IGNORE_RESET` value for this as it is no
longer relevant but has a similar purpose to `IGNORE_FALLBACK`.
Similarly, we remove `IGNORE_DEFAULT` which does not have use outside of
initializers and move that to fall back to the configuration as well.
submodule: don't let status change an existing instance
As submodules are becomes more like values, we should not let a status
check to update its properties. Instead of taking a submodule, have
status take a repo and submodule name.
Having this cache and giving them out goes against our multithreading
guarantees and it makes it impossible to use submodules in a
multi-threaded environment, as any thread can ask for a refresh which
may reallocate some string in the submodule struct which we've accessed
in a different one via a getter.
This makes the submodules behave more like remotes, where each object is
created upon request and not shared except explicitly by the user. This
means that some tests won't pass yet, as they assume they can affect the
submodule objects in the cache and that will affect later operations.
merge: work around write-side racy protection when hacking the index
As we attempt to replicate a situation in which an older checkout has
put a file on disk with different filtering settings from us, set the
timestamp on the entry and file to a second before we're performing the
operation so the entry in the index counts as old.
This way we can test that we're not looking at the on-disk file when the
index has the entry and we detect it as clean.
commit: allow retrieving an arbitrary header field
This allows the user to look up fields which we don't parse in libgit2,
and allows them to access gpgsig or mergetag fields if they wish to
check the signature.
When an entry has a racy timestamp, we need to check whether the file
itself has changed since we put its entry in the index. Only then do we
smudge the size field to force a check the next time around.
diff: check files with the same or newer timestamps
When a file on the workdir has the same or a newer timestamp than the
index, we need to perform a full check of the contents, as the update of
the file may have happened just after we wrote the index.
The iterator changes are such that we can reach inside the workdir
iterator from the diff, though it may be better to have an accessor
instead of moving these structs into the header.
Edward Thomson [Fri, 19 Jun 2015 15:32:26 +0000 (08:32 -0700)]
diff: preserve original mode in the index
When updating the index during a diff, preserve the original mode,
which prevents us from dropping the mode to what we have interpreted
as on our system (eg, what the working directory claims it to be,
which may be a lie on some systems.)
index: make relative comparison use the checksum as well
This is used by the submodule in order to figure out if the index has
changed since it last read it. Using a timestamp is racy, so let's make
it use the checksum, just like we now do for reloading the index itself.
When ticking over one second, it can happen that the actual time ticks
over the same second between the time that we undermine our own race
protections and the time in which we perform the index update. Such
timing would make the time in the entries match the index' timestamp and
we have not gained anything.
Ticking over five seconds makes it so that if real-time rolls over that
second, our index is still ahead. This is still suboptimal as we're
dealing with timing, but five seconds should be long enough for any
reasonable test runner to finish the tests.
index: use the checksum to check whether it's been modified
We currently use a timetamp to check whether an index file has been
modified since we last read it, but this is racy. If two updates happen
in the same second and we read after the first one, we won't detect the
second one.
Instead read the SHA-1 checksum of the file, which are its last 20 bytes which
gives us a sure-fire way to detect whether the file has changed since we
last read it.
As we're now keeping track of it, expose an accessor to this data.