Auto merge of #11032 - lqd:priority_pending_queue, r=Eh2406
Take priority into account within the pending queue
This is the PR for the work discussed in [this zulip thread](https://rust-lang.zulipchat.com/#narrow/stream/246057-t-cargo/topic/pending.20queue.20scheduling.20experiments) and whose detailed description and some results are available [here](https://github.com/lqd/rustc-benchmarking-data/tree/main/experiments/cargo-schedules/pending-queue-sorted) with graphs, summaries and raw data -- much of which was shown in the thread as well.
Units of works have a computed priority that is used in the dependency queue, so that higher priorities are dequeued sooner, as documented [here](https://github.com/rust-lang/cargo/blob/996a6363ce4b9109d4ca757407dd6dcb4805c86f/src/cargo/util/dependency_queue.rs#L34-L35).
This PR further applies that principle to the next step before being executed: if multiple pieces of work are waiting in the pending queue, we can sort that according to their priorities. Here as well, higher priorities should be scheduled sooner.
They are more often than not wider than pure chains of dependencies, and this should create more parallelism opportunities earlier in the pipeline: a high priority piece of work represents more future pieces of work down the line, and try to sustain CPU utilization longer (at the potential cost of this parallelism being distributed differently than today, between cargo invoking rustc and rustc's own codegen threads -- when applicable).
This is a scheduling tradeoff that behaves differently for each project, machine configuration, amount of available parallelism at a given point in time, etc, but seems to help more often than hinders: at low-core counts and with enough units of work to be done, so that there is jobserver token contention where choosing a "better" piece of work to work on next may be possible.
There's of course a bit of noise in the results linked above and 800 or so of the most popular crates.io crates is still a limited sample, but they're mostly meant to show a hopefully positive trend: while there are improvements and regressions, that trend looks to be more positive than negative, with the wins being more numerous and with higher amplitudes than the corresponding losses.
(A report on another scheduling experiment -- a superset of this PR, where I also simulate users manually giving a higher priority to `syn`, `quote`, `serde_derive` -- [is available here](https://github.com/lqd/rustc-benchmarking-data/tree/main/experiments/cargo-schedules/pending-queue-prioritized) and also improves this PR's results: the regressions are decreased in number and amplitude, whereas the improvements are bigger and more numerous. So that could be further work to iterate upon this one)
Since this mostly applies to clean builds, for low core counts, and with a sufficient number of dependencies to have some items in the pending queue, I feel this also applies well to CI use-cases (esp. on the free tiers).
Somewhat reassuring as well, and discussed in the thread but not in the report: I've also tried to make sure cargo and bootstrapping rustc are not negatively affected. cargo saw some improvements, and bootstrap stayed within today's variance of +/- 2 to 3%. Similarly, since 3y old versions of some tokio crates (`0.2.0-alpha.1`) were the most negatively affected, I've also checked that recent tokio versions (`1.19`) are not disproportionately impacted: their simple readme example, the more idiomatic `mini-redis` sample, and some of my friends' tokio projects were either unaffected or saw some interesting improvements.
And here's a `cargo check -j2` graph to liven up this wall of text:
![some results of `cargo check -j2`](https://github.com/lqd/rustc-benchmarking-data/raw/main/experiments/cargo-schedules/pending-queue-sorted/images/check-j2-sorted.png)
---
I'm not a cargo expert so I'm not sure whether it would be preferable to integrate priorities deeper than just the dependency queue, and e.g. have `Unit`s contain a dedicated field or similar. So in the meantime I've done the simplest thing: just sort the pending queue and ask the units' priorities to the dep queue.
We could just as well have the priority recorded as part of the pending queue tuples themselves, or have that be some kind of priority queue/max heap instead of a Vec.
Let me know which you prefer, but it's in essence a very simple change as-is.
fix(add): Clarify which version the features are added for
### What does this PR try to resolve?
This gives a hint to users that we might not be showing the feature list
for the latest version but the earliest version.
Also when using a workspace dependency, this is a good reminder of what the version requirement is that was selected. That could also be useful for reused dependencies but didn't want to bother with the relevant plumbing for that.
ie we are going from
```console
$ cargo add chrono@0.4
Updating crates.io index
Adding chrono v0.4 to dependencies.
Features:
- rustc-serialize
- serde
```
to
```console
$ cargo add chrono@0.4
Updating crates.io index
Adding chrono v0.4 to dependencies.
Features as of v0.4.2:
- rustc-serialize
- serde
```
### How should we test and review this PR?
I'd recommend looking at this commit-by-commit. This is broken up into several refactors leading up the main change. The refactors are focused on pulling UI logic out of dependency editing so we can more easily evolve the UI without breaking the editing API. I then tweaked the behavior in the final commit to be less redundant / noisy.
The existing tests are used to demonstrate this is working.
### Additional information
I'm also mixed on whether the meta version should show up.
Auto merge of #11081 - WaffleLapkin:no_for_in_option, r=epage
Don't use `for` loop on an `Option`
This PR removes a single `for` loop over `Option`, replacing it with an `if let` to improve code clarity. This currently blocks https://github.com/rust-lang/rust/pull/99696 that adds a lint against this pattern.
Ed Page [Mon, 12 Sep 2022 14:46:22 +0000 (09:46 -0500)]
fix(add): Limit 'Features as of vX.Y.Z' to when relevant
This will only show the messaeg if we didn't already show a version req
with full precision specified ... mostly. We'll also skip it if its a
local or git dependency but we never show the version in those cases
because it doesn't matter.
The `precise_version` logic came from cargo-upgrade.
Ed Page [Mon, 12 Sep 2022 14:37:05 +0000 (09:37 -0500)]
fix(add): Clarify which version the features are added for
This gives a hint to users that we might not be showing the feature list
for the latest version but the earliest version.
Also when using a workspace dependency or re-using an existing
dependency, this is a good reminder of what the version requirement is
that was selected.
However, when the user or add (the common case) selected a full
precision requirement, this is redundant.
I'm also mixed on whether the meta version should show up.
The progress indicator for sparse registries previously could go backwards as new dependencies are discovered, which confused users.
The new indicator looks like this:
```
Updating crates.io index
Fetch [====================> ] 46 complete; 29 pending
```
The progress bar percentage is based the current depth in the dependency tree, with a hard coded limit at `10/11`. This provides natural feeling progress for many projects that I tested.
`complete` represents the number of index files downloaded, `pending` represents the number of index files that Cargo knows need to be downloaded but have not yet finished.
The plan is to move the architecture documents over to rustdoc so they can more easily stay up-to-date. To do so, we'll need to enforce that the intradoc links stay valid.
As part of this, the PR run for `cargo doc` was updated to the command in #11019
Auto merge of #11044 - Eh2406:file_hash, r=weihanglo
Cache index files based on contents hash
Since #10507 Cargo has known how to read registry cached files whose index version starts with the hash of the file contents. Git makes it very cheap to determine the hash of a file. This PR switches cargo to start writing the new format.
Cargoes from before #10507 will not know how to read, and therefore overwrite, cached files written by Cargos after this PR.
Cargos after this PR can still read, and will consider up-to-date cached files written by all older Cargos.
As I'm writing this out I'm thinking that there may not be any point in writing a file that has both. An alternative implementation just writes the file contents hash. 🤔
This cleans up the priority-sorted scheduling by removing the need
for a priority accessor that would hash the nodes, and allows inserting
in the queue at the correctly sorted position to remove the insert +
sort combination.
Rémy Rakic [Mon, 29 Aug 2022 14:24:17 +0000 (16:24 +0200)]
sort the pending queue according to cost/priority
If multiple pieces of work are waiting in the pending queue, we can sort it according to
their priorities: higher priorities should be scheduled sooner.
They are more often than not wider than pure chains, and this should create more parallelism
opportunities earlier in the pipeline: a high priority piece of work represents more future
pieces of work down the line.
This is a scheduling tradeoff that behaves differently for each project, machine configuration,
amount of available parallelism at a given point in time, etc, but seems to help more often than
hinders, at low-core counts and with enough units of work to be done, so that there is jobserver
token contention where choosing a "better" piece of work to work on next is possible.
This is trying to clarify `-C` support when it is implemented in #10952
Cargo currently has two initialization states for Config, `Config::default` (process start) and `config.configure` (after parsing args). The most help we provide for a developer touching this code is a giant `CAUTION` comment in one of the relevant functions.
Currently, #10952 adds another configuration state in the middle where the `current_dir` has been set.
The goal of this PR is to remove that third configuration state by
- Lazy loading `Config::default` so it can be called after parsing `-C`
- Allowing `-C` support to assert that the config isn't loaded yet to catch bugs with it
The hope is this will make the intent of the code clearer and reduce the chance for bugs.
In doing this, there are two intermediate refactorings
- Make help behave like other subcommands
- Before, we had hacks to intercept raw arguments and to intercept clap errors and assume what their intention was to be able to implement our help system.
- This flips it around and makes help like any other subcommand,
simplifying cargo initialization.
- We had to upgrade clap because this exposed a bug where `Command::print_help` wasn't respecting `disable_colored_help(true)`
- Delay fix's access to config
Personally, I also find both changes make the intent of the code clearer.
To review this, I recommend looking at the individual commits. As this is just refactors, this has no impact on testing.
Ed Page [Mon, 29 Aug 2022 16:29:11 +0000 (11:29 -0500)]
refactor(cli): Make help behave like other subcommands
Before, we had hacks to intercept raw arguments and to intercept clap
errors and assume what their intention was to be able to implement our
help system.
This flips it around and makes help like any other subcommand,
simplifying cargo initialization.
The goal is to help new (and existing) users more quickly find the
appropriate files to edit (like in #11033). The main downside is for
someone trying to find output to verify what it looks like, a simple
search won't turn up results but there are other ways around that
(`--no-ignore`, `git status` after doing a man generation, etc).
bors [Wed, 31 Aug 2022 21:14:13 +0000 (21:14 +0000)]
Auto merge of #11039 - ehuss:ci-names, r=epage
Add names to CI jobs
This adds names to the CI jobs. I've often found the existing auto-generated names to be confusing, and I think it would help to make them a little more succinct and clearer.
Ed Page [Wed, 31 Aug 2022 21:11:36 +0000 (16:11 -0500)]
chore: Don't show genned docs in ripgrep
The goal is to help new (and existing) users more quickly find the
appropriate files to edit (like in #11033). The main downside is for
someone trying to find output to verify what it looks like, a simple
search won't turn up results but there are other ways around that
(`--no-ignore`, `git status` after doing a man generation, etc).
bors [Tue, 30 Aug 2022 22:59:43 +0000 (22:59 +0000)]
Auto merge of #11028 - ehuss:test-errors, r=weihanglo
Rework test error handling
This reworks how errors are handled when running tests and benchmarks. There were some cases where Cargo was eating the actual error and not displaying it. For example, if a test process fails to launch, it only displayed the `could not execute process` message, but didn't explain why it failed to execute. This fixes it to ensure that the full error chain is displayed.
This also tries to simplify how the errors are handled, and makes them more uniform across `test` and `bench`, and with doctests.
This also changes the `--no-fail-fast` behavior to report errors as they happen instead of grouped at the end (and prints a summary at the end). This helps to make it clearer when a nonstandard error happens. For example, before:
running 1 test
+error: test failed, to rerun pass `--test t1`
+
+Caused by:
+ process didn't exit successfully: `/Users/eric/Temp/z12/target/debug/deps/t1-bb449dfa37379ba1` (signal: 11, SIGSEGV: invalid memory reference)
Running tests/t2.rs (target/debug/deps/t2-1770ae8367bc97ce)
running 1 test
`@@` -18,8 +22,7 `@@`
test result: FAILED. 0 passed; 1 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s
-error: test failed.
-
-Caused by:
- process didn't exit successfully: `/Users/eric/Temp/z12/target/debug/deps/t1-bb449dfa37379ba1` (signal: 11, SIGSEGV: invalid memory reference)
- process didn't exit successfully: `/Users/eric/Temp/z12/target/debug/deps/t2-1770ae8367bc97ce` (exit status: 101)
+error: test failed, to rerun pass `--test t2`
+error: 2 targets failed:
+ `--test t1`
+ `--test t2`
```
In the first example, when it says `Running tests/t1.rs`, there is no error message displayed until after all the tests finish, and that error message is not associated with the original test. This also includes the "to rerun" hint with `--no-fail-fast`.
bors [Tue, 30 Aug 2022 21:50:14 +0000 (21:50 +0000)]
Auto merge of #11033 - buggymcbugfix:buggymcbugfix/cargo-add-docs, r=epage
Very slight `cargo add` documentation improvements
As discussed in https://github.com/rust-lang/book/pull/3331, a quick explanation of the `Features` part of the message that gets printed to stdout when adding some dependency.
Consider the following example:
```
cargo add my-crate
Updating crates.io index
Adding my-crate v0.1.0 to dependencies.
Features:
+ foo
- bar
```
It was not clear to me what `+foo` and `-bar` meant until `@carols10cents'` kindly explained it to me. Hopefully the documentation now clarifies this.
bors [Tue, 30 Aug 2022 00:26:35 +0000 (00:26 +0000)]
Auto merge of #11030 - ehuss:update-installation-requirements, r=epage
Update compiling requirements.
This updates the requirements for building cargo itself. It adds a little more clarification on exactly what is needed, and what some of the options are.
This may be a little bit too much detail, as usually I suspect most users will just run `cargo build` and follow the error message instructions on what to install next.
bors [Wed, 24 Aug 2022 07:05:49 +0000 (07:05 +0000)]
Auto merge of #10807 - dtolnay-contrib:sha256, r=weihanglo
Apply GitHub fast path even for partial hashes
### What does this PR try to resolve?
As flagged in https://github.com/rust-lang/cargo/pull/10079#issuecomment-1170940132, it's not great to assume that git SHAs would always be 40-character hex strings. In the future they will be longer.
> Git is on a long-term trajectory to move to SHA256 hashes ([current status](https://lwn.net/SubscriberLink/898522/f267d0e9b4fe9983/)). I suppose when that becomes available/the default it's possible for a 40-digit hex-encoded hash not to be the full hash. Will this fail for that case?
The implementation from #10079 fails in that situation because it turns dependencies of the form `{ git = "…", rev = "[…40 hex…]" }` into fetches with a refspec `+[…40 hex…]:refs/commit/[…40 hex…]`. That only works if the 40 hex digits are the *full* long hash of your commit. If it's really a prefix ("short hash") of a 64-hex-digit SHA-256 commit hash, you'd get a failure that resembles:
```console
error: failed to get `dependency` as a dependency of package `repro v0.0.0`
Caused by:
failed to load source for dependency `dependency`
This PR updates the implementation so that Cargo will curl GitHub to get a resolved long commit hash *even if* the `rev` specified for the git dependency in Cargo.toml already looks like a SHA-1 long hash.
### Performance considerations
⛔ This reverses a (questionable, negligible) benefit of #10079 of skipping the curl when `rev` is a long hash and is not already present in the local clone. These curls take 200-250ms on my machine.
🟰 We retain the much larger benefit of #10079 which comes from being able to precisely fetch a single `rev`, instead of fetching all branches and tags in the upstream repo and hoping to find the rev somewhere in there. This accounts for the entire performance difference explained in the summary of that PR.
🟰 We still skip the curl when `rev` is a **long hash** of a commit that is already previously fetched.
🥳 After this PR, we also curl and hit fast path when `rev` is a **short hash** of some upstream commit. For example `{ git = "https://github.com/rust-lang/cargo", rev = "b30694b4d9" }` would previously have done the download-all-branches-and-tags codepath because `b30694b4d9` is not a long hash. After this PR, the curl to GitHub informs us that `b30694b4d9` resolves to the long hash `b30694b4d9b29141298870b7993e9aee10940524`, and we download just that commit instead of all-branches-and-tags.
### How should we test and review this PR?
I tested with the following dependency specs, using `/path/to/target/release/cargo generate-lockfile`.
```toml
# Before and after: fast path
cargo = { git = "https://github.com/rust-lang/cargo", rev = "refs/heads/rust-1.14.0" }
```
```toml
# Before and after: same error "revspec 'rust-1.14.0' not found"
# You are supposed to use `branch = "rust-1.14.0"`, this is not considered a `rev`
cargo = { git = "https://github.com/rust-lang/cargo", rev = "rust-1.14.0" }
```
I made sure these all work both with and without `rm -rf ~/.cargo/git/db/cargo-* ~/.cargo/git/checkouts/cargo-*` in between each cargo invocation.
bors [Tue, 23 Aug 2022 21:39:56 +0000 (21:39 +0000)]
Auto merge of #11017 - BlackHoleFox:non-ascii-names, r=weihanglo
Update non-ASCII crate name warning message
This PR fixes an outdated warning when initializing crates sometimes.
### What does this PR try to resolve?
Per [a Zulip convo](https://rust-lang.zulipchat.com/#narrow/stream/246057-t-cargo/topic/Non-ASCII.20crate.20name.20status/near/294876491) on the topic, non-ASCII crate names are no longer allowed on any toolchain since https://github.com/rust-lang/rust/pull/73305, during the `non_ascii_idents` feature's development. Cargo however tells the user that they are accepted on Nightly. Rust and Cargo should agree on this point to avoid future confusion.
### How should we test and review this PR?
This should be covered by the existing test that was changed but if desired its easy to test with a checkout:
```
Running `/Users/fox/x/Forks/cargo/target/release/cargo init 'ああああ'`
warning: the name `ああああ` contains non-ASCII characters
Non-ASCII crate names are not supported by Rust.
```
Weihang Lo [Fri, 19 Aug 2022 22:30:01 +0000 (23:30 +0100)]
Ignore broken but excluded file during traversing
Walkdir's [`filter_entry()`][1] won't call the predicate if the entry
is essentially an `Err` from its underyling `IntoIter`. That means
Cargo hasn't had a chance to call `filter` on an entry that should be
excluded but eventually return an `Err` and cause the loop to stop.
For instance, a broken symlink which should bee excluded by `filter`
will generate an error since `filter` closure is not called with it.
The solution is calling `filter` if an error occurs with a path
(because it has yet been called with that path).
If it's exactly excluded, ignore the error.
bors [Wed, 17 Aug 2022 20:18:30 +0000 (20:18 +0000)]
Auto merge of #11001 - Muscraft:fix-unstable-docs, r=weihanglo
remove missed reference to workspace inheritance in unstable.md
Currently on the nightly docs, workspace inheritance is [under the stable table](https://doc.rust-lang.org/nightly/cargo/reference/unstable.html#workspace-inheritance-1) and the [unstable table](https://doc.rust-lang.org/nightly/cargo/reference/unstable.html#workspace-inheritance). It looks like I forgot to remove it from the unstable table when working on stabilization.
I am not sure if it is worth a beta backport but I will happily open a PR for it if needed.
bors [Tue, 16 Aug 2022 12:29:32 +0000 (12:29 +0000)]
Auto merge of #10975 - theCapypara:flock-enosys-android, r=weihanglo
Fix file locking being not supported on Android raising an error
This PR fixes #10972 by not failing Cargo operations when the `target_os` is Android and file locking is being reported as not being implemented by the kernel.
I am sadly unable to actually test this at the moment, since despite my best efforts I am not able to get Cargo actually cross-compiled for Android (aarch64-linux-android).
I however don't see any reason why this wouldn't work. `target_os` is "android" on Android and not "linux".
bors [Thu, 11 Aug 2022 22:30:26 +0000 (22:30 +0000)]
Auto merge of #10930 - ehuss:enable-windows-tests, r=weihanglo
Enable two windows tests
These two tests were disabled on Windows a long time ago. AFAICT, the reasons are no longer relevant, and it should be safe to enable these tests. See each commit for a more detailed exposition.
bors [Thu, 11 Aug 2022 03:23:09 +0000 (03:23 +0000)]
Auto merge of #10968 - hi-rustin:rustin-patch-msg, r=ehuss
Improve error msg for get target runner
Actually, we'll get this config from three places. So this msg may be confusing when you set it up in `.cargo/config.toml` or pass it by `--config`.
We already printed the location of the config, so I think it's OK to change it to `configurations`.