]>
Commit | Line | Data |
---|---|---|
49aad941 | 1 | # camino - UTF-8 paths |
3c0e092e XL |
2 | |
3 | [![camino on crates.io](https://img.shields.io/crates/v/camino)](https://crates.io/crates/camino) | |
4 | [![crates.io download count](https://img.shields.io/crates/d/camino)](https://crates.io/crates/camino) | |
5099ac24 FG |
5 | [![Documentation (latest release)](https://img.shields.io/badge/docs-latest%20version-brightgreen.svg)](https://docs.rs/camino) |
6 | [![Documentation (main)](https://img.shields.io/badge/docs-main-purple.svg)](https://camino-rs.github.io/camino/rustdoc/camino/) | |
3c0e092e XL |
7 | [![License](https://img.shields.io/badge/license-Apache-green.svg)](LICENSE-APACHE) |
8 | [![License](https://img.shields.io/badge/license-MIT-green.svg)](LICENSE-MIT) | |
9 | ||
10 | This repository contains the source code for `camino`, an extension of the `std::path` module that adds new | |
11 | [`Utf8PathBuf`] and [`Utf8Path`] types. | |
12 | ||
13 | ## What is camino? | |
14 | ||
15 | `camino`'s [`Utf8PathBuf`] and [`Utf8Path`] types are like the standard library's [`PathBuf`] and [`Path`] types, except | |
16 | they are guaranteed to only contain UTF-8 encoded data. Therefore, they expose the ability to get their | |
17 | contents as strings, they implement `Display`, etc. | |
18 | ||
19 | The `std::path` types are not guaranteed to be valid UTF-8. This is the right decision for the standard library, | |
20 | since it must be as general as possible. However, on all platforms, non-Unicode paths are vanishingly uncommon for a | |
21 | number of reasons: | |
22 | * Unicode won. There are still some legacy codebases that store paths in encodings like [Shift JIS], but most | |
23 | have been converted to Unicode at this point. | |
24 | * Unicode is the common subset of supported paths across Windows and Unix platforms. (On Windows, Rust stores paths | |
25 | as [an extension to UTF-8](https://simonsapin.github.io/wtf-8/), and converts them to UTF-16 at Win32 | |
26 | API boundaries.) | |
27 | * There are already many systems, such as Cargo, that only support UTF-8 paths. If your own tool interacts with any such | |
28 | system, you can assume that paths are valid UTF-8 without creating any additional burdens on consumers. | |
29 | * The ["makefile problem"](https://www.mercurial-scm.org/wiki/EncodingStrategy#The_.22makefile_problem.22) asks: given a | |
30 | Makefile or other metadata file (such as `Cargo.toml`) that lists the names of other files, how should the names in | |
31 | the Makefile be matched with the ones on disk? This has *no general, cross-platform solution* in systems that support | |
32 | non-UTF-8 paths. However, restricting paths to UTF-8 eliminates this problem. | |
33 | ||
34 | [Shift JIS]: https://en.wikipedia.org/wiki/Shift_JIS | |
35 | ||
36 | Therefore, many programs that want to manipulate paths *do* assume they contain UTF-8 data, and convert them to `str`s | |
37 | as necessary. However, because this invariant is not encoded in the `Path` type, conversions such as | |
38 | `path.to_str().unwrap()` need to be repeated again and again, creating a frustrating experience. | |
39 | ||
40 | Instead, `camino` allows you to check that your paths are UTF-8 *once*, and then manipulate them | |
41 | as valid UTF-8 from there on, avoiding repeated lossy and confusing conversions. | |
42 | ||
43 | ## Examples | |
44 | ||
45 | The documentation for [`Utf8PathBuf`] and [`Utf8Path`] contains several examples. | |
46 | ||
5099ac24 | 47 | For examples of how to use `camino` with other libraries like `serde` and `clap`, see the [`camino-examples`] directory. |
3c0e092e XL |
48 | |
49 | ## API design | |
50 | ||
51 | `camino` is a very thin wrapper around `std::path`. [`Utf8Path`] and [`Utf8PathBuf`] are drop-in replacements | |
52 | for [`Path`] and [`PathBuf`]. | |
53 | ||
54 | Most APIs are the same, but those at the boundary with `str` are different. Some examples: | |
55 | * `Path::to_str() -> Option<&str>` has been renamed to `Utf8Path::as_str() -> &str`. | |
56 | * [`Utf8Path`] implements `Display`, and `Path::display()` has been removed. | |
57 | * Iterating over a [`Utf8Path`] returns `&str`, not `&OsStr`. | |
58 | ||
59 | Every [`Utf8Path`] is a valid [`Path`], so [`Utf8Path`] implements `AsRef<Path>`. Any APIs that accept `impl AsRef<Path>` | |
60 | will continue to work with [`Utf8Path`] instances. | |
61 | ||
62 | ## Should you use camino? | |
63 | ||
64 | `camino` trades off some utility for a great deal of simplicity. Whether `camino` is appropriate for a project or not | |
65 | is ultimately a case-by-case decision. Here are some general guidelines that may help. | |
66 | ||
67 | *You should consider using camino if...* | |
68 | ||
69 | * **You're building portable, cross-platform software.** While both Unix and Windows platforms support different kinds | |
70 | of non-Unicode paths, Unicode is the common subset that's supported across them. | |
71 | * **Your system has files that contain the names of other files.** If you don't use UTF-8 paths, you will run into the | |
72 | makefile problem described above, which has no general, cross-platform solution. | |
73 | * **You're interacting with existing systems that already assume UTF-8 paths.** In that case you won't be adding any new | |
74 | burdens on downstream consumers. | |
75 | * **You're building something brand new and are willing to ask your users to rename their paths if necessary.** Projects | |
76 | that don't have to worry about legacy compatibility have more flexibility in choosing what paths they support. | |
5099ac24 FG |
77 | |
78 | In general, using camino is the right choice for most projects. | |
3c0e092e XL |
79 | |
80 | *You should **NOT** use camino, if...* | |
81 | ||
82 | * **You're writing a core system utility.** If you're writing, say, an `mv` or `cat` replacement, you should | |
83 | **not** use camino. Instead, use [`std::path::Path`] and add extensive tests for non-UTF-8 paths. | |
84 | * **You have legacy compatibility constraints.** For example, Git supports non-UTF-8 paths. If your tool needs to handle | |
85 | arbitrary Git repositories, it should use its own path type that's a wrapper around `Vec<u8>`. | |
86 | * [`std::path::Path`] supports arbitrary bytestrings [on Unix] but not on Windows. | |
87 | * **There's some other reason you need to support non-UTF-8 paths.** Some tools like disk recovery utilities need to | |
88 | handle potentially corrupt filenames: only being able to handle UTF-8 paths would greatly diminish their utility. | |
89 | ||
90 | [on Unix]: https://doc.rust-lang.org/std/os/unix/ffi/index.html | |
91 | ||
92 | ## Optional features | |
93 | ||
94 | By default, `camino` has **no dependencies** other than `std`. There are some optional features that enable | |
95 | dependencies: | |
923072b8 FG |
96 | * `serde1` adds serde [`Serialize`] and [`Deserialize`] impls for [`Utf8PathBuf`] and [`Utf8Path`] |
97 | (zero-copy). | |
98 | * `proptest1` adds [proptest](https://altsysrq.github.io/proptest-book/) [`Arbitrary`] | |
99 | implementations for [`Utf8PathBuf`] and `Box<Utf8Path>`. | |
3c0e092e XL |
100 | |
101 | ## Rust version support | |
102 | ||
103 | The minimum supported Rust version (MSRV) for `camino` with default features is **1.34**. This project is tested in CI | |
104 | against the latest stable version of Rust and the MSRV. | |
923072b8 | 105 | * *Stable APIs* added in later Rust versions are supported either through conditional compilation in `build.rs`, or through backfills that also work on older versions. |
3c0e092e XL |
106 | * *Deprecations* are kept in sync with the version of Rust they're added in. |
107 | * *Unstable APIs* are currently not supported. Please | |
5099ac24 | 108 | [file an issue on GitHub](https://github.com/camino-rs/camino/issues/new) if you need an unstable API. |
3c0e092e XL |
109 | |
110 | `camino` is designed to be a core library and has a conservative MSRV policy. MSRV increases will only happen for | |
111 | a compelling enough reason, and will involve at least a minor version bump. | |
112 | ||
113 | Optional features may pull in dependencies that require a newer version of Rust. | |
114 | ||
115 | ## License | |
116 | ||
117 | This project is available under the terms of either the [Apache 2.0 license](LICENSE-APACHE) or the [MIT | |
118 | license](LICENSE-MIT). | |
119 | ||
120 | This project's documentation is adapted from [The Rust Programming Language](https://github.com/rust-lang/rust/), which is | |
121 | available under the terms of either the [Apache 2.0 license](https://github.com/rust-lang/rust/blob/master/LICENSE-APACHE) | |
122 | or the [MIT license](https://github.com/rust-lang/rust/blob/master/LICENSE-MIT). | |
123 | ||
124 | [`Utf8PathBuf`]: https://docs.rs/camino/*/camino/struct.Utf8PathBuf.html | |
125 | [`Utf8Path`]: https://docs.rs/camino/*/camino/struct.Utf8Path.html | |
126 | [`PathBuf`]: https://doc.rust-lang.org/std/path/struct.PathBuf.html | |
127 | [`Path`]: https://doc.rust-lang.org/std/path/struct.Path.html | |
128 | [`std::path::Path`]: https://doc.rust-lang.org/std/path/struct.Path.html | |
129 | [`Serialize`]: https://docs.rs/serde/1/serde/trait.Serialize.html | |
130 | [`Deserialize`]: https://docs.rs/serde/1/serde/trait.Deserialize.html | |
5099ac24 | 131 | [`camino-examples`]: https://github.com/camino-rs/camino/tree/main/camino-examples |
923072b8 | 132 | [`Arbitrary`]: https://docs.rs/proptest/1/proptest/arbitrary/trait.Arbitrary.html |