3 A Rust library for parsing, compiling, and executing regular expressions. Its
4 syntax is similar to Perl-style regular expressions, but lacks a few features
5 like look around and backreferences. In exchange, all searches execute in
6 linear time with respect to the size of the regular expression and search text.
7 Much of the syntax and implementation is inspired
8 by [RE2](https://github.com/google/re2).
10 [![Build Status](https://travis-ci.com/rust-lang/regex.svg?branch=master)](https://travis-ci.com/rust-lang/regex)
11 [![Build status](https://ci.appveyor.com/api/projects/status/github/rust-lang/regex?svg=true)](https://ci.appveyor.com/project/rust-lang-libs/regex)
12 [![Coverage Status](https://coveralls.io/repos/github/rust-lang/regex/badge.svg?branch=master)](https://coveralls.io/github/rust-lang/regex?branch=master)
13 [![](https://meritbadge.herokuapp.com/regex)](https://crates.io/crates/regex)
14 [![Rust](https://img.shields.io/badge/rust-1.24.1%2B-blue.svg?maxAge=3600)](https://github.com/rust-lang/regex)
18 [Module documentation with examples](https://docs.rs/regex).
19 The module documentation also includes a comprehensive description of the
22 Documentation with examples for the various matching functions and iterators
24 [`Regex` type](https://docs.rs/regex/*/regex/struct.Regex.html).
28 Add this to your `Cargo.toml`:
35 and this to your crate root (if you're using Rust 2015):
41 Here's a simple example that matches a date in YYYY-MM-DD format and prints the
48 let re = Regex::new(r"(?x)
49 (?P<year>\d{4}) # the year
51 (?P<month>\d{2}) # the month
53 (?P<day>\d{2}) # the day
55 let caps = re.captures("2010-03-14").unwrap();
57 assert_eq!("2010", &caps["year"]);
58 assert_eq!("03", &caps["month"]);
59 assert_eq!("14", &caps["day"]);
63 If you have lots of dates in text that you'd like to iterate over, then it's
64 easy to adapt the above example with an iterator:
69 const TO_SEARCH: &'static str = "
70 On 2010-03-14, foo happened. On 2014-10-14, bar happened.
74 let re = Regex::new(r"(\d{4})-(\d{2})-(\d{2})").unwrap();
76 for caps in re.captures_iter(TO_SEARCH) {
77 // Note that all of the unwraps are actually OK for this regex
78 // because the only way for the regex to match is if all of the
79 // capture groups match. This is not true in general though!
80 println!("year: {}, month: {}, day: {}",
81 caps.get(1).unwrap().as_str(),
82 caps.get(2).unwrap().as_str(),
83 caps.get(3).unwrap().as_str());
91 year: 2010, month: 03, day: 14
92 year: 2014, month: 10, day: 14
95 ### Usage: Avoid compiling the same regex in a loop
97 It is an anti-pattern to compile the same regular expression in a loop since
98 compilation is typically expensive. (It takes anywhere from a few microseconds
99 to a few **milliseconds** depending on the size of the regex.) Not only is
100 compilation itself expensive, but this also prevents optimizations that reuse
101 allocations internally to the matching engines.
103 In Rust, it can sometimes be a pain to pass regular expressions around if
104 they're used from inside a helper function. Instead, we recommend using the
105 [`lazy_static`](https://crates.io/crates/lazy_static) crate to ensure that
106 regular expressions are compiled exactly once.
113 fn some_helper_function(text: &str) -> bool {
115 static ref RE: Regex = Regex::new("...").unwrap();
121 Specifically, in this example, the regex will be compiled when it is used for
122 the first time. On subsequent uses, it will reuse the previous compilation.
124 ### Usage: match regular expressions on `&[u8]`
126 The main API of this crate (`regex::Regex`) requires the caller to pass a
127 `&str` for searching. In Rust, an `&str` is required to be valid UTF-8, which
128 means the main API can't be used for searching arbitrary bytes.
130 To match on arbitrary bytes, use the `regex::bytes::Regex` API. The API
131 is identical to the main API, except that it takes an `&[u8]` to search
132 on instead of an `&str`. By default, `.` will match any *byte* using
133 `regex::bytes::Regex`, while `.` will match any *UTF-8 encoded Unicode scalar
134 value* using the main API.
136 This example shows how to find all null-terminated strings in a slice of bytes:
139 use regex::bytes::Regex;
141 let re = Regex::new(r"(?P<cstr>[^\x00]+)\x00").unwrap();
142 let text = b"foo\x00bar\x00baz\x00";
144 // Extract all of the strings without the null terminator from each match.
145 // The unwrap is OK here since a match requires the `cstr` capture to match.
146 let cstrs: Vec<&[u8]> =
147 re.captures_iter(text)
148 .map(|c| c.name("cstr").unwrap().as_bytes())
150 assert_eq!(vec![&b"foo"[..], &b"bar"[..], &b"baz"[..]], cstrs);
153 Notice here that the `[^\x00]+` will match any *byte* except for `NUL`. When
154 using the main API, `[^\x00]+` would instead match any valid UTF-8 sequence
157 ### Usage: match multiple regular expressions simultaneously
159 This demonstrates how to use a `RegexSet` to match multiple (possibly
160 overlapping) regular expressions in a single scan of the search text:
165 let set = RegexSet::new(&[
175 // Iterate over and collect all of the matches.
176 let matches: Vec<_> = set.matches("foobar").into_iter().collect();
177 assert_eq!(matches, vec![0, 2, 3, 4, 6]);
179 // You can also test whether a particular regex matched:
180 let matches = set.matches("foobar");
181 assert!(!matches.matched(5));
182 assert!(matches.matched(6));
185 ### Usage: enable SIMD optimizations
187 SIMD optimizations are enabled automatically on Rust stable 1.27 and newer.
188 For nightly versions of Rust, this requires a recent version with the SIMD
192 ### Usage: a regular expression parser
194 This repository contains a crate that provides a well tested regular expression
195 parser, abstract syntax and a high-level intermediate representation for
196 convenient analysis. It provides no facilities for compilation or execution.
197 This may be useful if you're implementing your own regex engine or otherwise
198 need to do analysis on the syntax of a regular expression. It is otherwise not
199 recommended for general use.
201 [Documentation `regex-syntax`.](https://docs.rs/regex-syntax)
204 ### Minimum Rust version policy
206 This crate's minimum supported `rustc` version is `1.24.1`.
208 The current **tentative** policy is that the minimum Rust version required
209 to use this crate can be increased in minor version updates. For example, if
210 regex 1.0 requires Rust 1.20.0, then regex 1.0.z for all values of `z` will
211 also require Rust 1.20.0 or newer. However, regex 1.y for `y > 0` may require a
212 newer minimum version of Rust.
214 In general, this crate will be conservative with respect to the minimum
215 supported version of Rust.
220 This project is licensed under either of
222 * Apache License, Version 2.0, ([LICENSE-APACHE](LICENSE-APACHE) or
223 http://www.apache.org/licenses/LICENSE-2.0)
224 * MIT license ([LICENSE-MIT](LICENSE-MIT) or
225 http://opensource.org/licenses/MIT)
229 The data in `regex-syntax/src/unicode_tables/` is licensed under the Unicode
231 ([LICENSE-UNICODE](http://www.unicode.org/copyright.html#License)).