]>
Commit | Line | Data |
---|---|---|
72b1a166 FG |
1 | HTML Sanitization |
2 | ================= | |
3 | ||
4 | [![Crates.IO](https://img.shields.io/crates/v/ammonia.svg)](https://crates.rs/crates/ammonia) | |
5 | ![Requires rustc 1.36.0](https://img.shields.io/badge/rustc-1.36.0+-green.svg) | |
6 | ||
7 | Ammonia is a whitelist-based HTML sanitization library. It is designed to | |
8 | prevent cross-site scripting, layout breaking, and clickjacking caused | |
9 | by untrusted user-provided HTML being mixed into a larger web page. | |
10 | ||
11 | Ammonia uses [html5ever] to parse and serialize document fragments the same way browsers do, | |
12 | so it is extremely resilient to syntactic obfuscation. | |
13 | ||
14 | Ammonia parses its input exactly according to the HTML5 specification; | |
15 | it will not linkify bare URLs, insert line or paragraph breaks, or convert `(C)` into ©. | |
16 | If you want that, use a markup processor before running the sanitizer, like [pulldown-cmark]. | |
17 | ||
18 | [html5ever]: https://github.com/servo/html5ever "The HTML parser in Servo" | |
19 | [pulldown-cmark]: https://github.com/google/pulldown-cmark | |
20 | ||
21 | ||
22 | Installation | |
23 | ----------- | |
24 | ||
25 | To use `ammonia`, add it to your project's `Cargo.toml` file: | |
26 | ||
27 | ```toml | |
28 | [dependencies] | |
29 | ammonia = "3" | |
30 | ``` | |
31 | ||
32 | ||
33 | Changes | |
34 | ----------- | |
35 | Please see the [CHANGELOG](CHANGELOG.md) for a release history. | |
36 | ||
37 | ||
38 | Example | |
39 | ------- | |
40 | ||
41 | Using [pulldown-cmark] together with Ammonia for a friendly user-facing comment | |
42 | site. | |
43 | ||
44 | ```rust | |
45 | use ammonia::clean; | |
46 | use pulldown_cmark::{Parser, Options, html::push_html}; | |
47 | ||
48 | let text = "[a link](http://www.notriddle.com/)"; | |
49 | ||
50 | let mut options = Options::empty(); | |
51 | options.insert(Options::ENABLE_TABLES); | |
52 | ||
53 | let mut md_parse = Parser::new_ext(text, options); | |
54 | let mut unsafe_html = String::new(); | |
55 | push_html(&mut unsafe_html, md_parse); | |
56 | ||
57 | let safe_html = clean(&*unsafe_html); | |
58 | assert_eq!(safe_html, "<a href=\"http://www.notriddle.com/\">a link</a>"); | |
59 | ``` | |
60 | ||
61 | ||
62 | Performance | |
63 | ----------- | |
64 | ||
65 | Ammonia builds a DOM, traverses it (replacing unwanted nodes along the way), | |
66 | and serializes it again. It could be faster for what it does, and if you don't | |
67 | want to allow any HTML it is possible to be even faster than that. | |
68 | ||
69 | However, it takes about fifteen times longer to sanitize an HTML string using | |
70 | [bleach]-2.0.0 with html5lib-0.999999999 than it does using Ammonia 1.0. | |
71 | ||
72 | $ cd benchmarks | |
73 | $ cargo run --release | |
74 | Running `target/release/ammonia_bench` | |
75 | 87539 nanoseconds to clean up the intro to the Ammonia docs. | |
76 | $ python bleach_bench.py | |
77 | (1498800.015449524, 'nanoseconds to clean up the intro to the Ammonia docs.') | |
78 | ||
79 | ||
80 | License | |
81 | ------ | |
82 | ||
83 | Licensed under either of these: | |
84 | ||
85 | * Apache License, Version 2.0, ([LICENSE-APACHE](LICENSE-APACHE) or | |
86 | http://www.apache.org/licenses/LICENSE-2.0) | |
87 | * MIT license ([LICENSE-MIT](LICENSE-MIT) or | |
88 | http://opensource.org/licenses/MIT) | |
89 | ||
90 | ||
91 | Thanks | |
92 | ------ | |
93 | ||
94 | Thanks to the other sanitizer libraries, particularly [Bleach] for Python and [sanitize-html] for Node, | |
95 | which we blatantly copied most of our API from. | |
96 | ||
97 | Thanks to ChALkeR, whose [Improper Markup Sanitization] document helped us find high-level semantic holes in Ammonia, | |
98 | and to [ssokolow](https://github.com/ssokolow), whose review and experience were also very helpful. | |
99 | ||
100 | And finally, thanks to [the contributors]. | |
101 | ||
102 | ||
103 | [sanitize-html]: https://www.npmjs.com/package/sanitize-html | |
104 | [Bleach]: https://bleach.readthedocs.io/ | |
105 | [Improper Markup Sanitization]: https://github.com/ChALkeR/notes/blob/master/Improper-markup-sanitization.md | |
106 | [the contributors]: https://github.com/notriddle/ammonia/graphs/contributors |