src/vendor/pulldown-cmark-0.0.14/README.md

   1 # pulldown-cmark
   2
   3 [Documentation](https://docs.rs/pulldown-cmark/)
   4
   5 This library is a pull parser for [CommonMark](http://commonmark.org/), written
   6 in [Rust](http://www.rust-lang.org/). It comes with a simple command-line tool,
   7 useful for rendering to HTML, and is also designed to be easy to use from as
   8 a library.
   9
  10 It is designed to be:
  11
  12 * Fast; a bare minimum of allocation and copying
  13 * Safe; written in pure Rust with no unsafe blocks
  14 * Versatile; in particular source-maps are supported
  15 * Correct; the goal is 100% compliance with the [CommonMark spec](http://spec.commonmark.org/)
  16
  17 ## Why a pull parser?
  18
  19 There are many parsers for Markdown and its variants, but to my knowledge none
  20 use pull parsing. Pull parsing has become popular for XML, especially for
  21 memory-conscious applications, because it uses dramatically less memory than
  22 construcing a document tree, but is much easier to use than push parsers. Push
  23 parsers are notoriously difficult to use, and also often error-prone because of
  24 the need for user to delicately juggle state in a series of callbacks.
  25
  26 In a clean design, the parsing and rendering stages are neatly separated, but
  27 this is often sacrificed in the name of performance and expedience. Many Markdown
  28 implementations mix parsing and rendering together, and even designs that try
  29 to separate them (such as the popular [hoedown](https://github.com/hoedown/hoedown)),
  30 make the assumption that the rendering process can be fully represented as a
  31 serialized string.
  32
  33 Pull parsing is in some sense the most versatile architecture. It's possible to
  34 drive a push interface, also with minimal memory, and quite straightforward to
  35 construct an AST. Another advantage is that source-map information (the mapping
  36 between parsed blocks and offsets within the source text) is readily available;
  37 you basically just call `get_offset()` as you consume events.
  38
  39 While manipulating AST's is the most flexible way to transform documents,
  40 operating on iterators is surprisingly easy, and quite efficient. Here, for
  41 example, is the code to transform soft line breaks into hard breaks:
  42
  43 ```rust
  44 let parser = parser.map(|event| match event {
  45         Event::SoftBreak => Event::HardBreak,
  46         _ => event
  47 });
  48 ```
  49
  50 Or expanding an abbreviation in text:
  51
  52 ```rust
  53 let parser = parser.map(|event| match event {
  54         Event::Str(text) => Event::Str(text.replace("abbr", "abbreviation")),
  55         _ => event
  56 });
  57 ```
  58
  59 Another simple example is code to determine the max nesting level:
  60
  61 ```rust
  62 let mut max_nesting = 0;
  63 let mut level = 0;
  64 for event in parser {
  65         match event {
  66                 Event::Start(_) => {
  67                         level += 1;
  68                         max_nesting = std::cmp::max(max_nesting, level);
  69                 }
  70                 Event::End(_) => level -= 1,
  71                 _ => ()
  72         }
  73 }
  74 ```
  75
  76 ## Using Rust idiomatically
  77
  78 A lot of the internal scanning code is written at a pretty low level (it
  79 pretty much scans byte patterns for the bits of syntax), but the external
  80 interface is designed to be idiomatic Rust.
  81
  82 Pull parsers are at heart an iterator of events (start and end tags, text,
  83 and other bits and pieces). The parser data structure implements the
  84 Rust Iterator trait directly, and Event is an enum. Thus, you can use the
  85 full power and expressivity of Rust's iterator infrastructure, including
  86 for loops and `map` (as in the examples above), collecting the events into
  87 a vector (for recording, playback, and manipulation), and more.
  88
  89 Further, the Str event (representing text) is a copy-on-write string (note:
  90 this isn't quite true yet). The vast majority of text fragments are just
  91 slices of the source document. For these, copy-on-write gives a convenient
  92 representation that requires no allocation or copying, but allocated
  93 strings are available when they're needed. Thus, when rendering text to
  94 HTML, most text is copied just once, from the source document to the
  95 HTML buffer.
  96
  97 ## Building only the pulldown-cmark library
  98
  99 By default, the binary is built as well. If you don't want/need it, then build like this:
 100
 101 ```bash
 102 > cargo build --no-default-features
 103 ```
 104
 105 Or put in your `Cargo.toml` file:
 106
 107 ```toml
 108 pulldown-cmark = { version = "0.0.11", default-features = false }
 109 ```
 110
 111 ## Authors
 112
 113 The main author is Raph Levien.
 114
 115 ## Contributions
 116
 117 We gladly accept contributions via GitHub pull requests, as long as the author
 118 has signed the Google Contributor License. Please see CONTRIBUTIONS.md for
 119 more details.
 120
 121 ### Disclaimer
 122
 123 This is not an official Google product (experimental or otherwise), it
 124 is just code that happens to be owned by Google.