]> git.proxmox.com Git - mirror_ovs.git/blob - lib/meta-flow.xml
5efd431004dd7aaf9ee83f6c212fd6914246c3c1
[mirror_ovs.git] / lib / meta-flow.xml
1 <?xml version="1.0" encoding="utf-8"?>
2 <fields>
3 <h1>Introduction</h1>
4
5 <p>
6 This document aims to comprehensively document all of the fields,
7 both standard and non-standard, supported by OpenFlow or Open
8 vSwitch, regardless of origin.
9 </p>
10
11 <h2>Fields</h2>
12
13 <p>
14 A <dfn>field</dfn> is a property of a packet. Most familiarly, <dfn>data
15 fields</dfn> are fields that can be extracted from a packet. Most data
16 fields are copied directly from protocol headers, e.g. at layer 2, the
17 Ethernet source and destination addresses, or the VLAN ID; at layer 3, the
18 IPv4 or IPv6 source and destination; and at layer 4, the TCP or UDP ports.
19 Other data fields are computed, e.g. <ref field="ip_frag"/> describes
20 whether a packet is a fragment but it is not copied directly from the IP
21 header.
22 </p>
23
24 <p>
25 Some data fields, called <dfn>root fields</dfn>, are always present as a
26 consequence of the basic networking technology in use. The Ethernet header
27 fields are root fields in current versions of Open vSwitch, though future
28 versions might support other roots. (Currently, to support LISP tunnels,
29 which do not encapsulate an Ethernet header, Open vSwitch synthesizes one.)
30 </p>
31
32 <!-- future directions: EXT-112 -->
33 <p>
34 Other data fields are not always present. A packet contains ARP fields,
35 for example, only when its Ethernet header indicates the Ethertype for ARP,
36 0x0806. In this documentation, we say that a field is
37 <dfn>applicable</dfn> when it is present in a packet, and
38 <dfn>inapplicable</dfn> when it is not. (These are not standard terms.)
39 We refer to the conditions that determine whether a field is applicable as
40 <dfn>prerequisites</dfn>. Some VLAN-related fields are a special case:
41 these fields are always applicable, but have a designated value or bit that
42 indicates whether a VLAN header is present, with the remaining values or
43 bits indicating the VLAN header's content (if it is present). <!-- XXX
44 also ethertype -->
45 </p>
46
47 <p>
48 An inapplicable field does not have a value, not even a nominal
49 ``value'' such as all-zero-bits. In many circumstances, OpenFlow
50 and Open vSwitch allow references only to applicable fields. For
51 example, one may match (see <cite>Matching</cite>, below) a given
52 field only if the match includes the field's prerequisite,
53 e.g. matching an ARP field is only allowed if one also matches on
54 Ethertype 0x0806.
55 </p>
56
57 <p>
58 Sometimes a packet may contain multiple instances of a header.
59 For example, a packet may contain multiple VLAN or MPLS headers,
60 and tunnels can cause any data field to recur. OpenFlow and Open
61 vSwitch do not address these cases uniformly. For VLAN and MPLS
62 headers, only the outermost header is accessible, so that inner
63 headers may be accessed only by ``popping'' (removing) the outer
64 header. (Open vSwitch supports only a single VLAN header in any
65 case.) For tunnels, e.g. GRE or VXLAN, the outer header and inner
66 headers are treated as different data fields.
67 </p>
68
69 <p>
70 Many network protocols are built in layers as a stack of concatenated
71 headers. Each header typically contains a ``next type'' field that
72 indicates the type of the protocol header that follows, e.g. Ethernet
73 contains an Ethertype and IPv4 contains a IP protocol type. The
74 exceptional cases, where protocols are layered but an outer layer does not
75 indicate the protocol type for the inner layer, or gives only an ambiguous
76 indication, are troublesome. An MPLS header, for example, only indicates
77 whether another MPLS header or some other protocol follows, and in the
78 latter case the inner protocol must be known from the context. In these
79 exceptional cases, OpenFlow and Open vSwitch cannot provide insight into
80 the inner protocol data fields without additional context, and thus they
81 treat all later data fields as inapplicable until an OpenFlow action
82 explicitly specifies what protocol follows. In the case of MPLS, the
83 OpenFlow ``pop MPLS'' action that removes the last MPLS header from a
84 packet provides this context, as the Ethertype of the payload. See
85 <cite>Layer 2.5: MPLS</cite> for more information.
86 </p>
87
88 <p>
89 OpenFlow and Open vSwitch support some fields other than data
90 fields. <dfn>Metadata fields</dfn> relate to the origin or
91 treatment of a packet, but they are not extracted from the packet
92 data itself. One example is the physical port on which a packet
93 arrived at the switch. <dfn>Register fields</dfn> act like
94 variables: they give an OpenFlow switch space for temporary
95 storage while processing a packet. Existing metadata and register
96 fields have no prerequisites.
97 </p>
98
99 <p>
100 A field's value consists of an integral number of bytes. For data
101 fields, sometimes those bytes are taken directly from the packet.
102 Other data fields are copied from a packet with padding (usually
103 with zeros and in the most significant positions). The remaining
104 data fields are transformed in other ways as they are copied from
105 the packets, to make them more useful for matching.
106 </p>
107
108 <h2>Matching</h2>
109
110 <p>
111 The most important use of fields in OpenFlow is
112 <dfn>matching</dfn>, to determine whether particular field values
113 agree with a set of constraints called a <dfn>match</dfn>. A
114 match consists of zero or more constraints on individual fields,
115 all of which must be met to satisfy the match. (A match that
116 contains no constraints is always satisfied.) OpenFlow and Open
117 vSwitch support a number of forms of matching on individual
118 fields:
119 </p>
120
121 <dl>
122 <dt><dfn>Exact match</dfn>, e.g. <code>nw_src=10.1.2.3</code></dt>
123 <dd>
124 <p>
125 Only a particular value of the field is matched; for example, only one
126 particular source IP address. Exact matches are written as
127 <code><var>field</var>=<var>value</var></code>. The forms accepted for
128 <var>value</var> depend on the field.
129 </p>
130
131 <p>
132 All fields support exact matches.
133 </p>
134 </dd>
135
136 <dt>
137 <dfn>Bitwise match</dfn>, e.g. <code>nw_src=10.1.0.0/255.255.0.0</code>
138 </dt>
139 <dd>
140 <p>
141 Specific bits in the field must have specified values; for example,
142 only source IP addresses in a particular subnet. Bitwise matches are
143 written as
144 <code><var>field</var>=<var>value</var>/<var>mask</var></code>, where
145 <var>value</var> and <var>mask</var> take one of the forms accepted for
146 an exact match on <var>field</var>. Some fields accept other forms for
147 bitwise matches; for example, <code>nw_src=10.1.0.0/255.255.0.0</code>
148 may also be written <code>nw_src=10.1.0.0/16</code>.
149 </p>
150
151 <p>
152 Most OpenFlow switches do not allow every bitwise matching on every
153 field (and before OpenFlow 1.2, the protocol did not even provide for
154 the possibility for most fields). Even switches that do allow bitwise
155 matching on a given field may restrict the masks that are allowed, e.g.
156 by allowing matches only on contiguous sets of bits starting from the
157 most significant bit, that is, ``CIDR'' masks [RFC 4632]. Open vSwitch
158 does not allows bitwise matching on every field, but it allows
159 arbitrary bitwise masks on any field that does support bitwise
160 matching. (Older versions had some restrictions, as documented in the
161 descriptions of individual fields.)
162 </p>
163 </dd>
164
165 <dt><dfn>Wildcard</dfn>, e.g. ``any <code>nw_src</code>''</dt>
166 <dd>
167 <p>
168 The value of the field is not constrained. Wildcarded fields may be
169 written as <code><var>field</var>=*</code>, although it is unusual to
170 mention them at all. (When specifying a wildcard explicitly in a
171 command invocation, be sure to using quoting to protect against shell
172 expansion.)
173 </p>
174
175 <p>
176 There is a tiny difference between wildcarding a field and not
177 specifying any match on a field: wildcarding a field requires
178 satisfying the field's prerequisites.
179 </p>
180 </dd>
181 </dl>
182
183 <p>
184 Some types of matches on individual fields cannot be expressed directly
185 with OpenFlow and Open vSwitch. These can be expressed indirectly:
186 </p>
187
188 <dl>
189 <dt><dfn>Set match</dfn>, e.g. ``<code>tcp_dst</code> ∈ {80, 443,
190 8080}''</dt>
191 <dd>
192 <p>
193 The value of a field is one of a specified set of values; for
194 example, the TCP destination port is 80, 443, or 8080.
195 </p>
196
197 <p>
198 For matches used in flows (see <cite>Flows</cite>, below), multiple
199 flows can simulate set matches.
200 </p>
201 </dd>
202
203 <dt><dfn>Range match</dfn>, e.g. ``1000<code>tcp_dst</code>
204 1999''</dt>
205 <dd>
206 <p>
207 The value of the field must lie within a numerical range, for
208 example, TCP destination ports between 1000 and 1999.
209 </p>
210
211 <p>
212 Range matches can be expressed as a collection of bitwise matches. For
213 example, suppose that the goal is to match TCP source ports 1000 to
214 1999, inclusive. The binary representations of 1000 and 1999 are:
215 </p>
216
217 <pre fixed="yes">
218 01111101000
219 11111001111
220 </pre>
221
222 <p>
223 The following series of bitwise matches will match 1000 and
224 1999 and all the values in between:
225 </p>
226
227 <pre fixed="yes">
228 01111101xxx
229 0111111xxxx
230 10xxxxxxxxx
231 110xxxxxxxx
232 1110xxxxxxx
233 11110xxxxxx
234 1111100xxxx
235 </pre>
236
237 <p>
238 which can be written as the following matches:
239 </p>
240
241 <pre>
242 tcp,tp_src=0x03e8/0xfff8
243 tcp,tp_src=0x03f0/0xfff0
244 tcp,tp_src=0x0400/0xfe00
245 tcp,tp_src=0x0600/0xff00
246 tcp,tp_src=0x0700/0xff80
247 tcp,tp_src=0x0780/0xffc0
248 tcp,tp_src=0x07c0/0xfff0
249 </pre>
250 </dd>
251
252 <dt><dfn>Inequality match</dfn>, e.g. ``<code>tcp_dst</code>80''</dt>
253 <dd>
254 <p>
255 The value of the field differs from a specified value, for
256 example, all TCP destination ports except 80.
257 </p>
258
259 <p>
260 An inequality match on an <var>n</var>-bit field can be expressed as a
261 disjunction of <var>n</var> 1-bit matches. For example, the inequality
262 match ``<code>vlan_pcp</code>5'' can be expressed as
263 ``<code>vlan_pcp</code> = 0/4 or <code>vlan_pcp</code> = 2/2 or
264 <code>vlan_pcp</code> = 0/1.'' For matches used in flows (see
265 <cite>Flows</cite>, below), sometimes one can more compactly express
266 inequality as a higher-priority flow that matches the exceptional case
267 paired with a lower-priority flow that matches the general case.
268 </p>
269
270 <p>
271 Alternatively, an inequality match may be converted to a pair of range
272 matches, e.g. <code>tcp_src ≠ 80</code> may be expressed as ``0
273 <code>tcp_src</code> &lt; 80 or 80 &lt; <code>tcp_src</code>65535'',
274 and then each range match may in turn be converted to a bitwise match.
275 </p>
276 </dd>
277
278 <dt><dfn>Conjunctive match</dfn>, e.g. ``<code>tcp_src</code> ∈ {80, 443, 8080} and <code>tcp_dst</code> ∈ {80, 443, 8080}''</dt>
279 <dd>
280 As an OpenFlow extension, Open vSwitch supports matching on conditions on
281 conjunctions of the previously mentioned forms of matching. See the
282 documentation for <ref field="conj_id"/> for more information.
283 </dd>
284 </dl>
285
286 <p>
287 All of these supported forms of matching are special cases of bitwise
288 matching. In some cases this influences the design of field values. <ref
289 field="ip_frag"/> is the most prominent example: it is designed to make all
290 of the practically useful checks for IP fragmentation possible as a single
291 bitwise match.
292 </p>
293
294 <h3>Shorthands</h3>
295
296 <p>
297 Some matches are very commonly used, so Open vSwitch accepts shorthand
298 notations. In some cases, Open vSwitch also uses shorthand notations when
299 it displays matches. The following shorthands are defined, with their long
300 forms shown on the right side:
301 </p>
302
303 <dl>
304 <dt><code>ip</code></dt> <dd><code>eth_type=0x0800</code></dd>
305 <dt><code>ipv6</code></dt> <dd><code>eth_type=0x86dd</code></dd>
306 <dt><code>icmp</code></dt> <dd><code>eth_type=0x0800,ip_proto=1</code></dd>
307 <dt><code>icmp6</code></dt> <dd><code>eth_type=0x86dd,ip_proto=58</code></dd>
308 <dt><code>tcp</code></dt> <dd><code>eth_type=0x0800,ip_proto=6</code></dd>
309 <dt><code>tcp6</code></dt> <dd><code>eth_type=0x86dd,ip_proto=6</code></dd>
310 <dt><code>udp</code></dt> <dd><code>eth_type=0x0800,ip_proto=17</code></dd>
311 <dt><code>udp6</code></dt> <dd><code>eth_type=0x86dd,ip_proto=17</code></dd>
312 <dt><code>sctp</code></dt> <dd><code>eth_type=0x0800,ip_proto=132</code></dd>
313 <dt><code>sctp6</code></dt> <dd><code>eth_type=0x86dd,ip_proto=132</code></dd>
314 <dt><code>arp</code></dt> <dd><code>eth_type=0x0806</code></dd>
315 <dt><code>rarp</code></dt> <dd><code>eth_type=0x8035</code></dd>
316 <dt><code>mpls</code></dt> <dd><code>eth_type=0x8847</code></dd>
317 <dt><code>mplsm</code></dt> <dd><code>eth_type=0x8848</code></dd>
318 </dl>
319
320 <h2>Evolution of OpenFlow Fields</h2>
321
322 <p>
323 The discussion so far applies to all OpenFlow and Open vSwitch
324 versions. This section starts to draw in specific information by
325 explaining, in broad terms, the treatment of fields and matches in
326 each OpenFlow version.
327 </p>
328
329 <h3>OpenFlow 1.0</h3>
330
331 <p>
332 OpenFlow 1.0 defined the OpenFlow protocol format of a match as a
333 fixed-length data structure that could match on the following
334 fields:
335 </p>
336
337 <ul>
338 <li>Ingress port.</li>
339 <li>Ethernet source and destination MAC.</li>
340 <li>Ethertype (with a special value to match frames that lack an
341 Ethertype).</li>
342 <li>VLAN ID and priority.</li>
343 <li>IPv4 source, destination, protocol, and DSCP.</li>
344 <li>TCP source and destination port.</li>
345 <li>UDP source and destination port.</li>
346 <li>ICMPv4 type and code.</li>
347 <li>ARP IPv4 addresses (SPA and TPA) and opcode.</li>
348 </ul>
349
350 <p>
351 Each supported field corresponded to some member of the data
352 structure. Some members represented multiple fields, in the case
353 of the TCP, UDP, ICMPv4, and ARP fields whose presence is mutually
354 exclusive. This also meant that some members were poor fits for
355 their fields: only the low 8 bits of the 16-bit ARP opcode could
356 be represented, and the ICMPv4 type and code were padded with 8 bits
357 of zeros to fit in the 16-bit members primarily meant for TCP and
358 UDP ports. An additional bitmap member indicated, for each
359 member, whether its field should be an ``exact'' or ``wildcarded''
360 match (see <cite>Matching</cite>), with additional support for
361 CIDR prefix matching on the IPv4 source and destination fields.
362 </p>
363
364 <p>
365 Simplicity was recognized early on as the main virtue of this
366 approach. Obviously, any fixed-length data structure cannot
367 support matching new protocols that do not fit. There was no
368 room, for example, for matching IPv6 fields, which was not a
369 priority at the time. Lack of room to support matching the
370 Ethernet addresses inside ARP packets actually caused more of a
371 design problem later, leading to an Open vSwitch extension action
372 specialized for dropping ``spoofed'' ARP packets in which the
373 frame and ARP Ethernet source addressed differed. (This extension
374 was never standardized. Open vSwitch dropped support for it a few
375 releases after it added support for full ARP matching.)
376 </p>
377
378 <p>
379 The design of the OpenFlow fixed-length matches also illustrates
380 compromises, in both directions, between the strengths and
381 weaknesses of software and hardware that have always influenced
382 the design of OpenFlow. Support for matching ARP fields that do
383 fit in the data structure was only added late in the design
384 process (and remained optional in OpenFlow 1.0), for example,
385 because common switch ASICs did not support matching these fields.
386 </p>
387
388 <p>
389 The compromises in favor of software occurred for more complicated
390 reasons. The OpenFlow designers did not know how to implement
391 matching in software that was fast, dynamic, and general. (A way
392 was later found [Srinivasan].) Thus, the designers sought to
393 support dynamic, general matching that would be fast in realistic
394 special cases, in particular when all of the matches were
395 <dfn>microflows</dfn>, that is, matches that specify every field
396 present in a packet, because such matches can be implemented as a
397 single hash table lookup. Contemporary research supported the
398 feasibility of this approach: the number of microflows in a campus
399 network had been measured to peak at about 10,000 [Casado, section
400 3.2]. (Calculations show that this can only be true in a lightly
401 loaded network [Pepelnjak].)
402 </p>
403
404 <p>
405 As a result, OpenFlow 1.0 required switches to treat microflow
406 matches as the highest possible priority. This let software
407 switches perform the microflow hash table lookup first. Only on
408 failure to match a microflow did the switch need to fall back to
409 checking the more general and presumed slower matches. Also, the
410 OpenFlow 1.0 flow match was minimally flexible, with no support
411 for general bitwise matching, partly on the basis that this seemed
412 more likely amenable to relatively efficient software
413 implementation. (CIDR masking for IPv4 addresses was added
414 relatively late in the OpenFlow 1.0 design process.)
415 </p>
416
417 <p>
418 Microflow matching was later discovered to aid some hardware
419 implementations. The TCAM chips used for matching in hardware do
420 not support priority in the same way as OpenFlow but instead tie
421 priority to ordering [Pagiamtzis]. Thus, adding a new match with
422 a priority between the priorities of existing matches can require
423 reordering an arbitrary number of TCAM entries. On the other
424 hand, when microflows are highest priority, they can be managed as
425 a set-aside portion of the TCAM entries.
426 </p>
427
428 <p>
429 The emphasis on matching microflows also led designers to
430 carefully consider the bandwidth requirements between switch and
431 controller: to maximize the number of microflow setups per second,
432 one must minimize the size of each flow's description. This
433 favored the fixed-length format in use, because it expressed
434 common TCP and UDP microflows in fewer bytes than more flexible
435 ``type-length-value'' (TLV) formats. (Early versions of OpenFlow
436 also avoided TLVs in general to head off protocol fragmentation.)
437 </p>
438
439 <h4>Inapplicable Fields</h4>
440
441 <p>
442 OpenFlow 1.0 does not clearly specify how to treat inapplicable
443 fields. The members for inapplicable fields are always present in
444 the match data structure, as are the bits that indicate whether
445 the fields are matched, and the ``correct'' member and bit values
446 for inapplicable fields is unclear. OpenFlow 1.0 implementations
447 changed their behavior over time as priorities shifted. The early
448 OpenFlow reference implementation, motivated to make every flow a
449 microflow to enable hashing, treated inapplicable fields as exact
450 matches on a value of 0. Initially, this behavior was implemented
451 in the reference controller only.
452 </p>
453
454 <p>
455 Later, the reference switch was also changed to actually force any
456 wildcarded inapplicable fields into exact matches on 0. The
457 latter behavior sometimes caused problems, because the modified
458 flow was the one reported back to the controller later when it
459 queried the flow table, and the modifications sometimes meant that
460 the controller could not properly recognize the flow that it had
461 added. In retrospect, perhaps this problem should have alerted
462 the designers to a design error, but the ability to use a single
463 hash table was held to be more important than almost every other
464 consideration at the time.
465 </p>
466
467 <p>
468 When more flexible match formats were introduced much later, they
469 disallowed any mention of inapplicable fields as part of a match.
470 This raised the question of how to translate between this new
471 format and the OpenFlow 1.0 fixed format. It seemed somewhat
472 inconsistent and backward to treat fields as exact-match in one
473 format and forbid matching them in the other, so instead the
474 treatment of inapplicable fields in the fixed-length format was
475 changed from exact match on 0 to wildcarding. (A better
476 classifier had by now eliminated software performance problems
477 with wildcards.)
478 </p>
479
480 <p>
481 The OpenFlow 1.0.1 errata (released only in 2012) added some
482 additional explanation [OpenFlow 1.0.1, section 3.4], but it did
483 not mandate specific behavior because of variation among
484 implementations.
485 </p>
486
487 <h3>OpenFlow 1.1</h3>
488
489 <p>
490 The OpenFlow 1.1 protocol match format was designed as a type/length/value
491 (TLV) format to allow for future flexibility. The specification
492 standardized only a single type <code>OFPMT_STANDARD</code> (0) with a
493 fixed-size payload, described here. The additional fields and bitwise
494 masks in OpenFlow 1.1 cause this match structure to be over twice as large
495 as in OpenFlow 1.0, 88 bytes versus 40.
496 </p>
497
498 <p>
499 OpenFlow 1.1 added support for the following fields:
500 </p>
501
502 <ul>
503 <li>SCTP source and destination port.</li>
504 <li>MPLS label and traffic control (TC) fields.</li>
505 <li>One 64-bit register (named ``metadata'').</li>
506 </ul>
507
508 <p>
509 OpenFlow 1.1 increased the width of the ingress port number field (and all
510 other port numbers in the protocol) from 16 bits to 32 bits.
511 </p>
512
513 <p>
514 OpenFlow 1.1 increased matching flexibility by introducing
515 arbitrary bitwise matching on Ethernet and IPv4 address fields and
516 on the new ``metadata'' register field. Switches were not
517 required to support all possible masks [OpenFlow 1.1, section
518 4.3].
519 </p>
520
521 <p>
522 By a strict reading of the specification, OpenFlow 1.1 removed
523 support for matching ICMPv4 type and code [OpenFlow 1.1, section
524 A.2.3], but this is likely an editing error because ICMP
525 matching is described elsewhere [OpenFlow 1.1, Table 3, Table 4,
526 Figure 4]. Open vSwitch does support ICMPv4 type and code
527 matching with OpenFlow 1.1.
528 </p>
529
530 <p>
531 OpenFlow 1.1 avoided the pitfalls of inapplicable fields that
532 OpenFlow 1.0 encountered, by requiring the switch to ignore the
533 specified field values [OpenFlow 1.1, section A.2.3]. It also
534 implied that the switch should ignore the bits that indicate
535 whether to match inapplicable fields.
536 </p>
537
538 <h4>Physical Ingress Port</h4>
539
540 <p>
541 OpenFlow 1.1 introduced a new pseudo-field, the physical ingress port. The
542 physical ingress port is only a pseudo-field because it cannot be used for
543 matching. It appears only one place in the protocol, in the ``packet-in''
544 message that passes a packet received at the switch to an OpenFlow
545 controller.
546 </p>
547
548 <p>
549 A packet's ingress port and physical ingress port are identical except for
550 packets processed by a switch feature such as bonding or tunneling that
551 makes a packet appear to arrive on a ``virtual'' port associated with the
552 bond or the tunnel. For such packets, the ingress port is the virtual port
553 and the physical ingress port is, naturally, the physical port. Open
554 vSwitch implements both bonding and tunneling, but its bonding
555 implementation does not use virtual ports and its tunnels are typically not
556 on the same OpenFlow switch as their physical ingress ports (which need not
557 be part of any switch), so the ingress port and physical ingress port are
558 always the same in Open vSwitch.
559 </p>
560
561 <h3>OpenFlow 1.2</h3>
562
563 <p>
564 OpenFlow 1.2 abandoned the fixed-length approach to matching. One reason
565 was size, since adding support for IPv6 address matching (now seen as
566 important), with bitwise masks, would have added 64 bytes to the match
567 length, increasing it from 88 bytes in OpenFlow 1.1 to over 150 bytes.
568 Extensibility had also become important as controller writers increasingly
569 wanted support for new fields without having to change messages throughout
570 the OpenFlow protocol. The challenges of carefully defining fixed-length
571 matches to avoid problems with inapplicable fields had also become clear
572 over time.
573 </p>
574
575 <p>
576 Therefore, OpenFlow 1.2 adopted a flow format using a flexible
577 type-length-value (TLV) representation, in which each TLV expresses a match
578 on one field. These TLVs were in turn encapsulated inside the outer TLV
579 wrapper introduced in OpenFlow 1.1 with the new identifier
580 <code>OFPMT_OXM</code> (1). (This wrapper fulfilled its intended purpose
581 of reducing the amount of churn in the protocol when changing match
582 formats; some messages that included matches remained unchanged from
583 OpenFlow 1.1 to 1.2 and later versions.)
584 </p>
585
586 <p>
587 OpenFlow 1.2 added support for the following fields:
588 </p>
589
590 <ul>
591 <li>ARP hardware addresses (SHA and THA).</li>
592 <li>IPv4 ECN.</li>
593 <li>IPv6 source and destination addresses, flow label, DSCP, ECN,
594 and protocol.</li>
595 <li>TCP, UDP, and SCTP port numbers when encapsulated inside IPv6.</li>
596 <li>ICMPv6 type and code.</li>
597 <li>ICMPv6 Neighbor Discovery target address and source and target
598 Ethernet addresses.</li>
599 </ul>
600
601 <!-- mention tun_id_from_cookie extension? -->
602
603 <p>
604 The OpenFlow 1.2 format, called <dfn>OXM</dfn> (<dfn>OpenFlow Extensible
605 Match</dfn>), was modeled closely on an extension to OpenFlow 1.0
606 introduced in Open vSwitch 1.1 called <dfn>NXM</dfn> (<dfn>Nicira Extended
607 Match</dfn>). Each OXM or NXM TLV has the following format:
608 </p>
609
610 <diagram>
611 <header name="type">
612 <bits name="vendor/class" above="16" width=".75"/>
613 <bits name="field" above="7" width=".4"/>
614 </header>
615 <nospace/>
616 <header name="">
617 <bits name="HM" above="1" width=".25"/>
618 <bits name="length" above="8" width=".4"/>
619 </header>
620 <header name="">
621 <bits name="body" above="length bytes" width="1.7"/>
622 </header>
623 </diagram>
624
625 <p>
626 The most significant 16 bits of the NXM or OXM header, called
627 <code>vendor</code> by NXM and <code>class</code> by OXM, identify
628 an organization permitted to allocate identifiers for fields. NXM
629 allocates only two vendors, 0x0000 for fields supported by
630 OpenFlow 1.0 and 0x0001 for fields implemented as an Open vSwitch
631 extension. OXM assigns classes as follows:
632 </p>
633
634 <dl>
635 <dt>0x0000 (<code>OFPXMC_NXM_0</code>).</dt>
636 <dt>0x0001 (<code>OFPXMC_NXM_1</code>).</dt>
637 <dd>Reserved for NXM compatibility.</dd>
638
639 <dt>0x0002 to 0x7fff</dt>
640 <dd>
641 Reserved for allocation to ONF members, but none yet assigned.
642 </dd>
643
644 <dt>0x8000 (<code>OFPXMC_OPENFLOW_BASIC</code>)</dt>
645 <dd>
646 Used for most standard OpenFlow fields.
647 </dd>
648
649 <dt>0x8001 (<code>OFPXMC_PACKET_REGS</code>)</dt>
650 <dd>
651 Used for packet register fields in OpenFlow 1.5 and later.
652 </dd>
653
654 <dt>0x8002 to 0xfffe</dt>
655 <dd>
656 Reserved for the OpenFlow specification.
657 </dd>
658
659 <dt>0xffff (<code>OFPXMC_EXPERIMENTER</code>)</dt>
660 <dd>Experimental use.</dd>
661 </dl>
662
663 <p>
664 When <code>class</code> is 0xffff, the OXM header is extended to 64 bits by
665 using the first 32 bits of the body as an <code>experimenter</code> field
666 whose most significant byte is zero and whose remaining bytes are an
667 Organizationally Unique Identifier (OUI) assigned by the IEEE [IEEE OUI],
668 as shown below. OpenFlow says that support for experimenter fields is
669 optional. Open vSwitch 2.4 and later does support them, primarily so that
670 it can support the <code>ONFOXM_ET_</code>* code points defined by official
671 Open Networking Foundation extensions to OpenFlow 1.3 in e.g. [TCP Flags
672 Match Field Extension].
673 </p>
674
675 <diagram>
676 <header name="type">
677 <bits name="class" above="16" below="0xffff" width=".75"/>
678 <bits name="field" above="7" width=".4"/>
679 </header>
680 <nospace/>
681 <header name="">
682 <bits name="HM" above="1" width=".25"/>
683 <bits name="length" above="8" width=".4"/>
684 </header>
685
686 <header name="experimenter">
687 <bits name="zero" above="8" below="0x00" width=".4"/>
688 <bits name="OUI" above="24" width="1"/>
689 </header>
690 <header name="">
691 <bits name="body" above="(length - 4) bytes" width="1.7"/>
692 </header>
693 </diagram>
694
695 <p>
696 Taken as a unit, <code>class</code> (or <code>vendor</code>),
697 <code>field</code>, and <code>experimenter</code> (when present) uniquely
698 identify a particular field.
699 </p>
700
701 <p>
702 When <code>hasmask</code> (abbreviated <code>HM</code> above) is 0, the OXM
703 is an exact match on an entire field. In this case, the body (excluding
704 the experimenter field, if present) is a single value to be matched.
705 </p>
706
707 <p>
708 When <code>hasmask</code> is 1, the OXM is a bitwise match. The body
709 (excluding the experimenter field) consists of a value to match, followed
710 by the bitwise mask to apply. A 1-bit in the mask indicates that the
711 corresponding bit in the value should be matched and a 0-bit that it should
712 be ignored. For example, for an IP address field, a value of 192.168.0.0
713 followed by a mask of 255.255.0.0 would match addresses in the
714 196.168.0.0/16 subnet.
715 </p>
716
717 <ul>
718 <li>
719 Some fields might not support masking at all, and some fields that do
720 support masking might restrict it to certain patterns. For example,
721 fields that have IP address values might be restricted to CIDR masks.
722 The descriptions of individual fields note these restrictions.
723 </li>
724
725 <li>
726 An OXM TLV with a mask that is all zeros is not useful (although it is
727 not forbidden), because it is has the same effect as omitting the TLV
728 entirely.
729 </li>
730
731 <li>
732 It is not meaningful to pair a 0-bit in an OXM mask with a 1-bit in its
733 value, and Open vSwitch rejects such an OXM with the error
734 <code>OFPBMC_BAD_WILDCARDS</code>, as required by OpenFlow 1.3 and later.
735 </li>
736 </ul>
737
738 <p>
739 The <code>length</code> identifies the number of bytes in the body,
740 including the 4-byte <code>experimenter</code> header, if it is present.
741 Each OXM TLV has a fixed length; that is, given <code>class</code>,
742 <code>field</code>, <code>experimenter</code> (if present), and
743 <code>hasmask</code>, <code>length</code> is a constant. The
744 <code>length</code> is included explicitly to allow software to minimally
745 parse OXM TLVs of unknown types.
746 </p>
747
748 <p>
749 OXM TLVs must be ordered so that a field's prerequisites are satisfied
750 before it is parsed. For example, an OXM TLV that matches on the IPv4
751 source address field is only allowed following an OXM TLV that matches on
752 the Ethertype for IPv4. Similarly, an OXM TLV that matches on the TCP
753 source port must follow a TLV that matches an Ethertype of IPv4 or IPv6 and
754 one that matches an IP protocol of TCP (in that order). The order of OXM
755 TLVs is not otherwise restricted; no canonical ordering is defined.
756 </p>
757
758 <p>
759 A given field may be matched only once in a series of OXM TLVs.
760 </p>
761
762 <!-- EXT-482? -->
763
764 <h3>OpenFlow 1.3</h3>
765
766 <p>
767 OpenFlow 1.3 showed OXM to be largely successful, by adding new fields
768 without making any changes to how flow matches otherwise worked. It added
769 OXMs for the following fields supported by Open vSwitch:
770 </p>
771
772 <ul>
773 <li>Tunnel ID for ports associated with e.g. VXLAN or keyed GRE.</li>
774 <li>MPLS ``bottom of stack'' (BOS) bit.</li>
775 </ul>
776
777 <p>
778 OpenFlow 1.3 also added OXMs for the following fields not documented here
779 and not yet implemented by Open vSwitch:
780 </p>
781
782 <ul>
783 <li>IPv6 extension header handling.</li>
784 <li>PBB I-SID.</li>
785 </ul>
786
787 <h3>OpenFlow 1.4</h3>
788
789 <p>
790 OpenFlow 1.4 added OXMs for the following fields not documented here and
791 not yet implemented by Open vSwitch:
792 </p>
793
794 <ul>
795 <li>PBB UCA.</li>
796 </ul>
797
798 <h3>OpenFlow 1.5</h3>
799
800 <p>
801 OpenFlow 1.5 added OXMs for the following fields supported by Open vSwitch:
802 </p>
803
804 <ul>
805 <li>TCP flags.</li>
806 <li>Packet registers.</li>
807 <li>The output port in the OpenFlow action set.</li>
808 </ul>
809
810 <p>
811 OpenFlow 1.5 also added OXMs for the following fields not documented here
812 and not yet implemented by Open vSwitch:
813 </p>
814
815 <ul>
816 <li>Packet type.</li>
817 </ul>
818
819 <h1>Fields Reference</h1>
820
821 <p>
822 The following sections document the fields that Open vSwitch supports.
823 Each section provides introductory material on a group of related fields,
824 followed by information on each individual field. In addition to
825 field-specific information, each field begins with a table with entries for
826 the following important properties:
827 </p>
828
829 <dl>
830 <dt>Name</dt>
831 <dd>
832 The field's name, used for parsing and formatting the field, e.g. in
833 <code>ovs-ofctl</code> commands. For historical reasons, some fields
834 have an additional name that is accepted as an alternative in parsing.
835 This name, when there is one, is listed as well, e.g. ``<code>tun</code>
836 (aka <code>tunnel_id</code>).''
837 </dd>
838
839 <dt>Width</dt>
840 <dd>
841 The field's width, always a multiple of 8 bits. Some fields don't use
842 all of the bits, so this may be accompanied by an explanation. For
843 example, OpenFlow embeds the 2-bit IP ECN field as as the low bits in an
844 8-bit byte, and so its width is expressed as ``8 bits (only the
845 least-significant 2 bits may be nonzero).''
846 </dd>
847
848 <dt>Format</dt>
849 <dd>
850 <p>
851 How a value for the field is formatted or parsed by, e.g.,
852 <code>ovs-ofctl</code>. Some possibilities are generic:
853 </p>
854
855 <dl>
856 <dt>decimal</dt>
857 <dd>
858 Formats as a decimal number. On input, accepts decimal numbers or
859 hexadecimal numbers prefixed by <code>0x</code>.
860 </dd>
861
862 <dt>hexadecimal</dt>
863 <dd>
864 Formats as a hexadecimal number prefixed by <code>0x</code>. On
865 input, accepts decimal numbers or hexadecimal numbers prefixed by
866 <code>0x</code>. (The default for parsing is <em>not</em>
867 hexadecimal: only a <code>0x</code> prefix causes input to be treated
868 as hexadecimal.)
869 </dd>
870
871 <dt>Ethernet</dt>
872 <dd>
873 Formats and accepts the common Ethernet address format
874 <code><var>xx</var>:<var>xx</var>:<var>xx</var>:<var>xx</var>:<var>xx</var>:<var>xx</var></code>.
875 </dd>
876
877 <dt>IPv4</dt>
878 <dd>
879 Formats and accepts the dotted-quad format
880 <code><var>a</var>.<var>b</var>.<var>c</var>.<var>d</var></code>.
881 For bitwise matches, formats and accepts
882 <code><var>address</var>/<var>length</var></code> CIDR notation in
883 addition to <code><var>address</var>/<var>mask</var></code>.
884 </dd>
885
886 <dt>IPv6</dt>
887 <dd>
888 Formats and accepts the common IPv6 address formats, plus CIDR
889 notation for bitwise matches.
890 </dd>
891
892 <dt>OpenFlow 1.0 port</dt>
893 <dd>
894 Accepts 16-bit port numbers in decimal, plus OpenFlow well-known port
895 names (e.g. <code>IN_PORT</code>) in uppercase or lowercase.
896 </dd>
897
898 <dt>OpenFlow 1.1+ port</dt>
899 <dd>
900 Same syntax as OpenFlow 1.0 ports but for 32-bit OpenFlow 1.1+ port
901 number fields.
902 </dd>
903 </dl>
904
905 <p>
906 Other, field-specific formats are explained along with their fields.
907 </p>
908 </dd>
909
910 <dt>Masking</dt>
911 <dd>
912 For most fields, this says ``arbitrary bitwise masks,'' meaning that a
913 flow may match any combination of bits in the field. Some fields
914 instead say ``exact match only,'' which means that a flow that matches
915 on this field must match on the whole field instead of just certain
916 bits. Either way, this reports masking support for the latest version
917 of Open vSwitch using OXM or NXM (that is, either OpenFlow 1.2+ or
918 OpenFlow 1.0 plus Open vSwitch NXM extensions). In particular,
919 OpenFlow 1.0 (without NXM) and 1.1 don't always support masking even if
920 Open vSwitch itself does; refer to the <em>OpenFlow 1.0</em> and
921 <em>OpenFlow 1.1</em> rows to learn about masking with these protocol
922 versions.
923 </dd>
924
925 <dt>Prerequisites</dt>
926 <dd>
927 <p>
928 Requirements that must be met to match on this field. For example,
929 <ref field="ip_src"/> has IPv4 as a prerequisite, meaning that a match
930 must include <code>eth_type=0x0800</code> to match on the IPv4 source
931 address. The following prerequisites, with their requirements, are
932 currently in use:
933 </p>
934
935 <dl>
936 <dt>none</dt>
937 <dd>(no requirements)</dd>
938
939 <dt>VLAN VID</dt>
940 <dd><code>vlan_tci=0x1000/0x1000</code> (i.e. a VLAN header is
941 present)</dd>
942
943 <dt>ARP</dt>
944 <dd><code>eth_type=0x0806</code> (ARP) or <code>eth_type=0x8035</code> (RARP)</dd>
945
946 <dt>IPv4</dt>
947 <dd><code>eth_type=0x0800</code></dd>
948
949 <dt>IPv6</dt>
950 <dd><code>eth_type=0x86dd</code></dd>
951
952 <dt>IPv4/IPv6</dt>
953 <dd>IPv4 or IPv6</dd>
954
955 <dt>MPLS</dt>
956 <dd><code>eth_type=0x8847</code> or <code>eth_type=0x8848</code></dd>
957
958 <dt>TCP</dt>
959 <dd>IPv4/IPv6 and <code>ip_proto=6</code></dd>
960
961 <dt>UDP</dt>
962 <dd>IPv4/IPv6 and <code>ip_proto=17</code></dd>
963
964 <dt>SCTP</dt>
965 <dd>IPv4/IPv6 and <code>ip_proto=132</code></dd>
966
967 <dt>ICMPv4</dt>
968 <dd>IPv4 and <code>ip_proto=1</code></dd>
969
970 <dt>ICMPv6</dt>
971 <dd>IPv6 and <code>ip_proto=58</code></dd>
972
973 <dt>ND solicit</dt>
974 <dd>ICMPv6 and <code>icmp_type=135</code> and <code>icmp_code=0</code></dd>
975
976 <dt>ND advert</dt>
977 <dd>ICMPv6 and <code>icmp_type=136</code> and <code>icmp_code=0</code></dd>
978
979 <dt>ND</dt>
980 <dd>ND solicit or ND advert</dd>
981 </dl>
982
983 <p>
984 The TCP, UDP, and SCTP prerequisites also have the special requirement
985 that <code>nw_frag</code> is not being used to select ``later
986 fragments.'' This is because only the first fragment of a fragmented
987 IPv4 or IPv6 datagram contains the TCP or UDP header.
988 </p>
989 </dd>
990
991 <dt>Access</dt>
992 <dd>
993 Most fields are ``read/write,'' which means that common OpenFlow actions
994 like <code>set_field</code> can modify them. Fields that are
995 ``read-only'' cannot be modified in these general-purpose ways, although
996 there may be other ways that actions can modify them.
997 </dd>
998
999 <dt>OpenFlow 1.0</dt>
1000 <dt>OpenFlow 1.1</dt>
1001 <dd>
1002 These rows report the level of support that OpenFlow 1.0 or OpenFlow 1.1,
1003 respectively, has for a field. For OpenFlow 1.0, supported fields are
1004 reported as either ``yes (exact match only)'' for fields that do not
1005 support any bitwise masking or ``yes (CIDR match only)'' for fields that
1006 support CIDR masking. OpenFlow 1.1 supported fields report either ``yes
1007 (exact match only)'' or simply ``yes'' for fields that do support
1008 arbitrary masks. These OpenFlow versions supported a fixed collection of
1009 fields that cannot be extended, so many more fields are reported as ``not
1010 supported.''
1011 </dd>
1012
1013 <dt>OXM</dt>
1014 <dt>NXM</dt>
1015 <dd>
1016 <p>
1017 These rows report the OXM and NXM code points that correspond to a
1018 given field. Either or both may be ``none.''
1019 </p>
1020
1021 <p>
1022 A field that has only an OXM code point is usually one that was
1023 standardized before it was added to Open vSwitch. A field that has
1024 only an NXM code point is usually one that is not yet standardized.
1025 When a field has both OXM and NXM code points, it usually indicates
1026 that it was introduced as an Open vSwitch extension under the NXM code
1027 point, then later standardized under the OXM code point. A field can
1028 have more than one OXM code point if it was standardized in OpenFlow
1029 1.4 or later and additionally introduced as an official ONF extension
1030 for OpenFlow 1.3. (A field that has neither OXM nor NXM code point is
1031 typically an obsolete field that is supported in some other form using
1032 OXM or NXM.)
1033 </p>
1034
1035 <p>
1036 Each code point in these rows is described in the form
1037 ``<code>NAME</code> (<var>number</var>) since OpenFlow <var>spec</var>
1038 and Open vSwitch <var>version</var>,''
1039 e.g. ``<code>OXM_OF_ETH_TYPE</code> (5) since OpenFlow 1.2 and Open
1040 vSwitch 1.7.'' First, <code>NAME</code>, which specifies a name for
1041 the code point, starts with a prefix that designates a class and, in
1042 some cases, a vendor, as listed in the following table:
1043 </p>
1044
1045 <oxm_classes/>
1046
1047 <p>
1048 For more information on OXM/NXM classes and vendors, refer back to
1049 <em>OpenFlow 1.2</em> under <em>Evolution of OpenFlow Fields</em>. The
1050 <var>number</var> is the field number within the class and vendor. The
1051 OpenFlow <var>spec</var> is the version of OpenFlow that standardized
1052 the code point. It is omitted for NXM code points because they are
1053 nonstandard. The <var>version</var> is the version of Open vSwitch
1054 that first supported the code point.
1055 </p>
1056 </dd>
1057 </dl>
1058
1059 <group title="Conjunctive Match">
1060 <p>
1061 An individual OpenFlow flow can match only a single value for each field.
1062 However, situations often arise where one wants to match one of a set of
1063 values within a field or fields. For matching a single field against a
1064 set, it is straightforward and efficient to add multiple flows to the
1065 flow table, one for each value in the set. For example, one might use
1066 the following flows to send packets with IP source address <var>a</var>,
1067 <var>b</var>, <var>c</var>, or <var>d</var> to the OpenFlow controller:
1068 </p>
1069
1070 <pre>
1071 ip,ip_src=<var>a</var> actions=controller
1072 ip,ip_src=<var>b</var> actions=controller
1073 ip,ip_src=<var>c</var> actions=controller
1074 ip,ip_src=<var>d</var> actions=controller
1075 </pre>
1076
1077 <p>
1078 Similarly, these flows send packets with IP destination address
1079 <var>e</var>, <var>f</var>, <var>g</var>, or <var>h</var> to the OpenFlow
1080 controller:
1081 </p>
1082
1083 <pre>
1084 ip,ip_dst=<var>e</var> actions=controller
1085 ip,ip_dst=<var>f</var> actions=controller
1086 ip,ip_dst=<var>g</var> actions=controller
1087 ip,ip_dst=<var>h</var> actions=controller
1088 </pre>
1089
1090 <p>
1091 Installing all of the above flows in a single flow table yields a
1092 disjunctive effect: a packet is sent to the controller if
1093 <code>ip_src</code> ∈ {<var>a</var>,<var>b</var>,<var>c</var>,<var>d</var>}
1094 or <code>ip_dst</code>
1095 {<var>e</var>,<var>f</var>,<var>g</var>,<var>h</var>} (or both).
1096 (Pedantically, if both of the above sets of flows are present in the flow
1097 table, they should have different priorities, because OpenFlow says that
1098 the results are undefined when two flows with same priority can both match
1099 a single packet.)
1100 </p>
1101
1102 <p>
1103 Suppose, on the other hand, one wishes to match conjunctively, that is, to
1104 send a packet to the controller only if both <code>ip_src</code>
1105 {<var>a</var>,<var>b</var>,<var>c</var>,<var>d</var>} and
1106 <code>ip_dst</code>
1107 {<var>e</var>,<var>f</var>,<var>g</var>,<var>h</var>}. This requires 4 × 4
1108 = 16 flows, one for each possible pairing of <code>ip_src</code> and
1109 <code>ip_dst</code>. That is acceptable for our small example, but it does
1110 not gracefully extend to larger sets or greater numbers of dimensions.
1111 </p>
1112
1113 <p>
1114 The <code>conjunction</code> action is a solution for conjunctive matches
1115 that is built into Open vSwitch. A <code>conjunction</code> action ties groups of
1116 individual OpenFlow flows into higher-level ``conjunctive flows''. Each
1117 group corresponds to one dimension, and each flow within the group matches
1118 one possible value for the dimension. A packet that matches one flow from
1119 each group matches the conjunctive flow.
1120 </p>
1121
1122 <p>
1123 To implement a conjunctive flow with <code>conjunction</code>, assign the
1124 conjunctive flow a 32-bit <var>id</var>, which must be unique within an
1125 OpenFlow table. Assign each of the <var>n</var>2 dimensions a unique
1126 number from 1 to <var>n</var>; the ordering is unimportant. Add one flow
1127 to the OpenFlow flow table for each possible value of each dimension with
1128 <code>conjunction(<var>id</var>, <var>k</var>/<var>n</var>)</code> as the
1129 flow's actions, where <var>k</var> is the number assigned to the flow's
1130 dimension. Together, these flows specify the conjunctive flow's match
1131 condition. When the conjunctive match condition is met, Open vSwitch looks
1132 up one more flow that specifies the conjunctive flow's actions and receives
1133 its statistics. This flow is found by setting <code>conj_id</code> to the
1134 specified <var>id</var> and then again searching the flow table.
1135 </p>
1136
1137 <p>
1138 The following flows provide an example. Whenever the IP source is one of
1139 the values in the flows that match on the IP source (dimension 1 of 2),
1140 <em>and</em> the IP destination is one of the values in the flows that
1141 match on IP destination (dimension 2 of 2), Open vSwitch searches for a
1142 flow that matches <code>conj_id</code> against the conjunction ID (1234),
1143 finding the first flow listed below.
1144 </p>
1145
1146 <pre>
1147 conj_id=1234 actions=controller
1148 ip,ip_src=10.0.0.1 actions=conjunction(1234, 1/2)
1149 ip,ip_src=10.0.0.4 actions=conjunction(1234, 1/2)
1150 ip,ip_src=10.0.0.6 actions=conjunction(1234, 1/2)
1151 ip,ip_src=10.0.0.7 actions=conjunction(1234, 1/2)
1152 ip,ip_dst=10.0.0.2 actions=conjunction(1234, 2/2)
1153 ip,ip_dst=10.0.0.5 actions=conjunction(1234, 2/2)
1154 ip,ip_dst=10.0.0.7 actions=conjunction(1234, 2/2)
1155 ip,ip_dst=10.0.0.8 actions=conjunction(1234, 2/2)
1156 </pre>
1157
1158 <p>
1159 Many subtleties exist:
1160 </p>
1161
1162 <ul>
1163 <li>
1164 In the example above, every flow in a single dimension has the same form,
1165 that is, dimension 1 matches on <code>ip_src</code> and dimension 2 on
1166 <code>ip_dst</code>, but this is not a requirement. Different flows
1167 within a dimension may match on different bits within a field (e.g. IP
1168 network prefixes of different lengths, or TCP/UDP port ranges as bitwise
1169 matches), or even on entirely different fields (e.g. to match packets for
1170 TCP source port 80 or TCP destination port 80).
1171 </li>
1172
1173 <li>
1174 The flows within a dimension can vary their matches across more than
1175 one field, e.g. to match only specific pairs of IP source and
1176 destination addresses or L4 port numbers.
1177 </li>
1178
1179 <li>
1180 A flow may have multiple <code>conjunction</code> actions, with different
1181 <code>id</code> values. This is useful for multiple conjunctive flows with
1182 overlapping sets. If one conjunctive flow matches packets with both
1183 <code>ip_src</code> ∈ {<var>a</var>,<var>b</var>} and <code>ip_dst</code>
1184 {<var>d</var>,<var>e</var>} and a second conjunctive flow matches <code>ip_src</code>
1185 ∈ {<var>b</var>,<var>c</var>} and <code>ip_dst</code> ∈ {<var>f</var>,<var>g</var>}, for
1186 example, then the flow that matches <code>ip_src=</code><var>b</var> would have two
1187 <code>conjunction</code> actions, one for each conjunctive flow. The order
1188 of <code>conjunction</code> actions within a list of actions is not
1189 significant.
1190 </li>
1191 <li>
1192 A flow with <code>conjunction</code> actions may also include <code>note</code>
1193 actions for annotations, but not any other kind of actions. (They
1194 would not be useful because they would never be executed.)
1195 </li>
1196 <li>
1197 All of the flows that constitute a conjunctive flow with a given
1198 <var>id</var> must have the same priority. (Flows with the same <var>id</var>
1199 but different priorities are currently treated as different
1200 conjunctive flows, that is, currently <var>id</var> values need only be
1201 unique within an OpenFlow table at a given priority. This behavior
1202 isn't guaranteed to stay the same in later releases, so please use
1203 <var>id</var> values unique within an OpenFlow table.)
1204 </li>
1205 <li>
1206 Conjunctive flows must not overlap with each other, at a given
1207 priority, that is, any given packet must be able to match at most one
1208 conjunctive flow at a given priority. Overlapping conjunctive flows
1209 yield unpredictable results.
1210 </li>
1211 <li>
1212 Following a conjunctive flow match, the search for the flow with
1213 <code>conj_id=</code><var>id</var> is done in the same general-purpose way as
1214 other flow table searches, so one can use flows with
1215 <code>conj_id=</code><var>id</var> to act differently depending on
1216 circumstances. (One exception is that the search for the
1217 <code>conj_id=</code><var>id</var> flow itself ignores conjunctive flows, to
1218 avoid recursion.) If the search with <code>conj_id=</code><var>id</var> fails,
1219 Open vSwitch acts as if the conjunctive flow had not matched at all, and
1220 continues searching the flow table for other matching flows.
1221 </li>
1222 <li>
1223 <p>
1224 OpenFlow prerequisite checking occurs for the flow with
1225 <code>conj_id=</code><var>id</var> in the same way as any other flow, e.g. in
1226 an OpenFlow 1.1+ context, putting a <code>mod_nw_src</code> action into the example
1227 above would require adding an <code>ip</code> match, like this:
1228 </p>
1229 <pre>
1230 conj_id=1234,ip actions=mod_nw_src:1.2.3.4,controller
1231 </pre>
1232 </li>
1233 <li>
1234 OpenFlow prerequisite checking also occurs for the individual flows
1235 that comprise a conjunctive match in the same way as any other flow.
1236 </li>
1237 <li>
1238 The flows that constitute a conjunctive flow do not have useful
1239 statistics. They are never updated with byte or packet counts, and so
1240 on. (For such a flow, therefore, the idle and hard timeouts work much
1241 the same way.)
1242 </li>
1243 <li>
1244 <p>
1245 Sometimes there is a choice of which flows include a particular match.
1246 For example, suppose that we added an extra constraint to our example,
1247 to match on <code>ip_src</code>
1248 {<var>a</var>,<var>b</var>,<var>c</var>,<var>d</var>} and
1249 <code>ip_dst</code>
1250 {<var>e</var>,<var>f</var>,<var>g</var>,<var>h</var>} and
1251 <code>tcp_dst</code> = <var>i</var>. One way to implement this is to
1252 add the new constraint to the <code>conj_id</code> flow, like this:
1253 </p>
1254 <pre>
1255 conj_id=1234,tcp,tcp_dst=<var>i</var> actions=mod_nw_src:1.2.3.4,controller
1256 </pre>
1257 <p>
1258 but <em>this is not recommended</em> because of the cost of the extra
1259 flow table lookup. Instead, add the constraint to the individual
1260 flows, either in one of the dimensions or (slightly better) all of
1261 them.
1262 </p>
1263 </li>
1264 <li>
1265 A conjunctive match must have <var>n</var>2 dimensions (otherwise a
1266 conjunctive match is not necessary). Open vSwitch enforces this.
1267 </li>
1268 <li>
1269 Each dimension within a conjunctive match should ordinarily have more
1270 than one flow. Open vSwitch does not enforce this.
1271 </li>
1272 </ul>
1273
1274 <field id="MFF_CONJ_ID" title="Conjunction ID">
1275 Used for conjunctive matching. See above for more information.
1276 </field>
1277 </group>
1278
1279 <group title="Tunnel">
1280 <p>
1281 The fields in this group relate to tunnels, which Open vSwitch
1282 supports in several forms (GRE, VXLAN, and so on). Most of
1283 these fields do appear in the wire format of a packet, so they
1284 are data fields from that point of view, but they are metadata
1285 from an OpenFlow flow table point of view because they do not
1286 appear in packets that are forwarded to the controller or to
1287 ordinary (non-tunnel) output ports.
1288 </p>
1289
1290 <p>
1291 Open vSwitch supports a spectrum of usage models for mapping
1292 tunnels to OpenFlow ports:
1293 </p>
1294
1295 <dl>
1296 <dt>``Port-based'' tunnels</dt>
1297 <dd>
1298 <p>
1299 In this model, an OpenFlow port represents one tunnel: it matches a
1300 particular type of tunnel traffic between two IP endpoints, with a
1301 particular tunnel key (if keys are in use). In this situation, <ref
1302 field="in_port"/> suffices to distinguish one tunnel from another, so
1303 the tunnel header fields have little importance for OpenFlow
1304 processing. (They are still populated and may be used if it is
1305 convenient.) The tunnel header fields play no role in sending
1306 packets out such an OpenFlow port, either, because the OpenFlow port
1307 itself fully specifies the tunnel headers.
1308 </p>
1309
1310 <p>
1311 The following Open vSwitch commands create a bridge
1312 <code>br-int</code>, add port <code>tap0</code> to the bridge as
1313 OpenFlow port 1, establish a port-based GRE tunnel between the local
1314 host and remote IP 192.168.1.1 using GRE key 5001 as OpenFlow port 2,
1315 and arranges to forward all traffic from <code>tap0</code> to the
1316 tunnel and vice versa:
1317 </p>
1318
1319 <pre>
1320 ovs-vsctl add-br br-int
1321 ovs-vsctl add-port br-int tap0 -- set interface tap0 ofport_request=1
1322 ovs-vsctl add-port br-int gre0 --
1323 set interface gre0 ofport_request=2 type=gre \
1324 options:remote_ip=192.168.1.1 options:key=5001
1325 ovs-ofctl add-flow br-int in_port=1,actions=2
1326 ovs-ofctl add-flow br-int in_port=2,actions=1
1327 </pre>
1328 </dd>
1329
1330 <dt>``Flow-based'' tunnels</dt>
1331 <dd>
1332 <p>
1333 In this model, one OpenFlow port represents all possible tunnels of a
1334 given type with an endpoint on the current host, for example, all GRE
1335 tunnels. In this situation, <ref field="in_port"/> only indicates
1336 that traffic was received on the particular kind of tunnel. This is
1337 where the tunnel header fields are most important: they allow the
1338 OpenFlow tables to discriminate among tunnels based on their IP
1339 endpoints or keys. Tunnel header fields also determine the IP
1340 endpoints and keys of packets sent out such a tunnel port.
1341 </p>
1342
1343 <p>
1344 The following Open vSwitch commands create a bridge
1345 <code>br-int</code>, add port <code>tap0</code> to the
1346 bridge as OpenFlow port 1, establish a flow-based GRE tunnel
1347 port 3, and arranges to forward all traffic from
1348 <code>tap0</code> to remote IP 192.168.1.1 over a GRE tunnel
1349 with key 5001 and vice versa:
1350 </p>
1351
1352 <pre>
1353 ovs-vsctl add-br br-int
1354 ovs-vsctl add-port br-int tap0 -- set interface tap0 ofport_request=1
1355 ovs-vsctl add-port br-int allgre --
1356 set interface gre0 ofport_request=3 type=gre \
1357 options:remote_ip=flow options:key=flow
1358 ovs-ofctl add-flow br-int \
1359 'in_port=1 actions=set_tunnel:5001,set_field:192.168.1.1->tun_dst,3'
1360 ovs-ofctl add-flow br-int 'in_port=3,tun_src=192.168.1.1,tun_id=5001 actions=1'
1361 </pre>
1362 </dd>
1363
1364 <dt>Mixed models.</dt>
1365 <dd>
1366 <p>
1367 One may define both flow-based and port-based tunnels at the
1368 same time. For example, it is valid and possibly useful to
1369 create and configure both <code>gre0</code> and
1370 <code>allgre</code> tunnel ports described above.
1371 </p>
1372
1373 <p>
1374 Traffic is attributed on ingress to the most specific
1375 matching tunnel. For example, <code>gre0</code> is more
1376 specific than <code>allgre</code>. Therefore, if both
1377 exist, then <code>gre0</code> will be the ingress port for any
1378 GRE traffic received from 192.168.1.1 with key 5001.
1379 </p>
1380
1381 <p>
1382 On egress, traffic may be directed to any appropriate tunnel
1383 port. If both <code>gre0</code> and <code>allgre</code> are
1384 configured as already described, then the actions
1385 <code>2</code> and
1386 <code>set_tunnel:5001,set_field:192.168.1.1->tun_dst,3</code>
1387 send the same tunnel traffic.
1388 </p>
1389 </dd>
1390
1391 <dt>Intermediate models.</dt>
1392 <dd>
1393 Ports may be configured as partially flow-based. For example,
1394 one may define an OpenFlow port that represents tunnels
1395 between a pair of endpoints but leaves the flow table to
1396 discriminate on the flow key.
1397 </dd>
1398 </dl>
1399
1400 <p>
1401 <code>ovs-vswitchd.conf.db</code>(5) describes all the details of tunnel
1402 configuration.
1403 </p>
1404
1405 <p>
1406 These fields do not have any prerequisites, which means that a
1407 flow may match on any or all of them, in any combination.
1408 </p>
1409
1410 <p>
1411 These fields are zeros for packets that did not arrive on a tunnel.
1412 </p>
1413
1414 <field id="MFF_TUN_ID" title="Tunnel ID">
1415 <p>
1416 Many kinds of tunnels support a tunnel ID:
1417 </p>
1418
1419 <ul>
1420 <li>
1421 VXLAN and Geneve have a 24-bit virtual network identifier (VNI).
1422 </li>
1423 <li>LISP has a 24-bit instance ID.</li>
1424 <li>GRE has an optional 32-bit key.</li>
1425 <li>STT has a 64-bit key.</li>
1426 </ul>
1427
1428 <p>
1429 When a packet is received from a tunnel, this field holds the
1430 tunnel ID in its least significant bits, zero-extended to fit.
1431 This field is zero if the tunnel does not support an ID, or if
1432 no ID is in use for a tunnel type that has an optional ID, or
1433 if an ID of zero received, or if the packet was not received
1434 over a tunnel.
1435 </p>
1436
1437 <p>
1438 When a packet is output to a tunnel port, the tunnel
1439 configuration determines whether the tunnel ID is taken from
1440 this field or bound to a fixed value. See the earlier
1441 description of ``port-based'' and ``flow-based'' tunnels for
1442 more information.
1443 </p>
1444
1445 <p>
1446 The following diagram shows the origin of this field in a
1447 typical keyed GRE tunnel:
1448 </p>
1449
1450 <diagram>
1451 <header name="Ethernet">
1452 <bits name="dst" above="48" width="0.4"/>
1453 <bits name="src" above="48" width="0.4"/>
1454 <bits name="type" above="16" below="0x800" width="0.4"/>
1455 </header>
1456 <header name="IPv4">
1457 <bits name="..." width="0.4"/>
1458 <bits name="proto" above="8" below="47" width="0.4"/>
1459 <bits name="src" above="32" width="0.4"/>
1460 <bits name="dst" above="32" width="0.4"/>
1461 </header>
1462 <header name="GRE">
1463 <bits name="..." above="16" width="0.4"/>
1464 <bits name="type" above="16" below="0x6558" width="0.4"/>
1465 <bits name="key" above="32" width=".4" fill="yes"/>
1466 </header>
1467 <header name="Ethernet">
1468 <bits name="dst" above="48" width="0.4"/>
1469 <bits name="src" above="48" width="0.4"/>
1470 <bits name="type" above="16" width="0.4"/>
1471 </header>
1472 <dots/>
1473 </diagram>
1474 </field>
1475
1476 <field id="MFF_TUN_SRC" title="Tunnel IPv4 Source">
1477 <p>
1478 When a packet is received from a tunnel, this field is the
1479 source address in the outer IP header of the tunneled packet.
1480 This field is zero if the packet was not received over a
1481 tunnel.
1482 </p>
1483
1484 <p>
1485 When a packet is output to a flow-based tunnel port, this
1486 field influences the IPv4 source address used to send the
1487 packet. If it is zero, then the kernel chooses an appropriate
1488 IP address based using the routing table.
1489 </p>
1490
1491 <p>
1492 The following diagram shows the origin of this field in a
1493 typical keyed GRE tunnel:
1494 </p>
1495
1496 <diagram>
1497 <header name="Ethernet">
1498 <bits name="dst" above="48" width="0.4"/>
1499 <bits name="src" above="48" width="0.4"/>
1500 <bits name="type" above="16" below="0x800" width="0.4"/>
1501 </header>
1502 <header name="IPv4">
1503 <bits name="..." width="0.4"/>
1504 <bits name="proto" above="8" below="47" width="0.4"/>
1505 <bits name="src" above="32" width="0.4" fill="yes"/>
1506 <bits name="dst" above="32" width="0.4"/>
1507 </header>
1508 <header name="GRE">
1509 <bits name="..." above="16" width="0.4"/>
1510 <bits name="type" above="16" below="0x6558" width="0.4"/>
1511 <bits name="key" above="32" width=".4"/>
1512 </header>
1513 <header name="Ethernet">
1514 <bits name="dst" above="48" width="0.4"/>
1515 <bits name="src" above="48" width="0.4"/>
1516 <bits name="type" above="16" width="0.4"/>
1517 </header>
1518 <dots/>
1519 </diagram>
1520 </field>
1521
1522 <field id="MFF_TUN_DST" title="Tunnel IPv4 Destination">
1523 <p>
1524 When a packet is received from a tunnel, this field is the
1525 destination address in the outer IP header of the tunneled
1526 packet. This field is zero if the packet was not received
1527 over a tunnel.
1528 </p>
1529
1530 <p>
1531 When a packet is output to a flow-based tunnel port, this
1532 field specifies the destination to which the tunnel packet is
1533 sent.
1534 </p>
1535
1536 <p>
1537 The following diagram shows the origin of this field in a
1538 typical keyed GRE tunnel:
1539 </p>
1540
1541 <diagram>
1542 <header name="Ethernet">
1543 <bits name="dst" above="48" width="0.4"/>
1544 <bits name="src" above="48" width="0.4"/>
1545 <bits name="type" above="16" below="0x800" width="0.4"/>
1546 </header>
1547 <header name="IPv4">
1548 <bits name="..." width="0.4"/>
1549 <bits name="proto" above="8" below="47" width="0.4"/>
1550 <bits name="src" above="32" width="0.4"/>
1551 <bits name="dst" above="32" width="0.4" fill="yes"/>
1552 </header>
1553 <header name="GRE">
1554 <bits name="..." above="16" width="0.4"/>
1555 <bits name="type" above="16" below="0x6558" width="0.4"/>
1556 <bits name="key" above="32" width=".4"/>
1557 </header>
1558 <header name="Ethernet">
1559 <bits name="dst" above="48" width="0.4"/>
1560 <bits name="src" above="48" width="0.4"/>
1561 <bits name="type" above="16" width="0.4"/>
1562 </header>
1563 <dots/>
1564 </diagram>
1565 </field>
1566
1567 <field id="MFF_TUN_IPV6_SRC" title="Tunnel IPv6 Source">
1568 Similar to <ref field="tun_src"/>, but for tunnels over IPv6.
1569 </field>
1570
1571 <field id="MFF_TUN_IPV6_DST" title="Tunnel IPv6 Destination">
1572 Similar to <ref field="tun_dst"/>, but for tunnels over IPv6.
1573 </field>
1574
1575 <h2>VXLAN Group-Based Policy Fields</h2>
1576
1577 <p>
1578 The VXLAN header is defined as follows [RFC 7348], where the
1579 <code>I</code> bit must be set to 1, unlabeled bits or those labeled
1580 <code>reserved</code> must be set to 0, and Open vSwitch makes the VNI
1581 available via <ref field="tun_id"/>:
1582 </p>
1583
1584 <diagram>
1585 <header name="VXLAN flags">
1586 <bits name="" above="1" width="0.15"/>
1587 <bits name="" above="1" width="0.15"/>
1588 <bits name="" above="1" width="0.15"/>
1589 <bits name="" above="1" width="0.15"/>
1590 <bits name="I" above="1" width="0.15"/>
1591 <bits name="" above="1" width="0.15"/>
1592 <bits name="" above="1" width="0.15"/>
1593 <bits name="" above="1" width="0.15"/>
1594 </header>
1595 <nospace/>
1596 <header>
1597 <bits name="reserved" above="24" width="1.2"/>
1598 <bits name="VNI" above="24" width="1.2"/>
1599 <bits name="reserved" above="8" width=".5"/>
1600 </header>
1601 </diagram>
1602
1603 <p>
1604 VXLAN Group-Based Policy [VXLAN Group Policy Option] adds new
1605 interpretations to existing bits in the VXLAN header, reinterpreting it
1606 as follows, with changes highlighted:
1607 </p>
1608
1609 <diagram>
1610 <header name="GBP flags">
1611 <bits name="" above="1" width="0.15"/>
1612 <bits name="D" above="1" width="0.15" fill="yes"/>
1613 <bits name="" above="1" width="0.15"/>
1614 <bits name="" above="1" width="0.15"/>
1615 <bits name="A" above="1" width="0.15" fill="yes"/>
1616 <bits name="" above="1" width="0.15"/>
1617 <bits name="" above="1" width="0.15"/>
1618 <bits name="" above="1" width="0.15"/>
1619 </header>
1620 <nospace/>
1621 <header>
1622 <bits name="group policy ID" above="24" width="1.2" fill="yes"/>
1623 <bits name="VNI" above="24" width="1.2"/>
1624 <bits name="reserved" above="8" width=".5"/>
1625 </header>
1626 </diagram>
1627
1628 <p>
1629 Open vSwitch makes GBP fields and flags available through the following
1630 fields. Only packets that arrive over a VXLAN tunnel with the GBP
1631 extension enabled have these fields set. In other packets they are zero
1632 on receive and ignored on transmit.
1633 </p>
1634
1635 <field id="MFF_TUN_GBP_ID" title="VXLAN Group-Based Policy ID">
1636 <p>
1637 For a packet tunneled over VXLAN with the Group-Based Policy (GBP)
1638 extension, this field represents the GBP policy ID, as shown above.
1639 </p>
1640 </field>
1641
1642 <field id="MFF_TUN_GBP_FLAGS" title="VXLAN Group-Based Policy Flags">
1643 <p>
1644 For a packet tunneled over VXLAN with the Group-Based Policy (GBP)
1645 extension, this field represents the GBP policy flags, as shown above.
1646 </p>
1647
1648 <p>
1649 The field has the format shown below:
1650 </p>
1651
1652 <diagram>
1653 <header name="GBP Flags">
1654 <bits name="" above="1" width="0.15"/>
1655 <bits name="D" above="1" width="0.15"/>
1656 <bits name="" above="1" width="0.15"/>
1657 <bits name="" above="1" width="0.15"/>
1658 <bits name="A" above="1" width="0.15"/>
1659 <bits name="" above="1" width="0.15"/>
1660 <bits name="" above="1" width="0.15"/>
1661 <bits name="" above="1" width="0.15"/>
1662 </header>
1663 </diagram>
1664
1665 <p>
1666 Unlabeled bits are reserved and must be transmitted as 0. The VXLAN
1667 GBP draft defines the other bits' meanings as:
1668 </p>
1669
1670 <dl>
1671 <dt><code>D</code> (Don't Learn)</dt>
1672 <dd>
1673 When set, this bit indicates that the egress tunnel endpoint must not
1674 learn the source address of the encapsulated frame.
1675 </dd>
1676
1677 <dt><code>A</code> (Applied)</dt>
1678 <dd>
1679 When set, indicates that the group policy has already been applied to
1680 this packet. Devices must not apply policies when the A bit is set.
1681 </dd>
1682 </dl>
1683 </field>
1684
1685 <h2>Geneve Fields</h2>
1686
1687 <p>
1688 These fields provide access to additional features in the Geneve
1689 tunneling protocol [Geneve]. Their names are somewhat generic in the
1690 hope that the same fields could be reused for other protocols in the
1691 future; for example, the NSH protocol [NSH] supports TLV options whose
1692 form is identical to that for Geneve options.
1693 </p>
1694
1695 <field id="MFF_TUN_METADATA0" title="Generic Tunnel Option 0">
1696 <p>
1697 The above information specifically covers generic tunnel option 0, but
1698 Open vSwitch supports 64 options, numbered 0 through 63, whose
1699 NXM field numbers are 40 through 103.
1700 </p>
1701
1702 <p>
1703 These fields provide OpenFlow access to the generic type-length-value
1704 options defined by the Geneve tunneling protocol or other protocols
1705 with options in the same TLV format as Geneve options. Each of these
1706 options has the following wire format:
1707 </p>
1708
1709 <diagram>
1710 <header name="header">
1711 <bits name="class" above="16" width="0.6"/>
1712 <bits name="type" above="8" width="0.5"/>
1713 <bits name="res" above="3" below="0" width="0.25"/>
1714 <bits name="length" above="5" width="0.4"/>
1715 </header>
1716 <nospace/>
1717 <header name="body">
1718 <bits name="value" above="4×(length - 1) bytes" width="1.7"/>
1719 </header>
1720 </diagram>
1721
1722 <p>
1723 Taken together, the <code>class</code> and <code>type</code> in the
1724 option format mean that there are about 16 million distinct kinds of
1725 TLV options, too many to give individual OXM code points. Thus, Open
1726 vSwitch requires the user to define the TLV options of interest, by
1727 binding up to 64 TLV options to generic tunnel option NXM code points.
1728 Each option may have up to 124 bytes in its body, the maximum allowed
1729 by the TLV format, but bound options may total at most 252 bytes of
1730 body.
1731 </p>
1732
1733 <p>
1734 Open vSwitch extensions to the OpenFlow protocol bind TLV options to
1735 NXM code points. The <code>ovs-ofctl</code>(8) program offers one way
1736 to use these extensions, e.g. to configure a mapping from a TLV option
1737 with <code>class</code> <code>0xffff</code>, <code>type</code>
1738 <code>0</code>, and a body length of 4 bytes:
1739 </p>
1740
1741 <pre>
1742 ovs-ofctl add-tlv-map br0 "{class=0xffff,type=0,len=4}->tun_metadata0"
1743 </pre>
1744
1745 <p>
1746 Once a TLV option is properly bound, it can be accessed and modified
1747 like any other field, e.g. to send packets that have value 1234 for the
1748 option described above to the controller:
1749 </p>
1750
1751 <pre>
1752 ovs-ofctl add-flow br0 tun_metadata0=1234,actions=controller
1753 </pre>
1754
1755 <p>
1756 An option not received or not bound is matched as all zeros.
1757 </p>
1758 </field>
1759 <!--- XXX need a way to define a range of OXMs -->
1760 <field id="MFF_TUN_METADATA1" title="Generic Tunnel Option 1" hidden="yes"/>
1761 <field id="MFF_TUN_METADATA2" title="Generic Tunnel Option 2" hidden="yes"/>
1762 <field id="MFF_TUN_METADATA3" title="Generic Tunnel Option 3" hidden="yes"/>
1763 <field id="MFF_TUN_METADATA4" title="Generic Tunnel Option 4" hidden="yes"/>
1764 <field id="MFF_TUN_METADATA5" title="Generic Tunnel Option 5" hidden="yes"/>
1765 <field id="MFF_TUN_METADATA6" title="Generic Tunnel Option 6" hidden="yes"/>
1766 <field id="MFF_TUN_METADATA7" title="Generic Tunnel Option 7" hidden="yes"/>
1767 <field id="MFF_TUN_METADATA8" title="Generic Tunnel Option 8" hidden="yes"/>
1768 <field id="MFF_TUN_METADATA9" title="Generic Tunnel Option 9" hidden="yes"/>
1769 <field id="MFF_TUN_METADATA10" title="Generic Tunnel Option 10" hidden="yes"/>
1770 <field id="MFF_TUN_METADATA11" title="Generic Tunnel Option 11" hidden="yes"/>
1771 <field id="MFF_TUN_METADATA12" title="Generic Tunnel Option 12" hidden="yes"/>
1772 <field id="MFF_TUN_METADATA13" title="Generic Tunnel Option 13" hidden="yes"/>
1773 <field id="MFF_TUN_METADATA14" title="Generic Tunnel Option 14" hidden="yes"/>
1774 <field id="MFF_TUN_METADATA15" title="Generic Tunnel Option 15" hidden="yes"/>
1775 <field id="MFF_TUN_METADATA16" title="Generic Tunnel Option 16" hidden="yes"/>
1776 <field id="MFF_TUN_METADATA17" title="Generic Tunnel Option 17" hidden="yes"/>
1777 <field id="MFF_TUN_METADATA18" title="Generic Tunnel Option 18" hidden="yes"/>
1778 <field id="MFF_TUN_METADATA19" title="Generic Tunnel Option 19" hidden="yes"/>
1779 <field id="MFF_TUN_METADATA20" title="Generic Tunnel Option 20" hidden="yes"/>
1780 <field id="MFF_TUN_METADATA21" title="Generic Tunnel Option 21" hidden="yes"/>
1781 <field id="MFF_TUN_METADATA22" title="Generic Tunnel Option 22" hidden="yes"/>
1782 <field id="MFF_TUN_METADATA23" title="Generic Tunnel Option 23" hidden="yes"/>
1783 <field id="MFF_TUN_METADATA24" title="Generic Tunnel Option 24" hidden="yes"/>
1784 <field id="MFF_TUN_METADATA25" title="Generic Tunnel Option 25" hidden="yes"/>
1785 <field id="MFF_TUN_METADATA26" title="Generic Tunnel Option 26" hidden="yes"/>
1786 <field id="MFF_TUN_METADATA27" title="Generic Tunnel Option 27" hidden="yes"/>
1787 <field id="MFF_TUN_METADATA28" title="Generic Tunnel Option 28" hidden="yes"/>
1788 <field id="MFF_TUN_METADATA29" title="Generic Tunnel Option 29" hidden="yes"/>
1789 <field id="MFF_TUN_METADATA30" title="Generic Tunnel Option 30" hidden="yes"/>
1790 <field id="MFF_TUN_METADATA31" title="Generic Tunnel Option 31" hidden="yes"/>
1791 <field id="MFF_TUN_METADATA32" title="Generic Tunnel Option 32" hidden="yes"/>
1792 <field id="MFF_TUN_METADATA33" title="Generic Tunnel Option 33" hidden="yes"/>
1793 <field id="MFF_TUN_METADATA34" title="Generic Tunnel Option 34" hidden="yes"/>
1794 <field id="MFF_TUN_METADATA35" title="Generic Tunnel Option 35" hidden="yes"/>
1795 <field id="MFF_TUN_METADATA36" title="Generic Tunnel Option 36" hidden="yes"/>
1796 <field id="MFF_TUN_METADATA37" title="Generic Tunnel Option 37" hidden="yes"/>
1797 <field id="MFF_TUN_METADATA38" title="Generic Tunnel Option 38" hidden="yes"/>
1798 <field id="MFF_TUN_METADATA39" title="Generic Tunnel Option 39" hidden="yes"/>
1799 <field id="MFF_TUN_METADATA40" title="Generic Tunnel Option 40" hidden="yes"/>
1800 <field id="MFF_TUN_METADATA41" title="Generic Tunnel Option 41" hidden="yes"/>
1801 <field id="MFF_TUN_METADATA42" title="Generic Tunnel Option 42" hidden="yes"/>
1802 <field id="MFF_TUN_METADATA43" title="Generic Tunnel Option 43" hidden="yes"/>
1803 <field id="MFF_TUN_METADATA44" title="Generic Tunnel Option 44" hidden="yes"/>
1804 <field id="MFF_TUN_METADATA45" title="Generic Tunnel Option 45" hidden="yes"/>
1805 <field id="MFF_TUN_METADATA46" title="Generic Tunnel Option 46" hidden="yes"/>
1806 <field id="MFF_TUN_METADATA47" title="Generic Tunnel Option 47" hidden="yes"/>
1807 <field id="MFF_TUN_METADATA48" title="Generic Tunnel Option 48" hidden="yes"/>
1808 <field id="MFF_TUN_METADATA49" title="Generic Tunnel Option 49" hidden="yes"/>
1809 <field id="MFF_TUN_METADATA50" title="Generic Tunnel Option 50" hidden="yes"/>
1810 <field id="MFF_TUN_METADATA51" title="Generic Tunnel Option 51" hidden="yes"/>
1811 <field id="MFF_TUN_METADATA52" title="Generic Tunnel Option 52" hidden="yes"/>
1812 <field id="MFF_TUN_METADATA53" title="Generic Tunnel Option 53" hidden="yes"/>
1813 <field id="MFF_TUN_METADATA54" title="Generic Tunnel Option 54" hidden="yes"/>
1814 <field id="MFF_TUN_METADATA55" title="Generic Tunnel Option 55" hidden="yes"/>
1815 <field id="MFF_TUN_METADATA56" title="Generic Tunnel Option 56" hidden="yes"/>
1816 <field id="MFF_TUN_METADATA57" title="Generic Tunnel Option 57" hidden="yes"/>
1817 <field id="MFF_TUN_METADATA58" title="Generic Tunnel Option 58" hidden="yes"/>
1818 <field id="MFF_TUN_METADATA59" title="Generic Tunnel Option 59" hidden="yes"/>
1819 <field id="MFF_TUN_METADATA60" title="Generic Tunnel Option 60" hidden="yes"/>
1820 <field id="MFF_TUN_METADATA61" title="Generic Tunnel Option 61" hidden="yes"/>
1821 <field id="MFF_TUN_METADATA62" title="Generic Tunnel Option 62" hidden="yes"/>
1822 <field id="MFF_TUN_METADATA63" title="Generic Tunnel Option 63" hidden="yes"/>
1823
1824 <field id="MFF_TUN_FLAGS" title="Tunnel Flags">
1825 <p>
1826 Flags indicating various aspects of the tunnel encapsulation.
1827 </p>
1828
1829 <p>
1830 Matches on this field are most conveniently written in terms of
1831 symbolic names (given in the diagram below), each preceded by either
1832 <code>+</code> for a flag that must be set, or <code>-</code> for a
1833 flag that must be unset, without any other delimiters between the
1834 flags. Flags not mentioned are wildcarded. For example,
1835 <code>tun_flags=+oam</code> matches only OAM packets. Matches can also
1836 be written as <code><var>flags</var>/<var>mask</var></code>, where
1837 <var>flags</var> and <var>mask</var> are 16-bit numbers in decimal or
1838 in hexadecimal prefixed by <code>0x</code>.
1839 </p>
1840
1841 <p>
1842 Currently, only one flag is defined:
1843 </p>
1844
1845 <dl>
1846 <dt><code>oam</code></dt>
1847 <dd>
1848 The tunnel protocol indicated that this is an OAM (Operations and
1849 Management) control packet.
1850 </dd>
1851 </dl>
1852
1853 <p>
1854 The switch may reject matches against unknown flags.
1855 </p>
1856
1857 <p>
1858 Newer versions of Open vSwitch may introduce additional flags with new
1859 meanings. It is therefore not recommended to use an exact match on
1860 this field since the behavior of these new flags is unknown and should
1861 be ignored.
1862 </p>
1863
1864 <p>
1865 For non-tunneled packets, the value is 0.
1866 </p>
1867 </field>
1868
1869 <!-- Open vSwitch uses the following fields internally, but it
1870 does not expose them to the user via OpenFlow, so we do not
1871 document them. -->
1872 <field id="MFF_TUN_TTL" title="Tunnel IPv4 Time-to-Live" internal="yes"/>
1873 <field id="MFF_TUN_TOS" title="Tunnel IPv4 Type of Service" internal="yes"/>
1874 </group>
1875
1876 <group title="Metadata">
1877 <p>
1878 These fields relate to the origin or treatment of a packet, but
1879 they are not extracted from the packet data itself.
1880 </p>
1881
1882 <field id="MFF_IN_PORT" title="Ingress Port">
1883 <p>
1884 The OpenFlow port on which the packet being processed arrived.
1885 This is a 16-bit field that holds an OpenFlow 1.0 port number.
1886 For receiving a packet, the only values that appear in this
1887 field are:
1888 </p>
1889
1890 <dl>
1891 <dt>1 through <code>0xfeff</code> (65,279), inclusive.</dt>
1892 <dd>
1893 Conventional OpenFlow port numbers.
1894 </dd>
1895
1896 <dt><code>OFPP_LOCAL</code> (<code>0xfffe</code> or 65,534).</dt>
1897 <dd>
1898 <p>
1899 The ``local'' port, which in Open vSwitch is always named
1900 the same as the bridge itself. This represents a
1901 connection between the switch and the local TCP/IP stack.
1902 This port is where an IP address is most commonly
1903 configured on an Open vSwitch switch.
1904 </p>
1905
1906 <p>
1907 OpenFlow does not require a switch to have a local port,
1908 but all existing versions of Open vSwitch have always
1909 included a local port. <b>Future Directions:</b> Future
1910 versions of Open vSwitch might be able to optionally omit
1911 the local port, if someone submits code to implement such
1912 a feature.
1913 </p>
1914 </dd>
1915
1916 <dt><code>OFPP_NONE</code> (OpenFlow 1.0) or <code>OFPP_ANY</code> (OpenFlow 1.1+) (<code>0xffff</code> or 65,535).</dt>
1917 <dt><code>OFPP_CONTROLLER</code> (<code>0xfffd</code> or 65,533).</dt>
1918 <dd>
1919 <p>
1920 When a controller injects a packet into an OpenFlow switch
1921 with a ``packet-out'' request, it can specify one of these
1922 ingress ports to indicate that the packet was generated
1923 internally rather than having been received on some port.
1924 </p>
1925
1926 <p>
1927 OpenFlow 1.0 specified <code>OFPP_NONE</code> for this
1928 purpose. Despite that, some controllers used
1929 <code>OFPP_CONTROLLER</code>, and some switches only
1930 accepted <code>OFPP_CONTROLLER</code>, so OpenFlow 1.0.2
1931 required support for both ports. OpenFlow 1.1 and later
1932 were more clearly drafted to allow only
1933 <code>OFPP_CONTROLLER</code>. For maximum compatibility,
1934 Open vSwitch allows both ports with all OpenFlow versions.
1935 </p>
1936 </dd>
1937 </dl>
1938
1939 <p>
1940 Values not mentioned above will never appear when receiving a
1941 packet, including the following notable values:
1942 </p>
1943
1944 <dl>
1945 <dt>0</dt>
1946 <dd>
1947 Zero is not a valid OpenFlow port number.
1948 </dd>
1949
1950 <dt><code>OFPP_MAX</code> (<code>0xff00</code> or 65,280).</dt>
1951 <dd>
1952 This value has only been clearly specified as a valid port
1953 number as of OpenFlow 1.3.3. Before that, its status was
1954 unclear, and so Open vSwitch has never allowed
1955 <code>OFPP_MAX</code> to be used as a port number, so
1956 packets will never be received on this port. (Other
1957 OpenFlow switches, of course, might use it.)
1958 </dd>
1959
1960 <dt><code>OFPP_UNSET</code> (<code>0xfff7</code> or 65,527)</dt>
1961 <dt><code>OFPP_IN_PORT</code> (<code>0xfff8</code> or 65,528)</dt>
1962 <dt><code>OFPP_TABLE</code> (<code>0xfff9</code> or 65,529)</dt>
1963 <dt><code>OFPP_NORMAL</code> (<code>0xfffa</code> or 65,530)</dt>
1964 <dt><code>OFPP_FLOOD</code> (<code>0xfffb</code> or 65,531)</dt>
1965 <dt><code>OFPP_ALL</code> (<code>0xfffc</code> or 65,532)</dt>
1966 <dd>
1967 <p>
1968 These port numbers are used only in output actions and never
1969 appear as ingress ports.
1970 </p>
1971
1972 <p>
1973 Most of these port numbers were defined in OpenFlow 1.0, but
1974 <code>OFPP_UNSET</code> was only introduced in OpenFlow 1.5.
1975 </p>
1976 </dd>
1977 </dl>
1978
1979 <p>
1980 Values that will never appear when receiving a packet may
1981 still be matched against in the flow table. There are still
1982 circumstances in which those flows can be matched:
1983 </p>
1984
1985 <ul>
1986 <li>
1987 The <code>resubmit</code> Open vSwitch extension action allows a
1988 flow table lookup with an arbitrary ingress port.
1989 </li>
1990
1991 <li>
1992 An action that modifies the ingress port field (see below),
1993 such as e.g. <code>load</code> or <code>set_field</code>,
1994 followed by an action or instruction that performs another
1995 flow table lookup, such as <code>resubmit</code> or
1996 <code>goto_table</code>.
1997 </li>
1998 </ul>
1999
2000 <p>
2001 This field is heavily used for matching in OpenFlow tables,
2002 but for packet egress, it has only very limited roles:
2003 </p>
2004
2005 <ul>
2006 <li>
2007 <p>
2008 OpenFlow requires suppressing output actions to <ref
2009 field="in_port"/>. That is, the following two flows both drop all
2010 packets that arrive on port 1:
2011 </p>
2012
2013 <pre>
2014 in_port=1,actions=1
2015 in_port=1,actions=drop
2016 </pre>
2017
2018 <p>
2019 (This behavior is occasionally useful for flooding to a
2020 subset of ports. Specifying <code>actions=1,2,3,4</code>,
2021 for example, outputs to ports 1, 2, 3, and 4, omitting the
2022 ingress port.)
2023 </p>
2024 </li>
2025
2026 <li>
2027 OpenFlow has a special port <code>OFPP_IN_PORT</code> (with
2028 value 0xfff8) that outputs to the ingress port. For example,
2029 in a switch that has four ports numbered 1 through 4,
2030 <code>actions=1,2,3,4,in_port</code> outputs to ports 1, 2,
2031 3, and 4, including the ingress port.
2032 </li>
2033 </ul>
2034
2035 <p>
2036 Because the ingress port field has so little influence on packet
2037 processing, it does not ordinarily make sense to modify the
2038 ingress port field. The field is writable only to support the
2039 occasional use case where the ingress port's roles in packet
2040 egress, described above, become troublesome. For example,
2041 <code>actions=load:0-&gt;NXM_OF_IN_PORT[],output:123</code>
2042 will output to port 123 regardless of whether it is in the
2043 ingress port. If the ingress port is important, then one may save
2044 and restore it on the stack:
2045 </p>
2046
2047 <pre>
2048 actions=push:NXM_OF_IN_PORT[],load:0->NXM_OF_IN_PORT[],output:123,pop:NXM_OF_IN_PORT[]
2049 </pre>
2050
2051 <p>
2052 or, in Open vSwitch 2.7 or later, use the <code>clone</code> action to
2053 save and restore it:
2054 </p>
2055
2056 <pre>
2057 actions=clone(load:0->NXM_OF_IN_PORT[],output:123)
2058 </pre>
2059
2060 <p>
2061 The ability to modify the ingress port is an Open vSwitch
2062 extension to OpenFlow.
2063 </p>
2064 </field>
2065
2066 <field id="MFF_IN_PORT_OXM" title="OXM Ingress Port">
2067 <p>
2068 OpenFlow 1.1 and later use a 32-bit port number, so this field
2069 supplies a 32-bit view of the ingress port. Current versions of
2070 Open vSwitch support only a 16-bit range of ports:
2071 </p>
2072
2073 <ul>
2074 <li>
2075 OpenFlow 1.0 ports <code>0x0000</code> to
2076 <code>0xfeff</code>, inclusive, map to OpenFlow 1.1
2077 port numbers with the same values.
2078 </li>
2079
2080 <li>
2081 OpenFlow 1.0 ports <code>0xff00</code> to
2082 <code>0xffff</code>, inclusive, map to OpenFlow 1.1 port
2083 numbers <code>0xffffff00</code> to <code>0xffffffff</code>.
2084 </li>
2085
2086 <li>
2087 OpenFlow 1.1 ports <code>0x0000ff00</code> to
2088 <code>0xfffffeff</code> are not mapped and not supported.
2089 </li>
2090 </ul>
2091
2092 <p>
2093 <ref field="in_port"/> and <ref field="in_port_oxm"/> are two views of
2094 the same information, so all of the comments on <ref field="in_port"/>
2095 apply to <ref field="in_port_oxm"/> too. Modifying <ref
2096 field="in_port"/> changes <ref field="in_port_oxm"/>, and vice versa.
2097 </p>
2098
2099 <p>
2100 Setting <ref field="in_port_oxm"/> to an unsupported value yields
2101 unspecified behavior.
2102 </p>
2103 </field>
2104
2105 <field id="MFF_SKB_PRIORITY" title="Output Queue">
2106 <p>
2107 <b>Future Directions:</b> Open vSwitch implements the output queue as a
2108 field, but does not currently expose it through OXM or NXM for matching
2109 purposes. If this turns out to be a useful feature, it could be
2110 implemented in future versions. Only the <code>set_queue</code>,
2111 <code>enqueue</code>, and <code>pop_queue</code> actions currently
2112 influence the output queue.
2113 </p>
2114
2115 <p>
2116 This field influences how packets in the flow will be queued,
2117 for quality of service (QoS) purposes, when they egress the
2118 switch. Its range of meaningful values, and their meanings,
2119 varies greatly from one OpenFlow implementation to another.
2120 Even within a single implementation, there is no guarantee
2121 that all OpenFlow ports have the same queues configured or
2122 that all OpenFlow ports in an implementation can be configured
2123 the same way queue-wise.
2124 </p>
2125
2126 <p>
2127 Configuring queues on OpenFlow is not well standardized. On
2128 Linux, Open vSwitch supports queue configuration via OVSDB,
2129 specifically the <code>QoS</code> and <code>Queue</code>
2130 tables (see <code>ovs-vswitchd.conf.db(5)</code> for details).
2131 Ports of Open vSwitch to other platforms might require queue
2132 configuration through some separate protocol (such as a CLI).
2133 Even on Linux, Open vSwitch exposes only a fraction of the
2134 kernel's queuing features through OVSDB, so advanced or
2135 unusual uses might require use of separate utilities
2136 (e.g. <code>tc</code>). OpenFlow switches other than Open
2137 vSwitch might use OF-CONFIG or any of the configuration
2138 methods mentioned above. Finally, some OpenFlow switches have
2139 a fixed number of fixed-function queues (e.g. eight queues
2140 with strictly defined priorities) and others do not support
2141 any control over queuing.
2142 </p>
2143
2144 <p>
2145 The only output queue that all OpenFlow implementations must
2146 support is zero, to identify a default queue, whose properties
2147 are implementation-defined. Outputting a packet to a queue
2148 that does not exist on the output port yields unpredictable
2149 behavior: among the possibilities are that the packet might be
2150 dropped or transmitted with a very high or very low priority.
2151 </p>
2152
2153 <p>
2154 OpenFlow 1.0 only allowed output queues to be specified as part of an
2155 <code>enqueue</code> action that specified both a queue and an output
2156 port. That is, OpenFlow 1.0 treats the queue as an argument to an
2157 action, not as a field.
2158 </p>
2159
2160 <p>
2161 To increase flexibility, OpenFlow 1.1 added an action to set the output
2162 queue. This model was carried forward, without change, through
2163 OpenFlow 1.5.
2164 </p>
2165
2166 <p>
2167 Open vSwitch implements the native queuing model of each
2168 OpenFlow version it supports. Open vSwitch also includes an
2169 extension for setting the output queue as an action in
2170 OpenFlow 1.0.
2171 </p>
2172
2173 <p>
2174 When a packet ingresses into an OpenFlow switch, the output
2175 queue is ordinarily set to 0, indicating the default queue.
2176 However, Open vSwitch supports various ways to forward a
2177 packet from one OpenFlow switch to another within a single
2178 host. In these cases, Open vSwitch maintains the output queue
2179 across the forwarding step. For example:
2180 </p>
2181
2182 <ul>
2183 <li>
2184 A hop across an Open vSwitch ``patch port'' (which does not
2185 actually involve queuing) preserves the output queue.
2186 </li>
2187
2188 <li>
2189 <p>
2190 When a flow sets the output queue then outputs to an
2191 OpenFlow tunnel port, the encapsulation preserves the
2192 output queue. If the kernel TCP/IP stack routes the
2193 encapsulated packet directly to a physical interface, then
2194 that output honors the output queue. Alternatively, if
2195 the kernel routes the encapsulated packet to another Open
2196 vSwitch bridge, then the output queue set previously
2197 becomes the initial output queue on ingress to the second
2198 bridge and will thus be used for further output actions
2199 (unless overridden by a new ``set queue'' action).
2200 </p>
2201
2202 <p>
2203 (This description reflects the current behavior of Open
2204 vSwitch on Linux. This behavior relies on details of the
2205 Linux TCP/IP stack. It could be difficult to make ports
2206 to other operating systems behave the same way.)
2207 </p>
2208 </li>
2209 </ul>
2210 </field>
2211
2212 <field id="MFF_PKT_MARK" title="Packet Mark">
2213 <p>
2214 Packet mark comes to Open vSwitch from the Linux kernel, in
2215 which the <code>sk_buff</code> data structure that represents
2216 a packet contains a 32-bit member named <code>skb_mark</code>.
2217 The value of <code>skb_mark</code> propagates along with the
2218 packet it accompanies wherever the packet goes in the kernel.
2219 It has no predefined semantics but various kernel-user
2220 interfaces can set and match on it, which makes it suitable
2221 for ``marking'' packets at one point in their handling and
2222 then acting on the mark later. With <code>iptables</code>,
2223 for example, one can mark some traffic specially at ingress
2224 and then handle that traffic differently at egress based on
2225 the marked value.
2226 </p>
2227
2228 <p>
2229 Packet mark is an attempt at a generalization of the
2230 <code>skb_mark</code> concept beyond Linux, at least through more
2231 generic naming. Like <ref field="skb_priority"/>, packet mark is
2232 preserved across forwarding steps within a machine. Unlike <ref
2233 field="skb_priority"/>, packet mark has no direct effect on packet
2234 forwarding: the value set in packet mark does not matter unless some
2235 later OpenFlow table or switch matches on packet mark, or unless the
2236 packet passes through some other kernel subsystem that has been
2237 configured to interpret packet mark in specific ways, e.g. through
2238 <code>iptables</code> configuration mentioned above.
2239 </p>
2240
2241 <p>
2242 Preserving packet mark across kernel forwarding steps relies
2243 heavily on kernel support, which ports to non-Linux operating
2244 systems may not have. Regardless of operating system support,
2245 Open vSwitch supports packet mark within a single bridge and
2246 across patch ports.
2247 </p>
2248
2249 <p>
2250 The value of packet mark when a packet ingresses into the
2251 first Open vSwich bridge is typically zero, but it could be
2252 nonzero if its value was previously set by some kernel
2253 subsystem.
2254 </p>
2255 </field>
2256
2257 <field id="MFF_ACTSET_OUTPUT" title="Action Set Output Port">
2258 <p>
2259 Holds the output port currently in the OpenFlow action set (i.e. from
2260 an <code>output</code> action within a <code>write_actions</code>
2261 instruction). Its value is an OpenFlow port number. If there is no
2262 output port in the OpenFlow action set, or if the output port will be
2263 ignored (e.g. because there is an output group in the OpenFlow action
2264 set), then the value will be <code>OFPP_UNSET</code>.
2265 </p>
2266
2267 <p>
2268 Open vSwitch allows any table to match this field. OpenFlow, however,
2269 only requires this field to be matchable from within an OpenFlow egress
2270 table (a feature that Open vSwitch does not yet implement).
2271 </p>
2272 </field>
2273
2274 <field id="MFF_DP_HASH" title="Datapath Hash" internal="yes"/>
2275 <field id="MFF_RECIRC_ID" title="Datapath Recirculation ID" internal="yes"/>
2276 </group>
2277
2278 <group title="Connection Tracking">
2279 <p>
2280 Open vSwitch 2.5 and later support ``connection tracking,'' which allows
2281 bidirectional streams of packets to be statefully grouped into
2282 connections. Open vSwitch connection tracking, for example, identifies
2283 the patterns of TCP packets that indicates a successfully initiated
2284 connection, as well as those that indicate that a connection has been
2285 torn down. Open vSwitch connection tracking can also identify related
2286 connections, such as FTP data connections spawned from FTP control
2287 connections.
2288 </p>
2289
2290 <p>
2291 An individual packet passing through the pipeline may be in one of two
2292 states, ``untracked'' or ``tracked,'' which may be distinguished via the
2293 ``trk'' flag in <ref field="ct_state"/>. A packet is
2294 <dfn>untracked</dfn> at the beginning of the Open vSwitch pipeline and
2295 continues to be untracked until the pipeline invokes the <code>ct</code>
2296 action. The connection tracking fields are all zeroes in an untracked
2297 packet. When a flow in the Open vSwitch pipeline invokes the
2298 <code>ct</code> action, the action initializes the connection tracking
2299 fields and the packet becomes <dfn>tracked</dfn> for the remainder of its
2300 processing.
2301 </p>
2302
2303 <p>
2304 The connection tracker stores connection state in an internal table, but
2305 it only adds a new entry to this table when a <code>ct</code> action for
2306 a new connection invokes <code>ct</code> with the <code>commit</code>
2307 parameter. For a given connection, when a pipeline has executed
2308 <code>ct</code>, but not yet with <code>commit</code>, the connection is
2309 said to be <dfn>uncommitted</dfn>. State for an uncommitted connection
2310 is ephemeral and does not persist past the end of the pipeline, so some
2311 features are only available to committed connections. A connection would
2312 typically be left uncommitted as a way to drop its packets.
2313 </p>
2314
2315 <p>
2316 Connection tracking is an Open vSwitch extension to OpenFlow.
2317 </p>
2318
2319 <field id="MFF_CT_STATE" title="Connection Tracking State">
2320 <p>
2321 This field holds several flags that can be used to determine the state
2322 of the connection to which the packet belongs.
2323 </p>
2324
2325 <p>
2326 Matches on this field are most conveniently written in terms of
2327 symbolic names (listed below), each preceded by either <code>+</code>
2328 for a flag that must be set, or <code>-</code> for a flag that must be
2329 unset, without any other delimiters between the flags. Flags not
2330 mentioned are wildcarded. For example,
2331 <code>tcp,ct_state=+trk-new</code> matches TCP packets that have been
2332 run through the connection tracker and do not establish a new
2333 connection. Matches can also be written as
2334 <code><var>flags</var>/<var>mask</var></code>, where <var>flags</var>
2335 and <var>mask</var> are 32-bit numbers in decimal or in hexadecimal
2336 prefixed by <code>0x</code>.
2337 </p>
2338
2339 <p>
2340 The following flags are defined:
2341 </p>
2342
2343 <dl>
2344 <dt><code>new</code> (0x01)</dt>
2345 <dd>
2346 A new connection. Set to 1 if this is an uncommitted connection.
2347 </dd>
2348
2349 <dt><code>est</code> (0x02)</dt>
2350 <dd>
2351 Part of an existing connection. Set to 1 if this is a committed
2352 connection.
2353 </dd>
2354
2355 <dt><code>rel</code> (0x04)</dt>
2356 <dd>
2357 <p>
2358 Related to an existing connection, e.g. an ICMP ``destination
2359 unreachable'' message or an FTP data connections. This flag will
2360 only be 1 if the connection to which this one is related is
2361 committed.
2362 </p>
2363
2364 <p>
2365 Connections identified as <code>rel</code> are separate from the
2366 originating connection and must be committed separately. All
2367 packets for a related connection will have the <code>rel</code>
2368 flag set, not just the initial packet.
2369 </p>
2370 </dd>
2371
2372 <dt><code>rpl</code> (0x08)</dt>
2373 <dd>
2374 This packet is in the reply direction, meaning that it is in the
2375 opposite direction from the packet that initiated the connection.
2376 This flag will only be 1 if the connection is committed.
2377 </dd>
2378
2379 <dt><code>inv</code> (0x10)</dt>
2380 <dd>
2381 <p>
2382 The state is invalid, meaning that the connection tracker couldn't
2383 identify the connection. This flag is a catch-all for problems
2384 in the connection or the connection tracker, such as:
2385 </p>
2386
2387 <ul>
2388 <li>
2389 L3/L4 protocol handler is not loaded/unavailable. With the Linux
2390 kernel datapath, this may mean that the
2391 <code>nf_conntrack_ipv4</code> or <code>nf_conntrack_ipv6</code>
2392 modules are not loaded.
2393 </li>
2394
2395 <li>
2396 L3/L4 protocol handler determines that the packet is malformed.
2397 </li>
2398
2399 <li>
2400 Packets are unexpected length for protocol.
2401 </li>
2402 </ul>
2403 </dd>
2404
2405 <dt><code>trk</code> (0x20)</dt>
2406 <dd>
2407 This packet is tracked, meaning that it has previously traversed the
2408 connection tracker. If this flag is not set, then no other flags
2409 will be set. If this flag is set, then the packet is tracked and
2410 other flags may also be set.
2411 </dd>
2412
2413 <dt><code>snat</code> (0x40)</dt>
2414 <dd>
2415 This packet was transformed by source address/port translation by a
2416 preceding <code>ct</code> action. Open vSwitch 2.6 added this flag.
2417 </dd>
2418
2419 <dt><code>dnat</code> (0x80)</dt>
2420 <dd>
2421 This packet was transformed by destination address/port translation
2422 by a preceding <code>ct</code> action. Open vSwitch 2.6 added this
2423 flag.
2424 </dd>
2425 </dl>
2426
2427 <p>
2428 There are additional constraints on these flags, listed in decreasing
2429 order of precedence below:
2430 </p>
2431
2432 <ol>
2433 <li>
2434 If <code>trk</code> is unset, no other flags are set.
2435 </li>
2436
2437 <li>
2438 If <code>trk</code> is set, one or more other flags may be set.
2439 </li>
2440
2441 <li>
2442 If <code>inv</code> is set, only the <code>trk</code> flag is also
2443 set.
2444 </li>
2445
2446 <li>
2447 <code>new</code> and <code>est</code> are mutually exclusive.
2448 </li>
2449
2450 <li>
2451 <code>new</code> and <code>rpl</code> are mutually exclusive.
2452 </li>
2453
2454 <li>
2455 <code>rel</code> may be set in conjunction with any other flags.
2456 </li>
2457 </ol>
2458
2459 <p>
2460 Future versions of Open vSwitch may define new flags.
2461 </p>
2462 </field>
2463
2464 <field id="MFF_CT_ZONE" title="Connection Tracking Zone">
2465 A connection tracking zone, the zone value passed to the most recent
2466 <code>ct</code> action. Each zone is an independent connection tracking
2467 context, so tracking the same packet in multiple contexts requires using
2468 the <code>ct</code> action multiple times.
2469 </field>
2470
2471 <field id="MFF_CT_MARK" title="Connection Tracking Mark">
2472 The metadata committed, by an action within the <code>exec</code>
2473 parameter to the <code>ct</code> action, to the connection to which the
2474 current packet belongs.
2475 </field>
2476
2477 <field id="MFF_CT_LABEL" title="Connection Tracking Label">
2478 The label committed, by an action within the <code>exec</code>
2479 parameter to the <code>ct</code> action, to the connection to which the
2480 current packet belongs.
2481 </field>
2482
2483 <p>
2484 Open vSwitch 2.8 introduced the matching support for connection
2485 tracker original direction 5-tuple fields.
2486 </p>
2487
2488 <p>
2489 For non-committed non-related connections the conntrack original
2490 direction tuple fields always have the same values as the
2491 corresponding headers in the packet itself. For any other packets of
2492 a committed connection the conntrack original direction tuple fields
2493 reflect the values from that initial non-committed non-related packet,
2494 and thus may be different from the actual packet headers, as the
2495 actual packet headers may be in reverse direction (for reply packets),
2496 transformed by NAT (when \fBnat\fR option was applied to the
2497 connection), or be of different protocol (i.e., when an ICMP response
2498 is sent to an UDP packet). In case of related connections, e.g., an
2499 FTP data connection, the original direction tuple contains the
2500 original direction headers from the master connection, e.g., an FTP
2501 control connection.
2502 </p>
2503
2504 <p>
2505 The following fields are populated by the ct action, and require a
2506 match to a valid connection tracking state as a prerequisite, in
2507 addition to the IP or IPv6 ethertype match. Examples of valid
2508 connection tracking state matches include \fBct_state=+new\fR,
2509 \fBct_state=+est\fR, \fBct_state=+rel\fR, and \fBct_state=+trk-inv\fR.
2510 </p>
2511
2512 <field id="MFF_CT_NW_SRC" title="Connection Tracking Original Direction IPv4 Source Address">
2513 Matches IPv4 conntrack original direction tuple source address.
2514 See the paragraphs above for general description to the
2515 conntrack original direction tuple. Introduced in Open vSwitch
2516 2.8.
2517 </field>
2518
2519 <field id="MFF_CT_NW_DST" title="Connection Tracking Original Direction IPv4 Destination Address">
2520 Matches IPv4 conntrack original direction tuple destination address.
2521 See the paragraphs above for general description to the
2522 conntrack original direction tuple. Introduced in Open vSwitch
2523 2.8.
2524 </field>
2525
2526 <field id="MFF_CT_IPV6_SRC" title="Connection Tracking Original Direction IPv6 Source Address">
2527 Matches IPv6 conntrack original direction tuple source address.
2528 See the paragraphs above for general description to the
2529 conntrack original direction tuple. Introduced in Open vSwitch
2530 2.8.
2531 </field>
2532
2533 <field id="MFF_CT_IPV6_DST" title="Connection Tracking Original Direction IPv6 Destination Address">
2534 Matches IPv6 conntrack original direction tuple destination address.
2535 See the paragraphs above for general description to the
2536 conntrack original direction tuple. Introduced in Open vSwitch
2537 2.8.
2538 </field>
2539
2540 <field id="MFF_CT_NW_PROTO" title="Connection Tracking Original Direction IP Protocol">
2541 Matches conntrack original direction tuple IP protocol type,
2542 which is specified as a decimal number between 0 and 255,
2543 inclusive (e.g. 1 to match ICMP packets or 6 to match TCP
2544 packets). In case of, for example, an ICMP response to an UDP
2545 packet, this may be different from the IP protocol type of the
2546 packet itself. See the paragraphs above for general description
2547 to the conntrack original direction tuple. Introduced in Open
2548 vSwitch 2.8.
2549 </field>
2550
2551 <field id="MFF_CT_TP_SRC" title="Connection Tracking Original Direction Transport Layer Source Port">
2552 Bitwise match on the conntrack original direction tuple
2553 transport source, when
2554 <code>MFF_CT_NW_PROTO</code> has value 6 for TCP, 17 for UDP, or
2555 132 for SCTP. When <code>MFF_CT_NW_PROTO</code> has value 1 for
2556 ICMP, or 58 for ICMPv6, the lower 8 bits of
2557 <code>MFF_CT_TP_SRC</code> matches the conntrack original
2558 direction ICMP type. See the paragraphs above for general
2559 description to the conntrack original direction
2560 tuple. Introduced in Open vSwitch 2.8.
2561 </field>
2562
2563 <field id="MFF_CT_TP_DST" title="Connection Tracking Original Direction Transport Layer Source Port">
2564 Bitwise match on the conntrack original direction tuple
2565 transport destination port, when
2566 <code>MFF_CT_NW_PROTO</code> has value 6 for TCP, 17 for UDP, or
2567 132 for SCTP. When <code>MFF_CT_NW_PROTO</code> has value 1 for
2568 ICMP, or 58 for ICMPv6, the lower 8 bits of
2569 <code>MFF_CT_TP_DST</code> matches the conntrack original
2570 direction ICMP code. See the paragraphs above for general
2571 description to the conntrack original direction
2572 tuple. Introduced in Open vSwitch 2.8.
2573 </field>
2574 </group>
2575
2576 <group title="Register">
2577 <p>
2578 These fields give an OpenFlow switch space for temporary storage while
2579 the pipeline is running. Whereas metadata fields can have a meaningful
2580 initial value and can persist across some hops across OpenFlow switches,
2581 registers are always initially 0 and their values never persist across
2582 inter-switch hops (not even across patch ports).
2583 </p>
2584
2585 <field id="MFF_METADATA" title="OpenFlow Metadata">
2586 <p>
2587 This field is the oldest standardized OpenFlow register field,
2588 introduced in OpenFlow 1.1. It was introduced to model the limited
2589 number of user-defined bits that some ASIC-based switches can carry
2590 through their pipelines. Because of hardware limitations, OpenFlow
2591 allows switches to support writing and masking only an
2592 implementation-defined subset of bits, even no bits at all. The Open
2593 vSwitch software switch always supports all 64 bits, but of course an
2594 Open vSwitch port to an ASIC would have the same restriction as the
2595 ASIC itself.
2596 </p>
2597
2598 <p>
2599 This field has an OXM code point, but OpenFlow 1.4 and earlier allow it
2600 to be modified only with a specialized instruction, not with a
2601 ``set-field'' action. OpenFlow 1.5 removes this restriction. Open
2602 vSwitch does not enforce this restriction, regardless of OpenFlow
2603 version.
2604 </p>
2605 </field>
2606
2607 <field id="MFF_REG0" title="Register 0">
2608 This is the first of several Open vSwitch registers, all of which have
2609 the same properties. Open vSwitch 1.1 introduced registers 0, 1, 2, and
2610 3, version 1.3 added register 4, version 1.7 added registers 5, 6, and 7,
2611 and version 2.6 added registers 8 through 15.
2612 </field>
2613 <!-- XXX series -->
2614 <field id="MFF_REG1" title="Register 1" hidden="yes"/>
2615 <field id="MFF_REG2" title="Register 2" hidden="yes"/>
2616 <field id="MFF_REG3" title="Register 3" hidden="yes"/>
2617 <field id="MFF_REG4" title="Register 4" hidden="yes"/>
2618 <field id="MFF_REG5" title="Register 5" hidden="yes"/>
2619 <field id="MFF_REG6" title="Register 6" hidden="yes"/>
2620 <field id="MFF_REG7" title="Register 7" hidden="yes"/>
2621 <field id="MFF_REG8" title="Register 8" hidden="yes"/>
2622 <field id="MFF_REG9" title="Register 9" hidden="yes"/>
2623 <field id="MFF_REG10" title="Register 10" hidden="yes"/>
2624 <field id="MFF_REG11" title="Register 11" hidden="yes"/>
2625 <field id="MFF_REG12" title="Register 12" hidden="yes"/>
2626 <field id="MFF_REG13" title="Register 13" hidden="yes"/>
2627 <field id="MFF_REG14" title="Register 14" hidden="yes"/>
2628 <field id="MFF_REG15" title="Register 15" hidden="yes"/>
2629
2630 <field id="MFF_XREG0" title="Extended Register 0">
2631 <p>
2632 This is the first of the registers introduced in OpenFlow 1.5.
2633 OpenFlow 1.5 calls these fields just the ``packet registers,'' but Open
2634 vSwitch already had 32-bit registers by that name, so Open vSwitch uses
2635 the name ``extended registers'' in an attempt to reduce confusion. The
2636 standard allows for up to 128 registers, each 64 bits wide, but Open
2637 vSwitch only implements 4 (in versions 2.4 and 2.5) or 8 (in version
2638 2.6 and later).
2639 </p>
2640
2641 <p>
2642 Each of the 64-bit extended registers overlays two of the 32-bit
2643 registers: <code>xreg0</code> overlays <code>reg0</code> and
2644 <code>reg1</code>, with <code>reg0</code> supplying the
2645 most-significant bits of <code>xreg0</code> and <code>reg1</code> the
2646 least-significant. Similarly, <code>xreg1</code> overlays
2647 <code>reg2</code> and <code>reg3</code>, and so on.
2648 </p>
2649
2650 <p>
2651 The OpenFlow specification says, ``In most cases, the packet registers
2652 can not be matched in tables, i.e. they usually can not be used in the
2653 flow entry match structure'' [OpenFlow 1.5, section 7.2.3.10], but
2654 there is no reason for a software switch to impose such a restriction,
2655 and Open vSwitch does not.
2656 </p>
2657 </field>
2658
2659 <!-- XXX series -->
2660 <field id="MFF_XREG1" title="Extended Register 1" hidden="yes"/>
2661 <field id="MFF_XREG2" title="Extended Register 2" hidden="yes"/>
2662 <field id="MFF_XREG3" title="Extended Register 3" hidden="yes"/>
2663 <field id="MFF_XREG4" title="Extended Register 4" hidden="yes"/>
2664 <field id="MFF_XREG5" title="Extended Register 5" hidden="yes"/>
2665 <field id="MFF_XREG6" title="Extended Register 6" hidden="yes"/>
2666 <field id="MFF_XREG7" title="Extended Register 7" hidden="yes"/>
2667
2668 <field id="MFF_XXREG0" title="Double-Extended Register 0">
2669 <p>
2670 This is the first of the double-extended registers introduce in Open
2671 vSwitch 2.6. Each of the 128-bit extended registers overlays four of
2672 the 32-bit registers: <code>xxreg0</code> overlays <code>reg0</code>
2673 through <code>reg3</code>, with <code>reg0</code> supplying the
2674 most-significant bits of <code>xxreg0</code> and <code>reg3</code> the
2675 least-significant. <code>xxreg1</code> similarly overlays
2676 <code>reg4</code> through <code>reg7</code>, and so on.
2677 </p>
2678 </field>
2679
2680 <!-- XXX series -->
2681 <field id="MFF_XXREG1" title="Double-Extended Register 1" hidden="yes"/>
2682 <field id="MFF_XXREG2" title="Double-Extended Register 2" hidden="yes"/>
2683 <field id="MFF_XXREG3" title="Double-Extended Register 3" hidden="yes"/>
2684 </group>
2685
2686 <group title="Layer 2 (Ethernet)">
2687 <p>
2688 Ethernet is the only layer-2 protocol that Open vSwitch
2689 supports. As with most software, Open vSwitch and OpenFlow
2690 regard an Ethernet frame to begin with the 14-byte header and
2691 end with the final byte of the payload; that is, the frame check
2692 sequence is not considered part of the frame.
2693 </p>
2694
2695 <field id="MFF_ETH_SRC" title="Ethernet Source">
2696 <p>
2697 The Ethernet source address:
2698 </p>
2699
2700 <diagram>
2701 <header name="Ethernet">
2702 <bits name="dst" above="48" width=".75"/>
2703 <bits name="src" above="48" width=".75" fill="yes"/>
2704 <bits name="type" above="16" width="0.4"/>
2705 </header>
2706 <dots/>
2707 </diagram>
2708 </field>
2709
2710 <field id="MFF_ETH_DST" title="Ethernet Destination">
2711 <p>
2712 The Ethernet destination address:
2713 </p>
2714
2715 <diagram>
2716 <header name="Ethernet">
2717 <bits name="dst" above="48" width=".75" fill="yes"/>
2718 <bits name="src" above="48" width=".75"/>
2719 <bits name="type" above="16" width="0.4"/>
2720 </header>
2721 <dots/>
2722 </diagram>
2723
2724 <p>
2725 Open vSwitch 1.8 and later support arbitrary masks for source and/or
2726 destination. Earlier versions only support masking the destination
2727 with the following masks:
2728 </p>
2729
2730 <dl>
2731 <dt><code>01:00:00:00:00:00</code></dt>
2732 <dd>
2733 Match only the multicast bit. Thus,
2734 <code>dl_dst=01:00:00:00:00:00/01:00:00:00:00:00</code> matches all
2735 multicast (including broadcast) Ethernet packets, and
2736 <code>dl_dst=00:00:00:00:00:00/01:00:00:00:00:00</code> matches all
2737 unicast Ethernet packets.
2738 </dd>
2739
2740 <dt><code>fe:ff:ff:ff:ff:ff</code></dt>
2741 <dd>
2742 Match all bits except the multicast bit. This is probably not
2743 useful.
2744 </dd>
2745
2746 <dt><code>ff:ff:ff:ff:ff:ff</code></dt>
2747 <dd>
2748 Exact match (equivalent to omitting the mask).
2749 </dd>
2750
2751 <dt><code>00:00:00:00:00:00</code></dt>
2752 <dd>
2753 Wildcard all bits (equivalent to <code>dl_dst=*</code>).
2754 </dd>
2755 </dl>
2756 </field>
2757
2758 <field id="MFF_ETH_TYPE" title="Ethernet Type">
2759 <p>
2760 The most commonly seen Ethernet frames today use a format
2761 called ``Ethernet II,'' in which the last two bytes of the
2762 Ethernet header specify the Ethertype. For such a frame, this
2763 field is copied from those bytes of the header, like so:
2764 </p>
2765
2766 <diagram>
2767 <header name="Ethernet">
2768 <bits name="dst" above="48" width=".75"/>
2769 <bits name="src" above="48" width=".75"/>
2770 <bits name="type" above="16" below="\[&gt;=]0x600" width="0.4" fill="yes"/>
2771 </header>
2772 <dots/>
2773 </diagram>
2774
2775 <p>
2776 Every Ethernet type has a value 0x600 (1,536) or greater.
2777 When the last two bytes of the Ethernet header have a value
2778 too small to be an Ethernet type, then the value found there
2779 is the total length of the frame in bytes, excluding the
2780 Ethernet header. An 802.2 LLC header typically follows the
2781 Ethernet header. OpenFlow and Open vSwitch only support LLC
2782 headers with DSAP and SSAP <code>0xaa</code> and control byte
2783 <code>0x03</code>, which indicate that a SNAP header follows
2784 the LLC header. In turn, OpenFlow and Open vSwitch only
2785 support a SNAP header with organization <code>0x000000</code>.
2786 In such a case, this field is copied from the type field in
2787 the SNAP header, like this:
2788 </p>
2789
2790 <diagram>
2791 <header name="Ethernet">
2792 <bits name="dst" above="48" width=".75"/>
2793 <bits name="src" above="48" width=".75"/>
2794 <bits name="type" above="16" below="&lt;0x600" width="0.4"/>
2795 </header>
2796 <header name="LLC">
2797 <bits name="DSAP" above="8" below="0xaa" width=".4"/>
2798 <bits name="SSAP" above="8" below="0xaa" width=".4"/>
2799 <bits name="cntl" above="8" below="0x03" width=".4"/>
2800 </header>
2801 <header name="SNAP">
2802 <bits name="org" above="24" below="0x000000" width=".75"/>
2803 <bits name="type" above="16" below="\[&gt;=]0x600" width=".4" fill="yes"/>
2804 </header>
2805 <dots/>
2806 </diagram>
2807
2808 <p>
2809 When an 802.1Q header is inserted after the Ethernet source
2810 and destination, this field is populated with the encapsulated
2811 Ethertype, not the 802.1Q Ethertype. With an Ethernet II
2812 inner frame, the result looks like this:
2813 </p>
2814
2815 <diagram>
2816 <header name="Ethernet">
2817 <bits name="dst" above="48" width=".75"/>
2818 <bits name="src" above="48" width=".75"/>
2819 </header>
2820 <header name="802.1Q">
2821 <bits name="TPID" above="16" below="0x8100" width=".4"/>
2822 <bits name="TCI" above="16" width=".4"/>
2823 </header>
2824 <header name="Ethertype">
2825 <bits name="type" above="16" below="\[&gt;=]0x600" width=".4" fill="yes"/>
2826 </header>
2827 <dots/>
2828 </diagram>
2829
2830 <p>
2831 LLC and SNAP encapsulation look like this with an 802.1Q header:
2832 </p>
2833
2834 <diagram>
2835 <header name="Ethernet">
2836 <bits name="dst" above="48" width=".75"/>
2837 <bits name="src" above="48" width=".75"/>
2838 </header>
2839 <header name="802.1Q">
2840 <bits name="TPID" above="16" below="0x8100" width=".4"/>
2841 <bits name="TCI" above="16" width=".4"/>
2842 </header>
2843 <header name="Ethertype">
2844 <bits name="type" above="16" below="&lt;0x600" width="0.4"/>
2845 </header>
2846 <header name="LLC">
2847 <bits name="DSAP" above="8" below="0xaa" width=".4"/>
2848 <bits name="SSAP" above="8" below="0xaa" width=".4"/>
2849 <bits name="cntl" above="8" below="0x03" width=".4"/>
2850 </header>
2851 <header name="SNAP">
2852 <bits name="org" above="24" below="0x000000" width=".75"/>
2853 <bits name="type" above="16" below="\[&gt;=]0x600" width=".4" fill="yes"/>
2854 </header>
2855 <dots/>
2856 </diagram>
2857
2858 <p>
2859 When a packet doesn't match any of the header formats described
2860 above, Open vSwitch and OpenFlow set this field to
2861 <code>0x5ff</code> (<code>OFP_DL_TYPE_NOT_ETH_TYPE</code>).
2862 </p>
2863 </field>
2864 </group>
2865
2866 <group title="VLAN">
2867 <p>
2868 The 802.1Q VLAN header causes more trouble than any other 4
2869 bytes in networking. OpenFlow 1.0, 1.1, and 1.2+ all treat VLANs
2870 differently. Open vSwitch extensions add another variant to the mix.
2871 Open vSwitch reconciles all four treatments as best it can.
2872 </p>
2873
2874 <h2>VLAN Header Format</h2>
2875
2876 <p>
2877 An 802.1Q VLAN header consists of two 16-bit fields:
2878 </p>
2879
2880 <diagram>
2881 <header name="TPID">
2882 <bits name="Ethertype" above="16" below="0x8100" width="1.8"/>
2883 </header>
2884 <nospace/>
2885 <header name="TCI">
2886 <bits name="PCP" above="3" width=".6"/>
2887 <bits name="CFI" above="1" below="0" width=".3"/>
2888 <bits name="VID" above="12" width=".9"/>
2889 </header>
2890 </diagram>
2891
2892 <p>
2893 The first 16 bits of the VLAN header, the <dfn>TPID</dfn> (Tag Protocol
2894 IDentifier), is an Ethertype. When the VLAN header is inserted just
2895 after the source and destination MAC addresses in a Ethertype frame, the
2896 TPID serves to identify the presence of the VLAN. The standard TPID, the
2897 only one that Open vSwitch supports, is <code>0x8100</code>. OpenFlow
2898 1.0 explicitly supports only TPID <code>0x8100</code>. OpenFlow 1.1, but
2899 not earlier or later versions, also requires support for TPID
2900 <code>0x88a8</code> (Open vSwitch does not support this). OpenFlow 1.2
2901 through 1.5 do not require support for specific TPIDs (the ``push vlan
2902 header'' action does say that only <code>0x8100</code> and
2903 <code>0x88a8</code> should be pushed). No version of OpenFlow provides a
2904 way to distinguish or match on the TPID.
2905 </p>
2906
2907 <p>
2908 The remaining 16 bits of the VLAN header, the <dfn>TCI</dfn>
2909 (Tag Control Information), is subdivided into three subfields:
2910 </p>
2911
2912 <ul>
2913 <li>
2914 <dfn>PCP</dfn> (Priority Control Point), is a 3-bit 802.1p
2915 <dfn>priority</dfn>. The lowest priority is value 1, the
2916 second-lowest is value 0, and priority increases from 2 up to
2917 highest priority 7.
2918 </li>
2919
2920 <li>
2921 <p>
2922 <dfn>CFI</dfn> (Canonical Format Indicator), is a 1-bit field. On an
2923 Ethernet network, its value is always 0. This led to it later being
2924 repurposed under the name <dfn>DEI</dfn> (Drop Eligibility
2925 Indicator). By either name, OpenFlow and Open vSwitch don't provide
2926 any way to match or set this bit.
2927 </p>
2928 </li>
2929
2930 <li>
2931 <dfn>VID</dfn> (VLAN IDentifier), is a 12-bit VLAN. If the
2932 VID is 0, then the frame is not part of a VLAN. In that case,
2933 the VLAN header is called a <dfn>priority tag</dfn> because it
2934 is only meaningful for assigning the frame a priority. VID
2935 <code>0xfff</code> (4,095) is reserved.
2936 </li>
2937 </ul>
2938
2939 <p>
2940 See <ref field="eth_type"/> for illustrations of a complete Ethernet
2941 frame with 802.1Q tag included.
2942 </p>
2943
2944 <h2>Multiple VLANs</h2>
2945
2946 <p>
2947 Open vSwitch can match only a single VLAN header. If more than
2948 one VLAN header is present, then <ref field="eth_type"/>
2949 holds the TPID of the inner VLAN header. Open vSwitch stops
2950 parsing the packet after the inner TPID, so matching further
2951 into the packet (e.g. on the inner TCI or L3 fields) is not
2952 possible.
2953 </p>
2954
2955 <p>
2956 OpenFlow only directly supports matching a single VLAN header. In
2957 OpenFlow 1.1 or later, one OpenFlow table can match on the outermost VLAN
2958 header and pop it off, and a later OpenFlow table can match on the next
2959 outermost header. Open vSwitch does not support this.
2960 </p>
2961
2962 <h2>VLAN Field Details</h2>
2963
2964 <p>
2965 The four variants have three different levels of expressiveness: OpenFlow
2966 1.0 and 1.1 VLAN matching are less powerful than OpenFlow 1.2+ VLAN
2967 matching, which is less powerful than Open vSwitch extension VLAN
2968 matching.
2969 </p>
2970
2971 <h2>OpenFlow 1.0 VLAN Fields</h2>
2972
2973 <p>
2974 OpenFlow 1.0 uses two fields, called <code>dl_vlan</code> and
2975 <code>dl_vlan_pcp</code>, each of which can be either exact-matched or
2976 wildcarded, to specify VLAN matches:
2977 </p>
2978
2979 <ul>
2980 <li>
2981 When both <code>dl_vlan</code> and <code>dl_vlan_pcp</code> are
2982 wildcarded, the flow matches packets without an 802.1Q header or
2983 with any 802.1Q header.
2984 </li>
2985
2986 <li>
2987 The match <code>dl_vlan=0xffff</code> causes a flow to match only
2988 packets without an 802.1Q header. Such a flow should also wildcard
2989 <code>dl_vlan_pcp</code>, since a packet without an 802.1Q header does
2990 not have a PCP. OpenFlow does not specify what to do if a match on PCP
2991 is actually present, but Open vSwitch ignores it.
2992 </li>
2993
2994 <li>
2995 <p>
2996 Otherwise, the flow matches only packets with an 802.1Q
2997 header. If <code>dl_vlan</code> is not wildcarded, then the
2998 flow only matches packets with the VLAN ID specified in
2999 <code>dl_vlan</code>'s low 12 bits. If
3000 <code>dl_vlan_pcp</code> is not wildcarded, then the flow
3001 only matches packets with the priority specified in
3002 <code>dl_vlan_pcp</code>'s low 3 bits.
3003 </p>
3004
3005 <p>
3006 OpenFlow does not specify how to interpret the high 4 bits of
3007 <code>dl_vlan</code> or the high 5 bits of <code>dl_vlan_pcp</code>.
3008 Open vSwitch ignores them.
3009 </p>
3010 </li>
3011 </ul>
3012
3013 <field id="MFF_DL_VLAN" title="OpenFlow 1.0 VLAN ID" hidden="yes"/>
3014 <field id="MFF_DL_VLAN_PCP" title="OpenFlow 1.0 VLAN Priority"
3015 hidden="yes"/>
3016
3017 <h2>OpenFlow 1.1 VLAN Fields</h2>
3018
3019 <p>
3020 VLAN matching in OpenFlow 1.1 is similar to OpenFlow 1.0.
3021 The one refinement is that when <code>dl_vlan</code> matches on
3022 <code>0xfffe</code> (<code>OFVPID_ANY</code>), the flow matches
3023 only packets with an 802.1Q header, with any VLAN ID. If
3024 <code>dl_vlan_pcp</code> is wildcarded, the flow matches any
3025 packet with an 802.1Q header, regardless of VLAN ID or priority.
3026 If <code>dl_vlan_pcp</code> is not wildcarded, then the flow
3027 only matches packets with the priority specified in
3028 <code>dl_vlan_pcp</code>'s low 3 bits.
3029 </p>
3030
3031 <p>
3032 OpenFlow 1.1 uses the name <code>OFPVID_NONE</code>, instead of
3033 <code>OFP_VLAN_NONE</code>, for a <code>dl_vlan</code> of
3034 <code>0xffff</code>, but it has the same meaning.
3035 </p>
3036
3037 <p>
3038 In OpenFlow 1.1, Open vSwitch reports error
3039 <code>OFPBMC_BAD_VALUE</code> for an attempt to match on
3040 <code>dl_vlan</code> between 4,096 and <code>0xfffd</code>,
3041 inclusive, or <code>dl_vlan_pcp</code> greater than 7.
3042 </p>
3043
3044 <h2>OpenFlow 1.2 VLAN Fields</h2>
3045
3046 <field id="MFF_VLAN_VID" title="OpenFlow 1.2+ VLAN ID">
3047 <p>
3048 The OpenFlow standard describes this field as consisting of
3049 ``12+1'' bits. On ingress, its value is 0 if no 802.1Q header
3050 is present, and otherwise it holds the VLAN VID in its least
3051 significant 12 bits, with bit 12 (<code>0x1000</code> aka
3052 <code>OFPVID_PRESENT</code>) also set to 1. The three most
3053 significant bits are always zero:
3054 </p>
3055
3056 <diagram>
3057 <header name="OXM_OF_VLAN_VID">
3058 <bits name="" above="3" below="0" width=".6"/>
3059 <bits name="P" above="1" width=".1"/>
3060 <bits name="VLAN ID" above="12" width=".9"/>
3061 </header>
3062 </diagram>
3063
3064 <p>
3065 As a consequence of this field's format, one may use it to match the
3066 VLAN ID in all of the ways available with the OpenFlow 1.0 and 1.1
3067 formats, and a few new ways:
3068 </p>
3069
3070 <dl>
3071 <dt>Fully wildcarded</dt>
3072 <dd>
3073 Matches any packet, that is, one without an 802.1Q header or
3074 with an 802.1Q header with any TCI value.
3075 </dd>
3076
3077 <dt>
3078 Value <code>0x0000</code> (<code>OFPVID_NONE</code>), mask
3079 <code>0xffff</code> (or no mask)
3080 </dt>
3081 <dd>
3082 Matches only packets without an 802.1Q header.
3083 </dd>
3084
3085 <dt>
3086 Value <code>0x1000</code>, mask <code>0x1000</code>
3087 </dt>
3088 <dd>
3089 Matches any packet with an 802.1Q header, regardless of VLAN
3090 ID.
3091 </dd>
3092
3093 <dt>
3094 Value <code>0x1009</code>, mask <code>0xffff</code> (or no mask)
3095 </dt>
3096 <dd>
3097 Match only packets with an 802.1Q header with VLAN ID 9.
3098 </dd>
3099
3100 <dt>Value <code>0x1001</code>, mask <code>0x1001</code></dt>
3101 <dd>
3102 Matches only packets that have an 802.1Q header with an
3103 odd-numbered VLAN ID. (This is just an example; one can
3104 match on any desired VLAN ID bit pattern.)
3105 </dd>
3106 </dl>
3107 </field>
3108
3109 <field id="MFF_VLAN_PCP" title="OpenFlow 1.2+ VLAN Priority">
3110 <p>
3111 The 3 least significant bits may be used to match the PCP bits
3112 in an 802.1Q header. Other bits are always zero:
3113 </p>
3114
3115 <diagram>
3116 <header name="OXM_OF_VLAN_VID">
3117 <bits name="zero" above="5" below="0" width="1.0"/>
3118 <bits name="PCP" above="3" width=".6"/>
3119 </header>
3120 </diagram>
3121
3122 <p>
3123 This field may only be used when <ref field="vlan_vid"/> is not
3124 wildcarded and does not exact match on 0 (which only matches
3125 when there is no 802.1Q header).
3126 </p>
3127
3128 <p>
3129 See <cite>VLAN Comparison Chart</cite>, below, for some examples.
3130 </p>
3131 </field>
3132
3133 <h2>Open vSwitch Extension VLAN Field</h2>
3134
3135 <p>
3136 The <ref field="vlan_tci"/> extension can describe more kinds of VLAN
3137 matches than the other variants. It is also simpler than the other
3138 variants.
3139 </p>
3140
3141 <field id="MFF_VLAN_TCI" title="VLAN TCI">
3142 <p>
3143 For a packet without an 802.1Q header, this field is zero. For a
3144 packet with an 802.1Q header, this field is the TCI with the bit in
3145 CFI's position (marked <code>P</code> for ``present'' below) forced to
3146 1. Thus, for a packet in VLAN 9 with priority 7, it has the value
3147 <code>0xf009</code>:
3148 </p>
3149
3150 <diagram>
3151 <header name="NXM_VLAN_TCI">
3152 <bits name="PCP" above="3" below="7" width=".6"/>
3153 <bits name="P" above="1" below="1" width=".2"/>
3154 <bits name="VID" above="12" below="9" width=".9"/>
3155 </header>
3156 </diagram>
3157
3158 <p>
3159 Usage examples:
3160 </p>
3161
3162 <dl>
3163 <dt><code>vlan_tci=0</code></dt>
3164 <dd>
3165 Match packets without an 802.1Q header.
3166 </dd>
3167
3168 <dt><code>vlan_tci=0x1000/0x1000</code></dt>
3169 <dd>
3170 Match packets with an 802.1Q header, regardless of VLAN
3171 and priority values.
3172 </dd>
3173
3174 <dt><code>vlan_tci=0xf123</code></dt>
3175 <dd>
3176 Match packets tagged with priority 7 in VLAN 0x123.
3177 </dd>
3178
3179 <dt><code>vlan_tci=0x1123/0x1fff</code></dt>
3180 <dd>
3181 Match packets tagged with VLAN 0x123 (and any priority).
3182 </dd>
3183
3184 <dt><code>vlan_tci=0x5000/0xf000</code></dt>
3185 <dd>
3186 Match packets tagged with priority 2 (in any VLAN).
3187 </dd>
3188
3189 <dt><code>vlan_tci=0/0xfff</code></dt>
3190 <dd>
3191 Match packets with no 802.1Q header or tagged with VLAN 0
3192 (and any priority).
3193 </dd>
3194
3195 <dt><code>vlan_tci=0x5000/0xe000</code></dt>
3196 <dd>
3197 Match packets with no 802.1Q header or tagged with priority 2 (in any VLAN).
3198 </dd>
3199
3200 <dt><code>vlan_tci=0/0xefff</code></dt>
3201 <dd>
3202 Match packets with no 802.1Q header or tagged with VLAN 0
3203 and priority 0.
3204 </dd>
3205 </dl>
3206
3207 <p>
3208 See <cite>VLAN Comparison Chart</cite>, below, for more examples.
3209 </p>
3210 </field>
3211
3212 <h2>VLAN Comparison Chart</h2>
3213
3214 <p>
3215 The following table describes each of several possible matching
3216 criteria on 802.1Q header may be expressed with each variation
3217 of the VLAN matching fields:
3218 </p>
3219
3220 <tbl>
3221 r r r r r.
3222 Criteria OpenFlow 1.0 OpenFlow 1.1 OpenFlow 1.2+ NXM
3223 \_ \_ \_ \_ \_
3224 [1] \fL????\fR/\fL1\fR,\fL??\fR/\fL?\fR \fL????\fR/\fL1\fR,\fL??\fR/\fL?\fR \fL0000\fR/\fL0000\fR,\fL--\fR \fL0000\fR/\fL0000\fR
3225 [2] \fLffff\fR/\fL0\fR,\fL??\fR/\fL?\fR \fLffff\fR/\fL0\fR,\fL??\fR/\fL?\fR \fL0000\fR/\fLffff\fR,\fL--\fR \fL0000\fR/\fLffff\fR
3226 [3] \fL0xxx\fR/\fL0\fR,\fL??\fR/\fL1\fR \fL0xxx\fR/\fL0\fR,\fL??\fR/\fL1\fR \fL1xxx\fR/\fLffff\fR,\fL--\fR \fL1xxx\fR/\fL1fff\fR
3227 [4] \fL????\fR/\fL1\fR,\fL0y\fR/\fL0\fR \fLfffe\fR/\fL0\fR,\fL0y\fR/\fL0\fR \fL1000\fR/\fL1000\fR,\fL0y\fR \fLz000\fR/\fLf000\fR
3228 [5] \fL0xxx\fR/\fL0\fR,\fL0y\fR/\fL0\fR \fL0xxx\fR/\fL0\fR,\fL0y\fR/\fL0\fR \fL1xxx\fR/\fLffff\fR,\fL0y\fR \fLzxxx\fR/\fLffff\fR
3229 .T&amp;
3230 r r c c r.
3231 [6] (none) (none) \fL1001\fR/\fL1001\fR,\fL--\fR \fL1001\fR/\fL1001\fR
3232 .T&amp;
3233 r r c c c.
3234 [7] (none) (none) (none) \fL3000\fR/\fL3000\fR
3235 [8] (none) (none) (none) \fL0000\fR/\fL0fff\fR
3236 [9] (none) (none) (none) \fL0000\fR/\fLf000\fR
3237 [10] (none) (none) (none) \fL0000\fR/\fLefff\fR
3238 </tbl>
3239
3240 <p>
3241 All numbers in the table are expressed in hexadecimal. The
3242 columns in the table are interpreted as follows:
3243 </p>
3244
3245 <dl>
3246 <dt>Criteria</dt>
3247 <dd>See the list below.</dd>
3248
3249 <dt>OpenFlow 1.0</dt>
3250 <dt>OpenFlow 1.1</dt>
3251 <dd>
3252 <literal>wwww/x,yy/z</literal> means VLAN ID match value
3253 <literal>wwww</literal> with wildcard bit <literal>x</literal>
3254 and VLAN PCP match value <literal>yy</literal> with wildcard
3255 bit <literal>z</literal>. <literal>?</literal> means that the
3256 given bits are ignored (and conventionally
3257 <literal>0</literal> for <literal>wwww</literal> or
3258 <literal>yy</literal>, conventionally <literal>1</literal> for
3259 <literal>x</literal> or <literal>z</literal>). ``(none)''
3260 means that OpenFlow 1.0 (or 1.1) cannot match with these
3261 criteria.
3262 </dd>
3263
3264 <dt>OpenFlow 1.2+</dt>
3265 <dd>
3266 <literal>xxxx/yyyy,zz</literal> means <ref field="vlan_vid"/> with
3267 value <literal>xxxx</literal> and mask <literal>yyyy</literal>, and
3268 <ref field="vlan_pcp"/> (which is not maskable) with value
3269 <literal>zz</literal>. <literal>--</literal> means that <ref
3270 field="vlan_pcp"/> is omitted. ``(none)'' means that OpenFlow 1.2
3271 cannot match with these criteria.
3272 </dd>
3273
3274 <dt>NXM</dt>
3275 <dd>
3276 <literal>xxxx/yyyy</literal> means <ref field="vlan_tci"/> with value
3277 <literal>xxxx</literal> and mask <literal>yyyy</literal>.
3278 </dd>
3279 </dl>
3280
3281 <p>
3282 The matching criteria described by the table are:
3283 </p>
3284
3285 <dl>
3286 <dt>[1]</dt>
3287 <dd>
3288 Matches any packet, that is, one without an 802.1Q header or
3289 with an 802.1Q header with any TCI value.
3290 </dd>
3291
3292 <dt>[2]</dt>
3293 <dd>
3294 <p>
3295 Matches only packets without an 802.1Q header.
3296 </p>
3297
3298 <p>
3299 OpenFlow 1.0 doesn't define the behavior if <ref field="dl_vlan"/> is
3300 set to <code>0xffff</code> and <ref field="dl_vlan_pcp"/> is not
3301 wildcarded. (Open vSwitch always ignores <ref field="dl_vlan_pcp"/>
3302 when <ref field="dl_vlan"/> is set to <code>0xffff</code>.)
3303 </p>
3304
3305 <p>
3306 OpenFlow 1.1 says explicitly to ignore <ref field="dl_vlan_pcp"/>
3307 when <ref field="dl_vlan"/> is set to <code>0xffff</code>.
3308 </p>
3309
3310 <p>
3311 OpenFlow 1.2 doesn't say how to interpret a match with <ref
3312 field="vlan_vid"/> value 0 and a mask with
3313 <code>OFPVID_PRESENT</code> (<code>0x1000</code>) set to 1 and some
3314 other bits in the mask set to 1 also. Open vSwitch interprets it the
3315 same way as a mask of <code>0x1000</code>.
3316 </p>
3317
3318 <p>
3319 Any NXM match with <ref field="vlan_tci"/> value 0 and the CFI bit
3320 set to 1 in the mask is equivalent to the one listed in the table.
3321 </p>
3322 </dd>
3323
3324 <dt>[3]</dt>
3325 <dd>
3326 Matches only packets that have an 802.1Q header with VID
3327 <literal>xxx</literal> (and any PCP).
3328 </dd>
3329
3330 <dt>[4]</dt>
3331 <dd>
3332 <p>
3333 Matches only packets that have an 802.1Q header with PCP
3334 <literal>y</literal> (and any VID).
3335 </p>
3336
3337 <p>
3338 OpenFlow 1.0 doesn't clearly define the behavior for this
3339 case. Open vSwitch implements it this way.
3340 </p>
3341
3342 <p>
3343 In the NXM value, <literal>z</literal> equals
3344 (<literal>y</literal> &lt;&lt; 1) | 1.
3345 </p>
3346 </dd>
3347
3348 <dt>[5]</dt>
3349 <dd>
3350 <p>
3351 Matches only packets that have an 802.1Q header with VID
3352 <literal>xxx</literal> and PCP <literal>y</literal>.
3353 </p>
3354
3355 <p>
3356 In the NXM value, <literal>z</literal> equals
3357 (<literal>y</literal> &lt;&lt; 1) | 1.
3358 </p>
3359 </dd>
3360
3361 <dt>[6]</dt>
3362 <dd>
3363 Matches only packets that have an 802.1Q header with an
3364 odd-numbered VID (and any PCP). Only possible with OpenFlow
3365 1.2 and NXM. (This is just an example; one can match on any
3366 desired VID bit pattern.)
3367 </dd>
3368
3369 <dt>[7]</dt>
3370 <dd>
3371 Matches only packets that have an 802.1Q header with an
3372 odd-numbered PCP (and any VID). Only possible with NXM.
3373 (This is just an example; one can match on any desired VID bit
3374 pattern.)
3375 </dd>
3376
3377 <dt>[8]</dt>
3378 <dd>
3379 Matches packets with no 802.1Q header or with an 802.1Q header
3380 with a VID of 0. Only possible with NXM.
3381 </dd>
3382
3383 <dt>[9]</dt>
3384 <dd>
3385 Matches packets with no 802.1Q header or with an 802.1Q header
3386 with a PCP of 0. Only possible with NXM.
3387 </dd>
3388
3389 <dt>[10]</dt>
3390 <dd>
3391 Matches packets with no 802.1Q header or with an 802.1Q header
3392 with both VID and PCP of 0. Only possible with NXM.
3393 </dd>
3394 </dl>
3395 </group>
3396
3397 <group title="Layer 2.5: MPLS">
3398 <p>
3399 One or more MPLS headers (more commonly called <dfn>MPLS
3400 labels</dfn>) follow an Ethernet type field that specifies an
3401 MPLS Ethernet type [RFC 3032]. Ethertype <code>0x8847</code> is
3402 used for all unicast. Multicast MPLS is divided into two
3403 specific classes, one of which uses Ethertype
3404 <code>0x8847</code> and the other <code>0x8848</code> [RFC
3405 5332].
3406 </p>
3407
3408 <p>
3409 The most common overall packet format is Ethernet II, shown
3410 below (SNAP encapsulation may be used but is not ordinarily seen
3411 in Ethernet networks):
3412 </p>
3413
3414 <diagram>
3415 <header name="Ethernet">
3416 <bits name="dst" above="48" width="0.75"/>
3417 <bits name="src" above="48" width="0.75"/>
3418 <bits name="type" above="16" below="0x8847" width="0.4"/>
3419 </header>
3420 <header name="MPLS">
3421 <bits name="label" above="20" width=".6"/>
3422 <bits name="TC" above="3" width=".3"/>
3423 <bits name="S" above="1" width=".1"/>
3424 <bits name="TTL" above="8" width=".4"/>
3425 </header>
3426 <dots/>
3427 </diagram>
3428
3429 <p>
3430 MPLS can be encapsulated inside an 802.1Q header, in which case
3431 the combination looks like this:
3432 </p>
3433
3434 <diagram>
3435 <header name="Ethernet">
3436 <bits name="dst" above="48" width=".75"/>
3437 <bits name="src" above="48" width=".75"/>
3438 </header>
3439 <header name="802.1Q">
3440 <bits name="TPID" above="16" below="0x8100" width=".4"/>
3441 <bits name="TCI" above="16" width=".4"/>
3442 </header>
3443 <header name="Ethertype">
3444 <bits name="type" above="16" below="0x8847" width=".4"/>
3445 </header>
3446 <header name="MPLS">
3447 <bits name="label" above="20" width=".6"/>
3448 <bits name="TC" above="3" width=".3"/>
3449 <bits name="S" above="1" width=".1"/>
3450 <bits name="TTL" above="8" width=".4"/>
3451 </header>
3452 <dots/>
3453 </diagram>
3454
3455 <p>
3456 The fields within an MPLS label are:
3457 </p>
3458
3459 <dl>
3460 <dt>Label, 20 bits.</dt>
3461 <dd>
3462 An identifier.
3463 </dd>
3464
3465 <dt>Traffic control (TC), 3 bits.</dt>
3466 <dd>
3467 Used for quality of service.
3468 </dd>
3469
3470 <dt>Bottom of stack (BOS), 1 bit (labeled just ``S'' above).</dt>
3471 <dd>
3472 <p>
3473 0 indicates that another MPLS label follows this one.
3474 </p>
3475
3476 <p>
3477 1 indicates that this MPLS label is the last one in the
3478 stack, so that some other protocol follows this one.
3479 </p>
3480 </dd>
3481
3482 <dt>Time to live (TTL), 8 bits.</dt>
3483 <dd>
3484 <p>
3485 Each hop across an MPLS network decrements the TTL by 1. If
3486 it reaches 0, the packet is discarded.
3487 </p>
3488
3489 <p>
3490 OpenFlow does not make the MPLS TTL available as a match field, but
3491 actions are available to set and decrement the TTL. Open vSwitch 2.6
3492 and later makes the MPLS TTL available as an extension.
3493 </p>
3494 </dd>
3495 </dl>
3496
3497 <h2>MPLS Label Stacks</h2>
3498
3499 <p>
3500 Unlike the other encapsulations supported by OpenFlow and Open vSwitch,
3501 MPLS labels are routinely used in ``stacks'' two or three deep and
3502 sometimes even deeper. Open vSwitch currently supports up to three
3503 labels.
3504 </p>
3505
3506 <p>
3507 The OpenFlow specification only supports matching on the outermost MPLS
3508 label at any given time. To match on the second label, one must first
3509 ``pop'' the outer label and advance to another OpenFlow table, where the
3510 inner label may be matched. To match on the third label, one must pop
3511 the two outer labels, and so on. The Open Networking Foundation is
3512 considering support for directly matching on multiple MPLS labels for
3513 OpenFlow 1.6.<!-- XXX add EXT-* link -->
3514 </p>
3515
3516 <h2>MPLS Inner Protocol</h2>
3517
3518 <p>
3519 Unlike all other forms of encapsulation that Open vSwitch and
3520 OpenFlow support, an MPLS label does not indicate what inner
3521 protocol it encapsulates. Different deployments determine the
3522 inner protocol in different ways [RFC 3032]:
3523 </p>
3524
3525 <ul>
3526 <li>
3527 A few reserved label values do indicate an inner protocol.
3528 Label 0, the ``IPv4 Explicit NULL Label,'' indicates inner
3529 IPv4. Label 2, the ``IPv6 Explicit NULL Label,'' indicates
3530 inner IPv6.
3531 </li>
3532
3533 <li>
3534 Some deployments use a single inner protocol consistently.
3535 </li>
3536
3537 <li>
3538 In some deployments, the inner protocol must be inferred from
3539 the innermost label.
3540 </li>
3541
3542 <li>
3543 In some deployments, the inner protocol must be inferred from
3544 the innermost label and the encapsulated data, e.g. to
3545 distinguish between inner IPv4 and IPv6 based on whether the
3546 first nibble of the inner protocol data are <code>4</code> or
3547 <code>6</code>. OpenFlow and Open vSwitch do not currently
3548 support these cases.
3549 </li>
3550 </ul>
3551
3552 <p>
3553 Open vSwitch and OpenFlow do not infer the inner protocol, even if
3554 reserved label values are in use. Instead, the flow table must specify
3555 the inner protocol at the time it pops the bottommost MPLS label, using
3556 the Ethertype argument to the <code>pop_mpls</code> action.
3557 </p>
3558
3559 <h2>Field Details</h2>
3560
3561 <field id="MFF_MPLS_LABEL" title="MPLS Label">
3562 <p>
3563 The least significant 20 bits hold the ``label'' field from
3564 the MPLS label. Other bits are zero:
3565 </p>
3566
3567 <diagram>
3568 <header name="OXM_OF_MPLS_LABEL">
3569 <bits name="zero" above="12" below="0" width=".6"/>
3570 <bits name="label" above="20" width="1.0"/>
3571 </header>
3572 </diagram>
3573
3574 <p>
3575 Most label values are available for any use by deployments.
3576 Values under 16 are reserved.
3577 </p>
3578 </field>
3579
3580 <field id="MFF_MPLS_TC" title="MPLS Traffic Class">
3581 <p>
3582 The least significant 3 bits hold the TC field from the MPLS
3583 label. Other bits are zero:
3584 </p>
3585
3586 <diagram>
3587 <header name="OXM_OF_MPLS_TC">
3588 <bits name="zero" above="5" below="0" width="1.0"/>
3589 <bits name="TC" above="3" width=".6"/>
3590 </header>
3591 </diagram>
3592
3593 <p>
3594 This field is intended for use for Quality of Service (QoS)
3595 and Explicit Congestion Notification purposes, but its
3596 particular interpretation is deployment specific.
3597 </p>
3598
3599 <p>
3600 Before 2009, this field was named EXP and reserved for
3601 experimental use [RFC 5462].
3602 </p>
3603 </field>
3604
3605 <field id="MFF_MPLS_BOS" title="MPLS Bottom of Stack">
3606 <p>
3607 The least significant bit holds the BOS field from the MPLS
3608 label. Other bits are zero:
3609 </p>
3610
3611 <diagram>
3612 <header name="OXM_OF_MPLS_BOS">
3613 <bits name="zero" above="7" below="0" width="1.3"/>
3614 <bits name="BOS" above="1" width=".3"/>
3615 </header>
3616 </diagram>
3617
3618 <p>
3619 This field is useful as part of processing a series of incoming MPLS
3620 labels. A flow that includes a <code>pop_mpls</code> action should
3621 generally match on <ref field="mpls_bos"/>:
3622 </p>
3623
3624 <ul>
3625 <li>
3626 When <ref field="mpls_bos"/> is 1, there is another MPLS label
3627 following this one, so the Ethertype passed to <code>pop_mpls</code>
3628 should be an MPLS Ethertype. For example: <code>table=0,
3629 dl_type=0x8847, mpls_bos=1, actions=pop_mpls:0x8847,
3630 goto_table:1</code>
3631 </li>
3632
3633 <li>
3634 When <ref field="mpls_bos"/> is 0, this MPLS label is the last one,
3635 so the Ethertype passed to <code>pop_mpls</code> should be a non-MPLS
3636 Ethertype such as IPv4. For example: <code>table=1, dl_type=0x8847,
3637 mpls_bos=0, actions=pop_mpls:0x0800, goto_table:2</code>
3638 </li>
3639 </ul>
3640 </field>
3641
3642 <field id="MFF_MPLS_TTL" title="MPLS Time-to-Live">
3643 <p>
3644 Holds the 8-bit time-to-live field from the MPLS label:
3645 </p>
3646
3647 <diagram>
3648 <header name="NXM_NX_MPLS_TTL">
3649 <bits name="TTL" above="8" width=".4"/>
3650 </header>
3651 </diagram>
3652 </field>
3653 </group>
3654
3655 <group title="Layer 3: IPv4 and IPv6">
3656 <h2>IPv4 Specific Fields</h2>
3657
3658 <p>
3659 These fields are applicable only to IPv4 flows, that is, flows that match
3660 on the IPv4 Ethertype <code>0x0800</code>.
3661 </p>
3662
3663 <field id="MFF_IPV4_SRC" title="IPv4 Source Address">
3664 <p>
3665 The source address from the IPv4 header:
3666 </p>
3667
3668 <diagram>
3669 <header name="Ethernet">
3670 <bits name="dst" above="48" width="0.4"/>
3671 <bits name="src" above="48" width="0.4"/>
3672 <bits name="type" above="16" below="0x800" width="0.4"/>
3673 </header>
3674 <header name="IPv4">
3675 <bits name="..." width="0.4"/>
3676 <bits name="proto" above="8" width="0.4"/>
3677 <bits name="src" above="32" width="0.4" fill="yes"/>
3678 <bits name="dst" above="32" width="0.4"/>
3679 </header>
3680 <dots/>
3681 </diagram>
3682
3683 <p>
3684 For historical reasons, in an ARP or RARP flow, Open vSwitch interprets
3685 matches on <code>nw_src</code> as actually referring to the ARP SPA.
3686 </p>
3687 </field>
3688
3689 <field id="MFF_IPV4_DST" title="IPv4 Destination Address">
3690 <p>
3691 The destination address from the IPv4 header:
3692 </p>
3693
3694 <diagram>
3695 <header name="Ethernet">
3696 <bits name="dst" above="48" width="0.4"/>
3697 <bits name="src" above="48" width="0.4"/>
3698 <bits name="type" above="16" below="0x800" width="0.4"/>
3699 </header>
3700 <header name="IPv4">
3701 <bits name="..." width="0.4"/>
3702 <bits name="proto" above="8" width="0.4"/>
3703 <bits name="src" above="32" width="0.4"/>
3704 <bits name="dst" above="32" width="0.4" fill="yes"/>
3705 </header>
3706 <dots/>
3707 </diagram>
3708
3709 <p>
3710 For historical reasons, in an ARP or RARP flow, Open vSwitch interprets
3711 matches on <code>nw_dst</code> as actually referring to the ARP TPA.
3712 </p>
3713 </field>
3714
3715 <h2>IPv6 Specific Fields</h2>
3716
3717 <p>
3718 These fields apply only to IPv6 flows, that is, flows that match
3719 on the IPv6 Ethertype <code>0x86dd</code>.
3720 </p>
3721
3722 <field id="MFF_IPV6_SRC" title="IPv6 Source Address">
3723 <p>
3724 The source address from the IPv6 header:
3725 </p>
3726
3727 <diagram>
3728 <header name="Ethernet">
3729 <bits name="dst" above="48" width="0.4"/>
3730 <bits name="src" above="48" width="0.4"/>
3731 <bits name="type" above="16" below="0x86dd" width="0.4"/>
3732 </header>
3733 <header name="IPv6">
3734 <bits name="..." width="0.4"/>
3735 <bits name="next" above="8" width="0.3"/>
3736 <bits name="src" above="128" width="0.8" fill="yes"/>
3737 <bits name="dst" above="128" width="0.8"/>
3738 </header>
3739 <dots/>
3740 </diagram>
3741
3742 <p>
3743 Open vSwitch 1.8 added support for bitwise matching; earlier versions
3744 supported only CIDR masks.
3745 </p>
3746 </field>
3747 <field id="MFF_IPV6_DST" title="IPv6 Destination Address">
3748 <p>
3749 The destination address from the IPv6 header:
3750 </p>
3751 <diagram>
3752 <header name="Ethernet">
3753 <bits name="dst" above="48" width="0.4"/>
3754 <bits name="src" above="48" width="0.4"/>
3755 <bits name="type" above="16" below="0x86dd" width="0.4"/>
3756 </header>
3757 <header name="IPv6">
3758 <bits name="..." width="0.4"/>
3759 <bits name="next" above="8" width="0.3"/>
3760 <bits name="src" above="128" width="0.8"/>
3761 <bits name="dst" above="128" width="0.8" fill="yes"/>
3762 </header>
3763 <dots/>
3764 </diagram>
3765
3766 <p>
3767 Open vSwitch 1.8 added support for bitwise matching; earlier versions
3768 supported only CIDR masks.
3769 </p>
3770 </field>
3771 <field id="MFF_IPV6_LABEL" title="IPv6 Flow Label">
3772 <p>
3773 The least significant 20 bits hold the flow label field from
3774 the IPv6 header. Other bits are zero:
3775 </p>
3776
3777 <diagram>
3778 <header name="OXM_OF_IPV6_FLABEL">
3779 <bits name="zero" above="12" below="0" width=".6"/>
3780 <bits name="label" above="20" width="1.0"/>
3781 </header>
3782 </diagram>
3783 </field>
3784
3785 <h2>IPv4/IPv6 Fields</h2>
3786
3787 <p>
3788 These fields exist with at least approximately the same meaning in both
3789 IPv4 and IPv6, so they are treated as a single field for matching
3790 purposes. Any flow that matches on the IPv4 Ethertype
3791 <code>0x0800</code> or the IPv6 Ethertype <code>0x86dd</code> may match
3792 on these fields.
3793 </p>
3794
3795 <field id="MFF_IP_PROTO" title="IPv4/v6 Protocol">
3796 <p>
3797 Matches the IPv4 or IPv6 protocol type.
3798 </p>
3799
3800 <p>
3801 For historical reasons, in an ARP or RARP flow, Open vSwitch interprets
3802 matches on <code>nw_proto</code> as actually referring to the ARP
3803 opcode. The ARP opcode is a 16-bit field, so for matching purposes ARP
3804 opcodes greater than 255 are treated as 0; this works adequately
3805 because in practice ARP and RARP only use opcodes 1 through 4.
3806 </p>
3807 </field>
3808
3809 <field id="MFF_IP_TTL" title="IPv4/v6 TTL/Hop Limit">
3810 The main reason to match on the TTL or hop limit field is to detect
3811 whether a <code>dec_ttl</code> action will fail due to a TTL exceeded
3812 error. Another way that a controller can detect TTL exceeded is to
3813 listen for <code>OFPR_INVALID_TTL</code> ``packet-in'' messages via
3814 OpenFlow.
3815 </field>
3816
3817 <field id="MFF_IP_FRAG" title="IPv4/v6 Fragment Bitmask">
3818 <p>
3819 Specifies what kinds of IP fragments or non-fragments to match. The
3820 value for this field is most conveniently specified as one of the
3821 following:
3822 </p>
3823
3824 <dl>
3825 <dt><code>no</code></dt>
3826 <dd>
3827 Match only non-fragmented packets.
3828 </dd>
3829
3830 <dt><code>yes</code></dt>
3831 <dd>
3832 Matches all fragments.
3833 </dd>
3834
3835 <dt><code>first</code></dt>
3836 <dd>
3837 Matches only fragments with offset 0.
3838 </dd>
3839
3840 <dt><code>later</code></dt>
3841 <dd>
3842 Matches only fragments with nonzero offset.
3843 </dd>
3844
3845 <dt><code>not_later</code></dt>
3846 <dd>
3847 Matches non-fragmented packets and fragments with zero offset.
3848 </dd>
3849 </dl>
3850
3851 <p>
3852 The field is internally formatted as 2 bits: bit 0 is 1 for an IP
3853 fragment with any offset (and otherwise 0), and bit 1 is 1 for an IP
3854 fragment with nonzero offset (and otherwise 0), like so:
3855 </p>
3856
3857 <diagram>
3858 <header name="NXM_NX_IP_FRAG">
3859 <bits name="zero" above="6" below="0" width=".9"/>
3860 <bits name="later" above="1" width=".3"/>
3861 <bits name="any" above="1" width=".3"/>
3862 </header>
3863 </diagram>
3864
3865 <p>
3866 Even though 2 bits have 4 possible values, this field only uses 3 of
3867 them:
3868 </p>
3869
3870 <ul>
3871 <li>
3872 A packet that is not an IP fragment has value 0.
3873 </li>
3874
3875 <li>
3876 A packet that is an IP fragment with offset 0 (the first fragment)
3877 has bit 0 set and thus value 1.
3878 </li>
3879
3880 <li>
3881 A packet that is an IP fragment with nonzero offset has bits 0 and 1
3882 set and thus value 3.
3883 </li>
3884 </ul>
3885
3886 <p>
3887 The switch may reject matches against values that can never appear.
3888 </p>
3889
3890 <p>
3891 It is important to understand how this field interacts with the
3892 OpenFlow fragment handling mode:
3893 </p>
3894
3895 <ul>
3896 <li>
3897 In <code>OFPC_FRAG_DROP</code> mode, the OpenFlow switch drops all IP
3898 fragments before they reach the flow table, so every packet that is
3899 available for matching will have value 0 in this field.
3900 </li>
3901
3902 <li>
3903 Open vSwitch does not implement <code>OFPC_FRAG_REASM</code> mode,
3904 but if it did then IP fragments would be reassembled before they
3905 reached the flow table and again every packet available for matching
3906 would always have value 0.
3907 </li>
3908
3909 <li>
3910 In <code>OFPC_FRAG_NORMAL</code> mode, all three values are possible,
3911 but OpenFlow 1.0 says that fragments' transport ports are always 0,
3912 even for the first fragment, so this does not provide much extra
3913 information.
3914 </li>
3915
3916 <li>
3917 In <code>OFPC_FRAG_NX_MATCH</code> mode, all three values are
3918 possible. For fragments with offset 0, Open vSwitch makes L4 header
3919 information available.
3920 </li>
3921 </ul>
3922
3923 <p>
3924 Thus, this field is likely to be most useful for an Open vSwitch switch
3925 configured in <code>OFPC_FRAG_NX_MATCH</code> mode. See the
3926 description of the <code>set-frags</code> command in
3927 <code>ovs-ofctl</code>(8), for more details.
3928 </p>
3929 </field>
3930
3931 <h3>IPv4/IPv6 TOS Fields</h3>
3932
3933 <p>
3934 IPv4 and IPv6 contain a one-byte ``type of service'' or TOS field that
3935 has the following format:
3936 </p>
3937
3938 <diagram>
3939 <header name="type of service">
3940 <bits name="DSCP" above="6" width=".9"/>
3941 <bits name="ECN" above="2" width=".3"/>
3942 </header>
3943 </diagram>
3944
3945 <field id="MFF_IP_DSCP" title="IPv4/v6 DSCP (Bits 2-7)">
3946 <p>
3947 This field is the TOS byte with the two ECN bits cleared to 0:
3948 </p>
3949
3950 <diagram>
3951 <header name="NXM_OF_IP_TOS">
3952 <bits name="DSCP" above="6" width=".9"/>
3953 <bits name="zero" above="2" below="0" width=".3"/>
3954 </header>
3955 </diagram>
3956 </field>
3957 <field id="MFF_IP_DSCP_SHIFTED" title="IPv4/v6 DSCP (Bits 0-5)">
3958 <p>
3959 This field is the TOS byte shifted right to put the DSCP bits in the
3960 6 least-significant bits:
3961 </p>
3962
3963 <diagram>
3964 <header name="OXM_OF_IP_DSCP">
3965 <bits name="zero" above="2" below="0" width=".3"/>
3966 <bits name="DSCP" above="6" width=".9"/>
3967 </header>
3968 </diagram>
3969 </field>
3970 <field id="MFF_IP_ECN" title="IPv4/v6 ECN">
3971 <p>
3972 This field is the TOS byte with the DSCP bits cleared to 0:
3973 </p>
3974
3975 <diagram>
3976 <header name="OXM_OF_IP_ECN">
3977 <bits name="zero" above="6" below="0" width=".9"/>
3978 <bits name="ECN" above="2" width=".35"/>
3979 </header>
3980 </diagram>
3981 </field>
3982
3983 </group>
3984
3985 <group title="Layer 3: ARP">
3986 <p>
3987 In theory, Address Resolution Protocol, or ARP, is a generic protocol
3988 generic protocol that can be used to obtain the hardware address that
3989 corresponds to any higher-level protocol address. In contemporary usage,
3990 ARP is used only in Ethernet networks to obtain the Ethernet address for
3991 a given IPv4 address. OpenFlow and Open vSwitch only support this usage
3992 of ARP. For this use case, an ARP packet has the following format, with
3993 the ARP fields exposed as Open vSwitch fields highlighted:
3994 </p>
3995
3996 <diagram>
3997 <header name="Ethernet">
3998 <bits name="dst" above="48" width="0.4"/>
3999 <bits name="src" above="48" width="0.4"/>
4000 <bits name="type" above="16" below="0x806" width="0.4"/>
4001 </header>
4002 <header name="ARP">
4003 <bits name="hrd" above="16" below="1" width=".3"/>
4004 <bits name="pro" above="16" below="0x800" width=".3"/>
4005 <bits name="hln" above="8" below="6" width=".2"/>
4006 <bits name="pln" above="8" below="4" width=".2"/>
4007 <bits name="op" above="16" width=".2" fill="yes"/>
4008 <bits name="sha" above="48" width="0.5" fill="yes"/>
4009 <bits name="spa" above="16" width="0.3" fill="yes"/>
4010 <bits name="tha" above="48" width="0.5" fill="yes"/>
4011 <bits name="tpa" above="16" width="0.3" fill="yes"/>
4012 </header>
4013 </diagram>
4014
4015 <p>
4016 The ARP fields are also used for RARP, the Reverse Address Resolution
4017 Protocol, which shares ARP's wire format.
4018 </p>
4019
4020 <field id="MFF_ARP_OP" title="ARP Opcode">
4021 Even though this is a 16-bit field, Open vSwitch does not support ARP
4022 opcodes greater than 255; it treats them to zero. This works adequately
4023 because in practice ARP and RARP only use opcodes 1 through 4.
4024 </field>
4025
4026 <field id="MFF_ARP_SPA" title="ARP Source IPv4 Address"/>
4027 <field id="MFF_ARP_TPA" title="ARP Target IPv4 Address"/>
4028 <field id="MFF_ARP_SHA" title="ARP Source Ethernet Address"/>
4029 <field id="MFF_ARP_THA" title="ARP Target Ethernet Address"/>
4030 </group>
4031
4032 <group title="Layer 4: TCP, UDP, and SCTP">
4033 <p>
4034 For matching purposes, no distinction is made whether these protocols are
4035 encapsulated within IPv4 or IPv6.
4036 </p>
4037
4038 <h2>TCP</h2>
4039
4040 <p>
4041 The following diagram shows TCP within IPv4. Open vSwitch also supports
4042 TCP in IPv6. Only TCP fields implemented as Open vSwitch fields are
4043 shown:
4044 </p>
4045
4046 <diagram>
4047 <header name="Ethernet">
4048 <bits name="dst" above="48" width="0.4"/>
4049 <bits name="src" above="48" width="0.4"/>
4050 <bits name="type" above="16" below="0x800" width="0.4"/>
4051 </header>
4052 <header name="IPv4">
4053 <bits name="..." width="0.4"/>
4054 <bits name="proto" above="8" below="6" width="0.3"/>
4055 <bits name="src" above="32" width="0.4"/>
4056 <bits name="dst" above="32" width="0.4"/>
4057 </header>
4058 <header name="TCP">
4059 <bits name="src" above="16" width=".2"/>
4060 <bits name="dst" above="16" width=".2"/>
4061 <bits name="..." width=".75"/>
4062 <bits name="flags" above="12" width=".3"/>
4063 <bits name="..." width=".6"/>
4064 </header>
4065 <dots/>
4066 </diagram>
4067 <field id="MFF_TCP_SRC" title="TCP Source Port">
4068 Open vSwitch 1.6 added support for bitwise matching.
4069 </field>
4070 <field id="MFF_TCP_DST" title="TCP Destination Port">
4071 Open vSwitch 1.6 added support for bitwise matching.
4072 </field>
4073 <field id="MFF_TCP_FLAGS" title="TCP Flags">
4074 <p>
4075 This field holds the TCP flags. TCP currently defines 9 flag bits. An
4076 additional 3 bits are reserved. For more information, see [RFC 793],
4077 [RFC 3168], and [RFC 3540].
4078 </p>
4079
4080 <p>
4081 Matches on this field are most conveniently written in terms of
4082 symbolic names (given in the diagram below), each preceded by either
4083 <code>+</code> for a flag that must be set, or <code>-</code> for a
4084 flag that must be unset, without any other delimiters between the
4085 flags. Flags not mentioned are wildcarded. For example,
4086 <code>tcp,tcp_flags=+syn-ack</code> matches TCP SYNs that are not ACKs,
4087 and <code>tcp,tcp_flags=+[200]</code> matches TCP packets with the
4088 reserved [200] flag set. Matches can also be written as
4089 <code><var>flags</var>/<var>mask</var></code>, where <var>flags</var>
4090 and <var>mask</var> are 16-bit numbers in decimal or in hexadecimal
4091 prefixed by <code>0x</code>.
4092 </p>
4093
4094 <p>
4095 The flag bits are:
4096 </p>
4097
4098 <diagram>
4099 <header>
4100 <bits name="zero" above="4" below="0" width=".9"/>
4101 </header>
4102 <nospace/>
4103 <header name="reserved">
4104 <bits name="[800]" above="1" width=".35"/>
4105 <bits name="[400]" above="1" width=".35"/>
4106 <bits name="[200]" above="1" width=".35"/>
4107 </header>
4108 <nospace/>
4109 <header name="later RFCs">
4110 <bits name="NS" above="1" width=".35"/>
4111 <bits name="CWR" above="1" width=".35"/>
4112 <bits name="ECE" above="1" width=".35"/>
4113 </header>
4114 <nospace/>
4115 <header name="RFC 793">
4116 <bits name="URG" above="1" width=".35"/>
4117 <bits name="ACK" above="1" width=".35"/>
4118 <bits name="PSH" above="1" width=".35"/>
4119 <bits name="RST" above="1" width=".35"/>
4120 <bits name="SYN" above="1" width=".35"/>
4121 <bits name="FIN" above="1" width=".35"/>
4122 </header>
4123 </diagram>
4124 </field>
4125
4126 <h2>UDP</h2>
4127
4128 <p>
4129 The following diagram shows UDP within IPv4. Open vSwitch also supports
4130 UDP in IPv6. Only UDP fields that Open vSwitch exposes as fields are
4131 shown:
4132 </p>
4133
4134 <diagram>
4135 <header name="Ethernet">
4136 <bits name="dst" above="48" width="0.4"/>
4137 <bits name="src" above="48" width="0.4"/>
4138 <bits name="type" above="16" below="0x800" width="0.4"/>
4139 </header>
4140 <header name="IPv4">
4141 <bits name="..." width="0.4"/>
4142 <bits name="proto" above="8" below="17" width="0.3"/>
4143 <bits name="src" above="32" width="0.4"/>
4144 <bits name="dst" above="32" width="0.4"/>
4145 </header>
4146 <header name="UDP">
4147 <bits name="src" above="16" width=".2"/>
4148 <bits name="dst" above="16" width=".2"/>
4149 <bits name="..." width=".4"/>
4150 </header>
4151 <dots/>
4152 </diagram>
4153 <field id="MFF_UDP_SRC" title="UDP Source Port"/>
4154 <field id="MFF_UDP_DST" title="UDP Destination Port"/>
4155
4156 <h2>SCTP</h2>
4157
4158 <p>
4159 The following diagram shows SCTP within IPv4. Open vSwitch also supports
4160 SCTP in IPv6. Only SCTP fields that Open vSwitch exposes as fields are
4161 shown:
4162 </p>
4163
4164 <diagram>
4165 <header name="Ethernet">
4166 <bits name="dst" above="48" width="0.4"/>
4167 <bits name="src" above="48" width="0.4"/>
4168 <bits name="type" above="16" below="0x800" width="0.4"/>
4169 </header>
4170 <header name="IPv4">
4171 <bits name="..." width="0.4"/>
4172 <bits name="proto" above="8" below="132" width="0.3"/>
4173 <bits name="src" above="32" width="0.4"/>
4174 <bits name="dst" above="32" width="0.4"/>
4175 </header>
4176 <header name="SCTP">
4177 <bits name="src" above="16" width=".2"/>
4178 <bits name="dst" above="16" width=".2"/>
4179 <bits name="..." width=".8"/>
4180 </header>
4181 <dots/>
4182 </diagram>
4183 <field id="MFF_SCTP_SRC" title="SCTP Source Port"/>
4184 <field id="MFF_SCTP_DST" title="SCTP Destination Port"/>
4185 </group>
4186
4187 <group title="Layer 4: ICMPv4 and ICMPv6">
4188 <h2>ICMPv4</h2>
4189 <diagram>
4190 <header name="Ethernet">
4191 <bits name="dst" above="48" width="0.4"/>
4192 <bits name="src" above="48" width="0.4"/>
4193 <bits name="type" above="16" below="0x800" width="0.4"/>
4194 </header>
4195 <header name="IPv4">
4196 <bits name="..." width="0.4"/>
4197 <bits name="proto" above="8" below="1" width="0.3"/>
4198 <bits name="src" above="32" width="0.4"/>
4199 <bits name="dst" above="32" width="0.4"/>
4200 </header>
4201 <header name="ICMPv4">
4202 <bits name="type" above="8" width=".3"/>
4203 <bits name="code" above="8" width=".3"/>
4204 <bits name="..." width=".8"/>
4205 </header>
4206 <dots/>
4207 </diagram>
4208 <field id="MFF_ICMPV4_TYPE" title="ICMPv4 Type">
4209 <p>
4210 For historical reasons, in an ICMPv4 flow, Open vSwitch interprets
4211 matches on <code>tp_src</code> as actually referring to the ICMP type.
4212 </p>
4213 </field>
4214 <field id="MFF_ICMPV4_CODE" title="ICMPv4 Code">
4215 <p>
4216 For historical reasons, in an ICMPv4 flow, Open vSwitch interprets
4217 matches on <code>tp_dst</code> as actually referring to the ICMP code.
4218 </p>
4219 </field>
4220
4221 <h2>ICMPv6</h2>
4222 <diagram>
4223 <header name="Ethernet">
4224 <bits name="dst" above="48" width="0.4"/>
4225 <bits name="src" above="48" width="0.4"/>
4226 <bits name="type" above="16" below="0x86dd" width="0.4"/>
4227 </header>
4228 <header name="IPv6">
4229 <bits name="..." width="0.2"/>
4230 <bits name="next" above="8" below="58" width="0.3"/>
4231 <bits name="src" above="128" width="0.4"/>
4232 <bits name="dst" above="128" width="0.4"/>
4233 </header>
4234 <header name="ICMPv6">
4235 <bits name="type" above="8" width=".3"/>
4236 <bits name="code" above="8" width=".3"/>
4237 <bits name="..." width=".8"/>
4238 </header>
4239 <dots/>
4240 </diagram>
4241 <field id="MFF_ICMPV6_TYPE" title="ICMPv6 Type"/>
4242 <field id="MFF_ICMPV6_CODE" title="ICMPv6 Code"/>
4243
4244 <h2>ICMPv6 Neighbor Discovery</h2>
4245 <diagram>
4246 <header name="Ethernet">
4247 <bits name="dst" above="48" width="0.4"/>
4248 <bits name="src" above="48" width="0.4"/>
4249 <bits name="type" above="16" below="0x86dd" width="0.4"/>
4250 </header>
4251 <header name="IPv6">
4252 <bits name="..." width="0.2"/>
4253 <bits name="next" above="8" below="58" width="0.3"/>
4254 <bits name="src" above="128" width="0.4"/>
4255 <bits name="dst" above="128" width="0.4"/>
4256 </header>
4257 <header name="ICMPv6">
4258 <bits name="type" above="8" below="135/136" width=".3"/>
4259 <bits name="code" above="8" below="0" width=".3"/>
4260 <bits name="..." width=".8"/>
4261 </header>
4262 <header name="ICMPv6 ND">
4263 <bits name="target" above="128" width=".4"/>
4264 <bits name="option ..." width=".6"/>
4265 </header>
4266 </diagram>
4267 <field id="MFF_ND_TARGET" title="ICMPv6 Neighbor Discovery Target IPv6"/>
4268 <field id="MFF_ND_SLL"
4269 title="ICMPv6 Neighbor Discovery Source Ethernet Address"/>
4270 <field id="MFF_ND_TLL"
4271 title="ICMPv6 Neighbor Discovery Target Ethernet Address"/>
4272 </group>
4273
4274 <h1>References</h1>
4275
4276 <dl>
4277 <dt>Casado</dt>
4278 <dd>
4279 M. Casado, M. J. Freedman, J. Pettit, J. Luo, N. McKeown, and
4280 S. Shenker, ``Ethane: Taking Control of the Enterprise,''
4281 Computer Communications Review, October 2007.
4282 </dd>
4283
4284 <dt>EXT-56</dt>
4285 <dd>
4286 J. Tonsing, ``Permit one of a set of prerequisites to apply, e.g. don't
4287 preclude non-Ethernet media,'' <url
4288 href="https://rs.opennetworking.org/bugs/browse/EXT-56"/> (ONF
4289 members only).
4290 </dd>
4291
4292 <dt>EXT-112</dt>
4293 <dd>
4294 J. Tourrilhes, ``Support non-Ethernet packets throughout the
4295 pipeline,'' <url
4296 href="https://rs.opennetworking.org/bugs/browse/EXT-112"/> (ONF
4297 members only).
4298 </dd>
4299
4300 <dt>EXT-134</dt>
4301 <dd>
4302 J. Tourrilhes, ``Match first nibble of the MPLS payload,'' <url
4303 href="https://rs.opennetworking.org/bugs/browse/EXT-134"/> (ONF
4304 members only).
4305 </dd>
4306
4307 <dt>Geneve</dt>
4308 <dd>
4309 J. Gross, I. Ganga, and T. Sridhar, editors, ``Geneve: Generic Network
4310 Virtualization Encapsulation,'' <url
4311 href="https://datatracker.ietf.org/doc/draft-ietf-nvo3-geneve/"/>.
4312 </dd>
4313
4314 <dt>IEEE OUI</dt>
4315 <dd>
4316 IEEE Standards Association, ``MAC Address Block Large (MA-L),''
4317 <url
4318 href="https://standards.ieee.org/develop/regauth/oui/index.html"/>.
4319 </dd>
4320
4321 <dt>NSH</dt>
4322 <dd>
4323 P. Quinn and U. Elzur, editors, ``Network Service Header,'' <url
4324 href="https://datatracker.ietf.org/doc/draft-ietf-sfc-nsh/"/>.
4325 </dd>
4326
4327 <dt>OpenFlow 1.0.1</dt>
4328 <dd>
4329 Open Networking Foundation, ``OpenFlow Switch Errata, Version
4330 1.0.1,'' June 2012.
4331 </dd>
4332
4333 <dt>OpenFlow 1.1</dt>
4334 <dd>
4335 OpenFlow Consortium, ``OpenFlow Switch Specification Version
4336 1.1.0 Implemented (Wire Protocol 0x02),'' February 2011.
4337 </dd>
4338
4339 <dt>OpenFlow 1.5</dt>
4340 <dd>
4341 Open Networking Foundation, ``OpenFlow Switch Specification Version
4342 1.5.0 (Protocol version 0x06),'' December 2014.
4343 </dd>
4344
4345 <dt>OpenFlow Extensions 1.3.x Package 2</dt>
4346 <dd>
4347 Open Networking Foundation, ``OpenFlow Extensions 1.3.x Package 2,''
4348 December 2013.
4349 </dd>
4350
4351 <dt>TCP Flags Match Field Extension</dt>
4352 <dd>
4353 Open Networking Foundation, ``TCP flags match field Extension,'' December
4354 2014. In [OpenFlow Extensions 1.3.x Package 2].
4355 </dd>
4356
4357 <dt>Pepelnjak</dt>
4358 <dd>
4359 I. Pepelnjak, ``OpenFlow and Fermi Estimates,'' <url
4360 href="http://blog.ipspace.net/2013/09/openflow-and-fermi-estimates.html"/>.
4361 </dd>
4362
4363 <dt>RFC 793</dt>
4364 <dd>
4365 ``Transmission Control Protocol,'' <url
4366 href="http://www.ietf.org/rfc/rfc793.txt"/>.
4367 </dd>
4368
4369 <dt>RFC 3032</dt>
4370 <dd>
4371 E. Rosen, D. Tappan, G. Fedorkow, Y. Rekhter, D. Farinacci,
4372 T. Li, and A. Conta, ``MPLS Label Stack Encoding,'' <url
4373 href="http://www.ietf.org/rfc/rfc3032.txt"/>.
4374 </dd>
4375
4376 <dt>RFC 3168</dt>
4377 <dd>
4378 K. Ramakrishnan, S. Floyd, and D. Black, ``The Addition of Explicit
4379 Congestion Notification (ECN) to IP,'' <url href="https://tools.ietf.org/html/rfc3168"/>.
4380 </dd>
4381
4382 <dt>RFC 3540</dt>
4383 <dd>
4384 N. Spring, D. Wetherall, and D. Ely, ``Robust Explicit Congestion
4385 Notification (ECN) Signaling with Nonces,'' <url
4386 href="https://tools.ietf.org/html/rfc3540"/>.
4387 </dd>
4388
4389 <dt>RFC 4632</dt>
4390 <dd>
4391 V. Fuller and T. Li, ``Classless Inter-domain Routing (CIDR): The
4392 Internet Address Assignment and Aggregation Plan,'' <url
4393 href="https://tools.ietf.org/html/rfc4632"/>.
4394 </dd>
4395
4396 <dt>RFC 5462</dt>
4397 <dd>
4398 L. Andersson and R. Asati, ``Multiprotocol Label Switching
4399 (MPLS) Label Stack Entry: ``EXP'' Field Renamed to ``Traffic
4400 Class'' Field,'' <url
4401 href="http://www.ietf.org/rfc/rfc5462.txt"/>.
4402 </dd>
4403
4404 <dt>RFC 6830</dt>
4405 <dd>
4406 D. Farinacci, V. Fuller, D. Meyer, and D. Lewis, ``The
4407 Locator/ID Separation Protocol (LISP),'' <url
4408 href="http://www.ietf.org/rfc/rfc6830.txt"/>.
4409 </dd>
4410
4411 <dt>RFC 7348</dt>
4412 <dd>
4413 M. Mahalingam, D. Dutt, K. Duda, P. Agarwal, L. Kreeger, T. Sridhar,
4414 M. Bursell, and C. Wright, ``Virtual eXtensible Local Area Network
4415 (VXLAN): A Framework for Overlaying Virtualized Layer 2 Networks over
4416 Layer 3 Networks, '' <url href="https://tools.ietf.org/html/rfc7348"/>.
4417 </dd>
4418
4419 <dt>Srinivasan</dt>
4420 <dd>
4421 V. Srinivasan, S. Suriy, and G. Varghese, ``Packet
4422 Classification using Tuple Space Search,'' SIGCOMM 1999.
4423 </dd>
4424
4425 <dt>Pagiamtzis</dt>
4426 <dd>
4427 K. Pagiamtzis and A. Sheikholeslami, ``Content-addressable
4428 memory (CAM) circuits and architectures: A tutorial and
4429 survey,'' IEEE Journal of Solid-State Circuits, vol. 41, no. 3,
4430 pp. 712-727, March 2006.
4431 </dd>
4432
4433 <dt>VXLAN Group Policy Option</dt>
4434 <dd>
4435 M. Smith and L. Kreeger, `` VXLAN Group Policy Option.'' Internet-Draft.
4436 <url href="https://tools.ietf.org/html/draft-smith-vxlan-group-policy"/>.
4437 </dd>
4438 </dl>
4439
4440 <h1>Authors</h1>
4441
4442 <p>
4443 Ben Pfaff, with advice from Justin Pettit and Jean Tourrilhes.
4444 </p>
4445
4446 </fields>
4447
4448 <!--
4449 OXM fields not yet supported Future Directions References/See Also
4450 OXM fields required by various versions and by the "Conformance Test Specification for OpenFlow Switch Specification 1.0.1"
4451 -->