]> git.proxmox.com Git - mirror_ovs.git/blame - lib/meta-flow.xml
treewide: Convert leading tabs to spaces.
[mirror_ovs.git] / lib / meta-flow.xml
CommitLineData
96fee5e0
BP
1<?xml version="1.0" encoding="utf-8"?>
2<fields>
3 <h1>Introduction</h1>
4
5 <p>
6 This document aims to comprehensively document all of the fields,
7 both standard and non-standard, supported by OpenFlow or Open
8 vSwitch, regardless of origin.
9 </p>
10
11 <h2>Fields</h2>
12
13 <p>
14 A <dfn>field</dfn> is a property of a packet. Most familiarly, <dfn>data
15 fields</dfn> are fields that can be extracted from a packet. Most data
16 fields are copied directly from protocol headers, e.g. at layer 2, the
17 Ethernet source and destination addresses, or the VLAN ID; at layer 3, the
18 IPv4 or IPv6 source and destination; and at layer 4, the TCP or UDP ports.
19 Other data fields are computed, e.g. <ref field="ip_frag"/> describes
20 whether a packet is a fragment but it is not copied directly from the IP
21 header.
22 </p>
23
24 <p>
3d4b2e6e
JS
25 Data fields that are always present as a consequence of the basic
26 networking technology in use are called called <dfn>root fields</dfn>.
27 Open vSwitch 2.7 and earlier considered Ethernet fields to be root fields,
28 and this remains the default mode of operation for Open vSwitch bridges.
875ab130
BP
29 When a packet is received from a non-Ethernet interfaces, such as a layer-3
30 LISP tunnel, Open vSwitch 2.7 and earlier force-fit the packet to this
3d4b2e6e
JS
31 Ethernet-centric point of view by pretending that an Ethernet header is
32 present whose Ethernet type that indicates the packet's actual type (and
33 whose source and destination addresses are all-zero).
96fee5e0
BP
34 </p>
35
96fee5e0 36 <p>
875ab130
BP
37 Open vSwitch 2.8 and later implement the ``packet type-aware pipeline''
38 concept introduced in OpenFlow 1.5. Such a pipeline does not have any root
39 fields. Instead, a new metadata field, <ref field="packet_type"/>,
40 indicates the basic type of the packet, which can be Ethernet, IPv4, IPv6,
41 or another type. For backward compatibility, by default Open vSwitch 2.8
42 imitates the behavior of Open vSwitch 2.7 and earlier. Later versions of
43 Open vSwitch may change the default, and in the meantime controllers can
44 turn off this legacy behavior, on a port-by-port basis, by setting
45 <code>options:packet_type</code> to <code>ptap</code> in the
46 <code>Interface</code> table. This is significant only for ports that can
47 handle non-Ethernet packets, which is currently just LISP, VXLAN-GPE, and
48 GRE tunnel ports. See <code>ovs-vwitchd.conf.db</code>(5) for more
49 information.
3d4b2e6e
JS
50 </p>
51
52 <p>
53 Non-root data fields are not always present. A packet contains ARP
54 fields, for example, only when its packet type is ARP or when it is an
55 Ethernet packet whose Ethernet header indicates the Ethertype for ARP,
96fee5e0
BP
56 0x0806. In this documentation, we say that a field is
57 <dfn>applicable</dfn> when it is present in a packet, and
58 <dfn>inapplicable</dfn> when it is not. (These are not standard terms.)
59 We refer to the conditions that determine whether a field is applicable as
60 <dfn>prerequisites</dfn>. Some VLAN-related fields are a special case:
3d4b2e6e
JS
61 these fields are always applicable for Ethernet packets, but have a
62 designated value or bit that indicates whether a VLAN header is present,
63 with the remaining values or bits indicating the VLAN header's content
64 (if it is present). <!-- XXX also ethertype -->
96fee5e0
BP
65 </p>
66
67 <p>
68 An inapplicable field does not have a value, not even a nominal
69 ``value'' such as all-zero-bits. In many circumstances, OpenFlow
70 and Open vSwitch allow references only to applicable fields. For
71 example, one may match (see <cite>Matching</cite>, below) a given
72 field only if the match includes the field's prerequisite,
73 e.g. matching an ARP field is only allowed if one also matches on
3d4b2e6e
JS
74 Ethertype 0x0806 or the <ref field="packet_type"/> for ARP in a packet
75 type-aware bridge.
96fee5e0
BP
76 </p>
77
78 <p>
79 Sometimes a packet may contain multiple instances of a header.
80 For example, a packet may contain multiple VLAN or MPLS headers,
81 and tunnels can cause any data field to recur. OpenFlow and Open
82 vSwitch do not address these cases uniformly. For VLAN and MPLS
83 headers, only the outermost header is accessible, so that inner
84 headers may be accessed only by ``popping'' (removing) the outer
85 header. (Open vSwitch supports only a single VLAN header in any
86 case.) For tunnels, e.g. GRE or VXLAN, the outer header and inner
87 headers are treated as different data fields.
88 </p>
89
90 <p>
91 Many network protocols are built in layers as a stack of concatenated
92 headers. Each header typically contains a ``next type'' field that
93 indicates the type of the protocol header that follows, e.g. Ethernet
94 contains an Ethertype and IPv4 contains a IP protocol type. The
95 exceptional cases, where protocols are layered but an outer layer does not
96 indicate the protocol type for the inner layer, or gives only an ambiguous
97 indication, are troublesome. An MPLS header, for example, only indicates
98 whether another MPLS header or some other protocol follows, and in the
99 latter case the inner protocol must be known from the context. In these
100 exceptional cases, OpenFlow and Open vSwitch cannot provide insight into
101 the inner protocol data fields without additional context, and thus they
102 treat all later data fields as inapplicable until an OpenFlow action
103 explicitly specifies what protocol follows. In the case of MPLS, the
104 OpenFlow ``pop MPLS'' action that removes the last MPLS header from a
105 packet provides this context, as the Ethertype of the payload. See
106 <cite>Layer 2.5: MPLS</cite> for more information.
107 </p>
108
109 <p>
110 OpenFlow and Open vSwitch support some fields other than data
111 fields. <dfn>Metadata fields</dfn> relate to the origin or
112 treatment of a packet, but they are not extracted from the packet
113 data itself. One example is the physical port on which a packet
114 arrived at the switch. <dfn>Register fields</dfn> act like
115 variables: they give an OpenFlow switch space for temporary
116 storage while processing a packet. Existing metadata and register
117 fields have no prerequisites.
118 </p>
119
120 <p>
121 A field's value consists of an integral number of bytes. For data
122 fields, sometimes those bytes are taken directly from the packet.
123 Other data fields are copied from a packet with padding (usually
124 with zeros and in the most significant positions). The remaining
125 data fields are transformed in other ways as they are copied from
126 the packets, to make them more useful for matching.
127 </p>
128
129 <h2>Matching</h2>
130
131 <p>
132 The most important use of fields in OpenFlow is
133 <dfn>matching</dfn>, to determine whether particular field values
134 agree with a set of constraints called a <dfn>match</dfn>. A
135 match consists of zero or more constraints on individual fields,
136 all of which must be met to satisfy the match. (A match that
137 contains no constraints is always satisfied.) OpenFlow and Open
138 vSwitch support a number of forms of matching on individual
139 fields:
140 </p>
141
142 <dl>
143 <dt><dfn>Exact match</dfn>, e.g. <code>nw_src=10.1.2.3</code></dt>
144 <dd>
145 <p>
5a0e4aec
BP
146 Only a particular value of the field is matched; for example, only one
147 particular source IP address. Exact matches are written as
148 <code><var>field</var>=<var>value</var></code>. The forms accepted for
149 <var>value</var> depend on the field.
96fee5e0
BP
150 </p>
151
152 <p>
5a0e4aec 153 All fields support exact matches.
96fee5e0
BP
154 </p>
155 </dd>
156
157 <dt>
158 <dfn>Bitwise match</dfn>, e.g. <code>nw_src=10.1.0.0/255.255.0.0</code>
159 </dt>
160 <dd>
161 <p>
5a0e4aec
BP
162 Specific bits in the field must have specified values; for example,
163 only source IP addresses in a particular subnet. Bitwise matches are
164 written as
165 <code><var>field</var>=<var>value</var>/<var>mask</var></code>, where
166 <var>value</var> and <var>mask</var> take one of the forms accepted for
167 an exact match on <var>field</var>. Some fields accept other forms for
168 bitwise matches; for example, <code>nw_src=10.1.0.0/255.255.0.0</code>
169 may also be written <code>nw_src=10.1.0.0/16</code>.
96fee5e0
BP
170 </p>
171
172 <p>
173 Most OpenFlow switches do not allow every bitwise matching on every
174 field (and before OpenFlow 1.2, the protocol did not even provide for
175 the possibility for most fields). Even switches that do allow bitwise
176 matching on a given field may restrict the masks that are allowed, e.g.
177 by allowing matches only on contiguous sets of bits starting from the
178 most significant bit, that is, ``CIDR'' masks [RFC 4632]. Open vSwitch
179 does not allows bitwise matching on every field, but it allows
180 arbitrary bitwise masks on any field that does support bitwise
181 matching. (Older versions had some restrictions, as documented in the
182 descriptions of individual fields.)
183 </p>
184 </dd>
185
186 <dt><dfn>Wildcard</dfn>, e.g. ``any <code>nw_src</code>''</dt>
187 <dd>
188 <p>
5a0e4aec
BP
189 The value of the field is not constrained. Wildcarded fields may be
190 written as <code><var>field</var>=*</code>, although it is unusual to
191 mention them at all. (When specifying a wildcard explicitly in a
192 command invocation, be sure to using quoting to protect against shell
193 expansion.)
96fee5e0
BP
194 </p>
195
196 <p>
5a0e4aec
BP
197 There is a tiny difference between wildcarding a field and not
198 specifying any match on a field: wildcarding a field requires
199 satisfying the field's prerequisites.
96fee5e0
BP
200 </p>
201 </dd>
202 </dl>
203
204 <p>
205 Some types of matches on individual fields cannot be expressed directly
206 with OpenFlow and Open vSwitch. These can be expressed indirectly:
207 </p>
208
209 <dl>
210 <dt><dfn>Set match</dfn>, e.g. ``<code>tcp_dst</code> ∈ {80, 443,
211 8080}''</dt>
212 <dd>
213 <p>
5a0e4aec
BP
214 The value of a field is one of a specified set of values; for
215 example, the TCP destination port is 80, 443, or 8080.
96fee5e0
BP
216 </p>
217
218 <p>
5a0e4aec
BP
219 For matches used in flows (see <cite>Flows</cite>, below), multiple
220 flows can simulate set matches.
96fee5e0
BP
221 </p>
222 </dd>
223
224 <dt><dfn>Range match</dfn>, e.g. ``1000 ≤ <code>tcp_dst</code> ≤
225 1999''</dt>
226 <dd>
227 <p>
5a0e4aec
BP
228 The value of the field must lie within a numerical range, for
229 example, TCP destination ports between 1000 and 1999.
96fee5e0
BP
230 </p>
231
232 <p>
5a0e4aec
BP
233 Range matches can be expressed as a collection of bitwise matches. For
234 example, suppose that the goal is to match TCP source ports 1000 to
235 1999, inclusive. The binary representations of 1000 and 1999 are:
96fee5e0
BP
236 </p>
237
238 <pre fixed="yes">
23901111101000
24011111001111
241 </pre>
242
243 <p>
5a0e4aec
BP
244 The following series of bitwise matches will match 1000 and
245 1999 and all the values in between:
96fee5e0
BP
246 </p>
247
248 <pre fixed="yes">
24901111101xxx
2500111111xxxx
25110xxxxxxxxx
252110xxxxxxxx
2531110xxxxxxx
25411110xxxxxx
2551111100xxxx
256 </pre>
257
258 <p>
5a0e4aec 259 which can be written as the following matches:
96fee5e0
BP
260 </p>
261
262 <pre>
263tcp,tp_src=0x03e8/0xfff8
264tcp,tp_src=0x03f0/0xfff0
265tcp,tp_src=0x0400/0xfe00
266tcp,tp_src=0x0600/0xff00
267tcp,tp_src=0x0700/0xff80
268tcp,tp_src=0x0780/0xffc0
269tcp,tp_src=0x07c0/0xfff0
270 </pre>
271 </dd>
272
273 <dt><dfn>Inequality match</dfn>, e.g. ``<code>tcp_dst</code> ≠ 80''</dt>
274 <dd>
275 <p>
5a0e4aec
BP
276 The value of the field differs from a specified value, for
277 example, all TCP destination ports except 80.
96fee5e0
BP
278 </p>
279
280 <p>
281 An inequality match on an <var>n</var>-bit field can be expressed as a
282 disjunction of <var>n</var> 1-bit matches. For example, the inequality
283 match ``<code>vlan_pcp</code> ≠ 5'' can be expressed as
284 ``<code>vlan_pcp</code> = 0/4 or <code>vlan_pcp</code> = 2/2 or
285 <code>vlan_pcp</code> = 0/1.'' For matches used in flows (see
286 <cite>Flows</cite>, below), sometimes one can more compactly express
287 inequality as a higher-priority flow that matches the exceptional case
288 paired with a lower-priority flow that matches the general case.
289 </p>
290
291 <p>
292 Alternatively, an inequality match may be converted to a pair of range
293 matches, e.g. <code>tcp_src ≠ 80</code> may be expressed as ``0 ≤
294 <code>tcp_src</code> &lt; 80 or 80 &lt; <code>tcp_src</code> ≤ 65535'',
295 and then each range match may in turn be converted to a bitwise match.
296 </p>
297 </dd>
298
299 <dt><dfn>Conjunctive match</dfn>, e.g. ``<code>tcp_src</code> ∈ {80, 443, 8080} and <code>tcp_dst</code> ∈ {80, 443, 8080}''</dt>
300 <dd>
301 As an OpenFlow extension, Open vSwitch supports matching on conditions on
302 conjunctions of the previously mentioned forms of matching. See the
303 documentation for <ref field="conj_id"/> for more information.
304 </dd>
305 </dl>
306
307 <p>
308 All of these supported forms of matching are special cases of bitwise
309 matching. In some cases this influences the design of field values. <ref
310 field="ip_frag"/> is the most prominent example: it is designed to make all
311 of the practically useful checks for IP fragmentation possible as a single
312 bitwise match.
313 </p>
314
315 <h3>Shorthands</h3>
316
317 <p>
318 Some matches are very commonly used, so Open vSwitch accepts shorthand
319 notations. In some cases, Open vSwitch also uses shorthand notations when
320 it displays matches. The following shorthands are defined, with their long
321 forms shown on the right side:
322 </p>
323
324 <dl>
3d4b2e6e
JS
325 <dt><code>eth</code></dt>
326 <dd><code>packet_type=(0,0)</code> (Open vSwitch 2.8 and later)</dd>
96fee5e0
BP
327 <dt><code>ip</code></dt> <dd><code>eth_type=0x0800</code></dd>
328 <dt><code>ipv6</code></dt> <dd><code>eth_type=0x86dd</code></dd>
329 <dt><code>icmp</code></dt> <dd><code>eth_type=0x0800,ip_proto=1</code></dd>
330 <dt><code>icmp6</code></dt> <dd><code>eth_type=0x86dd,ip_proto=58</code></dd>
331 <dt><code>tcp</code></dt> <dd><code>eth_type=0x0800,ip_proto=6</code></dd>
332 <dt><code>tcp6</code></dt> <dd><code>eth_type=0x86dd,ip_proto=6</code></dd>
333 <dt><code>udp</code></dt> <dd><code>eth_type=0x0800,ip_proto=17</code></dd>
334 <dt><code>udp6</code></dt> <dd><code>eth_type=0x86dd,ip_proto=17</code></dd>
335 <dt><code>sctp</code></dt> <dd><code>eth_type=0x0800,ip_proto=132</code></dd>
336 <dt><code>sctp6</code></dt> <dd><code>eth_type=0x86dd,ip_proto=132</code></dd>
337 <dt><code>arp</code></dt> <dd><code>eth_type=0x0806</code></dd>
338 <dt><code>rarp</code></dt> <dd><code>eth_type=0x8035</code></dd>
339 <dt><code>mpls</code></dt> <dd><code>eth_type=0x8847</code></dd>
340 <dt><code>mplsm</code></dt> <dd><code>eth_type=0x8848</code></dd>
341 </dl>
342
3d4b2e6e 343
96fee5e0
BP
344 <h2>Evolution of OpenFlow Fields</h2>
345
346 <p>
347 The discussion so far applies to all OpenFlow and Open vSwitch
348 versions. This section starts to draw in specific information by
349 explaining, in broad terms, the treatment of fields and matches in
350 each OpenFlow version.
351 </p>
352
353 <h3>OpenFlow 1.0</h3>
354
355 <p>
356 OpenFlow 1.0 defined the OpenFlow protocol format of a match as a
357 fixed-length data structure that could match on the following
358 fields:
359 </p>
360
361 <ul>
362 <li>Ingress port.</li>
363 <li>Ethernet source and destination MAC.</li>
364 <li>Ethertype (with a special value to match frames that lack an
365 Ethertype).</li>
366 <li>VLAN ID and priority.</li>
367 <li>IPv4 source, destination, protocol, and DSCP.</li>
368 <li>TCP source and destination port.</li>
369 <li>UDP source and destination port.</li>
370 <li>ICMPv4 type and code.</li>
371 <li>ARP IPv4 addresses (SPA and TPA) and opcode.</li>
372 </ul>
373
374 <p>
375 Each supported field corresponded to some member of the data
376 structure. Some members represented multiple fields, in the case
377 of the TCP, UDP, ICMPv4, and ARP fields whose presence is mutually
378 exclusive. This also meant that some members were poor fits for
379 their fields: only the low 8 bits of the 16-bit ARP opcode could
380 be represented, and the ICMPv4 type and code were padded with 8 bits
381 of zeros to fit in the 16-bit members primarily meant for TCP and
382 UDP ports. An additional bitmap member indicated, for each
383 member, whether its field should be an ``exact'' or ``wildcarded''
384 match (see <cite>Matching</cite>), with additional support for
385 CIDR prefix matching on the IPv4 source and destination fields.
386 </p>
387
388 <p>
389 Simplicity was recognized early on as the main virtue of this
390 approach. Obviously, any fixed-length data structure cannot
391 support matching new protocols that do not fit. There was no
392 room, for example, for matching IPv6 fields, which was not a
393 priority at the time. Lack of room to support matching the
394 Ethernet addresses inside ARP packets actually caused more of a
395 design problem later, leading to an Open vSwitch extension action
396 specialized for dropping ``spoofed'' ARP packets in which the
397 frame and ARP Ethernet source addressed differed. (This extension
398 was never standardized. Open vSwitch dropped support for it a few
399 releases after it added support for full ARP matching.)
400 </p>
401
402 <p>
403 The design of the OpenFlow fixed-length matches also illustrates
404 compromises, in both directions, between the strengths and
405 weaknesses of software and hardware that have always influenced
406 the design of OpenFlow. Support for matching ARP fields that do
407 fit in the data structure was only added late in the design
408 process (and remained optional in OpenFlow 1.0), for example,
409 because common switch ASICs did not support matching these fields.
410 </p>
411
412 <p>
413 The compromises in favor of software occurred for more complicated
414 reasons. The OpenFlow designers did not know how to implement
415 matching in software that was fast, dynamic, and general. (A way
416 was later found [Srinivasan].) Thus, the designers sought to
417 support dynamic, general matching that would be fast in realistic
418 special cases, in particular when all of the matches were
419 <dfn>microflows</dfn>, that is, matches that specify every field
420 present in a packet, because such matches can be implemented as a
421 single hash table lookup. Contemporary research supported the
422 feasibility of this approach: the number of microflows in a campus
423 network had been measured to peak at about 10,000 [Casado, section
424 3.2]. (Calculations show that this can only be true in a lightly
425 loaded network [Pepelnjak].)
426 </p>
427
428 <p>
429 As a result, OpenFlow 1.0 required switches to treat microflow
430 matches as the highest possible priority. This let software
431 switches perform the microflow hash table lookup first. Only on
432 failure to match a microflow did the switch need to fall back to
433 checking the more general and presumed slower matches. Also, the
434 OpenFlow 1.0 flow match was minimally flexible, with no support
435 for general bitwise matching, partly on the basis that this seemed
436 more likely amenable to relatively efficient software
437 implementation. (CIDR masking for IPv4 addresses was added
438 relatively late in the OpenFlow 1.0 design process.)
439 </p>
440
441 <p>
442 Microflow matching was later discovered to aid some hardware
443 implementations. The TCAM chips used for matching in hardware do
444 not support priority in the same way as OpenFlow but instead tie
445 priority to ordering [Pagiamtzis]. Thus, adding a new match with
446 a priority between the priorities of existing matches can require
447 reordering an arbitrary number of TCAM entries. On the other
448 hand, when microflows are highest priority, they can be managed as
449 a set-aside portion of the TCAM entries.
450 </p>
451
452 <p>
453 The emphasis on matching microflows also led designers to
454 carefully consider the bandwidth requirements between switch and
455 controller: to maximize the number of microflow setups per second,
456 one must minimize the size of each flow's description. This
457 favored the fixed-length format in use, because it expressed
458 common TCP and UDP microflows in fewer bytes than more flexible
459 ``type-length-value'' (TLV) formats. (Early versions of OpenFlow
460 also avoided TLVs in general to head off protocol fragmentation.)
461 </p>
462
463 <h4>Inapplicable Fields</h4>
464
465 <p>
466 OpenFlow 1.0 does not clearly specify how to treat inapplicable
467 fields. The members for inapplicable fields are always present in
468 the match data structure, as are the bits that indicate whether
469 the fields are matched, and the ``correct'' member and bit values
470 for inapplicable fields is unclear. OpenFlow 1.0 implementations
471 changed their behavior over time as priorities shifted. The early
472 OpenFlow reference implementation, motivated to make every flow a
473 microflow to enable hashing, treated inapplicable fields as exact
474 matches on a value of 0. Initially, this behavior was implemented
475 in the reference controller only.
476 </p>
477
478 <p>
479 Later, the reference switch was also changed to actually force any
480 wildcarded inapplicable fields into exact matches on 0. The
481 latter behavior sometimes caused problems, because the modified
482 flow was the one reported back to the controller later when it
483 queried the flow table, and the modifications sometimes meant that
484 the controller could not properly recognize the flow that it had
485 added. In retrospect, perhaps this problem should have alerted
486 the designers to a design error, but the ability to use a single
487 hash table was held to be more important than almost every other
488 consideration at the time.
489 </p>
490
491 <p>
492 When more flexible match formats were introduced much later, they
493 disallowed any mention of inapplicable fields as part of a match.
494 This raised the question of how to translate between this new
495 format and the OpenFlow 1.0 fixed format. It seemed somewhat
496 inconsistent and backward to treat fields as exact-match in one
497 format and forbid matching them in the other, so instead the
498 treatment of inapplicable fields in the fixed-length format was
499 changed from exact match on 0 to wildcarding. (A better
500 classifier had by now eliminated software performance problems
501 with wildcards.)
502 </p>
503
504 <p>
505 The OpenFlow 1.0.1 errata (released only in 2012) added some
506 additional explanation [OpenFlow 1.0.1, section 3.4], but it did
507 not mandate specific behavior because of variation among
508 implementations.
509 </p>
510
511 <h3>OpenFlow 1.1</h3>
512
513 <p>
514 The OpenFlow 1.1 protocol match format was designed as a type/length/value
515 (TLV) format to allow for future flexibility. The specification
516 standardized only a single type <code>OFPMT_STANDARD</code> (0) with a
517 fixed-size payload, described here. The additional fields and bitwise
518 masks in OpenFlow 1.1 cause this match structure to be over twice as large
519 as in OpenFlow 1.0, 88 bytes versus 40.
520 </p>
521
522 <p>
523 OpenFlow 1.1 added support for the following fields:
524 </p>
525
526 <ul>
527 <li>SCTP source and destination port.</li>
528 <li>MPLS label and traffic control (TC) fields.</li>
529 <li>One 64-bit register (named ``metadata'').</li>
530 </ul>
531
532 <p>
533 OpenFlow 1.1 increased the width of the ingress port number field (and all
534 other port numbers in the protocol) from 16 bits to 32 bits.
535 </p>
536
537 <p>
538 OpenFlow 1.1 increased matching flexibility by introducing
539 arbitrary bitwise matching on Ethernet and IPv4 address fields and
540 on the new ``metadata'' register field. Switches were not
541 required to support all possible masks [OpenFlow 1.1, section
542 4.3].
543 </p>
544
545 <p>
546 By a strict reading of the specification, OpenFlow 1.1 removed
547 support for matching ICMPv4 type and code [OpenFlow 1.1, section
548 A.2.3], but this is likely an editing error because ICMP
549 matching is described elsewhere [OpenFlow 1.1, Table 3, Table 4,
550 Figure 4]. Open vSwitch does support ICMPv4 type and code
551 matching with OpenFlow 1.1.
552 </p>
553
554 <p>
555 OpenFlow 1.1 avoided the pitfalls of inapplicable fields that
556 OpenFlow 1.0 encountered, by requiring the switch to ignore the
557 specified field values [OpenFlow 1.1, section A.2.3]. It also
558 implied that the switch should ignore the bits that indicate
559 whether to match inapplicable fields.
560 </p>
561
562 <h4>Physical Ingress Port</h4>
563
564 <p>
565 OpenFlow 1.1 introduced a new pseudo-field, the physical ingress port. The
566 physical ingress port is only a pseudo-field because it cannot be used for
567 matching. It appears only one place in the protocol, in the ``packet-in''
568 message that passes a packet received at the switch to an OpenFlow
569 controller.
570 </p>
571
572 <p>
573 A packet's ingress port and physical ingress port are identical except for
574 packets processed by a switch feature such as bonding or tunneling that
575 makes a packet appear to arrive on a ``virtual'' port associated with the
576 bond or the tunnel. For such packets, the ingress port is the virtual port
577 and the physical ingress port is, naturally, the physical port. Open
578 vSwitch implements both bonding and tunneling, but its bonding
579 implementation does not use virtual ports and its tunnels are typically not
580 on the same OpenFlow switch as their physical ingress ports (which need not
581 be part of any switch), so the ingress port and physical ingress port are
582 always the same in Open vSwitch.
583 </p>
584
585 <h3>OpenFlow 1.2</h3>
586
587 <p>
588 OpenFlow 1.2 abandoned the fixed-length approach to matching. One reason
589 was size, since adding support for IPv6 address matching (now seen as
590 important), with bitwise masks, would have added 64 bytes to the match
591 length, increasing it from 88 bytes in OpenFlow 1.1 to over 150 bytes.
592 Extensibility had also become important as controller writers increasingly
593 wanted support for new fields without having to change messages throughout
594 the OpenFlow protocol. The challenges of carefully defining fixed-length
595 matches to avoid problems with inapplicable fields had also become clear
596 over time.
597 </p>
598
599 <p>
600 Therefore, OpenFlow 1.2 adopted a flow format using a flexible
601 type-length-value (TLV) representation, in which each TLV expresses a match
602 on one field. These TLVs were in turn encapsulated inside the outer TLV
603 wrapper introduced in OpenFlow 1.1 with the new identifier
604 <code>OFPMT_OXM</code> (1). (This wrapper fulfilled its intended purpose
605 of reducing the amount of churn in the protocol when changing match
606 formats; some messages that included matches remained unchanged from
607 OpenFlow 1.1 to 1.2 and later versions.)
608 </p>
609
610 <p>
611 OpenFlow 1.2 added support for the following fields:
612 </p>
613
614 <ul>
615 <li>ARP hardware addresses (SHA and THA).</li>
616 <li>IPv4 ECN.</li>
617 <li>IPv6 source and destination addresses, flow label, DSCP, ECN,
618 and protocol.</li>
619 <li>TCP, UDP, and SCTP port numbers when encapsulated inside IPv6.</li>
620 <li>ICMPv6 type and code.</li>
621 <li>ICMPv6 Neighbor Discovery target address and source and target
622 Ethernet addresses.</li>
623 </ul>
624
625 <!-- mention tun_id_from_cookie extension? -->
626
627 <p>
628 The OpenFlow 1.2 format, called <dfn>OXM</dfn> (<dfn>OpenFlow Extensible
629 Match</dfn>), was modeled closely on an extension to OpenFlow 1.0
630 introduced in Open vSwitch 1.1 called <dfn>NXM</dfn> (<dfn>Nicira Extended
631 Match</dfn>). Each OXM or NXM TLV has the following format:
632 </p>
633
634 <diagram>
635 <header name="type">
636 <bits name="vendor/class" above="16" width=".75"/>
637 <bits name="field" above="7" width=".4"/>
638 </header>
639 <nospace/>
640 <header name="">
641 <bits name="HM" above="1" width=".25"/>
642 <bits name="length" above="8" width=".4"/>
643 </header>
644 <header name="">
645 <bits name="body" above="length bytes" width="1.7"/>
646 </header>
647 </diagram>
648
649 <p>
650 The most significant 16 bits of the NXM or OXM header, called
651 <code>vendor</code> by NXM and <code>class</code> by OXM, identify
652 an organization permitted to allocate identifiers for fields. NXM
653 allocates only two vendors, 0x0000 for fields supported by
654 OpenFlow 1.0 and 0x0001 for fields implemented as an Open vSwitch
655 extension. OXM assigns classes as follows:
656 </p>
657
658 <dl>
659 <dt>0x0000 (<code>OFPXMC_NXM_0</code>).</dt>
660 <dt>0x0001 (<code>OFPXMC_NXM_1</code>).</dt>
661 <dd>Reserved for NXM compatibility.</dd>
662
663 <dt>0x0002 to 0x7fff</dt>
664 <dd>
665 Reserved for allocation to ONF members, but none yet assigned.
666 </dd>
667
668 <dt>0x8000 (<code>OFPXMC_OPENFLOW_BASIC</code>)</dt>
669 <dd>
670 Used for most standard OpenFlow fields.
671 </dd>
672
673 <dt>0x8001 (<code>OFPXMC_PACKET_REGS</code>)</dt>
674 <dd>
675 Used for packet register fields in OpenFlow 1.5 and later.
676 </dd>
677
678 <dt>0x8002 to 0xfffe</dt>
679 <dd>
680 Reserved for the OpenFlow specification.
681 </dd>
682
683 <dt>0xffff (<code>OFPXMC_EXPERIMENTER</code>)</dt>
684 <dd>Experimental use.</dd>
685 </dl>
686
687 <p>
688 When <code>class</code> is 0xffff, the OXM header is extended to 64 bits by
689 using the first 32 bits of the body as an <code>experimenter</code> field
690 whose most significant byte is zero and whose remaining bytes are an
691 Organizationally Unique Identifier (OUI) assigned by the IEEE [IEEE OUI],
3d2fbd70 692 as shown below.
96fee5e0
BP
693 </p>
694
695 <diagram>
696 <header name="type">
697 <bits name="class" above="16" below="0xffff" width=".75"/>
698 <bits name="field" above="7" width=".4"/>
699 </header>
700 <nospace/>
701 <header name="">
702 <bits name="HM" above="1" width=".25"/>
703 <bits name="length" above="8" width=".4"/>
704 </header>
705
706 <header name="experimenter">
707 <bits name="zero" above="8" below="0x00" width=".4"/>
708 <bits name="OUI" above="24" width="1"/>
709 </header>
710 <header name="">
711 <bits name="body" above="(length - 4) bytes" width="1.7"/>
712 </header>
713 </diagram>
714
3d2fbd70
JS
715 <p>
716 OpenFlow says that support for experimenter fields is optional. Open
717 vSwitch 2.4 and later does support them, so that it can support the
718 following experimenter classes:
719 </p>
720
721 <dl>
722 <dt>0x4f4e4600 (<code>ONFOXM_ET</code>)</dt>
723 <dd>
87da4e2f
YY
724 Used by official Open Networking Foundation extensions in OpenFlow 1.3
725 and later.
3d2fbd70
JS
726 e.g. [TCP Flags Match Field Extension].
727 </dd>
728
729 <dt>0x005ad650 (<code>NXOXM_NSH</code>)</dt>
730 <dd>
731 Used by Open vSwitch for NSH extensions, in the absence of an official
732 ONF-assigned class. (This OUI is randomly generated.)
733 </dd>
734 </dl>
735
96fee5e0
BP
736 <p>
737 Taken as a unit, <code>class</code> (or <code>vendor</code>),
738 <code>field</code>, and <code>experimenter</code> (when present) uniquely
739 identify a particular field.
740 </p>
741
742 <p>
743 When <code>hasmask</code> (abbreviated <code>HM</code> above) is 0, the OXM
744 is an exact match on an entire field. In this case, the body (excluding
745 the experimenter field, if present) is a single value to be matched.
746 </p>
747
748 <p>
749 When <code>hasmask</code> is 1, the OXM is a bitwise match. The body
750 (excluding the experimenter field) consists of a value to match, followed
751 by the bitwise mask to apply. A 1-bit in the mask indicates that the
752 corresponding bit in the value should be matched and a 0-bit that it should
753 be ignored. For example, for an IP address field, a value of 192.168.0.0
754 followed by a mask of 255.255.0.0 would match addresses in the
755 196.168.0.0/16 subnet.
756 </p>
757
758 <ul>
759 <li>
760 Some fields might not support masking at all, and some fields that do
761 support masking might restrict it to certain patterns. For example,
762 fields that have IP address values might be restricted to CIDR masks.
763 The descriptions of individual fields note these restrictions.
764 </li>
765
766 <li>
767 An OXM TLV with a mask that is all zeros is not useful (although it is
768 not forbidden), because it is has the same effect as omitting the TLV
769 entirely.
770 </li>
771
772 <li>
773 It is not meaningful to pair a 0-bit in an OXM mask with a 1-bit in its
774 value, and Open vSwitch rejects such an OXM with the error
775 <code>OFPBMC_BAD_WILDCARDS</code>, as required by OpenFlow 1.3 and later.
776 </li>
777 </ul>
778
779 <p>
780 The <code>length</code> identifies the number of bytes in the body,
781 including the 4-byte <code>experimenter</code> header, if it is present.
782 Each OXM TLV has a fixed length; that is, given <code>class</code>,
783 <code>field</code>, <code>experimenter</code> (if present), and
784 <code>hasmask</code>, <code>length</code> is a constant. The
785 <code>length</code> is included explicitly to allow software to minimally
786 parse OXM TLVs of unknown types.
787 </p>
788
789 <p>
790 OXM TLVs must be ordered so that a field's prerequisites are satisfied
791 before it is parsed. For example, an OXM TLV that matches on the IPv4
792 source address field is only allowed following an OXM TLV that matches on
793 the Ethertype for IPv4. Similarly, an OXM TLV that matches on the TCP
794 source port must follow a TLV that matches an Ethertype of IPv4 or IPv6 and
795 one that matches an IP protocol of TCP (in that order). The order of OXM
796 TLVs is not otherwise restricted; no canonical ordering is defined.
797 </p>
798
799 <p>
800 A given field may be matched only once in a series of OXM TLVs.
801 </p>
802
803 <!-- EXT-482? -->
804
805 <h3>OpenFlow 1.3</h3>
806
807 <p>
808 OpenFlow 1.3 showed OXM to be largely successful, by adding new fields
809 without making any changes to how flow matches otherwise worked. It added
810 OXMs for the following fields supported by Open vSwitch:
811 </p>
812
813 <ul>
814 <li>Tunnel ID for ports associated with e.g. VXLAN or keyed GRE.</li>
815 <li>MPLS ``bottom of stack'' (BOS) bit.</li>
816 </ul>
817
818 <p>
819 OpenFlow 1.3 also added OXMs for the following fields not documented here
820 and not yet implemented by Open vSwitch:
821 </p>
822
823 <ul>
824 <li>IPv6 extension header handling.</li>
825 <li>PBB I-SID.</li>
826 </ul>
827
828 <h3>OpenFlow 1.4</h3>
829
830 <p>
831 OpenFlow 1.4 added OXMs for the following fields not documented here and
832 not yet implemented by Open vSwitch:
833 </p>
834
835 <ul>
836 <li>PBB UCA.</li>
837 </ul>
838
839 <h3>OpenFlow 1.5</h3>
840
841 <p>
842 OpenFlow 1.5 added OXMs for the following fields supported by Open vSwitch:
843 </p>
844
845 <ul>
3d4b2e6e 846 <li>Packet type.</li>
96fee5e0
BP
847 <li>TCP flags.</li>
848 <li>Packet registers.</li>
849 <li>The output port in the OpenFlow action set.</li>
850 </ul>
851
96fee5e0
BP
852 <h1>Fields Reference</h1>
853
854 <p>
855 The following sections document the fields that Open vSwitch supports.
856 Each section provides introductory material on a group of related fields,
857 followed by information on each individual field. In addition to
858 field-specific information, each field begins with a table with entries for
859 the following important properties:
860 </p>
861
862 <dl>
863 <dt>Name</dt>
864 <dd>
865 The field's name, used for parsing and formatting the field, e.g. in
866 <code>ovs-ofctl</code> commands. For historical reasons, some fields
867 have an additional name that is accepted as an alternative in parsing.
868 This name, when there is one, is listed as well, e.g. ``<code>tun</code>
869 (aka <code>tunnel_id</code>).''
870 </dd>
871
872 <dt>Width</dt>
873 <dd>
874 The field's width, always a multiple of 8 bits. Some fields don't use
875 all of the bits, so this may be accompanied by an explanation. For
876 example, OpenFlow embeds the 2-bit IP ECN field as as the low bits in an
877 8-bit byte, and so its width is expressed as ``8 bits (only the
878 least-significant 2 bits may be nonzero).''
879 </dd>
880
881 <dt>Format</dt>
882 <dd>
883 <p>
884 How a value for the field is formatted or parsed by, e.g.,
885 <code>ovs-ofctl</code>. Some possibilities are generic:
886 </p>
887
888 <dl>
889 <dt>decimal</dt>
890 <dd>
891 Formats as a decimal number. On input, accepts decimal numbers or
892 hexadecimal numbers prefixed by <code>0x</code>.
893 </dd>
894
895 <dt>hexadecimal</dt>
896 <dd>
897 Formats as a hexadecimal number prefixed by <code>0x</code>. On
898 input, accepts decimal numbers or hexadecimal numbers prefixed by
899 <code>0x</code>. (The default for parsing is <em>not</em>
900 hexadecimal: only a <code>0x</code> prefix causes input to be treated
901 as hexadecimal.)
902 </dd>
903
904 <dt>Ethernet</dt>
905 <dd>
906 Formats and accepts the common Ethernet address format
907 <code><var>xx</var>:<var>xx</var>:<var>xx</var>:<var>xx</var>:<var>xx</var>:<var>xx</var></code>.
908 </dd>
909
910 <dt>IPv4</dt>
911 <dd>
912 Formats and accepts the dotted-quad format
913 <code><var>a</var>.<var>b</var>.<var>c</var>.<var>d</var></code>.
914 For bitwise matches, formats and accepts
915 <code><var>address</var>/<var>length</var></code> CIDR notation in
916 addition to <code><var>address</var>/<var>mask</var></code>.
917 </dd>
918
919 <dt>IPv6</dt>
920 <dd>
921 Formats and accepts the common IPv6 address formats, plus CIDR
922 notation for bitwise matches.
923 </dd>
924
925 <dt>OpenFlow 1.0 port</dt>
926 <dd>
927 Accepts 16-bit port numbers in decimal, plus OpenFlow well-known port
928 names (e.g. <code>IN_PORT</code>) in uppercase or lowercase.
929 </dd>
930
931 <dt>OpenFlow 1.1+ port</dt>
932 <dd>
933 Same syntax as OpenFlow 1.0 ports but for 32-bit OpenFlow 1.1+ port
934 number fields.
935 </dd>
936 </dl>
937
938 <p>
939 Other, field-specific formats are explained along with their fields.
940 </p>
941 </dd>
942
943 <dt>Masking</dt>
944 <dd>
945 For most fields, this says ``arbitrary bitwise masks,'' meaning that a
946 flow may match any combination of bits in the field. Some fields
947 instead say ``exact match only,'' which means that a flow that matches
948 on this field must match on the whole field instead of just certain
949 bits. Either way, this reports masking support for the latest version
950 of Open vSwitch using OXM or NXM (that is, either OpenFlow 1.2+ or
951 OpenFlow 1.0 plus Open vSwitch NXM extensions). In particular,
952 OpenFlow 1.0 (without NXM) and 1.1 don't always support masking even if
953 Open vSwitch itself does; refer to the <em>OpenFlow 1.0</em> and
954 <em>OpenFlow 1.1</em> rows to learn about masking with these protocol
955 versions.
956 </dd>
957
958 <dt>Prerequisites</dt>
959 <dd>
960 <p>
961 Requirements that must be met to match on this field. For example,
962 <ref field="ip_src"/> has IPv4 as a prerequisite, meaning that a match
963 must include <code>eth_type=0x0800</code> to match on the IPv4 source
964 address. The following prerequisites, with their requirements, are
965 currently in use:
966 </p>
967
968 <dl>
969 <dt>none</dt>
970 <dd>(no requirements)</dd>
971
972 <dt>VLAN VID</dt>
973 <dd><code>vlan_tci=0x1000/0x1000</code> (i.e. a VLAN header is
974 present)</dd>
975
976 <dt>ARP</dt>
977 <dd><code>eth_type=0x0806</code> (ARP) or <code>eth_type=0x8035</code> (RARP)</dd>
978
979 <dt>IPv4</dt>
980 <dd><code>eth_type=0x0800</code></dd>
981
982 <dt>IPv6</dt>
983 <dd><code>eth_type=0x86dd</code></dd>
984
985 <dt>IPv4/IPv6</dt>
986 <dd>IPv4 or IPv6</dd>
987
988 <dt>MPLS</dt>
989 <dd><code>eth_type=0x8847</code> or <code>eth_type=0x8848</code></dd>
990
991 <dt>TCP</dt>
992 <dd>IPv4/IPv6 and <code>ip_proto=6</code></dd>
993
994 <dt>UDP</dt>
995 <dd>IPv4/IPv6 and <code>ip_proto=17</code></dd>
996
997 <dt>SCTP</dt>
998 <dd>IPv4/IPv6 and <code>ip_proto=132</code></dd>
999
1000 <dt>ICMPv4</dt>
1001 <dd>IPv4 and <code>ip_proto=1</code></dd>
1002
1003 <dt>ICMPv6</dt>
1004 <dd>IPv6 and <code>ip_proto=58</code></dd>
1005
1006 <dt>ND solicit</dt>
1007 <dd>ICMPv6 and <code>icmp_type=135</code> and <code>icmp_code=0</code></dd>
1008
1009 <dt>ND advert</dt>
1010 <dd>ICMPv6 and <code>icmp_type=136</code> and <code>icmp_code=0</code></dd>
1011
1012 <dt>ND</dt>
1013 <dd>ND solicit or ND advert</dd>
1014 </dl>
1015
1016 <p>
1017 The TCP, UDP, and SCTP prerequisites also have the special requirement
1018 that <code>nw_frag</code> is not being used to select ``later
1019 fragments.'' This is because only the first fragment of a fragmented
1020 IPv4 or IPv6 datagram contains the TCP or UDP header.
1021 </p>
1022 </dd>
1023
1024 <dt>Access</dt>
1025 <dd>
1026 Most fields are ``read/write,'' which means that common OpenFlow actions
1027 like <code>set_field</code> can modify them. Fields that are
1028 ``read-only'' cannot be modified in these general-purpose ways, although
1029 there may be other ways that actions can modify them.
1030 </dd>
1031
1032 <dt>OpenFlow 1.0</dt>
1033 <dt>OpenFlow 1.1</dt>
1034 <dd>
1035 These rows report the level of support that OpenFlow 1.0 or OpenFlow 1.1,
1036 respectively, has for a field. For OpenFlow 1.0, supported fields are
1037 reported as either ``yes (exact match only)'' for fields that do not
1038 support any bitwise masking or ``yes (CIDR match only)'' for fields that
1039 support CIDR masking. OpenFlow 1.1 supported fields report either ``yes
1040 (exact match only)'' or simply ``yes'' for fields that do support
1041 arbitrary masks. These OpenFlow versions supported a fixed collection of
1042 fields that cannot be extended, so many more fields are reported as ``not
1043 supported.''
1044 </dd>
1045
1046 <dt>OXM</dt>
1047 <dt>NXM</dt>
1048 <dd>
1049 <p>
1050 These rows report the OXM and NXM code points that correspond to a
1051 given field. Either or both may be ``none.''
1052 </p>
1053
1054 <p>
1055 A field that has only an OXM code point is usually one that was
1056 standardized before it was added to Open vSwitch. A field that has
1057 only an NXM code point is usually one that is not yet standardized.
1058 When a field has both OXM and NXM code points, it usually indicates
1059 that it was introduced as an Open vSwitch extension under the NXM code
1060 point, then later standardized under the OXM code point. A field can
1061 have more than one OXM code point if it was standardized in OpenFlow
1062 1.4 or later and additionally introduced as an official ONF extension
1063 for OpenFlow 1.3. (A field that has neither OXM nor NXM code point is
1064 typically an obsolete field that is supported in some other form using
1065 OXM or NXM.)
1066 </p>
1067
1068 <p>
1069 Each code point in these rows is described in the form
1070 ``<code>NAME</code> (<var>number</var>) since OpenFlow <var>spec</var>
1071 and Open vSwitch <var>version</var>,''
1072 e.g. ``<code>OXM_OF_ETH_TYPE</code> (5) since OpenFlow 1.2 and Open
1073 vSwitch 1.7.'' First, <code>NAME</code>, which specifies a name for
1074 the code point, starts with a prefix that designates a class and, in
1075 some cases, a vendor, as listed in the following table:
1076 </p>
1077
1078 <oxm_classes/>
1079
1080 <p>
1081 For more information on OXM/NXM classes and vendors, refer back to
1082 <em>OpenFlow 1.2</em> under <em>Evolution of OpenFlow Fields</em>. The
1083 <var>number</var> is the field number within the class and vendor. The
1084 OpenFlow <var>spec</var> is the version of OpenFlow that standardized
1085 the code point. It is omitted for NXM code points because they are
1086 nonstandard. The <var>version</var> is the version of Open vSwitch
1087 that first supported the code point.
1088 </p>
1089 </dd>
1090 </dl>
1091
1092 <group title="Conjunctive Match">
1093 <p>
1094 An individual OpenFlow flow can match only a single value for each field.
1095 However, situations often arise where one wants to match one of a set of
1096 values within a field or fields. For matching a single field against a
1097 set, it is straightforward and efficient to add multiple flows to the
1098 flow table, one for each value in the set. For example, one might use
1099 the following flows to send packets with IP source address <var>a</var>,
1100 <var>b</var>, <var>c</var>, or <var>d</var> to the OpenFlow controller:
1101 </p>
1102
1103 <pre>
1104 ip,ip_src=<var>a</var> actions=controller
1105 ip,ip_src=<var>b</var> actions=controller
1106 ip,ip_src=<var>c</var> actions=controller
1107 ip,ip_src=<var>d</var> actions=controller
1108 </pre>
1109
1110 <p>
1111 Similarly, these flows send packets with IP destination address
1112 <var>e</var>, <var>f</var>, <var>g</var>, or <var>h</var> to the OpenFlow
1113 controller:
1114 </p>
1115
1116 <pre>
1117 ip,ip_dst=<var>e</var> actions=controller
1118 ip,ip_dst=<var>f</var> actions=controller
1119 ip,ip_dst=<var>g</var> actions=controller
1120 ip,ip_dst=<var>h</var> actions=controller
1121 </pre>
1122
1123 <p>
1124 Installing all of the above flows in a single flow table yields a
1125 disjunctive effect: a packet is sent to the controller if
1126 <code>ip_src</code> ∈ {<var>a</var>,<var>b</var>,<var>c</var>,<var>d</var>}
1127 or <code>ip_dst</code> ∈
1128 {<var>e</var>,<var>f</var>,<var>g</var>,<var>h</var>} (or both).
1129 (Pedantically, if both of the above sets of flows are present in the flow
1130 table, they should have different priorities, because OpenFlow says that
1131 the results are undefined when two flows with same priority can both match
1132 a single packet.)
1133 </p>
1134
1135 <p>
1136 Suppose, on the other hand, one wishes to match conjunctively, that is, to
1137 send a packet to the controller only if both <code>ip_src</code> ∈
1138 {<var>a</var>,<var>b</var>,<var>c</var>,<var>d</var>} and
1139 <code>ip_dst</code> ∈
1140 {<var>e</var>,<var>f</var>,<var>g</var>,<var>h</var>}. This requires 4 × 4
1141 = 16 flows, one for each possible pairing of <code>ip_src</code> and
1142 <code>ip_dst</code>. That is acceptable for our small example, but it does
1143 not gracefully extend to larger sets or greater numbers of dimensions.
1144 </p>
1145
1146 <p>
1147 The <code>conjunction</code> action is a solution for conjunctive matches
1148 that is built into Open vSwitch. A <code>conjunction</code> action ties groups of
1149 individual OpenFlow flows into higher-level ``conjunctive flows''. Each
1150 group corresponds to one dimension, and each flow within the group matches
1151 one possible value for the dimension. A packet that matches one flow from
1152 each group matches the conjunctive flow.
1153 </p>
1154
1155 <p>
1156 To implement a conjunctive flow with <code>conjunction</code>, assign the
1157 conjunctive flow a 32-bit <var>id</var>, which must be unique within an
1158 OpenFlow table. Assign each of the <var>n</var> ≥ 2 dimensions a unique
1159 number from 1 to <var>n</var>; the ordering is unimportant. Add one flow
1160 to the OpenFlow flow table for each possible value of each dimension with
1161 <code>conjunction(<var>id</var>, <var>k</var>/<var>n</var>)</code> as the
1162 flow's actions, where <var>k</var> is the number assigned to the flow's
1163 dimension. Together, these flows specify the conjunctive flow's match
1164 condition. When the conjunctive match condition is met, Open vSwitch looks
1165 up one more flow that specifies the conjunctive flow's actions and receives
1166 its statistics. This flow is found by setting <code>conj_id</code> to the
1167 specified <var>id</var> and then again searching the flow table.
1168 </p>
1169
1170 <p>
1171 The following flows provide an example. Whenever the IP source is one of
1172 the values in the flows that match on the IP source (dimension 1 of 2),
1173 <em>and</em> the IP destination is one of the values in the flows that
1174 match on IP destination (dimension 2 of 2), Open vSwitch searches for a
1175 flow that matches <code>conj_id</code> against the conjunction ID (1234),
1176 finding the first flow listed below.
1177 </p>
1178
1179 <pre>
1180 conj_id=1234 actions=controller
1181 ip,ip_src=10.0.0.1 actions=conjunction(1234, 1/2)
1182 ip,ip_src=10.0.0.4 actions=conjunction(1234, 1/2)
1183 ip,ip_src=10.0.0.6 actions=conjunction(1234, 1/2)
1184 ip,ip_src=10.0.0.7 actions=conjunction(1234, 1/2)
1185 ip,ip_dst=10.0.0.2 actions=conjunction(1234, 2/2)
1186 ip,ip_dst=10.0.0.5 actions=conjunction(1234, 2/2)
1187 ip,ip_dst=10.0.0.7 actions=conjunction(1234, 2/2)
1188 ip,ip_dst=10.0.0.8 actions=conjunction(1234, 2/2)
1189 </pre>
1190
1191 <p>
1192 Many subtleties exist:
1193 </p>
1194
1195 <ul>
1196 <li>
1197 In the example above, every flow in a single dimension has the same form,
1198 that is, dimension 1 matches on <code>ip_src</code> and dimension 2 on
1199 <code>ip_dst</code>, but this is not a requirement. Different flows
1200 within a dimension may match on different bits within a field (e.g. IP
1201 network prefixes of different lengths, or TCP/UDP port ranges as bitwise
1202 matches), or even on entirely different fields (e.g. to match packets for
1203 TCP source port 80 or TCP destination port 80).
1204 </li>
1205
1206 <li>
1207 The flows within a dimension can vary their matches across more than
1208 one field, e.g. to match only specific pairs of IP source and
1209 destination addresses or L4 port numbers.
1210 </li>
1211
1212 <li>
1213 A flow may have multiple <code>conjunction</code> actions, with different
1214 <code>id</code> values. This is useful for multiple conjunctive flows with
1215 overlapping sets. If one conjunctive flow matches packets with both
1216 <code>ip_src</code> ∈ {<var>a</var>,<var>b</var>} and <code>ip_dst</code> ∈
1217 {<var>d</var>,<var>e</var>} and a second conjunctive flow matches <code>ip_src</code>
1218 ∈ {<var>b</var>,<var>c</var>} and <code>ip_dst</code> ∈ {<var>f</var>,<var>g</var>}, for
1219 example, then the flow that matches <code>ip_src=</code><var>b</var> would have two
1220 <code>conjunction</code> actions, one for each conjunctive flow. The order
1221 of <code>conjunction</code> actions within a list of actions is not
1222 significant.
1223 </li>
1224 <li>
1225 A flow with <code>conjunction</code> actions may also include <code>note</code>
1226 actions for annotations, but not any other kind of actions. (They
1227 would not be useful because they would never be executed.)
1228 </li>
1229 <li>
1230 All of the flows that constitute a conjunctive flow with a given
1231 <var>id</var> must have the same priority. (Flows with the same <var>id</var>
1232 but different priorities are currently treated as different
1233 conjunctive flows, that is, currently <var>id</var> values need only be
1234 unique within an OpenFlow table at a given priority. This behavior
1235 isn't guaranteed to stay the same in later releases, so please use
1236 <var>id</var> values unique within an OpenFlow table.)
1237 </li>
1238 <li>
1239 Conjunctive flows must not overlap with each other, at a given
1240 priority, that is, any given packet must be able to match at most one
1241 conjunctive flow at a given priority. Overlapping conjunctive flows
1242 yield unpredictable results.
1243 </li>
1244 <li>
1245 Following a conjunctive flow match, the search for the flow with
1246 <code>conj_id=</code><var>id</var> is done in the same general-purpose way as
1247 other flow table searches, so one can use flows with
1248 <code>conj_id=</code><var>id</var> to act differently depending on
1249 circumstances. (One exception is that the search for the
1250 <code>conj_id=</code><var>id</var> flow itself ignores conjunctive flows, to
1251 avoid recursion.) If the search with <code>conj_id=</code><var>id</var> fails,
1252 Open vSwitch acts as if the conjunctive flow had not matched at all, and
1253 continues searching the flow table for other matching flows.
1254 </li>
1255 <li>
1256 <p>
1257 OpenFlow prerequisite checking occurs for the flow with
1258 <code>conj_id=</code><var>id</var> in the same way as any other flow, e.g. in
1259 an OpenFlow 1.1+ context, putting a <code>mod_nw_src</code> action into the example
1260 above would require adding an <code>ip</code> match, like this:
1261 </p>
1262 <pre>
1263 conj_id=1234,ip actions=mod_nw_src:1.2.3.4,controller
1264 </pre>
1265 </li>
1266 <li>
1267 OpenFlow prerequisite checking also occurs for the individual flows
1268 that comprise a conjunctive match in the same way as any other flow.
1269 </li>
1270 <li>
1271 The flows that constitute a conjunctive flow do not have useful
1272 statistics. They are never updated with byte or packet counts, and so
1273 on. (For such a flow, therefore, the idle and hard timeouts work much
1274 the same way.)
1275 </li>
1276 <li>
1277 <p>
1278 Sometimes there is a choice of which flows include a particular match.
1279 For example, suppose that we added an extra constraint to our example,
1280 to match on <code>ip_src</code> ∈
1281 {<var>a</var>,<var>b</var>,<var>c</var>,<var>d</var>} and
1282 <code>ip_dst</code> ∈
1283 {<var>e</var>,<var>f</var>,<var>g</var>,<var>h</var>} and
1284 <code>tcp_dst</code> = <var>i</var>. One way to implement this is to
1285 add the new constraint to the <code>conj_id</code> flow, like this:
1286 </p>
1287 <pre>
1288 conj_id=1234,tcp,tcp_dst=<var>i</var> actions=mod_nw_src:1.2.3.4,controller
1289 </pre>
1290 <p>
1291 but <em>this is not recommended</em> because of the cost of the extra
1292 flow table lookup. Instead, add the constraint to the individual
1293 flows, either in one of the dimensions or (slightly better) all of
1294 them.
1295 </p>
1296 </li>
1297 <li>
1298 A conjunctive match must have <var>n</var> ≥ 2 dimensions (otherwise a
1299 conjunctive match is not necessary). Open vSwitch enforces this.
1300 </li>
1301 <li>
1302 Each dimension within a conjunctive match should ordinarily have more
1303 than one flow. Open vSwitch does not enforce this.
1304 </li>
1305 </ul>
1306
1307 <field id="MFF_CONJ_ID" title="Conjunction ID">
1308 Used for conjunctive matching. See above for more information.
1309 </field>
1310 </group>
1311
1312 <group title="Tunnel">
1313 <p>
1314 The fields in this group relate to tunnels, which Open vSwitch
1315 supports in several forms (GRE, VXLAN, and so on). Most of
1316 these fields do appear in the wire format of a packet, so they
1317 are data fields from that point of view, but they are metadata
1318 from an OpenFlow flow table point of view because they do not
1319 appear in packets that are forwarded to the controller or to
1320 ordinary (non-tunnel) output ports.
1321 </p>
1322
1323 <p>
1324 Open vSwitch supports a spectrum of usage models for mapping
1325 tunnels to OpenFlow ports:
1326 </p>
1327
1328 <dl>
1329 <dt>``Port-based'' tunnels</dt>
1330 <dd>
5a0e4aec
BP
1331 <p>
1332 In this model, an OpenFlow port represents one tunnel: it matches a
1333 particular type of tunnel traffic between two IP endpoints, with a
1334 particular tunnel key (if keys are in use). In this situation, <ref
1335 field="in_port"/> suffices to distinguish one tunnel from another, so
1336 the tunnel header fields have little importance for OpenFlow
1337 processing. (They are still populated and may be used if it is
1338 convenient.) The tunnel header fields play no role in sending
1339 packets out such an OpenFlow port, either, because the OpenFlow port
1340 itself fully specifies the tunnel headers.
1341 </p>
1342
1343 <p>
1344 The following Open vSwitch commands create a bridge
1345 <code>br-int</code>, add port <code>tap0</code> to the bridge as
1346 OpenFlow port 1, establish a port-based GRE tunnel between the local
1347 host and remote IP 192.168.1.1 using GRE key 5001 as OpenFlow port 2,
1348 and arranges to forward all traffic from <code>tap0</code> to the
1349 tunnel and vice versa:
1350 </p>
1351
1352 <pre>
96fee5e0
BP
1353ovs-vsctl add-br br-int
1354ovs-vsctl add-port br-int tap0 -- set interface tap0 ofport_request=1
1355ovs-vsctl add-port br-int gre0 --
1356 set interface gre0 ofport_request=2 type=gre \
1357 options:remote_ip=192.168.1.1 options:key=5001
1358ovs-ofctl add-flow br-int in_port=1,actions=2
1359ovs-ofctl add-flow br-int in_port=2,actions=1
5a0e4aec 1360 </pre>
96fee5e0
BP
1361 </dd>
1362
1363 <dt>``Flow-based'' tunnels</dt>
1364 <dd>
5a0e4aec
BP
1365 <p>
1366 In this model, one OpenFlow port represents all possible tunnels of a
1367 given type with an endpoint on the current host, for example, all GRE
1368 tunnels. In this situation, <ref field="in_port"/> only indicates
1369 that traffic was received on the particular kind of tunnel. This is
1370 where the tunnel header fields are most important: they allow the
1371 OpenFlow tables to discriminate among tunnels based on their IP
1372 endpoints or keys. Tunnel header fields also determine the IP
1373 endpoints and keys of packets sent out such a tunnel port.
1374 </p>
1375
1376 <p>
1377 The following Open vSwitch commands create a bridge
1378 <code>br-int</code>, add port <code>tap0</code> to the
1379 bridge as OpenFlow port 1, establish a flow-based GRE tunnel
1380 port 3, and arranges to forward all traffic from
1381 <code>tap0</code> to remote IP 192.168.1.1 over a GRE tunnel
1382 with key 5001 and vice versa:
1383 </p>
1384
1385 <pre>
96fee5e0
BP
1386ovs-vsctl add-br br-int
1387ovs-vsctl add-port br-int tap0 -- set interface tap0 ofport_request=1
1388ovs-vsctl add-port br-int allgre --
1389 set interface gre0 ofport_request=3 type=gre \
1390 options:remote_ip=flow options:key=flow
1391ovs-ofctl add-flow br-int \
1392 'in_port=1 actions=set_tunnel:5001,set_field:192.168.1.1->tun_dst,3'
1393ovs-ofctl add-flow br-int 'in_port=3,tun_src=192.168.1.1,tun_id=5001 actions=1'
5a0e4aec 1394 </pre>
96fee5e0
BP
1395 </dd>
1396
1397 <dt>Mixed models.</dt>
1398 <dd>
5a0e4aec
BP
1399 <p>
1400 One may define both flow-based and port-based tunnels at the
1401 same time. For example, it is valid and possibly useful to
1402 create and configure both <code>gre0</code> and
1403 <code>allgre</code> tunnel ports described above.
1404 </p>
1405
1406 <p>
1407 Traffic is attributed on ingress to the most specific
1408 matching tunnel. For example, <code>gre0</code> is more
1409 specific than <code>allgre</code>. Therefore, if both
1410 exist, then <code>gre0</code> will be the ingress port for any
1411 GRE traffic received from 192.168.1.1 with key 5001.
1412 </p>
1413
1414 <p>
1415 On egress, traffic may be directed to any appropriate tunnel
1416 port. If both <code>gre0</code> and <code>allgre</code> are
1417 configured as already described, then the actions
1418 <code>2</code> and
1419 <code>set_tunnel:5001,set_field:192.168.1.1->tun_dst,3</code>
1420 send the same tunnel traffic.
1421 </p>
96fee5e0
BP
1422 </dd>
1423
1424 <dt>Intermediate models.</dt>
1425 <dd>
5a0e4aec
BP
1426 Ports may be configured as partially flow-based. For example,
1427 one may define an OpenFlow port that represents tunnels
1428 between a pair of endpoints but leaves the flow table to
1429 discriminate on the flow key.
96fee5e0
BP
1430 </dd>
1431 </dl>
1432
1433 <p>
1434 <code>ovs-vswitchd.conf.db</code>(5) describes all the details of tunnel
1435 configuration.
1436 </p>
1437
1438 <p>
1439 These fields do not have any prerequisites, which means that a
1440 flow may match on any or all of them, in any combination.
1441 </p>
1442
1443 <p>
1444 These fields are zeros for packets that did not arrive on a tunnel.
1445 </p>
1446
1447 <field id="MFF_TUN_ID" title="Tunnel ID">
1448 <p>
5a0e4aec 1449 Many kinds of tunnels support a tunnel ID:
96fee5e0
BP
1450 </p>
1451
1452 <ul>
5a0e4aec 1453 <li>
96fee5e0
BP
1454 VXLAN and Geneve have a 24-bit virtual network identifier (VNI).
1455 </li>
5a0e4aec
BP
1456 <li>LISP has a 24-bit instance ID.</li>
1457 <li>GRE has an optional 32-bit key.</li>
1458 <li>STT has a 64-bit key.</li>
7dc18ae9 1459 <li>ERSPAN has a 10-bit key (Session ID).</li>
96fee5e0
BP
1460 </ul>
1461
1462 <p>
5a0e4aec
BP
1463 When a packet is received from a tunnel, this field holds the
1464 tunnel ID in its least significant bits, zero-extended to fit.
1465 This field is zero if the tunnel does not support an ID, or if
1466 no ID is in use for a tunnel type that has an optional ID, or
1467 if an ID of zero received, or if the packet was not received
1468 over a tunnel.
96fee5e0
BP
1469 </p>
1470
1471 <p>
5a0e4aec
BP
1472 When a packet is output to a tunnel port, the tunnel
1473 configuration determines whether the tunnel ID is taken from
1474 this field or bound to a fixed value. See the earlier
1475 description of ``port-based'' and ``flow-based'' tunnels for
1476 more information.
96fee5e0
BP
1477 </p>
1478
1479 <p>
5a0e4aec
BP
1480 The following diagram shows the origin of this field in a
1481 typical keyed GRE tunnel:
96fee5e0
BP
1482 </p>
1483
1484 <diagram>
5a0e4aec
BP
1485 <header name="Ethernet">
1486 <bits name="dst" above="48" width="0.4"/>
1487 <bits name="src" above="48" width="0.4"/>
1488 <bits name="type" above="16" below="0x800" width="0.4"/>
1489 </header>
1490 <header name="IPv4">
1491 <bits name="..." width="0.4"/>
1492 <bits name="proto" above="8" below="47" width="0.4"/>
1493 <bits name="src" above="32" width="0.4"/>
1494 <bits name="dst" above="32" width="0.4"/>
1495 </header>
1496 <header name="GRE">
1497 <bits name="..." above="16" width="0.4"/>
1498 <bits name="type" above="16" below="0x6558" width="0.4"/>
1499 <bits name="key" above="32" width=".4" fill="yes"/>
1500 </header>
1501 <header name="Ethernet">
1502 <bits name="dst" above="48" width="0.4"/>
1503 <bits name="src" above="48" width="0.4"/>
1504 <bits name="type" above="16" width="0.4"/>
1505 </header>
1506 <dots/>
96fee5e0
BP
1507 </diagram>
1508 </field>
1509
1510 <field id="MFF_TUN_SRC" title="Tunnel IPv4 Source">
1511 <p>
5a0e4aec
BP
1512 When a packet is received from a tunnel, this field is the
1513 source address in the outer IP header of the tunneled packet.
1514 This field is zero if the packet was not received over a
1515 tunnel.
96fee5e0
BP
1516 </p>
1517
1518 <p>
5a0e4aec
BP
1519 When a packet is output to a flow-based tunnel port, this
1520 field influences the IPv4 source address used to send the
1521 packet. If it is zero, then the kernel chooses an appropriate
1522 IP address based using the routing table.
96fee5e0
BP
1523 </p>
1524
1525 <p>
5a0e4aec
BP
1526 The following diagram shows the origin of this field in a
1527 typical keyed GRE tunnel:
96fee5e0
BP
1528 </p>
1529
1530 <diagram>
5a0e4aec
BP
1531 <header name="Ethernet">
1532 <bits name="dst" above="48" width="0.4"/>
1533 <bits name="src" above="48" width="0.4"/>
1534 <bits name="type" above="16" below="0x800" width="0.4"/>
1535 </header>
1536 <header name="IPv4">
1537 <bits name="..." width="0.4"/>
1538 <bits name="proto" above="8" below="47" width="0.4"/>
1539 <bits name="src" above="32" width="0.4" fill="yes"/>
1540 <bits name="dst" above="32" width="0.4"/>
1541 </header>
1542 <header name="GRE">
1543 <bits name="..." above="16" width="0.4"/>
1544 <bits name="type" above="16" below="0x6558" width="0.4"/>
1545 <bits name="key" above="32" width=".4"/>
1546 </header>
1547 <header name="Ethernet">
1548 <bits name="dst" above="48" width="0.4"/>
1549 <bits name="src" above="48" width="0.4"/>
1550 <bits name="type" above="16" width="0.4"/>
1551 </header>
1552 <dots/>
96fee5e0
BP
1553 </diagram>
1554 </field>
1555
1556 <field id="MFF_TUN_DST" title="Tunnel IPv4 Destination">
1557 <p>
5a0e4aec
BP
1558 When a packet is received from a tunnel, this field is the
1559 destination address in the outer IP header of the tunneled
1560 packet. This field is zero if the packet was not received
1561 over a tunnel.
96fee5e0
BP
1562 </p>
1563
1564 <p>
5a0e4aec
BP
1565 When a packet is output to a flow-based tunnel port, this
1566 field specifies the destination to which the tunnel packet is
1567 sent.
96fee5e0
BP
1568 </p>
1569
1570 <p>
5a0e4aec
BP
1571 The following diagram shows the origin of this field in a
1572 typical keyed GRE tunnel:
96fee5e0
BP
1573 </p>
1574
1575 <diagram>
5a0e4aec
BP
1576 <header name="Ethernet">
1577 <bits name="dst" above="48" width="0.4"/>
1578 <bits name="src" above="48" width="0.4"/>
1579 <bits name="type" above="16" below="0x800" width="0.4"/>
1580 </header>
1581 <header name="IPv4">
1582 <bits name="..." width="0.4"/>
1583 <bits name="proto" above="8" below="47" width="0.4"/>
1584 <bits name="src" above="32" width="0.4"/>
1585 <bits name="dst" above="32" width="0.4" fill="yes"/>
1586 </header>
1587 <header name="GRE">
1588 <bits name="..." above="16" width="0.4"/>
1589 <bits name="type" above="16" below="0x6558" width="0.4"/>
1590 <bits name="key" above="32" width=".4"/>
1591 </header>
1592 <header name="Ethernet">
1593 <bits name="dst" above="48" width="0.4"/>
1594 <bits name="src" above="48" width="0.4"/>
1595 <bits name="type" above="16" width="0.4"/>
1596 </header>
1597 <dots/>
96fee5e0
BP
1598 </diagram>
1599 </field>
1600
1601 <field id="MFF_TUN_IPV6_SRC" title="Tunnel IPv6 Source">
1602 Similar to <ref field="tun_src"/>, but for tunnels over IPv6.
1603 </field>
1604
1605 <field id="MFF_TUN_IPV6_DST" title="Tunnel IPv6 Destination">
1606 Similar to <ref field="tun_dst"/>, but for tunnels over IPv6.
1607 </field>
1608
1609 <h2>VXLAN Group-Based Policy Fields</h2>
1610
1611 <p>
1612 The VXLAN header is defined as follows [RFC 7348], where the
1613 <code>I</code> bit must be set to 1, unlabeled bits or those labeled
1614 <code>reserved</code> must be set to 0, and Open vSwitch makes the VNI
1615 available via <ref field="tun_id"/>:
1616 </p>
1617
1618 <diagram>
1619 <header name="VXLAN flags">
1620 <bits name="" above="1" width="0.15"/>
1621 <bits name="" above="1" width="0.15"/>
1622 <bits name="" above="1" width="0.15"/>
1623 <bits name="" above="1" width="0.15"/>
1624 <bits name="I" above="1" width="0.15"/>
1625 <bits name="" above="1" width="0.15"/>
1626 <bits name="" above="1" width="0.15"/>
1627 <bits name="" above="1" width="0.15"/>
1628 </header>
1629 <nospace/>
1630 <header>
1631 <bits name="reserved" above="24" width="1.2"/>
1632 <bits name="VNI" above="24" width="1.2"/>
1633 <bits name="reserved" above="8" width=".5"/>
1634 </header>
1635 </diagram>
1636
1637 <p>
1638 VXLAN Group-Based Policy [VXLAN Group Policy Option] adds new
1639 interpretations to existing bits in the VXLAN header, reinterpreting it
1640 as follows, with changes highlighted:
1641 </p>
1642
1643 <diagram>
1644 <header name="GBP flags">
1645 <bits name="" above="1" width="0.15"/>
1646 <bits name="D" above="1" width="0.15" fill="yes"/>
1647 <bits name="" above="1" width="0.15"/>
1648 <bits name="" above="1" width="0.15"/>
1649 <bits name="A" above="1" width="0.15" fill="yes"/>
1650 <bits name="" above="1" width="0.15"/>
1651 <bits name="" above="1" width="0.15"/>
1652 <bits name="" above="1" width="0.15"/>
1653 </header>
1654 <nospace/>
1655 <header>
1656 <bits name="group policy ID" above="24" width="1.2" fill="yes"/>
1657 <bits name="VNI" above="24" width="1.2"/>
1658 <bits name="reserved" above="8" width=".5"/>
1659 </header>
1660 </diagram>
1661
1662 <p>
1663 Open vSwitch makes GBP fields and flags available through the following
1664 fields. Only packets that arrive over a VXLAN tunnel with the GBP
1665 extension enabled have these fields set. In other packets they are zero
1666 on receive and ignored on transmit.
1667 </p>
1668
1669 <field id="MFF_TUN_GBP_ID" title="VXLAN Group-Based Policy ID">
1670 <p>
1671 For a packet tunneled over VXLAN with the Group-Based Policy (GBP)
1672 extension, this field represents the GBP policy ID, as shown above.
1673 </p>
1674 </field>
1675
1676 <field id="MFF_TUN_GBP_FLAGS" title="VXLAN Group-Based Policy Flags">
1677 <p>
1678 For a packet tunneled over VXLAN with the Group-Based Policy (GBP)
1679 extension, this field represents the GBP policy flags, as shown above.
1680 </p>
1681
1682 <p>
1683 The field has the format shown below:
1684 </p>
1685
1686 <diagram>
1687 <header name="GBP Flags">
1688 <bits name="" above="1" width="0.15"/>
1689 <bits name="D" above="1" width="0.15"/>
1690 <bits name="" above="1" width="0.15"/>
1691 <bits name="" above="1" width="0.15"/>
1692 <bits name="A" above="1" width="0.15"/>
1693 <bits name="" above="1" width="0.15"/>
1694 <bits name="" above="1" width="0.15"/>
1695 <bits name="" above="1" width="0.15"/>
1696 </header>
1697 </diagram>
1698
1699 <p>
1700 Unlabeled bits are reserved and must be transmitted as 0. The VXLAN
1701 GBP draft defines the other bits' meanings as:
1702 </p>
1703
1704 <dl>
1705 <dt><code>D</code> (Don't Learn)</dt>
1706 <dd>
1707 When set, this bit indicates that the egress tunnel endpoint must not
1708 learn the source address of the encapsulated frame.
1709 </dd>
1710
1711 <dt><code>A</code> (Applied)</dt>
1712 <dd>
1713 When set, indicates that the group policy has already been applied to
1714 this packet. Devices must not apply policies when the A bit is set.
1715 </dd>
1716 </dl>
1717 </field>
1718
7dc18ae9
WT
1719 <h2>ERSPAN Metadata Fields</h2>
1720 <p>
1721 These fields provide access to features in the ERSPAN tunneling protocol
1722 [ERSPAN], which has two major versions: version 1 (aka type II) and
1723 version 2 (aka type III).
1724 </p>
1725
1726 <p>
1727 Regardless of version, ERSPAN is encapsulated within a fixed 8-byte GRE
1728 header that consists of a 4-byte GRE base header and a 4-byte sequence
1729 number. The ERSPAN version 1 header format is:
1730 </p>
1731
1732 <diagram>
1733 <header name="GRE">
1734 <bits name="..." above="16" width="0.4"/>
1735 <bits name="type" above="16" below="0x88be" width="0.4"/>
1736 <bits name="seq" above="32" width=".4"/>
1737 </header>
1738 <header name="ERSPAN v1">
1739 <bits name="ver" above="4" below="1" width="0.4"/>
1740 <bits name="..." above="18" width="0.4"/>
1741 <bits name="session" above="10" below="tun_id" width="0.5"/>
1742 <bits name="..." above="12" width="0.4"/>
1743 <bits name="idx" above="20" width="0.6"/>
1744 </header>
1745 <header name="Ethernet">
1746 <bits name="dst" above="48" width="0.4"/>
1747 <bits name="src" above="48" width="0.4"/>
1748 <bits name="type" above="16" width="0.4"/>
1749 </header>
1750 <dots/>
1751 </diagram>
1752
1753 <p>
1754 The ERSPAN version 2 header format is:
1755 </p>
1756
1757 <diagram>
1758 <header name="GRE">
1759 <bits name="..." above="16" width="0.4"/>
1760 <bits name="type" above="16" below="0x22eb" width="0.4"/>
1761 <bits name="seq" above="32" width=".4"/>
1762 </header>
1763 <header name="ERSPAN v2">
1764 <bits name="ver" above="4" below="2" width="0.4"/>
1765 <bits name="..." above="18" width="0.4"/>
1766 <bits name="session" above="10" below="tun_id" width="0.5"/>
1767 <bits name="timestamp" above="32" width=".7"/>
1768 <bits name="..." above="22" width="0.4"/>
1769 <bits name="hwid" above="6" width="0.4"/>
1770 <bits name="dir" above="1" below="0/1" width="0.4"/>
1771 <bits name="..." above="3" width="0.4"/>
1772 </header>
1773 <header name="Ethernet">
1774 <bits name="dst" above="48" width="0.4"/>
1775 <bits name="src" above="48" width="0.4"/>
1776 <bits name="type" above="16" width="0.4"/>
1777 </header>
1778 <dots/>
1779 </diagram>
1780
1781 <field id="MFF_TUN_ERSPAN_VER" title="ERSPAN Version">
1782 ERSPAN version number: 1 for version 1, or 2 for version 2.
1783 </field>
1784
1785 <field id="MFF_TUN_ERSPAN_IDX" title="ERSPAN Index">
1786 This field is a 20-bit index/port number associated with the ERSPAN
1787 traffic's source port and direction (ingress/egress). This field is
1788 platform dependent.
1789 </field>
1790
1791 <field id="MFF_TUN_ERSPAN_DIR" title="ERSPAN Direction">
1792 For ERSPAN v2, the mirrored traffic's direction: 0 for ingress traffic, 1
1793 for egress traffic.
1794 </field>
1795
1796 <field id="MFF_TUN_ERSPAN_HWID" title="ERSPAN Hardware ID">
1797 A 6-bit unique identifier of an ERSPAN v2 engine within a system.
1798 </field>
1799
96fee5e0
BP
1800 <h2>Geneve Fields</h2>
1801
1802 <p>
1803 These fields provide access to additional features in the Geneve
1804 tunneling protocol [Geneve]. Their names are somewhat generic in the
1805 hope that the same fields could be reused for other protocols in the
1806 future; for example, the NSH protocol [NSH] supports TLV options whose
1807 form is identical to that for Geneve options.
1808 </p>
1809
1810 <field id="MFF_TUN_METADATA0" title="Generic Tunnel Option 0">
1811 <p>
1812 The above information specifically covers generic tunnel option 0, but
1813 Open vSwitch supports 64 options, numbered 0 through 63, whose
1814 NXM field numbers are 40 through 103.
1815 </p>
1816
1817 <p>
1818 These fields provide OpenFlow access to the generic type-length-value
1819 options defined by the Geneve tunneling protocol or other protocols
1820 with options in the same TLV format as Geneve options. Each of these
1821 options has the following wire format:
1822 </p>
1823
1824 <diagram>
5a0e4aec
BP
1825 <header name="header">
1826 <bits name="class" above="16" width="0.6"/>
1827 <bits name="type" above="8" width="0.5"/>
1828 <bits name="res" above="3" below="0" width="0.25"/>
1829 <bits name="length" above="5" width="0.4"/>
1830 </header>
96fee5e0 1831 <nospace/>
5a0e4aec
BP
1832 <header name="body">
1833 <bits name="value" above="4×(length - 1) bytes" width="1.7"/>
1834 </header>
96fee5e0
BP
1835 </diagram>
1836
1837 <p>
1838 Taken together, the <code>class</code> and <code>type</code> in the
1839 option format mean that there are about 16 million distinct kinds of
1840 TLV options, too many to give individual OXM code points. Thus, Open
1841 vSwitch requires the user to define the TLV options of interest, by
1842 binding up to 64 TLV options to generic tunnel option NXM code points.
1843 Each option may have up to 124 bytes in its body, the maximum allowed
1844 by the TLV format, but bound options may total at most 252 bytes of
1845 body.
1846 </p>
1847
1848 <p>
1849 Open vSwitch extensions to the OpenFlow protocol bind TLV options to
1850 NXM code points. The <code>ovs-ofctl</code>(8) program offers one way
1851 to use these extensions, e.g. to configure a mapping from a TLV option
1852 with <code>class</code> <code>0xffff</code>, <code>type</code>
1853 <code>0</code>, and a body length of 4 bytes:
1854 </p>
1855
1856 <pre>
1857ovs-ofctl add-tlv-map br0 "{class=0xffff,type=0,len=4}->tun_metadata0"
1858 </pre>
1859
1860 <p>
1861 Once a TLV option is properly bound, it can be accessed and modified
1862 like any other field, e.g. to send packets that have value 1234 for the
1863 option described above to the controller:
1864 </p>
1865
1866 <pre>
1867ovs-ofctl add-flow br0 tun_metadata0=1234,actions=controller
1868 </pre>
1869
1870 <p>
1871 An option not received or not bound is matched as all zeros.
1872 </p>
1873 </field>
1874 <!--- XXX need a way to define a range of OXMs -->
1875 <field id="MFF_TUN_METADATA1" title="Generic Tunnel Option 1" hidden="yes"/>
1876 <field id="MFF_TUN_METADATA2" title="Generic Tunnel Option 2" hidden="yes"/>
1877 <field id="MFF_TUN_METADATA3" title="Generic Tunnel Option 3" hidden="yes"/>
1878 <field id="MFF_TUN_METADATA4" title="Generic Tunnel Option 4" hidden="yes"/>
1879 <field id="MFF_TUN_METADATA5" title="Generic Tunnel Option 5" hidden="yes"/>
1880 <field id="MFF_TUN_METADATA6" title="Generic Tunnel Option 6" hidden="yes"/>
1881 <field id="MFF_TUN_METADATA7" title="Generic Tunnel Option 7" hidden="yes"/>
1882 <field id="MFF_TUN_METADATA8" title="Generic Tunnel Option 8" hidden="yes"/>
1883 <field id="MFF_TUN_METADATA9" title="Generic Tunnel Option 9" hidden="yes"/>
1884 <field id="MFF_TUN_METADATA10" title="Generic Tunnel Option 10" hidden="yes"/>
1885 <field id="MFF_TUN_METADATA11" title="Generic Tunnel Option 11" hidden="yes"/>
1886 <field id="MFF_TUN_METADATA12" title="Generic Tunnel Option 12" hidden="yes"/>
1887 <field id="MFF_TUN_METADATA13" title="Generic Tunnel Option 13" hidden="yes"/>
1888 <field id="MFF_TUN_METADATA14" title="Generic Tunnel Option 14" hidden="yes"/>
1889 <field id="MFF_TUN_METADATA15" title="Generic Tunnel Option 15" hidden="yes"/>
1890 <field id="MFF_TUN_METADATA16" title="Generic Tunnel Option 16" hidden="yes"/>
1891 <field id="MFF_TUN_METADATA17" title="Generic Tunnel Option 17" hidden="yes"/>
1892 <field id="MFF_TUN_METADATA18" title="Generic Tunnel Option 18" hidden="yes"/>
1893 <field id="MFF_TUN_METADATA19" title="Generic Tunnel Option 19" hidden="yes"/>
1894 <field id="MFF_TUN_METADATA20" title="Generic Tunnel Option 20" hidden="yes"/>
1895 <field id="MFF_TUN_METADATA21" title="Generic Tunnel Option 21" hidden="yes"/>
1896 <field id="MFF_TUN_METADATA22" title="Generic Tunnel Option 22" hidden="yes"/>
1897 <field id="MFF_TUN_METADATA23" title="Generic Tunnel Option 23" hidden="yes"/>
1898 <field id="MFF_TUN_METADATA24" title="Generic Tunnel Option 24" hidden="yes"/>
1899 <field id="MFF_TUN_METADATA25" title="Generic Tunnel Option 25" hidden="yes"/>
1900 <field id="MFF_TUN_METADATA26" title="Generic Tunnel Option 26" hidden="yes"/>
1901 <field id="MFF_TUN_METADATA27" title="Generic Tunnel Option 27" hidden="yes"/>
1902 <field id="MFF_TUN_METADATA28" title="Generic Tunnel Option 28" hidden="yes"/>
1903 <field id="MFF_TUN_METADATA29" title="Generic Tunnel Option 29" hidden="yes"/>
1904 <field id="MFF_TUN_METADATA30" title="Generic Tunnel Option 30" hidden="yes"/>
1905 <field id="MFF_TUN_METADATA31" title="Generic Tunnel Option 31" hidden="yes"/>
1906 <field id="MFF_TUN_METADATA32" title="Generic Tunnel Option 32" hidden="yes"/>
1907 <field id="MFF_TUN_METADATA33" title="Generic Tunnel Option 33" hidden="yes"/>
1908 <field id="MFF_TUN_METADATA34" title="Generic Tunnel Option 34" hidden="yes"/>
1909 <field id="MFF_TUN_METADATA35" title="Generic Tunnel Option 35" hidden="yes"/>
1910 <field id="MFF_TUN_METADATA36" title="Generic Tunnel Option 36" hidden="yes"/>
1911 <field id="MFF_TUN_METADATA37" title="Generic Tunnel Option 37" hidden="yes"/>
1912 <field id="MFF_TUN_METADATA38" title="Generic Tunnel Option 38" hidden="yes"/>
1913 <field id="MFF_TUN_METADATA39" title="Generic Tunnel Option 39" hidden="yes"/>
1914 <field id="MFF_TUN_METADATA40" title="Generic Tunnel Option 40" hidden="yes"/>
1915 <field id="MFF_TUN_METADATA41" title="Generic Tunnel Option 41" hidden="yes"/>
1916 <field id="MFF_TUN_METADATA42" title="Generic Tunnel Option 42" hidden="yes"/>
1917 <field id="MFF_TUN_METADATA43" title="Generic Tunnel Option 43" hidden="yes"/>
1918 <field id="MFF_TUN_METADATA44" title="Generic Tunnel Option 44" hidden="yes"/>
1919 <field id="MFF_TUN_METADATA45" title="Generic Tunnel Option 45" hidden="yes"/>
1920 <field id="MFF_TUN_METADATA46" title="Generic Tunnel Option 46" hidden="yes"/>
1921 <field id="MFF_TUN_METADATA47" title="Generic Tunnel Option 47" hidden="yes"/>
1922 <field id="MFF_TUN_METADATA48" title="Generic Tunnel Option 48" hidden="yes"/>
1923 <field id="MFF_TUN_METADATA49" title="Generic Tunnel Option 49" hidden="yes"/>
1924 <field id="MFF_TUN_METADATA50" title="Generic Tunnel Option 50" hidden="yes"/>
1925 <field id="MFF_TUN_METADATA51" title="Generic Tunnel Option 51" hidden="yes"/>
1926 <field id="MFF_TUN_METADATA52" title="Generic Tunnel Option 52" hidden="yes"/>
1927 <field id="MFF_TUN_METADATA53" title="Generic Tunnel Option 53" hidden="yes"/>
1928 <field id="MFF_TUN_METADATA54" title="Generic Tunnel Option 54" hidden="yes"/>
1929 <field id="MFF_TUN_METADATA55" title="Generic Tunnel Option 55" hidden="yes"/>
1930 <field id="MFF_TUN_METADATA56" title="Generic Tunnel Option 56" hidden="yes"/>
1931 <field id="MFF_TUN_METADATA57" title="Generic Tunnel Option 57" hidden="yes"/>
1932 <field id="MFF_TUN_METADATA58" title="Generic Tunnel Option 58" hidden="yes"/>
1933 <field id="MFF_TUN_METADATA59" title="Generic Tunnel Option 59" hidden="yes"/>
1934 <field id="MFF_TUN_METADATA60" title="Generic Tunnel Option 60" hidden="yes"/>
1935 <field id="MFF_TUN_METADATA61" title="Generic Tunnel Option 61" hidden="yes"/>
1936 <field id="MFF_TUN_METADATA62" title="Generic Tunnel Option 62" hidden="yes"/>
1937 <field id="MFF_TUN_METADATA63" title="Generic Tunnel Option 63" hidden="yes"/>
1938
1939 <field id="MFF_TUN_FLAGS" title="Tunnel Flags">
1940 <p>
1941 Flags indicating various aspects of the tunnel encapsulation.
1942 </p>
1943
1944 <p>
1945 Matches on this field are most conveniently written in terms of
1946 symbolic names (given in the diagram below), each preceded by either
1947 <code>+</code> for a flag that must be set, or <code>-</code> for a
1948 flag that must be unset, without any other delimiters between the
1949 flags. Flags not mentioned are wildcarded. For example,
1950 <code>tun_flags=+oam</code> matches only OAM packets. Matches can also
1951 be written as <code><var>flags</var>/<var>mask</var></code>, where
1952 <var>flags</var> and <var>mask</var> are 16-bit numbers in decimal or
1953 in hexadecimal prefixed by <code>0x</code>.
1954 </p>
1955
1956 <p>
1957 Currently, only one flag is defined:
1958 </p>
1959
1960 <dl>
1961 <dt><code>oam</code></dt>
1962 <dd>
1963 The tunnel protocol indicated that this is an OAM (Operations and
1964 Management) control packet.
1965 </dd>
1966 </dl>
1967
1968 <p>
1969 The switch may reject matches against unknown flags.
1970 </p>
1971
1972 <p>
1973 Newer versions of Open vSwitch may introduce additional flags with new
1974 meanings. It is therefore not recommended to use an exact match on
1975 this field since the behavior of these new flags is unknown and should
1976 be ignored.
1977 </p>
1978
1979 <p>
1980 For non-tunneled packets, the value is 0.
1981 </p>
1982 </field>
1983
1984 <!-- Open vSwitch uses the following fields internally, but it
1985 does not expose them to the user via OpenFlow, so we do not
1986 document them. -->
1987 <field id="MFF_TUN_TTL" title="Tunnel IPv4 Time-to-Live" internal="yes"/>
1988 <field id="MFF_TUN_TOS" title="Tunnel IPv4 Type of Service" internal="yes"/>
1989 </group>
1990
1991 <group title="Metadata">
1992 <p>
1993 These fields relate to the origin or treatment of a packet, but
1994 they are not extracted from the packet data itself.
1995 </p>
1996
1997 <field id="MFF_IN_PORT" title="Ingress Port">
1998 <p>
5a0e4aec
BP
1999 The OpenFlow port on which the packet being processed arrived.
2000 This is a 16-bit field that holds an OpenFlow 1.0 port number.
2001 For receiving a packet, the only values that appear in this
2002 field are:
96fee5e0
BP
2003 </p>
2004
2005 <dl>
5a0e4aec
BP
2006 <dt>1 through <code>0xfeff</code> (65,279), inclusive.</dt>
2007 <dd>
2008 Conventional OpenFlow port numbers.
2009 </dd>
2010
2011 <dt><code>OFPP_LOCAL</code> (<code>0xfffe</code> or 65,534).</dt>
2012 <dd>
2013 <p>
2014 The ``local'' port, which in Open vSwitch is always named
2015 the same as the bridge itself. This represents a
2016 connection between the switch and the local TCP/IP stack.
2017 This port is where an IP address is most commonly
2018 configured on an Open vSwitch switch.
2019 </p>
2020
2021 <p>
2022 OpenFlow does not require a switch to have a local port,
2023 but all existing versions of Open vSwitch have always
2024 included a local port. <b>Future Directions:</b> Future
2025 versions of Open vSwitch might be able to optionally omit
2026 the local port, if someone submits code to implement such
2027 a feature.
2028 </p>
2029 </dd>
2030
2031 <dt><code>OFPP_NONE</code> (OpenFlow 1.0) or <code>OFPP_ANY</code> (OpenFlow 1.1+) (<code>0xffff</code> or 65,535).</dt>
2032 <dt><code>OFPP_CONTROLLER</code> (<code>0xfffd</code> or 65,533).</dt>
2033 <dd>
2034 <p>
2035 When a controller injects a packet into an OpenFlow switch
2036 with a ``packet-out'' request, it can specify one of these
2037 ingress ports to indicate that the packet was generated
2038 internally rather than having been received on some port.
2039 </p>
2040
2041 <p>
2042 OpenFlow 1.0 specified <code>OFPP_NONE</code> for this
2043 purpose. Despite that, some controllers used
2044 <code>OFPP_CONTROLLER</code>, and some switches only
2045 accepted <code>OFPP_CONTROLLER</code>, so OpenFlow 1.0.2
2046 required support for both ports. OpenFlow 1.1 and later
2047 were more clearly drafted to allow only
2048 <code>OFPP_CONTROLLER</code>. For maximum compatibility,
2049 Open vSwitch allows both ports with all OpenFlow versions.
2050 </p>
2051 </dd>
96fee5e0
BP
2052 </dl>
2053
2054 <p>
5a0e4aec
BP
2055 Values not mentioned above will never appear when receiving a
2056 packet, including the following notable values:
96fee5e0
BP
2057 </p>
2058
2059 <dl>
5a0e4aec
BP
2060 <dt>0</dt>
2061 <dd>
2062 Zero is not a valid OpenFlow port number.
2063 </dd>
2064
2065 <dt><code>OFPP_MAX</code> (<code>0xff00</code> or 65,280).</dt>
2066 <dd>
2067 This value has only been clearly specified as a valid port
2068 number as of OpenFlow 1.3.3. Before that, its status was
2069 unclear, and so Open vSwitch has never allowed
2070 <code>OFPP_MAX</code> to be used as a port number, so
2071 packets will never be received on this port. (Other
2072 OpenFlow switches, of course, might use it.)
2073 </dd>
96fee5e0
BP
2074
2075 <dt><code>OFPP_UNSET</code> (<code>0xfff7</code> or 65,527)</dt>
5a0e4aec
BP
2076 <dt><code>OFPP_IN_PORT</code> (<code>0xfff8</code> or 65,528)</dt>
2077 <dt><code>OFPP_TABLE</code> (<code>0xfff9</code> or 65,529)</dt>
2078 <dt><code>OFPP_NORMAL</code> (<code>0xfffa</code> or 65,530)</dt>
2079 <dt><code>OFPP_FLOOD</code> (<code>0xfffb</code> or 65,531)</dt>
2080 <dt><code>OFPP_ALL</code> (<code>0xfffc</code> or 65,532)</dt>
2081 <dd>
96fee5e0 2082 <p>
5a0e4aec
BP
2083 These port numbers are used only in output actions and never
2084 appear as ingress ports.
96fee5e0
BP
2085 </p>
2086
2087 <p>
2088 Most of these port numbers were defined in OpenFlow 1.0, but
2089 <code>OFPP_UNSET</code> was only introduced in OpenFlow 1.5.
2090 </p>
5a0e4aec 2091 </dd>
96fee5e0
BP
2092 </dl>
2093
2094 <p>
5a0e4aec
BP
2095 Values that will never appear when receiving a packet may
2096 still be matched against in the flow table. There are still
2097 circumstances in which those flows can be matched:
96fee5e0
BP
2098 </p>
2099
2100 <ul>
5a0e4aec
BP
2101 <li>
2102 The <code>resubmit</code> Open vSwitch extension action allows a
2103 flow table lookup with an arbitrary ingress port.
2104 </li>
2105
2106 <li>
2107 An action that modifies the ingress port field (see below),
2108 such as e.g. <code>load</code> or <code>set_field</code>,
2109 followed by an action or instruction that performs another
2110 flow table lookup, such as <code>resubmit</code> or
2111 <code>goto_table</code>.
2112 </li>
96fee5e0
BP
2113 </ul>
2114
2115 <p>
5a0e4aec
BP
2116 This field is heavily used for matching in OpenFlow tables,
2117 but for packet egress, it has only very limited roles:
96fee5e0
BP
2118 </p>
2119
2120 <ul>
5a0e4aec
BP
2121 <li>
2122 <p>
2123 OpenFlow requires suppressing output actions to <ref
2124 field="in_port"/>. That is, the following two flows both drop all
2125 packets that arrive on port 1:
2126 </p>
2127
2128 <pre>
96fee5e0
BP
2129in_port=1,actions=1
2130in_port=1,actions=drop
5a0e4aec
BP
2131 </pre>
2132
2133 <p>
2134 (This behavior is occasionally useful for flooding to a
2135 subset of ports. Specifying <code>actions=1,2,3,4</code>,
2136 for example, outputs to ports 1, 2, 3, and 4, omitting the
2137 ingress port.)
2138 </p>
2139 </li>
2140
2141 <li>
2142 OpenFlow has a special port <code>OFPP_IN_PORT</code> (with
2143 value 0xfff8) that outputs to the ingress port. For example,
2144 in a switch that has four ports numbered 1 through 4,
2145 <code>actions=1,2,3,4,in_port</code> outputs to ports 1, 2,
2146 3, and 4, including the ingress port.
2147 </li>
96fee5e0
BP
2148 </ul>
2149
2150 <p>
5a0e4aec
BP
2151 Because the ingress port field has so little influence on packet
2152 processing, it does not ordinarily make sense to modify the
2153 ingress port field. The field is writable only to support the
2154 occasional use case where the ingress port's roles in packet
2155 egress, described above, become troublesome. For example,
2156 <code>actions=load:0-&gt;NXM_OF_IN_PORT[],output:123</code>
2157 will output to port 123 regardless of whether it is in the
2158 ingress port. If the ingress port is important, then one may save
2159 and restore it on the stack:
96fee5e0
BP
2160 </p>
2161
2162 <pre>
2163actions=push:NXM_OF_IN_PORT[],load:0->NXM_OF_IN_PORT[],output:123,pop:NXM_OF_IN_PORT[]
2164 </pre>
2165
2166 <p>
2167 or, in Open vSwitch 2.7 or later, use the <code>clone</code> action to
2168 save and restore it:
2169 </p>
2170
2171 <pre>
2172actions=clone(load:0->NXM_OF_IN_PORT[],output:123)
2173 </pre>
2174
2175 <p>
5a0e4aec
BP
2176 The ability to modify the ingress port is an Open vSwitch
2177 extension to OpenFlow.
96fee5e0
BP
2178 </p>
2179 </field>
2180
2181 <field id="MFF_IN_PORT_OXM" title="OXM Ingress Port">
2182 <p>
5a0e4aec
BP
2183 OpenFlow 1.1 and later use a 32-bit port number, so this field
2184 supplies a 32-bit view of the ingress port. Current versions of
2185 Open vSwitch support only a 16-bit range of ports:
96fee5e0
BP
2186 </p>
2187
2188 <ul>
5a0e4aec
BP
2189 <li>
2190 OpenFlow 1.0 ports <code>0x0000</code> to
2191 <code>0xfeff</code>, inclusive, map to OpenFlow 1.1
2192 port numbers with the same values.
2193 </li>
2194
2195 <li>
2196 OpenFlow 1.0 ports <code>0xff00</code> to
2197 <code>0xffff</code>, inclusive, map to OpenFlow 1.1 port
2198 numbers <code>0xffffff00</code> to <code>0xffffffff</code>.
2199 </li>
2200
2201 <li>
2202 OpenFlow 1.1 ports <code>0x0000ff00</code> to
2203 <code>0xfffffeff</code> are not mapped and not supported.
2204 </li>
96fee5e0
BP
2205 </ul>
2206
2207 <p>
5a0e4aec
BP
2208 <ref field="in_port"/> and <ref field="in_port_oxm"/> are two views of
2209 the same information, so all of the comments on <ref field="in_port"/>
2210 apply to <ref field="in_port_oxm"/> too. Modifying <ref
2211 field="in_port"/> changes <ref field="in_port_oxm"/>, and vice versa.
96fee5e0
BP
2212 </p>
2213
2214 <p>
5a0e4aec
BP
2215 Setting <ref field="in_port_oxm"/> to an unsupported value yields
2216 unspecified behavior.
96fee5e0
BP
2217 </p>
2218 </field>
2219
2220 <field id="MFF_SKB_PRIORITY" title="Output Queue">
2221 <p>
5a0e4aec
BP
2222 <b>Future Directions:</b> Open vSwitch implements the output queue as a
2223 field, but does not currently expose it through OXM or NXM for matching
2224 purposes. If this turns out to be a useful feature, it could be
2225 implemented in future versions. Only the <code>set_queue</code>,
2226 <code>enqueue</code>, and <code>pop_queue</code> actions currently
2227 influence the output queue.
96fee5e0
BP
2228 </p>
2229
2230 <p>
5a0e4aec
BP
2231 This field influences how packets in the flow will be queued,
2232 for quality of service (QoS) purposes, when they egress the
2233 switch. Its range of meaningful values, and their meanings,
2234 varies greatly from one OpenFlow implementation to another.
2235 Even within a single implementation, there is no guarantee
2236 that all OpenFlow ports have the same queues configured or
2237 that all OpenFlow ports in an implementation can be configured
2238 the same way queue-wise.
96fee5e0
BP
2239 </p>
2240
2241 <p>
5a0e4aec
BP
2242 Configuring queues on OpenFlow is not well standardized. On
2243 Linux, Open vSwitch supports queue configuration via OVSDB,
2244 specifically the <code>QoS</code> and <code>Queue</code>
2245 tables (see <code>ovs-vswitchd.conf.db(5)</code> for details).
2246 Ports of Open vSwitch to other platforms might require queue
2247 configuration through some separate protocol (such as a CLI).
2248 Even on Linux, Open vSwitch exposes only a fraction of the
2249 kernel's queuing features through OVSDB, so advanced or
2250 unusual uses might require use of separate utilities
2251 (e.g. <code>tc</code>). OpenFlow switches other than Open
2252 vSwitch might use OF-CONFIG or any of the configuration
2253 methods mentioned above. Finally, some OpenFlow switches have
2254 a fixed number of fixed-function queues (e.g. eight queues
2255 with strictly defined priorities) and others do not support
2256 any control over queuing.
96fee5e0
BP
2257 </p>
2258
2259 <p>
5a0e4aec
BP
2260 The only output queue that all OpenFlow implementations must
2261 support is zero, to identify a default queue, whose properties
2262 are implementation-defined. Outputting a packet to a queue
2263 that does not exist on the output port yields unpredictable
2264 behavior: among the possibilities are that the packet might be
2265 dropped or transmitted with a very high or very low priority.
96fee5e0
BP
2266 </p>
2267
2268 <p>
5a0e4aec
BP
2269 OpenFlow 1.0 only allowed output queues to be specified as part of an
2270 <code>enqueue</code> action that specified both a queue and an output
2271 port. That is, OpenFlow 1.0 treats the queue as an argument to an
2272 action, not as a field.
96fee5e0
BP
2273 </p>
2274
2275 <p>
5a0e4aec
BP
2276 To increase flexibility, OpenFlow 1.1 added an action to set the output
2277 queue. This model was carried forward, without change, through
2278 OpenFlow 1.5.
96fee5e0
BP
2279 </p>
2280
2281 <p>
5a0e4aec
BP
2282 Open vSwitch implements the native queuing model of each
2283 OpenFlow version it supports. Open vSwitch also includes an
2284 extension for setting the output queue as an action in
2285 OpenFlow 1.0.
96fee5e0
BP
2286 </p>
2287
2288 <p>
5a0e4aec
BP
2289 When a packet ingresses into an OpenFlow switch, the output
2290 queue is ordinarily set to 0, indicating the default queue.
2291 However, Open vSwitch supports various ways to forward a
2292 packet from one OpenFlow switch to another within a single
2293 host. In these cases, Open vSwitch maintains the output queue
2294 across the forwarding step. For example:
96fee5e0
BP
2295 </p>
2296
2297 <ul>
5a0e4aec
BP
2298 <li>
2299 A hop across an Open vSwitch ``patch port'' (which does not
2300 actually involve queuing) preserves the output queue.
2301 </li>
2302
2303 <li>
2304 <p>
2305 When a flow sets the output queue then outputs to an
2306 OpenFlow tunnel port, the encapsulation preserves the
2307 output queue. If the kernel TCP/IP stack routes the
2308 encapsulated packet directly to a physical interface, then
2309 that output honors the output queue. Alternatively, if
2310 the kernel routes the encapsulated packet to another Open
2311 vSwitch bridge, then the output queue set previously
2312 becomes the initial output queue on ingress to the second
2313 bridge and will thus be used for further output actions
2314 (unless overridden by a new ``set queue'' action).
2315 </p>
2316
2317 <p>
2318 (This description reflects the current behavior of Open
2319 vSwitch on Linux. This behavior relies on details of the
2320 Linux TCP/IP stack. It could be difficult to make ports
2321 to other operating systems behave the same way.)
2322 </p>
2323 </li>
96fee5e0
BP
2324 </ul>
2325 </field>
2326
2327 <field id="MFF_PKT_MARK" title="Packet Mark">
2328 <p>
2329 Packet mark comes to Open vSwitch from the Linux kernel, in
2330 which the <code>sk_buff</code> data structure that represents
2331 a packet contains a 32-bit member named <code>skb_mark</code>.
2332 The value of <code>skb_mark</code> propagates along with the
2333 packet it accompanies wherever the packet goes in the kernel.
2334 It has no predefined semantics but various kernel-user
2335 interfaces can set and match on it, which makes it suitable
2336 for ``marking'' packets at one point in their handling and
2337 then acting on the mark later. With <code>iptables</code>,
2338 for example, one can mark some traffic specially at ingress
2339 and then handle that traffic differently at egress based on
2340 the marked value.
2341 </p>
2342
2343 <p>
5a0e4aec
BP
2344 Packet mark is an attempt at a generalization of the
2345 <code>skb_mark</code> concept beyond Linux, at least through more
2346 generic naming. Like <ref field="skb_priority"/>, packet mark is
2347 preserved across forwarding steps within a machine. Unlike <ref
2348 field="skb_priority"/>, packet mark has no direct effect on packet
2349 forwarding: the value set in packet mark does not matter unless some
2350 later OpenFlow table or switch matches on packet mark, or unless the
2351 packet passes through some other kernel subsystem that has been
2352 configured to interpret packet mark in specific ways, e.g. through
2353 <code>iptables</code> configuration mentioned above.
96fee5e0
BP
2354 </p>
2355
2356 <p>
5a0e4aec
BP
2357 Preserving packet mark across kernel forwarding steps relies
2358 heavily on kernel support, which ports to non-Linux operating
2359 systems may not have. Regardless of operating system support,
2360 Open vSwitch supports packet mark within a single bridge and
2361 across patch ports.
96fee5e0
BP
2362 </p>
2363
2364 <p>
5a0e4aec
BP
2365 The value of packet mark when a packet ingresses into the
2366 first Open vSwich bridge is typically zero, but it could be
2367 nonzero if its value was previously set by some kernel
2368 subsystem.
96fee5e0
BP
2369 </p>
2370 </field>
2371
2372 <field id="MFF_ACTSET_OUTPUT" title="Action Set Output Port">
2373 <p>
2374 Holds the output port currently in the OpenFlow action set (i.e. from
2375 an <code>output</code> action within a <code>write_actions</code>
2376 instruction). Its value is an OpenFlow port number. If there is no
2377 output port in the OpenFlow action set, or if the output port will be
2378 ignored (e.g. because there is an output group in the OpenFlow action
2379 set), then the value will be <code>OFPP_UNSET</code>.
2380 </p>
2381
2382 <p>
2383 Open vSwitch allows any table to match this field. OpenFlow, however,
2384 only requires this field to be matchable from within an OpenFlow egress
2385 table (a feature that Open vSwitch does not yet implement).
2386 </p>
2387 </field>
2388
2389 <field id="MFF_DP_HASH" title="Datapath Hash" internal="yes"/>
2390 <field id="MFF_RECIRC_ID" title="Datapath Recirculation ID" internal="yes"/>
3d4b2e6e
JS
2391
2392 <field id="MFF_PACKET_TYPE" title="Packet Type">
2393 <p>
2394 The type of the packet in the format specified in OpenFlow 1.5:
2395 </p>
2396
2397 <diagram>
5a0e4aec
BP
2398 <header name="Packet type">
2399 <bits name="ns" above="16" width=".75"/>
2400 <bits name="ns_type" above="16" width=".75"/>
2401 </header>
2402 <dots/>
3d4b2e6e
JS
2403 </diagram>
2404
2405 <p>
2406 The upper 16 bits, <var>ns</var>, are a namespace. The meaning of
2407 <var>ns_type</var> depends on the namespace. The packet type field is
2408 specified and displayed in the format
2409 <code>(<var>ns</var>,<var>ns_type</var>)</code>.
2410 </p>
2411
2412 <p>
2413 Open vSwitch currently supports the following classes of packet types
2414 for matching:
2415 <dl>
2416 <dt><code>(0,0)</code></dt>
2417 <dd>Ethernet.</dd>
2418 <dt><code>(1,<var>ethertype</var>)</code></dt>
2419 <dd>
2420 <p>
2421 The specified <var>ethertype</var>. Open vSwitch can forward
2422 packets with any <var>ethertype</var>, but it can only match on
2423 and process data fields for the following supported packet types:
2424 </p>
2425 <dl>
2426 <dt><code>(1,0x800)</code></dt> <dd>IPv4</dd>
2427 <dt><code>(1,0x806)</code></dt> <dd>ARP</dd>
2428 <dt><code>(1,0x86dd)</code></dt> <dd>IPv6</dd>
2429 <dt><code>(1,0x8847)</code></dt> <dd>MPLS</dd>
2430 <dt><code>(1,0x8848)</code></dt> <dd>MPLS multicast</dd>
2431 <dt><code>(1,0x8035)</code></dt> <dd>RARP</dd>
2432 <dt><code>(1,0x894f)</code></dt> <dd>NSH</dd>
2433 </dl>
2434 </dd>
2435 </dl>
2436 </p>
2437
2438 <p>
2439 Consider the distinction between a packet with <code>packet_type=(0,0),
2440 dl_type=0x800</code> and one with <code>packet_type=(1,0x800)</code>.
2441 The former is an Ethernet frame that contains an IPv4 packet, like
2442 this:
2443 </p>
2444
2445 <diagram>
5a0e4aec
BP
2446 <header name="Ethernet">
2447 <bits name="dst" above="48" width="0.4"/>
2448 <bits name="src" above="48" width="0.4"/>
2449 <bits name="type" above="16" below="0x800" width="0.4"/>
2450 </header>
2451 <header name="IPv4">
2452 <bits name="..." width="0.4"/>
2453 <bits name="proto" above="8" width="0.4"/>
2454 <bits name="src" above="32" width="0.4"/>
2455 <bits name="dst" above="32" width="0.4"/>
2456 </header>
2457 <dots/>
3d4b2e6e
JS
2458 </diagram>
2459
2460 <p>
2461 The latter is an IPv4 packet not encapsulated inside any outer frame,
2462 like this:
2463 </p>
2464
2465 <diagram>
5a0e4aec
BP
2466 <header name="IPv4">
2467 <bits name="..." width="0.4"/>
2468 <bits name="proto" above="8" width="0.4"/>
2469 <bits name="src" above="32" width="0.4"/>
2470 <bits name="dst" above="32" width="0.4"/>
2471 </header>
2472 <dots/>
3d4b2e6e
JS
2473 </diagram>
2474
2475 <p>
2476 Matching on <ref field="packet_type"/> is a pre-requisite for matching
2477 on any data field, but for backward compatibility, when a match on a
2478 data field is present without a <ref field="packet_type"/> match, Open
2479 vSwitch acts as though a match on <code>(0,0)</code> (Ethernet) had
2480 been supplied. Similarly, when Open vSwitch sends flow match
2481 information to a controller, e.g. in a reply to a request to dump the
2482 flow table, Open vSwitch omits a match on packet type (0,0) if it would
2483 be implied by a data field match.
2484 </p>
2485 </field>
2486
96fee5e0
BP
2487 </group>
2488
2489 <group title="Connection Tracking">
2490 <p>
2491 Open vSwitch 2.5 and later support ``connection tracking,'' which allows
2492 bidirectional streams of packets to be statefully grouped into
2493 connections. Open vSwitch connection tracking, for example, identifies
2494 the patterns of TCP packets that indicates a successfully initiated
2495 connection, as well as those that indicate that a connection has been
2496 torn down. Open vSwitch connection tracking can also identify related
2497 connections, such as FTP data connections spawned from FTP control
2498 connections.
2499 </p>
2500
2501 <p>
2502 An individual packet passing through the pipeline may be in one of two
2503 states, ``untracked'' or ``tracked,'' which may be distinguished via the
2504 ``trk'' flag in <ref field="ct_state"/>. A packet is
2505 <dfn>untracked</dfn> at the beginning of the Open vSwitch pipeline and
2506 continues to be untracked until the pipeline invokes the <code>ct</code>
2507 action. The connection tracking fields are all zeroes in an untracked
2508 packet. When a flow in the Open vSwitch pipeline invokes the
2509 <code>ct</code> action, the action initializes the connection tracking
2510 fields and the packet becomes <dfn>tracked</dfn> for the remainder of its
2511 processing.
2512 </p>
2513
2514 <p>
2515 The connection tracker stores connection state in an internal table, but
2516 it only adds a new entry to this table when a <code>ct</code> action for
2517 a new connection invokes <code>ct</code> with the <code>commit</code>
2518 parameter. For a given connection, when a pipeline has executed
2519 <code>ct</code>, but not yet with <code>commit</code>, the connection is
2520 said to be <dfn>uncommitted</dfn>. State for an uncommitted connection
2521 is ephemeral and does not persist past the end of the pipeline, so some
2522 features are only available to committed connections. A connection would
2523 typically be left uncommitted as a way to drop its packets.
2524 </p>
2525
2526 <p>
2527 Connection tracking is an Open vSwitch extension to OpenFlow.
2528 </p>
2529
2530 <field id="MFF_CT_STATE" title="Connection Tracking State">
2531 <p>
2532 This field holds several flags that can be used to determine the state
2533 of the connection to which the packet belongs.
2534 </p>
2535
2536 <p>
2537 Matches on this field are most conveniently written in terms of
2538 symbolic names (listed below), each preceded by either <code>+</code>
2539 for a flag that must be set, or <code>-</code> for a flag that must be
2540 unset, without any other delimiters between the flags. Flags not
2541 mentioned are wildcarded. For example,
2542 <code>tcp,ct_state=+trk-new</code> matches TCP packets that have been
2543 run through the connection tracker and do not establish a new
2544 connection. Matches can also be written as
2545 <code><var>flags</var>/<var>mask</var></code>, where <var>flags</var>
2546 and <var>mask</var> are 32-bit numbers in decimal or in hexadecimal
2547 prefixed by <code>0x</code>.
2548 </p>
2549
2550 <p>
2551 The following flags are defined:
2552 </p>
2553
2554 <dl>
2555 <dt><code>new</code> (0x01)</dt>
2556 <dd>
2557 A new connection. Set to 1 if this is an uncommitted connection.
2558 </dd>
2559
2560 <dt><code>est</code> (0x02)</dt>
2561 <dd>
2562 Part of an existing connection. Set to 1 if this is a committed
2563 connection.
2564 </dd>
2565
2566 <dt><code>rel</code> (0x04)</dt>
2567 <dd>
2568 <p>
2569 Related to an existing connection, e.g. an ICMP ``destination
2570 unreachable'' message or an FTP data connections. This flag will
2571 only be 1 if the connection to which this one is related is
2572 committed.
2573 </p>
2574
2575 <p>
2576 Connections identified as <code>rel</code> are separate from the
2577 originating connection and must be committed separately. All
2578 packets for a related connection will have the <code>rel</code>
2579 flag set, not just the initial packet.
2580 </p>
2581 </dd>
2582
2583 <dt><code>rpl</code> (0x08)</dt>
2584 <dd>
2585 This packet is in the reply direction, meaning that it is in the
2586 opposite direction from the packet that initiated the connection.
2587 This flag will only be 1 if the connection is committed.
2588 </dd>
2589
2590 <dt><code>inv</code> (0x10)</dt>
2591 <dd>
2592 <p>
2593 The state is invalid, meaning that the connection tracker couldn't
2594 identify the connection. This flag is a catch-all for problems
2595 in the connection or the connection tracker, such as:
2596 </p>
2597
2598 <ul>
2599 <li>
2600 L3/L4 protocol handler is not loaded/unavailable. With the Linux
2601 kernel datapath, this may mean that the
2602 <code>nf_conntrack_ipv4</code> or <code>nf_conntrack_ipv6</code>
2603 modules are not loaded.
2604 </li>
2605
2606 <li>
2607 L3/L4 protocol handler determines that the packet is malformed.
2608 </li>
2609
2610 <li>
2611 Packets are unexpected length for protocol.
2612 </li>
2613 </ul>
2614 </dd>
2615
2616 <dt><code>trk</code> (0x20)</dt>
2617 <dd>
2618 This packet is tracked, meaning that it has previously traversed the
2619 connection tracker. If this flag is not set, then no other flags
2620 will be set. If this flag is set, then the packet is tracked and
2621 other flags may also be set.
2622 </dd>
2623
2624 <dt><code>snat</code> (0x40)</dt>
2625 <dd>
2626 This packet was transformed by source address/port translation by a
2627 preceding <code>ct</code> action. Open vSwitch 2.6 added this flag.
2628 </dd>
2629
2630 <dt><code>dnat</code> (0x80)</dt>
2631 <dd>
2632 This packet was transformed by destination address/port translation
2633 by a preceding <code>ct</code> action. Open vSwitch 2.6 added this
2634 flag.
2635 </dd>
2636 </dl>
2637
2638 <p>
2639 There are additional constraints on these flags, listed in decreasing
2640 order of precedence below:
2641 </p>
2642
2643 <ol>
2644 <li>
2645 If <code>trk</code> is unset, no other flags are set.
2646 </li>
2647
2648 <li>
2649 If <code>trk</code> is set, one or more other flags may be set.
2650 </li>
2651
2652 <li>
2653 If <code>inv</code> is set, only the <code>trk</code> flag is also
2654 set.
2655 </li>
2656
2657 <li>
2658 <code>new</code> and <code>est</code> are mutually exclusive.
2659 </li>
2660
2661 <li>
2662 <code>new</code> and <code>rpl</code> are mutually exclusive.
2663 </li>
2664
2665 <li>
2666 <code>rel</code> may be set in conjunction with any other flags.
2667 </li>
2668 </ol>
2669
2670 <p>
2671 Future versions of Open vSwitch may define new flags.
2672 </p>
2673 </field>
2674
2675 <field id="MFF_CT_ZONE" title="Connection Tracking Zone">
2676 A connection tracking zone, the zone value passed to the most recent
2677 <code>ct</code> action. Each zone is an independent connection tracking
2678 context, so tracking the same packet in multiple contexts requires using
2679 the <code>ct</code> action multiple times.
2680 </field>
2681
2682 <field id="MFF_CT_MARK" title="Connection Tracking Mark">
2683 The metadata committed, by an action within the <code>exec</code>
2684 parameter to the <code>ct</code> action, to the connection to which the
2685 current packet belongs.
2686 </field>
2687
2688 <field id="MFF_CT_LABEL" title="Connection Tracking Label">
2689 The label committed, by an action within the <code>exec</code>
2690 parameter to the <code>ct</code> action, to the connection to which the
2691 current packet belongs.
2692 </field>
daf4d3c1
JR
2693
2694 <p>
2695 Open vSwitch 2.8 introduced the matching support for connection
2696 tracker original direction 5-tuple fields.
2697 </p>
2698
2699 <p>
2700 For non-committed non-related connections the conntrack original
2701 direction tuple fields always have the same values as the
2702 corresponding headers in the packet itself. For any other packets of
2703 a committed connection the conntrack original direction tuple fields
2704 reflect the values from that initial non-committed non-related packet,
2705 and thus may be different from the actual packet headers, as the
2706 actual packet headers may be in reverse direction (for reply packets),
3efd46c8 2707 transformed by NAT (when <code>nat</code> option was applied to the
daf4d3c1
JR
2708 connection), or be of different protocol (i.e., when an ICMP response
2709 is sent to an UDP packet). In case of related connections, e.g., an
2710 FTP data connection, the original direction tuple contains the
2711 original direction headers from the master connection, e.g., an FTP
2712 control connection.
2713 </p>
2714
2715 <p>
2716 The following fields are populated by the ct action, and require a
2717 match to a valid connection tracking state as a prerequisite, in
2718 addition to the IP or IPv6 ethertype match. Examples of valid
3efd46c8
YHW
2719 connection tracking state matches include <code>ct_state=+new</code>,
2720 <code>ct_state=+est</code>, <code>ct_state=+rel</code>, and
2721 <code>ct_state=+trk-inv</code>.
daf4d3c1
JR
2722 </p>
2723
2724 <field id="MFF_CT_NW_SRC" title="Connection Tracking Original Direction IPv4 Source Address">
2725 Matches IPv4 conntrack original direction tuple source address.
2726 See the paragraphs above for general description to the
2727 conntrack original direction tuple. Introduced in Open vSwitch
2728 2.8.
2729 </field>
2730
2731 <field id="MFF_CT_NW_DST" title="Connection Tracking Original Direction IPv4 Destination Address">
2732 Matches IPv4 conntrack original direction tuple destination address.
2733 See the paragraphs above for general description to the
2734 conntrack original direction tuple. Introduced in Open vSwitch
2735 2.8.
2736 </field>
2737
2738 <field id="MFF_CT_IPV6_SRC" title="Connection Tracking Original Direction IPv6 Source Address">
2739 Matches IPv6 conntrack original direction tuple source address.
2740 See the paragraphs above for general description to the
2741 conntrack original direction tuple. Introduced in Open vSwitch
2742 2.8.
2743 </field>
2744
2745 <field id="MFF_CT_IPV6_DST" title="Connection Tracking Original Direction IPv6 Destination Address">
2746 Matches IPv6 conntrack original direction tuple destination address.
2747 See the paragraphs above for general description to the
2748 conntrack original direction tuple. Introduced in Open vSwitch
2749 2.8.
2750 </field>
2751
2752 <field id="MFF_CT_NW_PROTO" title="Connection Tracking Original Direction IP Protocol">
2753 Matches conntrack original direction tuple IP protocol type,
2754 which is specified as a decimal number between 0 and 255,
2755 inclusive (e.g. 1 to match ICMP packets or 6 to match TCP
2756 packets). In case of, for example, an ICMP response to an UDP
2757 packet, this may be different from the IP protocol type of the
2758 packet itself. See the paragraphs above for general description
2759 to the conntrack original direction tuple. Introduced in Open
2760 vSwitch 2.8.
2761 </field>
2762
2763 <field id="MFF_CT_TP_SRC" title="Connection Tracking Original Direction Transport Layer Source Port">
2764 Bitwise match on the conntrack original direction tuple
2765 transport source, when
2766 <code>MFF_CT_NW_PROTO</code> has value 6 for TCP, 17 for UDP, or
2767 132 for SCTP. When <code>MFF_CT_NW_PROTO</code> has value 1 for
2768 ICMP, or 58 for ICMPv6, the lower 8 bits of
2769 <code>MFF_CT_TP_SRC</code> matches the conntrack original
2770 direction ICMP type. See the paragraphs above for general
2771 description to the conntrack original direction
2772 tuple. Introduced in Open vSwitch 2.8.
2773 </field>
2774
2775 <field id="MFF_CT_TP_DST" title="Connection Tracking Original Direction Transport Layer Source Port">
2776 Bitwise match on the conntrack original direction tuple
2777 transport destination port, when
2778 <code>MFF_CT_NW_PROTO</code> has value 6 for TCP, 17 for UDP, or
2779 132 for SCTP. When <code>MFF_CT_NW_PROTO</code> has value 1 for
2780 ICMP, or 58 for ICMPv6, the lower 8 bits of
2781 <code>MFF_CT_TP_DST</code> matches the conntrack original
2782 direction ICMP code. See the paragraphs above for general
2783 description to the conntrack original direction
2784 tuple. Introduced in Open vSwitch 2.8.
2785 </field>
96fee5e0
BP
2786 </group>
2787
2788 <group title="Register">
2789 <p>
2790 These fields give an OpenFlow switch space for temporary storage while
2791 the pipeline is running. Whereas metadata fields can have a meaningful
2792 initial value and can persist across some hops across OpenFlow switches,
2793 registers are always initially 0 and their values never persist across
2794 inter-switch hops (not even across patch ports).
2795 </p>
2796
2797 <field id="MFF_METADATA" title="OpenFlow Metadata">
2798 <p>
5a0e4aec
BP
2799 This field is the oldest standardized OpenFlow register field,
2800 introduced in OpenFlow 1.1. It was introduced to model the limited
2801 number of user-defined bits that some ASIC-based switches can carry
2802 through their pipelines. Because of hardware limitations, OpenFlow
2803 allows switches to support writing and masking only an
2804 implementation-defined subset of bits, even no bits at all. The Open
2805 vSwitch software switch always supports all 64 bits, but of course an
2806 Open vSwitch port to an ASIC would have the same restriction as the
2807 ASIC itself.
96fee5e0
BP
2808 </p>
2809
2810 <p>
5a0e4aec
BP
2811 This field has an OXM code point, but OpenFlow 1.4 and earlier allow it
2812 to be modified only with a specialized instruction, not with a
2813 ``set-field'' action. OpenFlow 1.5 removes this restriction. Open
2814 vSwitch does not enforce this restriction, regardless of OpenFlow
2815 version.
96fee5e0
BP
2816 </p>
2817 </field>
2818
2819 <field id="MFF_REG0" title="Register 0">
2820 This is the first of several Open vSwitch registers, all of which have
2821 the same properties. Open vSwitch 1.1 introduced registers 0, 1, 2, and
2822 3, version 1.3 added register 4, version 1.7 added registers 5, 6, and 7,
2823 and version 2.6 added registers 8 through 15.
2824 </field>
2825 <!-- XXX series -->
2826 <field id="MFF_REG1" title="Register 1" hidden="yes"/>
2827 <field id="MFF_REG2" title="Register 2" hidden="yes"/>
2828 <field id="MFF_REG3" title="Register 3" hidden="yes"/>
2829 <field id="MFF_REG4" title="Register 4" hidden="yes"/>
2830 <field id="MFF_REG5" title="Register 5" hidden="yes"/>
2831 <field id="MFF_REG6" title="Register 6" hidden="yes"/>
2832 <field id="MFF_REG7" title="Register 7" hidden="yes"/>
2833 <field id="MFF_REG8" title="Register 8" hidden="yes"/>
2834 <field id="MFF_REG9" title="Register 9" hidden="yes"/>
2835 <field id="MFF_REG10" title="Register 10" hidden="yes"/>
2836 <field id="MFF_REG11" title="Register 11" hidden="yes"/>
2837 <field id="MFF_REG12" title="Register 12" hidden="yes"/>
2838 <field id="MFF_REG13" title="Register 13" hidden="yes"/>
2839 <field id="MFF_REG14" title="Register 14" hidden="yes"/>
2840 <field id="MFF_REG15" title="Register 15" hidden="yes"/>
2841
2842 <field id="MFF_XREG0" title="Extended Register 0">
2843 <p>
2844 This is the first of the registers introduced in OpenFlow 1.5.
2845 OpenFlow 1.5 calls these fields just the ``packet registers,'' but Open
2846 vSwitch already had 32-bit registers by that name, so Open vSwitch uses
2847 the name ``extended registers'' in an attempt to reduce confusion. The
2848 standard allows for up to 128 registers, each 64 bits wide, but Open
2849 vSwitch only implements 4 (in versions 2.4 and 2.5) or 8 (in version
2850 2.6 and later).
2851 </p>
2852
2853 <p>
2854 Each of the 64-bit extended registers overlays two of the 32-bit
2855 registers: <code>xreg0</code> overlays <code>reg0</code> and
2856 <code>reg1</code>, with <code>reg0</code> supplying the
2857 most-significant bits of <code>xreg0</code> and <code>reg1</code> the
2858 least-significant. Similarly, <code>xreg1</code> overlays
2859 <code>reg2</code> and <code>reg3</code>, and so on.
2860 </p>
2861
2862 <p>
2863 The OpenFlow specification says, ``In most cases, the packet registers
2864 can not be matched in tables, i.e. they usually can not be used in the
2865 flow entry match structure'' [OpenFlow 1.5, section 7.2.3.10], but
2866 there is no reason for a software switch to impose such a restriction,
2867 and Open vSwitch does not.
2868 </p>
2869 </field>
2870
2871 <!-- XXX series -->
2872 <field id="MFF_XREG1" title="Extended Register 1" hidden="yes"/>
2873 <field id="MFF_XREG2" title="Extended Register 2" hidden="yes"/>
2874 <field id="MFF_XREG3" title="Extended Register 3" hidden="yes"/>
2875 <field id="MFF_XREG4" title="Extended Register 4" hidden="yes"/>
2876 <field id="MFF_XREG5" title="Extended Register 5" hidden="yes"/>
2877 <field id="MFF_XREG6" title="Extended Register 6" hidden="yes"/>
2878 <field id="MFF_XREG7" title="Extended Register 7" hidden="yes"/>
2879
2880 <field id="MFF_XXREG0" title="Double-Extended Register 0">
2881 <p>
2882 This is the first of the double-extended registers introduce in Open
2883 vSwitch 2.6. Each of the 128-bit extended registers overlays four of
2884 the 32-bit registers: <code>xxreg0</code> overlays <code>reg0</code>
2885 through <code>reg3</code>, with <code>reg0</code> supplying the
2886 most-significant bits of <code>xxreg0</code> and <code>reg3</code> the
2887 least-significant. <code>xxreg1</code> similarly overlays
2888 <code>reg4</code> through <code>reg7</code>, and so on.
2889 </p>
2890 </field>
2891
2892 <!-- XXX series -->
2893 <field id="MFF_XXREG1" title="Double-Extended Register 1" hidden="yes"/>
2894 <field id="MFF_XXREG2" title="Double-Extended Register 2" hidden="yes"/>
2895 <field id="MFF_XXREG3" title="Double-Extended Register 3" hidden="yes"/>
2896 </group>
2897
2898 <group title="Layer 2 (Ethernet)">
2899 <p>
2900 Ethernet is the only layer-2 protocol that Open vSwitch
2901 supports. As with most software, Open vSwitch and OpenFlow
2902 regard an Ethernet frame to begin with the 14-byte header and
2903 end with the final byte of the payload; that is, the frame check
2904 sequence is not considered part of the frame.
2905 </p>
2906
2907 <field id="MFF_ETH_SRC" title="Ethernet Source">
2908 <p>
2909 The Ethernet source address:
2910 </p>
2911
2912 <diagram>
5a0e4aec
BP
2913 <header name="Ethernet">
2914 <bits name="dst" above="48" width=".75"/>
2915 <bits name="src" above="48" width=".75" fill="yes"/>
2916 <bits name="type" above="16" width="0.4"/>
2917 </header>
2918 <dots/>
96fee5e0
BP
2919 </diagram>
2920 </field>
2921
2922 <field id="MFF_ETH_DST" title="Ethernet Destination">
2923 <p>
5a0e4aec 2924 The Ethernet destination address:
96fee5e0
BP
2925 </p>
2926
2927 <diagram>
5a0e4aec
BP
2928 <header name="Ethernet">
2929 <bits name="dst" above="48" width=".75" fill="yes"/>
2930 <bits name="src" above="48" width=".75"/>
2931 <bits name="type" above="16" width="0.4"/>
2932 </header>
2933 <dots/>
96fee5e0
BP
2934 </diagram>
2935
2936 <p>
2937 Open vSwitch 1.8 and later support arbitrary masks for source and/or
2938 destination. Earlier versions only support masking the destination
2939 with the following masks:
2940 </p>
2941
2942 <dl>
2943 <dt><code>01:00:00:00:00:00</code></dt>
2944 <dd>
2945 Match only the multicast bit. Thus,
2946 <code>dl_dst=01:00:00:00:00:00/01:00:00:00:00:00</code> matches all
2947 multicast (including broadcast) Ethernet packets, and
2948 <code>dl_dst=00:00:00:00:00:00/01:00:00:00:00:00</code> matches all
2949 unicast Ethernet packets.
2950 </dd>
2951
2952 <dt><code>fe:ff:ff:ff:ff:ff</code></dt>
2953 <dd>
2954 Match all bits except the multicast bit. This is probably not
2955 useful.
2956 </dd>
2957
2958 <dt><code>ff:ff:ff:ff:ff:ff</code></dt>
2959 <dd>
2960 Exact match (equivalent to omitting the mask).
2961 </dd>
2962
2963 <dt><code>00:00:00:00:00:00</code></dt>
2964 <dd>
2965 Wildcard all bits (equivalent to <code>dl_dst=*</code>).
2966 </dd>
2967 </dl>
2968 </field>
2969
2970 <field id="MFF_ETH_TYPE" title="Ethernet Type">
2971 <p>
5a0e4aec
BP
2972 The most commonly seen Ethernet frames today use a format
2973 called ``Ethernet II,'' in which the last two bytes of the
2974 Ethernet header specify the Ethertype. For such a frame, this
2975 field is copied from those bytes of the header, like so:
96fee5e0
BP
2976 </p>
2977
2978 <diagram>
5a0e4aec
BP
2979 <header name="Ethernet">
2980 <bits name="dst" above="48" width=".75"/>
2981 <bits name="src" above="48" width=".75"/>
2982 <bits name="type" above="16" below="\[&gt;=]0x600" width="0.4" fill="yes"/>
2983 </header>
2984 <dots/>
96fee5e0
BP
2985 </diagram>
2986
2987 <p>
5a0e4aec
BP
2988 Every Ethernet type has a value 0x600 (1,536) or greater.
2989 When the last two bytes of the Ethernet header have a value
2990 too small to be an Ethernet type, then the value found there
2991 is the total length of the frame in bytes, excluding the
2992 Ethernet header. An 802.2 LLC header typically follows the
2993 Ethernet header. OpenFlow and Open vSwitch only support LLC
2994 headers with DSAP and SSAP <code>0xaa</code> and control byte
2995 <code>0x03</code>, which indicate that a SNAP header follows
2996 the LLC header. In turn, OpenFlow and Open vSwitch only
2997 support a SNAP header with organization <code>0x000000</code>.
2998 In such a case, this field is copied from the type field in
2999 the SNAP header, like this:
96fee5e0
BP
3000 </p>
3001
3002 <diagram>
5a0e4aec
BP
3003 <header name="Ethernet">
3004 <bits name="dst" above="48" width=".75"/>
3005 <bits name="src" above="48" width=".75"/>
3006 <bits name="type" above="16" below="&lt;0x600" width="0.4"/>
3007 </header>
3008 <header name="LLC">
3009 <bits name="DSAP" above="8" below="0xaa" width=".4"/>
3010 <bits name="SSAP" above="8" below="0xaa" width=".4"/>
3011 <bits name="cntl" above="8" below="0x03" width=".4"/>
3012 </header>
3013 <header name="SNAP">
3014 <bits name="org" above="24" below="0x000000" width=".75"/>
3015 <bits name="type" above="16" below="\[&gt;=]0x600" width=".4" fill="yes"/>
3016 </header>
3017 <dots/>
96fee5e0
BP
3018 </diagram>
3019
3020 <p>
5a0e4aec
BP
3021 When an 802.1Q header is inserted after the Ethernet source
3022 and destination, this field is populated with the encapsulated
3023 Ethertype, not the 802.1Q Ethertype. With an Ethernet II
3024 inner frame, the result looks like this:
96fee5e0
BP
3025 </p>
3026
3027 <diagram>
5a0e4aec
BP
3028 <header name="Ethernet">
3029 <bits name="dst" above="48" width=".75"/>
3030 <bits name="src" above="48" width=".75"/>
3031 </header>
3032 <header name="802.1Q">
3033 <bits name="TPID" above="16" below="0x8100" width=".4"/>
3034 <bits name="TCI" above="16" width=".4"/>
3035 </header>
3036 <header name="Ethertype">
3037 <bits name="type" above="16" below="\[&gt;=]0x600" width=".4" fill="yes"/>
3038 </header>
3039 <dots/>
96fee5e0
BP
3040 </diagram>
3041
3042 <p>
5a0e4aec 3043 LLC and SNAP encapsulation look like this with an 802.1Q header:
96fee5e0
BP
3044 </p>
3045
3046 <diagram>
5a0e4aec
BP
3047 <header name="Ethernet">
3048 <bits name="dst" above="48" width=".75"/>
3049 <bits name="src" above="48" width=".75"/>
3050 </header>
3051 <header name="802.1Q">
3052 <bits name="TPID" above="16" below="0x8100" width=".4"/>
3053 <bits name="TCI" above="16" width=".4"/>
3054 </header>
3055 <header name="Ethertype">
3056 <bits name="type" above="16" below="&lt;0x600" width="0.4"/>
3057 </header>
3058 <header name="LLC">
3059 <bits name="DSAP" above="8" below="0xaa" width=".4"/>
3060 <bits name="SSAP" above="8" below="0xaa" width=".4"/>
3061 <bits name="cntl" above="8" below="0x03" width=".4"/>
3062 </header>
3063 <header name="SNAP">
3064 <bits name="org" above="24" below="0x000000" width=".75"/>
3065 <bits name="type" above="16" below="\[&gt;=]0x600" width=".4" fill="yes"/>
3066 </header>
3067 <dots/>
96fee5e0
BP
3068 </diagram>
3069
3070 <p>
5a0e4aec
BP
3071 When a packet doesn't match any of the header formats described
3072 above, Open vSwitch and OpenFlow set this field to
3073 <code>0x5ff</code> (<code>OFP_DL_TYPE_NOT_ETH_TYPE</code>).
96fee5e0
BP
3074 </p>
3075 </field>
3076 </group>
3077
3078 <group title="VLAN">
3079 <p>
3080 The 802.1Q VLAN header causes more trouble than any other 4
3081 bytes in networking. OpenFlow 1.0, 1.1, and 1.2+ all treat VLANs
3082 differently. Open vSwitch extensions add another variant to the mix.
3083 Open vSwitch reconciles all four treatments as best it can.
3084 </p>
3085
3086 <h2>VLAN Header Format</h2>
3087
3088 <p>
3089 An 802.1Q VLAN header consists of two 16-bit fields:
3090 </p>
3091
3092 <diagram>
3093 <header name="TPID">
5a0e4aec 3094 <bits name="Ethertype" above="16" below="0x8100" width="1.8"/>
96fee5e0
BP
3095 </header>
3096 <nospace/>
3097 <header name="TCI">
5a0e4aec
BP
3098 <bits name="PCP" above="3" width=".6"/>
3099 <bits name="CFI" above="1" below="0" width=".3"/>
3100 <bits name="VID" above="12" width=".9"/>
96fee5e0
BP
3101 </header>
3102 </diagram>
3103
3104 <p>
3105 The first 16 bits of the VLAN header, the <dfn>TPID</dfn> (Tag Protocol
3106 IDentifier), is an Ethertype. When the VLAN header is inserted just
3107 after the source and destination MAC addresses in a Ethertype frame, the
3108 TPID serves to identify the presence of the VLAN. The standard TPID, the
3109 only one that Open vSwitch supports, is <code>0x8100</code>. OpenFlow
3110 1.0 explicitly supports only TPID <code>0x8100</code>. OpenFlow 1.1, but
3111 not earlier or later versions, also requires support for TPID
3112 <code>0x88a8</code> (Open vSwitch does not support this). OpenFlow 1.2
3113 through 1.5 do not require support for specific TPIDs (the ``push vlan
3114 header'' action does say that only <code>0x8100</code> and
3115 <code>0x88a8</code> should be pushed). No version of OpenFlow provides a
3116 way to distinguish or match on the TPID.
3117 </p>
3118
3119 <p>
3120 The remaining 16 bits of the VLAN header, the <dfn>TCI</dfn>
3121 (Tag Control Information), is subdivided into three subfields:
3122 </p>
3123
3124 <ul>
3125 <li>
5a0e4aec
BP
3126 <dfn>PCP</dfn> (Priority Control Point), is a 3-bit 802.1p
3127 <dfn>priority</dfn>. The lowest priority is value 1, the
3128 second-lowest is value 0, and priority increases from 2 up to
3129 highest priority 7.
96fee5e0
BP
3130 </li>
3131
3132 <li>
3133 <p>
5a0e4aec
BP
3134 <dfn>CFI</dfn> (Canonical Format Indicator), is a 1-bit field. On an
3135 Ethernet network, its value is always 0. This led to it later being
3136 repurposed under the name <dfn>DEI</dfn> (Drop Eligibility
3137 Indicator). By either name, OpenFlow and Open vSwitch don't provide
3138 any way to match or set this bit.
96fee5e0
BP
3139 </p>
3140 </li>
3141
3142 <li>
5a0e4aec
BP
3143 <dfn>VID</dfn> (VLAN IDentifier), is a 12-bit VLAN. If the
3144 VID is 0, then the frame is not part of a VLAN. In that case,
3145 the VLAN header is called a <dfn>priority tag</dfn> because it
3146 is only meaningful for assigning the frame a priority. VID
3147 <code>0xfff</code> (4,095) is reserved.
96fee5e0
BP
3148 </li>
3149 </ul>
3150
3151 <p>
3152 See <ref field="eth_type"/> for illustrations of a complete Ethernet
3153 frame with 802.1Q tag included.
3154 </p>
3155
3156 <h2>Multiple VLANs</h2>
3157
3158 <p>
3159 Open vSwitch can match only a single VLAN header. If more than
3160 one VLAN header is present, then <ref field="eth_type"/>
3161 holds the TPID of the inner VLAN header. Open vSwitch stops
3162 parsing the packet after the inner TPID, so matching further
3163 into the packet (e.g. on the inner TCI or L3 fields) is not
3164 possible.
3165 </p>
3166
3167 <p>
3168 OpenFlow only directly supports matching a single VLAN header. In
3169 OpenFlow 1.1 or later, one OpenFlow table can match on the outermost VLAN
3170 header and pop it off, and a later OpenFlow table can match on the next
3171 outermost header. Open vSwitch does not support this.
3172 </p>
3173
3174 <h2>VLAN Field Details</h2>
3175
3176 <p>
3177 The four variants have three different levels of expressiveness: OpenFlow
3178 1.0 and 1.1 VLAN matching are less powerful than OpenFlow 1.2+ VLAN
3179 matching, which is less powerful than Open vSwitch extension VLAN
3180 matching.
3181 </p>
3182
3183 <h2>OpenFlow 1.0 VLAN Fields</h2>
3184
3185 <p>
3186 OpenFlow 1.0 uses two fields, called <code>dl_vlan</code> and
3187 <code>dl_vlan_pcp</code>, each of which can be either exact-matched or
3188 wildcarded, to specify VLAN matches:
3189 </p>
3190
3191 <ul>
3192 <li>
5a0e4aec
BP
3193 When both <code>dl_vlan</code> and <code>dl_vlan_pcp</code> are
3194 wildcarded, the flow matches packets without an 802.1Q header or
3195 with any 802.1Q header.
96fee5e0
BP
3196 </li>
3197
3198 <li>
3199 The match <code>dl_vlan=0xffff</code> causes a flow to match only
3200 packets without an 802.1Q header. Such a flow should also wildcard
3201 <code>dl_vlan_pcp</code>, since a packet without an 802.1Q header does
3202 not have a PCP. OpenFlow does not specify what to do if a match on PCP
3203 is actually present, but Open vSwitch ignores it.
3204 </li>
3205
3206 <li>
5a0e4aec
BP
3207 <p>
3208 Otherwise, the flow matches only packets with an 802.1Q
3209 header. If <code>dl_vlan</code> is not wildcarded, then the
3210 flow only matches packets with the VLAN ID specified in
3211 <code>dl_vlan</code>'s low 12 bits. If
3212 <code>dl_vlan_pcp</code> is not wildcarded, then the flow
3213 only matches packets with the priority specified in
3214 <code>dl_vlan_pcp</code>'s low 3 bits.
3215 </p>
3216
3217 <p>
3218 OpenFlow does not specify how to interpret the high 4 bits of
3219 <code>dl_vlan</code> or the high 5 bits of <code>dl_vlan_pcp</code>.
3220 Open vSwitch ignores them.
3221 </p>
96fee5e0
BP
3222 </li>
3223 </ul>
3224
3225 <field id="MFF_DL_VLAN" title="OpenFlow 1.0 VLAN ID" hidden="yes"/>
3226 <field id="MFF_DL_VLAN_PCP" title="OpenFlow 1.0 VLAN Priority"
5a0e4aec 3227 hidden="yes"/>
96fee5e0
BP
3228
3229 <h2>OpenFlow 1.1 VLAN Fields</h2>
3230
3231 <p>
3232 VLAN matching in OpenFlow 1.1 is similar to OpenFlow 1.0.
3233 The one refinement is that when <code>dl_vlan</code> matches on
3234 <code>0xfffe</code> (<code>OFVPID_ANY</code>), the flow matches
3235 only packets with an 802.1Q header, with any VLAN ID. If
3236 <code>dl_vlan_pcp</code> is wildcarded, the flow matches any
3237 packet with an 802.1Q header, regardless of VLAN ID or priority.
3238 If <code>dl_vlan_pcp</code> is not wildcarded, then the flow
3239 only matches packets with the priority specified in
3240 <code>dl_vlan_pcp</code>'s low 3 bits.
3241 </p>
3242
3243 <p>
3244 OpenFlow 1.1 uses the name <code>OFPVID_NONE</code>, instead of
3245 <code>OFP_VLAN_NONE</code>, for a <code>dl_vlan</code> of
3246 <code>0xffff</code>, but it has the same meaning.
3247 </p>
3248
3249 <p>
3250 In OpenFlow 1.1, Open vSwitch reports error
3251 <code>OFPBMC_BAD_VALUE</code> for an attempt to match on
3252 <code>dl_vlan</code> between 4,096 and <code>0xfffd</code>,
3253 inclusive, or <code>dl_vlan_pcp</code> greater than 7.
3254 </p>
3255
3256 <h2>OpenFlow 1.2 VLAN Fields</h2>
3257
3258 <field id="MFF_VLAN_VID" title="OpenFlow 1.2+ VLAN ID">
3259 <p>
5a0e4aec
BP
3260 The OpenFlow standard describes this field as consisting of
3261 ``12+1'' bits. On ingress, its value is 0 if no 802.1Q header
3262 is present, and otherwise it holds the VLAN VID in its least
3263 significant 12 bits, with bit 12 (<code>0x1000</code> aka
3264 <code>OFPVID_PRESENT</code>) also set to 1. The three most
3265 significant bits are always zero:
96fee5e0
BP
3266 </p>
3267
3268 <diagram>
5a0e4aec
BP
3269 <header name="OXM_OF_VLAN_VID">
3270 <bits name="" above="3" below="0" width=".6"/>
3271 <bits name="P" above="1" width=".1"/>
3272 <bits name="VLAN ID" above="12" width=".9"/>
3273 </header>
96fee5e0
BP
3274 </diagram>
3275
3276 <p>
5a0e4aec
BP
3277 As a consequence of this field's format, one may use it to match the
3278 VLAN ID in all of the ways available with the OpenFlow 1.0 and 1.1
3279 formats, and a few new ways:
96fee5e0
BP
3280 </p>
3281
3282 <dl>
5a0e4aec
BP
3283 <dt>Fully wildcarded</dt>
3284 <dd>
3285 Matches any packet, that is, one without an 802.1Q header or
3286 with an 802.1Q header with any TCI value.
3287 </dd>
3288
3289 <dt>
3290 Value <code>0x0000</code> (<code>OFPVID_NONE</code>), mask
3291 <code>0xffff</code> (or no mask)
3292 </dt>
3293 <dd>
3294 Matches only packets without an 802.1Q header.
3295 </dd>
3296
3297 <dt>
3298 Value <code>0x1000</code>, mask <code>0x1000</code>
3299 </dt>
3300 <dd>
3301 Matches any packet with an 802.1Q header, regardless of VLAN
3302 ID.
3303 </dd>
3304
3305 <dt>
3306 Value <code>0x1009</code>, mask <code>0xffff</code> (or no mask)
3307 </dt>
3308 <dd>
3309 Match only packets with an 802.1Q header with VLAN ID 9.
3310 </dd>
3311
3312 <dt>Value <code>0x1001</code>, mask <code>0x1001</code></dt>
3313 <dd>
3314 Matches only packets that have an 802.1Q header with an
3315 odd-numbered VLAN ID. (This is just an example; one can
3316 match on any desired VLAN ID bit pattern.)
3317 </dd>
96fee5e0
BP
3318 </dl>
3319 </field>
3320
3321 <field id="MFF_VLAN_PCP" title="OpenFlow 1.2+ VLAN Priority">
3322 <p>
5a0e4aec
BP
3323 The 3 least significant bits may be used to match the PCP bits
3324 in an 802.1Q header. Other bits are always zero:
96fee5e0
BP
3325 </p>
3326
3327 <diagram>
5a0e4aec
BP
3328 <header name="OXM_OF_VLAN_VID">
3329 <bits name="zero" above="5" below="0" width="1.0"/>
3330 <bits name="PCP" above="3" width=".6"/>
3331 </header>
96fee5e0
BP
3332 </diagram>
3333
3334 <p>
5a0e4aec
BP
3335 This field may only be used when <ref field="vlan_vid"/> is not
3336 wildcarded and does not exact match on 0 (which only matches
3337 when there is no 802.1Q header).
96fee5e0
BP
3338 </p>
3339
3340 <p>
5a0e4aec 3341 See <cite>VLAN Comparison Chart</cite>, below, for some examples.
96fee5e0
BP
3342 </p>
3343 </field>
3344
3345 <h2>Open vSwitch Extension VLAN Field</h2>
3346
3347 <p>
3348 The <ref field="vlan_tci"/> extension can describe more kinds of VLAN
3349 matches than the other variants. It is also simpler than the other
3350 variants.
3351 </p>
3352
3353 <field id="MFF_VLAN_TCI" title="VLAN TCI">
3354 <p>
5a0e4aec
BP
3355 For a packet without an 802.1Q header, this field is zero. For a
3356 packet with an 802.1Q header, this field is the TCI with the bit in
3357 CFI's position (marked <code>P</code> for ``present'' below) forced to
3358 1. Thus, for a packet in VLAN 9 with priority 7, it has the value
3359 <code>0xf009</code>:
96fee5e0
BP
3360 </p>
3361
3362 <diagram>
5a0e4aec
BP
3363 <header name="NXM_VLAN_TCI">
3364 <bits name="PCP" above="3" below="7" width=".6"/>
3365 <bits name="P" above="1" below="1" width=".2"/>
3366 <bits name="VID" above="12" below="9" width=".9"/>
3367 </header>
96fee5e0
BP
3368 </diagram>
3369
3370 <p>
3371 Usage examples:
3372 </p>
3373
3374 <dl>
3375 <dt><code>vlan_tci=0</code></dt>
3376 <dd>
3377 Match packets without an 802.1Q header.
3378 </dd>
3379
3380 <dt><code>vlan_tci=0x1000/0x1000</code></dt>
3381 <dd>
3382 Match packets with an 802.1Q header, regardless of VLAN
3383 and priority values.
3384 </dd>
3385
3386 <dt><code>vlan_tci=0xf123</code></dt>
3387 <dd>
3388 Match packets tagged with priority 7 in VLAN 0x123.
3389 </dd>
3390
3391 <dt><code>vlan_tci=0x1123/0x1fff</code></dt>
3392 <dd>
3393 Match packets tagged with VLAN 0x123 (and any priority).
3394 </dd>
3395
3396 <dt><code>vlan_tci=0x5000/0xf000</code></dt>
3397 <dd>
3398 Match packets tagged with priority 2 (in any VLAN).
3399 </dd>
3400
3401 <dt><code>vlan_tci=0/0xfff</code></dt>
3402 <dd>
3403 Match packets with no 802.1Q header or tagged with VLAN 0
3404 (and any priority).
3405 </dd>
3406
3407 <dt><code>vlan_tci=0x5000/0xe000</code></dt>
3408 <dd>
a0a81b57 3409 Match packets with no 802.1Q header or tagged with priority 2 (in any VLAN).
96fee5e0
BP
3410 </dd>
3411
3412 <dt><code>vlan_tci=0/0xefff</code></dt>
3413 <dd>
3414 Match packets with no 802.1Q header or tagged with VLAN 0
3415 and priority 0.
3416 </dd>
3417 </dl>
3418
3419 <p>
5a0e4aec 3420 See <cite>VLAN Comparison Chart</cite>, below, for more examples.
96fee5e0
BP
3421 </p>
3422 </field>
3423
3424 <h2>VLAN Comparison Chart</h2>
3425
3426 <p>
3427 The following table describes each of several possible matching
3428 criteria on 802.1Q header may be expressed with each variation
3429 of the VLAN matching fields:
3430 </p>
3431
3432 <tbl>
3433r r r r r.
5a0e4aec
BP
3434Criteria OpenFlow 1.0 OpenFlow 1.1 OpenFlow 1.2+ NXM
3435\_ \_ \_ \_ \_
3436[1] \fL????\fR/\fL1\fR,\fL??\fR/\fL?\fR \fL????\fR/\fL1\fR,\fL??\fR/\fL?\fR \fL0000\fR/\fL0000\fR,\fL--\fR \fL0000\fR/\fL0000\fR
3437[2] \fLffff\fR/\fL0\fR,\fL??\fR/\fL?\fR \fLffff\fR/\fL0\fR,\fL??\fR/\fL?\fR \fL0000\fR/\fLffff\fR,\fL--\fR \fL0000\fR/\fLffff\fR
3438[3] \fL0xxx\fR/\fL0\fR,\fL??\fR/\fL1\fR \fL0xxx\fR/\fL0\fR,\fL??\fR/\fL1\fR \fL1xxx\fR/\fLffff\fR,\fL--\fR \fL1xxx\fR/\fL1fff\fR
3439[4] \fL????\fR/\fL1\fR,\fL0y\fR/\fL0\fR \fLfffe\fR/\fL0\fR,\fL0y\fR/\fL0\fR \fL1000\fR/\fL1000\fR,\fL0y\fR \fLz000\fR/\fLf000\fR
3440[5] \fL0xxx\fR/\fL0\fR,\fL0y\fR/\fL0\fR \fL0xxx\fR/\fL0\fR,\fL0y\fR/\fL0\fR \fL1xxx\fR/\fLffff\fR,\fL0y\fR \fLzxxx\fR/\fLffff\fR
96fee5e0
BP
3441.T&amp;
3442r r c c r.
5a0e4aec 3443[6] (none) (none) \fL1001\fR/\fL1001\fR,\fL--\fR \fL1001\fR/\fL1001\fR
96fee5e0
BP
3444.T&amp;
3445r r c c c.
5a0e4aec
BP
3446[7] (none) (none) (none) \fL3000\fR/\fL3000\fR
3447[8] (none) (none) (none) \fL0000\fR/\fL0fff\fR
3448[9] (none) (none) (none) \fL0000\fR/\fLf000\fR
3449[10] (none) (none) (none) \fL0000\fR/\fLefff\fR
96fee5e0
BP
3450 </tbl>
3451
3452 <p>
3453 All numbers in the table are expressed in hexadecimal. The
3454 columns in the table are interpreted as follows:
3455 </p>
3456
3457 <dl>
3458 <dt>Criteria</dt>
3459 <dd>See the list below.</dd>
3460
3461 <dt>OpenFlow 1.0</dt>
3462 <dt>OpenFlow 1.1</dt>
3463 <dd>
5a0e4aec
BP
3464 <literal>wwww/x,yy/z</literal> means VLAN ID match value
3465 <literal>wwww</literal> with wildcard bit <literal>x</literal>
3466 and VLAN PCP match value <literal>yy</literal> with wildcard
3467 bit <literal>z</literal>. <literal>?</literal> means that the
3468 given bits are ignored (and conventionally
3469 <literal>0</literal> for <literal>wwww</literal> or
3470 <literal>yy</literal>, conventionally <literal>1</literal> for
3471 <literal>x</literal> or <literal>z</literal>). ``(none)''
3472 means that OpenFlow 1.0 (or 1.1) cannot match with these
3473 criteria.
96fee5e0
BP
3474 </dd>
3475
3476 <dt>OpenFlow 1.2+</dt>
3477 <dd>
5a0e4aec
BP
3478 <literal>xxxx/yyyy,zz</literal> means <ref field="vlan_vid"/> with
3479 value <literal>xxxx</literal> and mask <literal>yyyy</literal>, and
3480 <ref field="vlan_pcp"/> (which is not maskable) with value
3481 <literal>zz</literal>. <literal>--</literal> means that <ref
3482 field="vlan_pcp"/> is omitted. ``(none)'' means that OpenFlow 1.2
3483 cannot match with these criteria.
96fee5e0
BP
3484 </dd>
3485
3486 <dt>NXM</dt>
3487 <dd>
5a0e4aec
BP
3488 <literal>xxxx/yyyy</literal> means <ref field="vlan_tci"/> with value
3489 <literal>xxxx</literal> and mask <literal>yyyy</literal>.
96fee5e0
BP
3490 </dd>
3491 </dl>
3492
3493 <p>
3494 The matching criteria described by the table are:
3495 </p>
3496
3497 <dl>
3498 <dt>[1]</dt>
3499 <dd>
5a0e4aec
BP
3500 Matches any packet, that is, one without an 802.1Q header or
3501 with an 802.1Q header with any TCI value.
96fee5e0
BP
3502 </dd>
3503
3504 <dt>[2]</dt>
3505 <dd>
5a0e4aec
BP
3506 <p>
3507 Matches only packets without an 802.1Q header.
3508 </p>
3509
3510 <p>
3511 OpenFlow 1.0 doesn't define the behavior if <ref field="dl_vlan"/> is
3512 set to <code>0xffff</code> and <ref field="dl_vlan_pcp"/> is not
3513 wildcarded. (Open vSwitch always ignores <ref field="dl_vlan_pcp"/>
3514 when <ref field="dl_vlan"/> is set to <code>0xffff</code>.)
3515 </p>
3516
3517 <p>
3518 OpenFlow 1.1 says explicitly to ignore <ref field="dl_vlan_pcp"/>
3519 when <ref field="dl_vlan"/> is set to <code>0xffff</code>.
3520 </p>
3521
3522 <p>
3523 OpenFlow 1.2 doesn't say how to interpret a match with <ref
3524 field="vlan_vid"/> value 0 and a mask with
3525 <code>OFPVID_PRESENT</code> (<code>0x1000</code>) set to 1 and some
3526 other bits in the mask set to 1 also. Open vSwitch interprets it the
3527 same way as a mask of <code>0x1000</code>.
3528 </p>
3529
3530 <p>
3531 Any NXM match with <ref field="vlan_tci"/> value 0 and the CFI bit
3532 set to 1 in the mask is equivalent to the one listed in the table.
3533 </p>
96fee5e0
BP
3534 </dd>
3535
3536 <dt>[3]</dt>
3537 <dd>
5a0e4aec
BP
3538 Matches only packets that have an 802.1Q header with VID
3539 <literal>xxx</literal> (and any PCP).
96fee5e0
BP
3540 </dd>
3541
3542 <dt>[4]</dt>
3543 <dd>
5a0e4aec
BP
3544 <p>
3545 Matches only packets that have an 802.1Q header with PCP
3546 <literal>y</literal> (and any VID).
3547 </p>
3548
3549 <p>
3550 OpenFlow 1.0 doesn't clearly define the behavior for this
3551 case. Open vSwitch implements it this way.
3552 </p>
3553
3554 <p>
3555 In the NXM value, <literal>z</literal> equals
3556 (<literal>y</literal> &lt;&lt; 1) | 1.
3557 </p>
96fee5e0
BP
3558 </dd>
3559
3560 <dt>[5]</dt>
3561 <dd>
5a0e4aec
BP
3562 <p>
3563 Matches only packets that have an 802.1Q header with VID
3564 <literal>xxx</literal> and PCP <literal>y</literal>.
3565 </p>
3566
3567 <p>
3568 In the NXM value, <literal>z</literal> equals
3569 (<literal>y</literal> &lt;&lt; 1) | 1.
3570 </p>
96fee5e0
BP
3571 </dd>
3572
3573 <dt>[6]</dt>
3574 <dd>
5a0e4aec
BP
3575 Matches only packets that have an 802.1Q header with an
3576 odd-numbered VID (and any PCP). Only possible with OpenFlow
3577 1.2 and NXM. (This is just an example; one can match on any
3578 desired VID bit pattern.)
96fee5e0
BP
3579 </dd>
3580
3581 <dt>[7]</dt>
3582 <dd>
5a0e4aec
BP
3583 Matches only packets that have an 802.1Q header with an
3584 odd-numbered PCP (and any VID). Only possible with NXM.
3585 (This is just an example; one can match on any desired VID bit
3586 pattern.)
96fee5e0
BP
3587 </dd>
3588
3589 <dt>[8]</dt>
3590 <dd>
5a0e4aec
BP
3591 Matches packets with no 802.1Q header or with an 802.1Q header
3592 with a VID of 0. Only possible with NXM.
96fee5e0
BP
3593 </dd>
3594
3595 <dt>[9]</dt>
3596 <dd>
5a0e4aec
BP
3597 Matches packets with no 802.1Q header or with an 802.1Q header
3598 with a PCP of 0. Only possible with NXM.
96fee5e0
BP
3599 </dd>
3600
3601 <dt>[10]</dt>
3602 <dd>
5a0e4aec
BP
3603 Matches packets with no 802.1Q header or with an 802.1Q header
3604 with both VID and PCP of 0. Only possible with NXM.
96fee5e0
BP
3605 </dd>
3606 </dl>
3607 </group>
3608
3609 <group title="Layer 2.5: MPLS">
3610 <p>
3611 One or more MPLS headers (more commonly called <dfn>MPLS
3612 labels</dfn>) follow an Ethernet type field that specifies an
3613 MPLS Ethernet type [RFC 3032]. Ethertype <code>0x8847</code> is
3614 used for all unicast. Multicast MPLS is divided into two
3615 specific classes, one of which uses Ethertype
3616 <code>0x8847</code> and the other <code>0x8848</code> [RFC
3617 5332].
3618 </p>
3619
3620 <p>
3621 The most common overall packet format is Ethernet II, shown
3622 below (SNAP encapsulation may be used but is not ordinarily seen
3623 in Ethernet networks):
3624 </p>
3625
3626 <diagram>
3627 <header name="Ethernet">
5a0e4aec
BP
3628 <bits name="dst" above="48" width="0.75"/>
3629 <bits name="src" above="48" width="0.75"/>
3630 <bits name="type" above="16" below="0x8847" width="0.4"/>
96fee5e0
BP
3631 </header>
3632 <header name="MPLS">
5a0e4aec
BP
3633 <bits name="label" above="20" width=".6"/>
3634 <bits name="TC" above="3" width=".3"/>
3635 <bits name="S" above="1" width=".1"/>
3636 <bits name="TTL" above="8" width=".4"/>
96fee5e0
BP
3637 </header>
3638 <dots/>
3639 </diagram>
3640
3641 <p>
3642 MPLS can be encapsulated inside an 802.1Q header, in which case
3643 the combination looks like this:
3644 </p>
3645
3646 <diagram>
3647 <header name="Ethernet">
5a0e4aec
BP
3648 <bits name="dst" above="48" width=".75"/>
3649 <bits name="src" above="48" width=".75"/>
96fee5e0
BP
3650 </header>
3651 <header name="802.1Q">
5a0e4aec
BP
3652 <bits name="TPID" above="16" below="0x8100" width=".4"/>
3653 <bits name="TCI" above="16" width=".4"/>
96fee5e0
BP
3654 </header>
3655 <header name="Ethertype">
5a0e4aec 3656 <bits name="type" above="16" below="0x8847" width=".4"/>
96fee5e0
BP
3657 </header>
3658 <header name="MPLS">
5a0e4aec
BP
3659 <bits name="label" above="20" width=".6"/>
3660 <bits name="TC" above="3" width=".3"/>
3661 <bits name="S" above="1" width=".1"/>
3662 <bits name="TTL" above="8" width=".4"/>
96fee5e0
BP
3663 </header>
3664 <dots/>
3665 </diagram>
3666
3667 <p>
3668 The fields within an MPLS label are:
3669 </p>
3670
3671 <dl>
3672 <dt>Label, 20 bits.</dt>
3673 <dd>
5a0e4aec 3674 An identifier.
96fee5e0
BP
3675 </dd>
3676
3677 <dt>Traffic control (TC), 3 bits.</dt>
3678 <dd>
5a0e4aec 3679 Used for quality of service.
96fee5e0
BP
3680 </dd>
3681
3682 <dt>Bottom of stack (BOS), 1 bit (labeled just ``S'' above).</dt>
3683 <dd>
5a0e4aec
BP
3684 <p>
3685 0 indicates that another MPLS label follows this one.
3686 </p>
3687
3688 <p>
3689 1 indicates that this MPLS label is the last one in the
3690 stack, so that some other protocol follows this one.
3691 </p>
96fee5e0
BP
3692 </dd>
3693
3694 <dt>Time to live (TTL), 8 bits.</dt>
3695 <dd>
5a0e4aec
BP
3696 <p>
3697 Each hop across an MPLS network decrements the TTL by 1. If
3698 it reaches 0, the packet is discarded.
3699 </p>
3700
3701 <p>
3702 OpenFlow does not make the MPLS TTL available as a match field, but
3703 actions are available to set and decrement the TTL. Open vSwitch 2.6
3704 and later makes the MPLS TTL available as an extension.
3705 </p>
96fee5e0
BP
3706 </dd>
3707 </dl>
3708
3709 <h2>MPLS Label Stacks</h2>
3710
3711 <p>
3712 Unlike the other encapsulations supported by OpenFlow and Open vSwitch,
3713 MPLS labels are routinely used in ``stacks'' two or three deep and
3714 sometimes even deeper. Open vSwitch currently supports up to three
3715 labels.
3716 </p>
3717
3718 <p>
3719 The OpenFlow specification only supports matching on the outermost MPLS
3720 label at any given time. To match on the second label, one must first
3721 ``pop'' the outer label and advance to another OpenFlow table, where the
3722 inner label may be matched. To match on the third label, one must pop
3723 the two outer labels, and so on. The Open Networking Foundation is
3724 considering support for directly matching on multiple MPLS labels for
3725 OpenFlow 1.6.<!-- XXX add EXT-* link -->
3726 </p>
3727
3728 <h2>MPLS Inner Protocol</h2>
3729
3730 <p>
3731 Unlike all other forms of encapsulation that Open vSwitch and
3732 OpenFlow support, an MPLS label does not indicate what inner
3733 protocol it encapsulates. Different deployments determine the
3734 inner protocol in different ways [RFC 3032]:
3735 </p>
3736
3737 <ul>
3738 <li>
5a0e4aec
BP
3739 A few reserved label values do indicate an inner protocol.
3740 Label 0, the ``IPv4 Explicit NULL Label,'' indicates inner
3741 IPv4. Label 2, the ``IPv6 Explicit NULL Label,'' indicates
3742 inner IPv6.
96fee5e0
BP
3743 </li>
3744
3745 <li>
5a0e4aec 3746 Some deployments use a single inner protocol consistently.
96fee5e0
BP
3747 </li>
3748
3749 <li>
5a0e4aec
BP
3750 In some deployments, the inner protocol must be inferred from
3751 the innermost label.
96fee5e0
BP
3752 </li>
3753
3754 <li>
5a0e4aec
BP
3755 In some deployments, the inner protocol must be inferred from
3756 the innermost label and the encapsulated data, e.g. to
3757 distinguish between inner IPv4 and IPv6 based on whether the
3758 first nibble of the inner protocol data are <code>4</code> or
3759 <code>6</code>. OpenFlow and Open vSwitch do not currently
3760 support these cases.
96fee5e0
BP
3761 </li>
3762 </ul>
3763
3764 <p>
3765 Open vSwitch and OpenFlow do not infer the inner protocol, even if
3766 reserved label values are in use. Instead, the flow table must specify
3767 the inner protocol at the time it pops the bottommost MPLS label, using
3768 the Ethertype argument to the <code>pop_mpls</code> action.
3769 </p>
3770
3771 <h2>Field Details</h2>
3772
3773 <field id="MFF_MPLS_LABEL" title="MPLS Label">
3774 <p>
5a0e4aec
BP
3775 The least significant 20 bits hold the ``label'' field from
3776 the MPLS label. Other bits are zero:
96fee5e0
BP
3777 </p>
3778
3779 <diagram>
5a0e4aec
BP
3780 <header name="OXM_OF_MPLS_LABEL">
3781 <bits name="zero" above="12" below="0" width=".6"/>
3782 <bits name="label" above="20" width="1.0"/>
3783 </header>
96fee5e0
BP
3784 </diagram>
3785
3786 <p>
5a0e4aec
BP
3787 Most label values are available for any use by deployments.
3788 Values under 16 are reserved.
96fee5e0
BP
3789 </p>
3790 </field>
3791
3792 <field id="MFF_MPLS_TC" title="MPLS Traffic Class">
3793 <p>
5a0e4aec
BP
3794 The least significant 3 bits hold the TC field from the MPLS
3795 label. Other bits are zero:
96fee5e0
BP
3796 </p>
3797
3798 <diagram>
5a0e4aec
BP
3799 <header name="OXM_OF_MPLS_TC">
3800 <bits name="zero" above="5" below="0" width="1.0"/>
3801 <bits name="TC" above="3" width=".6"/>
3802 </header>
96fee5e0
BP
3803 </diagram>
3804
3805 <p>
5a0e4aec
BP
3806 This field is intended for use for Quality of Service (QoS)
3807 and Explicit Congestion Notification purposes, but its
3808 particular interpretation is deployment specific.
96fee5e0
BP
3809 </p>
3810
3811 <p>
5a0e4aec
BP
3812 Before 2009, this field was named EXP and reserved for
3813 experimental use [RFC 5462].
96fee5e0
BP
3814 </p>
3815 </field>
3816
3817 <field id="MFF_MPLS_BOS" title="MPLS Bottom of Stack">
3818 <p>
5a0e4aec
BP
3819 The least significant bit holds the BOS field from the MPLS
3820 label. Other bits are zero:
96fee5e0
BP
3821 </p>
3822
3823 <diagram>
5a0e4aec
BP
3824 <header name="OXM_OF_MPLS_BOS">
3825 <bits name="zero" above="7" below="0" width="1.3"/>
3826 <bits name="BOS" above="1" width=".3"/>
3827 </header>
96fee5e0
BP
3828 </diagram>
3829
3830 <p>
5a0e4aec
BP
3831 This field is useful as part of processing a series of incoming MPLS
3832 labels. A flow that includes a <code>pop_mpls</code> action should
3833 generally match on <ref field="mpls_bos"/>:
96fee5e0
BP
3834 </p>
3835
3836 <ul>
5a0e4aec
BP
3837 <li>
3838 When <ref field="mpls_bos"/> is 1, there is another MPLS label
3839 following this one, so the Ethertype passed to <code>pop_mpls</code>
3840 should be an MPLS Ethertype. For example: <code>table=0,
3841 dl_type=0x8847, mpls_bos=1, actions=pop_mpls:0x8847,
3842 goto_table:1</code>
3843 </li>
3844
3845 <li>
3846 When <ref field="mpls_bos"/> is 0, this MPLS label is the last one,
3847 so the Ethertype passed to <code>pop_mpls</code> should be a non-MPLS
3848 Ethertype such as IPv4. For example: <code>table=1, dl_type=0x8847,
3849 mpls_bos=0, actions=pop_mpls:0x0800, goto_table:2</code>
3850 </li>
96fee5e0
BP
3851 </ul>
3852 </field>
3853
3854 <field id="MFF_MPLS_TTL" title="MPLS Time-to-Live">
3855 <p>
3856 Holds the 8-bit time-to-live field from the MPLS label:
3857 </p>
3858
3859 <diagram>
5a0e4aec
BP
3860 <header name="NXM_NX_MPLS_TTL">
3861 <bits name="TTL" above="8" width=".4"/>
3862 </header>
96fee5e0
BP
3863 </diagram>
3864 </field>
3865 </group>
3866
3867 <group title="Layer 3: IPv4 and IPv6">
3868 <h2>IPv4 Specific Fields</h2>
3869
3870 <p>
3871 These fields are applicable only to IPv4 flows, that is, flows that match
3872 on the IPv4 Ethertype <code>0x0800</code>.
3873 </p>
3874
3875 <field id="MFF_IPV4_SRC" title="IPv4 Source Address">
3876 <p>
3877 The source address from the IPv4 header:
3878 </p>
3879
3880 <diagram>
5a0e4aec
BP
3881 <header name="Ethernet">
3882 <bits name="dst" above="48" width="0.4"/>
3883 <bits name="src" above="48" width="0.4"/>
3884 <bits name="type" above="16" below="0x800" width="0.4"/>
3885 </header>
3886 <header name="IPv4">
3887 <bits name="..." width="0.4"/>
3888 <bits name="proto" above="8" width="0.4"/>
3889 <bits name="src" above="32" width="0.4" fill="yes"/>
3890 <bits name="dst" above="32" width="0.4"/>
3891 </header>
3892 <dots/>
96fee5e0
BP
3893 </diagram>
3894
3895 <p>
3896 For historical reasons, in an ARP or RARP flow, Open vSwitch interprets
3897 matches on <code>nw_src</code> as actually referring to the ARP SPA.
3898 </p>
3899 </field>
3900
3901 <field id="MFF_IPV4_DST" title="IPv4 Destination Address">
3902 <p>
3903 The destination address from the IPv4 header:
3904 </p>
3905
3906 <diagram>
5a0e4aec
BP
3907 <header name="Ethernet">
3908 <bits name="dst" above="48" width="0.4"/>
3909 <bits name="src" above="48" width="0.4"/>
3910 <bits name="type" above="16" below="0x800" width="0.4"/>
3911 </header>
3912 <header name="IPv4">
3913 <bits name="..." width="0.4"/>
3914 <bits name="proto" above="8" width="0.4"/>
3915 <bits name="src" above="32" width="0.4"/>
3916 <bits name="dst" above="32" width="0.4" fill="yes"/>
3917 </header>
3918 <dots/>
96fee5e0
BP
3919 </diagram>
3920
3921 <p>
3922 For historical reasons, in an ARP or RARP flow, Open vSwitch interprets
3923 matches on <code>nw_dst</code> as actually referring to the ARP TPA.
3924 </p>
3925 </field>
3926
3927 <h2>IPv6 Specific Fields</h2>
3928
3929 <p>
3930 These fields apply only to IPv6 flows, that is, flows that match
3931 on the IPv6 Ethertype <code>0x86dd</code>.
3932 </p>
3933
3934 <field id="MFF_IPV6_SRC" title="IPv6 Source Address">
3935 <p>
3936 The source address from the IPv6 header:
3937 </p>
3938
3939 <diagram>
5a0e4aec
BP
3940 <header name="Ethernet">
3941 <bits name="dst" above="48" width="0.4"/>
3942 <bits name="src" above="48" width="0.4"/>
3943 <bits name="type" above="16" below="0x86dd" width="0.4"/>
3944 </header>
3945 <header name="IPv6">
3946 <bits name="..." width="0.4"/>
3947 <bits name="next" above="8" width="0.3"/>
3948 <bits name="src" above="128" width="0.8" fill="yes"/>
3949 <bits name="dst" above="128" width="0.8"/>
3950 </header>
3951 <dots/>
96fee5e0
BP
3952 </diagram>
3953
3954 <p>
3955 Open vSwitch 1.8 added support for bitwise matching; earlier versions
3956 supported only CIDR masks.
3957 </p>
3958 </field>
3959 <field id="MFF_IPV6_DST" title="IPv6 Destination Address">
3960 <p>
3961 The destination address from the IPv6 header:
3962 </p>
3963 <diagram>
5a0e4aec
BP
3964 <header name="Ethernet">
3965 <bits name="dst" above="48" width="0.4"/>
3966 <bits name="src" above="48" width="0.4"/>
3967 <bits name="type" above="16" below="0x86dd" width="0.4"/>
3968 </header>
3969 <header name="IPv6">
3970 <bits name="..." width="0.4"/>
3971 <bits name="next" above="8" width="0.3"/>
3972 <bits name="src" above="128" width="0.8"/>
3973 <bits name="dst" above="128" width="0.8" fill="yes"/>
3974 </header>
3975 <dots/>
96fee5e0
BP
3976 </diagram>
3977
3978 <p>
3979 Open vSwitch 1.8 added support for bitwise matching; earlier versions
3980 supported only CIDR masks.
3981 </p>
3982 </field>
3983 <field id="MFF_IPV6_LABEL" title="IPv6 Flow Label">
3984 <p>
5a0e4aec
BP
3985 The least significant 20 bits hold the flow label field from
3986 the IPv6 header. Other bits are zero:
96fee5e0
BP
3987 </p>
3988
3989 <diagram>
5a0e4aec
BP
3990 <header name="OXM_OF_IPV6_FLABEL">
3991 <bits name="zero" above="12" below="0" width=".6"/>
3992 <bits name="label" above="20" width="1.0"/>
3993 </header>
96fee5e0
BP
3994 </diagram>
3995 </field>
3996
3997 <h2>IPv4/IPv6 Fields</h2>
3998
3999 <p>
4000 These fields exist with at least approximately the same meaning in both
4001 IPv4 and IPv6, so they are treated as a single field for matching
4002 purposes. Any flow that matches on the IPv4 Ethertype
4003 <code>0x0800</code> or the IPv6 Ethertype <code>0x86dd</code> may match
4004 on these fields.
4005 </p>
4006
4007 <field id="MFF_IP_PROTO" title="IPv4/v6 Protocol">
4008 <p>
4009 Matches the IPv4 or IPv6 protocol type.
4010 </p>
4011
4012 <p>
4013 For historical reasons, in an ARP or RARP flow, Open vSwitch interprets
4014 matches on <code>nw_proto</code> as actually referring to the ARP
4015 opcode. The ARP opcode is a 16-bit field, so for matching purposes ARP
4016 opcodes greater than 255 are treated as 0; this works adequately
4017 because in practice ARP and RARP only use opcodes 1 through 4.
4018 </p>
4019 </field>
4020
4021 <field id="MFF_IP_TTL" title="IPv4/v6 TTL/Hop Limit">
4022 The main reason to match on the TTL or hop limit field is to detect
4023 whether a <code>dec_ttl</code> action will fail due to a TTL exceeded
4024 error. Another way that a controller can detect TTL exceeded is to
4025 listen for <code>OFPR_INVALID_TTL</code> ``packet-in'' messages via
4026 OpenFlow.
4027 </field>
17553f27 4028
96fee5e0
BP
4029 <field id="MFF_IP_FRAG" title="IPv4/v6 Fragment Bitmask">
4030 <p>
4031 Specifies what kinds of IP fragments or non-fragments to match. The
4032 value for this field is most conveniently specified as one of the
4033 following:
4034 </p>
4035
4036 <dl>
4037 <dt><code>no</code></dt>
4038 <dd>
4039 Match only non-fragmented packets.
4040 </dd>
4041
4042 <dt><code>yes</code></dt>
4043 <dd>
4044 Matches all fragments.
4045 </dd>
4046
4047 <dt><code>first</code></dt>
4048 <dd>
4049 Matches only fragments with offset 0.
4050 </dd>
4051
4052 <dt><code>later</code></dt>
4053 <dd>
4054 Matches only fragments with nonzero offset.
4055 </dd>
4056
4057 <dt><code>not_later</code></dt>
4058 <dd>
4059 Matches non-fragmented packets and fragments with zero offset.
4060 </dd>
4061 </dl>
4062
4063 <p>
4064 The field is internally formatted as 2 bits: bit 0 is 1 for an IP
4065 fragment with any offset (and otherwise 0), and bit 1 is 1 for an IP
4066 fragment with nonzero offset (and otherwise 0), like so:
4067 </p>
4068
4069 <diagram>
5a0e4aec
BP
4070 <header name="NXM_NX_IP_FRAG">
4071 <bits name="zero" above="6" below="0" width=".9"/>
4072 <bits name="later" above="1" width=".3"/>
4073 <bits name="any" above="1" width=".3"/>
4074 </header>
96fee5e0
BP
4075 </diagram>
4076
4077 <p>
4078 Even though 2 bits have 4 possible values, this field only uses 3 of
4079 them:
4080 </p>
4081
4082 <ul>
4083 <li>
4084 A packet that is not an IP fragment has value 0.
4085 </li>
4086
4087 <li>
4088 A packet that is an IP fragment with offset 0 (the first fragment)
4089 has bit 0 set and thus value 1.
4090 </li>
4091
4092 <li>
4093 A packet that is an IP fragment with nonzero offset has bits 0 and 1
4094 set and thus value 3.
4095 </li>
4096 </ul>
4097
4098 <p>
4099 The switch may reject matches against values that can never appear.
4100 </p>
4101
4102 <p>
4103 It is important to understand how this field interacts with the
4104 OpenFlow fragment handling mode:
4105 </p>
4106
4107 <ul>
4108 <li>
4109 In <code>OFPC_FRAG_DROP</code> mode, the OpenFlow switch drops all IP
4110 fragments before they reach the flow table, so every packet that is
4111 available for matching will have value 0 in this field.
4112 </li>
4113
4114 <li>
4115 Open vSwitch does not implement <code>OFPC_FRAG_REASM</code> mode,
4116 but if it did then IP fragments would be reassembled before they
4117 reached the flow table and again every packet available for matching
4118 would always have value 0.
4119 </li>
4120
4121 <li>
4122 In <code>OFPC_FRAG_NORMAL</code> mode, all three values are possible,
4123 but OpenFlow 1.0 says that fragments' transport ports are always 0,
4124 even for the first fragment, so this does not provide much extra
4125 information.
4126 </li>
4127
4128 <li>
4129 In <code>OFPC_FRAG_NX_MATCH</code> mode, all three values are
4130 possible. For fragments with offset 0, Open vSwitch makes L4 header
4131 information available.
4132 </li>
4133 </ul>
4134
4135 <p>
4136 Thus, this field is likely to be most useful for an Open vSwitch switch
4137 configured in <code>OFPC_FRAG_NX_MATCH</code> mode. See the
4138 description of the <code>set-frags</code> command in
4139 <code>ovs-ofctl</code>(8), for more details.
4140 </p>
4141 </field>
4142
4143 <h3>IPv4/IPv6 TOS Fields</h3>
4144
4145 <p>
4146 IPv4 and IPv6 contain a one-byte ``type of service'' or TOS field that
4147 has the following format:
4148 </p>
4149
4150 <diagram>
4151 <header name="type of service">
5a0e4aec
BP
4152 <bits name="DSCP" above="6" width=".9"/>
4153 <bits name="ECN" above="2" width=".3"/>
96fee5e0
BP
4154 </header>
4155 </diagram>
4156
4157 <field id="MFF_IP_DSCP" title="IPv4/v6 DSCP (Bits 2-7)">
4158 <p>
4159 This field is the TOS byte with the two ECN bits cleared to 0:
4160 </p>
4161
4162 <diagram>
5a0e4aec
BP
4163 <header name="NXM_OF_IP_TOS">
4164 <bits name="DSCP" above="6" width=".9"/>
4165 <bits name="zero" above="2" below="0" width=".3"/>
4166 </header>
96fee5e0
BP
4167 </diagram>
4168 </field>
4169 <field id="MFF_IP_DSCP_SHIFTED" title="IPv4/v6 DSCP (Bits 0-5)">
4170 <p>
4171 This field is the TOS byte shifted right to put the DSCP bits in the
4172 6 least-significant bits:
4173 </p>
4174
4175 <diagram>
5a0e4aec
BP
4176 <header name="OXM_OF_IP_DSCP">
4177 <bits name="zero" above="2" below="0" width=".3"/>
4178 <bits name="DSCP" above="6" width=".9"/>
4179 </header>
96fee5e0
BP
4180 </diagram>
4181 </field>
4182 <field id="MFF_IP_ECN" title="IPv4/v6 ECN">
4183 <p>
4184 This field is the TOS byte with the DSCP bits cleared to 0:
4185 </p>
4186
4187 <diagram>
5a0e4aec
BP
4188 <header name="OXM_OF_IP_ECN">
4189 <bits name="zero" above="6" below="0" width=".9"/>
4190 <bits name="ECN" above="2" width=".35"/>
4191 </header>
96fee5e0
BP
4192 </diagram>
4193 </field>
4194
4195 </group>
4196
4197 <group title="Layer 3: ARP">
4198 <p>
4199 In theory, Address Resolution Protocol, or ARP, is a generic protocol
4200 generic protocol that can be used to obtain the hardware address that
4201 corresponds to any higher-level protocol address. In contemporary usage,
4202 ARP is used only in Ethernet networks to obtain the Ethernet address for
4203 a given IPv4 address. OpenFlow and Open vSwitch only support this usage
4204 of ARP. For this use case, an ARP packet has the following format, with
4205 the ARP fields exposed as Open vSwitch fields highlighted:
4206 </p>
4207
4208 <diagram>
4209 <header name="Ethernet">
5a0e4aec
BP
4210 <bits name="dst" above="48" width="0.4"/>
4211 <bits name="src" above="48" width="0.4"/>
4212 <bits name="type" above="16" below="0x806" width="0.4"/>
96fee5e0
BP
4213 </header>
4214 <header name="ARP">
5a0e4aec
BP
4215 <bits name="hrd" above="16" below="1" width=".3"/>
4216 <bits name="pro" above="16" below="0x800" width=".3"/>
4217 <bits name="hln" above="8" below="6" width=".2"/>
4218 <bits name="pln" above="8" below="4" width=".2"/>
4219 <bits name="op" above="16" width=".2" fill="yes"/>
4220 <bits name="sha" above="48" width="0.5" fill="yes"/>
4221 <bits name="spa" above="16" width="0.3" fill="yes"/>
4222 <bits name="tha" above="48" width="0.5" fill="yes"/>
4223 <bits name="tpa" above="16" width="0.3" fill="yes"/>
96fee5e0
BP
4224 </header>
4225 </diagram>
4226
4227 <p>
4228 The ARP fields are also used for RARP, the Reverse Address Resolution
4229 Protocol, which shares ARP's wire format.
4230 </p>
4231
4232 <field id="MFF_ARP_OP" title="ARP Opcode">
4233 Even though this is a 16-bit field, Open vSwitch does not support ARP
4234 opcodes greater than 255; it treats them to zero. This works adequately
4235 because in practice ARP and RARP only use opcodes 1 through 4.
4236 </field>
4237
4238 <field id="MFF_ARP_SPA" title="ARP Source IPv4 Address"/>
4239 <field id="MFF_ARP_TPA" title="ARP Target IPv4 Address"/>
4240 <field id="MFF_ARP_SHA" title="ARP Source Ethernet Address"/>
4241 <field id="MFF_ARP_THA" title="ARP Target Ethernet Address"/>
4242 </group>
4243
17553f27
YY
4244 <group title="Layer 3: NSH">
4245 <p>
4246 Service functions are widely deployed and essential in many networks.
4247 These service functions provide a range of features such as security,
4248 WAN acceleration, and server load balancing. Service functions may
4249 be instantiated at different points in the network infrastructure
4250 such as the wide area network, data center, and so forth.
4251 </p>
4252
4253 <p>
4254 Prior to development of the SFC architecture [RFC 7665] and the
4255 protocol specified in this document, current service function
4256 deployment models have been relatively static and bound to topology
4257 for insertion and policy selection. Furthermore, they do not adapt
4258 well to elastic service environments enabled by virtualization.
4259 </p>
4260
4261 <p>
4262 New data center network and cloud architectures require more flexible
4263 service function deployment models. Additionally, the transition to
4264 virtual platforms demands an agile service insertion model that
4265 supports dynamic and elastic service delivery. Specifically, the
4266 following functions are necessary:
4267 </p>
4268
3d628928
BP
4269 <ol>
4270 <li>
4271 The movement of service functions and application workloads in
4272 the network.
4273 </li>
17553f27 4274
3d628928
BP
4275 <li>
4276 The ability to easily bind service policy to granular information, such
4277 as per-subscriber state.
4278 </li>
17553f27 4279
3d628928
BP
4280 <li>
4281 The capability to steer traffic to the requisite service function(s).
4282 </li>
4283 </ol>
17553f27
YY
4284
4285 <p>
4286 The Network Service Header (NSH) specification defines a new data
4287 plane protocol, which is an encapsulation for service function
4288 chains. The NSH is designed to encapsulate an original packet or
4289 frame, and in turn be encapsulated by an outer transport
4290 encapsulation (which is used to deliver the NSH to NSH-aware network
3d628928 4291 elements), as shown below:
17553f27
YY
4292 </p>
4293
3d628928
BP
4294 <diagram>
4295 <header>
4296 <bits name="Transport Encapsulation" width="1.8"/>
4297 </header>
4298 <nospace/>
4299 <header>
4300 <bits name="Network Service Header (NSH)" width="2.0"/>
4301 </header>
4302 <nospace/>
4303 <header>
4304 <bits name="Original Packet/Frame" width="1.8"/>
4305 </header>
4306 </diagram>
17553f27
YY
4307
4308 <p>
4309 The NSH is composed of the following elements:
4310 </p>
4311
3d628928
BP
4312 <ol>
4313 <li>Service Function Path identification.</li>
4314 <li>Indication of location within a Service Function Path.</li>
4315 <li>Optional, per packet metadata (fixed length or variable).</li>
4316 </ol>
17553f27
YY
4317
4318 <p>
4319 [RFC 7665] provides an overview of a service chaining architecture
4320 that clearly defines the roles of the various elements and the scope
4321 of a service function chaining encapsulation. Figure 3 of [RFC 7665]
4322 depicts the SFC architectural components after classification. The
4323 NSH is the SFC encapsulation referenced in [RFC 7665].
4324 </p>
4325
4326 <field id="MFF_NSH_FLAGS"
4327 title="flags field (2 bits)"/>
4328 <field id="MFF_NSH_TTL"
4329 title="TTL field (6 bits)"/>
4330 <field id="MFF_NSH_MDTYPE"
4331 title="mdtype field (8 bits)"/>
4332 <field id="MFF_NSH_NP"
4333 title="np (next protocol) field (8 bits)"/>
4334 <field id="MFF_NSH_SPI"
4335 title="spi (service path identifier) field (24 bits)"/>
4336 <field id="MFF_NSH_SI"
4337 title="si (service index) field (8 bits)"/>
4338 <field id="MFF_NSH_C1"
4339 title="c1 (Network Platform Context) field (32 bits)"/>
4340 <field id="MFF_NSH_C2"
4341 title="c2 (Network Shared Context) field (32 bits)"/>
4342 <field id="MFF_NSH_C3"
4343 title="c3 (Service Platform Context) field (32 bits)"/>
4344 <field id="MFF_NSH_C4"
4345 title="c4 (Service Shared Context) field (32 bits)"/>
4346 </group>
4347
4348
96fee5e0
BP
4349 <group title="Layer 4: TCP, UDP, and SCTP">
4350 <p>
4351 For matching purposes, no distinction is made whether these protocols are
4352 encapsulated within IPv4 or IPv6.
4353 </p>
4354
4355 <h2>TCP</h2>
4356
4357 <p>
4358 The following diagram shows TCP within IPv4. Open vSwitch also supports
4359 TCP in IPv6. Only TCP fields implemented as Open vSwitch fields are
4360 shown:
4361 </p>
4362
4363 <diagram>
4364 <header name="Ethernet">
5a0e4aec
BP
4365 <bits name="dst" above="48" width="0.4"/>
4366 <bits name="src" above="48" width="0.4"/>
4367 <bits name="type" above="16" below="0x800" width="0.4"/>
96fee5e0
BP
4368 </header>
4369 <header name="IPv4">
5a0e4aec
BP
4370 <bits name="..." width="0.4"/>
4371 <bits name="proto" above="8" below="6" width="0.3"/>
4372 <bits name="src" above="32" width="0.4"/>
4373 <bits name="dst" above="32" width="0.4"/>
96fee5e0
BP
4374 </header>
4375 <header name="TCP">
5a0e4aec
BP
4376 <bits name="src" above="16" width=".2"/>
4377 <bits name="dst" above="16" width=".2"/>
4378 <bits name="..." width=".75"/>
4379 <bits name="flags" above="12" width=".3"/>
4380 <bits name="..." width=".6"/>
96fee5e0
BP
4381 </header>
4382 <dots/>
4383 </diagram>
4384 <field id="MFF_TCP_SRC" title="TCP Source Port">
4385 Open vSwitch 1.6 added support for bitwise matching.
4386 </field>
4387 <field id="MFF_TCP_DST" title="TCP Destination Port">
4388 Open vSwitch 1.6 added support for bitwise matching.
4389 </field>
4390 <field id="MFF_TCP_FLAGS" title="TCP Flags">
4391 <p>
4392 This field holds the TCP flags. TCP currently defines 9 flag bits. An
4393 additional 3 bits are reserved. For more information, see [RFC 793],
4394 [RFC 3168], and [RFC 3540].
4395 </p>
4396
4397 <p>
4398 Matches on this field are most conveniently written in terms of
4399 symbolic names (given in the diagram below), each preceded by either
4400 <code>+</code> for a flag that must be set, or <code>-</code> for a
4401 flag that must be unset, without any other delimiters between the
4402 flags. Flags not mentioned are wildcarded. For example,
4403 <code>tcp,tcp_flags=+syn-ack</code> matches TCP SYNs that are not ACKs,
4404 and <code>tcp,tcp_flags=+[200]</code> matches TCP packets with the
4405 reserved [200] flag set. Matches can also be written as
4406 <code><var>flags</var>/<var>mask</var></code>, where <var>flags</var>
4407 and <var>mask</var> are 16-bit numbers in decimal or in hexadecimal
4408 prefixed by <code>0x</code>.
4409 </p>
4410
4411 <p>
4412 The flag bits are:
4413 </p>
4414
4415 <diagram>
5a0e4aec
BP
4416 <header>
4417 <bits name="zero" above="4" below="0" width=".9"/>
4418 </header>
4419 <nospace/>
4420 <header name="reserved">
4421 <bits name="[800]" above="1" width=".35"/>
4422 <bits name="[400]" above="1" width=".35"/>
4423 <bits name="[200]" above="1" width=".35"/>
4424 </header>
4425 <nospace/>
4426 <header name="later RFCs">
4427 <bits name="NS" above="1" width=".35"/>
4428 <bits name="CWR" above="1" width=".35"/>
4429 <bits name="ECE" above="1" width=".35"/>
4430 </header>
4431 <nospace/>
4432 <header name="RFC 793">
4433 <bits name="URG" above="1" width=".35"/>
4434 <bits name="ACK" above="1" width=".35"/>
4435 <bits name="PSH" above="1" width=".35"/>
4436 <bits name="RST" above="1" width=".35"/>
4437 <bits name="SYN" above="1" width=".35"/>
4438 <bits name="FIN" above="1" width=".35"/>
4439 </header>
96fee5e0
BP
4440 </diagram>
4441 </field>
4442
4443 <h2>UDP</h2>
4444
4445 <p>
4446 The following diagram shows UDP within IPv4. Open vSwitch also supports
4447 UDP in IPv6. Only UDP fields that Open vSwitch exposes as fields are
4448 shown:
4449 </p>
4450
4451 <diagram>
4452 <header name="Ethernet">
5a0e4aec
BP
4453 <bits name="dst" above="48" width="0.4"/>
4454 <bits name="src" above="48" width="0.4"/>
4455 <bits name="type" above="16" below="0x800" width="0.4"/>
96fee5e0
BP
4456 </header>
4457 <header name="IPv4">
5a0e4aec
BP
4458 <bits name="..." width="0.4"/>
4459 <bits name="proto" above="8" below="17" width="0.3"/>
4460 <bits name="src" above="32" width="0.4"/>
4461 <bits name="dst" above="32" width="0.4"/>
96fee5e0
BP
4462 </header>
4463 <header name="UDP">
5a0e4aec
BP
4464 <bits name="src" above="16" width=".2"/>
4465 <bits name="dst" above="16" width=".2"/>
4466 <bits name="..." width=".4"/>
96fee5e0
BP
4467 </header>
4468 <dots/>
4469 </diagram>
4470 <field id="MFF_UDP_SRC" title="UDP Source Port"/>
4471 <field id="MFF_UDP_DST" title="UDP Destination Port"/>
4472
4473 <h2>SCTP</h2>
4474
4475 <p>
4476 The following diagram shows SCTP within IPv4. Open vSwitch also supports
4477 SCTP in IPv6. Only SCTP fields that Open vSwitch exposes as fields are
4478 shown:
4479 </p>
4480
4481 <diagram>
4482 <header name="Ethernet">
5a0e4aec
BP
4483 <bits name="dst" above="48" width="0.4"/>
4484 <bits name="src" above="48" width="0.4"/>
4485 <bits name="type" above="16" below="0x800" width="0.4"/>
96fee5e0
BP
4486 </header>
4487 <header name="IPv4">
5a0e4aec
BP
4488 <bits name="..." width="0.4"/>
4489 <bits name="proto" above="8" below="132" width="0.3"/>
4490 <bits name="src" above="32" width="0.4"/>
4491 <bits name="dst" above="32" width="0.4"/>
96fee5e0
BP
4492 </header>
4493 <header name="SCTP">
5a0e4aec
BP
4494 <bits name="src" above="16" width=".2"/>
4495 <bits name="dst" above="16" width=".2"/>
4496 <bits name="..." width=".8"/>
96fee5e0
BP
4497 </header>
4498 <dots/>
4499 </diagram>
4500 <field id="MFF_SCTP_SRC" title="SCTP Source Port"/>
4501 <field id="MFF_SCTP_DST" title="SCTP Destination Port"/>
4502 </group>
4503
4504 <group title="Layer 4: ICMPv4 and ICMPv6">
4505 <h2>ICMPv4</h2>
4506 <diagram>
4507 <header name="Ethernet">
5a0e4aec
BP
4508 <bits name="dst" above="48" width="0.4"/>
4509 <bits name="src" above="48" width="0.4"/>
4510 <bits name="type" above="16" below="0x800" width="0.4"/>
96fee5e0
BP
4511 </header>
4512 <header name="IPv4">
5a0e4aec
BP
4513 <bits name="..." width="0.4"/>
4514 <bits name="proto" above="8" below="1" width="0.3"/>
4515 <bits name="src" above="32" width="0.4"/>
4516 <bits name="dst" above="32" width="0.4"/>
96fee5e0
BP
4517 </header>
4518 <header name="ICMPv4">
5a0e4aec
BP
4519 <bits name="type" above="8" width=".3"/>
4520 <bits name="code" above="8" width=".3"/>
4521 <bits name="..." width=".8"/>
96fee5e0
BP
4522 </header>
4523 <dots/>
4524 </diagram>
4525 <field id="MFF_ICMPV4_TYPE" title="ICMPv4 Type">
4526 <p>
4527 For historical reasons, in an ICMPv4 flow, Open vSwitch interprets
4528 matches on <code>tp_src</code> as actually referring to the ICMP type.
4529 </p>
4530 </field>
4531 <field id="MFF_ICMPV4_CODE" title="ICMPv4 Code">
4532 <p>
4533 For historical reasons, in an ICMPv4 flow, Open vSwitch interprets
4534 matches on <code>tp_dst</code> as actually referring to the ICMP code.
4535 </p>
4536 </field>
4537
4538 <h2>ICMPv6</h2>
4539 <diagram>
4540 <header name="Ethernet">
5a0e4aec
BP
4541 <bits name="dst" above="48" width="0.4"/>
4542 <bits name="src" above="48" width="0.4"/>
4543 <bits name="type" above="16" below="0x86dd" width="0.4"/>
96fee5e0
BP
4544 </header>
4545 <header name="IPv6">
5a0e4aec
BP
4546 <bits name="..." width="0.2"/>
4547 <bits name="next" above="8" below="58" width="0.3"/>
4548 <bits name="src" above="128" width="0.4"/>
4549 <bits name="dst" above="128" width="0.4"/>
96fee5e0
BP
4550 </header>
4551 <header name="ICMPv6">
5a0e4aec
BP
4552 <bits name="type" above="8" width=".3"/>
4553 <bits name="code" above="8" width=".3"/>
4554 <bits name="..." width=".8"/>
96fee5e0
BP
4555 </header>
4556 <dots/>
4557 </diagram>
4558 <field id="MFF_ICMPV6_TYPE" title="ICMPv6 Type"/>
4559 <field id="MFF_ICMPV6_CODE" title="ICMPv6 Code"/>
4560
4561 <h2>ICMPv6 Neighbor Discovery</h2>
4562 <diagram>
4563 <header name="Ethernet">
5a0e4aec
BP
4564 <bits name="dst" above="48" width="0.4"/>
4565 <bits name="src" above="48" width="0.4"/>
4566 <bits name="type" above="16" below="0x86dd" width="0.4"/>
96fee5e0
BP
4567 </header>
4568 <header name="IPv6">
5a0e4aec
BP
4569 <bits name="..." width="0.2"/>
4570 <bits name="next" above="8" below="58" width="0.3"/>
4571 <bits name="src" above="128" width="0.4"/>
4572 <bits name="dst" above="128" width="0.4"/>
96fee5e0
BP
4573 </header>
4574 <header name="ICMPv6">
5a0e4aec
BP
4575 <bits name="type" above="8" below="135/136" width=".3"/>
4576 <bits name="code" above="8" below="0" width=".3"/>
4577 <bits name="..." width=".8"/>
96fee5e0
BP
4578 </header>
4579 <header name="ICMPv6 ND">
5a0e4aec
BP
4580 <bits name="target" above="128" width=".4"/>
4581 <bits name="option ..." width=".6"/>
96fee5e0
BP
4582 </header>
4583 </diagram>
4584 <field id="MFF_ND_TARGET" title="ICMPv6 Neighbor Discovery Target IPv6"/>
4585 <field id="MFF_ND_SLL"
5a0e4aec 4586 title="ICMPv6 Neighbor Discovery Source Ethernet Address"/>
96fee5e0 4587 <field id="MFF_ND_TLL"
5a0e4aec 4588 title="ICMPv6 Neighbor Discovery Target Ethernet Address"/>
96fee5e0
BP
4589 </group>
4590
4591 <h1>References</h1>
4592
4593 <dl>
4594 <dt>Casado</dt>
4595 <dd>
4596 M. Casado, M. J. Freedman, J. Pettit, J. Luo, N. McKeown, and
4597 S. Shenker, ``Ethane: Taking Control of the Enterprise,''
4598 Computer Communications Review, October 2007.
4599 </dd>
4600
7dc18ae9
WT
4601 <dt>ERSPAN</dt>
4602 <dd>
4603 M. Foschiano, K. Ghosh, M. Mehta, ``Cisco Systems' Encapsulated Remote
4604 Switch Port Analyzer (ERSPAN),'' <url
4605 href="https://tools.ietf.org/html/draft-foschiano-erspan-03"/>.
4606 </dd>
4607
96fee5e0
BP
4608 <dt>EXT-56</dt>
4609 <dd>
4610 J. Tonsing, ``Permit one of a set of prerequisites to apply, e.g. don't
4611 preclude non-Ethernet media,'' <url
4612 href="https://rs.opennetworking.org/bugs/browse/EXT-56"/> (ONF
4613 members only).
4614 </dd>
4615
4616 <dt>EXT-112</dt>
4617 <dd>
4618 J. Tourrilhes, ``Support non-Ethernet packets throughout the
4619 pipeline,'' <url
4620 href="https://rs.opennetworking.org/bugs/browse/EXT-112"/> (ONF
4621 members only).
4622 </dd>
4623
4624 <dt>EXT-134</dt>
4625 <dd>
4626 J. Tourrilhes, ``Match first nibble of the MPLS payload,'' <url
4627 href="https://rs.opennetworking.org/bugs/browse/EXT-134"/> (ONF
4628 members only).
4629 </dd>
4630
4631 <dt>Geneve</dt>
4632 <dd>
4633 J. Gross, I. Ganga, and T. Sridhar, editors, ``Geneve: Generic Network
4634 Virtualization Encapsulation,'' <url
4635 href="https://datatracker.ietf.org/doc/draft-ietf-nvo3-geneve/"/>.
4636 </dd>
4637
4638 <dt>IEEE OUI</dt>
4639 <dd>
4640 IEEE Standards Association, ``MAC Address Block Large (MA-L),''
4641 <url
4642 href="https://standards.ieee.org/develop/regauth/oui/index.html"/>.
4643 </dd>
4644
4645 <dt>NSH</dt>
4646 <dd>
4647 P. Quinn and U. Elzur, editors, ``Network Service Header,'' <url
4648 href="https://datatracker.ietf.org/doc/draft-ietf-sfc-nsh/"/>.
4649 </dd>
4650
4651 <dt>OpenFlow 1.0.1</dt>
4652 <dd>
4653 Open Networking Foundation, ``OpenFlow Switch Errata, Version
4654 1.0.1,'' June 2012.
4655 </dd>
4656
4657 <dt>OpenFlow 1.1</dt>
4658 <dd>
4659 OpenFlow Consortium, ``OpenFlow Switch Specification Version
4660 1.1.0 Implemented (Wire Protocol 0x02),'' February 2011.
4661 </dd>
4662
4663 <dt>OpenFlow 1.5</dt>
4664 <dd>
4665 Open Networking Foundation, ``OpenFlow Switch Specification Version
4666 1.5.0 (Protocol version 0x06),'' December 2014.
4667 </dd>
4668
4669 <dt>OpenFlow Extensions 1.3.x Package 2</dt>
4670 <dd>
4671 Open Networking Foundation, ``OpenFlow Extensions 1.3.x Package 2,''
4672 December 2013.
4673 </dd>
4674
4675 <dt>TCP Flags Match Field Extension</dt>
4676 <dd>
4677 Open Networking Foundation, ``TCP flags match field Extension,'' December
4678 2014. In [OpenFlow Extensions 1.3.x Package 2].
4679 </dd>
4680
4681 <dt>Pepelnjak</dt>
4682 <dd>
4683 I. Pepelnjak, ``OpenFlow and Fermi Estimates,'' <url
4684 href="http://blog.ipspace.net/2013/09/openflow-and-fermi-estimates.html"/>.
4685 </dd>
4686
4687 <dt>RFC 793</dt>
4688 <dd>
4689 ``Transmission Control Protocol,'' <url
4690 href="http://www.ietf.org/rfc/rfc793.txt"/>.
4691 </dd>
4692
4693 <dt>RFC 3032</dt>
4694 <dd>
4695 E. Rosen, D. Tappan, G. Fedorkow, Y. Rekhter, D. Farinacci,
4696 T. Li, and A. Conta, ``MPLS Label Stack Encoding,'' <url
4697 href="http://www.ietf.org/rfc/rfc3032.txt"/>.
4698 </dd>
4699
4700 <dt>RFC 3168</dt>
4701 <dd>
4702 K. Ramakrishnan, S. Floyd, and D. Black, ``The Addition of Explicit
4703 Congestion Notification (ECN) to IP,'' <url href="https://tools.ietf.org/html/rfc3168"/>.
4704 </dd>
4705
4706 <dt>RFC 3540</dt>
4707 <dd>
4708 N. Spring, D. Wetherall, and D. Ely, ``Robust Explicit Congestion
4709 Notification (ECN) Signaling with Nonces,'' <url
4710 href="https://tools.ietf.org/html/rfc3540"/>.
4711 </dd>
4712
4713 <dt>RFC 4632</dt>
4714 <dd>
4715 V. Fuller and T. Li, ``Classless Inter-domain Routing (CIDR): The
4716 Internet Address Assignment and Aggregation Plan,'' <url
4717 href="https://tools.ietf.org/html/rfc4632"/>.
4718 </dd>
4719
4720 <dt>RFC 5462</dt>
4721 <dd>
4722 L. Andersson and R. Asati, ``Multiprotocol Label Switching
4723 (MPLS) Label Stack Entry: ``EXP'' Field Renamed to ``Traffic
4724 Class'' Field,'' <url
4725 href="http://www.ietf.org/rfc/rfc5462.txt"/>.
4726 </dd>
4727
4728 <dt>RFC 6830</dt>
4729 <dd>
4730 D. Farinacci, V. Fuller, D. Meyer, and D. Lewis, ``The
4731 Locator/ID Separation Protocol (LISP),'' <url
4732 href="http://www.ietf.org/rfc/rfc6830.txt"/>.
4733 </dd>
4734
4735 <dt>RFC 7348</dt>
4736 <dd>
4737 M. Mahalingam, D. Dutt, K. Duda, P. Agarwal, L. Kreeger, T. Sridhar,
4738 M. Bursell, and C. Wright, ``Virtual eXtensible Local Area Network
4739 (VXLAN): A Framework for Overlaying Virtualized Layer 2 Networks over
4740 Layer 3 Networks, '' <url href="https://tools.ietf.org/html/rfc7348"/>.
4741 </dd>
4742
17553f27
YY
4743 <dt>RFC 7665</dt>
4744 <dd>
4745 J. Halpern, Ed. and C. Pignataro, Ed.,
4746 ``Service Function Chaining (SFC) Architecture,''
4747 <url href="https://tools.ietf.org/html/rfc7665"/>.
4748 </dd>
4749
96fee5e0
BP
4750 <dt>Srinivasan</dt>
4751 <dd>
4752 V. Srinivasan, S. Suriy, and G. Varghese, ``Packet
4753 Classification using Tuple Space Search,'' SIGCOMM 1999.
4754 </dd>
4755
4756 <dt>Pagiamtzis</dt>
4757 <dd>
4758 K. Pagiamtzis and A. Sheikholeslami, ``Content-addressable
4759 memory (CAM) circuits and architectures: A tutorial and
4760 survey,'' IEEE Journal of Solid-State Circuits, vol. 41, no. 3,
4761 pp. 712-727, March 2006.
4762 </dd>
4763
4764 <dt>VXLAN Group Policy Option</dt>
4765 <dd>
4766 M. Smith and L. Kreeger, `` VXLAN Group Policy Option.'' Internet-Draft.
4767 <url href="https://tools.ietf.org/html/draft-smith-vxlan-group-policy"/>.
4768 </dd>
4769 </dl>
4770
4771 <h1>Authors</h1>
4772
4773 <p>
4774 Ben Pfaff, with advice from Justin Pettit and Jean Tourrilhes.
4775 </p>
4776
4777</fields>
4778
4779<!--
4780 OXM fields not yet supported Future Directions References/See Also
4781 OXM fields required by various versions and by the "Conformance Test Specification for OpenFlow Switch Specification 1.0.1"
4782-->