]> git.proxmox.com Git - ovs.git/blob - lib/meta-flow.xml
Remove duplicate description about Experimenter classes
[ovs.git] / lib / meta-flow.xml
1 <?xml version="1.0" encoding="utf-8"?>
2 <fields>
3 <h1>Introduction</h1>
4
5 <p>
6 This document aims to comprehensively document all of the fields,
7 both standard and non-standard, supported by OpenFlow or Open
8 vSwitch, regardless of origin.
9 </p>
10
11 <h2>Fields</h2>
12
13 <p>
14 A <dfn>field</dfn> is a property of a packet. Most familiarly, <dfn>data
15 fields</dfn> are fields that can be extracted from a packet. Most data
16 fields are copied directly from protocol headers, e.g. at layer 2, the
17 Ethernet source and destination addresses, or the VLAN ID; at layer 3, the
18 IPv4 or IPv6 source and destination; and at layer 4, the TCP or UDP ports.
19 Other data fields are computed, e.g. <ref field="ip_frag"/> describes
20 whether a packet is a fragment but it is not copied directly from the IP
21 header.
22 </p>
23
24 <p>
25 Data fields that are always present as a consequence of the basic
26 networking technology in use are called called <dfn>root fields</dfn>.
27 Open vSwitch 2.7 and earlier considered Ethernet fields to be root fields,
28 and this remains the default mode of operation for Open vSwitch bridges.
29 When a packet is received from a non-Ethernet interfaces, such as a layer-3
30 LISP tunnel, Open vSwitch 2.7 and earlier force-fit the packet to this
31 Ethernet-centric point of view by pretending that an Ethernet header is
32 present whose Ethernet type that indicates the packet's actual type (and
33 whose source and destination addresses are all-zero).
34 </p>
35
36 <p>
37 Open vSwitch 2.8 and later implement the ``packet type-aware pipeline''
38 concept introduced in OpenFlow 1.5. Such a pipeline does not have any root
39 fields. Instead, a new metadata field, <ref field="packet_type"/>,
40 indicates the basic type of the packet, which can be Ethernet, IPv4, IPv6,
41 or another type. For backward compatibility, by default Open vSwitch 2.8
42 imitates the behavior of Open vSwitch 2.7 and earlier. Later versions of
43 Open vSwitch may change the default, and in the meantime controllers can
44 turn off this legacy behavior, on a port-by-port basis, by setting
45 <code>options:packet_type</code> to <code>ptap</code> in the
46 <code>Interface</code> table. This is significant only for ports that can
47 handle non-Ethernet packets, which is currently just LISP, VXLAN-GPE, and
48 GRE tunnel ports. See <code>ovs-vwitchd.conf.db</code>(5) for more
49 information.
50 </p>
51
52 <p>
53 Non-root data fields are not always present. A packet contains ARP
54 fields, for example, only when its packet type is ARP or when it is an
55 Ethernet packet whose Ethernet header indicates the Ethertype for ARP,
56 0x0806. In this documentation, we say that a field is
57 <dfn>applicable</dfn> when it is present in a packet, and
58 <dfn>inapplicable</dfn> when it is not. (These are not standard terms.)
59 We refer to the conditions that determine whether a field is applicable as
60 <dfn>prerequisites</dfn>. Some VLAN-related fields are a special case:
61 these fields are always applicable for Ethernet packets, but have a
62 designated value or bit that indicates whether a VLAN header is present,
63 with the remaining values or bits indicating the VLAN header's content
64 (if it is present). <!-- XXX also ethertype -->
65 </p>
66
67 <p>
68 An inapplicable field does not have a value, not even a nominal
69 ``value'' such as all-zero-bits. In many circumstances, OpenFlow
70 and Open vSwitch allow references only to applicable fields. For
71 example, one may match (see <cite>Matching</cite>, below) a given
72 field only if the match includes the field's prerequisite,
73 e.g. matching an ARP field is only allowed if one also matches on
74 Ethertype 0x0806 or the <ref field="packet_type"/> for ARP in a packet
75 type-aware bridge.
76 </p>
77
78 <p>
79 Sometimes a packet may contain multiple instances of a header.
80 For example, a packet may contain multiple VLAN or MPLS headers,
81 and tunnels can cause any data field to recur. OpenFlow and Open
82 vSwitch do not address these cases uniformly. For VLAN and MPLS
83 headers, only the outermost header is accessible, so that inner
84 headers may be accessed only by ``popping'' (removing) the outer
85 header. (Open vSwitch supports only a single VLAN header in any
86 case.) For tunnels, e.g. GRE or VXLAN, the outer header and inner
87 headers are treated as different data fields.
88 </p>
89
90 <p>
91 Many network protocols are built in layers as a stack of concatenated
92 headers. Each header typically contains a ``next type'' field that
93 indicates the type of the protocol header that follows, e.g. Ethernet
94 contains an Ethertype and IPv4 contains a IP protocol type. The
95 exceptional cases, where protocols are layered but an outer layer does not
96 indicate the protocol type for the inner layer, or gives only an ambiguous
97 indication, are troublesome. An MPLS header, for example, only indicates
98 whether another MPLS header or some other protocol follows, and in the
99 latter case the inner protocol must be known from the context. In these
100 exceptional cases, OpenFlow and Open vSwitch cannot provide insight into
101 the inner protocol data fields without additional context, and thus they
102 treat all later data fields as inapplicable until an OpenFlow action
103 explicitly specifies what protocol follows. In the case of MPLS, the
104 OpenFlow ``pop MPLS'' action that removes the last MPLS header from a
105 packet provides this context, as the Ethertype of the payload. See
106 <cite>Layer 2.5: MPLS</cite> for more information.
107 </p>
108
109 <p>
110 OpenFlow and Open vSwitch support some fields other than data
111 fields. <dfn>Metadata fields</dfn> relate to the origin or
112 treatment of a packet, but they are not extracted from the packet
113 data itself. One example is the physical port on which a packet
114 arrived at the switch. <dfn>Register fields</dfn> act like
115 variables: they give an OpenFlow switch space for temporary
116 storage while processing a packet. Existing metadata and register
117 fields have no prerequisites.
118 </p>
119
120 <p>
121 A field's value consists of an integral number of bytes. For data
122 fields, sometimes those bytes are taken directly from the packet.
123 Other data fields are copied from a packet with padding (usually
124 with zeros and in the most significant positions). The remaining
125 data fields are transformed in other ways as they are copied from
126 the packets, to make them more useful for matching.
127 </p>
128
129 <h2>Matching</h2>
130
131 <p>
132 The most important use of fields in OpenFlow is
133 <dfn>matching</dfn>, to determine whether particular field values
134 agree with a set of constraints called a <dfn>match</dfn>. A
135 match consists of zero or more constraints on individual fields,
136 all of which must be met to satisfy the match. (A match that
137 contains no constraints is always satisfied.) OpenFlow and Open
138 vSwitch support a number of forms of matching on individual
139 fields:
140 </p>
141
142 <dl>
143 <dt><dfn>Exact match</dfn>, e.g. <code>nw_src=10.1.2.3</code></dt>
144 <dd>
145 <p>
146 Only a particular value of the field is matched; for example, only one
147 particular source IP address. Exact matches are written as
148 <code><var>field</var>=<var>value</var></code>. The forms accepted for
149 <var>value</var> depend on the field.
150 </p>
151
152 <p>
153 All fields support exact matches.
154 </p>
155 </dd>
156
157 <dt>
158 <dfn>Bitwise match</dfn>, e.g. <code>nw_src=10.1.0.0/255.255.0.0</code>
159 </dt>
160 <dd>
161 <p>
162 Specific bits in the field must have specified values; for example,
163 only source IP addresses in a particular subnet. Bitwise matches are
164 written as
165 <code><var>field</var>=<var>value</var>/<var>mask</var></code>, where
166 <var>value</var> and <var>mask</var> take one of the forms accepted for
167 an exact match on <var>field</var>. Some fields accept other forms for
168 bitwise matches; for example, <code>nw_src=10.1.0.0/255.255.0.0</code>
169 may also be written <code>nw_src=10.1.0.0/16</code>.
170 </p>
171
172 <p>
173 Most OpenFlow switches do not allow every bitwise matching on every
174 field (and before OpenFlow 1.2, the protocol did not even provide for
175 the possibility for most fields). Even switches that do allow bitwise
176 matching on a given field may restrict the masks that are allowed, e.g.
177 by allowing matches only on contiguous sets of bits starting from the
178 most significant bit, that is, ``CIDR'' masks [RFC 4632]. Open vSwitch
179 does not allows bitwise matching on every field, but it allows
180 arbitrary bitwise masks on any field that does support bitwise
181 matching. (Older versions had some restrictions, as documented in the
182 descriptions of individual fields.)
183 </p>
184 </dd>
185
186 <dt><dfn>Wildcard</dfn>, e.g. ``any <code>nw_src</code>''</dt>
187 <dd>
188 <p>
189 The value of the field is not constrained. Wildcarded fields may be
190 written as <code><var>field</var>=*</code>, although it is unusual to
191 mention them at all. (When specifying a wildcard explicitly in a
192 command invocation, be sure to using quoting to protect against shell
193 expansion.)
194 </p>
195
196 <p>
197 There is a tiny difference between wildcarding a field and not
198 specifying any match on a field: wildcarding a field requires
199 satisfying the field's prerequisites.
200 </p>
201 </dd>
202 </dl>
203
204 <p>
205 Some types of matches on individual fields cannot be expressed directly
206 with OpenFlow and Open vSwitch. These can be expressed indirectly:
207 </p>
208
209 <dl>
210 <dt><dfn>Set match</dfn>, e.g. ``<code>tcp_dst</code> ∈ {80, 443,
211 8080}''</dt>
212 <dd>
213 <p>
214 The value of a field is one of a specified set of values; for
215 example, the TCP destination port is 80, 443, or 8080.
216 </p>
217
218 <p>
219 For matches used in flows (see <cite>Flows</cite>, below), multiple
220 flows can simulate set matches.
221 </p>
222 </dd>
223
224 <dt><dfn>Range match</dfn>, e.g. ``1000<code>tcp_dst</code>
225 1999''</dt>
226 <dd>
227 <p>
228 The value of the field must lie within a numerical range, for
229 example, TCP destination ports between 1000 and 1999.
230 </p>
231
232 <p>
233 Range matches can be expressed as a collection of bitwise matches. For
234 example, suppose that the goal is to match TCP source ports 1000 to
235 1999, inclusive. The binary representations of 1000 and 1999 are:
236 </p>
237
238 <pre fixed="yes">
239 01111101000
240 11111001111
241 </pre>
242
243 <p>
244 The following series of bitwise matches will match 1000 and
245 1999 and all the values in between:
246 </p>
247
248 <pre fixed="yes">
249 01111101xxx
250 0111111xxxx
251 10xxxxxxxxx
252 110xxxxxxxx
253 1110xxxxxxx
254 11110xxxxxx
255 1111100xxxx
256 </pre>
257
258 <p>
259 which can be written as the following matches:
260 </p>
261
262 <pre>
263 tcp,tp_src=0x03e8/0xfff8
264 tcp,tp_src=0x03f0/0xfff0
265 tcp,tp_src=0x0400/0xfe00
266 tcp,tp_src=0x0600/0xff00
267 tcp,tp_src=0x0700/0xff80
268 tcp,tp_src=0x0780/0xffc0
269 tcp,tp_src=0x07c0/0xfff0
270 </pre>
271 </dd>
272
273 <dt><dfn>Inequality match</dfn>, e.g. ``<code>tcp_dst</code>80''</dt>
274 <dd>
275 <p>
276 The value of the field differs from a specified value, for
277 example, all TCP destination ports except 80.
278 </p>
279
280 <p>
281 An inequality match on an <var>n</var>-bit field can be expressed as a
282 disjunction of <var>n</var> 1-bit matches. For example, the inequality
283 match ``<code>vlan_pcp</code>5'' can be expressed as
284 ``<code>vlan_pcp</code> = 0/4 or <code>vlan_pcp</code> = 2/2 or
285 <code>vlan_pcp</code> = 0/1.'' For matches used in flows (see
286 <cite>Flows</cite>, below), sometimes one can more compactly express
287 inequality as a higher-priority flow that matches the exceptional case
288 paired with a lower-priority flow that matches the general case.
289 </p>
290
291 <p>
292 Alternatively, an inequality match may be converted to a pair of range
293 matches, e.g. <code>tcp_src ≠ 80</code> may be expressed as ``0
294 <code>tcp_src</code> &lt; 80 or 80 &lt; <code>tcp_src</code>65535'',
295 and then each range match may in turn be converted to a bitwise match.
296 </p>
297 </dd>
298
299 <dt><dfn>Conjunctive match</dfn>, e.g. ``<code>tcp_src</code> ∈ {80, 443, 8080} and <code>tcp_dst</code> ∈ {80, 443, 8080}''</dt>
300 <dd>
301 As an OpenFlow extension, Open vSwitch supports matching on conditions on
302 conjunctions of the previously mentioned forms of matching. See the
303 documentation for <ref field="conj_id"/> for more information.
304 </dd>
305 </dl>
306
307 <p>
308 All of these supported forms of matching are special cases of bitwise
309 matching. In some cases this influences the design of field values. <ref
310 field="ip_frag"/> is the most prominent example: it is designed to make all
311 of the practically useful checks for IP fragmentation possible as a single
312 bitwise match.
313 </p>
314
315 <h3>Shorthands</h3>
316
317 <p>
318 Some matches are very commonly used, so Open vSwitch accepts shorthand
319 notations. In some cases, Open vSwitch also uses shorthand notations when
320 it displays matches. The following shorthands are defined, with their long
321 forms shown on the right side:
322 </p>
323
324 <dl>
325 <dt><code>eth</code></dt>
326 <dd><code>packet_type=(0,0)</code> (Open vSwitch 2.8 and later)</dd>
327 <dt><code>ip</code></dt> <dd><code>eth_type=0x0800</code></dd>
328 <dt><code>ipv6</code></dt> <dd><code>eth_type=0x86dd</code></dd>
329 <dt><code>icmp</code></dt> <dd><code>eth_type=0x0800,ip_proto=1</code></dd>
330 <dt><code>icmp6</code></dt> <dd><code>eth_type=0x86dd,ip_proto=58</code></dd>
331 <dt><code>tcp</code></dt> <dd><code>eth_type=0x0800,ip_proto=6</code></dd>
332 <dt><code>tcp6</code></dt> <dd><code>eth_type=0x86dd,ip_proto=6</code></dd>
333 <dt><code>udp</code></dt> <dd><code>eth_type=0x0800,ip_proto=17</code></dd>
334 <dt><code>udp6</code></dt> <dd><code>eth_type=0x86dd,ip_proto=17</code></dd>
335 <dt><code>sctp</code></dt> <dd><code>eth_type=0x0800,ip_proto=132</code></dd>
336 <dt><code>sctp6</code></dt> <dd><code>eth_type=0x86dd,ip_proto=132</code></dd>
337 <dt><code>arp</code></dt> <dd><code>eth_type=0x0806</code></dd>
338 <dt><code>rarp</code></dt> <dd><code>eth_type=0x8035</code></dd>
339 <dt><code>mpls</code></dt> <dd><code>eth_type=0x8847</code></dd>
340 <dt><code>mplsm</code></dt> <dd><code>eth_type=0x8848</code></dd>
341 </dl>
342
343
344 <h2>Evolution of OpenFlow Fields</h2>
345
346 <p>
347 The discussion so far applies to all OpenFlow and Open vSwitch
348 versions. This section starts to draw in specific information by
349 explaining, in broad terms, the treatment of fields and matches in
350 each OpenFlow version.
351 </p>
352
353 <h3>OpenFlow 1.0</h3>
354
355 <p>
356 OpenFlow 1.0 defined the OpenFlow protocol format of a match as a
357 fixed-length data structure that could match on the following
358 fields:
359 </p>
360
361 <ul>
362 <li>Ingress port.</li>
363 <li>Ethernet source and destination MAC.</li>
364 <li>Ethertype (with a special value to match frames that lack an
365 Ethertype).</li>
366 <li>VLAN ID and priority.</li>
367 <li>IPv4 source, destination, protocol, and DSCP.</li>
368 <li>TCP source and destination port.</li>
369 <li>UDP source and destination port.</li>
370 <li>ICMPv4 type and code.</li>
371 <li>ARP IPv4 addresses (SPA and TPA) and opcode.</li>
372 </ul>
373
374 <p>
375 Each supported field corresponded to some member of the data
376 structure. Some members represented multiple fields, in the case
377 of the TCP, UDP, ICMPv4, and ARP fields whose presence is mutually
378 exclusive. This also meant that some members were poor fits for
379 their fields: only the low 8 bits of the 16-bit ARP opcode could
380 be represented, and the ICMPv4 type and code were padded with 8 bits
381 of zeros to fit in the 16-bit members primarily meant for TCP and
382 UDP ports. An additional bitmap member indicated, for each
383 member, whether its field should be an ``exact'' or ``wildcarded''
384 match (see <cite>Matching</cite>), with additional support for
385 CIDR prefix matching on the IPv4 source and destination fields.
386 </p>
387
388 <p>
389 Simplicity was recognized early on as the main virtue of this
390 approach. Obviously, any fixed-length data structure cannot
391 support matching new protocols that do not fit. There was no
392 room, for example, for matching IPv6 fields, which was not a
393 priority at the time. Lack of room to support matching the
394 Ethernet addresses inside ARP packets actually caused more of a
395 design problem later, leading to an Open vSwitch extension action
396 specialized for dropping ``spoofed'' ARP packets in which the
397 frame and ARP Ethernet source addressed differed. (This extension
398 was never standardized. Open vSwitch dropped support for it a few
399 releases after it added support for full ARP matching.)
400 </p>
401
402 <p>
403 The design of the OpenFlow fixed-length matches also illustrates
404 compromises, in both directions, between the strengths and
405 weaknesses of software and hardware that have always influenced
406 the design of OpenFlow. Support for matching ARP fields that do
407 fit in the data structure was only added late in the design
408 process (and remained optional in OpenFlow 1.0), for example,
409 because common switch ASICs did not support matching these fields.
410 </p>
411
412 <p>
413 The compromises in favor of software occurred for more complicated
414 reasons. The OpenFlow designers did not know how to implement
415 matching in software that was fast, dynamic, and general. (A way
416 was later found [Srinivasan].) Thus, the designers sought to
417 support dynamic, general matching that would be fast in realistic
418 special cases, in particular when all of the matches were
419 <dfn>microflows</dfn>, that is, matches that specify every field
420 present in a packet, because such matches can be implemented as a
421 single hash table lookup. Contemporary research supported the
422 feasibility of this approach: the number of microflows in a campus
423 network had been measured to peak at about 10,000 [Casado, section
424 3.2]. (Calculations show that this can only be true in a lightly
425 loaded network [Pepelnjak].)
426 </p>
427
428 <p>
429 As a result, OpenFlow 1.0 required switches to treat microflow
430 matches as the highest possible priority. This let software
431 switches perform the microflow hash table lookup first. Only on
432 failure to match a microflow did the switch need to fall back to
433 checking the more general and presumed slower matches. Also, the
434 OpenFlow 1.0 flow match was minimally flexible, with no support
435 for general bitwise matching, partly on the basis that this seemed
436 more likely amenable to relatively efficient software
437 implementation. (CIDR masking for IPv4 addresses was added
438 relatively late in the OpenFlow 1.0 design process.)
439 </p>
440
441 <p>
442 Microflow matching was later discovered to aid some hardware
443 implementations. The TCAM chips used for matching in hardware do
444 not support priority in the same way as OpenFlow but instead tie
445 priority to ordering [Pagiamtzis]. Thus, adding a new match with
446 a priority between the priorities of existing matches can require
447 reordering an arbitrary number of TCAM entries. On the other
448 hand, when microflows are highest priority, they can be managed as
449 a set-aside portion of the TCAM entries.
450 </p>
451
452 <p>
453 The emphasis on matching microflows also led designers to
454 carefully consider the bandwidth requirements between switch and
455 controller: to maximize the number of microflow setups per second,
456 one must minimize the size of each flow's description. This
457 favored the fixed-length format in use, because it expressed
458 common TCP and UDP microflows in fewer bytes than more flexible
459 ``type-length-value'' (TLV) formats. (Early versions of OpenFlow
460 also avoided TLVs in general to head off protocol fragmentation.)
461 </p>
462
463 <h4>Inapplicable Fields</h4>
464
465 <p>
466 OpenFlow 1.0 does not clearly specify how to treat inapplicable
467 fields. The members for inapplicable fields are always present in
468 the match data structure, as are the bits that indicate whether
469 the fields are matched, and the ``correct'' member and bit values
470 for inapplicable fields is unclear. OpenFlow 1.0 implementations
471 changed their behavior over time as priorities shifted. The early
472 OpenFlow reference implementation, motivated to make every flow a
473 microflow to enable hashing, treated inapplicable fields as exact
474 matches on a value of 0. Initially, this behavior was implemented
475 in the reference controller only.
476 </p>
477
478 <p>
479 Later, the reference switch was also changed to actually force any
480 wildcarded inapplicable fields into exact matches on 0. The
481 latter behavior sometimes caused problems, because the modified
482 flow was the one reported back to the controller later when it
483 queried the flow table, and the modifications sometimes meant that
484 the controller could not properly recognize the flow that it had
485 added. In retrospect, perhaps this problem should have alerted
486 the designers to a design error, but the ability to use a single
487 hash table was held to be more important than almost every other
488 consideration at the time.
489 </p>
490
491 <p>
492 When more flexible match formats were introduced much later, they
493 disallowed any mention of inapplicable fields as part of a match.
494 This raised the question of how to translate between this new
495 format and the OpenFlow 1.0 fixed format. It seemed somewhat
496 inconsistent and backward to treat fields as exact-match in one
497 format and forbid matching them in the other, so instead the
498 treatment of inapplicable fields in the fixed-length format was
499 changed from exact match on 0 to wildcarding. (A better
500 classifier had by now eliminated software performance problems
501 with wildcards.)
502 </p>
503
504 <p>
505 The OpenFlow 1.0.1 errata (released only in 2012) added some
506 additional explanation [OpenFlow 1.0.1, section 3.4], but it did
507 not mandate specific behavior because of variation among
508 implementations.
509 </p>
510
511 <h3>OpenFlow 1.1</h3>
512
513 <p>
514 The OpenFlow 1.1 protocol match format was designed as a type/length/value
515 (TLV) format to allow for future flexibility. The specification
516 standardized only a single type <code>OFPMT_STANDARD</code> (0) with a
517 fixed-size payload, described here. The additional fields and bitwise
518 masks in OpenFlow 1.1 cause this match structure to be over twice as large
519 as in OpenFlow 1.0, 88 bytes versus 40.
520 </p>
521
522 <p>
523 OpenFlow 1.1 added support for the following fields:
524 </p>
525
526 <ul>
527 <li>SCTP source and destination port.</li>
528 <li>MPLS label and traffic control (TC) fields.</li>
529 <li>One 64-bit register (named ``metadata'').</li>
530 </ul>
531
532 <p>
533 OpenFlow 1.1 increased the width of the ingress port number field (and all
534 other port numbers in the protocol) from 16 bits to 32 bits.
535 </p>
536
537 <p>
538 OpenFlow 1.1 increased matching flexibility by introducing
539 arbitrary bitwise matching on Ethernet and IPv4 address fields and
540 on the new ``metadata'' register field. Switches were not
541 required to support all possible masks [OpenFlow 1.1, section
542 4.3].
543 </p>
544
545 <p>
546 By a strict reading of the specification, OpenFlow 1.1 removed
547 support for matching ICMPv4 type and code [OpenFlow 1.1, section
548 A.2.3], but this is likely an editing error because ICMP
549 matching is described elsewhere [OpenFlow 1.1, Table 3, Table 4,
550 Figure 4]. Open vSwitch does support ICMPv4 type and code
551 matching with OpenFlow 1.1.
552 </p>
553
554 <p>
555 OpenFlow 1.1 avoided the pitfalls of inapplicable fields that
556 OpenFlow 1.0 encountered, by requiring the switch to ignore the
557 specified field values [OpenFlow 1.1, section A.2.3]. It also
558 implied that the switch should ignore the bits that indicate
559 whether to match inapplicable fields.
560 </p>
561
562 <h4>Physical Ingress Port</h4>
563
564 <p>
565 OpenFlow 1.1 introduced a new pseudo-field, the physical ingress port. The
566 physical ingress port is only a pseudo-field because it cannot be used for
567 matching. It appears only one place in the protocol, in the ``packet-in''
568 message that passes a packet received at the switch to an OpenFlow
569 controller.
570 </p>
571
572 <p>
573 A packet's ingress port and physical ingress port are identical except for
574 packets processed by a switch feature such as bonding or tunneling that
575 makes a packet appear to arrive on a ``virtual'' port associated with the
576 bond or the tunnel. For such packets, the ingress port is the virtual port
577 and the physical ingress port is, naturally, the physical port. Open
578 vSwitch implements both bonding and tunneling, but its bonding
579 implementation does not use virtual ports and its tunnels are typically not
580 on the same OpenFlow switch as their physical ingress ports (which need not
581 be part of any switch), so the ingress port and physical ingress port are
582 always the same in Open vSwitch.
583 </p>
584
585 <h3>OpenFlow 1.2</h3>
586
587 <p>
588 OpenFlow 1.2 abandoned the fixed-length approach to matching. One reason
589 was size, since adding support for IPv6 address matching (now seen as
590 important), with bitwise masks, would have added 64 bytes to the match
591 length, increasing it from 88 bytes in OpenFlow 1.1 to over 150 bytes.
592 Extensibility had also become important as controller writers increasingly
593 wanted support for new fields without having to change messages throughout
594 the OpenFlow protocol. The challenges of carefully defining fixed-length
595 matches to avoid problems with inapplicable fields had also become clear
596 over time.
597 </p>
598
599 <p>
600 Therefore, OpenFlow 1.2 adopted a flow format using a flexible
601 type-length-value (TLV) representation, in which each TLV expresses a match
602 on one field. These TLVs were in turn encapsulated inside the outer TLV
603 wrapper introduced in OpenFlow 1.1 with the new identifier
604 <code>OFPMT_OXM</code> (1). (This wrapper fulfilled its intended purpose
605 of reducing the amount of churn in the protocol when changing match
606 formats; some messages that included matches remained unchanged from
607 OpenFlow 1.1 to 1.2 and later versions.)
608 </p>
609
610 <p>
611 OpenFlow 1.2 added support for the following fields:
612 </p>
613
614 <ul>
615 <li>ARP hardware addresses (SHA and THA).</li>
616 <li>IPv4 ECN.</li>
617 <li>IPv6 source and destination addresses, flow label, DSCP, ECN,
618 and protocol.</li>
619 <li>TCP, UDP, and SCTP port numbers when encapsulated inside IPv6.</li>
620 <li>ICMPv6 type and code.</li>
621 <li>ICMPv6 Neighbor Discovery target address and source and target
622 Ethernet addresses.</li>
623 </ul>
624
625 <!-- mention tun_id_from_cookie extension? -->
626
627 <p>
628 The OpenFlow 1.2 format, called <dfn>OXM</dfn> (<dfn>OpenFlow Extensible
629 Match</dfn>), was modeled closely on an extension to OpenFlow 1.0
630 introduced in Open vSwitch 1.1 called <dfn>NXM</dfn> (<dfn>Nicira Extended
631 Match</dfn>). Each OXM or NXM TLV has the following format:
632 </p>
633
634 <diagram>
635 <header name="type">
636 <bits name="vendor/class" above="16" width=".75"/>
637 <bits name="field" above="7" width=".4"/>
638 </header>
639 <nospace/>
640 <header name="">
641 <bits name="HM" above="1" width=".25"/>
642 <bits name="length" above="8" width=".4"/>
643 </header>
644 <header name="">
645 <bits name="body" above="length bytes" width="1.7"/>
646 </header>
647 </diagram>
648
649 <p>
650 The most significant 16 bits of the NXM or OXM header, called
651 <code>vendor</code> by NXM and <code>class</code> by OXM, identify
652 an organization permitted to allocate identifiers for fields. NXM
653 allocates only two vendors, 0x0000 for fields supported by
654 OpenFlow 1.0 and 0x0001 for fields implemented as an Open vSwitch
655 extension. OXM assigns classes as follows:
656 </p>
657
658 <dl>
659 <dt>0x0000 (<code>OFPXMC_NXM_0</code>).</dt>
660 <dt>0x0001 (<code>OFPXMC_NXM_1</code>).</dt>
661 <dd>Reserved for NXM compatibility.</dd>
662
663 <dt>0x0002 to 0x7fff</dt>
664 <dd>
665 Reserved for allocation to ONF members, but none yet assigned.
666 </dd>
667
668 <dt>0x8000 (<code>OFPXMC_OPENFLOW_BASIC</code>)</dt>
669 <dd>
670 Used for most standard OpenFlow fields.
671 </dd>
672
673 <dt>0x8001 (<code>OFPXMC_PACKET_REGS</code>)</dt>
674 <dd>
675 Used for packet register fields in OpenFlow 1.5 and later.
676 </dd>
677
678 <dt>0x8002 to 0xfffe</dt>
679 <dd>
680 Reserved for the OpenFlow specification.
681 </dd>
682
683 <dt>0xffff (<code>OFPXMC_EXPERIMENTER</code>)</dt>
684 <dd>Experimental use.</dd>
685 </dl>
686
687 <p>
688 When <code>class</code> is 0xffff, the OXM header is extended to 64 bits by
689 using the first 32 bits of the body as an <code>experimenter</code> field
690 whose most significant byte is zero and whose remaining bytes are an
691 Organizationally Unique Identifier (OUI) assigned by the IEEE [IEEE OUI],
692 as shown below.
693 </p>
694
695 <diagram>
696 <header name="type">
697 <bits name="class" above="16" below="0xffff" width=".75"/>
698 <bits name="field" above="7" width=".4"/>
699 </header>
700 <nospace/>
701 <header name="">
702 <bits name="HM" above="1" width=".25"/>
703 <bits name="length" above="8" width=".4"/>
704 </header>
705
706 <header name="experimenter">
707 <bits name="zero" above="8" below="0x00" width=".4"/>
708 <bits name="OUI" above="24" width="1"/>
709 </header>
710 <header name="">
711 <bits name="body" above="(length - 4) bytes" width="1.7"/>
712 </header>
713 </diagram>
714
715 <p>
716 OpenFlow says that support for experimenter fields is optional. Open
717 vSwitch 2.4 and later does support them, so that it can support the
718 following experimenter classes:
719 </p>
720
721 <dl>
722 <dt>0x4f4e4600 (<code>ONFOXM_ET</code>)</dt>
723 <dd>
724 Used by official Open Networking Foundation extensions in OpenFlow 1.3
725 and later.
726 e.g. [TCP Flags Match Field Extension].
727 </dd>
728
729 <dt>0x005ad650 (<code>NXOXM_NSH</code>)</dt>
730 <dd>
731 Used by Open vSwitch for NSH extensions, in the absence of an official
732 ONF-assigned class. (This OUI is randomly generated.)
733 </dd>
734 </dl>
735
736 <p>
737 Taken as a unit, <code>class</code> (or <code>vendor</code>),
738 <code>field</code>, and <code>experimenter</code> (when present) uniquely
739 identify a particular field.
740 </p>
741
742 <p>
743 When <code>hasmask</code> (abbreviated <code>HM</code> above) is 0, the OXM
744 is an exact match on an entire field. In this case, the body (excluding
745 the experimenter field, if present) is a single value to be matched.
746 </p>
747
748 <p>
749 When <code>hasmask</code> is 1, the OXM is a bitwise match. The body
750 (excluding the experimenter field) consists of a value to match, followed
751 by the bitwise mask to apply. A 1-bit in the mask indicates that the
752 corresponding bit in the value should be matched and a 0-bit that it should
753 be ignored. For example, for an IP address field, a value of 192.168.0.0
754 followed by a mask of 255.255.0.0 would match addresses in the
755 196.168.0.0/16 subnet.
756 </p>
757
758 <ul>
759 <li>
760 Some fields might not support masking at all, and some fields that do
761 support masking might restrict it to certain patterns. For example,
762 fields that have IP address values might be restricted to CIDR masks.
763 The descriptions of individual fields note these restrictions.
764 </li>
765
766 <li>
767 An OXM TLV with a mask that is all zeros is not useful (although it is
768 not forbidden), because it is has the same effect as omitting the TLV
769 entirely.
770 </li>
771
772 <li>
773 It is not meaningful to pair a 0-bit in an OXM mask with a 1-bit in its
774 value, and Open vSwitch rejects such an OXM with the error
775 <code>OFPBMC_BAD_WILDCARDS</code>, as required by OpenFlow 1.3 and later.
776 </li>
777 </ul>
778
779 <p>
780 The <code>length</code> identifies the number of bytes in the body,
781 including the 4-byte <code>experimenter</code> header, if it is present.
782 Each OXM TLV has a fixed length; that is, given <code>class</code>,
783 <code>field</code>, <code>experimenter</code> (if present), and
784 <code>hasmask</code>, <code>length</code> is a constant. The
785 <code>length</code> is included explicitly to allow software to minimally
786 parse OXM TLVs of unknown types.
787 </p>
788
789 <p>
790 OXM TLVs must be ordered so that a field's prerequisites are satisfied
791 before it is parsed. For example, an OXM TLV that matches on the IPv4
792 source address field is only allowed following an OXM TLV that matches on
793 the Ethertype for IPv4. Similarly, an OXM TLV that matches on the TCP
794 source port must follow a TLV that matches an Ethertype of IPv4 or IPv6 and
795 one that matches an IP protocol of TCP (in that order). The order of OXM
796 TLVs is not otherwise restricted; no canonical ordering is defined.
797 </p>
798
799 <p>
800 A given field may be matched only once in a series of OXM TLVs.
801 </p>
802
803 <!-- EXT-482? -->
804
805 <h3>OpenFlow 1.3</h3>
806
807 <p>
808 OpenFlow 1.3 showed OXM to be largely successful, by adding new fields
809 without making any changes to how flow matches otherwise worked. It added
810 OXMs for the following fields supported by Open vSwitch:
811 </p>
812
813 <ul>
814 <li>Tunnel ID for ports associated with e.g. VXLAN or keyed GRE.</li>
815 <li>MPLS ``bottom of stack'' (BOS) bit.</li>
816 </ul>
817
818 <p>
819 OpenFlow 1.3 also added OXMs for the following fields not documented here
820 and not yet implemented by Open vSwitch:
821 </p>
822
823 <ul>
824 <li>IPv6 extension header handling.</li>
825 <li>PBB I-SID.</li>
826 </ul>
827
828 <h3>OpenFlow 1.4</h3>
829
830 <p>
831 OpenFlow 1.4 added OXMs for the following fields not documented here and
832 not yet implemented by Open vSwitch:
833 </p>
834
835 <ul>
836 <li>PBB UCA.</li>
837 </ul>
838
839 <h3>OpenFlow 1.5</h3>
840
841 <p>
842 OpenFlow 1.5 added OXMs for the following fields supported by Open vSwitch:
843 </p>
844
845 <ul>
846 <li>Packet type.</li>
847 <li>TCP flags.</li>
848 <li>Packet registers.</li>
849 <li>The output port in the OpenFlow action set.</li>
850 </ul>
851
852 <h1>Fields Reference</h1>
853
854 <p>
855 The following sections document the fields that Open vSwitch supports.
856 Each section provides introductory material on a group of related fields,
857 followed by information on each individual field. In addition to
858 field-specific information, each field begins with a table with entries for
859 the following important properties:
860 </p>
861
862 <dl>
863 <dt>Name</dt>
864 <dd>
865 The field's name, used for parsing and formatting the field, e.g. in
866 <code>ovs-ofctl</code> commands. For historical reasons, some fields
867 have an additional name that is accepted as an alternative in parsing.
868 This name, when there is one, is listed as well, e.g. ``<code>tun</code>
869 (aka <code>tunnel_id</code>).''
870 </dd>
871
872 <dt>Width</dt>
873 <dd>
874 The field's width, always a multiple of 8 bits. Some fields don't use
875 all of the bits, so this may be accompanied by an explanation. For
876 example, OpenFlow embeds the 2-bit IP ECN field as as the low bits in an
877 8-bit byte, and so its width is expressed as ``8 bits (only the
878 least-significant 2 bits may be nonzero).''
879 </dd>
880
881 <dt>Format</dt>
882 <dd>
883 <p>
884 How a value for the field is formatted or parsed by, e.g.,
885 <code>ovs-ofctl</code>. Some possibilities are generic:
886 </p>
887
888 <dl>
889 <dt>decimal</dt>
890 <dd>
891 Formats as a decimal number. On input, accepts decimal numbers or
892 hexadecimal numbers prefixed by <code>0x</code>.
893 </dd>
894
895 <dt>hexadecimal</dt>
896 <dd>
897 Formats as a hexadecimal number prefixed by <code>0x</code>. On
898 input, accepts decimal numbers or hexadecimal numbers prefixed by
899 <code>0x</code>. (The default for parsing is <em>not</em>
900 hexadecimal: only a <code>0x</code> prefix causes input to be treated
901 as hexadecimal.)
902 </dd>
903
904 <dt>Ethernet</dt>
905 <dd>
906 Formats and accepts the common Ethernet address format
907 <code><var>xx</var>:<var>xx</var>:<var>xx</var>:<var>xx</var>:<var>xx</var>:<var>xx</var></code>.
908 </dd>
909
910 <dt>IPv4</dt>
911 <dd>
912 Formats and accepts the dotted-quad format
913 <code><var>a</var>.<var>b</var>.<var>c</var>.<var>d</var></code>.
914 For bitwise matches, formats and accepts
915 <code><var>address</var>/<var>length</var></code> CIDR notation in
916 addition to <code><var>address</var>/<var>mask</var></code>.
917 </dd>
918
919 <dt>IPv6</dt>
920 <dd>
921 Formats and accepts the common IPv6 address formats, plus CIDR
922 notation for bitwise matches.
923 </dd>
924
925 <dt>OpenFlow 1.0 port</dt>
926 <dd>
927 Accepts 16-bit port numbers in decimal, plus OpenFlow well-known port
928 names (e.g. <code>IN_PORT</code>) in uppercase or lowercase.
929 </dd>
930
931 <dt>OpenFlow 1.1+ port</dt>
932 <dd>
933 Same syntax as OpenFlow 1.0 ports but for 32-bit OpenFlow 1.1+ port
934 number fields.
935 </dd>
936 </dl>
937
938 <p>
939 Other, field-specific formats are explained along with their fields.
940 </p>
941 </dd>
942
943 <dt>Masking</dt>
944 <dd>
945 For most fields, this says ``arbitrary bitwise masks,'' meaning that a
946 flow may match any combination of bits in the field. Some fields
947 instead say ``exact match only,'' which means that a flow that matches
948 on this field must match on the whole field instead of just certain
949 bits. Either way, this reports masking support for the latest version
950 of Open vSwitch using OXM or NXM (that is, either OpenFlow 1.2+ or
951 OpenFlow 1.0 plus Open vSwitch NXM extensions). In particular,
952 OpenFlow 1.0 (without NXM) and 1.1 don't always support masking even if
953 Open vSwitch itself does; refer to the <em>OpenFlow 1.0</em> and
954 <em>OpenFlow 1.1</em> rows to learn about masking with these protocol
955 versions.
956 </dd>
957
958 <dt>Prerequisites</dt>
959 <dd>
960 <p>
961 Requirements that must be met to match on this field. For example,
962 <ref field="ip_src"/> has IPv4 as a prerequisite, meaning that a match
963 must include <code>eth_type=0x0800</code> to match on the IPv4 source
964 address. The following prerequisites, with their requirements, are
965 currently in use:
966 </p>
967
968 <dl>
969 <dt>none</dt>
970 <dd>(no requirements)</dd>
971
972 <dt>VLAN VID</dt>
973 <dd><code>vlan_tci=0x1000/0x1000</code> (i.e. a VLAN header is
974 present)</dd>
975
976 <dt>ARP</dt>
977 <dd><code>eth_type=0x0806</code> (ARP) or <code>eth_type=0x8035</code> (RARP)</dd>
978
979 <dt>IPv4</dt>
980 <dd><code>eth_type=0x0800</code></dd>
981
982 <dt>IPv6</dt>
983 <dd><code>eth_type=0x86dd</code></dd>
984
985 <dt>IPv4/IPv6</dt>
986 <dd>IPv4 or IPv6</dd>
987
988 <dt>MPLS</dt>
989 <dd><code>eth_type=0x8847</code> or <code>eth_type=0x8848</code></dd>
990
991 <dt>TCP</dt>
992 <dd>IPv4/IPv6 and <code>ip_proto=6</code></dd>
993
994 <dt>UDP</dt>
995 <dd>IPv4/IPv6 and <code>ip_proto=17</code></dd>
996
997 <dt>SCTP</dt>
998 <dd>IPv4/IPv6 and <code>ip_proto=132</code></dd>
999
1000 <dt>ICMPv4</dt>
1001 <dd>IPv4 and <code>ip_proto=1</code></dd>
1002
1003 <dt>ICMPv6</dt>
1004 <dd>IPv6 and <code>ip_proto=58</code></dd>
1005
1006 <dt>ND solicit</dt>
1007 <dd>ICMPv6 and <code>icmp_type=135</code> and <code>icmp_code=0</code></dd>
1008
1009 <dt>ND advert</dt>
1010 <dd>ICMPv6 and <code>icmp_type=136</code> and <code>icmp_code=0</code></dd>
1011
1012 <dt>ND</dt>
1013 <dd>ND solicit or ND advert</dd>
1014 </dl>
1015
1016 <p>
1017 The TCP, UDP, and SCTP prerequisites also have the special requirement
1018 that <code>nw_frag</code> is not being used to select ``later
1019 fragments.'' This is because only the first fragment of a fragmented
1020 IPv4 or IPv6 datagram contains the TCP or UDP header.
1021 </p>
1022 </dd>
1023
1024 <dt>Access</dt>
1025 <dd>
1026 Most fields are ``read/write,'' which means that common OpenFlow actions
1027 like <code>set_field</code> can modify them. Fields that are
1028 ``read-only'' cannot be modified in these general-purpose ways, although
1029 there may be other ways that actions can modify them.
1030 </dd>
1031
1032 <dt>OpenFlow 1.0</dt>
1033 <dt>OpenFlow 1.1</dt>
1034 <dd>
1035 These rows report the level of support that OpenFlow 1.0 or OpenFlow 1.1,
1036 respectively, has for a field. For OpenFlow 1.0, supported fields are
1037 reported as either ``yes (exact match only)'' for fields that do not
1038 support any bitwise masking or ``yes (CIDR match only)'' for fields that
1039 support CIDR masking. OpenFlow 1.1 supported fields report either ``yes
1040 (exact match only)'' or simply ``yes'' for fields that do support
1041 arbitrary masks. These OpenFlow versions supported a fixed collection of
1042 fields that cannot be extended, so many more fields are reported as ``not
1043 supported.''
1044 </dd>
1045
1046 <dt>OXM</dt>
1047 <dt>NXM</dt>
1048 <dd>
1049 <p>
1050 These rows report the OXM and NXM code points that correspond to a
1051 given field. Either or both may be ``none.''
1052 </p>
1053
1054 <p>
1055 A field that has only an OXM code point is usually one that was
1056 standardized before it was added to Open vSwitch. A field that has
1057 only an NXM code point is usually one that is not yet standardized.
1058 When a field has both OXM and NXM code points, it usually indicates
1059 that it was introduced as an Open vSwitch extension under the NXM code
1060 point, then later standardized under the OXM code point. A field can
1061 have more than one OXM code point if it was standardized in OpenFlow
1062 1.4 or later and additionally introduced as an official ONF extension
1063 for OpenFlow 1.3. (A field that has neither OXM nor NXM code point is
1064 typically an obsolete field that is supported in some other form using
1065 OXM or NXM.)
1066 </p>
1067
1068 <p>
1069 Each code point in these rows is described in the form
1070 ``<code>NAME</code> (<var>number</var>) since OpenFlow <var>spec</var>
1071 and Open vSwitch <var>version</var>,''
1072 e.g. ``<code>OXM_OF_ETH_TYPE</code> (5) since OpenFlow 1.2 and Open
1073 vSwitch 1.7.'' First, <code>NAME</code>, which specifies a name for
1074 the code point, starts with a prefix that designates a class and, in
1075 some cases, a vendor, as listed in the following table:
1076 </p>
1077
1078 <oxm_classes/>
1079
1080 <p>
1081 For more information on OXM/NXM classes and vendors, refer back to
1082 <em>OpenFlow 1.2</em> under <em>Evolution of OpenFlow Fields</em>. The
1083 <var>number</var> is the field number within the class and vendor. The
1084 OpenFlow <var>spec</var> is the version of OpenFlow that standardized
1085 the code point. It is omitted for NXM code points because they are
1086 nonstandard. The <var>version</var> is the version of Open vSwitch
1087 that first supported the code point.
1088 </p>
1089 </dd>
1090 </dl>
1091
1092 <group title="Conjunctive Match">
1093 <p>
1094 An individual OpenFlow flow can match only a single value for each field.
1095 However, situations often arise where one wants to match one of a set of
1096 values within a field or fields. For matching a single field against a
1097 set, it is straightforward and efficient to add multiple flows to the
1098 flow table, one for each value in the set. For example, one might use
1099 the following flows to send packets with IP source address <var>a</var>,
1100 <var>b</var>, <var>c</var>, or <var>d</var> to the OpenFlow controller:
1101 </p>
1102
1103 <pre>
1104 ip,ip_src=<var>a</var> actions=controller
1105 ip,ip_src=<var>b</var> actions=controller
1106 ip,ip_src=<var>c</var> actions=controller
1107 ip,ip_src=<var>d</var> actions=controller
1108 </pre>
1109
1110 <p>
1111 Similarly, these flows send packets with IP destination address
1112 <var>e</var>, <var>f</var>, <var>g</var>, or <var>h</var> to the OpenFlow
1113 controller:
1114 </p>
1115
1116 <pre>
1117 ip,ip_dst=<var>e</var> actions=controller
1118 ip,ip_dst=<var>f</var> actions=controller
1119 ip,ip_dst=<var>g</var> actions=controller
1120 ip,ip_dst=<var>h</var> actions=controller
1121 </pre>
1122
1123 <p>
1124 Installing all of the above flows in a single flow table yields a
1125 disjunctive effect: a packet is sent to the controller if
1126 <code>ip_src</code> ∈ {<var>a</var>,<var>b</var>,<var>c</var>,<var>d</var>}
1127 or <code>ip_dst</code>
1128 {<var>e</var>,<var>f</var>,<var>g</var>,<var>h</var>} (or both).
1129 (Pedantically, if both of the above sets of flows are present in the flow
1130 table, they should have different priorities, because OpenFlow says that
1131 the results are undefined when two flows with same priority can both match
1132 a single packet.)
1133 </p>
1134
1135 <p>
1136 Suppose, on the other hand, one wishes to match conjunctively, that is, to
1137 send a packet to the controller only if both <code>ip_src</code>
1138 {<var>a</var>,<var>b</var>,<var>c</var>,<var>d</var>} and
1139 <code>ip_dst</code>
1140 {<var>e</var>,<var>f</var>,<var>g</var>,<var>h</var>}. This requires 4 × 4
1141 = 16 flows, one for each possible pairing of <code>ip_src</code> and
1142 <code>ip_dst</code>. That is acceptable for our small example, but it does
1143 not gracefully extend to larger sets or greater numbers of dimensions.
1144 </p>
1145
1146 <p>
1147 The <code>conjunction</code> action is a solution for conjunctive matches
1148 that is built into Open vSwitch. A <code>conjunction</code> action ties groups of
1149 individual OpenFlow flows into higher-level ``conjunctive flows''. Each
1150 group corresponds to one dimension, and each flow within the group matches
1151 one possible value for the dimension. A packet that matches one flow from
1152 each group matches the conjunctive flow.
1153 </p>
1154
1155 <p>
1156 To implement a conjunctive flow with <code>conjunction</code>, assign the
1157 conjunctive flow a 32-bit <var>id</var>, which must be unique within an
1158 OpenFlow table. Assign each of the <var>n</var>2 dimensions a unique
1159 number from 1 to <var>n</var>; the ordering is unimportant. Add one flow
1160 to the OpenFlow flow table for each possible value of each dimension with
1161 <code>conjunction(<var>id</var>, <var>k</var>/<var>n</var>)</code> as the
1162 flow's actions, where <var>k</var> is the number assigned to the flow's
1163 dimension. Together, these flows specify the conjunctive flow's match
1164 condition. When the conjunctive match condition is met, Open vSwitch looks
1165 up one more flow that specifies the conjunctive flow's actions and receives
1166 its statistics. This flow is found by setting <code>conj_id</code> to the
1167 specified <var>id</var> and then again searching the flow table.
1168 </p>
1169
1170 <p>
1171 The following flows provide an example. Whenever the IP source is one of
1172 the values in the flows that match on the IP source (dimension 1 of 2),
1173 <em>and</em> the IP destination is one of the values in the flows that
1174 match on IP destination (dimension 2 of 2), Open vSwitch searches for a
1175 flow that matches <code>conj_id</code> against the conjunction ID (1234),
1176 finding the first flow listed below.
1177 </p>
1178
1179 <pre>
1180 conj_id=1234 actions=controller
1181 ip,ip_src=10.0.0.1 actions=conjunction(1234, 1/2)
1182 ip,ip_src=10.0.0.4 actions=conjunction(1234, 1/2)
1183 ip,ip_src=10.0.0.6 actions=conjunction(1234, 1/2)
1184 ip,ip_src=10.0.0.7 actions=conjunction(1234, 1/2)
1185 ip,ip_dst=10.0.0.2 actions=conjunction(1234, 2/2)
1186 ip,ip_dst=10.0.0.5 actions=conjunction(1234, 2/2)
1187 ip,ip_dst=10.0.0.7 actions=conjunction(1234, 2/2)
1188 ip,ip_dst=10.0.0.8 actions=conjunction(1234, 2/2)
1189 </pre>
1190
1191 <p>
1192 Many subtleties exist:
1193 </p>
1194
1195 <ul>
1196 <li>
1197 In the example above, every flow in a single dimension has the same form,
1198 that is, dimension 1 matches on <code>ip_src</code> and dimension 2 on
1199 <code>ip_dst</code>, but this is not a requirement. Different flows
1200 within a dimension may match on different bits within a field (e.g. IP
1201 network prefixes of different lengths, or TCP/UDP port ranges as bitwise
1202 matches), or even on entirely different fields (e.g. to match packets for
1203 TCP source port 80 or TCP destination port 80).
1204 </li>
1205
1206 <li>
1207 The flows within a dimension can vary their matches across more than
1208 one field, e.g. to match only specific pairs of IP source and
1209 destination addresses or L4 port numbers.
1210 </li>
1211
1212 <li>
1213 A flow may have multiple <code>conjunction</code> actions, with different
1214 <code>id</code> values. This is useful for multiple conjunctive flows with
1215 overlapping sets. If one conjunctive flow matches packets with both
1216 <code>ip_src</code> ∈ {<var>a</var>,<var>b</var>} and <code>ip_dst</code>
1217 {<var>d</var>,<var>e</var>} and a second conjunctive flow matches <code>ip_src</code>
1218 ∈ {<var>b</var>,<var>c</var>} and <code>ip_dst</code> ∈ {<var>f</var>,<var>g</var>}, for
1219 example, then the flow that matches <code>ip_src=</code><var>b</var> would have two
1220 <code>conjunction</code> actions, one for each conjunctive flow. The order
1221 of <code>conjunction</code> actions within a list of actions is not
1222 significant.
1223 </li>
1224 <li>
1225 A flow with <code>conjunction</code> actions may also include <code>note</code>
1226 actions for annotations, but not any other kind of actions. (They
1227 would not be useful because they would never be executed.)
1228 </li>
1229 <li>
1230 All of the flows that constitute a conjunctive flow with a given
1231 <var>id</var> must have the same priority. (Flows with the same <var>id</var>
1232 but different priorities are currently treated as different
1233 conjunctive flows, that is, currently <var>id</var> values need only be
1234 unique within an OpenFlow table at a given priority. This behavior
1235 isn't guaranteed to stay the same in later releases, so please use
1236 <var>id</var> values unique within an OpenFlow table.)
1237 </li>
1238 <li>
1239 Conjunctive flows must not overlap with each other, at a given
1240 priority, that is, any given packet must be able to match at most one
1241 conjunctive flow at a given priority. Overlapping conjunctive flows
1242 yield unpredictable results.
1243 </li>
1244 <li>
1245 Following a conjunctive flow match, the search for the flow with
1246 <code>conj_id=</code><var>id</var> is done in the same general-purpose way as
1247 other flow table searches, so one can use flows with
1248 <code>conj_id=</code><var>id</var> to act differently depending on
1249 circumstances. (One exception is that the search for the
1250 <code>conj_id=</code><var>id</var> flow itself ignores conjunctive flows, to
1251 avoid recursion.) If the search with <code>conj_id=</code><var>id</var> fails,
1252 Open vSwitch acts as if the conjunctive flow had not matched at all, and
1253 continues searching the flow table for other matching flows.
1254 </li>
1255 <li>
1256 <p>
1257 OpenFlow prerequisite checking occurs for the flow with
1258 <code>conj_id=</code><var>id</var> in the same way as any other flow, e.g. in
1259 an OpenFlow 1.1+ context, putting a <code>mod_nw_src</code> action into the example
1260 above would require adding an <code>ip</code> match, like this:
1261 </p>
1262 <pre>
1263 conj_id=1234,ip actions=mod_nw_src:1.2.3.4,controller
1264 </pre>
1265 </li>
1266 <li>
1267 OpenFlow prerequisite checking also occurs for the individual flows
1268 that comprise a conjunctive match in the same way as any other flow.
1269 </li>
1270 <li>
1271 The flows that constitute a conjunctive flow do not have useful
1272 statistics. They are never updated with byte or packet counts, and so
1273 on. (For such a flow, therefore, the idle and hard timeouts work much
1274 the same way.)
1275 </li>
1276 <li>
1277 <p>
1278 Sometimes there is a choice of which flows include a particular match.
1279 For example, suppose that we added an extra constraint to our example,
1280 to match on <code>ip_src</code>
1281 {<var>a</var>,<var>b</var>,<var>c</var>,<var>d</var>} and
1282 <code>ip_dst</code>
1283 {<var>e</var>,<var>f</var>,<var>g</var>,<var>h</var>} and
1284 <code>tcp_dst</code> = <var>i</var>. One way to implement this is to
1285 add the new constraint to the <code>conj_id</code> flow, like this:
1286 </p>
1287 <pre>
1288 conj_id=1234,tcp,tcp_dst=<var>i</var> actions=mod_nw_src:1.2.3.4,controller
1289 </pre>
1290 <p>
1291 but <em>this is not recommended</em> because of the cost of the extra
1292 flow table lookup. Instead, add the constraint to the individual
1293 flows, either in one of the dimensions or (slightly better) all of
1294 them.
1295 </p>
1296 </li>
1297 <li>
1298 A conjunctive match must have <var>n</var>2 dimensions (otherwise a
1299 conjunctive match is not necessary). Open vSwitch enforces this.
1300 </li>
1301 <li>
1302 Each dimension within a conjunctive match should ordinarily have more
1303 than one flow. Open vSwitch does not enforce this.
1304 </li>
1305 </ul>
1306
1307 <field id="MFF_CONJ_ID" title="Conjunction ID">
1308 Used for conjunctive matching. See above for more information.
1309 </field>
1310 </group>
1311
1312 <group title="Network Service Header">
1313 <field id="MFF_NSH_FLAGS"
1314 title="flags field (8 bits)"/>
1315 <field id="MFF_NSH_MDTYPE"
1316 title="mdtype field (8 bits)"/>
1317 <field id="MFF_NSH_NP"
1318 title="np (next protocol) field (8 bits)"/>
1319 <field id="MFF_NSH_SPI"
1320 title="spi (service path identifier) field (24 bits)"/>
1321 <field id="MFF_NSH_SI"
1322 title="si (service index) field (8 bits)"/>
1323 <field id="MFF_NSH_C1"
1324 title="c1 (Network Platform Context) field (32 bits)"/>
1325 <field id="MFF_NSH_C2"
1326 title="c2 (Network Shared Context) field (32 bits)"/>
1327 <field id="MFF_NSH_C3"
1328 title="c3 (Service Platform Context) field (32 bits)"/>
1329 <field id="MFF_NSH_C4"
1330 title="c4 (Service Shared Context) field (32 bits)"/>
1331 </group>
1332
1333 <group title="Tunnel">
1334 <p>
1335 The fields in this group relate to tunnels, which Open vSwitch
1336 supports in several forms (GRE, VXLAN, and so on). Most of
1337 these fields do appear in the wire format of a packet, so they
1338 are data fields from that point of view, but they are metadata
1339 from an OpenFlow flow table point of view because they do not
1340 appear in packets that are forwarded to the controller or to
1341 ordinary (non-tunnel) output ports.
1342 </p>
1343
1344 <p>
1345 Open vSwitch supports a spectrum of usage models for mapping
1346 tunnels to OpenFlow ports:
1347 </p>
1348
1349 <dl>
1350 <dt>``Port-based'' tunnels</dt>
1351 <dd>
1352 <p>
1353 In this model, an OpenFlow port represents one tunnel: it matches a
1354 particular type of tunnel traffic between two IP endpoints, with a
1355 particular tunnel key (if keys are in use). In this situation, <ref
1356 field="in_port"/> suffices to distinguish one tunnel from another, so
1357 the tunnel header fields have little importance for OpenFlow
1358 processing. (They are still populated and may be used if it is
1359 convenient.) The tunnel header fields play no role in sending
1360 packets out such an OpenFlow port, either, because the OpenFlow port
1361 itself fully specifies the tunnel headers.
1362 </p>
1363
1364 <p>
1365 The following Open vSwitch commands create a bridge
1366 <code>br-int</code>, add port <code>tap0</code> to the bridge as
1367 OpenFlow port 1, establish a port-based GRE tunnel between the local
1368 host and remote IP 192.168.1.1 using GRE key 5001 as OpenFlow port 2,
1369 and arranges to forward all traffic from <code>tap0</code> to the
1370 tunnel and vice versa:
1371 </p>
1372
1373 <pre>
1374 ovs-vsctl add-br br-int
1375 ovs-vsctl add-port br-int tap0 -- set interface tap0 ofport_request=1
1376 ovs-vsctl add-port br-int gre0 --
1377 set interface gre0 ofport_request=2 type=gre \
1378 options:remote_ip=192.168.1.1 options:key=5001
1379 ovs-ofctl add-flow br-int in_port=1,actions=2
1380 ovs-ofctl add-flow br-int in_port=2,actions=1
1381 </pre>
1382 </dd>
1383
1384 <dt>``Flow-based'' tunnels</dt>
1385 <dd>
1386 <p>
1387 In this model, one OpenFlow port represents all possible tunnels of a
1388 given type with an endpoint on the current host, for example, all GRE
1389 tunnels. In this situation, <ref field="in_port"/> only indicates
1390 that traffic was received on the particular kind of tunnel. This is
1391 where the tunnel header fields are most important: they allow the
1392 OpenFlow tables to discriminate among tunnels based on their IP
1393 endpoints or keys. Tunnel header fields also determine the IP
1394 endpoints and keys of packets sent out such a tunnel port.
1395 </p>
1396
1397 <p>
1398 The following Open vSwitch commands create a bridge
1399 <code>br-int</code>, add port <code>tap0</code> to the
1400 bridge as OpenFlow port 1, establish a flow-based GRE tunnel
1401 port 3, and arranges to forward all traffic from
1402 <code>tap0</code> to remote IP 192.168.1.1 over a GRE tunnel
1403 with key 5001 and vice versa:
1404 </p>
1405
1406 <pre>
1407 ovs-vsctl add-br br-int
1408 ovs-vsctl add-port br-int tap0 -- set interface tap0 ofport_request=1
1409 ovs-vsctl add-port br-int allgre --
1410 set interface gre0 ofport_request=3 type=gre \
1411 options:remote_ip=flow options:key=flow
1412 ovs-ofctl add-flow br-int \
1413 'in_port=1 actions=set_tunnel:5001,set_field:192.168.1.1->tun_dst,3'
1414 ovs-ofctl add-flow br-int 'in_port=3,tun_src=192.168.1.1,tun_id=5001 actions=1'
1415 </pre>
1416 </dd>
1417
1418 <dt>Mixed models.</dt>
1419 <dd>
1420 <p>
1421 One may define both flow-based and port-based tunnels at the
1422 same time. For example, it is valid and possibly useful to
1423 create and configure both <code>gre0</code> and
1424 <code>allgre</code> tunnel ports described above.
1425 </p>
1426
1427 <p>
1428 Traffic is attributed on ingress to the most specific
1429 matching tunnel. For example, <code>gre0</code> is more
1430 specific than <code>allgre</code>. Therefore, if both
1431 exist, then <code>gre0</code> will be the ingress port for any
1432 GRE traffic received from 192.168.1.1 with key 5001.
1433 </p>
1434
1435 <p>
1436 On egress, traffic may be directed to any appropriate tunnel
1437 port. If both <code>gre0</code> and <code>allgre</code> are
1438 configured as already described, then the actions
1439 <code>2</code> and
1440 <code>set_tunnel:5001,set_field:192.168.1.1->tun_dst,3</code>
1441 send the same tunnel traffic.
1442 </p>
1443 </dd>
1444
1445 <dt>Intermediate models.</dt>
1446 <dd>
1447 Ports may be configured as partially flow-based. For example,
1448 one may define an OpenFlow port that represents tunnels
1449 between a pair of endpoints but leaves the flow table to
1450 discriminate on the flow key.
1451 </dd>
1452 </dl>
1453
1454 <p>
1455 <code>ovs-vswitchd.conf.db</code>(5) describes all the details of tunnel
1456 configuration.
1457 </p>
1458
1459 <p>
1460 These fields do not have any prerequisites, which means that a
1461 flow may match on any or all of them, in any combination.
1462 </p>
1463
1464 <p>
1465 These fields are zeros for packets that did not arrive on a tunnel.
1466 </p>
1467
1468 <field id="MFF_TUN_ID" title="Tunnel ID">
1469 <p>
1470 Many kinds of tunnels support a tunnel ID:
1471 </p>
1472
1473 <ul>
1474 <li>
1475 VXLAN and Geneve have a 24-bit virtual network identifier (VNI).
1476 </li>
1477 <li>LISP has a 24-bit instance ID.</li>
1478 <li>GRE has an optional 32-bit key.</li>
1479 <li>STT has a 64-bit key.</li>
1480 </ul>
1481
1482 <p>
1483 When a packet is received from a tunnel, this field holds the
1484 tunnel ID in its least significant bits, zero-extended to fit.
1485 This field is zero if the tunnel does not support an ID, or if
1486 no ID is in use for a tunnel type that has an optional ID, or
1487 if an ID of zero received, or if the packet was not received
1488 over a tunnel.
1489 </p>
1490
1491 <p>
1492 When a packet is output to a tunnel port, the tunnel
1493 configuration determines whether the tunnel ID is taken from
1494 this field or bound to a fixed value. See the earlier
1495 description of ``port-based'' and ``flow-based'' tunnels for
1496 more information.
1497 </p>
1498
1499 <p>
1500 The following diagram shows the origin of this field in a
1501 typical keyed GRE tunnel:
1502 </p>
1503
1504 <diagram>
1505 <header name="Ethernet">
1506 <bits name="dst" above="48" width="0.4"/>
1507 <bits name="src" above="48" width="0.4"/>
1508 <bits name="type" above="16" below="0x800" width="0.4"/>
1509 </header>
1510 <header name="IPv4">
1511 <bits name="..." width="0.4"/>
1512 <bits name="proto" above="8" below="47" width="0.4"/>
1513 <bits name="src" above="32" width="0.4"/>
1514 <bits name="dst" above="32" width="0.4"/>
1515 </header>
1516 <header name="GRE">
1517 <bits name="..." above="16" width="0.4"/>
1518 <bits name="type" above="16" below="0x6558" width="0.4"/>
1519 <bits name="key" above="32" width=".4" fill="yes"/>
1520 </header>
1521 <header name="Ethernet">
1522 <bits name="dst" above="48" width="0.4"/>
1523 <bits name="src" above="48" width="0.4"/>
1524 <bits name="type" above="16" width="0.4"/>
1525 </header>
1526 <dots/>
1527 </diagram>
1528 </field>
1529
1530 <field id="MFF_TUN_SRC" title="Tunnel IPv4 Source">
1531 <p>
1532 When a packet is received from a tunnel, this field is the
1533 source address in the outer IP header of the tunneled packet.
1534 This field is zero if the packet was not received over a
1535 tunnel.
1536 </p>
1537
1538 <p>
1539 When a packet is output to a flow-based tunnel port, this
1540 field influences the IPv4 source address used to send the
1541 packet. If it is zero, then the kernel chooses an appropriate
1542 IP address based using the routing table.
1543 </p>
1544
1545 <p>
1546 The following diagram shows the origin of this field in a
1547 typical keyed GRE tunnel:
1548 </p>
1549
1550 <diagram>
1551 <header name="Ethernet">
1552 <bits name="dst" above="48" width="0.4"/>
1553 <bits name="src" above="48" width="0.4"/>
1554 <bits name="type" above="16" below="0x800" width="0.4"/>
1555 </header>
1556 <header name="IPv4">
1557 <bits name="..." width="0.4"/>
1558 <bits name="proto" above="8" below="47" width="0.4"/>
1559 <bits name="src" above="32" width="0.4" fill="yes"/>
1560 <bits name="dst" above="32" width="0.4"/>
1561 </header>
1562 <header name="GRE">
1563 <bits name="..." above="16" width="0.4"/>
1564 <bits name="type" above="16" below="0x6558" width="0.4"/>
1565 <bits name="key" above="32" width=".4"/>
1566 </header>
1567 <header name="Ethernet">
1568 <bits name="dst" above="48" width="0.4"/>
1569 <bits name="src" above="48" width="0.4"/>
1570 <bits name="type" above="16" width="0.4"/>
1571 </header>
1572 <dots/>
1573 </diagram>
1574 </field>
1575
1576 <field id="MFF_TUN_DST" title="Tunnel IPv4 Destination">
1577 <p>
1578 When a packet is received from a tunnel, this field is the
1579 destination address in the outer IP header of the tunneled
1580 packet. This field is zero if the packet was not received
1581 over a tunnel.
1582 </p>
1583
1584 <p>
1585 When a packet is output to a flow-based tunnel port, this
1586 field specifies the destination to which the tunnel packet is
1587 sent.
1588 </p>
1589
1590 <p>
1591 The following diagram shows the origin of this field in a
1592 typical keyed GRE tunnel:
1593 </p>
1594
1595 <diagram>
1596 <header name="Ethernet">
1597 <bits name="dst" above="48" width="0.4"/>
1598 <bits name="src" above="48" width="0.4"/>
1599 <bits name="type" above="16" below="0x800" width="0.4"/>
1600 </header>
1601 <header name="IPv4">
1602 <bits name="..." width="0.4"/>
1603 <bits name="proto" above="8" below="47" width="0.4"/>
1604 <bits name="src" above="32" width="0.4"/>
1605 <bits name="dst" above="32" width="0.4" fill="yes"/>
1606 </header>
1607 <header name="GRE">
1608 <bits name="..." above="16" width="0.4"/>
1609 <bits name="type" above="16" below="0x6558" width="0.4"/>
1610 <bits name="key" above="32" width=".4"/>
1611 </header>
1612 <header name="Ethernet">
1613 <bits name="dst" above="48" width="0.4"/>
1614 <bits name="src" above="48" width="0.4"/>
1615 <bits name="type" above="16" width="0.4"/>
1616 </header>
1617 <dots/>
1618 </diagram>
1619 </field>
1620
1621 <field id="MFF_TUN_IPV6_SRC" title="Tunnel IPv6 Source">
1622 Similar to <ref field="tun_src"/>, but for tunnels over IPv6.
1623 </field>
1624
1625 <field id="MFF_TUN_IPV6_DST" title="Tunnel IPv6 Destination">
1626 Similar to <ref field="tun_dst"/>, but for tunnels over IPv6.
1627 </field>
1628
1629 <h2>VXLAN Group-Based Policy Fields</h2>
1630
1631 <p>
1632 The VXLAN header is defined as follows [RFC 7348], where the
1633 <code>I</code> bit must be set to 1, unlabeled bits or those labeled
1634 <code>reserved</code> must be set to 0, and Open vSwitch makes the VNI
1635 available via <ref field="tun_id"/>:
1636 </p>
1637
1638 <diagram>
1639 <header name="VXLAN flags">
1640 <bits name="" above="1" width="0.15"/>
1641 <bits name="" above="1" width="0.15"/>
1642 <bits name="" above="1" width="0.15"/>
1643 <bits name="" above="1" width="0.15"/>
1644 <bits name="I" above="1" width="0.15"/>
1645 <bits name="" above="1" width="0.15"/>
1646 <bits name="" above="1" width="0.15"/>
1647 <bits name="" above="1" width="0.15"/>
1648 </header>
1649 <nospace/>
1650 <header>
1651 <bits name="reserved" above="24" width="1.2"/>
1652 <bits name="VNI" above="24" width="1.2"/>
1653 <bits name="reserved" above="8" width=".5"/>
1654 </header>
1655 </diagram>
1656
1657 <p>
1658 VXLAN Group-Based Policy [VXLAN Group Policy Option] adds new
1659 interpretations to existing bits in the VXLAN header, reinterpreting it
1660 as follows, with changes highlighted:
1661 </p>
1662
1663 <diagram>
1664 <header name="GBP flags">
1665 <bits name="" above="1" width="0.15"/>
1666 <bits name="D" above="1" width="0.15" fill="yes"/>
1667 <bits name="" above="1" width="0.15"/>
1668 <bits name="" above="1" width="0.15"/>
1669 <bits name="A" above="1" width="0.15" fill="yes"/>
1670 <bits name="" above="1" width="0.15"/>
1671 <bits name="" above="1" width="0.15"/>
1672 <bits name="" above="1" width="0.15"/>
1673 </header>
1674 <nospace/>
1675 <header>
1676 <bits name="group policy ID" above="24" width="1.2" fill="yes"/>
1677 <bits name="VNI" above="24" width="1.2"/>
1678 <bits name="reserved" above="8" width=".5"/>
1679 </header>
1680 </diagram>
1681
1682 <p>
1683 Open vSwitch makes GBP fields and flags available through the following
1684 fields. Only packets that arrive over a VXLAN tunnel with the GBP
1685 extension enabled have these fields set. In other packets they are zero
1686 on receive and ignored on transmit.
1687 </p>
1688
1689 <field id="MFF_TUN_GBP_ID" title="VXLAN Group-Based Policy ID">
1690 <p>
1691 For a packet tunneled over VXLAN with the Group-Based Policy (GBP)
1692 extension, this field represents the GBP policy ID, as shown above.
1693 </p>
1694 </field>
1695
1696 <field id="MFF_TUN_GBP_FLAGS" title="VXLAN Group-Based Policy Flags">
1697 <p>
1698 For a packet tunneled over VXLAN with the Group-Based Policy (GBP)
1699 extension, this field represents the GBP policy flags, as shown above.
1700 </p>
1701
1702 <p>
1703 The field has the format shown below:
1704 </p>
1705
1706 <diagram>
1707 <header name="GBP Flags">
1708 <bits name="" above="1" width="0.15"/>
1709 <bits name="D" above="1" width="0.15"/>
1710 <bits name="" above="1" width="0.15"/>
1711 <bits name="" above="1" width="0.15"/>
1712 <bits name="A" above="1" width="0.15"/>
1713 <bits name="" above="1" width="0.15"/>
1714 <bits name="" above="1" width="0.15"/>
1715 <bits name="" above="1" width="0.15"/>
1716 </header>
1717 </diagram>
1718
1719 <p>
1720 Unlabeled bits are reserved and must be transmitted as 0. The VXLAN
1721 GBP draft defines the other bits' meanings as:
1722 </p>
1723
1724 <dl>
1725 <dt><code>D</code> (Don't Learn)</dt>
1726 <dd>
1727 When set, this bit indicates that the egress tunnel endpoint must not
1728 learn the source address of the encapsulated frame.
1729 </dd>
1730
1731 <dt><code>A</code> (Applied)</dt>
1732 <dd>
1733 When set, indicates that the group policy has already been applied to
1734 this packet. Devices must not apply policies when the A bit is set.
1735 </dd>
1736 </dl>
1737 </field>
1738
1739 <h2>Geneve Fields</h2>
1740
1741 <p>
1742 These fields provide access to additional features in the Geneve
1743 tunneling protocol [Geneve]. Their names are somewhat generic in the
1744 hope that the same fields could be reused for other protocols in the
1745 future; for example, the NSH protocol [NSH] supports TLV options whose
1746 form is identical to that for Geneve options.
1747 </p>
1748
1749 <field id="MFF_TUN_METADATA0" title="Generic Tunnel Option 0">
1750 <p>
1751 The above information specifically covers generic tunnel option 0, but
1752 Open vSwitch supports 64 options, numbered 0 through 63, whose
1753 NXM field numbers are 40 through 103.
1754 </p>
1755
1756 <p>
1757 These fields provide OpenFlow access to the generic type-length-value
1758 options defined by the Geneve tunneling protocol or other protocols
1759 with options in the same TLV format as Geneve options. Each of these
1760 options has the following wire format:
1761 </p>
1762
1763 <diagram>
1764 <header name="header">
1765 <bits name="class" above="16" width="0.6"/>
1766 <bits name="type" above="8" width="0.5"/>
1767 <bits name="res" above="3" below="0" width="0.25"/>
1768 <bits name="length" above="5" width="0.4"/>
1769 </header>
1770 <nospace/>
1771 <header name="body">
1772 <bits name="value" above="4×(length - 1) bytes" width="1.7"/>
1773 </header>
1774 </diagram>
1775
1776 <p>
1777 Taken together, the <code>class</code> and <code>type</code> in the
1778 option format mean that there are about 16 million distinct kinds of
1779 TLV options, too many to give individual OXM code points. Thus, Open
1780 vSwitch requires the user to define the TLV options of interest, by
1781 binding up to 64 TLV options to generic tunnel option NXM code points.
1782 Each option may have up to 124 bytes in its body, the maximum allowed
1783 by the TLV format, but bound options may total at most 252 bytes of
1784 body.
1785 </p>
1786
1787 <p>
1788 Open vSwitch extensions to the OpenFlow protocol bind TLV options to
1789 NXM code points. The <code>ovs-ofctl</code>(8) program offers one way
1790 to use these extensions, e.g. to configure a mapping from a TLV option
1791 with <code>class</code> <code>0xffff</code>, <code>type</code>
1792 <code>0</code>, and a body length of 4 bytes:
1793 </p>
1794
1795 <pre>
1796 ovs-ofctl add-tlv-map br0 "{class=0xffff,type=0,len=4}->tun_metadata0"
1797 </pre>
1798
1799 <p>
1800 Once a TLV option is properly bound, it can be accessed and modified
1801 like any other field, e.g. to send packets that have value 1234 for the
1802 option described above to the controller:
1803 </p>
1804
1805 <pre>
1806 ovs-ofctl add-flow br0 tun_metadata0=1234,actions=controller
1807 </pre>
1808
1809 <p>
1810 An option not received or not bound is matched as all zeros.
1811 </p>
1812 </field>
1813 <!--- XXX need a way to define a range of OXMs -->
1814 <field id="MFF_TUN_METADATA1" title="Generic Tunnel Option 1" hidden="yes"/>
1815 <field id="MFF_TUN_METADATA2" title="Generic Tunnel Option 2" hidden="yes"/>
1816 <field id="MFF_TUN_METADATA3" title="Generic Tunnel Option 3" hidden="yes"/>
1817 <field id="MFF_TUN_METADATA4" title="Generic Tunnel Option 4" hidden="yes"/>
1818 <field id="MFF_TUN_METADATA5" title="Generic Tunnel Option 5" hidden="yes"/>
1819 <field id="MFF_TUN_METADATA6" title="Generic Tunnel Option 6" hidden="yes"/>
1820 <field id="MFF_TUN_METADATA7" title="Generic Tunnel Option 7" hidden="yes"/>
1821 <field id="MFF_TUN_METADATA8" title="Generic Tunnel Option 8" hidden="yes"/>
1822 <field id="MFF_TUN_METADATA9" title="Generic Tunnel Option 9" hidden="yes"/>
1823 <field id="MFF_TUN_METADATA10" title="Generic Tunnel Option 10" hidden="yes"/>
1824 <field id="MFF_TUN_METADATA11" title="Generic Tunnel Option 11" hidden="yes"/>
1825 <field id="MFF_TUN_METADATA12" title="Generic Tunnel Option 12" hidden="yes"/>
1826 <field id="MFF_TUN_METADATA13" title="Generic Tunnel Option 13" hidden="yes"/>
1827 <field id="MFF_TUN_METADATA14" title="Generic Tunnel Option 14" hidden="yes"/>
1828 <field id="MFF_TUN_METADATA15" title="Generic Tunnel Option 15" hidden="yes"/>
1829 <field id="MFF_TUN_METADATA16" title="Generic Tunnel Option 16" hidden="yes"/>
1830 <field id="MFF_TUN_METADATA17" title="Generic Tunnel Option 17" hidden="yes"/>
1831 <field id="MFF_TUN_METADATA18" title="Generic Tunnel Option 18" hidden="yes"/>
1832 <field id="MFF_TUN_METADATA19" title="Generic Tunnel Option 19" hidden="yes"/>
1833 <field id="MFF_TUN_METADATA20" title="Generic Tunnel Option 20" hidden="yes"/>
1834 <field id="MFF_TUN_METADATA21" title="Generic Tunnel Option 21" hidden="yes"/>
1835 <field id="MFF_TUN_METADATA22" title="Generic Tunnel Option 22" hidden="yes"/>
1836 <field id="MFF_TUN_METADATA23" title="Generic Tunnel Option 23" hidden="yes"/>
1837 <field id="MFF_TUN_METADATA24" title="Generic Tunnel Option 24" hidden="yes"/>
1838 <field id="MFF_TUN_METADATA25" title="Generic Tunnel Option 25" hidden="yes"/>
1839 <field id="MFF_TUN_METADATA26" title="Generic Tunnel Option 26" hidden="yes"/>
1840 <field id="MFF_TUN_METADATA27" title="Generic Tunnel Option 27" hidden="yes"/>
1841 <field id="MFF_TUN_METADATA28" title="Generic Tunnel Option 28" hidden="yes"/>
1842 <field id="MFF_TUN_METADATA29" title="Generic Tunnel Option 29" hidden="yes"/>
1843 <field id="MFF_TUN_METADATA30" title="Generic Tunnel Option 30" hidden="yes"/>
1844 <field id="MFF_TUN_METADATA31" title="Generic Tunnel Option 31" hidden="yes"/>
1845 <field id="MFF_TUN_METADATA32" title="Generic Tunnel Option 32" hidden="yes"/>
1846 <field id="MFF_TUN_METADATA33" title="Generic Tunnel Option 33" hidden="yes"/>
1847 <field id="MFF_TUN_METADATA34" title="Generic Tunnel Option 34" hidden="yes"/>
1848 <field id="MFF_TUN_METADATA35" title="Generic Tunnel Option 35" hidden="yes"/>
1849 <field id="MFF_TUN_METADATA36" title="Generic Tunnel Option 36" hidden="yes"/>
1850 <field id="MFF_TUN_METADATA37" title="Generic Tunnel Option 37" hidden="yes"/>
1851 <field id="MFF_TUN_METADATA38" title="Generic Tunnel Option 38" hidden="yes"/>
1852 <field id="MFF_TUN_METADATA39" title="Generic Tunnel Option 39" hidden="yes"/>
1853 <field id="MFF_TUN_METADATA40" title="Generic Tunnel Option 40" hidden="yes"/>
1854 <field id="MFF_TUN_METADATA41" title="Generic Tunnel Option 41" hidden="yes"/>
1855 <field id="MFF_TUN_METADATA42" title="Generic Tunnel Option 42" hidden="yes"/>
1856 <field id="MFF_TUN_METADATA43" title="Generic Tunnel Option 43" hidden="yes"/>
1857 <field id="MFF_TUN_METADATA44" title="Generic Tunnel Option 44" hidden="yes"/>
1858 <field id="MFF_TUN_METADATA45" title="Generic Tunnel Option 45" hidden="yes"/>
1859 <field id="MFF_TUN_METADATA46" title="Generic Tunnel Option 46" hidden="yes"/>
1860 <field id="MFF_TUN_METADATA47" title="Generic Tunnel Option 47" hidden="yes"/>
1861 <field id="MFF_TUN_METADATA48" title="Generic Tunnel Option 48" hidden="yes"/>
1862 <field id="MFF_TUN_METADATA49" title="Generic Tunnel Option 49" hidden="yes"/>
1863 <field id="MFF_TUN_METADATA50" title="Generic Tunnel Option 50" hidden="yes"/>
1864 <field id="MFF_TUN_METADATA51" title="Generic Tunnel Option 51" hidden="yes"/>
1865 <field id="MFF_TUN_METADATA52" title="Generic Tunnel Option 52" hidden="yes"/>
1866 <field id="MFF_TUN_METADATA53" title="Generic Tunnel Option 53" hidden="yes"/>
1867 <field id="MFF_TUN_METADATA54" title="Generic Tunnel Option 54" hidden="yes"/>
1868 <field id="MFF_TUN_METADATA55" title="Generic Tunnel Option 55" hidden="yes"/>
1869 <field id="MFF_TUN_METADATA56" title="Generic Tunnel Option 56" hidden="yes"/>
1870 <field id="MFF_TUN_METADATA57" title="Generic Tunnel Option 57" hidden="yes"/>
1871 <field id="MFF_TUN_METADATA58" title="Generic Tunnel Option 58" hidden="yes"/>
1872 <field id="MFF_TUN_METADATA59" title="Generic Tunnel Option 59" hidden="yes"/>
1873 <field id="MFF_TUN_METADATA60" title="Generic Tunnel Option 60" hidden="yes"/>
1874 <field id="MFF_TUN_METADATA61" title="Generic Tunnel Option 61" hidden="yes"/>
1875 <field id="MFF_TUN_METADATA62" title="Generic Tunnel Option 62" hidden="yes"/>
1876 <field id="MFF_TUN_METADATA63" title="Generic Tunnel Option 63" hidden="yes"/>
1877
1878 <field id="MFF_TUN_FLAGS" title="Tunnel Flags">
1879 <p>
1880 Flags indicating various aspects of the tunnel encapsulation.
1881 </p>
1882
1883 <p>
1884 Matches on this field are most conveniently written in terms of
1885 symbolic names (given in the diagram below), each preceded by either
1886 <code>+</code> for a flag that must be set, or <code>-</code> for a
1887 flag that must be unset, without any other delimiters between the
1888 flags. Flags not mentioned are wildcarded. For example,
1889 <code>tun_flags=+oam</code> matches only OAM packets. Matches can also
1890 be written as <code><var>flags</var>/<var>mask</var></code>, where
1891 <var>flags</var> and <var>mask</var> are 16-bit numbers in decimal or
1892 in hexadecimal prefixed by <code>0x</code>.
1893 </p>
1894
1895 <p>
1896 Currently, only one flag is defined:
1897 </p>
1898
1899 <dl>
1900 <dt><code>oam</code></dt>
1901 <dd>
1902 The tunnel protocol indicated that this is an OAM (Operations and
1903 Management) control packet.
1904 </dd>
1905 </dl>
1906
1907 <p>
1908 The switch may reject matches against unknown flags.
1909 </p>
1910
1911 <p>
1912 Newer versions of Open vSwitch may introduce additional flags with new
1913 meanings. It is therefore not recommended to use an exact match on
1914 this field since the behavior of these new flags is unknown and should
1915 be ignored.
1916 </p>
1917
1918 <p>
1919 For non-tunneled packets, the value is 0.
1920 </p>
1921 </field>
1922
1923 <!-- Open vSwitch uses the following fields internally, but it
1924 does not expose them to the user via OpenFlow, so we do not
1925 document them. -->
1926 <field id="MFF_TUN_TTL" title="Tunnel IPv4 Time-to-Live" internal="yes"/>
1927 <field id="MFF_TUN_TOS" title="Tunnel IPv4 Type of Service" internal="yes"/>
1928 </group>
1929
1930 <group title="Metadata">
1931 <p>
1932 These fields relate to the origin or treatment of a packet, but
1933 they are not extracted from the packet data itself.
1934 </p>
1935
1936 <field id="MFF_IN_PORT" title="Ingress Port">
1937 <p>
1938 The OpenFlow port on which the packet being processed arrived.
1939 This is a 16-bit field that holds an OpenFlow 1.0 port number.
1940 For receiving a packet, the only values that appear in this
1941 field are:
1942 </p>
1943
1944 <dl>
1945 <dt>1 through <code>0xfeff</code> (65,279), inclusive.</dt>
1946 <dd>
1947 Conventional OpenFlow port numbers.
1948 </dd>
1949
1950 <dt><code>OFPP_LOCAL</code> (<code>0xfffe</code> or 65,534).</dt>
1951 <dd>
1952 <p>
1953 The ``local'' port, which in Open vSwitch is always named
1954 the same as the bridge itself. This represents a
1955 connection between the switch and the local TCP/IP stack.
1956 This port is where an IP address is most commonly
1957 configured on an Open vSwitch switch.
1958 </p>
1959
1960 <p>
1961 OpenFlow does not require a switch to have a local port,
1962 but all existing versions of Open vSwitch have always
1963 included a local port. <b>Future Directions:</b> Future
1964 versions of Open vSwitch might be able to optionally omit
1965 the local port, if someone submits code to implement such
1966 a feature.
1967 </p>
1968 </dd>
1969
1970 <dt><code>OFPP_NONE</code> (OpenFlow 1.0) or <code>OFPP_ANY</code> (OpenFlow 1.1+) (<code>0xffff</code> or 65,535).</dt>
1971 <dt><code>OFPP_CONTROLLER</code> (<code>0xfffd</code> or 65,533).</dt>
1972 <dd>
1973 <p>
1974 When a controller injects a packet into an OpenFlow switch
1975 with a ``packet-out'' request, it can specify one of these
1976 ingress ports to indicate that the packet was generated
1977 internally rather than having been received on some port.
1978 </p>
1979
1980 <p>
1981 OpenFlow 1.0 specified <code>OFPP_NONE</code> for this
1982 purpose. Despite that, some controllers used
1983 <code>OFPP_CONTROLLER</code>, and some switches only
1984 accepted <code>OFPP_CONTROLLER</code>, so OpenFlow 1.0.2
1985 required support for both ports. OpenFlow 1.1 and later
1986 were more clearly drafted to allow only
1987 <code>OFPP_CONTROLLER</code>. For maximum compatibility,
1988 Open vSwitch allows both ports with all OpenFlow versions.
1989 </p>
1990 </dd>
1991 </dl>
1992
1993 <p>
1994 Values not mentioned above will never appear when receiving a
1995 packet, including the following notable values:
1996 </p>
1997
1998 <dl>
1999 <dt>0</dt>
2000 <dd>
2001 Zero is not a valid OpenFlow port number.
2002 </dd>
2003
2004 <dt><code>OFPP_MAX</code> (<code>0xff00</code> or 65,280).</dt>
2005 <dd>
2006 This value has only been clearly specified as a valid port
2007 number as of OpenFlow 1.3.3. Before that, its status was
2008 unclear, and so Open vSwitch has never allowed
2009 <code>OFPP_MAX</code> to be used as a port number, so
2010 packets will never be received on this port. (Other
2011 OpenFlow switches, of course, might use it.)
2012 </dd>
2013
2014 <dt><code>OFPP_UNSET</code> (<code>0xfff7</code> or 65,527)</dt>
2015 <dt><code>OFPP_IN_PORT</code> (<code>0xfff8</code> or 65,528)</dt>
2016 <dt><code>OFPP_TABLE</code> (<code>0xfff9</code> or 65,529)</dt>
2017 <dt><code>OFPP_NORMAL</code> (<code>0xfffa</code> or 65,530)</dt>
2018 <dt><code>OFPP_FLOOD</code> (<code>0xfffb</code> or 65,531)</dt>
2019 <dt><code>OFPP_ALL</code> (<code>0xfffc</code> or 65,532)</dt>
2020 <dd>
2021 <p>
2022 These port numbers are used only in output actions and never
2023 appear as ingress ports.
2024 </p>
2025
2026 <p>
2027 Most of these port numbers were defined in OpenFlow 1.0, but
2028 <code>OFPP_UNSET</code> was only introduced in OpenFlow 1.5.
2029 </p>
2030 </dd>
2031 </dl>
2032
2033 <p>
2034 Values that will never appear when receiving a packet may
2035 still be matched against in the flow table. There are still
2036 circumstances in which those flows can be matched:
2037 </p>
2038
2039 <ul>
2040 <li>
2041 The <code>resubmit</code> Open vSwitch extension action allows a
2042 flow table lookup with an arbitrary ingress port.
2043 </li>
2044
2045 <li>
2046 An action that modifies the ingress port field (see below),
2047 such as e.g. <code>load</code> or <code>set_field</code>,
2048 followed by an action or instruction that performs another
2049 flow table lookup, such as <code>resubmit</code> or
2050 <code>goto_table</code>.
2051 </li>
2052 </ul>
2053
2054 <p>
2055 This field is heavily used for matching in OpenFlow tables,
2056 but for packet egress, it has only very limited roles:
2057 </p>
2058
2059 <ul>
2060 <li>
2061 <p>
2062 OpenFlow requires suppressing output actions to <ref
2063 field="in_port"/>. That is, the following two flows both drop all
2064 packets that arrive on port 1:
2065 </p>
2066
2067 <pre>
2068 in_port=1,actions=1
2069 in_port=1,actions=drop
2070 </pre>
2071
2072 <p>
2073 (This behavior is occasionally useful for flooding to a
2074 subset of ports. Specifying <code>actions=1,2,3,4</code>,
2075 for example, outputs to ports 1, 2, 3, and 4, omitting the
2076 ingress port.)
2077 </p>
2078 </li>
2079
2080 <li>
2081 OpenFlow has a special port <code>OFPP_IN_PORT</code> (with
2082 value 0xfff8) that outputs to the ingress port. For example,
2083 in a switch that has four ports numbered 1 through 4,
2084 <code>actions=1,2,3,4,in_port</code> outputs to ports 1, 2,
2085 3, and 4, including the ingress port.
2086 </li>
2087 </ul>
2088
2089 <p>
2090 Because the ingress port field has so little influence on packet
2091 processing, it does not ordinarily make sense to modify the
2092 ingress port field. The field is writable only to support the
2093 occasional use case where the ingress port's roles in packet
2094 egress, described above, become troublesome. For example,
2095 <code>actions=load:0-&gt;NXM_OF_IN_PORT[],output:123</code>
2096 will output to port 123 regardless of whether it is in the
2097 ingress port. If the ingress port is important, then one may save
2098 and restore it on the stack:
2099 </p>
2100
2101 <pre>
2102 actions=push:NXM_OF_IN_PORT[],load:0->NXM_OF_IN_PORT[],output:123,pop:NXM_OF_IN_PORT[]
2103 </pre>
2104
2105 <p>
2106 or, in Open vSwitch 2.7 or later, use the <code>clone</code> action to
2107 save and restore it:
2108 </p>
2109
2110 <pre>
2111 actions=clone(load:0->NXM_OF_IN_PORT[],output:123)
2112 </pre>
2113
2114 <p>
2115 The ability to modify the ingress port is an Open vSwitch
2116 extension to OpenFlow.
2117 </p>
2118 </field>
2119
2120 <field id="MFF_IN_PORT_OXM" title="OXM Ingress Port">
2121 <p>
2122 OpenFlow 1.1 and later use a 32-bit port number, so this field
2123 supplies a 32-bit view of the ingress port. Current versions of
2124 Open vSwitch support only a 16-bit range of ports:
2125 </p>
2126
2127 <ul>
2128 <li>
2129 OpenFlow 1.0 ports <code>0x0000</code> to
2130 <code>0xfeff</code>, inclusive, map to OpenFlow 1.1
2131 port numbers with the same values.
2132 </li>
2133
2134 <li>
2135 OpenFlow 1.0 ports <code>0xff00</code> to
2136 <code>0xffff</code>, inclusive, map to OpenFlow 1.1 port
2137 numbers <code>0xffffff00</code> to <code>0xffffffff</code>.
2138 </li>
2139
2140 <li>
2141 OpenFlow 1.1 ports <code>0x0000ff00</code> to
2142 <code>0xfffffeff</code> are not mapped and not supported.
2143 </li>
2144 </ul>
2145
2146 <p>
2147 <ref field="in_port"/> and <ref field="in_port_oxm"/> are two views of
2148 the same information, so all of the comments on <ref field="in_port"/>
2149 apply to <ref field="in_port_oxm"/> too. Modifying <ref
2150 field="in_port"/> changes <ref field="in_port_oxm"/>, and vice versa.
2151 </p>
2152
2153 <p>
2154 Setting <ref field="in_port_oxm"/> to an unsupported value yields
2155 unspecified behavior.
2156 </p>
2157 </field>
2158
2159 <field id="MFF_SKB_PRIORITY" title="Output Queue">
2160 <p>
2161 <b>Future Directions:</b> Open vSwitch implements the output queue as a
2162 field, but does not currently expose it through OXM or NXM for matching
2163 purposes. If this turns out to be a useful feature, it could be
2164 implemented in future versions. Only the <code>set_queue</code>,
2165 <code>enqueue</code>, and <code>pop_queue</code> actions currently
2166 influence the output queue.
2167 </p>
2168
2169 <p>
2170 This field influences how packets in the flow will be queued,
2171 for quality of service (QoS) purposes, when they egress the
2172 switch. Its range of meaningful values, and their meanings,
2173 varies greatly from one OpenFlow implementation to another.
2174 Even within a single implementation, there is no guarantee
2175 that all OpenFlow ports have the same queues configured or
2176 that all OpenFlow ports in an implementation can be configured
2177 the same way queue-wise.
2178 </p>
2179
2180 <p>
2181 Configuring queues on OpenFlow is not well standardized. On
2182 Linux, Open vSwitch supports queue configuration via OVSDB,
2183 specifically the <code>QoS</code> and <code>Queue</code>
2184 tables (see <code>ovs-vswitchd.conf.db(5)</code> for details).
2185 Ports of Open vSwitch to other platforms might require queue
2186 configuration through some separate protocol (such as a CLI).
2187 Even on Linux, Open vSwitch exposes only a fraction of the
2188 kernel's queuing features through OVSDB, so advanced or
2189 unusual uses might require use of separate utilities
2190 (e.g. <code>tc</code>). OpenFlow switches other than Open
2191 vSwitch might use OF-CONFIG or any of the configuration
2192 methods mentioned above. Finally, some OpenFlow switches have
2193 a fixed number of fixed-function queues (e.g. eight queues
2194 with strictly defined priorities) and others do not support
2195 any control over queuing.
2196 </p>
2197
2198 <p>
2199 The only output queue that all OpenFlow implementations must
2200 support is zero, to identify a default queue, whose properties
2201 are implementation-defined. Outputting a packet to a queue
2202 that does not exist on the output port yields unpredictable
2203 behavior: among the possibilities are that the packet might be
2204 dropped or transmitted with a very high or very low priority.
2205 </p>
2206
2207 <p>
2208 OpenFlow 1.0 only allowed output queues to be specified as part of an
2209 <code>enqueue</code> action that specified both a queue and an output
2210 port. That is, OpenFlow 1.0 treats the queue as an argument to an
2211 action, not as a field.
2212 </p>
2213
2214 <p>
2215 To increase flexibility, OpenFlow 1.1 added an action to set the output
2216 queue. This model was carried forward, without change, through
2217 OpenFlow 1.5.
2218 </p>
2219
2220 <p>
2221 Open vSwitch implements the native queuing model of each
2222 OpenFlow version it supports. Open vSwitch also includes an
2223 extension for setting the output queue as an action in
2224 OpenFlow 1.0.
2225 </p>
2226
2227 <p>
2228 When a packet ingresses into an OpenFlow switch, the output
2229 queue is ordinarily set to 0, indicating the default queue.
2230 However, Open vSwitch supports various ways to forward a
2231 packet from one OpenFlow switch to another within a single
2232 host. In these cases, Open vSwitch maintains the output queue
2233 across the forwarding step. For example:
2234 </p>
2235
2236 <ul>
2237 <li>
2238 A hop across an Open vSwitch ``patch port'' (which does not
2239 actually involve queuing) preserves the output queue.
2240 </li>
2241
2242 <li>
2243 <p>
2244 When a flow sets the output queue then outputs to an
2245 OpenFlow tunnel port, the encapsulation preserves the
2246 output queue. If the kernel TCP/IP stack routes the
2247 encapsulated packet directly to a physical interface, then
2248 that output honors the output queue. Alternatively, if
2249 the kernel routes the encapsulated packet to another Open
2250 vSwitch bridge, then the output queue set previously
2251 becomes the initial output queue on ingress to the second
2252 bridge and will thus be used for further output actions
2253 (unless overridden by a new ``set queue'' action).
2254 </p>
2255
2256 <p>
2257 (This description reflects the current behavior of Open
2258 vSwitch on Linux. This behavior relies on details of the
2259 Linux TCP/IP stack. It could be difficult to make ports
2260 to other operating systems behave the same way.)
2261 </p>
2262 </li>
2263 </ul>
2264 </field>
2265
2266 <field id="MFF_PKT_MARK" title="Packet Mark">
2267 <p>
2268 Packet mark comes to Open vSwitch from the Linux kernel, in
2269 which the <code>sk_buff</code> data structure that represents
2270 a packet contains a 32-bit member named <code>skb_mark</code>.
2271 The value of <code>skb_mark</code> propagates along with the
2272 packet it accompanies wherever the packet goes in the kernel.
2273 It has no predefined semantics but various kernel-user
2274 interfaces can set and match on it, which makes it suitable
2275 for ``marking'' packets at one point in their handling and
2276 then acting on the mark later. With <code>iptables</code>,
2277 for example, one can mark some traffic specially at ingress
2278 and then handle that traffic differently at egress based on
2279 the marked value.
2280 </p>
2281
2282 <p>
2283 Packet mark is an attempt at a generalization of the
2284 <code>skb_mark</code> concept beyond Linux, at least through more
2285 generic naming. Like <ref field="skb_priority"/>, packet mark is
2286 preserved across forwarding steps within a machine. Unlike <ref
2287 field="skb_priority"/>, packet mark has no direct effect on packet
2288 forwarding: the value set in packet mark does not matter unless some
2289 later OpenFlow table or switch matches on packet mark, or unless the
2290 packet passes through some other kernel subsystem that has been
2291 configured to interpret packet mark in specific ways, e.g. through
2292 <code>iptables</code> configuration mentioned above.
2293 </p>
2294
2295 <p>
2296 Preserving packet mark across kernel forwarding steps relies
2297 heavily on kernel support, which ports to non-Linux operating
2298 systems may not have. Regardless of operating system support,
2299 Open vSwitch supports packet mark within a single bridge and
2300 across patch ports.
2301 </p>
2302
2303 <p>
2304 The value of packet mark when a packet ingresses into the
2305 first Open vSwich bridge is typically zero, but it could be
2306 nonzero if its value was previously set by some kernel
2307 subsystem.
2308 </p>
2309 </field>
2310
2311 <field id="MFF_ACTSET_OUTPUT" title="Action Set Output Port">
2312 <p>
2313 Holds the output port currently in the OpenFlow action set (i.e. from
2314 an <code>output</code> action within a <code>write_actions</code>
2315 instruction). Its value is an OpenFlow port number. If there is no
2316 output port in the OpenFlow action set, or if the output port will be
2317 ignored (e.g. because there is an output group in the OpenFlow action
2318 set), then the value will be <code>OFPP_UNSET</code>.
2319 </p>
2320
2321 <p>
2322 Open vSwitch allows any table to match this field. OpenFlow, however,
2323 only requires this field to be matchable from within an OpenFlow egress
2324 table (a feature that Open vSwitch does not yet implement).
2325 </p>
2326 </field>
2327
2328 <field id="MFF_DP_HASH" title="Datapath Hash" internal="yes"/>
2329 <field id="MFF_RECIRC_ID" title="Datapath Recirculation ID" internal="yes"/>
2330
2331 <field id="MFF_PACKET_TYPE" title="Packet Type">
2332 <p>
2333 The type of the packet in the format specified in OpenFlow 1.5:
2334 </p>
2335
2336 <diagram>
2337 <header name="Packet type">
2338 <bits name="ns" above="16" width=".75"/>
2339 <bits name="ns_type" above="16" width=".75"/>
2340 </header>
2341 <dots/>
2342 </diagram>
2343
2344 <p>
2345 The upper 16 bits, <var>ns</var>, are a namespace. The meaning of
2346 <var>ns_type</var> depends on the namespace. The packet type field is
2347 specified and displayed in the format
2348 <code>(<var>ns</var>,<var>ns_type</var>)</code>.
2349 </p>
2350
2351 <p>
2352 Open vSwitch currently supports the following classes of packet types
2353 for matching:
2354 <dl>
2355 <dt><code>(0,0)</code></dt>
2356 <dd>Ethernet.</dd>
2357 <dt><code>(1,<var>ethertype</var>)</code></dt>
2358 <dd>
2359 <p>
2360 The specified <var>ethertype</var>. Open vSwitch can forward
2361 packets with any <var>ethertype</var>, but it can only match on
2362 and process data fields for the following supported packet types:
2363 </p>
2364 <dl>
2365 <dt><code>(1,0x800)</code></dt> <dd>IPv4</dd>
2366 <dt><code>(1,0x806)</code></dt> <dd>ARP</dd>
2367 <dt><code>(1,0x86dd)</code></dt> <dd>IPv6</dd>
2368 <dt><code>(1,0x8847)</code></dt> <dd>MPLS</dd>
2369 <dt><code>(1,0x8848)</code></dt> <dd>MPLS multicast</dd>
2370 <dt><code>(1,0x8035)</code></dt> <dd>RARP</dd>
2371 <dt><code>(1,0x894f)</code></dt> <dd>NSH</dd>
2372 </dl>
2373 </dd>
2374 </dl>
2375 </p>
2376
2377 <p>
2378 Consider the distinction between a packet with <code>packet_type=(0,0),
2379 dl_type=0x800</code> and one with <code>packet_type=(1,0x800)</code>.
2380 The former is an Ethernet frame that contains an IPv4 packet, like
2381 this:
2382 </p>
2383
2384 <diagram>
2385 <header name="Ethernet">
2386 <bits name="dst" above="48" width="0.4"/>
2387 <bits name="src" above="48" width="0.4"/>
2388 <bits name="type" above="16" below="0x800" width="0.4"/>
2389 </header>
2390 <header name="IPv4">
2391 <bits name="..." width="0.4"/>
2392 <bits name="proto" above="8" width="0.4"/>
2393 <bits name="src" above="32" width="0.4"/>
2394 <bits name="dst" above="32" width="0.4"/>
2395 </header>
2396 <dots/>
2397 </diagram>
2398
2399 <p>
2400 The latter is an IPv4 packet not encapsulated inside any outer frame,
2401 like this:
2402 </p>
2403
2404 <diagram>
2405 <header name="IPv4">
2406 <bits name="..." width="0.4"/>
2407 <bits name="proto" above="8" width="0.4"/>
2408 <bits name="src" above="32" width="0.4"/>
2409 <bits name="dst" above="32" width="0.4"/>
2410 </header>
2411 <dots/>
2412 </diagram>
2413
2414 <p>
2415 Matching on <ref field="packet_type"/> is a pre-requisite for matching
2416 on any data field, but for backward compatibility, when a match on a
2417 data field is present without a <ref field="packet_type"/> match, Open
2418 vSwitch acts as though a match on <code>(0,0)</code> (Ethernet) had
2419 been supplied. Similarly, when Open vSwitch sends flow match
2420 information to a controller, e.g. in a reply to a request to dump the
2421 flow table, Open vSwitch omits a match on packet type (0,0) if it would
2422 be implied by a data field match.
2423 </p>
2424 </field>
2425
2426 </group>
2427
2428 <group title="Connection Tracking">
2429 <p>
2430 Open vSwitch 2.5 and later support ``connection tracking,'' which allows
2431 bidirectional streams of packets to be statefully grouped into
2432 connections. Open vSwitch connection tracking, for example, identifies
2433 the patterns of TCP packets that indicates a successfully initiated
2434 connection, as well as those that indicate that a connection has been
2435 torn down. Open vSwitch connection tracking can also identify related
2436 connections, such as FTP data connections spawned from FTP control
2437 connections.
2438 </p>
2439
2440 <p>
2441 An individual packet passing through the pipeline may be in one of two
2442 states, ``untracked'' or ``tracked,'' which may be distinguished via the
2443 ``trk'' flag in <ref field="ct_state"/>. A packet is
2444 <dfn>untracked</dfn> at the beginning of the Open vSwitch pipeline and
2445 continues to be untracked until the pipeline invokes the <code>ct</code>
2446 action. The connection tracking fields are all zeroes in an untracked
2447 packet. When a flow in the Open vSwitch pipeline invokes the
2448 <code>ct</code> action, the action initializes the connection tracking
2449 fields and the packet becomes <dfn>tracked</dfn> for the remainder of its
2450 processing.
2451 </p>
2452
2453 <p>
2454 The connection tracker stores connection state in an internal table, but
2455 it only adds a new entry to this table when a <code>ct</code> action for
2456 a new connection invokes <code>ct</code> with the <code>commit</code>
2457 parameter. For a given connection, when a pipeline has executed
2458 <code>ct</code>, but not yet with <code>commit</code>, the connection is
2459 said to be <dfn>uncommitted</dfn>. State for an uncommitted connection
2460 is ephemeral and does not persist past the end of the pipeline, so some
2461 features are only available to committed connections. A connection would
2462 typically be left uncommitted as a way to drop its packets.
2463 </p>
2464
2465 <p>
2466 Connection tracking is an Open vSwitch extension to OpenFlow.
2467 </p>
2468
2469 <field id="MFF_CT_STATE" title="Connection Tracking State">
2470 <p>
2471 This field holds several flags that can be used to determine the state
2472 of the connection to which the packet belongs.
2473 </p>
2474
2475 <p>
2476 Matches on this field are most conveniently written in terms of
2477 symbolic names (listed below), each preceded by either <code>+</code>
2478 for a flag that must be set, or <code>-</code> for a flag that must be
2479 unset, without any other delimiters between the flags. Flags not
2480 mentioned are wildcarded. For example,
2481 <code>tcp,ct_state=+trk-new</code> matches TCP packets that have been
2482 run through the connection tracker and do not establish a new
2483 connection. Matches can also be written as
2484 <code><var>flags</var>/<var>mask</var></code>, where <var>flags</var>
2485 and <var>mask</var> are 32-bit numbers in decimal or in hexadecimal
2486 prefixed by <code>0x</code>.
2487 </p>
2488
2489 <p>
2490 The following flags are defined:
2491 </p>
2492
2493 <dl>
2494 <dt><code>new</code> (0x01)</dt>
2495 <dd>
2496 A new connection. Set to 1 if this is an uncommitted connection.
2497 </dd>
2498
2499 <dt><code>est</code> (0x02)</dt>
2500 <dd>
2501 Part of an existing connection. Set to 1 if this is a committed
2502 connection.
2503 </dd>
2504
2505 <dt><code>rel</code> (0x04)</dt>
2506 <dd>
2507 <p>
2508 Related to an existing connection, e.g. an ICMP ``destination
2509 unreachable'' message or an FTP data connections. This flag will
2510 only be 1 if the connection to which this one is related is
2511 committed.
2512 </p>
2513
2514 <p>
2515 Connections identified as <code>rel</code> are separate from the
2516 originating connection and must be committed separately. All
2517 packets for a related connection will have the <code>rel</code>
2518 flag set, not just the initial packet.
2519 </p>
2520 </dd>
2521
2522 <dt><code>rpl</code> (0x08)</dt>
2523 <dd>
2524 This packet is in the reply direction, meaning that it is in the
2525 opposite direction from the packet that initiated the connection.
2526 This flag will only be 1 if the connection is committed.
2527 </dd>
2528
2529 <dt><code>inv</code> (0x10)</dt>
2530 <dd>
2531 <p>
2532 The state is invalid, meaning that the connection tracker couldn't
2533 identify the connection. This flag is a catch-all for problems
2534 in the connection or the connection tracker, such as:
2535 </p>
2536
2537 <ul>
2538 <li>
2539 L3/L4 protocol handler is not loaded/unavailable. With the Linux
2540 kernel datapath, this may mean that the
2541 <code>nf_conntrack_ipv4</code> or <code>nf_conntrack_ipv6</code>
2542 modules are not loaded.
2543 </li>
2544
2545 <li>
2546 L3/L4 protocol handler determines that the packet is malformed.
2547 </li>
2548
2549 <li>
2550 Packets are unexpected length for protocol.
2551 </li>
2552 </ul>
2553 </dd>
2554
2555 <dt><code>trk</code> (0x20)</dt>
2556 <dd>
2557 This packet is tracked, meaning that it has previously traversed the
2558 connection tracker. If this flag is not set, then no other flags
2559 will be set. If this flag is set, then the packet is tracked and
2560 other flags may also be set.
2561 </dd>
2562
2563 <dt><code>snat</code> (0x40)</dt>
2564 <dd>
2565 This packet was transformed by source address/port translation by a
2566 preceding <code>ct</code> action. Open vSwitch 2.6 added this flag.
2567 </dd>
2568
2569 <dt><code>dnat</code> (0x80)</dt>
2570 <dd>
2571 This packet was transformed by destination address/port translation
2572 by a preceding <code>ct</code> action. Open vSwitch 2.6 added this
2573 flag.
2574 </dd>
2575 </dl>
2576
2577 <p>
2578 There are additional constraints on these flags, listed in decreasing
2579 order of precedence below:
2580 </p>
2581
2582 <ol>
2583 <li>
2584 If <code>trk</code> is unset, no other flags are set.
2585 </li>
2586
2587 <li>
2588 If <code>trk</code> is set, one or more other flags may be set.
2589 </li>
2590
2591 <li>
2592 If <code>inv</code> is set, only the <code>trk</code> flag is also
2593 set.
2594 </li>
2595
2596 <li>
2597 <code>new</code> and <code>est</code> are mutually exclusive.
2598 </li>
2599
2600 <li>
2601 <code>new</code> and <code>rpl</code> are mutually exclusive.
2602 </li>
2603
2604 <li>
2605 <code>rel</code> may be set in conjunction with any other flags.
2606 </li>
2607 </ol>
2608
2609 <p>
2610 Future versions of Open vSwitch may define new flags.
2611 </p>
2612 </field>
2613
2614 <field id="MFF_CT_ZONE" title="Connection Tracking Zone">
2615 A connection tracking zone, the zone value passed to the most recent
2616 <code>ct</code> action. Each zone is an independent connection tracking
2617 context, so tracking the same packet in multiple contexts requires using
2618 the <code>ct</code> action multiple times.
2619 </field>
2620
2621 <field id="MFF_CT_MARK" title="Connection Tracking Mark">
2622 The metadata committed, by an action within the <code>exec</code>
2623 parameter to the <code>ct</code> action, to the connection to which the
2624 current packet belongs.
2625 </field>
2626
2627 <field id="MFF_CT_LABEL" title="Connection Tracking Label">
2628 The label committed, by an action within the <code>exec</code>
2629 parameter to the <code>ct</code> action, to the connection to which the
2630 current packet belongs.
2631 </field>
2632
2633 <p>
2634 Open vSwitch 2.8 introduced the matching support for connection
2635 tracker original direction 5-tuple fields.
2636 </p>
2637
2638 <p>
2639 For non-committed non-related connections the conntrack original
2640 direction tuple fields always have the same values as the
2641 corresponding headers in the packet itself. For any other packets of
2642 a committed connection the conntrack original direction tuple fields
2643 reflect the values from that initial non-committed non-related packet,
2644 and thus may be different from the actual packet headers, as the
2645 actual packet headers may be in reverse direction (for reply packets),
2646 transformed by NAT (when \fBnat\fR option was applied to the
2647 connection), or be of different protocol (i.e., when an ICMP response
2648 is sent to an UDP packet). In case of related connections, e.g., an
2649 FTP data connection, the original direction tuple contains the
2650 original direction headers from the master connection, e.g., an FTP
2651 control connection.
2652 </p>
2653
2654 <p>
2655 The following fields are populated by the ct action, and require a
2656 match to a valid connection tracking state as a prerequisite, in
2657 addition to the IP or IPv6 ethertype match. Examples of valid
2658 connection tracking state matches include \fBct_state=+new\fR,
2659 \fBct_state=+est\fR, \fBct_state=+rel\fR, and \fBct_state=+trk-inv\fR.
2660 </p>
2661
2662 <field id="MFF_CT_NW_SRC" title="Connection Tracking Original Direction IPv4 Source Address">
2663 Matches IPv4 conntrack original direction tuple source address.
2664 See the paragraphs above for general description to the
2665 conntrack original direction tuple. Introduced in Open vSwitch
2666 2.8.
2667 </field>
2668
2669 <field id="MFF_CT_NW_DST" title="Connection Tracking Original Direction IPv4 Destination Address">
2670 Matches IPv4 conntrack original direction tuple destination address.
2671 See the paragraphs above for general description to the
2672 conntrack original direction tuple. Introduced in Open vSwitch
2673 2.8.
2674 </field>
2675
2676 <field id="MFF_CT_IPV6_SRC" title="Connection Tracking Original Direction IPv6 Source Address">
2677 Matches IPv6 conntrack original direction tuple source address.
2678 See the paragraphs above for general description to the
2679 conntrack original direction tuple. Introduced in Open vSwitch
2680 2.8.
2681 </field>
2682
2683 <field id="MFF_CT_IPV6_DST" title="Connection Tracking Original Direction IPv6 Destination Address">
2684 Matches IPv6 conntrack original direction tuple destination address.
2685 See the paragraphs above for general description to the
2686 conntrack original direction tuple. Introduced in Open vSwitch
2687 2.8.
2688 </field>
2689
2690 <field id="MFF_CT_NW_PROTO" title="Connection Tracking Original Direction IP Protocol">
2691 Matches conntrack original direction tuple IP protocol type,
2692 which is specified as a decimal number between 0 and 255,
2693 inclusive (e.g. 1 to match ICMP packets or 6 to match TCP
2694 packets). In case of, for example, an ICMP response to an UDP
2695 packet, this may be different from the IP protocol type of the
2696 packet itself. See the paragraphs above for general description
2697 to the conntrack original direction tuple. Introduced in Open
2698 vSwitch 2.8.
2699 </field>
2700
2701 <field id="MFF_CT_TP_SRC" title="Connection Tracking Original Direction Transport Layer Source Port">
2702 Bitwise match on the conntrack original direction tuple
2703 transport source, when
2704 <code>MFF_CT_NW_PROTO</code> has value 6 for TCP, 17 for UDP, or
2705 132 for SCTP. When <code>MFF_CT_NW_PROTO</code> has value 1 for
2706 ICMP, or 58 for ICMPv6, the lower 8 bits of
2707 <code>MFF_CT_TP_SRC</code> matches the conntrack original
2708 direction ICMP type. See the paragraphs above for general
2709 description to the conntrack original direction
2710 tuple. Introduced in Open vSwitch 2.8.
2711 </field>
2712
2713 <field id="MFF_CT_TP_DST" title="Connection Tracking Original Direction Transport Layer Source Port">
2714 Bitwise match on the conntrack original direction tuple
2715 transport destination port, when
2716 <code>MFF_CT_NW_PROTO</code> has value 6 for TCP, 17 for UDP, or
2717 132 for SCTP. When <code>MFF_CT_NW_PROTO</code> has value 1 for
2718 ICMP, or 58 for ICMPv6, the lower 8 bits of
2719 <code>MFF_CT_TP_DST</code> matches the conntrack original
2720 direction ICMP code. See the paragraphs above for general
2721 description to the conntrack original direction
2722 tuple. Introduced in Open vSwitch 2.8.
2723 </field>
2724 </group>
2725
2726 <group title="Register">
2727 <p>
2728 These fields give an OpenFlow switch space for temporary storage while
2729 the pipeline is running. Whereas metadata fields can have a meaningful
2730 initial value and can persist across some hops across OpenFlow switches,
2731 registers are always initially 0 and their values never persist across
2732 inter-switch hops (not even across patch ports).
2733 </p>
2734
2735 <field id="MFF_METADATA" title="OpenFlow Metadata">
2736 <p>
2737 This field is the oldest standardized OpenFlow register field,
2738 introduced in OpenFlow 1.1. It was introduced to model the limited
2739 number of user-defined bits that some ASIC-based switches can carry
2740 through their pipelines. Because of hardware limitations, OpenFlow
2741 allows switches to support writing and masking only an
2742 implementation-defined subset of bits, even no bits at all. The Open
2743 vSwitch software switch always supports all 64 bits, but of course an
2744 Open vSwitch port to an ASIC would have the same restriction as the
2745 ASIC itself.
2746 </p>
2747
2748 <p>
2749 This field has an OXM code point, but OpenFlow 1.4 and earlier allow it
2750 to be modified only with a specialized instruction, not with a
2751 ``set-field'' action. OpenFlow 1.5 removes this restriction. Open
2752 vSwitch does not enforce this restriction, regardless of OpenFlow
2753 version.
2754 </p>
2755 </field>
2756
2757 <field id="MFF_REG0" title="Register 0">
2758 This is the first of several Open vSwitch registers, all of which have
2759 the same properties. Open vSwitch 1.1 introduced registers 0, 1, 2, and
2760 3, version 1.3 added register 4, version 1.7 added registers 5, 6, and 7,
2761 and version 2.6 added registers 8 through 15.
2762 </field>
2763 <!-- XXX series -->
2764 <field id="MFF_REG1" title="Register 1" hidden="yes"/>
2765 <field id="MFF_REG2" title="Register 2" hidden="yes"/>
2766 <field id="MFF_REG3" title="Register 3" hidden="yes"/>
2767 <field id="MFF_REG4" title="Register 4" hidden="yes"/>
2768 <field id="MFF_REG5" title="Register 5" hidden="yes"/>
2769 <field id="MFF_REG6" title="Register 6" hidden="yes"/>
2770 <field id="MFF_REG7" title="Register 7" hidden="yes"/>
2771 <field id="MFF_REG8" title="Register 8" hidden="yes"/>
2772 <field id="MFF_REG9" title="Register 9" hidden="yes"/>
2773 <field id="MFF_REG10" title="Register 10" hidden="yes"/>
2774 <field id="MFF_REG11" title="Register 11" hidden="yes"/>
2775 <field id="MFF_REG12" title="Register 12" hidden="yes"/>
2776 <field id="MFF_REG13" title="Register 13" hidden="yes"/>
2777 <field id="MFF_REG14" title="Register 14" hidden="yes"/>
2778 <field id="MFF_REG15" title="Register 15" hidden="yes"/>
2779
2780 <field id="MFF_XREG0" title="Extended Register 0">
2781 <p>
2782 This is the first of the registers introduced in OpenFlow 1.5.
2783 OpenFlow 1.5 calls these fields just the ``packet registers,'' but Open
2784 vSwitch already had 32-bit registers by that name, so Open vSwitch uses
2785 the name ``extended registers'' in an attempt to reduce confusion. The
2786 standard allows for up to 128 registers, each 64 bits wide, but Open
2787 vSwitch only implements 4 (in versions 2.4 and 2.5) or 8 (in version
2788 2.6 and later).
2789 </p>
2790
2791 <p>
2792 Each of the 64-bit extended registers overlays two of the 32-bit
2793 registers: <code>xreg0</code> overlays <code>reg0</code> and
2794 <code>reg1</code>, with <code>reg0</code> supplying the
2795 most-significant bits of <code>xreg0</code> and <code>reg1</code> the
2796 least-significant. Similarly, <code>xreg1</code> overlays
2797 <code>reg2</code> and <code>reg3</code>, and so on.
2798 </p>
2799
2800 <p>
2801 The OpenFlow specification says, ``In most cases, the packet registers
2802 can not be matched in tables, i.e. they usually can not be used in the
2803 flow entry match structure'' [OpenFlow 1.5, section 7.2.3.10], but
2804 there is no reason for a software switch to impose such a restriction,
2805 and Open vSwitch does not.
2806 </p>
2807 </field>
2808
2809 <!-- XXX series -->
2810 <field id="MFF_XREG1" title="Extended Register 1" hidden="yes"/>
2811 <field id="MFF_XREG2" title="Extended Register 2" hidden="yes"/>
2812 <field id="MFF_XREG3" title="Extended Register 3" hidden="yes"/>
2813 <field id="MFF_XREG4" title="Extended Register 4" hidden="yes"/>
2814 <field id="MFF_XREG5" title="Extended Register 5" hidden="yes"/>
2815 <field id="MFF_XREG6" title="Extended Register 6" hidden="yes"/>
2816 <field id="MFF_XREG7" title="Extended Register 7" hidden="yes"/>
2817
2818 <field id="MFF_XXREG0" title="Double-Extended Register 0">
2819 <p>
2820 This is the first of the double-extended registers introduce in Open
2821 vSwitch 2.6. Each of the 128-bit extended registers overlays four of
2822 the 32-bit registers: <code>xxreg0</code> overlays <code>reg0</code>
2823 through <code>reg3</code>, with <code>reg0</code> supplying the
2824 most-significant bits of <code>xxreg0</code> and <code>reg3</code> the
2825 least-significant. <code>xxreg1</code> similarly overlays
2826 <code>reg4</code> through <code>reg7</code>, and so on.
2827 </p>
2828 </field>
2829
2830 <!-- XXX series -->
2831 <field id="MFF_XXREG1" title="Double-Extended Register 1" hidden="yes"/>
2832 <field id="MFF_XXREG2" title="Double-Extended Register 2" hidden="yes"/>
2833 <field id="MFF_XXREG3" title="Double-Extended Register 3" hidden="yes"/>
2834 </group>
2835
2836 <group title="Layer 2 (Ethernet)">
2837 <p>
2838 Ethernet is the only layer-2 protocol that Open vSwitch
2839 supports. As with most software, Open vSwitch and OpenFlow
2840 regard an Ethernet frame to begin with the 14-byte header and
2841 end with the final byte of the payload; that is, the frame check
2842 sequence is not considered part of the frame.
2843 </p>
2844
2845 <field id="MFF_ETH_SRC" title="Ethernet Source">
2846 <p>
2847 The Ethernet source address:
2848 </p>
2849
2850 <diagram>
2851 <header name="Ethernet">
2852 <bits name="dst" above="48" width=".75"/>
2853 <bits name="src" above="48" width=".75" fill="yes"/>
2854 <bits name="type" above="16" width="0.4"/>
2855 </header>
2856 <dots/>
2857 </diagram>
2858 </field>
2859
2860 <field id="MFF_ETH_DST" title="Ethernet Destination">
2861 <p>
2862 The Ethernet destination address:
2863 </p>
2864
2865 <diagram>
2866 <header name="Ethernet">
2867 <bits name="dst" above="48" width=".75" fill="yes"/>
2868 <bits name="src" above="48" width=".75"/>
2869 <bits name="type" above="16" width="0.4"/>
2870 </header>
2871 <dots/>
2872 </diagram>
2873
2874 <p>
2875 Open vSwitch 1.8 and later support arbitrary masks for source and/or
2876 destination. Earlier versions only support masking the destination
2877 with the following masks:
2878 </p>
2879
2880 <dl>
2881 <dt><code>01:00:00:00:00:00</code></dt>
2882 <dd>
2883 Match only the multicast bit. Thus,
2884 <code>dl_dst=01:00:00:00:00:00/01:00:00:00:00:00</code> matches all
2885 multicast (including broadcast) Ethernet packets, and
2886 <code>dl_dst=00:00:00:00:00:00/01:00:00:00:00:00</code> matches all
2887 unicast Ethernet packets.
2888 </dd>
2889
2890 <dt><code>fe:ff:ff:ff:ff:ff</code></dt>
2891 <dd>
2892 Match all bits except the multicast bit. This is probably not
2893 useful.
2894 </dd>
2895
2896 <dt><code>ff:ff:ff:ff:ff:ff</code></dt>
2897 <dd>
2898 Exact match (equivalent to omitting the mask).
2899 </dd>
2900
2901 <dt><code>00:00:00:00:00:00</code></dt>
2902 <dd>
2903 Wildcard all bits (equivalent to <code>dl_dst=*</code>).
2904 </dd>
2905 </dl>
2906 </field>
2907
2908 <field id="MFF_ETH_TYPE" title="Ethernet Type">
2909 <p>
2910 The most commonly seen Ethernet frames today use a format
2911 called ``Ethernet II,'' in which the last two bytes of the
2912 Ethernet header specify the Ethertype. For such a frame, this
2913 field is copied from those bytes of the header, like so:
2914 </p>
2915
2916 <diagram>
2917 <header name="Ethernet">
2918 <bits name="dst" above="48" width=".75"/>
2919 <bits name="src" above="48" width=".75"/>
2920 <bits name="type" above="16" below="\[&gt;=]0x600" width="0.4" fill="yes"/>
2921 </header>
2922 <dots/>
2923 </diagram>
2924
2925 <p>
2926 Every Ethernet type has a value 0x600 (1,536) or greater.
2927 When the last two bytes of the Ethernet header have a value
2928 too small to be an Ethernet type, then the value found there
2929 is the total length of the frame in bytes, excluding the
2930 Ethernet header. An 802.2 LLC header typically follows the
2931 Ethernet header. OpenFlow and Open vSwitch only support LLC
2932 headers with DSAP and SSAP <code>0xaa</code> and control byte
2933 <code>0x03</code>, which indicate that a SNAP header follows
2934 the LLC header. In turn, OpenFlow and Open vSwitch only
2935 support a SNAP header with organization <code>0x000000</code>.
2936 In such a case, this field is copied from the type field in
2937 the SNAP header, like this:
2938 </p>
2939
2940 <diagram>
2941 <header name="Ethernet">
2942 <bits name="dst" above="48" width=".75"/>
2943 <bits name="src" above="48" width=".75"/>
2944 <bits name="type" above="16" below="&lt;0x600" width="0.4"/>
2945 </header>
2946 <header name="LLC">
2947 <bits name="DSAP" above="8" below="0xaa" width=".4"/>
2948 <bits name="SSAP" above="8" below="0xaa" width=".4"/>
2949 <bits name="cntl" above="8" below="0x03" width=".4"/>
2950 </header>
2951 <header name="SNAP">
2952 <bits name="org" above="24" below="0x000000" width=".75"/>
2953 <bits name="type" above="16" below="\[&gt;=]0x600" width=".4" fill="yes"/>
2954 </header>
2955 <dots/>
2956 </diagram>
2957
2958 <p>
2959 When an 802.1Q header is inserted after the Ethernet source
2960 and destination, this field is populated with the encapsulated
2961 Ethertype, not the 802.1Q Ethertype. With an Ethernet II
2962 inner frame, the result looks like this:
2963 </p>
2964
2965 <diagram>
2966 <header name="Ethernet">
2967 <bits name="dst" above="48" width=".75"/>
2968 <bits name="src" above="48" width=".75"/>
2969 </header>
2970 <header name="802.1Q">
2971 <bits name="TPID" above="16" below="0x8100" width=".4"/>
2972 <bits name="TCI" above="16" width=".4"/>
2973 </header>
2974 <header name="Ethertype">
2975 <bits name="type" above="16" below="\[&gt;=]0x600" width=".4" fill="yes"/>
2976 </header>
2977 <dots/>
2978 </diagram>
2979
2980 <p>
2981 LLC and SNAP encapsulation look like this with an 802.1Q header:
2982 </p>
2983
2984 <diagram>
2985 <header name="Ethernet">
2986 <bits name="dst" above="48" width=".75"/>
2987 <bits name="src" above="48" width=".75"/>
2988 </header>
2989 <header name="802.1Q">
2990 <bits name="TPID" above="16" below="0x8100" width=".4"/>
2991 <bits name="TCI" above="16" width=".4"/>
2992 </header>
2993 <header name="Ethertype">
2994 <bits name="type" above="16" below="&lt;0x600" width="0.4"/>
2995 </header>
2996 <header name="LLC">
2997 <bits name="DSAP" above="8" below="0xaa" width=".4"/>
2998 <bits name="SSAP" above="8" below="0xaa" width=".4"/>
2999 <bits name="cntl" above="8" below="0x03" width=".4"/>
3000 </header>
3001 <header name="SNAP">
3002 <bits name="org" above="24" below="0x000000" width=".75"/>
3003 <bits name="type" above="16" below="\[&gt;=]0x600" width=".4" fill="yes"/>
3004 </header>
3005 <dots/>
3006 </diagram>
3007
3008 <p>
3009 When a packet doesn't match any of the header formats described
3010 above, Open vSwitch and OpenFlow set this field to
3011 <code>0x5ff</code> (<code>OFP_DL_TYPE_NOT_ETH_TYPE</code>).
3012 </p>
3013 </field>
3014 </group>
3015
3016 <group title="VLAN">
3017 <p>
3018 The 802.1Q VLAN header causes more trouble than any other 4
3019 bytes in networking. OpenFlow 1.0, 1.1, and 1.2+ all treat VLANs
3020 differently. Open vSwitch extensions add another variant to the mix.
3021 Open vSwitch reconciles all four treatments as best it can.
3022 </p>
3023
3024 <h2>VLAN Header Format</h2>
3025
3026 <p>
3027 An 802.1Q VLAN header consists of two 16-bit fields:
3028 </p>
3029
3030 <diagram>
3031 <header name="TPID">
3032 <bits name="Ethertype" above="16" below="0x8100" width="1.8"/>
3033 </header>
3034 <nospace/>
3035 <header name="TCI">
3036 <bits name="PCP" above="3" width=".6"/>
3037 <bits name="CFI" above="1" below="0" width=".3"/>
3038 <bits name="VID" above="12" width=".9"/>
3039 </header>
3040 </diagram>
3041
3042 <p>
3043 The first 16 bits of the VLAN header, the <dfn>TPID</dfn> (Tag Protocol
3044 IDentifier), is an Ethertype. When the VLAN header is inserted just
3045 after the source and destination MAC addresses in a Ethertype frame, the
3046 TPID serves to identify the presence of the VLAN. The standard TPID, the
3047 only one that Open vSwitch supports, is <code>0x8100</code>. OpenFlow
3048 1.0 explicitly supports only TPID <code>0x8100</code>. OpenFlow 1.1, but
3049 not earlier or later versions, also requires support for TPID
3050 <code>0x88a8</code> (Open vSwitch does not support this). OpenFlow 1.2
3051 through 1.5 do not require support for specific TPIDs (the ``push vlan
3052 header'' action does say that only <code>0x8100</code> and
3053 <code>0x88a8</code> should be pushed). No version of OpenFlow provides a
3054 way to distinguish or match on the TPID.
3055 </p>
3056
3057 <p>
3058 The remaining 16 bits of the VLAN header, the <dfn>TCI</dfn>
3059 (Tag Control Information), is subdivided into three subfields:
3060 </p>
3061
3062 <ul>
3063 <li>
3064 <dfn>PCP</dfn> (Priority Control Point), is a 3-bit 802.1p
3065 <dfn>priority</dfn>. The lowest priority is value 1, the
3066 second-lowest is value 0, and priority increases from 2 up to
3067 highest priority 7.
3068 </li>
3069
3070 <li>
3071 <p>
3072 <dfn>CFI</dfn> (Canonical Format Indicator), is a 1-bit field. On an
3073 Ethernet network, its value is always 0. This led to it later being
3074 repurposed under the name <dfn>DEI</dfn> (Drop Eligibility
3075 Indicator). By either name, OpenFlow and Open vSwitch don't provide
3076 any way to match or set this bit.
3077 </p>
3078 </li>
3079
3080 <li>
3081 <dfn>VID</dfn> (VLAN IDentifier), is a 12-bit VLAN. If the
3082 VID is 0, then the frame is not part of a VLAN. In that case,
3083 the VLAN header is called a <dfn>priority tag</dfn> because it
3084 is only meaningful for assigning the frame a priority. VID
3085 <code>0xfff</code> (4,095) is reserved.
3086 </li>
3087 </ul>
3088
3089 <p>
3090 See <ref field="eth_type"/> for illustrations of a complete Ethernet
3091 frame with 802.1Q tag included.
3092 </p>
3093
3094 <h2>Multiple VLANs</h2>
3095
3096 <p>
3097 Open vSwitch can match only a single VLAN header. If more than
3098 one VLAN header is present, then <ref field="eth_type"/>
3099 holds the TPID of the inner VLAN header. Open vSwitch stops
3100 parsing the packet after the inner TPID, so matching further
3101 into the packet (e.g. on the inner TCI or L3 fields) is not
3102 possible.
3103 </p>
3104
3105 <p>
3106 OpenFlow only directly supports matching a single VLAN header. In
3107 OpenFlow 1.1 or later, one OpenFlow table can match on the outermost VLAN
3108 header and pop it off, and a later OpenFlow table can match on the next
3109 outermost header. Open vSwitch does not support this.
3110 </p>
3111
3112 <h2>VLAN Field Details</h2>
3113
3114 <p>
3115 The four variants have three different levels of expressiveness: OpenFlow
3116 1.0 and 1.1 VLAN matching are less powerful than OpenFlow 1.2+ VLAN
3117 matching, which is less powerful than Open vSwitch extension VLAN
3118 matching.
3119 </p>
3120
3121 <h2>OpenFlow 1.0 VLAN Fields</h2>
3122
3123 <p>
3124 OpenFlow 1.0 uses two fields, called <code>dl_vlan</code> and
3125 <code>dl_vlan_pcp</code>, each of which can be either exact-matched or
3126 wildcarded, to specify VLAN matches:
3127 </p>
3128
3129 <ul>
3130 <li>
3131 When both <code>dl_vlan</code> and <code>dl_vlan_pcp</code> are
3132 wildcarded, the flow matches packets without an 802.1Q header or
3133 with any 802.1Q header.
3134 </li>
3135
3136 <li>
3137 The match <code>dl_vlan=0xffff</code> causes a flow to match only
3138 packets without an 802.1Q header. Such a flow should also wildcard
3139 <code>dl_vlan_pcp</code>, since a packet without an 802.1Q header does
3140 not have a PCP. OpenFlow does not specify what to do if a match on PCP
3141 is actually present, but Open vSwitch ignores it.
3142 </li>
3143
3144 <li>
3145 <p>
3146 Otherwise, the flow matches only packets with an 802.1Q
3147 header. If <code>dl_vlan</code> is not wildcarded, then the
3148 flow only matches packets with the VLAN ID specified in
3149 <code>dl_vlan</code>'s low 12 bits. If
3150 <code>dl_vlan_pcp</code> is not wildcarded, then the flow
3151 only matches packets with the priority specified in
3152 <code>dl_vlan_pcp</code>'s low 3 bits.
3153 </p>
3154
3155 <p>
3156 OpenFlow does not specify how to interpret the high 4 bits of
3157 <code>dl_vlan</code> or the high 5 bits of <code>dl_vlan_pcp</code>.
3158 Open vSwitch ignores them.
3159 </p>
3160 </li>
3161 </ul>
3162
3163 <field id="MFF_DL_VLAN" title="OpenFlow 1.0 VLAN ID" hidden="yes"/>
3164 <field id="MFF_DL_VLAN_PCP" title="OpenFlow 1.0 VLAN Priority"
3165 hidden="yes"/>
3166
3167 <h2>OpenFlow 1.1 VLAN Fields</h2>
3168
3169 <p>
3170 VLAN matching in OpenFlow 1.1 is similar to OpenFlow 1.0.
3171 The one refinement is that when <code>dl_vlan</code> matches on
3172 <code>0xfffe</code> (<code>OFVPID_ANY</code>), the flow matches
3173 only packets with an 802.1Q header, with any VLAN ID. If
3174 <code>dl_vlan_pcp</code> is wildcarded, the flow matches any
3175 packet with an 802.1Q header, regardless of VLAN ID or priority.
3176 If <code>dl_vlan_pcp</code> is not wildcarded, then the flow
3177 only matches packets with the priority specified in
3178 <code>dl_vlan_pcp</code>'s low 3 bits.
3179 </p>
3180
3181 <p>
3182 OpenFlow 1.1 uses the name <code>OFPVID_NONE</code>, instead of
3183 <code>OFP_VLAN_NONE</code>, for a <code>dl_vlan</code> of
3184 <code>0xffff</code>, but it has the same meaning.
3185 </p>
3186
3187 <p>
3188 In OpenFlow 1.1, Open vSwitch reports error
3189 <code>OFPBMC_BAD_VALUE</code> for an attempt to match on
3190 <code>dl_vlan</code> between 4,096 and <code>0xfffd</code>,
3191 inclusive, or <code>dl_vlan_pcp</code> greater than 7.
3192 </p>
3193
3194 <h2>OpenFlow 1.2 VLAN Fields</h2>
3195
3196 <field id="MFF_VLAN_VID" title="OpenFlow 1.2+ VLAN ID">
3197 <p>
3198 The OpenFlow standard describes this field as consisting of
3199 ``12+1'' bits. On ingress, its value is 0 if no 802.1Q header
3200 is present, and otherwise it holds the VLAN VID in its least
3201 significant 12 bits, with bit 12 (<code>0x1000</code> aka
3202 <code>OFPVID_PRESENT</code>) also set to 1. The three most
3203 significant bits are always zero:
3204 </p>
3205
3206 <diagram>
3207 <header name="OXM_OF_VLAN_VID">
3208 <bits name="" above="3" below="0" width=".6"/>
3209 <bits name="P" above="1" width=".1"/>
3210 <bits name="VLAN ID" above="12" width=".9"/>
3211 </header>
3212 </diagram>
3213
3214 <p>
3215 As a consequence of this field's format, one may use it to match the
3216 VLAN ID in all of the ways available with the OpenFlow 1.0 and 1.1
3217 formats, and a few new ways:
3218 </p>
3219
3220 <dl>
3221 <dt>Fully wildcarded</dt>
3222 <dd>
3223 Matches any packet, that is, one without an 802.1Q header or
3224 with an 802.1Q header with any TCI value.
3225 </dd>
3226
3227 <dt>
3228 Value <code>0x0000</code> (<code>OFPVID_NONE</code>), mask
3229 <code>0xffff</code> (or no mask)
3230 </dt>
3231 <dd>
3232 Matches only packets without an 802.1Q header.
3233 </dd>
3234
3235 <dt>
3236 Value <code>0x1000</code>, mask <code>0x1000</code>
3237 </dt>
3238 <dd>
3239 Matches any packet with an 802.1Q header, regardless of VLAN
3240 ID.
3241 </dd>
3242
3243 <dt>
3244 Value <code>0x1009</code>, mask <code>0xffff</code> (or no mask)
3245 </dt>
3246 <dd>
3247 Match only packets with an 802.1Q header with VLAN ID 9.
3248 </dd>
3249
3250 <dt>Value <code>0x1001</code>, mask <code>0x1001</code></dt>
3251 <dd>
3252 Matches only packets that have an 802.1Q header with an
3253 odd-numbered VLAN ID. (This is just an example; one can
3254 match on any desired VLAN ID bit pattern.)
3255 </dd>
3256 </dl>
3257 </field>
3258
3259 <field id="MFF_VLAN_PCP" title="OpenFlow 1.2+ VLAN Priority">
3260 <p>
3261 The 3 least significant bits may be used to match the PCP bits
3262 in an 802.1Q header. Other bits are always zero:
3263 </p>
3264
3265 <diagram>
3266 <header name="OXM_OF_VLAN_VID">
3267 <bits name="zero" above="5" below="0" width="1.0"/>
3268 <bits name="PCP" above="3" width=".6"/>
3269 </header>
3270 </diagram>
3271
3272 <p>
3273 This field may only be used when <ref field="vlan_vid"/> is not
3274 wildcarded and does not exact match on 0 (which only matches
3275 when there is no 802.1Q header).
3276 </p>
3277
3278 <p>
3279 See <cite>VLAN Comparison Chart</cite>, below, for some examples.
3280 </p>
3281 </field>
3282
3283 <h2>Open vSwitch Extension VLAN Field</h2>
3284
3285 <p>
3286 The <ref field="vlan_tci"/> extension can describe more kinds of VLAN
3287 matches than the other variants. It is also simpler than the other
3288 variants.
3289 </p>
3290
3291 <field id="MFF_VLAN_TCI" title="VLAN TCI">
3292 <p>
3293 For a packet without an 802.1Q header, this field is zero. For a
3294 packet with an 802.1Q header, this field is the TCI with the bit in
3295 CFI's position (marked <code>P</code> for ``present'' below) forced to
3296 1. Thus, for a packet in VLAN 9 with priority 7, it has the value
3297 <code>0xf009</code>:
3298 </p>
3299
3300 <diagram>
3301 <header name="NXM_VLAN_TCI">
3302 <bits name="PCP" above="3" below="7" width=".6"/>
3303 <bits name="P" above="1" below="1" width=".2"/>
3304 <bits name="VID" above="12" below="9" width=".9"/>
3305 </header>
3306 </diagram>
3307
3308 <p>
3309 Usage examples:
3310 </p>
3311
3312 <dl>
3313 <dt><code>vlan_tci=0</code></dt>
3314 <dd>
3315 Match packets without an 802.1Q header.
3316 </dd>
3317
3318 <dt><code>vlan_tci=0x1000/0x1000</code></dt>
3319 <dd>
3320 Match packets with an 802.1Q header, regardless of VLAN
3321 and priority values.
3322 </dd>
3323
3324 <dt><code>vlan_tci=0xf123</code></dt>
3325 <dd>
3326 Match packets tagged with priority 7 in VLAN 0x123.
3327 </dd>
3328
3329 <dt><code>vlan_tci=0x1123/0x1fff</code></dt>
3330 <dd>
3331 Match packets tagged with VLAN 0x123 (and any priority).
3332 </dd>
3333
3334 <dt><code>vlan_tci=0x5000/0xf000</code></dt>
3335 <dd>
3336 Match packets tagged with priority 2 (in any VLAN).
3337 </dd>
3338
3339 <dt><code>vlan_tci=0/0xfff</code></dt>
3340 <dd>
3341 Match packets with no 802.1Q header or tagged with VLAN 0
3342 (and any priority).
3343 </dd>
3344
3345 <dt><code>vlan_tci=0x5000/0xe000</code></dt>
3346 <dd>
3347 Match packets with no 802.1Q header or tagged with priority 2 (in any VLAN).
3348 </dd>
3349
3350 <dt><code>vlan_tci=0/0xefff</code></dt>
3351 <dd>
3352 Match packets with no 802.1Q header or tagged with VLAN 0
3353 and priority 0.
3354 </dd>
3355 </dl>
3356
3357 <p>
3358 See <cite>VLAN Comparison Chart</cite>, below, for more examples.
3359 </p>
3360 </field>
3361
3362 <h2>VLAN Comparison Chart</h2>
3363
3364 <p>
3365 The following table describes each of several possible matching
3366 criteria on 802.1Q header may be expressed with each variation
3367 of the VLAN matching fields:
3368 </p>
3369
3370 <tbl>
3371 r r r r r.
3372 Criteria OpenFlow 1.0 OpenFlow 1.1 OpenFlow 1.2+ NXM
3373 \_ \_ \_ \_ \_
3374 [1] \fL????\fR/\fL1\fR,\fL??\fR/\fL?\fR \fL????\fR/\fL1\fR,\fL??\fR/\fL?\fR \fL0000\fR/\fL0000\fR,\fL--\fR \fL0000\fR/\fL0000\fR
3375 [2] \fLffff\fR/\fL0\fR,\fL??\fR/\fL?\fR \fLffff\fR/\fL0\fR,\fL??\fR/\fL?\fR \fL0000\fR/\fLffff\fR,\fL--\fR \fL0000\fR/\fLffff\fR
3376 [3] \fL0xxx\fR/\fL0\fR,\fL??\fR/\fL1\fR \fL0xxx\fR/\fL0\fR,\fL??\fR/\fL1\fR \fL1xxx\fR/\fLffff\fR,\fL--\fR \fL1xxx\fR/\fL1fff\fR
3377 [4] \fL????\fR/\fL1\fR,\fL0y\fR/\fL0\fR \fLfffe\fR/\fL0\fR,\fL0y\fR/\fL0\fR \fL1000\fR/\fL1000\fR,\fL0y\fR \fLz000\fR/\fLf000\fR
3378 [5] \fL0xxx\fR/\fL0\fR,\fL0y\fR/\fL0\fR \fL0xxx\fR/\fL0\fR,\fL0y\fR/\fL0\fR \fL1xxx\fR/\fLffff\fR,\fL0y\fR \fLzxxx\fR/\fLffff\fR
3379 .T&amp;
3380 r r c c r.
3381 [6] (none) (none) \fL1001\fR/\fL1001\fR,\fL--\fR \fL1001\fR/\fL1001\fR
3382 .T&amp;
3383 r r c c c.
3384 [7] (none) (none) (none) \fL3000\fR/\fL3000\fR
3385 [8] (none) (none) (none) \fL0000\fR/\fL0fff\fR
3386 [9] (none) (none) (none) \fL0000\fR/\fLf000\fR
3387 [10] (none) (none) (none) \fL0000\fR/\fLefff\fR
3388 </tbl>
3389
3390 <p>
3391 All numbers in the table are expressed in hexadecimal. The
3392 columns in the table are interpreted as follows:
3393 </p>
3394
3395 <dl>
3396 <dt>Criteria</dt>
3397 <dd>See the list below.</dd>
3398
3399 <dt>OpenFlow 1.0</dt>
3400 <dt>OpenFlow 1.1</dt>
3401 <dd>
3402 <literal>wwww/x,yy/z</literal> means VLAN ID match value
3403 <literal>wwww</literal> with wildcard bit <literal>x</literal>
3404 and VLAN PCP match value <literal>yy</literal> with wildcard
3405 bit <literal>z</literal>. <literal>?</literal> means that the
3406 given bits are ignored (and conventionally
3407 <literal>0</literal> for <literal>wwww</literal> or
3408 <literal>yy</literal>, conventionally <literal>1</literal> for
3409 <literal>x</literal> or <literal>z</literal>). ``(none)''
3410 means that OpenFlow 1.0 (or 1.1) cannot match with these
3411 criteria.
3412 </dd>
3413
3414 <dt>OpenFlow 1.2+</dt>
3415 <dd>
3416 <literal>xxxx/yyyy,zz</literal> means <ref field="vlan_vid"/> with
3417 value <literal>xxxx</literal> and mask <literal>yyyy</literal>, and
3418 <ref field="vlan_pcp"/> (which is not maskable) with value
3419 <literal>zz</literal>. <literal>--</literal> means that <ref
3420 field="vlan_pcp"/> is omitted. ``(none)'' means that OpenFlow 1.2
3421 cannot match with these criteria.
3422 </dd>
3423
3424 <dt>NXM</dt>
3425 <dd>
3426 <literal>xxxx/yyyy</literal> means <ref field="vlan_tci"/> with value
3427 <literal>xxxx</literal> and mask <literal>yyyy</literal>.
3428 </dd>
3429 </dl>
3430
3431 <p>
3432 The matching criteria described by the table are:
3433 </p>
3434
3435 <dl>
3436 <dt>[1]</dt>
3437 <dd>
3438 Matches any packet, that is, one without an 802.1Q header or
3439 with an 802.1Q header with any TCI value.
3440 </dd>
3441
3442 <dt>[2]</dt>
3443 <dd>
3444 <p>
3445 Matches only packets without an 802.1Q header.
3446 </p>
3447
3448 <p>
3449 OpenFlow 1.0 doesn't define the behavior if <ref field="dl_vlan"/> is
3450 set to <code>0xffff</code> and <ref field="dl_vlan_pcp"/> is not
3451 wildcarded. (Open vSwitch always ignores <ref field="dl_vlan_pcp"/>
3452 when <ref field="dl_vlan"/> is set to <code>0xffff</code>.)
3453 </p>
3454
3455 <p>
3456 OpenFlow 1.1 says explicitly to ignore <ref field="dl_vlan_pcp"/>
3457 when <ref field="dl_vlan"/> is set to <code>0xffff</code>.
3458 </p>
3459
3460 <p>
3461 OpenFlow 1.2 doesn't say how to interpret a match with <ref
3462 field="vlan_vid"/> value 0 and a mask with
3463 <code>OFPVID_PRESENT</code> (<code>0x1000</code>) set to 1 and some
3464 other bits in the mask set to 1 also. Open vSwitch interprets it the
3465 same way as a mask of <code>0x1000</code>.
3466 </p>
3467
3468 <p>
3469 Any NXM match with <ref field="vlan_tci"/> value 0 and the CFI bit
3470 set to 1 in the mask is equivalent to the one listed in the table.
3471 </p>
3472 </dd>
3473
3474 <dt>[3]</dt>
3475 <dd>
3476 Matches only packets that have an 802.1Q header with VID
3477 <literal>xxx</literal> (and any PCP).
3478 </dd>
3479
3480 <dt>[4]</dt>
3481 <dd>
3482 <p>
3483 Matches only packets that have an 802.1Q header with PCP
3484 <literal>y</literal> (and any VID).
3485 </p>
3486
3487 <p>
3488 OpenFlow 1.0 doesn't clearly define the behavior for this
3489 case. Open vSwitch implements it this way.
3490 </p>
3491
3492 <p>
3493 In the NXM value, <literal>z</literal> equals
3494 (<literal>y</literal> &lt;&lt; 1) | 1.
3495 </p>
3496 </dd>
3497
3498 <dt>[5]</dt>
3499 <dd>
3500 <p>
3501 Matches only packets that have an 802.1Q header with VID
3502 <literal>xxx</literal> and PCP <literal>y</literal>.
3503 </p>
3504
3505 <p>
3506 In the NXM value, <literal>z</literal> equals
3507 (<literal>y</literal> &lt;&lt; 1) | 1.
3508 </p>
3509 </dd>
3510
3511 <dt>[6]</dt>
3512 <dd>
3513 Matches only packets that have an 802.1Q header with an
3514 odd-numbered VID (and any PCP). Only possible with OpenFlow
3515 1.2 and NXM. (This is just an example; one can match on any
3516 desired VID bit pattern.)
3517 </dd>
3518
3519 <dt>[7]</dt>
3520 <dd>
3521 Matches only packets that have an 802.1Q header with an
3522 odd-numbered PCP (and any VID). Only possible with NXM.
3523 (This is just an example; one can match on any desired VID bit
3524 pattern.)
3525 </dd>
3526
3527 <dt>[8]</dt>
3528 <dd>
3529 Matches packets with no 802.1Q header or with an 802.1Q header
3530 with a VID of 0. Only possible with NXM.
3531 </dd>
3532
3533 <dt>[9]</dt>
3534 <dd>
3535 Matches packets with no 802.1Q header or with an 802.1Q header
3536 with a PCP of 0. Only possible with NXM.
3537 </dd>
3538
3539 <dt>[10]</dt>
3540 <dd>
3541 Matches packets with no 802.1Q header or with an 802.1Q header
3542 with both VID and PCP of 0. Only possible with NXM.
3543 </dd>
3544 </dl>
3545 </group>
3546
3547 <group title="Layer 2.5: MPLS">
3548 <p>
3549 One or more MPLS headers (more commonly called <dfn>MPLS
3550 labels</dfn>) follow an Ethernet type field that specifies an
3551 MPLS Ethernet type [RFC 3032]. Ethertype <code>0x8847</code> is
3552 used for all unicast. Multicast MPLS is divided into two
3553 specific classes, one of which uses Ethertype
3554 <code>0x8847</code> and the other <code>0x8848</code> [RFC
3555 5332].
3556 </p>
3557
3558 <p>
3559 The most common overall packet format is Ethernet II, shown
3560 below (SNAP encapsulation may be used but is not ordinarily seen
3561 in Ethernet networks):
3562 </p>
3563
3564 <diagram>
3565 <header name="Ethernet">
3566 <bits name="dst" above="48" width="0.75"/>
3567 <bits name="src" above="48" width="0.75"/>
3568 <bits name="type" above="16" below="0x8847" width="0.4"/>
3569 </header>
3570 <header name="MPLS">
3571 <bits name="label" above="20" width=".6"/>
3572 <bits name="TC" above="3" width=".3"/>
3573 <bits name="S" above="1" width=".1"/>
3574 <bits name="TTL" above="8" width=".4"/>
3575 </header>
3576 <dots/>
3577 </diagram>
3578
3579 <p>
3580 MPLS can be encapsulated inside an 802.1Q header, in which case
3581 the combination looks like this:
3582 </p>
3583
3584 <diagram>
3585 <header name="Ethernet">
3586 <bits name="dst" above="48" width=".75"/>
3587 <bits name="src" above="48" width=".75"/>
3588 </header>
3589 <header name="802.1Q">
3590 <bits name="TPID" above="16" below="0x8100" width=".4"/>
3591 <bits name="TCI" above="16" width=".4"/>
3592 </header>
3593 <header name="Ethertype">
3594 <bits name="type" above="16" below="0x8847" width=".4"/>
3595 </header>
3596 <header name="MPLS">
3597 <bits name="label" above="20" width=".6"/>
3598 <bits name="TC" above="3" width=".3"/>
3599 <bits name="S" above="1" width=".1"/>
3600 <bits name="TTL" above="8" width=".4"/>
3601 </header>
3602 <dots/>
3603 </diagram>
3604
3605 <p>
3606 The fields within an MPLS label are:
3607 </p>
3608
3609 <dl>
3610 <dt>Label, 20 bits.</dt>
3611 <dd>
3612 An identifier.
3613 </dd>
3614
3615 <dt>Traffic control (TC), 3 bits.</dt>
3616 <dd>
3617 Used for quality of service.
3618 </dd>
3619
3620 <dt>Bottom of stack (BOS), 1 bit (labeled just ``S'' above).</dt>
3621 <dd>
3622 <p>
3623 0 indicates that another MPLS label follows this one.
3624 </p>
3625
3626 <p>
3627 1 indicates that this MPLS label is the last one in the
3628 stack, so that some other protocol follows this one.
3629 </p>
3630 </dd>
3631
3632 <dt>Time to live (TTL), 8 bits.</dt>
3633 <dd>
3634 <p>
3635 Each hop across an MPLS network decrements the TTL by 1. If
3636 it reaches 0, the packet is discarded.
3637 </p>
3638
3639 <p>
3640 OpenFlow does not make the MPLS TTL available as a match field, but
3641 actions are available to set and decrement the TTL. Open vSwitch 2.6
3642 and later makes the MPLS TTL available as an extension.
3643 </p>
3644 </dd>
3645 </dl>
3646
3647 <h2>MPLS Label Stacks</h2>
3648
3649 <p>
3650 Unlike the other encapsulations supported by OpenFlow and Open vSwitch,
3651 MPLS labels are routinely used in ``stacks'' two or three deep and
3652 sometimes even deeper. Open vSwitch currently supports up to three
3653 labels.
3654 </p>
3655
3656 <p>
3657 The OpenFlow specification only supports matching on the outermost MPLS
3658 label at any given time. To match on the second label, one must first
3659 ``pop'' the outer label and advance to another OpenFlow table, where the
3660 inner label may be matched. To match on the third label, one must pop
3661 the two outer labels, and so on. The Open Networking Foundation is
3662 considering support for directly matching on multiple MPLS labels for
3663 OpenFlow 1.6.<!-- XXX add EXT-* link -->
3664 </p>
3665
3666 <h2>MPLS Inner Protocol</h2>
3667
3668 <p>
3669 Unlike all other forms of encapsulation that Open vSwitch and
3670 OpenFlow support, an MPLS label does not indicate what inner
3671 protocol it encapsulates. Different deployments determine the
3672 inner protocol in different ways [RFC 3032]:
3673 </p>
3674
3675 <ul>
3676 <li>
3677 A few reserved label values do indicate an inner protocol.
3678 Label 0, the ``IPv4 Explicit NULL Label,'' indicates inner
3679 IPv4. Label 2, the ``IPv6 Explicit NULL Label,'' indicates
3680 inner IPv6.
3681 </li>
3682
3683 <li>
3684 Some deployments use a single inner protocol consistently.
3685 </li>
3686
3687 <li>
3688 In some deployments, the inner protocol must be inferred from
3689 the innermost label.
3690 </li>
3691
3692 <li>
3693 In some deployments, the inner protocol must be inferred from
3694 the innermost label and the encapsulated data, e.g. to
3695 distinguish between inner IPv4 and IPv6 based on whether the
3696 first nibble of the inner protocol data are <code>4</code> or
3697 <code>6</code>. OpenFlow and Open vSwitch do not currently
3698 support these cases.
3699 </li>
3700 </ul>
3701
3702 <p>
3703 Open vSwitch and OpenFlow do not infer the inner protocol, even if
3704 reserved label values are in use. Instead, the flow table must specify
3705 the inner protocol at the time it pops the bottommost MPLS label, using
3706 the Ethertype argument to the <code>pop_mpls</code> action.
3707 </p>
3708
3709 <h2>Field Details</h2>
3710
3711 <field id="MFF_MPLS_LABEL" title="MPLS Label">
3712 <p>
3713 The least significant 20 bits hold the ``label'' field from
3714 the MPLS label. Other bits are zero:
3715 </p>
3716
3717 <diagram>
3718 <header name="OXM_OF_MPLS_LABEL">
3719 <bits name="zero" above="12" below="0" width=".6"/>
3720 <bits name="label" above="20" width="1.0"/>
3721 </header>
3722 </diagram>
3723
3724 <p>
3725 Most label values are available for any use by deployments.
3726 Values under 16 are reserved.
3727 </p>
3728 </field>
3729
3730 <field id="MFF_MPLS_TC" title="MPLS Traffic Class">
3731 <p>
3732 The least significant 3 bits hold the TC field from the MPLS
3733 label. Other bits are zero:
3734 </p>
3735
3736 <diagram>
3737 <header name="OXM_OF_MPLS_TC">
3738 <bits name="zero" above="5" below="0" width="1.0"/>
3739 <bits name="TC" above="3" width=".6"/>
3740 </header>
3741 </diagram>
3742
3743 <p>
3744 This field is intended for use for Quality of Service (QoS)
3745 and Explicit Congestion Notification purposes, but its
3746 particular interpretation is deployment specific.
3747 </p>
3748
3749 <p>
3750 Before 2009, this field was named EXP and reserved for
3751 experimental use [RFC 5462].
3752 </p>
3753 </field>
3754
3755 <field id="MFF_MPLS_BOS" title="MPLS Bottom of Stack">
3756 <p>
3757 The least significant bit holds the BOS field from the MPLS
3758 label. Other bits are zero:
3759 </p>
3760
3761 <diagram>
3762 <header name="OXM_OF_MPLS_BOS">
3763 <bits name="zero" above="7" below="0" width="1.3"/>
3764 <bits name="BOS" above="1" width=".3"/>
3765 </header>
3766 </diagram>
3767
3768 <p>
3769 This field is useful as part of processing a series of incoming MPLS
3770 labels. A flow that includes a <code>pop_mpls</code> action should
3771 generally match on <ref field="mpls_bos"/>:
3772 </p>
3773
3774 <ul>
3775 <li>
3776 When <ref field="mpls_bos"/> is 1, there is another MPLS label
3777 following this one, so the Ethertype passed to <code>pop_mpls</code>
3778 should be an MPLS Ethertype. For example: <code>table=0,
3779 dl_type=0x8847, mpls_bos=1, actions=pop_mpls:0x8847,
3780 goto_table:1</code>
3781 </li>
3782
3783 <li>
3784 When <ref field="mpls_bos"/> is 0, this MPLS label is the last one,
3785 so the Ethertype passed to <code>pop_mpls</code> should be a non-MPLS
3786 Ethertype such as IPv4. For example: <code>table=1, dl_type=0x8847,
3787 mpls_bos=0, actions=pop_mpls:0x0800, goto_table:2</code>
3788 </li>
3789 </ul>
3790 </field>
3791
3792 <field id="MFF_MPLS_TTL" title="MPLS Time-to-Live">
3793 <p>
3794 Holds the 8-bit time-to-live field from the MPLS label:
3795 </p>
3796
3797 <diagram>
3798 <header name="NXM_NX_MPLS_TTL">
3799 <bits name="TTL" above="8" width=".4"/>
3800 </header>
3801 </diagram>
3802 </field>
3803 </group>
3804
3805 <group title="Layer 3: IPv4 and IPv6">
3806 <h2>IPv4 Specific Fields</h2>
3807
3808 <p>
3809 These fields are applicable only to IPv4 flows, that is, flows that match
3810 on the IPv4 Ethertype <code>0x0800</code>.
3811 </p>
3812
3813 <field id="MFF_IPV4_SRC" title="IPv4 Source Address">
3814 <p>
3815 The source address from the IPv4 header:
3816 </p>
3817
3818 <diagram>
3819 <header name="Ethernet">
3820 <bits name="dst" above="48" width="0.4"/>
3821 <bits name="src" above="48" width="0.4"/>
3822 <bits name="type" above="16" below="0x800" width="0.4"/>
3823 </header>
3824 <header name="IPv4">
3825 <bits name="..." width="0.4"/>
3826 <bits name="proto" above="8" width="0.4"/>
3827 <bits name="src" above="32" width="0.4" fill="yes"/>
3828 <bits name="dst" above="32" width="0.4"/>
3829 </header>
3830 <dots/>
3831 </diagram>
3832
3833 <p>
3834 For historical reasons, in an ARP or RARP flow, Open vSwitch interprets
3835 matches on <code>nw_src</code> as actually referring to the ARP SPA.
3836 </p>
3837 </field>
3838
3839 <field id="MFF_IPV4_DST" title="IPv4 Destination Address">
3840 <p>
3841 The destination address from the IPv4 header:
3842 </p>
3843
3844 <diagram>
3845 <header name="Ethernet">
3846 <bits name="dst" above="48" width="0.4"/>
3847 <bits name="src" above="48" width="0.4"/>
3848 <bits name="type" above="16" below="0x800" width="0.4"/>
3849 </header>
3850 <header name="IPv4">
3851 <bits name="..." width="0.4"/>
3852 <bits name="proto" above="8" width="0.4"/>
3853 <bits name="src" above="32" width="0.4"/>
3854 <bits name="dst" above="32" width="0.4" fill="yes"/>
3855 </header>
3856 <dots/>
3857 </diagram>
3858
3859 <p>
3860 For historical reasons, in an ARP or RARP flow, Open vSwitch interprets
3861 matches on <code>nw_dst</code> as actually referring to the ARP TPA.
3862 </p>
3863 </field>
3864
3865 <h2>IPv6 Specific Fields</h2>
3866
3867 <p>
3868 These fields apply only to IPv6 flows, that is, flows that match
3869 on the IPv6 Ethertype <code>0x86dd</code>.
3870 </p>
3871
3872 <field id="MFF_IPV6_SRC" title="IPv6 Source Address">
3873 <p>
3874 The source address from the IPv6 header:
3875 </p>
3876
3877 <diagram>
3878 <header name="Ethernet">
3879 <bits name="dst" above="48" width="0.4"/>
3880 <bits name="src" above="48" width="0.4"/>
3881 <bits name="type" above="16" below="0x86dd" width="0.4"/>
3882 </header>
3883 <header name="IPv6">
3884 <bits name="..." width="0.4"/>
3885 <bits name="next" above="8" width="0.3"/>
3886 <bits name="src" above="128" width="0.8" fill="yes"/>
3887 <bits name="dst" above="128" width="0.8"/>
3888 </header>
3889 <dots/>
3890 </diagram>
3891
3892 <p>
3893 Open vSwitch 1.8 added support for bitwise matching; earlier versions
3894 supported only CIDR masks.
3895 </p>
3896 </field>
3897 <field id="MFF_IPV6_DST" title="IPv6 Destination Address">
3898 <p>
3899 The destination address from the IPv6 header:
3900 </p>
3901 <diagram>
3902 <header name="Ethernet">
3903 <bits name="dst" above="48" width="0.4"/>
3904 <bits name="src" above="48" width="0.4"/>
3905 <bits name="type" above="16" below="0x86dd" width="0.4"/>
3906 </header>
3907 <header name="IPv6">
3908 <bits name="..." width="0.4"/>
3909 <bits name="next" above="8" width="0.3"/>
3910 <bits name="src" above="128" width="0.8"/>
3911 <bits name="dst" above="128" width="0.8" fill="yes"/>
3912 </header>
3913 <dots/>
3914 </diagram>
3915
3916 <p>
3917 Open vSwitch 1.8 added support for bitwise matching; earlier versions
3918 supported only CIDR masks.
3919 </p>
3920 </field>
3921 <field id="MFF_IPV6_LABEL" title="IPv6 Flow Label">
3922 <p>
3923 The least significant 20 bits hold the flow label field from
3924 the IPv6 header. Other bits are zero:
3925 </p>
3926
3927 <diagram>
3928 <header name="OXM_OF_IPV6_FLABEL">
3929 <bits name="zero" above="12" below="0" width=".6"/>
3930 <bits name="label" above="20" width="1.0"/>
3931 </header>
3932 </diagram>
3933 </field>
3934
3935 <h2>IPv4/IPv6 Fields</h2>
3936
3937 <p>
3938 These fields exist with at least approximately the same meaning in both
3939 IPv4 and IPv6, so they are treated as a single field for matching
3940 purposes. Any flow that matches on the IPv4 Ethertype
3941 <code>0x0800</code> or the IPv6 Ethertype <code>0x86dd</code> may match
3942 on these fields.
3943 </p>
3944
3945 <field id="MFF_IP_PROTO" title="IPv4/v6 Protocol">
3946 <p>
3947 Matches the IPv4 or IPv6 protocol type.
3948 </p>
3949
3950 <p>
3951 For historical reasons, in an ARP or RARP flow, Open vSwitch interprets
3952 matches on <code>nw_proto</code> as actually referring to the ARP
3953 opcode. The ARP opcode is a 16-bit field, so for matching purposes ARP
3954 opcodes greater than 255 are treated as 0; this works adequately
3955 because in practice ARP and RARP only use opcodes 1 through 4.
3956 </p>
3957 </field>
3958
3959 <field id="MFF_IP_TTL" title="IPv4/v6 TTL/Hop Limit">
3960 The main reason to match on the TTL or hop limit field is to detect
3961 whether a <code>dec_ttl</code> action will fail due to a TTL exceeded
3962 error. Another way that a controller can detect TTL exceeded is to
3963 listen for <code>OFPR_INVALID_TTL</code> ``packet-in'' messages via
3964 OpenFlow.
3965 </field>
3966
3967 <field id="MFF_IP_FRAG" title="IPv4/v6 Fragment Bitmask">
3968 <p>
3969 Specifies what kinds of IP fragments or non-fragments to match. The
3970 value for this field is most conveniently specified as one of the
3971 following:
3972 </p>
3973
3974 <dl>
3975 <dt><code>no</code></dt>
3976 <dd>
3977 Match only non-fragmented packets.
3978 </dd>
3979
3980 <dt><code>yes</code></dt>
3981 <dd>
3982 Matches all fragments.
3983 </dd>
3984
3985 <dt><code>first</code></dt>
3986 <dd>
3987 Matches only fragments with offset 0.
3988 </dd>
3989
3990 <dt><code>later</code></dt>
3991 <dd>
3992 Matches only fragments with nonzero offset.
3993 </dd>
3994
3995 <dt><code>not_later</code></dt>
3996 <dd>
3997 Matches non-fragmented packets and fragments with zero offset.
3998 </dd>
3999 </dl>
4000
4001 <p>
4002 The field is internally formatted as 2 bits: bit 0 is 1 for an IP
4003 fragment with any offset (and otherwise 0), and bit 1 is 1 for an IP
4004 fragment with nonzero offset (and otherwise 0), like so:
4005 </p>
4006
4007 <diagram>
4008 <header name="NXM_NX_IP_FRAG">
4009 <bits name="zero" above="6" below="0" width=".9"/>
4010 <bits name="later" above="1" width=".3"/>
4011 <bits name="any" above="1" width=".3"/>
4012 </header>
4013 </diagram>
4014
4015 <p>
4016 Even though 2 bits have 4 possible values, this field only uses 3 of
4017 them:
4018 </p>
4019
4020 <ul>
4021 <li>
4022 A packet that is not an IP fragment has value 0.
4023 </li>
4024
4025 <li>
4026 A packet that is an IP fragment with offset 0 (the first fragment)
4027 has bit 0 set and thus value 1.
4028 </li>
4029
4030 <li>
4031 A packet that is an IP fragment with nonzero offset has bits 0 and 1
4032 set and thus value 3.
4033 </li>
4034 </ul>
4035
4036 <p>
4037 The switch may reject matches against values that can never appear.
4038 </p>
4039
4040 <p>
4041 It is important to understand how this field interacts with the
4042 OpenFlow fragment handling mode:
4043 </p>
4044
4045 <ul>
4046 <li>
4047 In <code>OFPC_FRAG_DROP</code> mode, the OpenFlow switch drops all IP
4048 fragments before they reach the flow table, so every packet that is
4049 available for matching will have value 0 in this field.
4050 </li>
4051
4052 <li>
4053 Open vSwitch does not implement <code>OFPC_FRAG_REASM</code> mode,
4054 but if it did then IP fragments would be reassembled before they
4055 reached the flow table and again every packet available for matching
4056 would always have value 0.
4057 </li>
4058
4059 <li>
4060 In <code>OFPC_FRAG_NORMAL</code> mode, all three values are possible,
4061 but OpenFlow 1.0 says that fragments' transport ports are always 0,
4062 even for the first fragment, so this does not provide much extra
4063 information.
4064 </li>
4065
4066 <li>
4067 In <code>OFPC_FRAG_NX_MATCH</code> mode, all three values are
4068 possible. For fragments with offset 0, Open vSwitch makes L4 header
4069 information available.
4070 </li>
4071 </ul>
4072
4073 <p>
4074 Thus, this field is likely to be most useful for an Open vSwitch switch
4075 configured in <code>OFPC_FRAG_NX_MATCH</code> mode. See the
4076 description of the <code>set-frags</code> command in
4077 <code>ovs-ofctl</code>(8), for more details.
4078 </p>
4079 </field>
4080
4081 <h3>IPv4/IPv6 TOS Fields</h3>
4082
4083 <p>
4084 IPv4 and IPv6 contain a one-byte ``type of service'' or TOS field that
4085 has the following format:
4086 </p>
4087
4088 <diagram>
4089 <header name="type of service">
4090 <bits name="DSCP" above="6" width=".9"/>
4091 <bits name="ECN" above="2" width=".3"/>
4092 </header>
4093 </diagram>
4094
4095 <field id="MFF_IP_DSCP" title="IPv4/v6 DSCP (Bits 2-7)">
4096 <p>
4097 This field is the TOS byte with the two ECN bits cleared to 0:
4098 </p>
4099
4100 <diagram>
4101 <header name="NXM_OF_IP_TOS">
4102 <bits name="DSCP" above="6" width=".9"/>
4103 <bits name="zero" above="2" below="0" width=".3"/>
4104 </header>
4105 </diagram>
4106 </field>
4107 <field id="MFF_IP_DSCP_SHIFTED" title="IPv4/v6 DSCP (Bits 0-5)">
4108 <p>
4109 This field is the TOS byte shifted right to put the DSCP bits in the
4110 6 least-significant bits:
4111 </p>
4112
4113 <diagram>
4114 <header name="OXM_OF_IP_DSCP">
4115 <bits name="zero" above="2" below="0" width=".3"/>
4116 <bits name="DSCP" above="6" width=".9"/>
4117 </header>
4118 </diagram>
4119 </field>
4120 <field id="MFF_IP_ECN" title="IPv4/v6 ECN">
4121 <p>
4122 This field is the TOS byte with the DSCP bits cleared to 0:
4123 </p>
4124
4125 <diagram>
4126 <header name="OXM_OF_IP_ECN">
4127 <bits name="zero" above="6" below="0" width=".9"/>
4128 <bits name="ECN" above="2" width=".35"/>
4129 </header>
4130 </diagram>
4131 </field>
4132
4133 </group>
4134
4135 <group title="Layer 3: ARP">
4136 <p>
4137 In theory, Address Resolution Protocol, or ARP, is a generic protocol
4138 generic protocol that can be used to obtain the hardware address that
4139 corresponds to any higher-level protocol address. In contemporary usage,
4140 ARP is used only in Ethernet networks to obtain the Ethernet address for
4141 a given IPv4 address. OpenFlow and Open vSwitch only support this usage
4142 of ARP. For this use case, an ARP packet has the following format, with
4143 the ARP fields exposed as Open vSwitch fields highlighted:
4144 </p>
4145
4146 <diagram>
4147 <header name="Ethernet">
4148 <bits name="dst" above="48" width="0.4"/>
4149 <bits name="src" above="48" width="0.4"/>
4150 <bits name="type" above="16" below="0x806" width="0.4"/>
4151 </header>
4152 <header name="ARP">
4153 <bits name="hrd" above="16" below="1" width=".3"/>
4154 <bits name="pro" above="16" below="0x800" width=".3"/>
4155 <bits name="hln" above="8" below="6" width=".2"/>
4156 <bits name="pln" above="8" below="4" width=".2"/>
4157 <bits name="op" above="16" width=".2" fill="yes"/>
4158 <bits name="sha" above="48" width="0.5" fill="yes"/>
4159 <bits name="spa" above="16" width="0.3" fill="yes"/>
4160 <bits name="tha" above="48" width="0.5" fill="yes"/>
4161 <bits name="tpa" above="16" width="0.3" fill="yes"/>
4162 </header>
4163 </diagram>
4164
4165 <p>
4166 The ARP fields are also used for RARP, the Reverse Address Resolution
4167 Protocol, which shares ARP's wire format.
4168 </p>
4169
4170 <field id="MFF_ARP_OP" title="ARP Opcode">
4171 Even though this is a 16-bit field, Open vSwitch does not support ARP
4172 opcodes greater than 255; it treats them to zero. This works adequately
4173 because in practice ARP and RARP only use opcodes 1 through 4.
4174 </field>
4175
4176 <field id="MFF_ARP_SPA" title="ARP Source IPv4 Address"/>
4177 <field id="MFF_ARP_TPA" title="ARP Target IPv4 Address"/>
4178 <field id="MFF_ARP_SHA" title="ARP Source Ethernet Address"/>
4179 <field id="MFF_ARP_THA" title="ARP Target Ethernet Address"/>
4180 </group>
4181
4182 <group title="Layer 4: TCP, UDP, and SCTP">
4183 <p>
4184 For matching purposes, no distinction is made whether these protocols are
4185 encapsulated within IPv4 or IPv6.
4186 </p>
4187
4188 <h2>TCP</h2>
4189
4190 <p>
4191 The following diagram shows TCP within IPv4. Open vSwitch also supports
4192 TCP in IPv6. Only TCP fields implemented as Open vSwitch fields are
4193 shown:
4194 </p>
4195
4196 <diagram>
4197 <header name="Ethernet">
4198 <bits name="dst" above="48" width="0.4"/>
4199 <bits name="src" above="48" width="0.4"/>
4200 <bits name="type" above="16" below="0x800" width="0.4"/>
4201 </header>
4202 <header name="IPv4">
4203 <bits name="..." width="0.4"/>
4204 <bits name="proto" above="8" below="6" width="0.3"/>
4205 <bits name="src" above="32" width="0.4"/>
4206 <bits name="dst" above="32" width="0.4"/>
4207 </header>
4208 <header name="TCP">
4209 <bits name="src" above="16" width=".2"/>
4210 <bits name="dst" above="16" width=".2"/>
4211 <bits name="..." width=".75"/>
4212 <bits name="flags" above="12" width=".3"/>
4213 <bits name="..." width=".6"/>
4214 </header>
4215 <dots/>
4216 </diagram>
4217 <field id="MFF_TCP_SRC" title="TCP Source Port">
4218 Open vSwitch 1.6 added support for bitwise matching.
4219 </field>
4220 <field id="MFF_TCP_DST" title="TCP Destination Port">
4221 Open vSwitch 1.6 added support for bitwise matching.
4222 </field>
4223 <field id="MFF_TCP_FLAGS" title="TCP Flags">
4224 <p>
4225 This field holds the TCP flags. TCP currently defines 9 flag bits. An
4226 additional 3 bits are reserved. For more information, see [RFC 793],
4227 [RFC 3168], and [RFC 3540].
4228 </p>
4229
4230 <p>
4231 Matches on this field are most conveniently written in terms of
4232 symbolic names (given in the diagram below), each preceded by either
4233 <code>+</code> for a flag that must be set, or <code>-</code> for a
4234 flag that must be unset, without any other delimiters between the
4235 flags. Flags not mentioned are wildcarded. For example,
4236 <code>tcp,tcp_flags=+syn-ack</code> matches TCP SYNs that are not ACKs,
4237 and <code>tcp,tcp_flags=+[200]</code> matches TCP packets with the
4238 reserved [200] flag set. Matches can also be written as
4239 <code><var>flags</var>/<var>mask</var></code>, where <var>flags</var>
4240 and <var>mask</var> are 16-bit numbers in decimal or in hexadecimal
4241 prefixed by <code>0x</code>.
4242 </p>
4243
4244 <p>
4245 The flag bits are:
4246 </p>
4247
4248 <diagram>
4249 <header>
4250 <bits name="zero" above="4" below="0" width=".9"/>
4251 </header>
4252 <nospace/>
4253 <header name="reserved">
4254 <bits name="[800]" above="1" width=".35"/>
4255 <bits name="[400]" above="1" width=".35"/>
4256 <bits name="[200]" above="1" width=".35"/>
4257 </header>
4258 <nospace/>
4259 <header name="later RFCs">
4260 <bits name="NS" above="1" width=".35"/>
4261 <bits name="CWR" above="1" width=".35"/>
4262 <bits name="ECE" above="1" width=".35"/>
4263 </header>
4264 <nospace/>
4265 <header name="RFC 793">
4266 <bits name="URG" above="1" width=".35"/>
4267 <bits name="ACK" above="1" width=".35"/>
4268 <bits name="PSH" above="1" width=".35"/>
4269 <bits name="RST" above="1" width=".35"/>
4270 <bits name="SYN" above="1" width=".35"/>
4271 <bits name="FIN" above="1" width=".35"/>
4272 </header>
4273 </diagram>
4274 </field>
4275
4276 <h2>UDP</h2>
4277
4278 <p>
4279 The following diagram shows UDP within IPv4. Open vSwitch also supports
4280 UDP in IPv6. Only UDP fields that Open vSwitch exposes as fields are
4281 shown:
4282 </p>
4283
4284 <diagram>
4285 <header name="Ethernet">
4286 <bits name="dst" above="48" width="0.4"/>
4287 <bits name="src" above="48" width="0.4"/>
4288 <bits name="type" above="16" below="0x800" width="0.4"/>
4289 </header>
4290 <header name="IPv4">
4291 <bits name="..." width="0.4"/>
4292 <bits name="proto" above="8" below="17" width="0.3"/>
4293 <bits name="src" above="32" width="0.4"/>
4294 <bits name="dst" above="32" width="0.4"/>
4295 </header>
4296 <header name="UDP">
4297 <bits name="src" above="16" width=".2"/>
4298 <bits name="dst" above="16" width=".2"/>
4299 <bits name="..." width=".4"/>
4300 </header>
4301 <dots/>
4302 </diagram>
4303 <field id="MFF_UDP_SRC" title="UDP Source Port"/>
4304 <field id="MFF_UDP_DST" title="UDP Destination Port"/>
4305
4306 <h2>SCTP</h2>
4307
4308 <p>
4309 The following diagram shows SCTP within IPv4. Open vSwitch also supports
4310 SCTP in IPv6. Only SCTP fields that Open vSwitch exposes as fields are
4311 shown:
4312 </p>
4313
4314 <diagram>
4315 <header name="Ethernet">
4316 <bits name="dst" above="48" width="0.4"/>
4317 <bits name="src" above="48" width="0.4"/>
4318 <bits name="type" above="16" below="0x800" width="0.4"/>
4319 </header>
4320 <header name="IPv4">
4321 <bits name="..." width="0.4"/>
4322 <bits name="proto" above="8" below="132" width="0.3"/>
4323 <bits name="src" above="32" width="0.4"/>
4324 <bits name="dst" above="32" width="0.4"/>
4325 </header>
4326 <header name="SCTP">
4327 <bits name="src" above="16" width=".2"/>
4328 <bits name="dst" above="16" width=".2"/>
4329 <bits name="..." width=".8"/>
4330 </header>
4331 <dots/>
4332 </diagram>
4333 <field id="MFF_SCTP_SRC" title="SCTP Source Port"/>
4334 <field id="MFF_SCTP_DST" title="SCTP Destination Port"/>
4335 </group>
4336
4337 <group title="Layer 4: ICMPv4 and ICMPv6">
4338 <h2>ICMPv4</h2>
4339 <diagram>
4340 <header name="Ethernet">
4341 <bits name="dst" above="48" width="0.4"/>
4342 <bits name="src" above="48" width="0.4"/>
4343 <bits name="type" above="16" below="0x800" width="0.4"/>
4344 </header>
4345 <header name="IPv4">
4346 <bits name="..." width="0.4"/>
4347 <bits name="proto" above="8" below="1" width="0.3"/>
4348 <bits name="src" above="32" width="0.4"/>
4349 <bits name="dst" above="32" width="0.4"/>
4350 </header>
4351 <header name="ICMPv4">
4352 <bits name="type" above="8" width=".3"/>
4353 <bits name="code" above="8" width=".3"/>
4354 <bits name="..." width=".8"/>
4355 </header>
4356 <dots/>
4357 </diagram>
4358 <field id="MFF_ICMPV4_TYPE" title="ICMPv4 Type">
4359 <p>
4360 For historical reasons, in an ICMPv4 flow, Open vSwitch interprets
4361 matches on <code>tp_src</code> as actually referring to the ICMP type.
4362 </p>
4363 </field>
4364 <field id="MFF_ICMPV4_CODE" title="ICMPv4 Code">
4365 <p>
4366 For historical reasons, in an ICMPv4 flow, Open vSwitch interprets
4367 matches on <code>tp_dst</code> as actually referring to the ICMP code.
4368 </p>
4369 </field>
4370
4371 <h2>ICMPv6</h2>
4372 <diagram>
4373 <header name="Ethernet">
4374 <bits name="dst" above="48" width="0.4"/>
4375 <bits name="src" above="48" width="0.4"/>
4376 <bits name="type" above="16" below="0x86dd" width="0.4"/>
4377 </header>
4378 <header name="IPv6">
4379 <bits name="..." width="0.2"/>
4380 <bits name="next" above="8" below="58" width="0.3"/>
4381 <bits name="src" above="128" width="0.4"/>
4382 <bits name="dst" above="128" width="0.4"/>
4383 </header>
4384 <header name="ICMPv6">
4385 <bits name="type" above="8" width=".3"/>
4386 <bits name="code" above="8" width=".3"/>
4387 <bits name="..." width=".8"/>
4388 </header>
4389 <dots/>
4390 </diagram>
4391 <field id="MFF_ICMPV6_TYPE" title="ICMPv6 Type"/>
4392 <field id="MFF_ICMPV6_CODE" title="ICMPv6 Code"/>
4393
4394 <h2>ICMPv6 Neighbor Discovery</h2>
4395 <diagram>
4396 <header name="Ethernet">
4397 <bits name="dst" above="48" width="0.4"/>
4398 <bits name="src" above="48" width="0.4"/>
4399 <bits name="type" above="16" below="0x86dd" width="0.4"/>
4400 </header>
4401 <header name="IPv6">
4402 <bits name="..." width="0.2"/>
4403 <bits name="next" above="8" below="58" width="0.3"/>
4404 <bits name="src" above="128" width="0.4"/>
4405 <bits name="dst" above="128" width="0.4"/>
4406 </header>
4407 <header name="ICMPv6">
4408 <bits name="type" above="8" below="135/136" width=".3"/>
4409 <bits name="code" above="8" below="0" width=".3"/>
4410 <bits name="..." width=".8"/>
4411 </header>
4412 <header name="ICMPv6 ND">
4413 <bits name="target" above="128" width=".4"/>
4414 <bits name="option ..." width=".6"/>
4415 </header>
4416 </diagram>
4417 <field id="MFF_ND_TARGET" title="ICMPv6 Neighbor Discovery Target IPv6"/>
4418 <field id="MFF_ND_SLL"
4419 title="ICMPv6 Neighbor Discovery Source Ethernet Address"/>
4420 <field id="MFF_ND_TLL"
4421 title="ICMPv6 Neighbor Discovery Target Ethernet Address"/>
4422 </group>
4423
4424 <h1>References</h1>
4425
4426 <dl>
4427 <dt>Casado</dt>
4428 <dd>
4429 M. Casado, M. J. Freedman, J. Pettit, J. Luo, N. McKeown, and
4430 S. Shenker, ``Ethane: Taking Control of the Enterprise,''
4431 Computer Communications Review, October 2007.
4432 </dd>
4433
4434 <dt>EXT-56</dt>
4435 <dd>
4436 J. Tonsing, ``Permit one of a set of prerequisites to apply, e.g. don't
4437 preclude non-Ethernet media,'' <url
4438 href="https://rs.opennetworking.org/bugs/browse/EXT-56"/> (ONF
4439 members only).
4440 </dd>
4441
4442 <dt>EXT-112</dt>
4443 <dd>
4444 J. Tourrilhes, ``Support non-Ethernet packets throughout the
4445 pipeline,'' <url
4446 href="https://rs.opennetworking.org/bugs/browse/EXT-112"/> (ONF
4447 members only).
4448 </dd>
4449
4450 <dt>EXT-134</dt>
4451 <dd>
4452 J. Tourrilhes, ``Match first nibble of the MPLS payload,'' <url
4453 href="https://rs.opennetworking.org/bugs/browse/EXT-134"/> (ONF
4454 members only).
4455 </dd>
4456
4457 <dt>Geneve</dt>
4458 <dd>
4459 J. Gross, I. Ganga, and T. Sridhar, editors, ``Geneve: Generic Network
4460 Virtualization Encapsulation,'' <url
4461 href="https://datatracker.ietf.org/doc/draft-ietf-nvo3-geneve/"/>.
4462 </dd>
4463
4464 <dt>IEEE OUI</dt>
4465 <dd>
4466 IEEE Standards Association, ``MAC Address Block Large (MA-L),''
4467 <url
4468 href="https://standards.ieee.org/develop/regauth/oui/index.html"/>.
4469 </dd>
4470
4471 <dt>NSH</dt>
4472 <dd>
4473 P. Quinn and U. Elzur, editors, ``Network Service Header,'' <url
4474 href="https://datatracker.ietf.org/doc/draft-ietf-sfc-nsh/"/>.
4475 </dd>
4476
4477 <dt>OpenFlow 1.0.1</dt>
4478 <dd>
4479 Open Networking Foundation, ``OpenFlow Switch Errata, Version
4480 1.0.1,'' June 2012.
4481 </dd>
4482
4483 <dt>OpenFlow 1.1</dt>
4484 <dd>
4485 OpenFlow Consortium, ``OpenFlow Switch Specification Version
4486 1.1.0 Implemented (Wire Protocol 0x02),'' February 2011.
4487 </dd>
4488
4489 <dt>OpenFlow 1.5</dt>
4490 <dd>
4491 Open Networking Foundation, ``OpenFlow Switch Specification Version
4492 1.5.0 (Protocol version 0x06),'' December 2014.
4493 </dd>
4494
4495 <dt>OpenFlow Extensions 1.3.x Package 2</dt>
4496 <dd>
4497 Open Networking Foundation, ``OpenFlow Extensions 1.3.x Package 2,''
4498 December 2013.
4499 </dd>
4500
4501 <dt>TCP Flags Match Field Extension</dt>
4502 <dd>
4503 Open Networking Foundation, ``TCP flags match field Extension,'' December
4504 2014. In [OpenFlow Extensions 1.3.x Package 2].
4505 </dd>
4506
4507 <dt>Pepelnjak</dt>
4508 <dd>
4509 I. Pepelnjak, ``OpenFlow and Fermi Estimates,'' <url
4510 href="http://blog.ipspace.net/2013/09/openflow-and-fermi-estimates.html"/>.
4511 </dd>
4512
4513 <dt>RFC 793</dt>
4514 <dd>
4515 ``Transmission Control Protocol,'' <url
4516 href="http://www.ietf.org/rfc/rfc793.txt"/>.
4517 </dd>
4518
4519 <dt>RFC 3032</dt>
4520 <dd>
4521 E. Rosen, D. Tappan, G. Fedorkow, Y. Rekhter, D. Farinacci,
4522 T. Li, and A. Conta, ``MPLS Label Stack Encoding,'' <url
4523 href="http://www.ietf.org/rfc/rfc3032.txt"/>.
4524 </dd>
4525
4526 <dt>RFC 3168</dt>
4527 <dd>
4528 K. Ramakrishnan, S. Floyd, and D. Black, ``The Addition of Explicit
4529 Congestion Notification (ECN) to IP,'' <url href="https://tools.ietf.org/html/rfc3168"/>.
4530 </dd>
4531
4532 <dt>RFC 3540</dt>
4533 <dd>
4534 N. Spring, D. Wetherall, and D. Ely, ``Robust Explicit Congestion
4535 Notification (ECN) Signaling with Nonces,'' <url
4536 href="https://tools.ietf.org/html/rfc3540"/>.
4537 </dd>
4538
4539 <dt>RFC 4632</dt>
4540 <dd>
4541 V. Fuller and T. Li, ``Classless Inter-domain Routing (CIDR): The
4542 Internet Address Assignment and Aggregation Plan,'' <url
4543 href="https://tools.ietf.org/html/rfc4632"/>.
4544 </dd>
4545
4546 <dt>RFC 5462</dt>
4547 <dd>
4548 L. Andersson and R. Asati, ``Multiprotocol Label Switching
4549 (MPLS) Label Stack Entry: ``EXP'' Field Renamed to ``Traffic
4550 Class'' Field,'' <url
4551 href="http://www.ietf.org/rfc/rfc5462.txt"/>.
4552 </dd>
4553
4554 <dt>RFC 6830</dt>
4555 <dd>
4556 D. Farinacci, V. Fuller, D. Meyer, and D. Lewis, ``The
4557 Locator/ID Separation Protocol (LISP),'' <url
4558 href="http://www.ietf.org/rfc/rfc6830.txt"/>.
4559 </dd>
4560
4561 <dt>RFC 7348</dt>
4562 <dd>
4563 M. Mahalingam, D. Dutt, K. Duda, P. Agarwal, L. Kreeger, T. Sridhar,
4564 M. Bursell, and C. Wright, ``Virtual eXtensible Local Area Network
4565 (VXLAN): A Framework for Overlaying Virtualized Layer 2 Networks over
4566 Layer 3 Networks, '' <url href="https://tools.ietf.org/html/rfc7348"/>.
4567 </dd>
4568
4569 <dt>Srinivasan</dt>
4570 <dd>
4571 V. Srinivasan, S. Suriy, and G. Varghese, ``Packet
4572 Classification using Tuple Space Search,'' SIGCOMM 1999.
4573 </dd>
4574
4575 <dt>Pagiamtzis</dt>
4576 <dd>
4577 K. Pagiamtzis and A. Sheikholeslami, ``Content-addressable
4578 memory (CAM) circuits and architectures: A tutorial and
4579 survey,'' IEEE Journal of Solid-State Circuits, vol. 41, no. 3,
4580 pp. 712-727, March 2006.
4581 </dd>
4582
4583 <dt>VXLAN Group Policy Option</dt>
4584 <dd>
4585 M. Smith and L. Kreeger, `` VXLAN Group Policy Option.'' Internet-Draft.
4586 <url href="https://tools.ietf.org/html/draft-smith-vxlan-group-policy"/>.
4587 </dd>
4588 </dl>
4589
4590 <h1>Authors</h1>
4591
4592 <p>
4593 Ben Pfaff, with advice from Justin Pettit and Jean Tourrilhes.
4594 </p>
4595
4596 </fields>
4597
4598 <!--
4599 OXM fields not yet supported Future Directions References/See Also
4600 OXM fields required by various versions and by the "Conformance Test Specification for OpenFlow Switch Specification 1.0.1"
4601 -->