]> git.proxmox.com Git - mirror_ovs.git/blame - lib/meta-flow.xml
dpif-netlink: Use netlink helpers for packet_type.
[mirror_ovs.git] / lib / meta-flow.xml
CommitLineData
96fee5e0
BP
1<?xml version="1.0" encoding="utf-8"?>
2<fields>
3 <h1>Introduction</h1>
4
5 <p>
6 This document aims to comprehensively document all of the fields,
7 both standard and non-standard, supported by OpenFlow or Open
8 vSwitch, regardless of origin.
9 </p>
10
11 <h2>Fields</h2>
12
13 <p>
14 A <dfn>field</dfn> is a property of a packet. Most familiarly, <dfn>data
15 fields</dfn> are fields that can be extracted from a packet. Most data
16 fields are copied directly from protocol headers, e.g. at layer 2, the
17 Ethernet source and destination addresses, or the VLAN ID; at layer 3, the
18 IPv4 or IPv6 source and destination; and at layer 4, the TCP or UDP ports.
19 Other data fields are computed, e.g. <ref field="ip_frag"/> describes
20 whether a packet is a fragment but it is not copied directly from the IP
21 header.
22 </p>
23
24 <p>
3d4b2e6e
JS
25 Data fields that are always present as a consequence of the basic
26 networking technology in use are called called <dfn>root fields</dfn>.
27 Open vSwitch 2.7 and earlier considered Ethernet fields to be root fields,
28 and this remains the default mode of operation for Open vSwitch bridges.
875ab130
BP
29 When a packet is received from a non-Ethernet interfaces, such as a layer-3
30 LISP tunnel, Open vSwitch 2.7 and earlier force-fit the packet to this
3d4b2e6e
JS
31 Ethernet-centric point of view by pretending that an Ethernet header is
32 present whose Ethernet type that indicates the packet's actual type (and
33 whose source and destination addresses are all-zero).
96fee5e0
BP
34 </p>
35
96fee5e0 36 <p>
875ab130
BP
37 Open vSwitch 2.8 and later implement the ``packet type-aware pipeline''
38 concept introduced in OpenFlow 1.5. Such a pipeline does not have any root
39 fields. Instead, a new metadata field, <ref field="packet_type"/>,
40 indicates the basic type of the packet, which can be Ethernet, IPv4, IPv6,
41 or another type. For backward compatibility, by default Open vSwitch 2.8
42 imitates the behavior of Open vSwitch 2.7 and earlier. Later versions of
43 Open vSwitch may change the default, and in the meantime controllers can
44 turn off this legacy behavior, on a port-by-port basis, by setting
45 <code>options:packet_type</code> to <code>ptap</code> in the
46 <code>Interface</code> table. This is significant only for ports that can
47 handle non-Ethernet packets, which is currently just LISP, VXLAN-GPE, and
48 GRE tunnel ports. See <code>ovs-vwitchd.conf.db</code>(5) for more
49 information.
3d4b2e6e
JS
50 </p>
51
52 <p>
53 Non-root data fields are not always present. A packet contains ARP
54 fields, for example, only when its packet type is ARP or when it is an
55 Ethernet packet whose Ethernet header indicates the Ethertype for ARP,
96fee5e0
BP
56 0x0806. In this documentation, we say that a field is
57 <dfn>applicable</dfn> when it is present in a packet, and
58 <dfn>inapplicable</dfn> when it is not. (These are not standard terms.)
59 We refer to the conditions that determine whether a field is applicable as
60 <dfn>prerequisites</dfn>. Some VLAN-related fields are a special case:
3d4b2e6e
JS
61 these fields are always applicable for Ethernet packets, but have a
62 designated value or bit that indicates whether a VLAN header is present,
63 with the remaining values or bits indicating the VLAN header's content
64 (if it is present). <!-- XXX also ethertype -->
96fee5e0
BP
65 </p>
66
67 <p>
68 An inapplicable field does not have a value, not even a nominal
69 ``value'' such as all-zero-bits. In many circumstances, OpenFlow
70 and Open vSwitch allow references only to applicable fields. For
71 example, one may match (see <cite>Matching</cite>, below) a given
72 field only if the match includes the field's prerequisite,
73 e.g. matching an ARP field is only allowed if one also matches on
3d4b2e6e
JS
74 Ethertype 0x0806 or the <ref field="packet_type"/> for ARP in a packet
75 type-aware bridge.
96fee5e0
BP
76 </p>
77
78 <p>
79 Sometimes a packet may contain multiple instances of a header.
80 For example, a packet may contain multiple VLAN or MPLS headers,
81 and tunnels can cause any data field to recur. OpenFlow and Open
82 vSwitch do not address these cases uniformly. For VLAN and MPLS
83 headers, only the outermost header is accessible, so that inner
84 headers may be accessed only by ``popping'' (removing) the outer
85 header. (Open vSwitch supports only a single VLAN header in any
86 case.) For tunnels, e.g. GRE or VXLAN, the outer header and inner
87 headers are treated as different data fields.
88 </p>
89
90 <p>
91 Many network protocols are built in layers as a stack of concatenated
92 headers. Each header typically contains a ``next type'' field that
93 indicates the type of the protocol header that follows, e.g. Ethernet
94 contains an Ethertype and IPv4 contains a IP protocol type. The
95 exceptional cases, where protocols are layered but an outer layer does not
96 indicate the protocol type for the inner layer, or gives only an ambiguous
97 indication, are troublesome. An MPLS header, for example, only indicates
98 whether another MPLS header or some other protocol follows, and in the
99 latter case the inner protocol must be known from the context. In these
100 exceptional cases, OpenFlow and Open vSwitch cannot provide insight into
101 the inner protocol data fields without additional context, and thus they
102 treat all later data fields as inapplicable until an OpenFlow action
103 explicitly specifies what protocol follows. In the case of MPLS, the
104 OpenFlow ``pop MPLS'' action that removes the last MPLS header from a
105 packet provides this context, as the Ethertype of the payload. See
106 <cite>Layer 2.5: MPLS</cite> for more information.
107 </p>
108
109 <p>
110 OpenFlow and Open vSwitch support some fields other than data
111 fields. <dfn>Metadata fields</dfn> relate to the origin or
112 treatment of a packet, but they are not extracted from the packet
113 data itself. One example is the physical port on which a packet
114 arrived at the switch. <dfn>Register fields</dfn> act like
115 variables: they give an OpenFlow switch space for temporary
116 storage while processing a packet. Existing metadata and register
117 fields have no prerequisites.
118 </p>
119
120 <p>
121 A field's value consists of an integral number of bytes. For data
122 fields, sometimes those bytes are taken directly from the packet.
123 Other data fields are copied from a packet with padding (usually
124 with zeros and in the most significant positions). The remaining
125 data fields are transformed in other ways as they are copied from
126 the packets, to make them more useful for matching.
127 </p>
128
129 <h2>Matching</h2>
130
131 <p>
132 The most important use of fields in OpenFlow is
133 <dfn>matching</dfn>, to determine whether particular field values
134 agree with a set of constraints called a <dfn>match</dfn>. A
135 match consists of zero or more constraints on individual fields,
136 all of which must be met to satisfy the match. (A match that
137 contains no constraints is always satisfied.) OpenFlow and Open
138 vSwitch support a number of forms of matching on individual
139 fields:
140 </p>
141
142 <dl>
143 <dt><dfn>Exact match</dfn>, e.g. <code>nw_src=10.1.2.3</code></dt>
144 <dd>
145 <p>
146 Only a particular value of the field is matched; for example, only one
147 particular source IP address. Exact matches are written as
148 <code><var>field</var>=<var>value</var></code>. The forms accepted for
149 <var>value</var> depend on the field.
150 </p>
151
152 <p>
153 All fields support exact matches.
154 </p>
155 </dd>
156
157 <dt>
158 <dfn>Bitwise match</dfn>, e.g. <code>nw_src=10.1.0.0/255.255.0.0</code>
159 </dt>
160 <dd>
161 <p>
162 Specific bits in the field must have specified values; for example,
163 only source IP addresses in a particular subnet. Bitwise matches are
164 written as
165 <code><var>field</var>=<var>value</var>/<var>mask</var></code>, where
166 <var>value</var> and <var>mask</var> take one of the forms accepted for
167 an exact match on <var>field</var>. Some fields accept other forms for
168 bitwise matches; for example, <code>nw_src=10.1.0.0/255.255.0.0</code>
169 may also be written <code>nw_src=10.1.0.0/16</code>.
170 </p>
171
172 <p>
173 Most OpenFlow switches do not allow every bitwise matching on every
174 field (and before OpenFlow 1.2, the protocol did not even provide for
175 the possibility for most fields). Even switches that do allow bitwise
176 matching on a given field may restrict the masks that are allowed, e.g.
177 by allowing matches only on contiguous sets of bits starting from the
178 most significant bit, that is, ``CIDR'' masks [RFC 4632]. Open vSwitch
179 does not allows bitwise matching on every field, but it allows
180 arbitrary bitwise masks on any field that does support bitwise
181 matching. (Older versions had some restrictions, as documented in the
182 descriptions of individual fields.)
183 </p>
184 </dd>
185
186 <dt><dfn>Wildcard</dfn>, e.g. ``any <code>nw_src</code>''</dt>
187 <dd>
188 <p>
189 The value of the field is not constrained. Wildcarded fields may be
190 written as <code><var>field</var>=*</code>, although it is unusual to
191 mention them at all. (When specifying a wildcard explicitly in a
192 command invocation, be sure to using quoting to protect against shell
193 expansion.)
194 </p>
195
196 <p>
197 There is a tiny difference between wildcarding a field and not
198 specifying any match on a field: wildcarding a field requires
199 satisfying the field's prerequisites.
200 </p>
201 </dd>
202 </dl>
203
204 <p>
205 Some types of matches on individual fields cannot be expressed directly
206 with OpenFlow and Open vSwitch. These can be expressed indirectly:
207 </p>
208
209 <dl>
210 <dt><dfn>Set match</dfn>, e.g. ``<code>tcp_dst</code> ∈ {80, 443,
211 8080}''</dt>
212 <dd>
213 <p>
214 The value of a field is one of a specified set of values; for
215 example, the TCP destination port is 80, 443, or 8080.
216 </p>
217
218 <p>
219 For matches used in flows (see <cite>Flows</cite>, below), multiple
220 flows can simulate set matches.
221 </p>
222 </dd>
223
224 <dt><dfn>Range match</dfn>, e.g. ``1000 ≤ <code>tcp_dst</code> ≤
225 1999''</dt>
226 <dd>
227 <p>
228 The value of the field must lie within a numerical range, for
229 example, TCP destination ports between 1000 and 1999.
230 </p>
231
232 <p>
233 Range matches can be expressed as a collection of bitwise matches. For
234 example, suppose that the goal is to match TCP source ports 1000 to
235 1999, inclusive. The binary representations of 1000 and 1999 are:
236 </p>
237
238 <pre fixed="yes">
23901111101000
24011111001111
241 </pre>
242
243 <p>
244 The following series of bitwise matches will match 1000 and
245 1999 and all the values in between:
246 </p>
247
248 <pre fixed="yes">
24901111101xxx
2500111111xxxx
25110xxxxxxxxx
252110xxxxxxxx
2531110xxxxxxx
25411110xxxxxx
2551111100xxxx
256 </pre>
257
258 <p>
259 which can be written as the following matches:
260 </p>
261
262 <pre>
263tcp,tp_src=0x03e8/0xfff8
264tcp,tp_src=0x03f0/0xfff0
265tcp,tp_src=0x0400/0xfe00
266tcp,tp_src=0x0600/0xff00
267tcp,tp_src=0x0700/0xff80
268tcp,tp_src=0x0780/0xffc0
269tcp,tp_src=0x07c0/0xfff0
270 </pre>
271 </dd>
272
273 <dt><dfn>Inequality match</dfn>, e.g. ``<code>tcp_dst</code> ≠ 80''</dt>
274 <dd>
275 <p>
276 The value of the field differs from a specified value, for
277 example, all TCP destination ports except 80.
278 </p>
279
280 <p>
281 An inequality match on an <var>n</var>-bit field can be expressed as a
282 disjunction of <var>n</var> 1-bit matches. For example, the inequality
283 match ``<code>vlan_pcp</code> ≠ 5'' can be expressed as
284 ``<code>vlan_pcp</code> = 0/4 or <code>vlan_pcp</code> = 2/2 or
285 <code>vlan_pcp</code> = 0/1.'' For matches used in flows (see
286 <cite>Flows</cite>, below), sometimes one can more compactly express
287 inequality as a higher-priority flow that matches the exceptional case
288 paired with a lower-priority flow that matches the general case.
289 </p>
290
291 <p>
292 Alternatively, an inequality match may be converted to a pair of range
293 matches, e.g. <code>tcp_src ≠ 80</code> may be expressed as ``0 ≤
294 <code>tcp_src</code> &lt; 80 or 80 &lt; <code>tcp_src</code> ≤ 65535'',
295 and then each range match may in turn be converted to a bitwise match.
296 </p>
297 </dd>
298
299 <dt><dfn>Conjunctive match</dfn>, e.g. ``<code>tcp_src</code> ∈ {80, 443, 8080} and <code>tcp_dst</code> ∈ {80, 443, 8080}''</dt>
300 <dd>
301 As an OpenFlow extension, Open vSwitch supports matching on conditions on
302 conjunctions of the previously mentioned forms of matching. See the
303 documentation for <ref field="conj_id"/> for more information.
304 </dd>
305 </dl>
306
307 <p>
308 All of these supported forms of matching are special cases of bitwise
309 matching. In some cases this influences the design of field values. <ref
310 field="ip_frag"/> is the most prominent example: it is designed to make all
311 of the practically useful checks for IP fragmentation possible as a single
312 bitwise match.
313 </p>
314
315 <h3>Shorthands</h3>
316
317 <p>
318 Some matches are very commonly used, so Open vSwitch accepts shorthand
319 notations. In some cases, Open vSwitch also uses shorthand notations when
320 it displays matches. The following shorthands are defined, with their long
321 forms shown on the right side:
322 </p>
323
324 <dl>
3d4b2e6e
JS
325 <dt><code>eth</code></dt>
326 <dd><code>packet_type=(0,0)</code> (Open vSwitch 2.8 and later)</dd>
96fee5e0
BP
327 <dt><code>ip</code></dt> <dd><code>eth_type=0x0800</code></dd>
328 <dt><code>ipv6</code></dt> <dd><code>eth_type=0x86dd</code></dd>
329 <dt><code>icmp</code></dt> <dd><code>eth_type=0x0800,ip_proto=1</code></dd>
330 <dt><code>icmp6</code></dt> <dd><code>eth_type=0x86dd,ip_proto=58</code></dd>
331 <dt><code>tcp</code></dt> <dd><code>eth_type=0x0800,ip_proto=6</code></dd>
332 <dt><code>tcp6</code></dt> <dd><code>eth_type=0x86dd,ip_proto=6</code></dd>
333 <dt><code>udp</code></dt> <dd><code>eth_type=0x0800,ip_proto=17</code></dd>
334 <dt><code>udp6</code></dt> <dd><code>eth_type=0x86dd,ip_proto=17</code></dd>
335 <dt><code>sctp</code></dt> <dd><code>eth_type=0x0800,ip_proto=132</code></dd>
336 <dt><code>sctp6</code></dt> <dd><code>eth_type=0x86dd,ip_proto=132</code></dd>
337 <dt><code>arp</code></dt> <dd><code>eth_type=0x0806</code></dd>
338 <dt><code>rarp</code></dt> <dd><code>eth_type=0x8035</code></dd>
339 <dt><code>mpls</code></dt> <dd><code>eth_type=0x8847</code></dd>
340 <dt><code>mplsm</code></dt> <dd><code>eth_type=0x8848</code></dd>
341 </dl>
342
3d4b2e6e 343
96fee5e0
BP
344 <h2>Evolution of OpenFlow Fields</h2>
345
346 <p>
347 The discussion so far applies to all OpenFlow and Open vSwitch
348 versions. This section starts to draw in specific information by
349 explaining, in broad terms, the treatment of fields and matches in
350 each OpenFlow version.
351 </p>
352
353 <h3>OpenFlow 1.0</h3>
354
355 <p>
356 OpenFlow 1.0 defined the OpenFlow protocol format of a match as a
357 fixed-length data structure that could match on the following
358 fields:
359 </p>
360
361 <ul>
362 <li>Ingress port.</li>
363 <li>Ethernet source and destination MAC.</li>
364 <li>Ethertype (with a special value to match frames that lack an
365 Ethertype).</li>
366 <li>VLAN ID and priority.</li>
367 <li>IPv4 source, destination, protocol, and DSCP.</li>
368 <li>TCP source and destination port.</li>
369 <li>UDP source and destination port.</li>
370 <li>ICMPv4 type and code.</li>
371 <li>ARP IPv4 addresses (SPA and TPA) and opcode.</li>
372 </ul>
373
374 <p>
375 Each supported field corresponded to some member of the data
376 structure. Some members represented multiple fields, in the case
377 of the TCP, UDP, ICMPv4, and ARP fields whose presence is mutually
378 exclusive. This also meant that some members were poor fits for
379 their fields: only the low 8 bits of the 16-bit ARP opcode could
380 be represented, and the ICMPv4 type and code were padded with 8 bits
381 of zeros to fit in the 16-bit members primarily meant for TCP and
382 UDP ports. An additional bitmap member indicated, for each
383 member, whether its field should be an ``exact'' or ``wildcarded''
384 match (see <cite>Matching</cite>), with additional support for
385 CIDR prefix matching on the IPv4 source and destination fields.
386 </p>
387
388 <p>
389 Simplicity was recognized early on as the main virtue of this
390 approach. Obviously, any fixed-length data structure cannot
391 support matching new protocols that do not fit. There was no
392 room, for example, for matching IPv6 fields, which was not a
393 priority at the time. Lack of room to support matching the
394 Ethernet addresses inside ARP packets actually caused more of a
395 design problem later, leading to an Open vSwitch extension action
396 specialized for dropping ``spoofed'' ARP packets in which the
397 frame and ARP Ethernet source addressed differed. (This extension
398 was never standardized. Open vSwitch dropped support for it a few
399 releases after it added support for full ARP matching.)
400 </p>
401
402 <p>
403 The design of the OpenFlow fixed-length matches also illustrates
404 compromises, in both directions, between the strengths and
405 weaknesses of software and hardware that have always influenced
406 the design of OpenFlow. Support for matching ARP fields that do
407 fit in the data structure was only added late in the design
408 process (and remained optional in OpenFlow 1.0), for example,
409 because common switch ASICs did not support matching these fields.
410 </p>
411
412 <p>
413 The compromises in favor of software occurred for more complicated
414 reasons. The OpenFlow designers did not know how to implement
415 matching in software that was fast, dynamic, and general. (A way
416 was later found [Srinivasan].) Thus, the designers sought to
417 support dynamic, general matching that would be fast in realistic
418 special cases, in particular when all of the matches were
419 <dfn>microflows</dfn>, that is, matches that specify every field
420 present in a packet, because such matches can be implemented as a
421 single hash table lookup. Contemporary research supported the
422 feasibility of this approach: the number of microflows in a campus
423 network had been measured to peak at about 10,000 [Casado, section
424 3.2]. (Calculations show that this can only be true in a lightly
425 loaded network [Pepelnjak].)
426 </p>
427
428 <p>
429 As a result, OpenFlow 1.0 required switches to treat microflow
430 matches as the highest possible priority. This let software
431 switches perform the microflow hash table lookup first. Only on
432 failure to match a microflow did the switch need to fall back to
433 checking the more general and presumed slower matches. Also, the
434 OpenFlow 1.0 flow match was minimally flexible, with no support
435 for general bitwise matching, partly on the basis that this seemed
436 more likely amenable to relatively efficient software
437 implementation. (CIDR masking for IPv4 addresses was added
438 relatively late in the OpenFlow 1.0 design process.)
439 </p>
440
441 <p>
442 Microflow matching was later discovered to aid some hardware
443 implementations. The TCAM chips used for matching in hardware do
444 not support priority in the same way as OpenFlow but instead tie
445 priority to ordering [Pagiamtzis]. Thus, adding a new match with
446 a priority between the priorities of existing matches can require
447 reordering an arbitrary number of TCAM entries. On the other
448 hand, when microflows are highest priority, they can be managed as
449 a set-aside portion of the TCAM entries.
450 </p>
451
452 <p>
453 The emphasis on matching microflows also led designers to
454 carefully consider the bandwidth requirements between switch and
455 controller: to maximize the number of microflow setups per second,
456 one must minimize the size of each flow's description. This
457 favored the fixed-length format in use, because it expressed
458 common TCP and UDP microflows in fewer bytes than more flexible
459 ``type-length-value'' (TLV) formats. (Early versions of OpenFlow
460 also avoided TLVs in general to head off protocol fragmentation.)
461 </p>
462
463 <h4>Inapplicable Fields</h4>
464
465 <p>
466 OpenFlow 1.0 does not clearly specify how to treat inapplicable
467 fields. The members for inapplicable fields are always present in
468 the match data structure, as are the bits that indicate whether
469 the fields are matched, and the ``correct'' member and bit values
470 for inapplicable fields is unclear. OpenFlow 1.0 implementations
471 changed their behavior over time as priorities shifted. The early
472 OpenFlow reference implementation, motivated to make every flow a
473 microflow to enable hashing, treated inapplicable fields as exact
474 matches on a value of 0. Initially, this behavior was implemented
475 in the reference controller only.
476 </p>
477
478 <p>
479 Later, the reference switch was also changed to actually force any
480 wildcarded inapplicable fields into exact matches on 0. The
481 latter behavior sometimes caused problems, because the modified
482 flow was the one reported back to the controller later when it
483 queried the flow table, and the modifications sometimes meant that
484 the controller could not properly recognize the flow that it had
485 added. In retrospect, perhaps this problem should have alerted
486 the designers to a design error, but the ability to use a single
487 hash table was held to be more important than almost every other
488 consideration at the time.
489 </p>
490
491 <p>
492 When more flexible match formats were introduced much later, they
493 disallowed any mention of inapplicable fields as part of a match.
494 This raised the question of how to translate between this new
495 format and the OpenFlow 1.0 fixed format. It seemed somewhat
496 inconsistent and backward to treat fields as exact-match in one
497 format and forbid matching them in the other, so instead the
498 treatment of inapplicable fields in the fixed-length format was
499 changed from exact match on 0 to wildcarding. (A better
500 classifier had by now eliminated software performance problems
501 with wildcards.)
502 </p>
503
504 <p>
505 The OpenFlow 1.0.1 errata (released only in 2012) added some
506 additional explanation [OpenFlow 1.0.1, section 3.4], but it did
507 not mandate specific behavior because of variation among
508 implementations.
509 </p>
510
511 <h3>OpenFlow 1.1</h3>
512
513 <p>
514 The OpenFlow 1.1 protocol match format was designed as a type/length/value
515 (TLV) format to allow for future flexibility. The specification
516 standardized only a single type <code>OFPMT_STANDARD</code> (0) with a
517 fixed-size payload, described here. The additional fields and bitwise
518 masks in OpenFlow 1.1 cause this match structure to be over twice as large
519 as in OpenFlow 1.0, 88 bytes versus 40.
520 </p>
521
522 <p>
523 OpenFlow 1.1 added support for the following fields:
524 </p>
525
526 <ul>
527 <li>SCTP source and destination port.</li>
528 <li>MPLS label and traffic control (TC) fields.</li>
529 <li>One 64-bit register (named ``metadata'').</li>
530 </ul>
531
532 <p>
533 OpenFlow 1.1 increased the width of the ingress port number field (and all
534 other port numbers in the protocol) from 16 bits to 32 bits.
535 </p>
536
537 <p>
538 OpenFlow 1.1 increased matching flexibility by introducing
539 arbitrary bitwise matching on Ethernet and IPv4 address fields and
540 on the new ``metadata'' register field. Switches were not
541 required to support all possible masks [OpenFlow 1.1, section
542 4.3].
543 </p>
544
545 <p>
546 By a strict reading of the specification, OpenFlow 1.1 removed
547 support for matching ICMPv4 type and code [OpenFlow 1.1, section
548 A.2.3], but this is likely an editing error because ICMP
549 matching is described elsewhere [OpenFlow 1.1, Table 3, Table 4,
550 Figure 4]. Open vSwitch does support ICMPv4 type and code
551 matching with OpenFlow 1.1.
552 </p>
553
554 <p>
555 OpenFlow 1.1 avoided the pitfalls of inapplicable fields that
556 OpenFlow 1.0 encountered, by requiring the switch to ignore the
557 specified field values [OpenFlow 1.1, section A.2.3]. It also
558 implied that the switch should ignore the bits that indicate
559 whether to match inapplicable fields.
560 </p>
561
562 <h4>Physical Ingress Port</h4>
563
564 <p>
565 OpenFlow 1.1 introduced a new pseudo-field, the physical ingress port. The
566 physical ingress port is only a pseudo-field because it cannot be used for
567 matching. It appears only one place in the protocol, in the ``packet-in''
568 message that passes a packet received at the switch to an OpenFlow
569 controller.
570 </p>
571
572 <p>
573 A packet's ingress port and physical ingress port are identical except for
574 packets processed by a switch feature such as bonding or tunneling that
575 makes a packet appear to arrive on a ``virtual'' port associated with the
576 bond or the tunnel. For such packets, the ingress port is the virtual port
577 and the physical ingress port is, naturally, the physical port. Open
578 vSwitch implements both bonding and tunneling, but its bonding
579 implementation does not use virtual ports and its tunnels are typically not
580 on the same OpenFlow switch as their physical ingress ports (which need not
581 be part of any switch), so the ingress port and physical ingress port are
582 always the same in Open vSwitch.
583 </p>
584
585 <h3>OpenFlow 1.2</h3>
586
587 <p>
588 OpenFlow 1.2 abandoned the fixed-length approach to matching. One reason
589 was size, since adding support for IPv6 address matching (now seen as
590 important), with bitwise masks, would have added 64 bytes to the match
591 length, increasing it from 88 bytes in OpenFlow 1.1 to over 150 bytes.
592 Extensibility had also become important as controller writers increasingly
593 wanted support for new fields without having to change messages throughout
594 the OpenFlow protocol. The challenges of carefully defining fixed-length
595 matches to avoid problems with inapplicable fields had also become clear
596 over time.
597 </p>
598
599 <p>
600 Therefore, OpenFlow 1.2 adopted a flow format using a flexible
601 type-length-value (TLV) representation, in which each TLV expresses a match
602 on one field. These TLVs were in turn encapsulated inside the outer TLV
603 wrapper introduced in OpenFlow 1.1 with the new identifier
604 <code>OFPMT_OXM</code> (1). (This wrapper fulfilled its intended purpose
605 of reducing the amount of churn in the protocol when changing match
606 formats; some messages that included matches remained unchanged from
607 OpenFlow 1.1 to 1.2 and later versions.)
608 </p>
609
610 <p>
611 OpenFlow 1.2 added support for the following fields:
612 </p>
613
614 <ul>
615 <li>ARP hardware addresses (SHA and THA).</li>
616 <li>IPv4 ECN.</li>
617 <li>IPv6 source and destination addresses, flow label, DSCP, ECN,
618 and protocol.</li>
619 <li>TCP, UDP, and SCTP port numbers when encapsulated inside IPv6.</li>
620 <li>ICMPv6 type and code.</li>
621 <li>ICMPv6 Neighbor Discovery target address and source and target
622 Ethernet addresses.</li>
623 </ul>
624
625 <!-- mention tun_id_from_cookie extension? -->
626
627 <p>
628 The OpenFlow 1.2 format, called <dfn>OXM</dfn> (<dfn>OpenFlow Extensible
629 Match</dfn>), was modeled closely on an extension to OpenFlow 1.0
630 introduced in Open vSwitch 1.1 called <dfn>NXM</dfn> (<dfn>Nicira Extended
631 Match</dfn>). Each OXM or NXM TLV has the following format:
632 </p>
633
634 <diagram>
635 <header name="type">
636 <bits name="vendor/class" above="16" width=".75"/>
637 <bits name="field" above="7" width=".4"/>
638 </header>
639 <nospace/>
640 <header name="">
641 <bits name="HM" above="1" width=".25"/>
642 <bits name="length" above="8" width=".4"/>
643 </header>
644 <header name="">
645 <bits name="body" above="length bytes" width="1.7"/>
646 </header>
647 </diagram>
648
649 <p>
650 The most significant 16 bits of the NXM or OXM header, called
651 <code>vendor</code> by NXM and <code>class</code> by OXM, identify
652 an organization permitted to allocate identifiers for fields. NXM
653 allocates only two vendors, 0x0000 for fields supported by
654 OpenFlow 1.0 and 0x0001 for fields implemented as an Open vSwitch
655 extension. OXM assigns classes as follows:
656 </p>
657
658 <dl>
659 <dt>0x0000 (<code>OFPXMC_NXM_0</code>).</dt>
660 <dt>0x0001 (<code>OFPXMC_NXM_1</code>).</dt>
661 <dd>Reserved for NXM compatibility.</dd>
662
663 <dt>0x0002 to 0x7fff</dt>
664 <dd>
665 Reserved for allocation to ONF members, but none yet assigned.
666 </dd>
667
668 <dt>0x8000 (<code>OFPXMC_OPENFLOW_BASIC</code>)</dt>
669 <dd>
670 Used for most standard OpenFlow fields.
671 </dd>
672
673 <dt>0x8001 (<code>OFPXMC_PACKET_REGS</code>)</dt>
674 <dd>
675 Used for packet register fields in OpenFlow 1.5 and later.
676 </dd>
677
678 <dt>0x8002 to 0xfffe</dt>
679 <dd>
680 Reserved for the OpenFlow specification.
681 </dd>
682
683 <dt>0xffff (<code>OFPXMC_EXPERIMENTER</code>)</dt>
684 <dd>Experimental use.</dd>
685 </dl>
686
687 <p>
688 When <code>class</code> is 0xffff, the OXM header is extended to 64 bits by
689 using the first 32 bits of the body as an <code>experimenter</code> field
690 whose most significant byte is zero and whose remaining bytes are an
691 Organizationally Unique Identifier (OUI) assigned by the IEEE [IEEE OUI],
692 as shown below. OpenFlow says that support for experimenter fields is
693 optional. Open vSwitch 2.4 and later does support them, primarily so that
694 it can support the <code>ONFOXM_ET_</code>* code points defined by official
695 Open Networking Foundation extensions to OpenFlow 1.3 in e.g. [TCP Flags
696 Match Field Extension].
697 </p>
698
699 <diagram>
700 <header name="type">
701 <bits name="class" above="16" below="0xffff" width=".75"/>
702 <bits name="field" above="7" width=".4"/>
703 </header>
704 <nospace/>
705 <header name="">
706 <bits name="HM" above="1" width=".25"/>
707 <bits name="length" above="8" width=".4"/>
708 </header>
709
710 <header name="experimenter">
711 <bits name="zero" above="8" below="0x00" width=".4"/>
712 <bits name="OUI" above="24" width="1"/>
713 </header>
714 <header name="">
715 <bits name="body" above="(length - 4) bytes" width="1.7"/>
716 </header>
717 </diagram>
718
719 <p>
720 Taken as a unit, <code>class</code> (or <code>vendor</code>),
721 <code>field</code>, and <code>experimenter</code> (when present) uniquely
722 identify a particular field.
723 </p>
724
725 <p>
726 When <code>hasmask</code> (abbreviated <code>HM</code> above) is 0, the OXM
727 is an exact match on an entire field. In this case, the body (excluding
728 the experimenter field, if present) is a single value to be matched.
729 </p>
730
731 <p>
732 When <code>hasmask</code> is 1, the OXM is a bitwise match. The body
733 (excluding the experimenter field) consists of a value to match, followed
734 by the bitwise mask to apply. A 1-bit in the mask indicates that the
735 corresponding bit in the value should be matched and a 0-bit that it should
736 be ignored. For example, for an IP address field, a value of 192.168.0.0
737 followed by a mask of 255.255.0.0 would match addresses in the
738 196.168.0.0/16 subnet.
739 </p>
740
741 <ul>
742 <li>
743 Some fields might not support masking at all, and some fields that do
744 support masking might restrict it to certain patterns. For example,
745 fields that have IP address values might be restricted to CIDR masks.
746 The descriptions of individual fields note these restrictions.
747 </li>
748
749 <li>
750 An OXM TLV with a mask that is all zeros is not useful (although it is
751 not forbidden), because it is has the same effect as omitting the TLV
752 entirely.
753 </li>
754
755 <li>
756 It is not meaningful to pair a 0-bit in an OXM mask with a 1-bit in its
757 value, and Open vSwitch rejects such an OXM with the error
758 <code>OFPBMC_BAD_WILDCARDS</code>, as required by OpenFlow 1.3 and later.
759 </li>
760 </ul>
761
762 <p>
763 The <code>length</code> identifies the number of bytes in the body,
764 including the 4-byte <code>experimenter</code> header, if it is present.
765 Each OXM TLV has a fixed length; that is, given <code>class</code>,
766 <code>field</code>, <code>experimenter</code> (if present), and
767 <code>hasmask</code>, <code>length</code> is a constant. The
768 <code>length</code> is included explicitly to allow software to minimally
769 parse OXM TLVs of unknown types.
770 </p>
771
772 <p>
773 OXM TLVs must be ordered so that a field's prerequisites are satisfied
774 before it is parsed. For example, an OXM TLV that matches on the IPv4
775 source address field is only allowed following an OXM TLV that matches on
776 the Ethertype for IPv4. Similarly, an OXM TLV that matches on the TCP
777 source port must follow a TLV that matches an Ethertype of IPv4 or IPv6 and
778 one that matches an IP protocol of TCP (in that order). The order of OXM
779 TLVs is not otherwise restricted; no canonical ordering is defined.
780 </p>
781
782 <p>
783 A given field may be matched only once in a series of OXM TLVs.
784 </p>
785
786 <!-- EXT-482? -->
787
788 <h3>OpenFlow 1.3</h3>
789
790 <p>
791 OpenFlow 1.3 showed OXM to be largely successful, by adding new fields
792 without making any changes to how flow matches otherwise worked. It added
793 OXMs for the following fields supported by Open vSwitch:
794 </p>
795
796 <ul>
797 <li>Tunnel ID for ports associated with e.g. VXLAN or keyed GRE.</li>
798 <li>MPLS ``bottom of stack'' (BOS) bit.</li>
799 </ul>
800
801 <p>
802 OpenFlow 1.3 also added OXMs for the following fields not documented here
803 and not yet implemented by Open vSwitch:
804 </p>
805
806 <ul>
807 <li>IPv6 extension header handling.</li>
808 <li>PBB I-SID.</li>
809 </ul>
810
811 <h3>OpenFlow 1.4</h3>
812
813 <p>
814 OpenFlow 1.4 added OXMs for the following fields not documented here and
815 not yet implemented by Open vSwitch:
816 </p>
817
818 <ul>
819 <li>PBB UCA.</li>
820 </ul>
821
822 <h3>OpenFlow 1.5</h3>
823
824 <p>
825 OpenFlow 1.5 added OXMs for the following fields supported by Open vSwitch:
826 </p>
827
828 <ul>
3d4b2e6e 829 <li>Packet type.</li>
96fee5e0
BP
830 <li>TCP flags.</li>
831 <li>Packet registers.</li>
832 <li>The output port in the OpenFlow action set.</li>
833 </ul>
834
96fee5e0
BP
835 <h1>Fields Reference</h1>
836
837 <p>
838 The following sections document the fields that Open vSwitch supports.
839 Each section provides introductory material on a group of related fields,
840 followed by information on each individual field. In addition to
841 field-specific information, each field begins with a table with entries for
842 the following important properties:
843 </p>
844
845 <dl>
846 <dt>Name</dt>
847 <dd>
848 The field's name, used for parsing and formatting the field, e.g. in
849 <code>ovs-ofctl</code> commands. For historical reasons, some fields
850 have an additional name that is accepted as an alternative in parsing.
851 This name, when there is one, is listed as well, e.g. ``<code>tun</code>
852 (aka <code>tunnel_id</code>).''
853 </dd>
854
855 <dt>Width</dt>
856 <dd>
857 The field's width, always a multiple of 8 bits. Some fields don't use
858 all of the bits, so this may be accompanied by an explanation. For
859 example, OpenFlow embeds the 2-bit IP ECN field as as the low bits in an
860 8-bit byte, and so its width is expressed as ``8 bits (only the
861 least-significant 2 bits may be nonzero).''
862 </dd>
863
864 <dt>Format</dt>
865 <dd>
866 <p>
867 How a value for the field is formatted or parsed by, e.g.,
868 <code>ovs-ofctl</code>. Some possibilities are generic:
869 </p>
870
871 <dl>
872 <dt>decimal</dt>
873 <dd>
874 Formats as a decimal number. On input, accepts decimal numbers or
875 hexadecimal numbers prefixed by <code>0x</code>.
876 </dd>
877
878 <dt>hexadecimal</dt>
879 <dd>
880 Formats as a hexadecimal number prefixed by <code>0x</code>. On
881 input, accepts decimal numbers or hexadecimal numbers prefixed by
882 <code>0x</code>. (The default for parsing is <em>not</em>
883 hexadecimal: only a <code>0x</code> prefix causes input to be treated
884 as hexadecimal.)
885 </dd>
886
887 <dt>Ethernet</dt>
888 <dd>
889 Formats and accepts the common Ethernet address format
890 <code><var>xx</var>:<var>xx</var>:<var>xx</var>:<var>xx</var>:<var>xx</var>:<var>xx</var></code>.
891 </dd>
892
893 <dt>IPv4</dt>
894 <dd>
895 Formats and accepts the dotted-quad format
896 <code><var>a</var>.<var>b</var>.<var>c</var>.<var>d</var></code>.
897 For bitwise matches, formats and accepts
898 <code><var>address</var>/<var>length</var></code> CIDR notation in
899 addition to <code><var>address</var>/<var>mask</var></code>.
900 </dd>
901
902 <dt>IPv6</dt>
903 <dd>
904 Formats and accepts the common IPv6 address formats, plus CIDR
905 notation for bitwise matches.
906 </dd>
907
908 <dt>OpenFlow 1.0 port</dt>
909 <dd>
910 Accepts 16-bit port numbers in decimal, plus OpenFlow well-known port
911 names (e.g. <code>IN_PORT</code>) in uppercase or lowercase.
912 </dd>
913
914 <dt>OpenFlow 1.1+ port</dt>
915 <dd>
916 Same syntax as OpenFlow 1.0 ports but for 32-bit OpenFlow 1.1+ port
917 number fields.
918 </dd>
919 </dl>
920
921 <p>
922 Other, field-specific formats are explained along with their fields.
923 </p>
924 </dd>
925
926 <dt>Masking</dt>
927 <dd>
928 For most fields, this says ``arbitrary bitwise masks,'' meaning that a
929 flow may match any combination of bits in the field. Some fields
930 instead say ``exact match only,'' which means that a flow that matches
931 on this field must match on the whole field instead of just certain
932 bits. Either way, this reports masking support for the latest version
933 of Open vSwitch using OXM or NXM (that is, either OpenFlow 1.2+ or
934 OpenFlow 1.0 plus Open vSwitch NXM extensions). In particular,
935 OpenFlow 1.0 (without NXM) and 1.1 don't always support masking even if
936 Open vSwitch itself does; refer to the <em>OpenFlow 1.0</em> and
937 <em>OpenFlow 1.1</em> rows to learn about masking with these protocol
938 versions.
939 </dd>
940
941 <dt>Prerequisites</dt>
942 <dd>
943 <p>
944 Requirements that must be met to match on this field. For example,
945 <ref field="ip_src"/> has IPv4 as a prerequisite, meaning that a match
946 must include <code>eth_type=0x0800</code> to match on the IPv4 source
947 address. The following prerequisites, with their requirements, are
948 currently in use:
949 </p>
950
951 <dl>
952 <dt>none</dt>
953 <dd>(no requirements)</dd>
954
955 <dt>VLAN VID</dt>
956 <dd><code>vlan_tci=0x1000/0x1000</code> (i.e. a VLAN header is
957 present)</dd>
958
959 <dt>ARP</dt>
960 <dd><code>eth_type=0x0806</code> (ARP) or <code>eth_type=0x8035</code> (RARP)</dd>
961
962 <dt>IPv4</dt>
963 <dd><code>eth_type=0x0800</code></dd>
964
965 <dt>IPv6</dt>
966 <dd><code>eth_type=0x86dd</code></dd>
967
968 <dt>IPv4/IPv6</dt>
969 <dd>IPv4 or IPv6</dd>
970
971 <dt>MPLS</dt>
972 <dd><code>eth_type=0x8847</code> or <code>eth_type=0x8848</code></dd>
973
974 <dt>TCP</dt>
975 <dd>IPv4/IPv6 and <code>ip_proto=6</code></dd>
976
977 <dt>UDP</dt>
978 <dd>IPv4/IPv6 and <code>ip_proto=17</code></dd>
979
980 <dt>SCTP</dt>
981 <dd>IPv4/IPv6 and <code>ip_proto=132</code></dd>
982
983 <dt>ICMPv4</dt>
984 <dd>IPv4 and <code>ip_proto=1</code></dd>
985
986 <dt>ICMPv6</dt>
987 <dd>IPv6 and <code>ip_proto=58</code></dd>
988
989 <dt>ND solicit</dt>
990 <dd>ICMPv6 and <code>icmp_type=135</code> and <code>icmp_code=0</code></dd>
991
992 <dt>ND advert</dt>
993 <dd>ICMPv6 and <code>icmp_type=136</code> and <code>icmp_code=0</code></dd>
994
995 <dt>ND</dt>
996 <dd>ND solicit or ND advert</dd>
997 </dl>
998
999 <p>
1000 The TCP, UDP, and SCTP prerequisites also have the special requirement
1001 that <code>nw_frag</code> is not being used to select ``later
1002 fragments.'' This is because only the first fragment of a fragmented
1003 IPv4 or IPv6 datagram contains the TCP or UDP header.
1004 </p>
1005 </dd>
1006
1007 <dt>Access</dt>
1008 <dd>
1009 Most fields are ``read/write,'' which means that common OpenFlow actions
1010 like <code>set_field</code> can modify them. Fields that are
1011 ``read-only'' cannot be modified in these general-purpose ways, although
1012 there may be other ways that actions can modify them.
1013 </dd>
1014
1015 <dt>OpenFlow 1.0</dt>
1016 <dt>OpenFlow 1.1</dt>
1017 <dd>
1018 These rows report the level of support that OpenFlow 1.0 or OpenFlow 1.1,
1019 respectively, has for a field. For OpenFlow 1.0, supported fields are
1020 reported as either ``yes (exact match only)'' for fields that do not
1021 support any bitwise masking or ``yes (CIDR match only)'' for fields that
1022 support CIDR masking. OpenFlow 1.1 supported fields report either ``yes
1023 (exact match only)'' or simply ``yes'' for fields that do support
1024 arbitrary masks. These OpenFlow versions supported a fixed collection of
1025 fields that cannot be extended, so many more fields are reported as ``not
1026 supported.''
1027 </dd>
1028
1029 <dt>OXM</dt>
1030 <dt>NXM</dt>
1031 <dd>
1032 <p>
1033 These rows report the OXM and NXM code points that correspond to a
1034 given field. Either or both may be ``none.''
1035 </p>
1036
1037 <p>
1038 A field that has only an OXM code point is usually one that was
1039 standardized before it was added to Open vSwitch. A field that has
1040 only an NXM code point is usually one that is not yet standardized.
1041 When a field has both OXM and NXM code points, it usually indicates
1042 that it was introduced as an Open vSwitch extension under the NXM code
1043 point, then later standardized under the OXM code point. A field can
1044 have more than one OXM code point if it was standardized in OpenFlow
1045 1.4 or later and additionally introduced as an official ONF extension
1046 for OpenFlow 1.3. (A field that has neither OXM nor NXM code point is
1047 typically an obsolete field that is supported in some other form using
1048 OXM or NXM.)
1049 </p>
1050
1051 <p>
1052 Each code point in these rows is described in the form
1053 ``<code>NAME</code> (<var>number</var>) since OpenFlow <var>spec</var>
1054 and Open vSwitch <var>version</var>,''
1055 e.g. ``<code>OXM_OF_ETH_TYPE</code> (5) since OpenFlow 1.2 and Open
1056 vSwitch 1.7.'' First, <code>NAME</code>, which specifies a name for
1057 the code point, starts with a prefix that designates a class and, in
1058 some cases, a vendor, as listed in the following table:
1059 </p>
1060
1061 <oxm_classes/>
1062
1063 <p>
1064 For more information on OXM/NXM classes and vendors, refer back to
1065 <em>OpenFlow 1.2</em> under <em>Evolution of OpenFlow Fields</em>. The
1066 <var>number</var> is the field number within the class and vendor. The
1067 OpenFlow <var>spec</var> is the version of OpenFlow that standardized
1068 the code point. It is omitted for NXM code points because they are
1069 nonstandard. The <var>version</var> is the version of Open vSwitch
1070 that first supported the code point.
1071 </p>
1072 </dd>
1073 </dl>
1074
1075 <group title="Conjunctive Match">
1076 <p>
1077 An individual OpenFlow flow can match only a single value for each field.
1078 However, situations often arise where one wants to match one of a set of
1079 values within a field or fields. For matching a single field against a
1080 set, it is straightforward and efficient to add multiple flows to the
1081 flow table, one for each value in the set. For example, one might use
1082 the following flows to send packets with IP source address <var>a</var>,
1083 <var>b</var>, <var>c</var>, or <var>d</var> to the OpenFlow controller:
1084 </p>
1085
1086 <pre>
1087 ip,ip_src=<var>a</var> actions=controller
1088 ip,ip_src=<var>b</var> actions=controller
1089 ip,ip_src=<var>c</var> actions=controller
1090 ip,ip_src=<var>d</var> actions=controller
1091 </pre>
1092
1093 <p>
1094 Similarly, these flows send packets with IP destination address
1095 <var>e</var>, <var>f</var>, <var>g</var>, or <var>h</var> to the OpenFlow
1096 controller:
1097 </p>
1098
1099 <pre>
1100 ip,ip_dst=<var>e</var> actions=controller
1101 ip,ip_dst=<var>f</var> actions=controller
1102 ip,ip_dst=<var>g</var> actions=controller
1103 ip,ip_dst=<var>h</var> actions=controller
1104 </pre>
1105
1106 <p>
1107 Installing all of the above flows in a single flow table yields a
1108 disjunctive effect: a packet is sent to the controller if
1109 <code>ip_src</code> ∈ {<var>a</var>,<var>b</var>,<var>c</var>,<var>d</var>}
1110 or <code>ip_dst</code> ∈
1111 {<var>e</var>,<var>f</var>,<var>g</var>,<var>h</var>} (or both).
1112 (Pedantically, if both of the above sets of flows are present in the flow
1113 table, they should have different priorities, because OpenFlow says that
1114 the results are undefined when two flows with same priority can both match
1115 a single packet.)
1116 </p>
1117
1118 <p>
1119 Suppose, on the other hand, one wishes to match conjunctively, that is, to
1120 send a packet to the controller only if both <code>ip_src</code> ∈
1121 {<var>a</var>,<var>b</var>,<var>c</var>,<var>d</var>} and
1122 <code>ip_dst</code> ∈
1123 {<var>e</var>,<var>f</var>,<var>g</var>,<var>h</var>}. This requires 4 × 4
1124 = 16 flows, one for each possible pairing of <code>ip_src</code> and
1125 <code>ip_dst</code>. That is acceptable for our small example, but it does
1126 not gracefully extend to larger sets or greater numbers of dimensions.
1127 </p>
1128
1129 <p>
1130 The <code>conjunction</code> action is a solution for conjunctive matches
1131 that is built into Open vSwitch. A <code>conjunction</code> action ties groups of
1132 individual OpenFlow flows into higher-level ``conjunctive flows''. Each
1133 group corresponds to one dimension, and each flow within the group matches
1134 one possible value for the dimension. A packet that matches one flow from
1135 each group matches the conjunctive flow.
1136 </p>
1137
1138 <p>
1139 To implement a conjunctive flow with <code>conjunction</code>, assign the
1140 conjunctive flow a 32-bit <var>id</var>, which must be unique within an
1141 OpenFlow table. Assign each of the <var>n</var> ≥ 2 dimensions a unique
1142 number from 1 to <var>n</var>; the ordering is unimportant. Add one flow
1143 to the OpenFlow flow table for each possible value of each dimension with
1144 <code>conjunction(<var>id</var>, <var>k</var>/<var>n</var>)</code> as the
1145 flow's actions, where <var>k</var> is the number assigned to the flow's
1146 dimension. Together, these flows specify the conjunctive flow's match
1147 condition. When the conjunctive match condition is met, Open vSwitch looks
1148 up one more flow that specifies the conjunctive flow's actions and receives
1149 its statistics. This flow is found by setting <code>conj_id</code> to the
1150 specified <var>id</var> and then again searching the flow table.
1151 </p>
1152
1153 <p>
1154 The following flows provide an example. Whenever the IP source is one of
1155 the values in the flows that match on the IP source (dimension 1 of 2),
1156 <em>and</em> the IP destination is one of the values in the flows that
1157 match on IP destination (dimension 2 of 2), Open vSwitch searches for a
1158 flow that matches <code>conj_id</code> against the conjunction ID (1234),
1159 finding the first flow listed below.
1160 </p>
1161
1162 <pre>
1163 conj_id=1234 actions=controller
1164 ip,ip_src=10.0.0.1 actions=conjunction(1234, 1/2)
1165 ip,ip_src=10.0.0.4 actions=conjunction(1234, 1/2)
1166 ip,ip_src=10.0.0.6 actions=conjunction(1234, 1/2)
1167 ip,ip_src=10.0.0.7 actions=conjunction(1234, 1/2)
1168 ip,ip_dst=10.0.0.2 actions=conjunction(1234, 2/2)
1169 ip,ip_dst=10.0.0.5 actions=conjunction(1234, 2/2)
1170 ip,ip_dst=10.0.0.7 actions=conjunction(1234, 2/2)
1171 ip,ip_dst=10.0.0.8 actions=conjunction(1234, 2/2)
1172 </pre>
1173
1174 <p>
1175 Many subtleties exist:
1176 </p>
1177
1178 <ul>
1179 <li>
1180 In the example above, every flow in a single dimension has the same form,
1181 that is, dimension 1 matches on <code>ip_src</code> and dimension 2 on
1182 <code>ip_dst</code>, but this is not a requirement. Different flows
1183 within a dimension may match on different bits within a field (e.g. IP
1184 network prefixes of different lengths, or TCP/UDP port ranges as bitwise
1185 matches), or even on entirely different fields (e.g. to match packets for
1186 TCP source port 80 or TCP destination port 80).
1187 </li>
1188
1189 <li>
1190 The flows within a dimension can vary their matches across more than
1191 one field, e.g. to match only specific pairs of IP source and
1192 destination addresses or L4 port numbers.
1193 </li>
1194
1195 <li>
1196 A flow may have multiple <code>conjunction</code> actions, with different
1197 <code>id</code> values. This is useful for multiple conjunctive flows with
1198 overlapping sets. If one conjunctive flow matches packets with both
1199 <code>ip_src</code> ∈ {<var>a</var>,<var>b</var>} and <code>ip_dst</code> ∈
1200 {<var>d</var>,<var>e</var>} and a second conjunctive flow matches <code>ip_src</code>
1201 ∈ {<var>b</var>,<var>c</var>} and <code>ip_dst</code> ∈ {<var>f</var>,<var>g</var>}, for
1202 example, then the flow that matches <code>ip_src=</code><var>b</var> would have two
1203 <code>conjunction</code> actions, one for each conjunctive flow. The order
1204 of <code>conjunction</code> actions within a list of actions is not
1205 significant.
1206 </li>
1207 <li>
1208 A flow with <code>conjunction</code> actions may also include <code>note</code>
1209 actions for annotations, but not any other kind of actions. (They
1210 would not be useful because they would never be executed.)
1211 </li>
1212 <li>
1213 All of the flows that constitute a conjunctive flow with a given
1214 <var>id</var> must have the same priority. (Flows with the same <var>id</var>
1215 but different priorities are currently treated as different
1216 conjunctive flows, that is, currently <var>id</var> values need only be
1217 unique within an OpenFlow table at a given priority. This behavior
1218 isn't guaranteed to stay the same in later releases, so please use
1219 <var>id</var> values unique within an OpenFlow table.)
1220 </li>
1221 <li>
1222 Conjunctive flows must not overlap with each other, at a given
1223 priority, that is, any given packet must be able to match at most one
1224 conjunctive flow at a given priority. Overlapping conjunctive flows
1225 yield unpredictable results.
1226 </li>
1227 <li>
1228 Following a conjunctive flow match, the search for the flow with
1229 <code>conj_id=</code><var>id</var> is done in the same general-purpose way as
1230 other flow table searches, so one can use flows with
1231 <code>conj_id=</code><var>id</var> to act differently depending on
1232 circumstances. (One exception is that the search for the
1233 <code>conj_id=</code><var>id</var> flow itself ignores conjunctive flows, to
1234 avoid recursion.) If the search with <code>conj_id=</code><var>id</var> fails,
1235 Open vSwitch acts as if the conjunctive flow had not matched at all, and
1236 continues searching the flow table for other matching flows.
1237 </li>
1238 <li>
1239 <p>
1240 OpenFlow prerequisite checking occurs for the flow with
1241 <code>conj_id=</code><var>id</var> in the same way as any other flow, e.g. in
1242 an OpenFlow 1.1+ context, putting a <code>mod_nw_src</code> action into the example
1243 above would require adding an <code>ip</code> match, like this:
1244 </p>
1245 <pre>
1246 conj_id=1234,ip actions=mod_nw_src:1.2.3.4,controller
1247 </pre>
1248 </li>
1249 <li>
1250 OpenFlow prerequisite checking also occurs for the individual flows
1251 that comprise a conjunctive match in the same way as any other flow.
1252 </li>
1253 <li>
1254 The flows that constitute a conjunctive flow do not have useful
1255 statistics. They are never updated with byte or packet counts, and so
1256 on. (For such a flow, therefore, the idle and hard timeouts work much
1257 the same way.)
1258 </li>
1259 <li>
1260 <p>
1261 Sometimes there is a choice of which flows include a particular match.
1262 For example, suppose that we added an extra constraint to our example,
1263 to match on <code>ip_src</code> ∈
1264 {<var>a</var>,<var>b</var>,<var>c</var>,<var>d</var>} and
1265 <code>ip_dst</code> ∈
1266 {<var>e</var>,<var>f</var>,<var>g</var>,<var>h</var>} and
1267 <code>tcp_dst</code> = <var>i</var>. One way to implement this is to
1268 add the new constraint to the <code>conj_id</code> flow, like this:
1269 </p>
1270 <pre>
1271 conj_id=1234,tcp,tcp_dst=<var>i</var> actions=mod_nw_src:1.2.3.4,controller
1272 </pre>
1273 <p>
1274 but <em>this is not recommended</em> because of the cost of the extra
1275 flow table lookup. Instead, add the constraint to the individual
1276 flows, either in one of the dimensions or (slightly better) all of
1277 them.
1278 </p>
1279 </li>
1280 <li>
1281 A conjunctive match must have <var>n</var> ≥ 2 dimensions (otherwise a
1282 conjunctive match is not necessary). Open vSwitch enforces this.
1283 </li>
1284 <li>
1285 Each dimension within a conjunctive match should ordinarily have more
1286 than one flow. Open vSwitch does not enforce this.
1287 </li>
1288 </ul>
1289
1290 <field id="MFF_CONJ_ID" title="Conjunction ID">
1291 Used for conjunctive matching. See above for more information.
1292 </field>
1293 </group>
1294
1295 <group title="Tunnel">
1296 <p>
1297 The fields in this group relate to tunnels, which Open vSwitch
1298 supports in several forms (GRE, VXLAN, and so on). Most of
1299 these fields do appear in the wire format of a packet, so they
1300 are data fields from that point of view, but they are metadata
1301 from an OpenFlow flow table point of view because they do not
1302 appear in packets that are forwarded to the controller or to
1303 ordinary (non-tunnel) output ports.
1304 </p>
1305
1306 <p>
1307 Open vSwitch supports a spectrum of usage models for mapping
1308 tunnels to OpenFlow ports:
1309 </p>
1310
1311 <dl>
1312 <dt>``Port-based'' tunnels</dt>
1313 <dd>
1314 <p>
1315 In this model, an OpenFlow port represents one tunnel: it matches a
1316 particular type of tunnel traffic between two IP endpoints, with a
1317 particular tunnel key (if keys are in use). In this situation, <ref
1318 field="in_port"/> suffices to distinguish one tunnel from another, so
1319 the tunnel header fields have little importance for OpenFlow
1320 processing. (They are still populated and may be used if it is
1321 convenient.) The tunnel header fields play no role in sending
1322 packets out such an OpenFlow port, either, because the OpenFlow port
1323 itself fully specifies the tunnel headers.
1324 </p>
1325
1326 <p>
1327 The following Open vSwitch commands create a bridge
1328 <code>br-int</code>, add port <code>tap0</code> to the bridge as
1329 OpenFlow port 1, establish a port-based GRE tunnel between the local
1330 host and remote IP 192.168.1.1 using GRE key 5001 as OpenFlow port 2,
1331 and arranges to forward all traffic from <code>tap0</code> to the
1332 tunnel and vice versa:
1333 </p>
1334
1335 <pre>
1336ovs-vsctl add-br br-int
1337ovs-vsctl add-port br-int tap0 -- set interface tap0 ofport_request=1
1338ovs-vsctl add-port br-int gre0 --
1339 set interface gre0 ofport_request=2 type=gre \
1340 options:remote_ip=192.168.1.1 options:key=5001
1341ovs-ofctl add-flow br-int in_port=1,actions=2
1342ovs-ofctl add-flow br-int in_port=2,actions=1
1343 </pre>
1344 </dd>
1345
1346 <dt>``Flow-based'' tunnels</dt>
1347 <dd>
1348 <p>
1349 In this model, one OpenFlow port represents all possible tunnels of a
1350 given type with an endpoint on the current host, for example, all GRE
1351 tunnels. In this situation, <ref field="in_port"/> only indicates
1352 that traffic was received on the particular kind of tunnel. This is
1353 where the tunnel header fields are most important: they allow the
1354 OpenFlow tables to discriminate among tunnels based on their IP
1355 endpoints or keys. Tunnel header fields also determine the IP
1356 endpoints and keys of packets sent out such a tunnel port.
1357 </p>
1358
1359 <p>
1360 The following Open vSwitch commands create a bridge
1361 <code>br-int</code>, add port <code>tap0</code> to the
1362 bridge as OpenFlow port 1, establish a flow-based GRE tunnel
1363 port 3, and arranges to forward all traffic from
1364 <code>tap0</code> to remote IP 192.168.1.1 over a GRE tunnel
1365 with key 5001 and vice versa:
1366 </p>
1367
1368 <pre>
1369ovs-vsctl add-br br-int
1370ovs-vsctl add-port br-int tap0 -- set interface tap0 ofport_request=1
1371ovs-vsctl add-port br-int allgre --
1372 set interface gre0 ofport_request=3 type=gre \
1373 options:remote_ip=flow options:key=flow
1374ovs-ofctl add-flow br-int \
1375 'in_port=1 actions=set_tunnel:5001,set_field:192.168.1.1->tun_dst,3'
1376ovs-ofctl add-flow br-int 'in_port=3,tun_src=192.168.1.1,tun_id=5001 actions=1'
1377 </pre>
1378 </dd>
1379
1380 <dt>Mixed models.</dt>
1381 <dd>
1382 <p>
1383 One may define both flow-based and port-based tunnels at the
1384 same time. For example, it is valid and possibly useful to
1385 create and configure both <code>gre0</code> and
1386 <code>allgre</code> tunnel ports described above.
1387 </p>
1388
1389 <p>
1390 Traffic is attributed on ingress to the most specific
1391 matching tunnel. For example, <code>gre0</code> is more
1392 specific than <code>allgre</code>. Therefore, if both
1393 exist, then <code>gre0</code> will be the ingress port for any
1394 GRE traffic received from 192.168.1.1 with key 5001.
1395 </p>
1396
1397 <p>
1398 On egress, traffic may be directed to any appropriate tunnel
1399 port. If both <code>gre0</code> and <code>allgre</code> are
1400 configured as already described, then the actions
1401 <code>2</code> and
1402 <code>set_tunnel:5001,set_field:192.168.1.1->tun_dst,3</code>
1403 send the same tunnel traffic.
1404 </p>
1405 </dd>
1406
1407 <dt>Intermediate models.</dt>
1408 <dd>
1409 Ports may be configured as partially flow-based. For example,
1410 one may define an OpenFlow port that represents tunnels
1411 between a pair of endpoints but leaves the flow table to
1412 discriminate on the flow key.
1413 </dd>
1414 </dl>
1415
1416 <p>
1417 <code>ovs-vswitchd.conf.db</code>(5) describes all the details of tunnel
1418 configuration.
1419 </p>
1420
1421 <p>
1422 These fields do not have any prerequisites, which means that a
1423 flow may match on any or all of them, in any combination.
1424 </p>
1425
1426 <p>
1427 These fields are zeros for packets that did not arrive on a tunnel.
1428 </p>
1429
1430 <field id="MFF_TUN_ID" title="Tunnel ID">
1431 <p>
1432 Many kinds of tunnels support a tunnel ID:
1433 </p>
1434
1435 <ul>
1436 <li>
1437 VXLAN and Geneve have a 24-bit virtual network identifier (VNI).
1438 </li>
1439 <li>LISP has a 24-bit instance ID.</li>
1440 <li>GRE has an optional 32-bit key.</li>
1441 <li>STT has a 64-bit key.</li>
1442 </ul>
1443
1444 <p>
1445 When a packet is received from a tunnel, this field holds the
1446 tunnel ID in its least significant bits, zero-extended to fit.
1447 This field is zero if the tunnel does not support an ID, or if
1448 no ID is in use for a tunnel type that has an optional ID, or
1449 if an ID of zero received, or if the packet was not received
1450 over a tunnel.
1451 </p>
1452
1453 <p>
1454 When a packet is output to a tunnel port, the tunnel
1455 configuration determines whether the tunnel ID is taken from
1456 this field or bound to a fixed value. See the earlier
1457 description of ``port-based'' and ``flow-based'' tunnels for
1458 more information.
1459 </p>
1460
1461 <p>
1462 The following diagram shows the origin of this field in a
1463 typical keyed GRE tunnel:
1464 </p>
1465
1466 <diagram>
1467 <header name="Ethernet">
1468 <bits name="dst" above="48" width="0.4"/>
1469 <bits name="src" above="48" width="0.4"/>
1470 <bits name="type" above="16" below="0x800" width="0.4"/>
1471 </header>
1472 <header name="IPv4">
1473 <bits name="..." width="0.4"/>
1474 <bits name="proto" above="8" below="47" width="0.4"/>
1475 <bits name="src" above="32" width="0.4"/>
1476 <bits name="dst" above="32" width="0.4"/>
1477 </header>
1478 <header name="GRE">
1479 <bits name="..." above="16" width="0.4"/>
1480 <bits name="type" above="16" below="0x6558" width="0.4"/>
1481 <bits name="key" above="32" width=".4" fill="yes"/>
1482 </header>
1483 <header name="Ethernet">
1484 <bits name="dst" above="48" width="0.4"/>
1485 <bits name="src" above="48" width="0.4"/>
1486 <bits name="type" above="16" width="0.4"/>
1487 </header>
1488 <dots/>
1489 </diagram>
1490 </field>
1491
1492 <field id="MFF_TUN_SRC" title="Tunnel IPv4 Source">
1493 <p>
1494 When a packet is received from a tunnel, this field is the
1495 source address in the outer IP header of the tunneled packet.
1496 This field is zero if the packet was not received over a
1497 tunnel.
1498 </p>
1499
1500 <p>
1501 When a packet is output to a flow-based tunnel port, this
1502 field influences the IPv4 source address used to send the
1503 packet. If it is zero, then the kernel chooses an appropriate
1504 IP address based using the routing table.
1505 </p>
1506
1507 <p>
1508 The following diagram shows the origin of this field in a
1509 typical keyed GRE tunnel:
1510 </p>
1511
1512 <diagram>
1513 <header name="Ethernet">
1514 <bits name="dst" above="48" width="0.4"/>
1515 <bits name="src" above="48" width="0.4"/>
1516 <bits name="type" above="16" below="0x800" width="0.4"/>
1517 </header>
1518 <header name="IPv4">
1519 <bits name="..." width="0.4"/>
1520 <bits name="proto" above="8" below="47" width="0.4"/>
1521 <bits name="src" above="32" width="0.4" fill="yes"/>
1522 <bits name="dst" above="32" width="0.4"/>
1523 </header>
1524 <header name="GRE">
1525 <bits name="..." above="16" width="0.4"/>
1526 <bits name="type" above="16" below="0x6558" width="0.4"/>
1527 <bits name="key" above="32" width=".4"/>
1528 </header>
1529 <header name="Ethernet">
1530 <bits name="dst" above="48" width="0.4"/>
1531 <bits name="src" above="48" width="0.4"/>
1532 <bits name="type" above="16" width="0.4"/>
1533 </header>
1534 <dots/>
1535 </diagram>
1536 </field>
1537
1538 <field id="MFF_TUN_DST" title="Tunnel IPv4 Destination">
1539 <p>
1540 When a packet is received from a tunnel, this field is the
1541 destination address in the outer IP header of the tunneled
1542 packet. This field is zero if the packet was not received
1543 over a tunnel.
1544 </p>
1545
1546 <p>
1547 When a packet is output to a flow-based tunnel port, this
1548 field specifies the destination to which the tunnel packet is
1549 sent.
1550 </p>
1551
1552 <p>
1553 The following diagram shows the origin of this field in a
1554 typical keyed GRE tunnel:
1555 </p>
1556
1557 <diagram>
1558 <header name="Ethernet">
1559 <bits name="dst" above="48" width="0.4"/>
1560 <bits name="src" above="48" width="0.4"/>
1561 <bits name="type" above="16" below="0x800" width="0.4"/>
1562 </header>
1563 <header name="IPv4">
1564 <bits name="..." width="0.4"/>
1565 <bits name="proto" above="8" below="47" width="0.4"/>
1566 <bits name="src" above="32" width="0.4"/>
1567 <bits name="dst" above="32" width="0.4" fill="yes"/>
1568 </header>
1569 <header name="GRE">
1570 <bits name="..." above="16" width="0.4"/>
1571 <bits name="type" above="16" below="0x6558" width="0.4"/>
1572 <bits name="key" above="32" width=".4"/>
1573 </header>
1574 <header name="Ethernet">
1575 <bits name="dst" above="48" width="0.4"/>
1576 <bits name="src" above="48" width="0.4"/>
1577 <bits name="type" above="16" width="0.4"/>
1578 </header>
1579 <dots/>
1580 </diagram>
1581 </field>
1582
1583 <field id="MFF_TUN_IPV6_SRC" title="Tunnel IPv6 Source">
1584 Similar to <ref field="tun_src"/>, but for tunnels over IPv6.
1585 </field>
1586
1587 <field id="MFF_TUN_IPV6_DST" title="Tunnel IPv6 Destination">
1588 Similar to <ref field="tun_dst"/>, but for tunnels over IPv6.
1589 </field>
1590
1591 <h2>VXLAN Group-Based Policy Fields</h2>
1592
1593 <p>
1594 The VXLAN header is defined as follows [RFC 7348], where the
1595 <code>I</code> bit must be set to 1, unlabeled bits or those labeled
1596 <code>reserved</code> must be set to 0, and Open vSwitch makes the VNI
1597 available via <ref field="tun_id"/>:
1598 </p>
1599
1600 <diagram>
1601 <header name="VXLAN flags">
1602 <bits name="" above="1" width="0.15"/>
1603 <bits name="" above="1" width="0.15"/>
1604 <bits name="" above="1" width="0.15"/>
1605 <bits name="" above="1" width="0.15"/>
1606 <bits name="I" above="1" width="0.15"/>
1607 <bits name="" above="1" width="0.15"/>
1608 <bits name="" above="1" width="0.15"/>
1609 <bits name="" above="1" width="0.15"/>
1610 </header>
1611 <nospace/>
1612 <header>
1613 <bits name="reserved" above="24" width="1.2"/>
1614 <bits name="VNI" above="24" width="1.2"/>
1615 <bits name="reserved" above="8" width=".5"/>
1616 </header>
1617 </diagram>
1618
1619 <p>
1620 VXLAN Group-Based Policy [VXLAN Group Policy Option] adds new
1621 interpretations to existing bits in the VXLAN header, reinterpreting it
1622 as follows, with changes highlighted:
1623 </p>
1624
1625 <diagram>
1626 <header name="GBP flags">
1627 <bits name="" above="1" width="0.15"/>
1628 <bits name="D" above="1" width="0.15" fill="yes"/>
1629 <bits name="" above="1" width="0.15"/>
1630 <bits name="" above="1" width="0.15"/>
1631 <bits name="A" above="1" width="0.15" fill="yes"/>
1632 <bits name="" above="1" width="0.15"/>
1633 <bits name="" above="1" width="0.15"/>
1634 <bits name="" above="1" width="0.15"/>
1635 </header>
1636 <nospace/>
1637 <header>
1638 <bits name="group policy ID" above="24" width="1.2" fill="yes"/>
1639 <bits name="VNI" above="24" width="1.2"/>
1640 <bits name="reserved" above="8" width=".5"/>
1641 </header>
1642 </diagram>
1643
1644 <p>
1645 Open vSwitch makes GBP fields and flags available through the following
1646 fields. Only packets that arrive over a VXLAN tunnel with the GBP
1647 extension enabled have these fields set. In other packets they are zero
1648 on receive and ignored on transmit.
1649 </p>
1650
1651 <field id="MFF_TUN_GBP_ID" title="VXLAN Group-Based Policy ID">
1652 <p>
1653 For a packet tunneled over VXLAN with the Group-Based Policy (GBP)
1654 extension, this field represents the GBP policy ID, as shown above.
1655 </p>
1656 </field>
1657
1658 <field id="MFF_TUN_GBP_FLAGS" title="VXLAN Group-Based Policy Flags">
1659 <p>
1660 For a packet tunneled over VXLAN with the Group-Based Policy (GBP)
1661 extension, this field represents the GBP policy flags, as shown above.
1662 </p>
1663
1664 <p>
1665 The field has the format shown below:
1666 </p>
1667
1668 <diagram>
1669 <header name="GBP Flags">
1670 <bits name="" above="1" width="0.15"/>
1671 <bits name="D" above="1" width="0.15"/>
1672 <bits name="" above="1" width="0.15"/>
1673 <bits name="" above="1" width="0.15"/>
1674 <bits name="A" above="1" width="0.15"/>
1675 <bits name="" above="1" width="0.15"/>
1676 <bits name="" above="1" width="0.15"/>
1677 <bits name="" above="1" width="0.15"/>
1678 </header>
1679 </diagram>
1680
1681 <p>
1682 Unlabeled bits are reserved and must be transmitted as 0. The VXLAN
1683 GBP draft defines the other bits' meanings as:
1684 </p>
1685
1686 <dl>
1687 <dt><code>D</code> (Don't Learn)</dt>
1688 <dd>
1689 When set, this bit indicates that the egress tunnel endpoint must not
1690 learn the source address of the encapsulated frame.
1691 </dd>
1692
1693 <dt><code>A</code> (Applied)</dt>
1694 <dd>
1695 When set, indicates that the group policy has already been applied to
1696 this packet. Devices must not apply policies when the A bit is set.
1697 </dd>
1698 </dl>
1699 </field>
1700
1701 <h2>Geneve Fields</h2>
1702
1703 <p>
1704 These fields provide access to additional features in the Geneve
1705 tunneling protocol [Geneve]. Their names are somewhat generic in the
1706 hope that the same fields could be reused for other protocols in the
1707 future; for example, the NSH protocol [NSH] supports TLV options whose
1708 form is identical to that for Geneve options.
1709 </p>
1710
1711 <field id="MFF_TUN_METADATA0" title="Generic Tunnel Option 0">
1712 <p>
1713 The above information specifically covers generic tunnel option 0, but
1714 Open vSwitch supports 64 options, numbered 0 through 63, whose
1715 NXM field numbers are 40 through 103.
1716 </p>
1717
1718 <p>
1719 These fields provide OpenFlow access to the generic type-length-value
1720 options defined by the Geneve tunneling protocol or other protocols
1721 with options in the same TLV format as Geneve options. Each of these
1722 options has the following wire format:
1723 </p>
1724
1725 <diagram>
1726 <header name="header">
1727 <bits name="class" above="16" width="0.6"/>
1728 <bits name="type" above="8" width="0.5"/>
1729 <bits name="res" above="3" below="0" width="0.25"/>
1730 <bits name="length" above="5" width="0.4"/>
1731 </header>
1732 <nospace/>
1733 <header name="body">
1734 <bits name="value" above="4×(length - 1) bytes" width="1.7"/>
1735 </header>
1736 </diagram>
1737
1738 <p>
1739 Taken together, the <code>class</code> and <code>type</code> in the
1740 option format mean that there are about 16 million distinct kinds of
1741 TLV options, too many to give individual OXM code points. Thus, Open
1742 vSwitch requires the user to define the TLV options of interest, by
1743 binding up to 64 TLV options to generic tunnel option NXM code points.
1744 Each option may have up to 124 bytes in its body, the maximum allowed
1745 by the TLV format, but bound options may total at most 252 bytes of
1746 body.
1747 </p>
1748
1749 <p>
1750 Open vSwitch extensions to the OpenFlow protocol bind TLV options to
1751 NXM code points. The <code>ovs-ofctl</code>(8) program offers one way
1752 to use these extensions, e.g. to configure a mapping from a TLV option
1753 with <code>class</code> <code>0xffff</code>, <code>type</code>
1754 <code>0</code>, and a body length of 4 bytes:
1755 </p>
1756
1757 <pre>
1758ovs-ofctl add-tlv-map br0 "{class=0xffff,type=0,len=4}->tun_metadata0"
1759 </pre>
1760
1761 <p>
1762 Once a TLV option is properly bound, it can be accessed and modified
1763 like any other field, e.g. to send packets that have value 1234 for the
1764 option described above to the controller:
1765 </p>
1766
1767 <pre>
1768ovs-ofctl add-flow br0 tun_metadata0=1234,actions=controller
1769 </pre>
1770
1771 <p>
1772 An option not received or not bound is matched as all zeros.
1773 </p>
1774 </field>
1775 <!--- XXX need a way to define a range of OXMs -->
1776 <field id="MFF_TUN_METADATA1" title="Generic Tunnel Option 1" hidden="yes"/>
1777 <field id="MFF_TUN_METADATA2" title="Generic Tunnel Option 2" hidden="yes"/>
1778 <field id="MFF_TUN_METADATA3" title="Generic Tunnel Option 3" hidden="yes"/>
1779 <field id="MFF_TUN_METADATA4" title="Generic Tunnel Option 4" hidden="yes"/>
1780 <field id="MFF_TUN_METADATA5" title="Generic Tunnel Option 5" hidden="yes"/>
1781 <field id="MFF_TUN_METADATA6" title="Generic Tunnel Option 6" hidden="yes"/>
1782 <field id="MFF_TUN_METADATA7" title="Generic Tunnel Option 7" hidden="yes"/>
1783 <field id="MFF_TUN_METADATA8" title="Generic Tunnel Option 8" hidden="yes"/>
1784 <field id="MFF_TUN_METADATA9" title="Generic Tunnel Option 9" hidden="yes"/>
1785 <field id="MFF_TUN_METADATA10" title="Generic Tunnel Option 10" hidden="yes"/>
1786 <field id="MFF_TUN_METADATA11" title="Generic Tunnel Option 11" hidden="yes"/>
1787 <field id="MFF_TUN_METADATA12" title="Generic Tunnel Option 12" hidden="yes"/>
1788 <field id="MFF_TUN_METADATA13" title="Generic Tunnel Option 13" hidden="yes"/>
1789 <field id="MFF_TUN_METADATA14" title="Generic Tunnel Option 14" hidden="yes"/>
1790 <field id="MFF_TUN_METADATA15" title="Generic Tunnel Option 15" hidden="yes"/>
1791 <field id="MFF_TUN_METADATA16" title="Generic Tunnel Option 16" hidden="yes"/>
1792 <field id="MFF_TUN_METADATA17" title="Generic Tunnel Option 17" hidden="yes"/>
1793 <field id="MFF_TUN_METADATA18" title="Generic Tunnel Option 18" hidden="yes"/>
1794 <field id="MFF_TUN_METADATA19" title="Generic Tunnel Option 19" hidden="yes"/>
1795 <field id="MFF_TUN_METADATA20" title="Generic Tunnel Option 20" hidden="yes"/>
1796 <field id="MFF_TUN_METADATA21" title="Generic Tunnel Option 21" hidden="yes"/>
1797 <field id="MFF_TUN_METADATA22" title="Generic Tunnel Option 22" hidden="yes"/>
1798 <field id="MFF_TUN_METADATA23" title="Generic Tunnel Option 23" hidden="yes"/>
1799 <field id="MFF_TUN_METADATA24" title="Generic Tunnel Option 24" hidden="yes"/>
1800 <field id="MFF_TUN_METADATA25" title="Generic Tunnel Option 25" hidden="yes"/>
1801 <field id="MFF_TUN_METADATA26" title="Generic Tunnel Option 26" hidden="yes"/>
1802 <field id="MFF_TUN_METADATA27" title="Generic Tunnel Option 27" hidden="yes"/>
1803 <field id="MFF_TUN_METADATA28" title="Generic Tunnel Option 28" hidden="yes"/>
1804 <field id="MFF_TUN_METADATA29" title="Generic Tunnel Option 29" hidden="yes"/>
1805 <field id="MFF_TUN_METADATA30" title="Generic Tunnel Option 30" hidden="yes"/>
1806 <field id="MFF_TUN_METADATA31" title="Generic Tunnel Option 31" hidden="yes"/>
1807 <field id="MFF_TUN_METADATA32" title="Generic Tunnel Option 32" hidden="yes"/>
1808 <field id="MFF_TUN_METADATA33" title="Generic Tunnel Option 33" hidden="yes"/>
1809 <field id="MFF_TUN_METADATA34" title="Generic Tunnel Option 34" hidden="yes"/>
1810 <field id="MFF_TUN_METADATA35" title="Generic Tunnel Option 35" hidden="yes"/>
1811 <field id="MFF_TUN_METADATA36" title="Generic Tunnel Option 36" hidden="yes"/>
1812 <field id="MFF_TUN_METADATA37" title="Generic Tunnel Option 37" hidden="yes"/>
1813 <field id="MFF_TUN_METADATA38" title="Generic Tunnel Option 38" hidden="yes"/>
1814 <field id="MFF_TUN_METADATA39" title="Generic Tunnel Option 39" hidden="yes"/>
1815 <field id="MFF_TUN_METADATA40" title="Generic Tunnel Option 40" hidden="yes"/>
1816 <field id="MFF_TUN_METADATA41" title="Generic Tunnel Option 41" hidden="yes"/>
1817 <field id="MFF_TUN_METADATA42" title="Generic Tunnel Option 42" hidden="yes"/>
1818 <field id="MFF_TUN_METADATA43" title="Generic Tunnel Option 43" hidden="yes"/>
1819 <field id="MFF_TUN_METADATA44" title="Generic Tunnel Option 44" hidden="yes"/>
1820 <field id="MFF_TUN_METADATA45" title="Generic Tunnel Option 45" hidden="yes"/>
1821 <field id="MFF_TUN_METADATA46" title="Generic Tunnel Option 46" hidden="yes"/>
1822 <field id="MFF_TUN_METADATA47" title="Generic Tunnel Option 47" hidden="yes"/>
1823 <field id="MFF_TUN_METADATA48" title="Generic Tunnel Option 48" hidden="yes"/>
1824 <field id="MFF_TUN_METADATA49" title="Generic Tunnel Option 49" hidden="yes"/>
1825 <field id="MFF_TUN_METADATA50" title="Generic Tunnel Option 50" hidden="yes"/>
1826 <field id="MFF_TUN_METADATA51" title="Generic Tunnel Option 51" hidden="yes"/>
1827 <field id="MFF_TUN_METADATA52" title="Generic Tunnel Option 52" hidden="yes"/>
1828 <field id="MFF_TUN_METADATA53" title="Generic Tunnel Option 53" hidden="yes"/>
1829 <field id="MFF_TUN_METADATA54" title="Generic Tunnel Option 54" hidden="yes"/>
1830 <field id="MFF_TUN_METADATA55" title="Generic Tunnel Option 55" hidden="yes"/>
1831 <field id="MFF_TUN_METADATA56" title="Generic Tunnel Option 56" hidden="yes"/>
1832 <field id="MFF_TUN_METADATA57" title="Generic Tunnel Option 57" hidden="yes"/>
1833 <field id="MFF_TUN_METADATA58" title="Generic Tunnel Option 58" hidden="yes"/>
1834 <field id="MFF_TUN_METADATA59" title="Generic Tunnel Option 59" hidden="yes"/>
1835 <field id="MFF_TUN_METADATA60" title="Generic Tunnel Option 60" hidden="yes"/>
1836 <field id="MFF_TUN_METADATA61" title="Generic Tunnel Option 61" hidden="yes"/>
1837 <field id="MFF_TUN_METADATA62" title="Generic Tunnel Option 62" hidden="yes"/>
1838 <field id="MFF_TUN_METADATA63" title="Generic Tunnel Option 63" hidden="yes"/>
1839
1840 <field id="MFF_TUN_FLAGS" title="Tunnel Flags">
1841 <p>
1842 Flags indicating various aspects of the tunnel encapsulation.
1843 </p>
1844
1845 <p>
1846 Matches on this field are most conveniently written in terms of
1847 symbolic names (given in the diagram below), each preceded by either
1848 <code>+</code> for a flag that must be set, or <code>-</code> for a
1849 flag that must be unset, without any other delimiters between the
1850 flags. Flags not mentioned are wildcarded. For example,
1851 <code>tun_flags=+oam</code> matches only OAM packets. Matches can also
1852 be written as <code><var>flags</var>/<var>mask</var></code>, where
1853 <var>flags</var> and <var>mask</var> are 16-bit numbers in decimal or
1854 in hexadecimal prefixed by <code>0x</code>.
1855 </p>
1856
1857 <p>
1858 Currently, only one flag is defined:
1859 </p>
1860
1861 <dl>
1862 <dt><code>oam</code></dt>
1863 <dd>
1864 The tunnel protocol indicated that this is an OAM (Operations and
1865 Management) control packet.
1866 </dd>
1867 </dl>
1868
1869 <p>
1870 The switch may reject matches against unknown flags.
1871 </p>
1872
1873 <p>
1874 Newer versions of Open vSwitch may introduce additional flags with new
1875 meanings. It is therefore not recommended to use an exact match on
1876 this field since the behavior of these new flags is unknown and should
1877 be ignored.
1878 </p>
1879
1880 <p>
1881 For non-tunneled packets, the value is 0.
1882 </p>
1883 </field>
1884
1885 <!-- Open vSwitch uses the following fields internally, but it
1886 does not expose them to the user via OpenFlow, so we do not
1887 document them. -->
1888 <field id="MFF_TUN_TTL" title="Tunnel IPv4 Time-to-Live" internal="yes"/>
1889 <field id="MFF_TUN_TOS" title="Tunnel IPv4 Type of Service" internal="yes"/>
1890 </group>
1891
1892 <group title="Metadata">
1893 <p>
1894 These fields relate to the origin or treatment of a packet, but
1895 they are not extracted from the packet data itself.
1896 </p>
1897
1898 <field id="MFF_IN_PORT" title="Ingress Port">
1899 <p>
1900 The OpenFlow port on which the packet being processed arrived.
1901 This is a 16-bit field that holds an OpenFlow 1.0 port number.
1902 For receiving a packet, the only values that appear in this
1903 field are:
1904 </p>
1905
1906 <dl>
1907 <dt>1 through <code>0xfeff</code> (65,279), inclusive.</dt>
1908 <dd>
1909 Conventional OpenFlow port numbers.
1910 </dd>
1911
1912 <dt><code>OFPP_LOCAL</code> (<code>0xfffe</code> or 65,534).</dt>
1913 <dd>
1914 <p>
1915 The ``local'' port, which in Open vSwitch is always named
1916 the same as the bridge itself. This represents a
1917 connection between the switch and the local TCP/IP stack.
1918 This port is where an IP address is most commonly
1919 configured on an Open vSwitch switch.
1920 </p>
1921
1922 <p>
1923 OpenFlow does not require a switch to have a local port,
1924 but all existing versions of Open vSwitch have always
1925 included a local port. <b>Future Directions:</b> Future
1926 versions of Open vSwitch might be able to optionally omit
1927 the local port, if someone submits code to implement such
1928 a feature.
1929 </p>
1930 </dd>
1931
1932 <dt><code>OFPP_NONE</code> (OpenFlow 1.0) or <code>OFPP_ANY</code> (OpenFlow 1.1+) (<code>0xffff</code> or 65,535).</dt>
1933 <dt><code>OFPP_CONTROLLER</code> (<code>0xfffd</code> or 65,533).</dt>
1934 <dd>
1935 <p>
1936 When a controller injects a packet into an OpenFlow switch
1937 with a ``packet-out'' request, it can specify one of these
1938 ingress ports to indicate that the packet was generated
1939 internally rather than having been received on some port.
1940 </p>
1941
1942 <p>
1943 OpenFlow 1.0 specified <code>OFPP_NONE</code> for this
1944 purpose. Despite that, some controllers used
1945 <code>OFPP_CONTROLLER</code>, and some switches only
1946 accepted <code>OFPP_CONTROLLER</code>, so OpenFlow 1.0.2
1947 required support for both ports. OpenFlow 1.1 and later
1948 were more clearly drafted to allow only
1949 <code>OFPP_CONTROLLER</code>. For maximum compatibility,
1950 Open vSwitch allows both ports with all OpenFlow versions.
1951 </p>
1952 </dd>
1953 </dl>
1954
1955 <p>
1956 Values not mentioned above will never appear when receiving a
1957 packet, including the following notable values:
1958 </p>
1959
1960 <dl>
1961 <dt>0</dt>
1962 <dd>
1963 Zero is not a valid OpenFlow port number.
1964 </dd>
1965
1966 <dt><code>OFPP_MAX</code> (<code>0xff00</code> or 65,280).</dt>
1967 <dd>
1968 This value has only been clearly specified as a valid port
1969 number as of OpenFlow 1.3.3. Before that, its status was
1970 unclear, and so Open vSwitch has never allowed
1971 <code>OFPP_MAX</code> to be used as a port number, so
1972 packets will never be received on this port. (Other
1973 OpenFlow switches, of course, might use it.)
1974 </dd>
1975
1976 <dt><code>OFPP_UNSET</code> (<code>0xfff7</code> or 65,527)</dt>
1977 <dt><code>OFPP_IN_PORT</code> (<code>0xfff8</code> or 65,528)</dt>
1978 <dt><code>OFPP_TABLE</code> (<code>0xfff9</code> or 65,529)</dt>
1979 <dt><code>OFPP_NORMAL</code> (<code>0xfffa</code> or 65,530)</dt>
1980 <dt><code>OFPP_FLOOD</code> (<code>0xfffb</code> or 65,531)</dt>
1981 <dt><code>OFPP_ALL</code> (<code>0xfffc</code> or 65,532)</dt>
1982 <dd>
1983 <p>
1984 These port numbers are used only in output actions and never
1985 appear as ingress ports.
1986 </p>
1987
1988 <p>
1989 Most of these port numbers were defined in OpenFlow 1.0, but
1990 <code>OFPP_UNSET</code> was only introduced in OpenFlow 1.5.
1991 </p>
1992 </dd>
1993 </dl>
1994
1995 <p>
1996 Values that will never appear when receiving a packet may
1997 still be matched against in the flow table. There are still
1998 circumstances in which those flows can be matched:
1999 </p>
2000
2001 <ul>
2002 <li>
2003 The <code>resubmit</code> Open vSwitch extension action allows a
2004 flow table lookup with an arbitrary ingress port.
2005 </li>
2006
2007 <li>
2008 An action that modifies the ingress port field (see below),
2009 such as e.g. <code>load</code> or <code>set_field</code>,
2010 followed by an action or instruction that performs another
2011 flow table lookup, such as <code>resubmit</code> or
2012 <code>goto_table</code>.
2013 </li>
2014 </ul>
2015
2016 <p>
2017 This field is heavily used for matching in OpenFlow tables,
2018 but for packet egress, it has only very limited roles:
2019 </p>
2020
2021 <ul>
2022 <li>
2023 <p>
2024 OpenFlow requires suppressing output actions to <ref
2025 field="in_port"/>. That is, the following two flows both drop all
2026 packets that arrive on port 1:
2027 </p>
2028
2029 <pre>
2030in_port=1,actions=1
2031in_port=1,actions=drop
2032 </pre>
2033
2034 <p>
2035 (This behavior is occasionally useful for flooding to a
2036 subset of ports. Specifying <code>actions=1,2,3,4</code>,
2037 for example, outputs to ports 1, 2, 3, and 4, omitting the
2038 ingress port.)
2039 </p>
2040 </li>
2041
2042 <li>
2043 OpenFlow has a special port <code>OFPP_IN_PORT</code> (with
2044 value 0xfff8) that outputs to the ingress port. For example,
2045 in a switch that has four ports numbered 1 through 4,
2046 <code>actions=1,2,3,4,in_port</code> outputs to ports 1, 2,
2047 3, and 4, including the ingress port.
2048 </li>
2049 </ul>
2050
2051 <p>
2052 Because the ingress port field has so little influence on packet
2053 processing, it does not ordinarily make sense to modify the
2054 ingress port field. The field is writable only to support the
2055 occasional use case where the ingress port's roles in packet
2056 egress, described above, become troublesome. For example,
2057 <code>actions=load:0-&gt;NXM_OF_IN_PORT[],output:123</code>
2058 will output to port 123 regardless of whether it is in the
2059 ingress port. If the ingress port is important, then one may save
2060 and restore it on the stack:
2061 </p>
2062
2063 <pre>
2064actions=push:NXM_OF_IN_PORT[],load:0->NXM_OF_IN_PORT[],output:123,pop:NXM_OF_IN_PORT[]
2065 </pre>
2066
2067 <p>
2068 or, in Open vSwitch 2.7 or later, use the <code>clone</code> action to
2069 save and restore it:
2070 </p>
2071
2072 <pre>
2073actions=clone(load:0->NXM_OF_IN_PORT[],output:123)
2074 </pre>
2075
2076 <p>
2077 The ability to modify the ingress port is an Open vSwitch
2078 extension to OpenFlow.
2079 </p>
2080 </field>
2081
2082 <field id="MFF_IN_PORT_OXM" title="OXM Ingress Port">
2083 <p>
2084 OpenFlow 1.1 and later use a 32-bit port number, so this field
2085 supplies a 32-bit view of the ingress port. Current versions of
2086 Open vSwitch support only a 16-bit range of ports:
2087 </p>
2088
2089 <ul>
2090 <li>
2091 OpenFlow 1.0 ports <code>0x0000</code> to
2092 <code>0xfeff</code>, inclusive, map to OpenFlow 1.1
2093 port numbers with the same values.
2094 </li>
2095
2096 <li>
2097 OpenFlow 1.0 ports <code>0xff00</code> to
2098 <code>0xffff</code>, inclusive, map to OpenFlow 1.1 port
2099 numbers <code>0xffffff00</code> to <code>0xffffffff</code>.
2100 </li>
2101
2102 <li>
2103 OpenFlow 1.1 ports <code>0x0000ff00</code> to
2104 <code>0xfffffeff</code> are not mapped and not supported.
2105 </li>
2106 </ul>
2107
2108 <p>
2109 <ref field="in_port"/> and <ref field="in_port_oxm"/> are two views of
2110 the same information, so all of the comments on <ref field="in_port"/>
2111 apply to <ref field="in_port_oxm"/> too. Modifying <ref
2112 field="in_port"/> changes <ref field="in_port_oxm"/>, and vice versa.
2113 </p>
2114
2115 <p>
2116 Setting <ref field="in_port_oxm"/> to an unsupported value yields
2117 unspecified behavior.
2118 </p>
2119 </field>
2120
2121 <field id="MFF_SKB_PRIORITY" title="Output Queue">
2122 <p>
2123 <b>Future Directions:</b> Open vSwitch implements the output queue as a
2124 field, but does not currently expose it through OXM or NXM for matching
2125 purposes. If this turns out to be a useful feature, it could be
2126 implemented in future versions. Only the <code>set_queue</code>,
2127 <code>enqueue</code>, and <code>pop_queue</code> actions currently
2128 influence the output queue.
2129 </p>
2130
2131 <p>
2132 This field influences how packets in the flow will be queued,
2133 for quality of service (QoS) purposes, when they egress the
2134 switch. Its range of meaningful values, and their meanings,
2135 varies greatly from one OpenFlow implementation to another.
2136 Even within a single implementation, there is no guarantee
2137 that all OpenFlow ports have the same queues configured or
2138 that all OpenFlow ports in an implementation can be configured
2139 the same way queue-wise.
2140 </p>
2141
2142 <p>
2143 Configuring queues on OpenFlow is not well standardized. On
2144 Linux, Open vSwitch supports queue configuration via OVSDB,
2145 specifically the <code>QoS</code> and <code>Queue</code>
2146 tables (see <code>ovs-vswitchd.conf.db(5)</code> for details).
2147 Ports of Open vSwitch to other platforms might require queue
2148 configuration through some separate protocol (such as a CLI).
2149 Even on Linux, Open vSwitch exposes only a fraction of the
2150 kernel's queuing features through OVSDB, so advanced or
2151 unusual uses might require use of separate utilities
2152 (e.g. <code>tc</code>). OpenFlow switches other than Open
2153 vSwitch might use OF-CONFIG or any of the configuration
2154 methods mentioned above. Finally, some OpenFlow switches have
2155 a fixed number of fixed-function queues (e.g. eight queues
2156 with strictly defined priorities) and others do not support
2157 any control over queuing.
2158 </p>
2159
2160 <p>
2161 The only output queue that all OpenFlow implementations must
2162 support is zero, to identify a default queue, whose properties
2163 are implementation-defined. Outputting a packet to a queue
2164 that does not exist on the output port yields unpredictable
2165 behavior: among the possibilities are that the packet might be
2166 dropped or transmitted with a very high or very low priority.
2167 </p>
2168
2169 <p>
2170 OpenFlow 1.0 only allowed output queues to be specified as part of an
2171 <code>enqueue</code> action that specified both a queue and an output
2172 port. That is, OpenFlow 1.0 treats the queue as an argument to an
2173 action, not as a field.
2174 </p>
2175
2176 <p>
2177 To increase flexibility, OpenFlow 1.1 added an action to set the output
2178 queue. This model was carried forward, without change, through
2179 OpenFlow 1.5.
2180 </p>
2181
2182 <p>
2183 Open vSwitch implements the native queuing model of each
2184 OpenFlow version it supports. Open vSwitch also includes an
2185 extension for setting the output queue as an action in
2186 OpenFlow 1.0.
2187 </p>
2188
2189 <p>
2190 When a packet ingresses into an OpenFlow switch, the output
2191 queue is ordinarily set to 0, indicating the default queue.
2192 However, Open vSwitch supports various ways to forward a
2193 packet from one OpenFlow switch to another within a single
2194 host. In these cases, Open vSwitch maintains the output queue
2195 across the forwarding step. For example:
2196 </p>
2197
2198 <ul>
2199 <li>
2200 A hop across an Open vSwitch ``patch port'' (which does not
2201 actually involve queuing) preserves the output queue.
2202 </li>
2203
2204 <li>
2205 <p>
2206 When a flow sets the output queue then outputs to an
2207 OpenFlow tunnel port, the encapsulation preserves the
2208 output queue. If the kernel TCP/IP stack routes the
2209 encapsulated packet directly to a physical interface, then
2210 that output honors the output queue. Alternatively, if
2211 the kernel routes the encapsulated packet to another Open
2212 vSwitch bridge, then the output queue set previously
2213 becomes the initial output queue on ingress to the second
2214 bridge and will thus be used for further output actions
2215 (unless overridden by a new ``set queue'' action).
2216 </p>
2217
2218 <p>
2219 (This description reflects the current behavior of Open
2220 vSwitch on Linux. This behavior relies on details of the
2221 Linux TCP/IP stack. It could be difficult to make ports
2222 to other operating systems behave the same way.)
2223 </p>
2224 </li>
2225 </ul>
2226 </field>
2227
2228 <field id="MFF_PKT_MARK" title="Packet Mark">
2229 <p>
2230 Packet mark comes to Open vSwitch from the Linux kernel, in
2231 which the <code>sk_buff</code> data structure that represents
2232 a packet contains a 32-bit member named <code>skb_mark</code>.
2233 The value of <code>skb_mark</code> propagates along with the
2234 packet it accompanies wherever the packet goes in the kernel.
2235 It has no predefined semantics but various kernel-user
2236 interfaces can set and match on it, which makes it suitable
2237 for ``marking'' packets at one point in their handling and
2238 then acting on the mark later. With <code>iptables</code>,
2239 for example, one can mark some traffic specially at ingress
2240 and then handle that traffic differently at egress based on
2241 the marked value.
2242 </p>
2243
2244 <p>
2245 Packet mark is an attempt at a generalization of the
2246 <code>skb_mark</code> concept beyond Linux, at least through more
2247 generic naming. Like <ref field="skb_priority"/>, packet mark is
2248 preserved across forwarding steps within a machine. Unlike <ref
2249 field="skb_priority"/>, packet mark has no direct effect on packet
2250 forwarding: the value set in packet mark does not matter unless some
2251 later OpenFlow table or switch matches on packet mark, or unless the
2252 packet passes through some other kernel subsystem that has been
2253 configured to interpret packet mark in specific ways, e.g. through
2254 <code>iptables</code> configuration mentioned above.
2255 </p>
2256
2257 <p>
2258 Preserving packet mark across kernel forwarding steps relies
2259 heavily on kernel support, which ports to non-Linux operating
2260 systems may not have. Regardless of operating system support,
2261 Open vSwitch supports packet mark within a single bridge and
2262 across patch ports.
2263 </p>
2264
2265 <p>
2266 The value of packet mark when a packet ingresses into the
2267 first Open vSwich bridge is typically zero, but it could be
2268 nonzero if its value was previously set by some kernel
2269 subsystem.
2270 </p>
2271 </field>
2272
2273 <field id="MFF_ACTSET_OUTPUT" title="Action Set Output Port">
2274 <p>
2275 Holds the output port currently in the OpenFlow action set (i.e. from
2276 an <code>output</code> action within a <code>write_actions</code>
2277 instruction). Its value is an OpenFlow port number. If there is no
2278 output port in the OpenFlow action set, or if the output port will be
2279 ignored (e.g. because there is an output group in the OpenFlow action
2280 set), then the value will be <code>OFPP_UNSET</code>.
2281 </p>
2282
2283 <p>
2284 Open vSwitch allows any table to match this field. OpenFlow, however,
2285 only requires this field to be matchable from within an OpenFlow egress
2286 table (a feature that Open vSwitch does not yet implement).
2287 </p>
2288 </field>
2289
2290 <field id="MFF_DP_HASH" title="Datapath Hash" internal="yes"/>
2291 <field id="MFF_RECIRC_ID" title="Datapath Recirculation ID" internal="yes"/>
3d4b2e6e
JS
2292
2293 <field id="MFF_PACKET_TYPE" title="Packet Type">
2294 <p>
2295 The type of the packet in the format specified in OpenFlow 1.5:
2296 </p>
2297
2298 <diagram>
2299 <header name="Packet type">
2300 <bits name="ns" above="16" width=".75"/>
2301 <bits name="ns_type" above="16" width=".75"/>
2302 </header>
2303 <dots/>
2304 </diagram>
2305
2306 <p>
2307 The upper 16 bits, <var>ns</var>, are a namespace. The meaning of
2308 <var>ns_type</var> depends on the namespace. The packet type field is
2309 specified and displayed in the format
2310 <code>(<var>ns</var>,<var>ns_type</var>)</code>.
2311 </p>
2312
2313 <p>
2314 Open vSwitch currently supports the following classes of packet types
2315 for matching:
2316 <dl>
2317 <dt><code>(0,0)</code></dt>
2318 <dd>Ethernet.</dd>
2319 <dt><code>(1,<var>ethertype</var>)</code></dt>
2320 <dd>
2321 <p>
2322 The specified <var>ethertype</var>. Open vSwitch can forward
2323 packets with any <var>ethertype</var>, but it can only match on
2324 and process data fields for the following supported packet types:
2325 </p>
2326 <dl>
2327 <dt><code>(1,0x800)</code></dt> <dd>IPv4</dd>
2328 <dt><code>(1,0x806)</code></dt> <dd>ARP</dd>
2329 <dt><code>(1,0x86dd)</code></dt> <dd>IPv6</dd>
2330 <dt><code>(1,0x8847)</code></dt> <dd>MPLS</dd>
2331 <dt><code>(1,0x8848)</code></dt> <dd>MPLS multicast</dd>
2332 <dt><code>(1,0x8035)</code></dt> <dd>RARP</dd>
2333 <dt><code>(1,0x894f)</code></dt> <dd>NSH</dd>
2334 </dl>
2335 </dd>
2336 </dl>
2337 </p>
2338
2339 <p>
2340 Consider the distinction between a packet with <code>packet_type=(0,0),
2341 dl_type=0x800</code> and one with <code>packet_type=(1,0x800)</code>.
2342 The former is an Ethernet frame that contains an IPv4 packet, like
2343 this:
2344 </p>
2345
2346 <diagram>
2347 <header name="Ethernet">
2348 <bits name="dst" above="48" width="0.4"/>
2349 <bits name="src" above="48" width="0.4"/>
2350 <bits name="type" above="16" below="0x800" width="0.4"/>
2351 </header>
2352 <header name="IPv4">
2353 <bits name="..." width="0.4"/>
2354 <bits name="proto" above="8" width="0.4"/>
2355 <bits name="src" above="32" width="0.4"/>
2356 <bits name="dst" above="32" width="0.4"/>
2357 </header>
2358 <dots/>
2359 </diagram>
2360
2361 <p>
2362 The latter is an IPv4 packet not encapsulated inside any outer frame,
2363 like this:
2364 </p>
2365
2366 <diagram>
2367 <header name="IPv4">
2368 <bits name="..." width="0.4"/>
2369 <bits name="proto" above="8" width="0.4"/>
2370 <bits name="src" above="32" width="0.4"/>
2371 <bits name="dst" above="32" width="0.4"/>
2372 </header>
2373 <dots/>
2374 </diagram>
2375
2376 <p>
2377 Matching on <ref field="packet_type"/> is a pre-requisite for matching
2378 on any data field, but for backward compatibility, when a match on a
2379 data field is present without a <ref field="packet_type"/> match, Open
2380 vSwitch acts as though a match on <code>(0,0)</code> (Ethernet) had
2381 been supplied. Similarly, when Open vSwitch sends flow match
2382 information to a controller, e.g. in a reply to a request to dump the
2383 flow table, Open vSwitch omits a match on packet type (0,0) if it would
2384 be implied by a data field match.
2385 </p>
2386 </field>
2387
96fee5e0
BP
2388 </group>
2389
2390 <group title="Connection Tracking">
2391 <p>
2392 Open vSwitch 2.5 and later support ``connection tracking,'' which allows
2393 bidirectional streams of packets to be statefully grouped into
2394 connections. Open vSwitch connection tracking, for example, identifies
2395 the patterns of TCP packets that indicates a successfully initiated
2396 connection, as well as those that indicate that a connection has been
2397 torn down. Open vSwitch connection tracking can also identify related
2398 connections, such as FTP data connections spawned from FTP control
2399 connections.
2400 </p>
2401
2402 <p>
2403 An individual packet passing through the pipeline may be in one of two
2404 states, ``untracked'' or ``tracked,'' which may be distinguished via the
2405 ``trk'' flag in <ref field="ct_state"/>. A packet is
2406 <dfn>untracked</dfn> at the beginning of the Open vSwitch pipeline and
2407 continues to be untracked until the pipeline invokes the <code>ct</code>
2408 action. The connection tracking fields are all zeroes in an untracked
2409 packet. When a flow in the Open vSwitch pipeline invokes the
2410 <code>ct</code> action, the action initializes the connection tracking
2411 fields and the packet becomes <dfn>tracked</dfn> for the remainder of its
2412 processing.
2413 </p>
2414
2415 <p>
2416 The connection tracker stores connection state in an internal table, but
2417 it only adds a new entry to this table when a <code>ct</code> action for
2418 a new connection invokes <code>ct</code> with the <code>commit</code>
2419 parameter. For a given connection, when a pipeline has executed
2420 <code>ct</code>, but not yet with <code>commit</code>, the connection is
2421 said to be <dfn>uncommitted</dfn>. State for an uncommitted connection
2422 is ephemeral and does not persist past the end of the pipeline, so some
2423 features are only available to committed connections. A connection would
2424 typically be left uncommitted as a way to drop its packets.
2425 </p>
2426
2427 <p>
2428 Connection tracking is an Open vSwitch extension to OpenFlow.
2429 </p>
2430
2431 <field id="MFF_CT_STATE" title="Connection Tracking State">
2432 <p>
2433 This field holds several flags that can be used to determine the state
2434 of the connection to which the packet belongs.
2435 </p>
2436
2437 <p>
2438 Matches on this field are most conveniently written in terms of
2439 symbolic names (listed below), each preceded by either <code>+</code>
2440 for a flag that must be set, or <code>-</code> for a flag that must be
2441 unset, without any other delimiters between the flags. Flags not
2442 mentioned are wildcarded. For example,
2443 <code>tcp,ct_state=+trk-new</code> matches TCP packets that have been
2444 run through the connection tracker and do not establish a new
2445 connection. Matches can also be written as
2446 <code><var>flags</var>/<var>mask</var></code>, where <var>flags</var>
2447 and <var>mask</var> are 32-bit numbers in decimal or in hexadecimal
2448 prefixed by <code>0x</code>.
2449 </p>
2450
2451 <p>
2452 The following flags are defined:
2453 </p>
2454
2455 <dl>
2456 <dt><code>new</code> (0x01)</dt>
2457 <dd>
2458 A new connection. Set to 1 if this is an uncommitted connection.
2459 </dd>
2460
2461 <dt><code>est</code> (0x02)</dt>
2462 <dd>
2463 Part of an existing connection. Set to 1 if this is a committed
2464 connection.
2465 </dd>
2466
2467 <dt><code>rel</code> (0x04)</dt>
2468 <dd>
2469 <p>
2470 Related to an existing connection, e.g. an ICMP ``destination
2471 unreachable'' message or an FTP data connections. This flag will
2472 only be 1 if the connection to which this one is related is
2473 committed.
2474 </p>
2475
2476 <p>
2477 Connections identified as <code>rel</code> are separate from the
2478 originating connection and must be committed separately. All
2479 packets for a related connection will have the <code>rel</code>
2480 flag set, not just the initial packet.
2481 </p>
2482 </dd>
2483
2484 <dt><code>rpl</code> (0x08)</dt>
2485 <dd>
2486 This packet is in the reply direction, meaning that it is in the
2487 opposite direction from the packet that initiated the connection.
2488 This flag will only be 1 if the connection is committed.
2489 </dd>
2490
2491 <dt><code>inv</code> (0x10)</dt>
2492 <dd>
2493 <p>
2494 The state is invalid, meaning that the connection tracker couldn't
2495 identify the connection. This flag is a catch-all for problems
2496 in the connection or the connection tracker, such as:
2497 </p>
2498
2499 <ul>
2500 <li>
2501 L3/L4 protocol handler is not loaded/unavailable. With the Linux
2502 kernel datapath, this may mean that the
2503 <code>nf_conntrack_ipv4</code> or <code>nf_conntrack_ipv6</code>
2504 modules are not loaded.
2505 </li>
2506
2507 <li>
2508 L3/L4 protocol handler determines that the packet is malformed.
2509 </li>
2510
2511 <li>
2512 Packets are unexpected length for protocol.
2513 </li>
2514 </ul>
2515 </dd>
2516
2517 <dt><code>trk</code> (0x20)</dt>
2518 <dd>
2519 This packet is tracked, meaning that it has previously traversed the
2520 connection tracker. If this flag is not set, then no other flags
2521 will be set. If this flag is set, then the packet is tracked and
2522 other flags may also be set.
2523 </dd>
2524
2525 <dt><code>snat</code> (0x40)</dt>
2526 <dd>
2527 This packet was transformed by source address/port translation by a
2528 preceding <code>ct</code> action. Open vSwitch 2.6 added this flag.
2529 </dd>
2530
2531 <dt><code>dnat</code> (0x80)</dt>
2532 <dd>
2533 This packet was transformed by destination address/port translation
2534 by a preceding <code>ct</code> action. Open vSwitch 2.6 added this
2535 flag.
2536 </dd>
2537 </dl>
2538
2539 <p>
2540 There are additional constraints on these flags, listed in decreasing
2541 order of precedence below:
2542 </p>
2543
2544 <ol>
2545 <li>
2546 If <code>trk</code> is unset, no other flags are set.
2547 </li>
2548
2549 <li>
2550 If <code>trk</code> is set, one or more other flags may be set.
2551 </li>
2552
2553 <li>
2554 If <code>inv</code> is set, only the <code>trk</code> flag is also
2555 set.
2556 </li>
2557
2558 <li>
2559 <code>new</code> and <code>est</code> are mutually exclusive.
2560 </li>
2561
2562 <li>
2563 <code>new</code> and <code>rpl</code> are mutually exclusive.
2564 </li>
2565
2566 <li>
2567 <code>rel</code> may be set in conjunction with any other flags.
2568 </li>
2569 </ol>
2570
2571 <p>
2572 Future versions of Open vSwitch may define new flags.
2573 </p>
2574 </field>
2575
2576 <field id="MFF_CT_ZONE" title="Connection Tracking Zone">
2577 A connection tracking zone, the zone value passed to the most recent
2578 <code>ct</code> action. Each zone is an independent connection tracking
2579 context, so tracking the same packet in multiple contexts requires using
2580 the <code>ct</code> action multiple times.
2581 </field>
2582
2583 <field id="MFF_CT_MARK" title="Connection Tracking Mark">
2584 The metadata committed, by an action within the <code>exec</code>
2585 parameter to the <code>ct</code> action, to the connection to which the
2586 current packet belongs.
2587 </field>
2588
2589 <field id="MFF_CT_LABEL" title="Connection Tracking Label">
2590 The label committed, by an action within the <code>exec</code>
2591 parameter to the <code>ct</code> action, to the connection to which the
2592 current packet belongs.
2593 </field>
daf4d3c1
JR
2594
2595 <p>
2596 Open vSwitch 2.8 introduced the matching support for connection
2597 tracker original direction 5-tuple fields.
2598 </p>
2599
2600 <p>
2601 For non-committed non-related connections the conntrack original
2602 direction tuple fields always have the same values as the
2603 corresponding headers in the packet itself. For any other packets of
2604 a committed connection the conntrack original direction tuple fields
2605 reflect the values from that initial non-committed non-related packet,
2606 and thus may be different from the actual packet headers, as the
2607 actual packet headers may be in reverse direction (for reply packets),
2608 transformed by NAT (when \fBnat\fR option was applied to the
2609 connection), or be of different protocol (i.e., when an ICMP response
2610 is sent to an UDP packet). In case of related connections, e.g., an
2611 FTP data connection, the original direction tuple contains the
2612 original direction headers from the master connection, e.g., an FTP
2613 control connection.
2614 </p>
2615
2616 <p>
2617 The following fields are populated by the ct action, and require a
2618 match to a valid connection tracking state as a prerequisite, in
2619 addition to the IP or IPv6 ethertype match. Examples of valid
2620 connection tracking state matches include \fBct_state=+new\fR,
2621 \fBct_state=+est\fR, \fBct_state=+rel\fR, and \fBct_state=+trk-inv\fR.
2622 </p>
2623
2624 <field id="MFF_CT_NW_SRC" title="Connection Tracking Original Direction IPv4 Source Address">
2625 Matches IPv4 conntrack original direction tuple source address.
2626 See the paragraphs above for general description to the
2627 conntrack original direction tuple. Introduced in Open vSwitch
2628 2.8.
2629 </field>
2630
2631 <field id="MFF_CT_NW_DST" title="Connection Tracking Original Direction IPv4 Destination Address">
2632 Matches IPv4 conntrack original direction tuple destination address.
2633 See the paragraphs above for general description to the
2634 conntrack original direction tuple. Introduced in Open vSwitch
2635 2.8.
2636 </field>
2637
2638 <field id="MFF_CT_IPV6_SRC" title="Connection Tracking Original Direction IPv6 Source Address">
2639 Matches IPv6 conntrack original direction tuple source address.
2640 See the paragraphs above for general description to the
2641 conntrack original direction tuple. Introduced in Open vSwitch
2642 2.8.
2643 </field>
2644
2645 <field id="MFF_CT_IPV6_DST" title="Connection Tracking Original Direction IPv6 Destination Address">
2646 Matches IPv6 conntrack original direction tuple destination address.
2647 See the paragraphs above for general description to the
2648 conntrack original direction tuple. Introduced in Open vSwitch
2649 2.8.
2650 </field>
2651
2652 <field id="MFF_CT_NW_PROTO" title="Connection Tracking Original Direction IP Protocol">
2653 Matches conntrack original direction tuple IP protocol type,
2654 which is specified as a decimal number between 0 and 255,
2655 inclusive (e.g. 1 to match ICMP packets or 6 to match TCP
2656 packets). In case of, for example, an ICMP response to an UDP
2657 packet, this may be different from the IP protocol type of the
2658 packet itself. See the paragraphs above for general description
2659 to the conntrack original direction tuple. Introduced in Open
2660 vSwitch 2.8.
2661 </field>
2662
2663 <field id="MFF_CT_TP_SRC" title="Connection Tracking Original Direction Transport Layer Source Port">
2664 Bitwise match on the conntrack original direction tuple
2665 transport source, when
2666 <code>MFF_CT_NW_PROTO</code> has value 6 for TCP, 17 for UDP, or
2667 132 for SCTP. When <code>MFF_CT_NW_PROTO</code> has value 1 for
2668 ICMP, or 58 for ICMPv6, the lower 8 bits of
2669 <code>MFF_CT_TP_SRC</code> matches the conntrack original
2670 direction ICMP type. See the paragraphs above for general
2671 description to the conntrack original direction
2672 tuple. Introduced in Open vSwitch 2.8.
2673 </field>
2674
2675 <field id="MFF_CT_TP_DST" title="Connection Tracking Original Direction Transport Layer Source Port">
2676 Bitwise match on the conntrack original direction tuple
2677 transport destination port, when
2678 <code>MFF_CT_NW_PROTO</code> has value 6 for TCP, 17 for UDP, or
2679 132 for SCTP. When <code>MFF_CT_NW_PROTO</code> has value 1 for
2680 ICMP, or 58 for ICMPv6, the lower 8 bits of
2681 <code>MFF_CT_TP_DST</code> matches the conntrack original
2682 direction ICMP code. See the paragraphs above for general
2683 description to the conntrack original direction
2684 tuple. Introduced in Open vSwitch 2.8.
2685 </field>
96fee5e0
BP
2686 </group>
2687
2688 <group title="Register">
2689 <p>
2690 These fields give an OpenFlow switch space for temporary storage while
2691 the pipeline is running. Whereas metadata fields can have a meaningful
2692 initial value and can persist across some hops across OpenFlow switches,
2693 registers are always initially 0 and their values never persist across
2694 inter-switch hops (not even across patch ports).
2695 </p>
2696
2697 <field id="MFF_METADATA" title="OpenFlow Metadata">
2698 <p>
2699 This field is the oldest standardized OpenFlow register field,
2700 introduced in OpenFlow 1.1. It was introduced to model the limited
2701 number of user-defined bits that some ASIC-based switches can carry
2702 through their pipelines. Because of hardware limitations, OpenFlow
2703 allows switches to support writing and masking only an
2704 implementation-defined subset of bits, even no bits at all. The Open
2705 vSwitch software switch always supports all 64 bits, but of course an
2706 Open vSwitch port to an ASIC would have the same restriction as the
2707 ASIC itself.
2708 </p>
2709
2710 <p>
2711 This field has an OXM code point, but OpenFlow 1.4 and earlier allow it
2712 to be modified only with a specialized instruction, not with a
2713 ``set-field'' action. OpenFlow 1.5 removes this restriction. Open
2714 vSwitch does not enforce this restriction, regardless of OpenFlow
2715 version.
2716 </p>
2717 </field>
2718
2719 <field id="MFF_REG0" title="Register 0">
2720 This is the first of several Open vSwitch registers, all of which have
2721 the same properties. Open vSwitch 1.1 introduced registers 0, 1, 2, and
2722 3, version 1.3 added register 4, version 1.7 added registers 5, 6, and 7,
2723 and version 2.6 added registers 8 through 15.
2724 </field>
2725 <!-- XXX series -->
2726 <field id="MFF_REG1" title="Register 1" hidden="yes"/>
2727 <field id="MFF_REG2" title="Register 2" hidden="yes"/>
2728 <field id="MFF_REG3" title="Register 3" hidden="yes"/>
2729 <field id="MFF_REG4" title="Register 4" hidden="yes"/>
2730 <field id="MFF_REG5" title="Register 5" hidden="yes"/>
2731 <field id="MFF_REG6" title="Register 6" hidden="yes"/>
2732 <field id="MFF_REG7" title="Register 7" hidden="yes"/>
2733 <field id="MFF_REG8" title="Register 8" hidden="yes"/>
2734 <field id="MFF_REG9" title="Register 9" hidden="yes"/>
2735 <field id="MFF_REG10" title="Register 10" hidden="yes"/>
2736 <field id="MFF_REG11" title="Register 11" hidden="yes"/>
2737 <field id="MFF_REG12" title="Register 12" hidden="yes"/>
2738 <field id="MFF_REG13" title="Register 13" hidden="yes"/>
2739 <field id="MFF_REG14" title="Register 14" hidden="yes"/>
2740 <field id="MFF_REG15" title="Register 15" hidden="yes"/>
2741
2742 <field id="MFF_XREG0" title="Extended Register 0">
2743 <p>
2744 This is the first of the registers introduced in OpenFlow 1.5.
2745 OpenFlow 1.5 calls these fields just the ``packet registers,'' but Open
2746 vSwitch already had 32-bit registers by that name, so Open vSwitch uses
2747 the name ``extended registers'' in an attempt to reduce confusion. The
2748 standard allows for up to 128 registers, each 64 bits wide, but Open
2749 vSwitch only implements 4 (in versions 2.4 and 2.5) or 8 (in version
2750 2.6 and later).
2751 </p>
2752
2753 <p>
2754 Each of the 64-bit extended registers overlays two of the 32-bit
2755 registers: <code>xreg0</code> overlays <code>reg0</code> and
2756 <code>reg1</code>, with <code>reg0</code> supplying the
2757 most-significant bits of <code>xreg0</code> and <code>reg1</code> the
2758 least-significant. Similarly, <code>xreg1</code> overlays
2759 <code>reg2</code> and <code>reg3</code>, and so on.
2760 </p>
2761
2762 <p>
2763 The OpenFlow specification says, ``In most cases, the packet registers
2764 can not be matched in tables, i.e. they usually can not be used in the
2765 flow entry match structure'' [OpenFlow 1.5, section 7.2.3.10], but
2766 there is no reason for a software switch to impose such a restriction,
2767 and Open vSwitch does not.
2768 </p>
2769 </field>
2770
2771 <!-- XXX series -->
2772 <field id="MFF_XREG1" title="Extended Register 1" hidden="yes"/>
2773 <field id="MFF_XREG2" title="Extended Register 2" hidden="yes"/>
2774 <field id="MFF_XREG3" title="Extended Register 3" hidden="yes"/>
2775 <field id="MFF_XREG4" title="Extended Register 4" hidden="yes"/>
2776 <field id="MFF_XREG5" title="Extended Register 5" hidden="yes"/>
2777 <field id="MFF_XREG6" title="Extended Register 6" hidden="yes"/>
2778 <field id="MFF_XREG7" title="Extended Register 7" hidden="yes"/>
2779
2780 <field id="MFF_XXREG0" title="Double-Extended Register 0">
2781 <p>
2782 This is the first of the double-extended registers introduce in Open
2783 vSwitch 2.6. Each of the 128-bit extended registers overlays four of
2784 the 32-bit registers: <code>xxreg0</code> overlays <code>reg0</code>
2785 through <code>reg3</code>, with <code>reg0</code> supplying the
2786 most-significant bits of <code>xxreg0</code> and <code>reg3</code> the
2787 least-significant. <code>xxreg1</code> similarly overlays
2788 <code>reg4</code> through <code>reg7</code>, and so on.
2789 </p>
2790 </field>
2791
2792 <!-- XXX series -->
2793 <field id="MFF_XXREG1" title="Double-Extended Register 1" hidden="yes"/>
2794 <field id="MFF_XXREG2" title="Double-Extended Register 2" hidden="yes"/>
2795 <field id="MFF_XXREG3" title="Double-Extended Register 3" hidden="yes"/>
2796 </group>
2797
2798 <group title="Layer 2 (Ethernet)">
2799 <p>
2800 Ethernet is the only layer-2 protocol that Open vSwitch
2801 supports. As with most software, Open vSwitch and OpenFlow
2802 regard an Ethernet frame to begin with the 14-byte header and
2803 end with the final byte of the payload; that is, the frame check
2804 sequence is not considered part of the frame.
2805 </p>
2806
2807 <field id="MFF_ETH_SRC" title="Ethernet Source">
2808 <p>
2809 The Ethernet source address:
2810 </p>
2811
2812 <diagram>
2813 <header name="Ethernet">
2814 <bits name="dst" above="48" width=".75"/>
2815 <bits name="src" above="48" width=".75" fill="yes"/>
2816 <bits name="type" above="16" width="0.4"/>
2817 </header>
2818 <dots/>
2819 </diagram>
2820 </field>
2821
2822 <field id="MFF_ETH_DST" title="Ethernet Destination">
2823 <p>
2824 The Ethernet destination address:
2825 </p>
2826
2827 <diagram>
2828 <header name="Ethernet">
2829 <bits name="dst" above="48" width=".75" fill="yes"/>
2830 <bits name="src" above="48" width=".75"/>
2831 <bits name="type" above="16" width="0.4"/>
2832 </header>
2833 <dots/>
2834 </diagram>
2835
2836 <p>
2837 Open vSwitch 1.8 and later support arbitrary masks for source and/or
2838 destination. Earlier versions only support masking the destination
2839 with the following masks:
2840 </p>
2841
2842 <dl>
2843 <dt><code>01:00:00:00:00:00</code></dt>
2844 <dd>
2845 Match only the multicast bit. Thus,
2846 <code>dl_dst=01:00:00:00:00:00/01:00:00:00:00:00</code> matches all
2847 multicast (including broadcast) Ethernet packets, and
2848 <code>dl_dst=00:00:00:00:00:00/01:00:00:00:00:00</code> matches all
2849 unicast Ethernet packets.
2850 </dd>
2851
2852 <dt><code>fe:ff:ff:ff:ff:ff</code></dt>
2853 <dd>
2854 Match all bits except the multicast bit. This is probably not
2855 useful.
2856 </dd>
2857
2858 <dt><code>ff:ff:ff:ff:ff:ff</code></dt>
2859 <dd>
2860 Exact match (equivalent to omitting the mask).
2861 </dd>
2862
2863 <dt><code>00:00:00:00:00:00</code></dt>
2864 <dd>
2865 Wildcard all bits (equivalent to <code>dl_dst=*</code>).
2866 </dd>
2867 </dl>
2868 </field>
2869
2870 <field id="MFF_ETH_TYPE" title="Ethernet Type">
2871 <p>
2872 The most commonly seen Ethernet frames today use a format
2873 called ``Ethernet II,'' in which the last two bytes of the
2874 Ethernet header specify the Ethertype. For such a frame, this
2875 field is copied from those bytes of the header, like so:
2876 </p>
2877
2878 <diagram>
2879 <header name="Ethernet">
2880 <bits name="dst" above="48" width=".75"/>
2881 <bits name="src" above="48" width=".75"/>
2882 <bits name="type" above="16" below="\[&gt;=]0x600" width="0.4" fill="yes"/>
2883 </header>
2884 <dots/>
2885 </diagram>
2886
2887 <p>
2888 Every Ethernet type has a value 0x600 (1,536) or greater.
2889 When the last two bytes of the Ethernet header have a value
2890 too small to be an Ethernet type, then the value found there
2891 is the total length of the frame in bytes, excluding the
2892 Ethernet header. An 802.2 LLC header typically follows the
2893 Ethernet header. OpenFlow and Open vSwitch only support LLC
2894 headers with DSAP and SSAP <code>0xaa</code> and control byte
2895 <code>0x03</code>, which indicate that a SNAP header follows
2896 the LLC header. In turn, OpenFlow and Open vSwitch only
2897 support a SNAP header with organization <code>0x000000</code>.
2898 In such a case, this field is copied from the type field in
2899 the SNAP header, like this:
2900 </p>
2901
2902 <diagram>
2903 <header name="Ethernet">
2904 <bits name="dst" above="48" width=".75"/>
2905 <bits name="src" above="48" width=".75"/>
2906 <bits name="type" above="16" below="&lt;0x600" width="0.4"/>
2907 </header>
2908 <header name="LLC">
2909 <bits name="DSAP" above="8" below="0xaa" width=".4"/>
2910 <bits name="SSAP" above="8" below="0xaa" width=".4"/>
2911 <bits name="cntl" above="8" below="0x03" width=".4"/>
2912 </header>
2913 <header name="SNAP">
2914 <bits name="org" above="24" below="0x000000" width=".75"/>
2915 <bits name="type" above="16" below="\[&gt;=]0x600" width=".4" fill="yes"/>
2916 </header>
2917 <dots/>
2918 </diagram>
2919
2920 <p>
2921 When an 802.1Q header is inserted after the Ethernet source
2922 and destination, this field is populated with the encapsulated
2923 Ethertype, not the 802.1Q Ethertype. With an Ethernet II
2924 inner frame, the result looks like this:
2925 </p>
2926
2927 <diagram>
2928 <header name="Ethernet">
2929 <bits name="dst" above="48" width=".75"/>
2930 <bits name="src" above="48" width=".75"/>
2931 </header>
2932 <header name="802.1Q">
2933 <bits name="TPID" above="16" below="0x8100" width=".4"/>
2934 <bits name="TCI" above="16" width=".4"/>
2935 </header>
2936 <header name="Ethertype">
2937 <bits name="type" above="16" below="\[&gt;=]0x600" width=".4" fill="yes"/>
2938 </header>
2939 <dots/>
2940 </diagram>
2941
2942 <p>
2943 LLC and SNAP encapsulation look like this with an 802.1Q header:
2944 </p>
2945
2946 <diagram>
2947 <header name="Ethernet">
2948 <bits name="dst" above="48" width=".75"/>
2949 <bits name="src" above="48" width=".75"/>
2950 </header>
2951 <header name="802.1Q">
2952 <bits name="TPID" above="16" below="0x8100" width=".4"/>
2953 <bits name="TCI" above="16" width=".4"/>
2954 </header>
2955 <header name="Ethertype">
2956 <bits name="type" above="16" below="&lt;0x600" width="0.4"/>
2957 </header>
2958 <header name="LLC">
2959 <bits name="DSAP" above="8" below="0xaa" width=".4"/>
2960 <bits name="SSAP" above="8" below="0xaa" width=".4"/>
2961 <bits name="cntl" above="8" below="0x03" width=".4"/>
2962 </header>
2963 <header name="SNAP">
2964 <bits name="org" above="24" below="0x000000" width=".75"/>
2965 <bits name="type" above="16" below="\[&gt;=]0x600" width=".4" fill="yes"/>
2966 </header>
2967 <dots/>
2968 </diagram>
2969
2970 <p>
2971 When a packet doesn't match any of the header formats described
2972 above, Open vSwitch and OpenFlow set this field to
2973 <code>0x5ff</code> (<code>OFP_DL_TYPE_NOT_ETH_TYPE</code>).
2974 </p>
2975 </field>
2976 </group>
2977
2978 <group title="VLAN">
2979 <p>
2980 The 802.1Q VLAN header causes more trouble than any other 4
2981 bytes in networking. OpenFlow 1.0, 1.1, and 1.2+ all treat VLANs
2982 differently. Open vSwitch extensions add another variant to the mix.
2983 Open vSwitch reconciles all four treatments as best it can.
2984 </p>
2985
2986 <h2>VLAN Header Format</h2>
2987
2988 <p>
2989 An 802.1Q VLAN header consists of two 16-bit fields:
2990 </p>
2991
2992 <diagram>
2993 <header name="TPID">
2994 <bits name="Ethertype" above="16" below="0x8100" width="1.8"/>
2995 </header>
2996 <nospace/>
2997 <header name="TCI">
2998 <bits name="PCP" above="3" width=".6"/>
2999 <bits name="CFI" above="1" below="0" width=".3"/>
3000 <bits name="VID" above="12" width=".9"/>
3001 </header>
3002 </diagram>
3003
3004 <p>
3005 The first 16 bits of the VLAN header, the <dfn>TPID</dfn> (Tag Protocol
3006 IDentifier), is an Ethertype. When the VLAN header is inserted just
3007 after the source and destination MAC addresses in a Ethertype frame, the
3008 TPID serves to identify the presence of the VLAN. The standard TPID, the
3009 only one that Open vSwitch supports, is <code>0x8100</code>. OpenFlow
3010 1.0 explicitly supports only TPID <code>0x8100</code>. OpenFlow 1.1, but
3011 not earlier or later versions, also requires support for TPID
3012 <code>0x88a8</code> (Open vSwitch does not support this). OpenFlow 1.2
3013 through 1.5 do not require support for specific TPIDs (the ``push vlan
3014 header'' action does say that only <code>0x8100</code> and
3015 <code>0x88a8</code> should be pushed). No version of OpenFlow provides a
3016 way to distinguish or match on the TPID.
3017 </p>
3018
3019 <p>
3020 The remaining 16 bits of the VLAN header, the <dfn>TCI</dfn>
3021 (Tag Control Information), is subdivided into three subfields:
3022 </p>
3023
3024 <ul>
3025 <li>
3026 <dfn>PCP</dfn> (Priority Control Point), is a 3-bit 802.1p
3027 <dfn>priority</dfn>. The lowest priority is value 1, the
3028 second-lowest is value 0, and priority increases from 2 up to
3029 highest priority 7.
3030 </li>
3031
3032 <li>
3033 <p>
3034 <dfn>CFI</dfn> (Canonical Format Indicator), is a 1-bit field. On an
3035 Ethernet network, its value is always 0. This led to it later being
3036 repurposed under the name <dfn>DEI</dfn> (Drop Eligibility
3037 Indicator). By either name, OpenFlow and Open vSwitch don't provide
3038 any way to match or set this bit.
3039 </p>
3040 </li>
3041
3042 <li>
3043 <dfn>VID</dfn> (VLAN IDentifier), is a 12-bit VLAN. If the
3044 VID is 0, then the frame is not part of a VLAN. In that case,
3045 the VLAN header is called a <dfn>priority tag</dfn> because it
3046 is only meaningful for assigning the frame a priority. VID
3047 <code>0xfff</code> (4,095) is reserved.
3048 </li>
3049 </ul>
3050
3051 <p>
3052 See <ref field="eth_type"/> for illustrations of a complete Ethernet
3053 frame with 802.1Q tag included.
3054 </p>
3055
3056 <h2>Multiple VLANs</h2>
3057
3058 <p>
3059 Open vSwitch can match only a single VLAN header. If more than
3060 one VLAN header is present, then <ref field="eth_type"/>
3061 holds the TPID of the inner VLAN header. Open vSwitch stops
3062 parsing the packet after the inner TPID, so matching further
3063 into the packet (e.g. on the inner TCI or L3 fields) is not
3064 possible.
3065 </p>
3066
3067 <p>
3068 OpenFlow only directly supports matching a single VLAN header. In
3069 OpenFlow 1.1 or later, one OpenFlow table can match on the outermost VLAN
3070 header and pop it off, and a later OpenFlow table can match on the next
3071 outermost header. Open vSwitch does not support this.
3072 </p>
3073
3074 <h2>VLAN Field Details</h2>
3075
3076 <p>
3077 The four variants have three different levels of expressiveness: OpenFlow
3078 1.0 and 1.1 VLAN matching are less powerful than OpenFlow 1.2+ VLAN
3079 matching, which is less powerful than Open vSwitch extension VLAN
3080 matching.
3081 </p>
3082
3083 <h2>OpenFlow 1.0 VLAN Fields</h2>
3084
3085 <p>
3086 OpenFlow 1.0 uses two fields, called <code>dl_vlan</code> and
3087 <code>dl_vlan_pcp</code>, each of which can be either exact-matched or
3088 wildcarded, to specify VLAN matches:
3089 </p>
3090
3091 <ul>
3092 <li>
3093 When both <code>dl_vlan</code> and <code>dl_vlan_pcp</code> are
3094 wildcarded, the flow matches packets without an 802.1Q header or
3095 with any 802.1Q header.
3096 </li>
3097
3098 <li>
3099 The match <code>dl_vlan=0xffff</code> causes a flow to match only
3100 packets without an 802.1Q header. Such a flow should also wildcard
3101 <code>dl_vlan_pcp</code>, since a packet without an 802.1Q header does
3102 not have a PCP. OpenFlow does not specify what to do if a match on PCP
3103 is actually present, but Open vSwitch ignores it.
3104 </li>
3105
3106 <li>
3107 <p>
3108 Otherwise, the flow matches only packets with an 802.1Q
3109 header. If <code>dl_vlan</code> is not wildcarded, then the
3110 flow only matches packets with the VLAN ID specified in
3111 <code>dl_vlan</code>'s low 12 bits. If
3112 <code>dl_vlan_pcp</code> is not wildcarded, then the flow
3113 only matches packets with the priority specified in
3114 <code>dl_vlan_pcp</code>'s low 3 bits.
3115 </p>
3116
3117 <p>
3118 OpenFlow does not specify how to interpret the high 4 bits of
3119 <code>dl_vlan</code> or the high 5 bits of <code>dl_vlan_pcp</code>.
3120 Open vSwitch ignores them.
3121 </p>
3122 </li>
3123 </ul>
3124
3125 <field id="MFF_DL_VLAN" title="OpenFlow 1.0 VLAN ID" hidden="yes"/>
3126 <field id="MFF_DL_VLAN_PCP" title="OpenFlow 1.0 VLAN Priority"
3127 hidden="yes"/>
3128
3129 <h2>OpenFlow 1.1 VLAN Fields</h2>
3130
3131 <p>
3132 VLAN matching in OpenFlow 1.1 is similar to OpenFlow 1.0.
3133 The one refinement is that when <code>dl_vlan</code> matches on
3134 <code>0xfffe</code> (<code>OFVPID_ANY</code>), the flow matches
3135 only packets with an 802.1Q header, with any VLAN ID. If
3136 <code>dl_vlan_pcp</code> is wildcarded, the flow matches any
3137 packet with an 802.1Q header, regardless of VLAN ID or priority.
3138 If <code>dl_vlan_pcp</code> is not wildcarded, then the flow
3139 only matches packets with the priority specified in
3140 <code>dl_vlan_pcp</code>'s low 3 bits.
3141 </p>
3142
3143 <p>
3144 OpenFlow 1.1 uses the name <code>OFPVID_NONE</code>, instead of
3145 <code>OFP_VLAN_NONE</code>, for a <code>dl_vlan</code> of
3146 <code>0xffff</code>, but it has the same meaning.
3147 </p>
3148
3149 <p>
3150 In OpenFlow 1.1, Open vSwitch reports error
3151 <code>OFPBMC_BAD_VALUE</code> for an attempt to match on
3152 <code>dl_vlan</code> between 4,096 and <code>0xfffd</code>,
3153 inclusive, or <code>dl_vlan_pcp</code> greater than 7.
3154 </p>
3155
3156 <h2>OpenFlow 1.2 VLAN Fields</h2>
3157
3158 <field id="MFF_VLAN_VID" title="OpenFlow 1.2+ VLAN ID">
3159 <p>
3160 The OpenFlow standard describes this field as consisting of
3161 ``12+1'' bits. On ingress, its value is 0 if no 802.1Q header
3162 is present, and otherwise it holds the VLAN VID in its least
3163 significant 12 bits, with bit 12 (<code>0x1000</code> aka
3164 <code>OFPVID_PRESENT</code>) also set to 1. The three most
3165 significant bits are always zero:
3166 </p>
3167
3168 <diagram>
3169 <header name="OXM_OF_VLAN_VID">
3170 <bits name="" above="3" below="0" width=".6"/>
3171 <bits name="P" above="1" width=".1"/>
3172 <bits name="VLAN ID" above="12" width=".9"/>
3173 </header>
3174 </diagram>
3175
3176 <p>
3177 As a consequence of this field's format, one may use it to match the
3178 VLAN ID in all of the ways available with the OpenFlow 1.0 and 1.1
3179 formats, and a few new ways:
3180 </p>
3181
3182 <dl>
3183 <dt>Fully wildcarded</dt>
3184 <dd>
3185 Matches any packet, that is, one without an 802.1Q header or
3186 with an 802.1Q header with any TCI value.
3187 </dd>
3188
3189 <dt>
3190 Value <code>0x0000</code> (<code>OFPVID_NONE</code>), mask
3191 <code>0xffff</code> (or no mask)
3192 </dt>
3193 <dd>
3194 Matches only packets without an 802.1Q header.
3195 </dd>
3196
3197 <dt>
3198 Value <code>0x1000</code>, mask <code>0x1000</code>
3199 </dt>
3200 <dd>
3201 Matches any packet with an 802.1Q header, regardless of VLAN
3202 ID.
3203 </dd>
3204
3205 <dt>
3206 Value <code>0x1009</code>, mask <code>0xffff</code> (or no mask)
3207 </dt>
3208 <dd>
3209 Match only packets with an 802.1Q header with VLAN ID 9.
3210 </dd>
3211
3212 <dt>Value <code>0x1001</code>, mask <code>0x1001</code></dt>
3213 <dd>
3214 Matches only packets that have an 802.1Q header with an
3215 odd-numbered VLAN ID. (This is just an example; one can
3216 match on any desired VLAN ID bit pattern.)
3217 </dd>
3218 </dl>
3219 </field>
3220
3221 <field id="MFF_VLAN_PCP" title="OpenFlow 1.2+ VLAN Priority">
3222 <p>
3223 The 3 least significant bits may be used to match the PCP bits
3224 in an 802.1Q header. Other bits are always zero:
3225 </p>
3226
3227 <diagram>
3228 <header name="OXM_OF_VLAN_VID">
3229 <bits name="zero" above="5" below="0" width="1.0"/>
3230 <bits name="PCP" above="3" width=".6"/>
3231 </header>
3232 </diagram>
3233
3234 <p>
3235 This field may only be used when <ref field="vlan_vid"/> is not
3236 wildcarded and does not exact match on 0 (which only matches
3237 when there is no 802.1Q header).
3238 </p>
3239
3240 <p>
3241 See <cite>VLAN Comparison Chart</cite>, below, for some examples.
3242 </p>
3243 </field>
3244
3245 <h2>Open vSwitch Extension VLAN Field</h2>
3246
3247 <p>
3248 The <ref field="vlan_tci"/> extension can describe more kinds of VLAN
3249 matches than the other variants. It is also simpler than the other
3250 variants.
3251 </p>
3252
3253 <field id="MFF_VLAN_TCI" title="VLAN TCI">
3254 <p>
3255 For a packet without an 802.1Q header, this field is zero. For a
3256 packet with an 802.1Q header, this field is the TCI with the bit in
3257 CFI's position (marked <code>P</code> for ``present'' below) forced to
3258 1. Thus, for a packet in VLAN 9 with priority 7, it has the value
3259 <code>0xf009</code>:
3260 </p>
3261
3262 <diagram>
3263 <header name="NXM_VLAN_TCI">
3264 <bits name="PCP" above="3" below="7" width=".6"/>
3265 <bits name="P" above="1" below="1" width=".2"/>
3266 <bits name="VID" above="12" below="9" width=".9"/>
3267 </header>
3268 </diagram>
3269
3270 <p>
3271 Usage examples:
3272 </p>
3273
3274 <dl>
3275 <dt><code>vlan_tci=0</code></dt>
3276 <dd>
3277 Match packets without an 802.1Q header.
3278 </dd>
3279
3280 <dt><code>vlan_tci=0x1000/0x1000</code></dt>
3281 <dd>
3282 Match packets with an 802.1Q header, regardless of VLAN
3283 and priority values.
3284 </dd>
3285
3286 <dt><code>vlan_tci=0xf123</code></dt>
3287 <dd>
3288 Match packets tagged with priority 7 in VLAN 0x123.
3289 </dd>
3290
3291 <dt><code>vlan_tci=0x1123/0x1fff</code></dt>
3292 <dd>
3293 Match packets tagged with VLAN 0x123 (and any priority).
3294 </dd>
3295
3296 <dt><code>vlan_tci=0x5000/0xf000</code></dt>
3297 <dd>
3298 Match packets tagged with priority 2 (in any VLAN).
3299 </dd>
3300
3301 <dt><code>vlan_tci=0/0xfff</code></dt>
3302 <dd>
3303 Match packets with no 802.1Q header or tagged with VLAN 0
3304 (and any priority).
3305 </dd>
3306
3307 <dt><code>vlan_tci=0x5000/0xe000</code></dt>
3308 <dd>
a0a81b57 3309 Match packets with no 802.1Q header or tagged with priority 2 (in any VLAN).
96fee5e0
BP
3310 </dd>
3311
3312 <dt><code>vlan_tci=0/0xefff</code></dt>
3313 <dd>
3314 Match packets with no 802.1Q header or tagged with VLAN 0
3315 and priority 0.
3316 </dd>
3317 </dl>
3318
3319 <p>
3320 See <cite>VLAN Comparison Chart</cite>, below, for more examples.
3321 </p>
3322 </field>
3323
3324 <h2>VLAN Comparison Chart</h2>
3325
3326 <p>
3327 The following table describes each of several possible matching
3328 criteria on 802.1Q header may be expressed with each variation
3329 of the VLAN matching fields:
3330 </p>
3331
3332 <tbl>
3333r r r r r.
3334Criteria OpenFlow 1.0 OpenFlow 1.1 OpenFlow 1.2+ NXM
3335\_ \_ \_ \_ \_
3336[1] \fL????\fR/\fL1\fR,\fL??\fR/\fL?\fR \fL????\fR/\fL1\fR,\fL??\fR/\fL?\fR \fL0000\fR/\fL0000\fR,\fL--\fR \fL0000\fR/\fL0000\fR
3337[2] \fLffff\fR/\fL0\fR,\fL??\fR/\fL?\fR \fLffff\fR/\fL0\fR,\fL??\fR/\fL?\fR \fL0000\fR/\fLffff\fR,\fL--\fR \fL0000\fR/\fLffff\fR
3338[3] \fL0xxx\fR/\fL0\fR,\fL??\fR/\fL1\fR \fL0xxx\fR/\fL0\fR,\fL??\fR/\fL1\fR \fL1xxx\fR/\fLffff\fR,\fL--\fR \fL1xxx\fR/\fL1fff\fR
3339[4] \fL????\fR/\fL1\fR,\fL0y\fR/\fL0\fR \fLfffe\fR/\fL0\fR,\fL0y\fR/\fL0\fR \fL1000\fR/\fL1000\fR,\fL0y\fR \fLz000\fR/\fLf000\fR
3340[5] \fL0xxx\fR/\fL0\fR,\fL0y\fR/\fL0\fR \fL0xxx\fR/\fL0\fR,\fL0y\fR/\fL0\fR \fL1xxx\fR/\fLffff\fR,\fL0y\fR \fLzxxx\fR/\fLffff\fR
3341.T&amp;
3342r r c c r.
3343[6] (none) (none) \fL1001\fR/\fL1001\fR,\fL--\fR \fL1001\fR/\fL1001\fR
3344.T&amp;
3345r r c c c.
3346[7] (none) (none) (none) \fL3000\fR/\fL3000\fR
3347[8] (none) (none) (none) \fL0000\fR/\fL0fff\fR
3348[9] (none) (none) (none) \fL0000\fR/\fLf000\fR
3349[10] (none) (none) (none) \fL0000\fR/\fLefff\fR
3350 </tbl>
3351
3352 <p>
3353 All numbers in the table are expressed in hexadecimal. The
3354 columns in the table are interpreted as follows:
3355 </p>
3356
3357 <dl>
3358 <dt>Criteria</dt>
3359 <dd>See the list below.</dd>
3360
3361 <dt>OpenFlow 1.0</dt>
3362 <dt>OpenFlow 1.1</dt>
3363 <dd>
3364 <literal>wwww/x,yy/z</literal> means VLAN ID match value
3365 <literal>wwww</literal> with wildcard bit <literal>x</literal>
3366 and VLAN PCP match value <literal>yy</literal> with wildcard
3367 bit <literal>z</literal>. <literal>?</literal> means that the
3368 given bits are ignored (and conventionally
3369 <literal>0</literal> for <literal>wwww</literal> or
3370 <literal>yy</literal>, conventionally <literal>1</literal> for
3371 <literal>x</literal> or <literal>z</literal>). ``(none)''
3372 means that OpenFlow 1.0 (or 1.1) cannot match with these
3373 criteria.
3374 </dd>
3375
3376 <dt>OpenFlow 1.2+</dt>
3377 <dd>
3378 <literal>xxxx/yyyy,zz</literal> means <ref field="vlan_vid"/> with
3379 value <literal>xxxx</literal> and mask <literal>yyyy</literal>, and
3380 <ref field="vlan_pcp"/> (which is not maskable) with value
3381 <literal>zz</literal>. <literal>--</literal> means that <ref
3382 field="vlan_pcp"/> is omitted. ``(none)'' means that OpenFlow 1.2
3383 cannot match with these criteria.
3384 </dd>
3385
3386 <dt>NXM</dt>
3387 <dd>
3388 <literal>xxxx/yyyy</literal> means <ref field="vlan_tci"/> with value
3389 <literal>xxxx</literal> and mask <literal>yyyy</literal>.
3390 </dd>
3391 </dl>
3392
3393 <p>
3394 The matching criteria described by the table are:
3395 </p>
3396
3397 <dl>
3398 <dt>[1]</dt>
3399 <dd>
3400 Matches any packet, that is, one without an 802.1Q header or
3401 with an 802.1Q header with any TCI value.
3402 </dd>
3403
3404 <dt>[2]</dt>
3405 <dd>
3406 <p>
3407 Matches only packets without an 802.1Q header.
3408 </p>
3409
3410 <p>
3411 OpenFlow 1.0 doesn't define the behavior if <ref field="dl_vlan"/> is
3412 set to <code>0xffff</code> and <ref field="dl_vlan_pcp"/> is not
3413 wildcarded. (Open vSwitch always ignores <ref field="dl_vlan_pcp"/>
3414 when <ref field="dl_vlan"/> is set to <code>0xffff</code>.)
3415 </p>
3416
3417 <p>
3418 OpenFlow 1.1 says explicitly to ignore <ref field="dl_vlan_pcp"/>
3419 when <ref field="dl_vlan"/> is set to <code>0xffff</code>.
3420 </p>
3421
3422 <p>
3423 OpenFlow 1.2 doesn't say how to interpret a match with <ref
3424 field="vlan_vid"/> value 0 and a mask with
3425 <code>OFPVID_PRESENT</code> (<code>0x1000</code>) set to 1 and some
3426 other bits in the mask set to 1 also. Open vSwitch interprets it the
3427 same way as a mask of <code>0x1000</code>.
3428 </p>
3429
3430 <p>
3431 Any NXM match with <ref field="vlan_tci"/> value 0 and the CFI bit
3432 set to 1 in the mask is equivalent to the one listed in the table.
3433 </p>
3434 </dd>
3435
3436 <dt>[3]</dt>
3437 <dd>
3438 Matches only packets that have an 802.1Q header with VID
3439 <literal>xxx</literal> (and any PCP).
3440 </dd>
3441
3442 <dt>[4]</dt>
3443 <dd>
3444 <p>
3445 Matches only packets that have an 802.1Q header with PCP
3446 <literal>y</literal> (and any VID).
3447 </p>
3448
3449 <p>
3450 OpenFlow 1.0 doesn't clearly define the behavior for this
3451 case. Open vSwitch implements it this way.
3452 </p>
3453
3454 <p>
3455 In the NXM value, <literal>z</literal> equals
3456 (<literal>y</literal> &lt;&lt; 1) | 1.
3457 </p>
3458 </dd>
3459
3460 <dt>[5]</dt>
3461 <dd>
3462 <p>
3463 Matches only packets that have an 802.1Q header with VID
3464 <literal>xxx</literal> and PCP <literal>y</literal>.
3465 </p>
3466
3467 <p>
3468 In the NXM value, <literal>z</literal> equals
3469 (<literal>y</literal> &lt;&lt; 1) | 1.
3470 </p>
3471 </dd>
3472
3473 <dt>[6]</dt>
3474 <dd>
3475 Matches only packets that have an 802.1Q header with an
3476 odd-numbered VID (and any PCP). Only possible with OpenFlow
3477 1.2 and NXM. (This is just an example; one can match on any
3478 desired VID bit pattern.)
3479 </dd>
3480
3481 <dt>[7]</dt>
3482 <dd>
3483 Matches only packets that have an 802.1Q header with an
3484 odd-numbered PCP (and any VID). Only possible with NXM.
3485 (This is just an example; one can match on any desired VID bit
3486 pattern.)
3487 </dd>
3488
3489 <dt>[8]</dt>
3490 <dd>
3491 Matches packets with no 802.1Q header or with an 802.1Q header
3492 with a VID of 0. Only possible with NXM.
3493 </dd>
3494
3495 <dt>[9]</dt>
3496 <dd>
3497 Matches packets with no 802.1Q header or with an 802.1Q header
3498 with a PCP of 0. Only possible with NXM.
3499 </dd>
3500
3501 <dt>[10]</dt>
3502 <dd>
3503 Matches packets with no 802.1Q header or with an 802.1Q header
3504 with both VID and PCP of 0. Only possible with NXM.
3505 </dd>
3506 </dl>
3507 </group>
3508
3509 <group title="Layer 2.5: MPLS">
3510 <p>
3511 One or more MPLS headers (more commonly called <dfn>MPLS
3512 labels</dfn>) follow an Ethernet type field that specifies an
3513 MPLS Ethernet type [RFC 3032]. Ethertype <code>0x8847</code> is
3514 used for all unicast. Multicast MPLS is divided into two
3515 specific classes, one of which uses Ethertype
3516 <code>0x8847</code> and the other <code>0x8848</code> [RFC
3517 5332].
3518 </p>
3519
3520 <p>
3521 The most common overall packet format is Ethernet II, shown
3522 below (SNAP encapsulation may be used but is not ordinarily seen
3523 in Ethernet networks):
3524 </p>
3525
3526 <diagram>
3527 <header name="Ethernet">
3528 <bits name="dst" above="48" width="0.75"/>
3529 <bits name="src" above="48" width="0.75"/>
3530 <bits name="type" above="16" below="0x8847" width="0.4"/>
3531 </header>
3532 <header name="MPLS">
3533 <bits name="label" above="20" width=".6"/>
3534 <bits name="TC" above="3" width=".3"/>
3535 <bits name="S" above="1" width=".1"/>
3536 <bits name="TTL" above="8" width=".4"/>
3537 </header>
3538 <dots/>
3539 </diagram>
3540
3541 <p>
3542 MPLS can be encapsulated inside an 802.1Q header, in which case
3543 the combination looks like this:
3544 </p>
3545
3546 <diagram>
3547 <header name="Ethernet">
3548 <bits name="dst" above="48" width=".75"/>
3549 <bits name="src" above="48" width=".75"/>
3550 </header>
3551 <header name="802.1Q">
3552 <bits name="TPID" above="16" below="0x8100" width=".4"/>
3553 <bits name="TCI" above="16" width=".4"/>
3554 </header>
3555 <header name="Ethertype">
3556 <bits name="type" above="16" below="0x8847" width=".4"/>
3557 </header>
3558 <header name="MPLS">
3559 <bits name="label" above="20" width=".6"/>
3560 <bits name="TC" above="3" width=".3"/>
3561 <bits name="S" above="1" width=".1"/>
3562 <bits name="TTL" above="8" width=".4"/>
3563 </header>
3564 <dots/>
3565 </diagram>
3566
3567 <p>
3568 The fields within an MPLS label are:
3569 </p>
3570
3571 <dl>
3572 <dt>Label, 20 bits.</dt>
3573 <dd>
3574 An identifier.
3575 </dd>
3576
3577 <dt>Traffic control (TC), 3 bits.</dt>
3578 <dd>
3579 Used for quality of service.
3580 </dd>
3581
3582 <dt>Bottom of stack (BOS), 1 bit (labeled just ``S'' above).</dt>
3583 <dd>
3584 <p>
3585 0 indicates that another MPLS label follows this one.
3586 </p>
3587
3588 <p>
3589 1 indicates that this MPLS label is the last one in the
3590 stack, so that some other protocol follows this one.
3591 </p>
3592 </dd>
3593
3594 <dt>Time to live (TTL), 8 bits.</dt>
3595 <dd>
3596 <p>
3597 Each hop across an MPLS network decrements the TTL by 1. If
3598 it reaches 0, the packet is discarded.
3599 </p>
3600
3601 <p>
3602 OpenFlow does not make the MPLS TTL available as a match field, but
3603 actions are available to set and decrement the TTL. Open vSwitch 2.6
3604 and later makes the MPLS TTL available as an extension.
3605 </p>
3606 </dd>
3607 </dl>
3608
3609 <h2>MPLS Label Stacks</h2>
3610
3611 <p>
3612 Unlike the other encapsulations supported by OpenFlow and Open vSwitch,
3613 MPLS labels are routinely used in ``stacks'' two or three deep and
3614 sometimes even deeper. Open vSwitch currently supports up to three
3615 labels.
3616 </p>
3617
3618 <p>
3619 The OpenFlow specification only supports matching on the outermost MPLS
3620 label at any given time. To match on the second label, one must first
3621 ``pop'' the outer label and advance to another OpenFlow table, where the
3622 inner label may be matched. To match on the third label, one must pop
3623 the two outer labels, and so on. The Open Networking Foundation is
3624 considering support for directly matching on multiple MPLS labels for
3625 OpenFlow 1.6.<!-- XXX add EXT-* link -->
3626 </p>
3627
3628 <h2>MPLS Inner Protocol</h2>
3629
3630 <p>
3631 Unlike all other forms of encapsulation that Open vSwitch and
3632 OpenFlow support, an MPLS label does not indicate what inner
3633 protocol it encapsulates. Different deployments determine the
3634 inner protocol in different ways [RFC 3032]:
3635 </p>
3636
3637 <ul>
3638 <li>
3639 A few reserved label values do indicate an inner protocol.
3640 Label 0, the ``IPv4 Explicit NULL Label,'' indicates inner
3641 IPv4. Label 2, the ``IPv6 Explicit NULL Label,'' indicates
3642 inner IPv6.
3643 </li>
3644
3645 <li>
3646 Some deployments use a single inner protocol consistently.
3647 </li>
3648
3649 <li>
3650 In some deployments, the inner protocol must be inferred from
3651 the innermost label.
3652 </li>
3653
3654 <li>
3655 In some deployments, the inner protocol must be inferred from
3656 the innermost label and the encapsulated data, e.g. to
3657 distinguish between inner IPv4 and IPv6 based on whether the
3658 first nibble of the inner protocol data are <code>4</code> or
3659 <code>6</code>. OpenFlow and Open vSwitch do not currently
3660 support these cases.
3661 </li>
3662 </ul>
3663
3664 <p>
3665 Open vSwitch and OpenFlow do not infer the inner protocol, even if
3666 reserved label values are in use. Instead, the flow table must specify
3667 the inner protocol at the time it pops the bottommost MPLS label, using
3668 the Ethertype argument to the <code>pop_mpls</code> action.
3669 </p>
3670
3671 <h2>Field Details</h2>
3672
3673 <field id="MFF_MPLS_LABEL" title="MPLS Label">
3674 <p>
3675 The least significant 20 bits hold the ``label'' field from
3676 the MPLS label. Other bits are zero:
3677 </p>
3678
3679 <diagram>
3680 <header name="OXM_OF_MPLS_LABEL">
3681 <bits name="zero" above="12" below="0" width=".6"/>
3682 <bits name="label" above="20" width="1.0"/>
3683 </header>
3684 </diagram>
3685
3686 <p>
3687 Most label values are available for any use by deployments.
3688 Values under 16 are reserved.
3689 </p>
3690 </field>
3691
3692 <field id="MFF_MPLS_TC" title="MPLS Traffic Class">
3693 <p>
3694 The least significant 3 bits hold the TC field from the MPLS
3695 label. Other bits are zero:
3696 </p>
3697
3698 <diagram>
3699 <header name="OXM_OF_MPLS_TC">
3700 <bits name="zero" above="5" below="0" width="1.0"/>
3701 <bits name="TC" above="3" width=".6"/>
3702 </header>
3703 </diagram>
3704
3705 <p>
3706 This field is intended for use for Quality of Service (QoS)
3707 and Explicit Congestion Notification purposes, but its
3708 particular interpretation is deployment specific.
3709 </p>
3710
3711 <p>
3712 Before 2009, this field was named EXP and reserved for
3713 experimental use [RFC 5462].
3714 </p>
3715 </field>
3716
3717 <field id="MFF_MPLS_BOS" title="MPLS Bottom of Stack">
3718 <p>
3719 The least significant bit holds the BOS field from the MPLS
3720 label. Other bits are zero:
3721 </p>
3722
3723 <diagram>
3724 <header name="OXM_OF_MPLS_BOS">
3725 <bits name="zero" above="7" below="0" width="1.3"/>
3726 <bits name="BOS" above="1" width=".3"/>
3727 </header>
3728 </diagram>
3729
3730 <p>
3731 This field is useful as part of processing a series of incoming MPLS
3732 labels. A flow that includes a <code>pop_mpls</code> action should
3733 generally match on <ref field="mpls_bos"/>:
3734 </p>
3735
3736 <ul>
3737 <li>
3738 When <ref field="mpls_bos"/> is 1, there is another MPLS label
3739 following this one, so the Ethertype passed to <code>pop_mpls</code>
3740 should be an MPLS Ethertype. For example: <code>table=0,
3741 dl_type=0x8847, mpls_bos=1, actions=pop_mpls:0x8847,
3742 goto_table:1</code>
3743 </li>
3744
3745 <li>
3746 When <ref field="mpls_bos"/> is 0, this MPLS label is the last one,
3747 so the Ethertype passed to <code>pop_mpls</code> should be a non-MPLS
3748 Ethertype such as IPv4. For example: <code>table=1, dl_type=0x8847,
3749 mpls_bos=0, actions=pop_mpls:0x0800, goto_table:2</code>
3750 </li>
3751 </ul>
3752 </field>
3753
3754 <field id="MFF_MPLS_TTL" title="MPLS Time-to-Live">
3755 <p>
3756 Holds the 8-bit time-to-live field from the MPLS label:
3757 </p>
3758
3759 <diagram>
3760 <header name="NXM_NX_MPLS_TTL">
3761 <bits name="TTL" above="8" width=".4"/>
3762 </header>
3763 </diagram>
3764 </field>
3765 </group>
3766
3767 <group title="Layer 3: IPv4 and IPv6">
3768 <h2>IPv4 Specific Fields</h2>
3769
3770 <p>
3771 These fields are applicable only to IPv4 flows, that is, flows that match
3772 on the IPv4 Ethertype <code>0x0800</code>.
3773 </p>
3774
3775 <field id="MFF_IPV4_SRC" title="IPv4 Source Address">
3776 <p>
3777 The source address from the IPv4 header:
3778 </p>
3779
3780 <diagram>
3781 <header name="Ethernet">
3782 <bits name="dst" above="48" width="0.4"/>
3783 <bits name="src" above="48" width="0.4"/>
3784 <bits name="type" above="16" below="0x800" width="0.4"/>
3785 </header>
3786 <header name="IPv4">
3787 <bits name="..." width="0.4"/>
3788 <bits name="proto" above="8" width="0.4"/>
3789 <bits name="src" above="32" width="0.4" fill="yes"/>
3790 <bits name="dst" above="32" width="0.4"/>
3791 </header>
3792 <dots/>
3793 </diagram>
3794
3795 <p>
3796 For historical reasons, in an ARP or RARP flow, Open vSwitch interprets
3797 matches on <code>nw_src</code> as actually referring to the ARP SPA.
3798 </p>
3799 </field>
3800
3801 <field id="MFF_IPV4_DST" title="IPv4 Destination Address">
3802 <p>
3803 The destination address from the IPv4 header:
3804 </p>
3805
3806 <diagram>
3807 <header name="Ethernet">
3808 <bits name="dst" above="48" width="0.4"/>
3809 <bits name="src" above="48" width="0.4"/>
3810 <bits name="type" above="16" below="0x800" width="0.4"/>
3811 </header>
3812 <header name="IPv4">
3813 <bits name="..." width="0.4"/>
3814 <bits name="proto" above="8" width="0.4"/>
3815 <bits name="src" above="32" width="0.4"/>
3816 <bits name="dst" above="32" width="0.4" fill="yes"/>
3817 </header>
3818 <dots/>
3819 </diagram>
3820
3821 <p>
3822 For historical reasons, in an ARP or RARP flow, Open vSwitch interprets
3823 matches on <code>nw_dst</code> as actually referring to the ARP TPA.
3824 </p>
3825 </field>
3826
3827 <h2>IPv6 Specific Fields</h2>
3828
3829 <p>
3830 These fields apply only to IPv6 flows, that is, flows that match
3831 on the IPv6 Ethertype <code>0x86dd</code>.
3832 </p>
3833
3834 <field id="MFF_IPV6_SRC" title="IPv6 Source Address">
3835 <p>
3836 The source address from the IPv6 header:
3837 </p>
3838
3839 <diagram>
3840 <header name="Ethernet">
3841 <bits name="dst" above="48" width="0.4"/>
3842 <bits name="src" above="48" width="0.4"/>
3843 <bits name="type" above="16" below="0x86dd" width="0.4"/>
3844 </header>
3845 <header name="IPv6">
3846 <bits name="..." width="0.4"/>
3847 <bits name="next" above="8" width="0.3"/>
3848 <bits name="src" above="128" width="0.8" fill="yes"/>
3849 <bits name="dst" above="128" width="0.8"/>
3850 </header>
3851 <dots/>
3852 </diagram>
3853
3854 <p>
3855 Open vSwitch 1.8 added support for bitwise matching; earlier versions
3856 supported only CIDR masks.
3857 </p>
3858 </field>
3859 <field id="MFF_IPV6_DST" title="IPv6 Destination Address">
3860 <p>
3861 The destination address from the IPv6 header:
3862 </p>
3863 <diagram>
3864 <header name="Ethernet">
3865 <bits name="dst" above="48" width="0.4"/>
3866 <bits name="src" above="48" width="0.4"/>
3867 <bits name="type" above="16" below="0x86dd" width="0.4"/>
3868 </header>
3869 <header name="IPv6">
3870 <bits name="..." width="0.4"/>
3871 <bits name="next" above="8" width="0.3"/>
3872 <bits name="src" above="128" width="0.8"/>
3873 <bits name="dst" above="128" width="0.8" fill="yes"/>
3874 </header>
3875 <dots/>
3876 </diagram>
3877
3878 <p>
3879 Open vSwitch 1.8 added support for bitwise matching; earlier versions
3880 supported only CIDR masks.
3881 </p>
3882 </field>
3883 <field id="MFF_IPV6_LABEL" title="IPv6 Flow Label">
3884 <p>
3885 The least significant 20 bits hold the flow label field from
3886 the IPv6 header. Other bits are zero:
3887 </p>
3888
3889 <diagram>
3890 <header name="OXM_OF_IPV6_FLABEL">
3891 <bits name="zero" above="12" below="0" width=".6"/>
3892 <bits name="label" above="20" width="1.0"/>
3893 </header>
3894 </diagram>
3895 </field>
3896
3897 <h2>IPv4/IPv6 Fields</h2>
3898
3899 <p>
3900 These fields exist with at least approximately the same meaning in both
3901 IPv4 and IPv6, so they are treated as a single field for matching
3902 purposes. Any flow that matches on the IPv4 Ethertype
3903 <code>0x0800</code> or the IPv6 Ethertype <code>0x86dd</code> may match
3904 on these fields.
3905 </p>
3906
3907 <field id="MFF_IP_PROTO" title="IPv4/v6 Protocol">
3908 <p>
3909 Matches the IPv4 or IPv6 protocol type.
3910 </p>
3911
3912 <p>
3913 For historical reasons, in an ARP or RARP flow, Open vSwitch interprets
3914 matches on <code>nw_proto</code> as actually referring to the ARP
3915 opcode. The ARP opcode is a 16-bit field, so for matching purposes ARP
3916 opcodes greater than 255 are treated as 0; this works adequately
3917 because in practice ARP and RARP only use opcodes 1 through 4.
3918 </p>
3919 </field>
3920
3921 <field id="MFF_IP_TTL" title="IPv4/v6 TTL/Hop Limit">
3922 The main reason to match on the TTL or hop limit field is to detect
3923 whether a <code>dec_ttl</code> action will fail due to a TTL exceeded
3924 error. Another way that a controller can detect TTL exceeded is to
3925 listen for <code>OFPR_INVALID_TTL</code> ``packet-in'' messages via
3926 OpenFlow.
3927 </field>
3928
3929 <field id="MFF_IP_FRAG" title="IPv4/v6 Fragment Bitmask">
3930 <p>
3931 Specifies what kinds of IP fragments or non-fragments to match. The
3932 value for this field is most conveniently specified as one of the
3933 following:
3934 </p>
3935
3936 <dl>
3937 <dt><code>no</code></dt>
3938 <dd>
3939 Match only non-fragmented packets.
3940 </dd>
3941
3942 <dt><code>yes</code></dt>
3943 <dd>
3944 Matches all fragments.
3945 </dd>
3946
3947 <dt><code>first</code></dt>
3948 <dd>
3949 Matches only fragments with offset 0.
3950 </dd>
3951
3952 <dt><code>later</code></dt>
3953 <dd>
3954 Matches only fragments with nonzero offset.
3955 </dd>
3956
3957 <dt><code>not_later</code></dt>
3958 <dd>
3959 Matches non-fragmented packets and fragments with zero offset.
3960 </dd>
3961 </dl>
3962
3963 <p>
3964 The field is internally formatted as 2 bits: bit 0 is 1 for an IP
3965 fragment with any offset (and otherwise 0), and bit 1 is 1 for an IP
3966 fragment with nonzero offset (and otherwise 0), like so:
3967 </p>
3968
3969 <diagram>
3970 <header name="NXM_NX_IP_FRAG">
3971 <bits name="zero" above="6" below="0" width=".9"/>
3972 <bits name="later" above="1" width=".3"/>
3973 <bits name="any" above="1" width=".3"/>
3974 </header>
3975 </diagram>
3976
3977 <p>
3978 Even though 2 bits have 4 possible values, this field only uses 3 of
3979 them:
3980 </p>
3981
3982 <ul>
3983 <li>
3984 A packet that is not an IP fragment has value 0.
3985 </li>
3986
3987 <li>
3988 A packet that is an IP fragment with offset 0 (the first fragment)
3989 has bit 0 set and thus value 1.
3990 </li>
3991
3992 <li>
3993 A packet that is an IP fragment with nonzero offset has bits 0 and 1
3994 set and thus value 3.
3995 </li>
3996 </ul>
3997
3998 <p>
3999 The switch may reject matches against values that can never appear.
4000 </p>
4001
4002 <p>
4003 It is important to understand how this field interacts with the
4004 OpenFlow fragment handling mode:
4005 </p>
4006
4007 <ul>
4008 <li>
4009 In <code>OFPC_FRAG_DROP</code> mode, the OpenFlow switch drops all IP
4010 fragments before they reach the flow table, so every packet that is
4011 available for matching will have value 0 in this field.
4012 </li>
4013
4014 <li>
4015 Open vSwitch does not implement <code>OFPC_FRAG_REASM</code> mode,
4016 but if it did then IP fragments would be reassembled before they
4017 reached the flow table and again every packet available for matching
4018 would always have value 0.
4019 </li>
4020
4021 <li>
4022 In <code>OFPC_FRAG_NORMAL</code> mode, all three values are possible,
4023 but OpenFlow 1.0 says that fragments' transport ports are always 0,
4024 even for the first fragment, so this does not provide much extra
4025 information.
4026 </li>
4027
4028 <li>
4029 In <code>OFPC_FRAG_NX_MATCH</code> mode, all three values are
4030 possible. For fragments with offset 0, Open vSwitch makes L4 header
4031 information available.
4032 </li>
4033 </ul>
4034
4035 <p>
4036 Thus, this field is likely to be most useful for an Open vSwitch switch
4037 configured in <code>OFPC_FRAG_NX_MATCH</code> mode. See the
4038 description of the <code>set-frags</code> command in
4039 <code>ovs-ofctl</code>(8), for more details.
4040 </p>
4041 </field>
4042
4043 <h3>IPv4/IPv6 TOS Fields</h3>
4044
4045 <p>
4046 IPv4 and IPv6 contain a one-byte ``type of service'' or TOS field that
4047 has the following format:
4048 </p>
4049
4050 <diagram>
4051 <header name="type of service">
4052 <bits name="DSCP" above="6" width=".9"/>
4053 <bits name="ECN" above="2" width=".3"/>
4054 </header>
4055 </diagram>
4056
4057 <field id="MFF_IP_DSCP" title="IPv4/v6 DSCP (Bits 2-7)">
4058 <p>
4059 This field is the TOS byte with the two ECN bits cleared to 0:
4060 </p>
4061
4062 <diagram>
4063 <header name="NXM_OF_IP_TOS">
4064 <bits name="DSCP" above="6" width=".9"/>
4065 <bits name="zero" above="2" below="0" width=".3"/>
4066 </header>
4067 </diagram>
4068 </field>
4069 <field id="MFF_IP_DSCP_SHIFTED" title="IPv4/v6 DSCP (Bits 0-5)">
4070 <p>
4071 This field is the TOS byte shifted right to put the DSCP bits in the
4072 6 least-significant bits:
4073 </p>
4074
4075 <diagram>
4076 <header name="OXM_OF_IP_DSCP">
4077 <bits name="zero" above="2" below="0" width=".3"/>
4078 <bits name="DSCP" above="6" width=".9"/>
4079 </header>
4080 </diagram>
4081 </field>
4082 <field id="MFF_IP_ECN" title="IPv4/v6 ECN">
4083 <p>
4084 This field is the TOS byte with the DSCP bits cleared to 0:
4085 </p>
4086
4087 <diagram>
4088 <header name="OXM_OF_IP_ECN">
4089 <bits name="zero" above="6" below="0" width=".9"/>
4090 <bits name="ECN" above="2" width=".35"/>
4091 </header>
4092 </diagram>
4093 </field>
4094
4095 </group>
4096
4097 <group title="Layer 3: ARP">
4098 <p>
4099 In theory, Address Resolution Protocol, or ARP, is a generic protocol
4100 generic protocol that can be used to obtain the hardware address that
4101 corresponds to any higher-level protocol address. In contemporary usage,
4102 ARP is used only in Ethernet networks to obtain the Ethernet address for
4103 a given IPv4 address. OpenFlow and Open vSwitch only support this usage
4104 of ARP. For this use case, an ARP packet has the following format, with
4105 the ARP fields exposed as Open vSwitch fields highlighted:
4106 </p>
4107
4108 <diagram>
4109 <header name="Ethernet">
4110 <bits name="dst" above="48" width="0.4"/>
4111 <bits name="src" above="48" width="0.4"/>
4112 <bits name="type" above="16" below="0x806" width="0.4"/>
4113 </header>
4114 <header name="ARP">
4115 <bits name="hrd" above="16" below="1" width=".3"/>
4116 <bits name="pro" above="16" below="0x800" width=".3"/>
4117 <bits name="hln" above="8" below="6" width=".2"/>
4118 <bits name="pln" above="8" below="4" width=".2"/>
4119 <bits name="op" above="16" width=".2" fill="yes"/>
4120 <bits name="sha" above="48" width="0.5" fill="yes"/>
4121 <bits name="spa" above="16" width="0.3" fill="yes"/>
4122 <bits name="tha" above="48" width="0.5" fill="yes"/>
4123 <bits name="tpa" above="16" width="0.3" fill="yes"/>
4124 </header>
4125 </diagram>
4126
4127 <p>
4128 The ARP fields are also used for RARP, the Reverse Address Resolution
4129 Protocol, which shares ARP's wire format.
4130 </p>
4131
4132 <field id="MFF_ARP_OP" title="ARP Opcode">
4133 Even though this is a 16-bit field, Open vSwitch does not support ARP
4134 opcodes greater than 255; it treats them to zero. This works adequately
4135 because in practice ARP and RARP only use opcodes 1 through 4.
4136 </field>
4137
4138 <field id="MFF_ARP_SPA" title="ARP Source IPv4 Address"/>
4139 <field id="MFF_ARP_TPA" title="ARP Target IPv4 Address"/>
4140 <field id="MFF_ARP_SHA" title="ARP Source Ethernet Address"/>
4141 <field id="MFF_ARP_THA" title="ARP Target Ethernet Address"/>
4142 </group>
4143
4144 <group title="Layer 4: TCP, UDP, and SCTP">
4145 <p>
4146 For matching purposes, no distinction is made whether these protocols are
4147 encapsulated within IPv4 or IPv6.
4148 </p>
4149
4150 <h2>TCP</h2>
4151
4152 <p>
4153 The following diagram shows TCP within IPv4. Open vSwitch also supports
4154 TCP in IPv6. Only TCP fields implemented as Open vSwitch fields are
4155 shown:
4156 </p>
4157
4158 <diagram>
4159 <header name="Ethernet">
4160 <bits name="dst" above="48" width="0.4"/>
4161 <bits name="src" above="48" width="0.4"/>
4162 <bits name="type" above="16" below="0x800" width="0.4"/>
4163 </header>
4164 <header name="IPv4">
4165 <bits name="..." width="0.4"/>
4166 <bits name="proto" above="8" below="6" width="0.3"/>
4167 <bits name="src" above="32" width="0.4"/>
4168 <bits name="dst" above="32" width="0.4"/>
4169 </header>
4170 <header name="TCP">
4171 <bits name="src" above="16" width=".2"/>
4172 <bits name="dst" above="16" width=".2"/>
4173 <bits name="..." width=".75"/>
4174 <bits name="flags" above="12" width=".3"/>
4175 <bits name="..." width=".6"/>
4176 </header>
4177 <dots/>
4178 </diagram>
4179 <field id="MFF_TCP_SRC" title="TCP Source Port">
4180 Open vSwitch 1.6 added support for bitwise matching.
4181 </field>
4182 <field id="MFF_TCP_DST" title="TCP Destination Port">
4183 Open vSwitch 1.6 added support for bitwise matching.
4184 </field>
4185 <field id="MFF_TCP_FLAGS" title="TCP Flags">
4186 <p>
4187 This field holds the TCP flags. TCP currently defines 9 flag bits. An
4188 additional 3 bits are reserved. For more information, see [RFC 793],
4189 [RFC 3168], and [RFC 3540].
4190 </p>
4191
4192 <p>
4193 Matches on this field are most conveniently written in terms of
4194 symbolic names (given in the diagram below), each preceded by either
4195 <code>+</code> for a flag that must be set, or <code>-</code> for a
4196 flag that must be unset, without any other delimiters between the
4197 flags. Flags not mentioned are wildcarded. For example,
4198 <code>tcp,tcp_flags=+syn-ack</code> matches TCP SYNs that are not ACKs,
4199 and <code>tcp,tcp_flags=+[200]</code> matches TCP packets with the
4200 reserved [200] flag set. Matches can also be written as
4201 <code><var>flags</var>/<var>mask</var></code>, where <var>flags</var>
4202 and <var>mask</var> are 16-bit numbers in decimal or in hexadecimal
4203 prefixed by <code>0x</code>.
4204 </p>
4205
4206 <p>
4207 The flag bits are:
4208 </p>
4209
4210 <diagram>
4211 <header>
4212 <bits name="zero" above="4" below="0" width=".9"/>
4213 </header>
4214 <nospace/>
4215 <header name="reserved">
4216 <bits name="[800]" above="1" width=".35"/>
4217 <bits name="[400]" above="1" width=".35"/>
4218 <bits name="[200]" above="1" width=".35"/>
4219 </header>
4220 <nospace/>
4221 <header name="later RFCs">
4222 <bits name="NS" above="1" width=".35"/>
4223 <bits name="CWR" above="1" width=".35"/>
4224 <bits name="ECE" above="1" width=".35"/>
4225 </header>
4226 <nospace/>
4227 <header name="RFC 793">
4228 <bits name="URG" above="1" width=".35"/>
4229 <bits name="ACK" above="1" width=".35"/>
4230 <bits name="PSH" above="1" width=".35"/>
4231 <bits name="RST" above="1" width=".35"/>
4232 <bits name="SYN" above="1" width=".35"/>
4233 <bits name="FIN" above="1" width=".35"/>
4234 </header>
4235 </diagram>
4236 </field>
4237
4238 <h2>UDP</h2>
4239
4240 <p>
4241 The following diagram shows UDP within IPv4. Open vSwitch also supports
4242 UDP in IPv6. Only UDP fields that Open vSwitch exposes as fields are
4243 shown:
4244 </p>
4245
4246 <diagram>
4247 <header name="Ethernet">
4248 <bits name="dst" above="48" width="0.4"/>
4249 <bits name="src" above="48" width="0.4"/>
4250 <bits name="type" above="16" below="0x800" width="0.4"/>
4251 </header>
4252 <header name="IPv4">
4253 <bits name="..." width="0.4"/>
4254 <bits name="proto" above="8" below="17" width="0.3"/>
4255 <bits name="src" above="32" width="0.4"/>
4256 <bits name="dst" above="32" width="0.4"/>
4257 </header>
4258 <header name="UDP">
4259 <bits name="src" above="16" width=".2"/>
4260 <bits name="dst" above="16" width=".2"/>
4261 <bits name="..." width=".4"/>
4262 </header>
4263 <dots/>
4264 </diagram>
4265 <field id="MFF_UDP_SRC" title="UDP Source Port"/>
4266 <field id="MFF_UDP_DST" title="UDP Destination Port"/>
4267
4268 <h2>SCTP</h2>
4269
4270 <p>
4271 The following diagram shows SCTP within IPv4. Open vSwitch also supports
4272 SCTP in IPv6. Only SCTP fields that Open vSwitch exposes as fields are
4273 shown:
4274 </p>
4275
4276 <diagram>
4277 <header name="Ethernet">
4278 <bits name="dst" above="48" width="0.4"/>
4279 <bits name="src" above="48" width="0.4"/>
4280 <bits name="type" above="16" below="0x800" width="0.4"/>
4281 </header>
4282 <header name="IPv4">
4283 <bits name="..." width="0.4"/>
4284 <bits name="proto" above="8" below="132" width="0.3"/>
4285 <bits name="src" above="32" width="0.4"/>
4286 <bits name="dst" above="32" width="0.4"/>
4287 </header>
4288 <header name="SCTP">
4289 <bits name="src" above="16" width=".2"/>
4290 <bits name="dst" above="16" width=".2"/>
4291 <bits name="..." width=".8"/>
4292 </header>
4293 <dots/>
4294 </diagram>
4295 <field id="MFF_SCTP_SRC" title="SCTP Source Port"/>
4296 <field id="MFF_SCTP_DST" title="SCTP Destination Port"/>
4297 </group>
4298
4299 <group title="Layer 4: ICMPv4 and ICMPv6">
4300 <h2>ICMPv4</h2>
4301 <diagram>
4302 <header name="Ethernet">
4303 <bits name="dst" above="48" width="0.4"/>
4304 <bits name="src" above="48" width="0.4"/>
4305 <bits name="type" above="16" below="0x800" width="0.4"/>
4306 </header>
4307 <header name="IPv4">
4308 <bits name="..." width="0.4"/>
4309 <bits name="proto" above="8" below="1" width="0.3"/>
4310 <bits name="src" above="32" width="0.4"/>
4311 <bits name="dst" above="32" width="0.4"/>
4312 </header>
4313 <header name="ICMPv4">
4314 <bits name="type" above="8" width=".3"/>
4315 <bits name="code" above="8" width=".3"/>
4316 <bits name="..." width=".8"/>
4317 </header>
4318 <dots/>
4319 </diagram>
4320 <field id="MFF_ICMPV4_TYPE" title="ICMPv4 Type">
4321 <p>
4322 For historical reasons, in an ICMPv4 flow, Open vSwitch interprets
4323 matches on <code>tp_src</code> as actually referring to the ICMP type.
4324 </p>
4325 </field>
4326 <field id="MFF_ICMPV4_CODE" title="ICMPv4 Code">
4327 <p>
4328 For historical reasons, in an ICMPv4 flow, Open vSwitch interprets
4329 matches on <code>tp_dst</code> as actually referring to the ICMP code.
4330 </p>
4331 </field>
4332
4333 <h2>ICMPv6</h2>
4334 <diagram>
4335 <header name="Ethernet">
4336 <bits name="dst" above="48" width="0.4"/>
4337 <bits name="src" above="48" width="0.4"/>
4338 <bits name="type" above="16" below="0x86dd" width="0.4"/>
4339 </header>
4340 <header name="IPv6">
4341 <bits name="..." width="0.2"/>
4342 <bits name="next" above="8" below="58" width="0.3"/>
4343 <bits name="src" above="128" width="0.4"/>
4344 <bits name="dst" above="128" width="0.4"/>
4345 </header>
4346 <header name="ICMPv6">
4347 <bits name="type" above="8" width=".3"/>
4348 <bits name="code" above="8" width=".3"/>
4349 <bits name="..." width=".8"/>
4350 </header>
4351 <dots/>
4352 </diagram>
4353 <field id="MFF_ICMPV6_TYPE" title="ICMPv6 Type"/>
4354 <field id="MFF_ICMPV6_CODE" title="ICMPv6 Code"/>
4355
4356 <h2>ICMPv6 Neighbor Discovery</h2>
4357 <diagram>
4358 <header name="Ethernet">
4359 <bits name="dst" above="48" width="0.4"/>
4360 <bits name="src" above="48" width="0.4"/>
4361 <bits name="type" above="16" below="0x86dd" width="0.4"/>
4362 </header>
4363 <header name="IPv6">
4364 <bits name="..." width="0.2"/>
4365 <bits name="next" above="8" below="58" width="0.3"/>
4366 <bits name="src" above="128" width="0.4"/>
4367 <bits name="dst" above="128" width="0.4"/>
4368 </header>
4369 <header name="ICMPv6">
4370 <bits name="type" above="8" below="135/136" width=".3"/>
4371 <bits name="code" above="8" below="0" width=".3"/>
4372 <bits name="..." width=".8"/>
4373 </header>
4374 <header name="ICMPv6 ND">
4375 <bits name="target" above="128" width=".4"/>
4376 <bits name="option ..." width=".6"/>
4377 </header>
4378 </diagram>
4379 <field id="MFF_ND_TARGET" title="ICMPv6 Neighbor Discovery Target IPv6"/>
4380 <field id="MFF_ND_SLL"
4381 title="ICMPv6 Neighbor Discovery Source Ethernet Address"/>
4382 <field id="MFF_ND_TLL"
4383 title="ICMPv6 Neighbor Discovery Target Ethernet Address"/>
4384 </group>
4385
4386 <h1>References</h1>
4387
4388 <dl>
4389 <dt>Casado</dt>
4390 <dd>
4391 M. Casado, M. J. Freedman, J. Pettit, J. Luo, N. McKeown, and
4392 S. Shenker, ``Ethane: Taking Control of the Enterprise,''
4393 Computer Communications Review, October 2007.
4394 </dd>
4395
4396 <dt>EXT-56</dt>
4397 <dd>
4398 J. Tonsing, ``Permit one of a set of prerequisites to apply, e.g. don't
4399 preclude non-Ethernet media,'' <url
4400 href="https://rs.opennetworking.org/bugs/browse/EXT-56"/> (ONF
4401 members only).
4402 </dd>
4403
4404 <dt>EXT-112</dt>
4405 <dd>
4406 J. Tourrilhes, ``Support non-Ethernet packets throughout the
4407 pipeline,'' <url
4408 href="https://rs.opennetworking.org/bugs/browse/EXT-112"/> (ONF
4409 members only).
4410 </dd>
4411
4412 <dt>EXT-134</dt>
4413 <dd>
4414 J. Tourrilhes, ``Match first nibble of the MPLS payload,'' <url
4415 href="https://rs.opennetworking.org/bugs/browse/EXT-134"/> (ONF
4416 members only).
4417 </dd>
4418
4419 <dt>Geneve</dt>
4420 <dd>
4421 J. Gross, I. Ganga, and T. Sridhar, editors, ``Geneve: Generic Network
4422 Virtualization Encapsulation,'' <url
4423 href="https://datatracker.ietf.org/doc/draft-ietf-nvo3-geneve/"/>.
4424 </dd>
4425
4426 <dt>IEEE OUI</dt>
4427 <dd>
4428 IEEE Standards Association, ``MAC Address Block Large (MA-L),''
4429 <url
4430 href="https://standards.ieee.org/develop/regauth/oui/index.html"/>.
4431 </dd>
4432
4433 <dt>NSH</dt>
4434 <dd>
4435 P. Quinn and U. Elzur, editors, ``Network Service Header,'' <url
4436 href="https://datatracker.ietf.org/doc/draft-ietf-sfc-nsh/"/>.
4437 </dd>
4438
4439 <dt>OpenFlow 1.0.1</dt>
4440 <dd>
4441 Open Networking Foundation, ``OpenFlow Switch Errata, Version
4442 1.0.1,'' June 2012.
4443 </dd>
4444
4445 <dt>OpenFlow 1.1</dt>
4446 <dd>
4447 OpenFlow Consortium, ``OpenFlow Switch Specification Version
4448 1.1.0 Implemented (Wire Protocol 0x02),'' February 2011.
4449 </dd>
4450
4451 <dt>OpenFlow 1.5</dt>
4452 <dd>
4453 Open Networking Foundation, ``OpenFlow Switch Specification Version
4454 1.5.0 (Protocol version 0x06),'' December 2014.
4455 </dd>
4456
4457 <dt>OpenFlow Extensions 1.3.x Package 2</dt>
4458 <dd>
4459 Open Networking Foundation, ``OpenFlow Extensions 1.3.x Package 2,''
4460 December 2013.
4461 </dd>
4462
4463 <dt>TCP Flags Match Field Extension</dt>
4464 <dd>
4465 Open Networking Foundation, ``TCP flags match field Extension,'' December
4466 2014. In [OpenFlow Extensions 1.3.x Package 2].
4467 </dd>
4468
4469 <dt>Pepelnjak</dt>
4470 <dd>
4471 I. Pepelnjak, ``OpenFlow and Fermi Estimates,'' <url
4472 href="http://blog.ipspace.net/2013/09/openflow-and-fermi-estimates.html"/>.
4473 </dd>
4474
4475 <dt>RFC 793</dt>
4476 <dd>
4477 ``Transmission Control Protocol,'' <url
4478 href="http://www.ietf.org/rfc/rfc793.txt"/>.
4479 </dd>
4480
4481 <dt>RFC 3032</dt>
4482 <dd>
4483 E. Rosen, D. Tappan, G. Fedorkow, Y. Rekhter, D. Farinacci,
4484 T. Li, and A. Conta, ``MPLS Label Stack Encoding,'' <url
4485 href="http://www.ietf.org/rfc/rfc3032.txt"/>.
4486 </dd>
4487
4488 <dt>RFC 3168</dt>
4489 <dd>
4490 K. Ramakrishnan, S. Floyd, and D. Black, ``The Addition of Explicit
4491 Congestion Notification (ECN) to IP,'' <url href="https://tools.ietf.org/html/rfc3168"/>.
4492 </dd>
4493
4494 <dt>RFC 3540</dt>
4495 <dd>
4496 N. Spring, D. Wetherall, and D. Ely, ``Robust Explicit Congestion
4497 Notification (ECN) Signaling with Nonces,'' <url
4498 href="https://tools.ietf.org/html/rfc3540"/>.
4499 </dd>
4500
4501 <dt>RFC 4632</dt>
4502 <dd>
4503 V. Fuller and T. Li, ``Classless Inter-domain Routing (CIDR): The
4504 Internet Address Assignment and Aggregation Plan,'' <url
4505 href="https://tools.ietf.org/html/rfc4632"/>.
4506 </dd>
4507
4508 <dt>RFC 5462</dt>
4509 <dd>
4510 L. Andersson and R. Asati, ``Multiprotocol Label Switching
4511 (MPLS) Label Stack Entry: ``EXP'' Field Renamed to ``Traffic
4512 Class'' Field,'' <url
4513 href="http://www.ietf.org/rfc/rfc5462.txt"/>.
4514 </dd>
4515
4516 <dt>RFC 6830</dt>
4517 <dd>
4518 D. Farinacci, V. Fuller, D. Meyer, and D. Lewis, ``The
4519 Locator/ID Separation Protocol (LISP),'' <url
4520 href="http://www.ietf.org/rfc/rfc6830.txt"/>.
4521 </dd>
4522
4523 <dt>RFC 7348</dt>
4524 <dd>
4525 M. Mahalingam, D. Dutt, K. Duda, P. Agarwal, L. Kreeger, T. Sridhar,
4526 M. Bursell, and C. Wright, ``Virtual eXtensible Local Area Network
4527 (VXLAN): A Framework for Overlaying Virtualized Layer 2 Networks over
4528 Layer 3 Networks, '' <url href="https://tools.ietf.org/html/rfc7348"/>.
4529 </dd>
4530
4531 <dt>Srinivasan</dt>
4532 <dd>
4533 V. Srinivasan, S. Suriy, and G. Varghese, ``Packet
4534 Classification using Tuple Space Search,'' SIGCOMM 1999.
4535 </dd>
4536
4537 <dt>Pagiamtzis</dt>
4538 <dd>
4539 K. Pagiamtzis and A. Sheikholeslami, ``Content-addressable
4540 memory (CAM) circuits and architectures: A tutorial and
4541 survey,'' IEEE Journal of Solid-State Circuits, vol. 41, no. 3,
4542 pp. 712-727, March 2006.
4543 </dd>
4544
4545 <dt>VXLAN Group Policy Option</dt>
4546 <dd>
4547 M. Smith and L. Kreeger, `` VXLAN Group Policy Option.'' Internet-Draft.
4548 <url href="https://tools.ietf.org/html/draft-smith-vxlan-group-policy"/>.
4549 </dd>
4550 </dl>
4551
4552 <h1>Authors</h1>
4553
4554 <p>
4555 Ben Pfaff, with advice from Justin Pettit and Jean Tourrilhes.
4556 </p>
4557
4558</fields>
4559
4560<!--
4561 OXM fields not yet supported Future Directions References/See Also
4562 OXM fields required by various versions and by the "Conformance Test Specification for OpenFlow Switch Specification 1.0.1"
4563-->