]> git.proxmox.com Git - ovs.git/blob - ovn/ovn-sb.xml
logical-fields: Add ct.dnat and ct.snat symbols.
[ovs.git] / ovn / ovn-sb.xml
1 <?xml version="1.0" encoding="utf-8"?>
2 <database name="ovn-sb" title="OVN Southbound Database">
3 <p>
4 This database holds logical and physical configuration and state for the
5 Open Virtual Network (OVN) system to support virtual network abstraction.
6 For an introduction to OVN, please see <code>ovn-architecture</code>(7).
7 </p>
8
9 <p>
10 The OVN Southbound database sits at the center of the OVN
11 architecture. It is the one component that speaks both southbound
12 directly to all the hypervisors and gateways, via
13 <code>ovn-controller</code>/<code>ovn-controller-vtep</code>, and
14 northbound to the Cloud Management System, via <code>ovn-northd</code>:
15 </p>
16
17 <h2>Database Structure</h2>
18
19 <p>
20 The OVN Southbound database contains classes of data with
21 different properties, as described in the sections below.
22 </p>
23
24 <h3>Physical Network (PN) data</h3>
25
26 <p>
27 PN tables contain information about the chassis nodes in the system. This
28 contains all the information necessary to wire the overlay, such as IP
29 addresses, supported tunnel types, and security keys.
30 </p>
31
32 <p>
33 The amount of PN data is small (O(n) in the number of chassis) and it
34 changes infrequently, so it can be replicated to every chassis.
35 </p>
36
37 <p>
38 The <ref table="Chassis"/> table comprises the PN tables.
39 </p>
40
41 <h3>Logical Network (LN) data</h3>
42
43 <p>
44 LN tables contain the topology of logical switches and routers, ACLs,
45 firewall rules, and everything needed to describe how packets traverse a
46 logical network, represented as logical datapath flows (see Logical
47 Datapath Flows, below).
48 </p>
49
50 <p>
51 LN data may be large (O(n) in the number of logical ports, ACL rules,
52 etc.). Thus, to improve scaling, each chassis should receive only data
53 related to logical networks in which that chassis participates. Past
54 experience shows that in the presence of large logical networks, even
55 finer-grained partitioning of data, e.g. designing logical flows so that
56 only the chassis hosting a logical port needs related flows, pays off
57 scale-wise. (This is not necessary initially but it is worth bearing in
58 mind in the design.)
59 </p>
60
61 <p>
62 The LN is a slave of the cloud management system running northbound of OVN.
63 That CMS determines the entire OVN logical configuration and therefore the
64 LN's content at any given time is a deterministic function of the CMS's
65 configuration, although that happens indirectly via the
66 <ref db="OVN_Northbound"/> database and <code>ovn-northd</code>.
67 </p>
68
69 <p>
70 LN data is likely to change more quickly than PN data. This is especially
71 true in a container environment where VMs are created and destroyed (and
72 therefore added to and deleted from logical switches) quickly.
73 </p>
74
75 <p>
76 <ref table="Logical_Flow"/> and <ref table="Multicast_Group"/> contain LN
77 data.
78 </p>
79
80 <h3>Logical-physical bindings</h3>
81
82 <p>
83 These tables link logical and physical components. They show the current
84 placement of logical components (such as VMs and VIFs) onto chassis, and
85 map logical entities to the values that represent them in tunnel
86 encapsulations.
87 </p>
88
89 <p>
90 These tables change frequently, at least every time a VM powers up or down
91 or migrates, and especially quickly in a container environment. The
92 amount of data per VM (or VIF) is small.
93 </p>
94
95 <p>
96 Each chassis is authoritative about the VMs and VIFs that it hosts at any
97 given time and can efficiently flood that state to a central location, so
98 the consistency needs are minimal.
99 </p>
100
101 <p>
102 The <ref table="Port_Binding"/> and <ref table="Datapath_Binding"/> tables
103 contain binding data.
104 </p>
105
106 <h3>MAC bindings</h3>
107
108 <p>
109 The <ref table="MAC_Binding"/> table tracks the bindings from IP addresses
110 to Ethernet addresses that are dynamically discovered using ARP (for IPv4)
111 and neighbor discovery (for IPv6). Usually, IP-to-MAC bindings for virtual
112 machines are statically populated into the <ref table="Port_Binding"/>
113 table, so <ref table="MAC_Binding"/> is primarily used to discover bindings
114 on physical networks.
115 </p>
116
117 <h2>Common Columns</h2>
118
119 <p>
120 Some tables contain a special column named <code>external_ids</code>. This
121 column has the same form and purpose each place that it appears, so we
122 describe it here to save space later.
123 </p>
124
125 <dl>
126 <dt><code>external_ids</code>: map of string-string pairs</dt>
127 <dd>
128 Key-value pairs for use by the software that manages the OVN Southbound
129 database rather than by
130 <code>ovn-controller</code>/<code>ovn-controller-vtep</code>. In
131 particular, <code>ovn-northd</code> can use key-value pairs in this
132 column to relate entities in the southbound database to higher-level
133 entities (such as entities in the OVN Northbound database). Individual
134 key-value pairs in this column may be documented in some cases to aid
135 in understanding and troubleshooting, but the reader should not mistake
136 such documentation as comprehensive.
137 </dd>
138 </dl>
139
140 <table name="SB_Global" title="Southbound configuration">
141 <p>
142 Southbound configuration for an OVN system. This table must have exactly
143 one row.
144 </p>
145
146 <group title="Status">
147 This column allow a client to track the overall configuration state of
148 the system.
149
150 <column name="nb_cfg">
151 Sequence number for the configuration. When a CMS or
152 <code>ovn-nbctl</code> updates the northbound database, it increments
153 the <code>nb_cfg</code> column in the <code>NB_Global</code> table in
154 the northbound database. In turn, when <code>ovn-northd</code> updates
155 the southbound database to bring it up to date with these changes, it
156 updates this column to the same value.
157 </column>
158 </group>
159
160 <group title="Common Columns">
161 <column name="external_ids">
162 See <em>External IDs</em> at the beginning of this document.
163 </column>
164 </group>
165 <group title="Connection Options">
166 <column name="connections">
167 Database clients to which the Open vSwitch database server should
168 connect or on which it should listen, along with options for how these
169 connections should be configured. See the <ref table="Connection"/>
170 table for more information.
171 </column>
172 </group>
173 </table>
174
175 <table name="Chassis" title="Physical Network Hypervisor and Gateway Information">
176 <p>
177 Each row in this table represents a hypervisor or gateway (a chassis) in
178 the physical network (PN). Each chassis, via
179 <code>ovn-controller</code>/<code>ovn-controller-vtep</code>, adds
180 and updates its own row, and keeps a copy of the remaining rows to
181 determine how to reach other hypervisors.
182 </p>
183
184 <p>
185 When a chassis shuts down gracefully, it should remove its own row.
186 (This is not critical because resources hosted on the chassis are equally
187 unreachable regardless of whether the row is present.) If a chassis
188 shuts down permanently without removing its row, some kind of manual or
189 automatic cleanup is eventually needed; we can devise a process for that
190 as necessary.
191 </p>
192
193 <column name="name">
194 OVN does not prescribe a particular format for chassis names.
195 ovn-controller populates this column using <ref key="system-id"
196 table="Open_vSwitch" column="external_ids" db="Open_vSwitch"/>
197 in the Open_vSwitch database's <ref table="Open_vSwitch"
198 db="Open_vSwitch"/> table. ovn-controller-vtep populates this
199 column with <ref table="Physical_Switch" column="name"
200 db="hardware_vtep"/> in the hardware_vtep database's
201 <ref table="Physical_Switch" db="hardware_vtep"/> table.
202 </column>
203
204 <column name="hostname">
205 The hostname of the chassis, if applicable. ovn-controller will populate
206 this column with the hostname of the host it is running on.
207 ovn-controller-vtep will leave this column empty.
208 </column>
209
210 <column name="nb_cfg">
211 Sequence number for the configuration. When <code>ovn-controller</code>
212 updates the configuration of a chassis from the contents of the
213 southbound database, it copies <ref table="SB_Global" column="nb_cfg"/>
214 from the <ref table="SB_Global"/> table into this column.
215 </column>
216
217 <column name="external_ids" key="ovn-bridge-mappings">
218 <code>ovn-controller</code> populates this key with the set of bridge
219 mappings it has been configured to use. Other applications should treat
220 this key as read-only. See <code>ovn-controller</code>(8) for more
221 information.
222 </column>
223
224 <column name="external_ids" key="datapath-type">
225 <code>ovn-controller</code> populates this key with the datapath type
226 configured in the <ref table="Bridge" column="datapath_type"/> column of
227 the Open_vSwitch database's <ref table="Bridge" db="Open_vSwitch"/>
228 table. Other applications should treat this key as read-only. See
229 <code>ovn-controller</code>(8) for more information.
230 </column>
231
232 <column name="external_ids" key="iface-types">
233 <code>ovn-controller</code> populates this key with the interface types
234 configured in the <ref table="Open_vSwitch" column="iface_types"/> column
235 of the Open_vSwitch database's <ref table="Open_vSwitch"
236 db="Open_vSwitch"/> table. Other applications should treat this key as
237 read-only. See <code>ovn-controller</code>(8) for more information.
238 </column>
239
240 <group title="Common Columns">
241 The overall purpose of these columns is described under <code>Common
242 Columns</code> at the beginning of this document.
243
244 <column name="external_ids"/>
245 </group>
246
247 <group title="Encapsulation Configuration">
248 <p>
249 OVN uses encapsulation to transmit logical dataplane packets
250 between chassis.
251 </p>
252
253 <column name="encaps">
254 Points to supported encapsulation configurations to transmit
255 logical dataplane packets to this chassis. Each entry is a <ref
256 table="Encap"/> record that describes the configuration.
257 </column>
258 </group>
259
260 <group title="Gateway Configuration">
261 <p>
262 A <dfn>gateway</dfn> is a chassis that forwards traffic between the
263 OVN-managed part of a logical network and a physical VLAN, extending a
264 tunnel-based logical network into a physical network. Gateways are
265 typically dedicated nodes that do not host VMs and will be controlled
266 by <code>ovn-controller-vtep</code>.
267 </p>
268
269 <column name="vtep_logical_switches">
270 Stores all VTEP logical switch names connected by this gateway
271 chassis. The <ref table="Port_Binding"/> table entry with
272 <ref column="options" table="Port_Binding"/>:<code>vtep-physical-switch</code>
273 equal <ref table="Chassis"/> <ref column="name" table="Chassis"/>, and
274 <ref column="options" table="Port_Binding"/>:<code>vtep-logical-switch</code>
275 value in <ref table="Chassis"/>
276 <ref column="vtep_logical_switches" table="Chassis"/>, will be
277 associated with this <ref table="Chassis"/>.
278 </column>
279 </group>
280 </table>
281
282 <table name="Encap" title="Encapsulation Types">
283 <p>
284 The <ref column="encaps" table="Chassis"/> column in the <ref
285 table="Chassis"/> table refers to rows in this table to identify
286 how OVN may transmit logical dataplane packets to this chassis.
287 Each chassis, via <code>ovn-controller</code>(8) or
288 <code>ovn-controller-vtep</code>(8), adds and updates its own rows
289 and keeps a copy of the remaining rows to determine how to reach
290 other chassis.
291 </p>
292
293 <column name="type">
294 The encapsulation to use to transmit packets to this chassis.
295 Hypervisors must use either <code>geneve</code> or
296 <code>stt</code>. Gateways may use <code>vxlan</code>,
297 <code>geneve</code>, or <code>stt</code>.
298 </column>
299
300 <column name="options">
301 <p>
302 Options for configuring the encapsulation. Currently, the only
303 option that has been defined is <code>csum</code>.
304 </p>
305
306 <p>
307 <code>csum</code> indicates that encapsulation checksums can be
308 transmitted and received with reasonable performance. It is a hint
309 to senders transmitting data to this chassis that they should use
310 checksums to protect OVN metadata. Set to <code>true</code> to enable
311 or <code>false</code> to disable.
312 </p>
313
314 <p>
315 In terms of performance, this actually significantly increases
316 throughput in most common cases when running on Linux based hosts
317 without NICs supporting encapsulation hardware offload (around 60% for
318 bulk traffic). The reason is that generally all NICs are capable of
319 offloading transmitted and received TCP/UDP checksums (viewed as
320 ordinary data packets and not as tunnels). The benefit comes on the
321 receive side where the validated outer checksum can be used to
322 additionally validate an inner checksum (such as TCP), which in turn
323 allows aggregation of packets to be more efficiently handled by the
324 rest of the stack.
325 </p>
326
327 <p>
328 Not all devices see such a benefit. The most notable exception is
329 hardware VTEPs. These devices are designed to not buffer entire
330 packets in their switching engines and are therefore unable to
331 efficiently compute or validate full packet checksums. In addition
332 certain versions of the Linux kernel are not able to fully take
333 advantage of encapsulation NIC offloads in the presence of checksums.
334 (This is actually a pretty narrow corner case though - earlier
335 versions of Linux don't support encapsulation offloads at all and
336 later versions support both offloads and checksums well.)
337 </p>
338
339 <p>
340 <code>csum</code> defaults to <code>false</code> for hardware VTEPs and
341 <code>true</code> for all other cases.
342 </p>
343 </column>
344
345 <column name="ip">
346 The IPv4 address of the encapsulation tunnel endpoint.
347 </column>
348 </table>
349
350 <table name="Address_Set" title="Address Sets">
351 <p>
352 See the documentation for the <ref table="Address_Set"
353 db="OVN_Northbound"/> table in the <ref db="OVN_Northbound"/> database
354 for details.
355 </p>
356
357 <column name="name"/>
358 <column name="addresses"/>
359 </table>
360
361 <table name="Logical_Flow" title="Logical Network Flows">
362 <p>
363 Each row in this table represents one logical flow.
364 <code>ovn-northd</code> populates this table with logical flows
365 that implement the L2 and L3 topologies specified in the
366 <ref db="OVN_Northbound"/> database. Each hypervisor, via
367 <code>ovn-controller</code>, translates the logical flows into
368 OpenFlow flows specific to its hypervisor and installs them into
369 Open vSwitch.
370 </p>
371
372 <p>
373 Logical flows are expressed in an OVN-specific format, described here. A
374 logical datapath flow is much like an OpenFlow flow, except that the
375 flows are written in terms of logical ports and logical datapaths instead
376 of physical ports and physical datapaths. Translation between logical
377 and physical flows helps to ensure isolation between logical datapaths.
378 (The logical flow abstraction also allows the OVN centralized
379 components to do less work, since they do not have to separately
380 compute and push out physical flows to each chassis.)
381 </p>
382
383 <p>
384 The default action when no flow matches is to drop packets.
385 </p>
386
387 <p><em>Architectural Logical Life Cycle of a Packet</em></p>
388
389 <p>
390 This following description focuses on the life cycle of a packet through
391 a logical datapath, ignoring physical details of the implementation.
392 Please refer to <em>Architectural Physical Life Cycle of a Packet</em> in
393 <code>ovn-architecture</code>(7) for the physical information.
394 </p>
395
396 <p>
397 The description here is written as if OVN itself executes these steps,
398 but in fact OVN (that is, <code>ovn-controller</code>) programs Open
399 vSwitch, via OpenFlow and OVSDB, to execute them on its behalf.
400 </p>
401
402 <p>
403 At a high level, OVN passes each packet through the logical datapath's
404 logical ingress pipeline, which may output the packet to one or more
405 logical port or logical multicast groups. For each such logical output
406 port, OVN passes the packet through the datapath's logical egress
407 pipeline, which may either drop the packet or deliver it to the
408 destination. Between the two pipelines, outputs to logical multicast
409 groups are expanded into logical ports, so that the egress pipeline only
410 processes a single logical output port at a time. Between the two
411 pipelines is also where, when necessary, OVN encapsulates a packet in a
412 tunnel (or tunnels) to transmit to remote hypervisors.
413 </p>
414
415 <p>
416 In more detail, to start, OVN searches the <ref table="Logical_Flow"/>
417 table for a row with correct <ref column="logical_datapath"/>, a <ref
418 column="pipeline"/> of <code>ingress</code>, a <ref column="table_id"/>
419 of 0, and a <ref column="match"/> that is true for the packet. If none
420 is found, OVN drops the packet. If OVN finds more than one, it chooses
421 the match with the highest <ref column="priority"/>. Then OVN executes
422 each of the actions specified in the row's <ref table="actions"/> column,
423 in the order specified. Some actions, such as those to modify packet
424 headers, require no further details. The <code>next</code> and
425 <code>output</code> actions are special.
426 </p>
427
428 <p>
429 The <code>next</code> action causes the above process to be repeated
430 recursively, except that OVN searches for <ref column="table_id"/> of 1
431 instead of 0. Similarly, any <code>next</code> action in a row found in
432 that table would cause a further search for a <ref column="table_id"/> of
433 2, and so on. When recursive processing completes, flow control returns
434 to the action following <code>next</code>.
435 </p>
436
437 <p>
438 The <code>output</code> action also introduces recursion. Its effect
439 depends on the current value of the <code>outport</code> field. Suppose
440 <code>outport</code> designates a logical port. First, OVN compares
441 <code>inport</code> to <code>outport</code>; if they are equal, it treats
442 the <code>output</code> as a no-op by default. In the common
443 case, where they are different, the packet enters the egress
444 pipeline. This transition to the egress pipeline discards
445 register data, e.g. <code>reg0</code> ... <code>reg9</code> and
446 connection tracking state, to achieve uniform behavior regardless
447 of whether the egress pipeline is on a different hypervisor
448 (because registers aren't preserve across tunnel encapsulation).
449 </p>
450
451 <p>
452 To execute the egress pipeline, OVN again searches the <ref
453 table="Logical_Flow"/> table for a row with correct <ref
454 column="logical_datapath"/>, a <ref column="table_id"/> of 0, a <ref
455 column="match"/> that is true for the packet, but now looking for a <ref
456 column="pipeline"/> of <code>egress</code>. If no matching row is found,
457 the output becomes a no-op. Otherwise, OVN executes the actions for the
458 matching flow (which is chosen from multiple, if necessary, as already
459 described).
460 </p>
461
462 <p>
463 In the <code>egress</code> pipeline, the <code>next</code> action acts as
464 already described, except that it, of course, searches for
465 <code>egress</code> flows. The <code>output</code> action, however, now
466 directly outputs the packet to the output port (which is now fixed,
467 because <code>outport</code> is read-only within the egress pipeline).
468 </p>
469
470 <p>
471 The description earlier assumed that <code>outport</code> referred to a
472 logical port. If it instead designates a logical multicast group, then
473 the description above still applies, with the addition of fan-out from
474 the logical multicast group to each logical port in the group. For each
475 member of the group, OVN executes the logical pipeline as described, with
476 the logical output port replaced by the group member.
477 </p>
478
479 <p><em>Pipeline Stages</em></p>
480
481 <p>
482 <code>ovn-northd</code> populates the <ref table="Logical_Flow"/> table
483 with the logical flows described in detail in <code>ovn-northd</code>(8).
484 </p>
485
486 <column name="logical_datapath">
487 The logical datapath to which the logical flow belongs.
488 </column>
489
490 <column name="pipeline">
491 <p>
492 The primary flows used for deciding on a packet's destination are the
493 <code>ingress</code> flows. The <code>egress</code> flows implement
494 ACLs. See <em>Logical Life Cycle of a Packet</em>, above, for details.
495 </p>
496 </column>
497
498 <column name="table_id">
499 The stage in the logical pipeline, analogous to an OpenFlow table number.
500 </column>
501
502 <column name="priority">
503 The flow's priority. Flows with numerically higher priority take
504 precedence over those with lower. If two logical datapath flows with the
505 same priority both match, then the one actually applied to the packet is
506 undefined.
507 </column>
508
509 <column name="match">
510 <p>
511 A matching expression. OVN provides a superset of OpenFlow matching
512 capabilities, using a syntax similar to Boolean expressions in a
513 programming language.
514 </p>
515
516 <p>
517 The most important components of match expression are
518 <dfn>comparisons</dfn> between <dfn>symbols</dfn> and
519 <dfn>constants</dfn>, e.g. <code>ip4.dst == 192.168.0.1</code>,
520 <code>ip.proto == 6</code>, <code>arp.op == 1</code>, <code>eth.type ==
521 0x800</code>. The logical AND operator <code>&amp;&amp;</code> and
522 logical OR operator <code>||</code> can combine comparisons into a
523 larger expression.
524 </p>
525
526 <p>
527 Matching expressions also support parentheses for grouping, the logical
528 NOT prefix operator <code>!</code>, and literals <code>0</code> and
529 <code>1</code> to express ``false'' or ``true,'' respectively. The
530 latter is useful by itself as a catch-all expression that matches every
531 packet.
532 </p>
533
534 <p><em>Symbols</em></p>
535
536 <p>
537 <em>Type</em>. Symbols have <dfn>integer</dfn> or <dfn>string</dfn>
538 type. Integer symbols have a <dfn>width</dfn> in bits.
539 </p>
540
541 <p>
542 <em>Kinds</em>. There are three kinds of symbols:
543 </p>
544
545 <ul>
546 <li>
547 <p>
548 <dfn>Fields</dfn>. A field symbol represents a packet header or
549 metadata field. For example, a field
550 named <code>vlan.tci</code> might represent the VLAN TCI field in a
551 packet.
552 </p>
553
554 <p>
555 A field symbol can have integer or string type. Integer fields can
556 be nominal or ordinal (see <em>Level of Measurement</em>,
557 below).
558 </p>
559 </li>
560
561 <li>
562 <p>
563 <dfn>Subfields</dfn>. A subfield represents a subset of bits from
564 a larger field. For example, a field <code>vlan.vid</code> might
565 be defined as an alias for <code>vlan.tci[0..11]</code>. Subfields
566 are provided for syntactic convenience, because it is always
567 possible to instead refer to a subset of bits from a field
568 directly.
569 </p>
570
571 <p>
572 Only ordinal fields (see <em>Level of Measurement</em>,
573 below) may have subfields. Subfields are always ordinal.
574 </p>
575 </li>
576
577 <li>
578 <p>
579 <dfn>Predicates</dfn>. A predicate is shorthand for a Boolean
580 expression. Predicates may be used much like 1-bit fields. For
581 example, <code>ip4</code> might expand to <code>eth.type ==
582 0x800</code>. Predicates are provided for syntactic convenience,
583 because it is always possible to instead specify the underlying
584 expression directly.
585 </p>
586
587 <p>
588 A predicate whose expansion refers to any nominal field or
589 predicate (see <em>Level of Measurement</em>, below) is nominal;
590 other predicates have Boolean level of measurement.
591 </p>
592 </li>
593 </ul>
594
595 <p>
596 <em>Level of Measurement</em>. See
597 http://en.wikipedia.org/wiki/Level_of_measurement for the statistical
598 concept on which this classification is based. There are three
599 levels:
600 </p>
601
602 <ul>
603 <li>
604 <p>
605 <dfn>Ordinal</dfn>. In statistics, ordinal values can be ordered
606 on a scale. OVN considers a field (or subfield) to be ordinal if
607 its bits can be examined individually. This is true for the
608 OpenFlow fields that OpenFlow or Open vSwitch makes ``maskable.''
609 </p>
610
611 <p>
612 Any use of a nominal field may specify a single bit or a range of
613 bits, e.g. <code>vlan.tci[13..15]</code> refers to the PCP field
614 within the VLAN TCI, and <code>eth.dst[40]</code> refers to the
615 multicast bit in the Ethernet destination address.
616 </p>
617
618 <p>
619 OVN supports all the usual arithmetic relations (<code>==</code>,
620 <code>!=</code>, <code>&lt;</code>, <code>&lt;=</code>,
621 <code>&gt;</code>, and <code>&gt;=</code>) on ordinal fields and
622 their subfields, because OVN can implement these in OpenFlow and
623 Open vSwitch as collections of bitwise tests.
624 </p>
625 </li>
626
627 <li>
628 <p>
629 <dfn>Nominal</dfn>. In statistics, nominal values cannot be
630 usefully compared except for equality. This is true of OpenFlow
631 port numbers, Ethernet types, and IP protocols are examples: all of
632 these are just identifiers assigned arbitrarily with no deeper
633 meaning. In OpenFlow and Open vSwitch, bits in these fields
634 generally aren't individually addressable.
635 </p>
636
637 <p>
638 OVN only supports arithmetic tests for equality on nominal fields,
639 because OpenFlow and Open vSwitch provide no way for a flow to
640 efficiently implement other comparisons on them. (A test for
641 inequality can be sort of built out of two flows with different
642 priorities, but OVN matching expressions always generate flows with
643 a single priority.)
644 </p>
645
646 <p>
647 String fields are always nominal.
648 </p>
649 </li>
650
651 <li>
652 <p>
653 <dfn>Boolean</dfn>. A nominal field that has only two values, 0
654 and 1, is somewhat exceptional, since it is easy to support both
655 equality and inequality tests on such a field: either one can be
656 implemented as a test for 0 or 1.
657 </p>
658
659 <p>
660 Only predicates (see above) have a Boolean level of measurement.
661 </p>
662
663 <p>
664 This isn't a standard level of measurement.
665 </p>
666 </li>
667 </ul>
668
669 <p>
670 <em>Prerequisites</em>. Any symbol can have prerequisites, which are
671 additional condition implied by the use of the symbol. For example,
672 For example, <code>icmp4.type</code> symbol might have prerequisite
673 <code>icmp4</code>, which would cause an expression <code>icmp4.type ==
674 0</code> to be interpreted as <code>icmp4.type == 0 &amp;&amp;
675 icmp4</code>, which would in turn expand to <code>icmp4.type == 0
676 &amp;&amp; eth.type == 0x800 &amp;&amp; ip4.proto == 1</code> (assuming
677 <code>icmp4</code> is a predicate defined as suggested under
678 <em>Types</em> above).
679 </p>
680
681 <p><em>Relational operators</em></p>
682
683 <p>
684 All of the standard relational operators <code>==</code>,
685 <code>!=</code>, <code>&lt;</code>, <code>&lt;=</code>,
686 <code>&gt;</code>, and <code>&gt;=</code> are supported. Nominal
687 fields support only <code>==</code> and <code>!=</code>, and only in a
688 positive sense when outer <code>!</code> are taken into account,
689 e.g. given string field <code>inport</code>, <code>inport ==
690 "eth0"</code> and <code>!(inport != "eth0")</code> are acceptable, but
691 not <code>inport != "eth0"</code>.
692 </p>
693
694 <p>
695 The implementation of <code>==</code> (or <code>!=</code> when it is
696 negated), is more efficient than that of the other relational
697 operators.
698 </p>
699
700 <p><em>Constants</em></p>
701
702 <p>
703 Integer constants may be expressed in decimal, hexadecimal prefixed by
704 <code>0x</code>, or as dotted-quad IPv4 addresses, IPv6 addresses in
705 their standard forms, or Ethernet addresses as colon-separated hex
706 digits. A constant in any of these forms may be followed by a slash
707 and a second constant (the mask) in the same form, to form a masked
708 constant. IPv4 and IPv6 masks may be given as integers, to express
709 CIDR prefixes.
710 </p>
711
712 <p>
713 String constants have the same syntax as quoted strings in JSON (thus,
714 they are Unicode strings).
715 </p>
716
717 <p>
718 Some operators support sets of constants written inside curly braces
719 <code>{</code> ... <code>}</code>. Commas between elements of a set,
720 and after the last elements, are optional. With <code>==</code>,
721 ``<code><var>field</var> == { <var>constant1</var>,
722 <var>constant2</var>,</code> ... <code>}</code>'' is syntactic sugar
723 for ``<code><var>field</var> == <var>constant1</var> ||
724 <var>field</var> == <var>constant2</var> || </code>...<code></code>.
725 Similarly, ``<code><var>field</var> != { <var>constant1</var>,
726 <var>constant2</var>, </code>...<code> }</code>'' is equivalent to
727 ``<code><var>field</var> != <var>constant1</var> &amp;&amp;
728 <var>field</var> != <var>constant2</var> &amp;&amp;
729 </code>...<code></code>''.
730 </p>
731
732 <p>
733 You may refer to a set of IPv4, IPv6, or MAC addresses stored in the
734 <ref table="Address_Set"/> table by its <ref column="name"
735 table="Address_Set"/>. An <ref table="Address_Set"/> with a name
736 of <code>set1</code> can be referred to as
737 <code>$set1</code>.
738 </p>
739
740 <p><em>Miscellaneous</em></p>
741
742 <p>
743 Comparisons may name the symbol or the constant first,
744 e.g. <code>tcp.src == 80</code> and <code>80 == tcp.src</code> are both
745 acceptable.
746 </p>
747
748 <p>
749 Tests for a range may be expressed using a syntax like <code>1024 &lt;=
750 tcp.src &lt;= 49151</code>, which is equivalent to <code>1024 &lt;=
751 tcp.src &amp;&amp; tcp.src &lt;= 49151</code>.
752 </p>
753
754 <p>
755 For a one-bit field or predicate, a mention of its name is equivalent
756 to <code><var>symobl</var> == 1</code>, e.g. <code>vlan.present</code>
757 is equivalent to <code>vlan.present == 1</code>. The same is true for
758 one-bit subfields, e.g. <code>vlan.tci[12]</code>. There is no
759 technical limitation to implementing the same for ordinal fields of all
760 widths, but the implementation is expensive enough that the syntax
761 parser requires writing an explicit comparison against zero to make
762 mistakes less likely, e.g. in <code>tcp.src != 0</code> the comparison
763 against 0 is required.
764 </p>
765
766 <p>
767 <em>Operator precedence</em> is as shown below, from highest to lowest.
768 There are two exceptions where parentheses are required even though the
769 table would suggest that they are not: <code>&amp;&amp;</code> and
770 <code>||</code> require parentheses when used together, and
771 <code>!</code> requires parentheses when applied to a relational
772 expression. Thus, in <code>(eth.type == 0x800 || eth.type == 0x86dd)
773 &amp;&amp; ip.proto == 6</code> or <code>!(arp.op == 1)</code>, the
774 parentheses are mandatory.
775 </p>
776
777 <ul>
778 <li><code>()</code></li>
779 <li><code>== != &lt; &lt;= &gt; &gt;=</code></li>
780 <li><code>!</code></li>
781 <li><code>&amp;&amp; ||</code></li>
782 </ul>
783
784 <p>
785 <em>Comments</em> may be introduced by <code>//</code>, which extends
786 to the next new-line. Comments within a line may be bracketed by
787 <code>/*</code> and <code>*/</code>. Multiline comments are not
788 supported.
789 </p>
790
791 <p><em>Symbols</em></p>
792
793 <p>
794 Most of the symbols below have integer type. Only <code>inport</code>
795 and <code>outport</code> have string type. <code>inport</code> names a
796 logical port. Thus, its value is a <ref column="logical_port"/> name
797 from the <ref table="Port_Binding"/> table. <code>outport</code> may
798 name a logical port, as <code>inport</code>, or a logical multicast
799 group defined in the <ref table="Multicast_Group"/> table. For both
800 symbols, only names within the flow's logical datapath may be used.
801 </p>
802
803 <p>
804 The <code>reg</code><var>X</var> symbols are 32-bit integers.
805 The <code>xxreg</code><var>X</var> symbols are 128-bit integers,
806 which overlay four of the 32-bit registers: <code>xxreg0</code>
807 overlays <code>reg0</code> through <code>reg3</code>, with
808 <code>reg0</code> supplying the most-significant bits of
809 <code>xxreg0</code> and <code>reg3</code> the least-signficant.
810 <code>xxreg1</code> similarly overlays <code>reg4</code> through
811 <code>reg7</code>.
812 </p>
813
814 <ul>
815 <li><code>reg0</code>...<code>reg9</code></li>
816 <li><code>xxreg0</code> <code>xxreg1</code></li>
817 <li><code>inport</code> <code>outport</code></li>
818 <li><code>flags.loopback</code></li>
819 <li><code>eth.src</code> <code>eth.dst</code> <code>eth.type</code></li>
820 <li><code>vlan.tci</code> <code>vlan.vid</code> <code>vlan.pcp</code> <code>vlan.present</code></li>
821 <li><code>ip.proto</code> <code>ip.dscp</code> <code>ip.ecn</code> <code>ip.ttl</code> <code>ip.frag</code></li>
822 <li><code>ip4.src</code> <code>ip4.dst</code></li>
823 <li><code>ip6.src</code> <code>ip6.dst</code> <code>ip6.label</code></li>
824 <li><code>arp.op</code> <code>arp.spa</code> <code>arp.tpa</code> <code>arp.sha</code> <code>arp.tha</code></li>
825 <li><code>tcp.src</code> <code>tcp.dst</code> <code>tcp.flags</code></li>
826 <li><code>udp.src</code> <code>udp.dst</code></li>
827 <li><code>sctp.src</code> <code>sctp.dst</code></li>
828 <li><code>icmp4.type</code> <code>icmp4.code</code></li>
829 <li><code>icmp6.type</code> <code>icmp6.code</code></li>
830 <li><code>nd.target</code> <code>nd.sll</code> <code>nd.tll</code></li>
831 <li><code>ct_mark</code> <code>ct_label</code></li>
832 <li>
833 <p>
834 <code>ct_state</code>, which has the following Boolean subfields:
835 </p>
836 <ul>
837 <li><code>ct.new</code>: True for a new flow</li>
838 <li><code>ct.est</code>: True for an established flow</li>
839 <li><code>ct.rel</code>: True for a related flow</li>
840 <li><code>ct.rpl</code>: True for a reply flow</li>
841 <li><code>ct.inv</code>: True for a connection entry in a bad state</li>
842 </ul>
843 <p>
844 The above subfields of <code>ct_state</code> are initialized by
845 the <code>ct_next</code> action, described later.
846 </p>
847 <ul>
848 <li>
849 <code>ct.dnat</code>: True for a packet whose destination IP
850 address has been changed.
851 </li>
852 <li>
853 <code>ct.snat</code>: True for a packet whose source IP
854 address has been changed.
855 </li>
856 </ul>
857 <p>
858 The above subfields of <code>ct_state</code> are initialized by
859 the actions like <code>ct_dnat</code>, <code>ct_snat</code> and
860 <code>ct_lb</code> described later.
861 </p>
862 </li>
863 </ul>
864
865 <p>
866 The following predicates are supported:
867 </p>
868
869 <ul>
870 <li><code>eth.bcast</code> expands to <code>eth.dst == ff:ff:ff:ff:ff:ff</code></li>
871 <li><code>eth.mcast</code> expands to <code>eth.dst[40]</code></li>
872 <li><code>vlan.present</code> expands to <code>vlan.tci[12]</code></li>
873 <li><code>ip4</code> expands to <code>eth.type == 0x800</code></li>
874 <li><code>ip4.mcast</code> expands to <code>ip4.dst[28..31] == 0xe</code></li>
875 <li><code>ip6</code> expands to <code>eth.type == 0x86dd</code></li>
876 <li><code>ip</code> expands to <code>ip4 || ip6</code></li>
877 <li><code>icmp4</code> expands to <code>ip4 &amp;&amp; ip.proto == 1</code></li>
878 <li><code>icmp6</code> expands to <code>ip6 &amp;&amp; ip.proto == 58</code></li>
879 <li><code>icmp</code> expands to <code>icmp4 || icmp6</code></li>
880 <li><code>ip.is_frag</code> expands to <code>ip.frag[0]</code></li>
881 <li><code>ip.later_frag</code> expands to <code>ip.frag[1]</code></li>
882 <li><code>ip.first_frag</code> expands to <code>ip.is_frag &amp;&amp; !ip.later_frag</code></li>
883 <li><code>arp</code> expands to <code>eth.type == 0x806</code></li>
884 <li><code>nd</code> expands to <code>icmp6.type == {135, 136} &amp;&amp; icmp6.code == 0 &amp;&amp; ip.ttl == 255</code></li>
885 <li><code>nd_ns</code> expands to <code>icmp6.type == 135 &amp;&amp; icmp6.code == 0 &amp;&amp; ip.ttl == 255</code></li>
886 <li><code>nd_na</code> expands to <code>icmp6.type == 136 &amp;&amp; icmp6.code == 0 &amp;&amp; ip.ttl == 255</code></li>
887 <li><code>tcp</code> expands to <code>ip.proto == 6</code></li>
888 <li><code>udp</code> expands to <code>ip.proto == 17</code></li>
889 <li><code>sctp</code> expands to <code>ip.proto == 132</code></li>
890 </ul>
891 </column>
892
893 <column name="actions">
894 <p>
895 Logical datapath actions, to be executed when the logical flow
896 represented by this row is the highest-priority match.
897 </p>
898
899 <p>
900 Actions share lexical syntax with the <ref column="match"/> column. An
901 empty set of actions (or one that contains just white space or
902 comments), or a set of actions that consists of just
903 <code>drop;</code>, causes the matched packets to be dropped.
904 Otherwise, the column should contain a sequence of actions, each
905 terminated by a semicolon.
906 </p>
907
908 <p>
909 The following actions are defined:
910 </p>
911
912 <dl>
913 <dt><code>output;</code></dt>
914 <dd>
915 <p>
916 In the ingress pipeline, this action executes the
917 <code>egress</code> pipeline as a subroutine. If
918 <code>outport</code> names a logical port, the egress pipeline
919 executes once; if it is a multicast group, the egress pipeline runs
920 once for each logical port in the group.
921 </p>
922
923 <p>
924 In the egress pipeline, this action performs the actual
925 output to the <code>outport</code> logical port. (In the egress
926 pipeline, <code>outport</code> never names a multicast group.)
927 </p>
928
929 <p>
930 By default, output to the input port is implicitly dropped,
931 that is, <code>output</code> becomes a no-op if
932 <code>outport</code> == <code>inport</code>. Occasionally
933 it may be useful to override this behavior, e.g. to send an
934 ARP reply to an ARP request; to do so, use
935 <code>flags.loopback = 1</code> to allow the packet to
936 "hair-pin" back to the input port.
937 </p>
938 </dd>
939
940 <dt><code>next;</code></dt>
941 <dt><code>next(<var>table</var>);</code></dt>
942 <dd>
943 Executes another logical datapath table as a subroutine. By default,
944 the table after the current one is executed. Specify
945 <var>table</var> to jump to a specific table in the same pipeline.
946 </dd>
947
948 <dt><code><var>field</var> = <var>constant</var>;</code></dt>
949 <dd>
950 <p>
951 Sets data or metadata field <var>field</var> to constant value
952 <var>constant</var>, e.g. <code>outport = "vif0";</code> to set the
953 logical output port. To set only a subset of bits in a field,
954 specify a subfield for <var>field</var> or a masked
955 <var>constant</var>, e.g. one may use <code>vlan.pcp[2] = 1;</code>
956 or <code>vlan.pcp = 4/4;</code> to set the most sigificant bit of
957 the VLAN PCP.
958 </p>
959
960 <p>
961 Assigning to a field with prerequisites implicitly adds those
962 prerequisites to <ref column="match"/>; thus, for example, a flow
963 that sets <code>tcp.dst</code> applies only to TCP flows,
964 regardless of whether its <ref column="match"/> mentions any TCP
965 field.
966 </p>
967
968 <p>
969 Not all fields are modifiable (e.g. <code>eth.type</code> and
970 <code>ip.proto</code> are read-only), and not all modifiable fields
971 may be partially modified (e.g. <code>ip.ttl</code> must assigned
972 as a whole). The <code>outport</code> field is modifiable in the
973 <code>ingress</code> pipeline but not in the <code>egress</code>
974 pipeline.
975 </p>
976 </dd>
977
978 <dt><code><var>field1</var> = <var>field2</var>;</code></dt>
979 <dd>
980 <p>
981 Sets data or metadata field <var>field1</var> to the value of data
982 or metadata field <var>field2</var>, e.g. <code>reg0 =
983 ip4.src;</code> copies <code>ip4.src</code> into <code>reg0</code>.
984 To modify only a subset of a field's bits, specify a subfield for
985 <var>field1</var> or <var>field2</var> or both, e.g. <code>vlan.pcp
986 = reg0[0..2];</code> copies the least-significant bits of
987 <code>reg0</code> into the VLAN PCP.
988 </p>
989
990 <p>
991 <var>field1</var> and <var>field2</var> must be the same type,
992 either both string or both integer fields. If they are both
993 integer fields, they must have the same width.
994 </p>
995
996 <p>
997 If <var>field1</var> or <var>field2</var> has prerequisites, they
998 are added implicitly to <ref column="match"/>. It is possible to
999 write an assignment with contradictory prerequisites, such as
1000 <code>ip4.src = ip6.src[0..31];</code>, but the contradiction means
1001 that a logical flow with such an assignment will never be matched.
1002 </p>
1003 </dd>
1004
1005 <dt><code><var>field1</var> &lt;-&gt; <var>field2</var>;</code></dt>
1006 <dd>
1007 <p>
1008 Similar to <code><var>field1</var> = <var>field2</var>;</code>
1009 except that the two values are exchanged instead of copied. Both
1010 <var>field1</var> and <var>field2</var> must modifiable.
1011 </p>
1012 </dd>
1013
1014 <dt><code>ip.ttl--;</code></dt>
1015 <dd>
1016 <p>
1017 Decrements the IPv4 or IPv6 TTL. If this would make the TTL zero
1018 or negative, then processing of the packet halts; no further
1019 actions are processed. (To properly handle such cases, a
1020 higher-priority flow should match on
1021 <code>ip.ttl == {0, 1};</code>.)
1022 </p>
1023
1024 <p><b>Prerequisite:</b> <code>ip</code></p>
1025 </dd>
1026
1027 <dt><code>ct_next;</code></dt>
1028 <dd>
1029 <p>
1030 Apply connection tracking to the flow, initializing
1031 <code>ct_state</code> for matching in later tables.
1032 Automatically moves on to the next table, as if followed by
1033 <code>next</code>.
1034 </p>
1035
1036 <p>
1037 As a side effect, IP fragments will be reassembled for matching.
1038 If a fragmented packet is output, then it will be sent with any
1039 overlapping fragments squashed. The connection tracking state is
1040 scoped by the logical port, so overlapping addresses may be used.
1041 To allow traffic related to the matched flow, execute
1042 <code>ct_commit</code>.
1043 </p>
1044
1045 <p>
1046 It is possible to have actions follow <code>ct_next</code>,
1047 but they will not have access to any of its side-effects and
1048 is not generally useful.
1049 </p>
1050 </dd>
1051
1052 <dt><code>ct_commit;</code></dt>
1053 <dt><code>ct_commit(ct_mark=<var>value[/mask]</var>);</code></dt>
1054 <dt><code>ct_commit(ct_label=<var>value[/mask]</var>);</code></dt>
1055 <dt><code>ct_commit(ct_mark=<var>value[/mask]</var>, ct_label=<var>value[/mask]</var>);</code></dt>
1056 <dd>
1057 <p>
1058 Commit the flow to the connection tracking entry associated with it
1059 by a previous call to <code>ct_next</code>. When
1060 <code>ct_mark=<var>value[/mask]</var></code> and/or
1061 <code>ct_label=<var>value[/mask]</var></code> are supplied,
1062 <code>ct_mark</code> and/or <code>ct_label</code> will be set to the
1063 values indicated by <var>value[/mask]</var> on the connection
1064 tracking entry. <code>ct_mark</code> is a 32-bit field.
1065 <code>ct_label</code> is a 128-bit field. The <var>value[/mask]</var>
1066 should be specified in hex string if more than 64bits are to be used.
1067 </p>
1068
1069 <p>
1070 Note that if you want processing to continue in the next table,
1071 you must execute the <code>next</code> action after
1072 <code>ct_commit</code>. You may also leave out <code>next</code>
1073 which will commit connection tracking state, and then drop the
1074 packet. This could be useful for setting <code>ct_mark</code>
1075 on a connection tracking entry before dropping a packet,
1076 for example.
1077 </p>
1078 </dd>
1079
1080 <dt><code>ct_dnat;</code></dt>
1081 <dt><code>ct_dnat(<var>IP</var>);</code></dt>
1082 <dd>
1083 <p>
1084 <code>ct_dnat</code> sends the packet through the DNAT zone in
1085 connection tracking table to unDNAT any packet that was DNATed in
1086 the opposite direction. The packet is then automatically sent to
1087 to the next tables as if followed by <code>next;</code> action.
1088 The next tables will see the changes in the packet caused by
1089 the connection tracker.
1090 </p>
1091 <p>
1092 <code>ct_dnat(<var>IP</var>)</code> sends the packet through the
1093 DNAT zone to change the destination IP address of the packet to
1094 the one provided inside the parentheses and commits the connection.
1095 The packet is then automatically sent to the next tables as if
1096 followed by <code>next;</code> action. The next tables will see
1097 the changes in the packet caused by the connection tracker.
1098 </p>
1099 </dd>
1100
1101 <dt><code>ct_snat;</code></dt>
1102 <dt><code>ct_snat(<var>IP</var>);</code></dt>
1103 <dd>
1104 <p>
1105 <code>ct_snat</code> sends the packet through the SNAT zone to
1106 unSNAT any packet that was SNATed in the opposite direction. If
1107 the packet needs to be sent to the next tables, then it should be
1108 followed by a <code>next;</code> action. The next tables will not
1109 see the changes in the packet caused by the connection tracker.
1110 </p>
1111 <p>
1112 <code>ct_snat(<var>IP</var>)</code> sends the packet through the
1113 SNAT zone to change the source IP address of the packet to
1114 the one provided inside the parenthesis and commits the connection.
1115 The packet is then automatically sent to the next tables as if
1116 followed by <code>next;</code> action. The next tables will see the
1117 changes in the packet caused by the connection tracker.
1118 </p>
1119 </dd>
1120
1121 <dt><code>arp { <var>action</var>; </code>...<code> };</code></dt>
1122 <dd>
1123 <p>
1124 Temporarily replaces the IPv4 packet being processed by an ARP
1125 packet and executes each nested <var>action</var> on the ARP
1126 packet. Actions following the <var>arp</var> action, if any, apply
1127 to the original, unmodified packet.
1128 </p>
1129
1130 <p>
1131 The ARP packet that this action operates on is initialized based on
1132 the IPv4 packet being processed, as follows. These are default
1133 values that the nested actions will probably want to change:
1134 </p>
1135
1136 <ul>
1137 <li><code>eth.src</code> unchanged</li>
1138 <li><code>eth.dst</code> unchanged</li>
1139 <li><code>eth.type = 0x0806</code></li>
1140 <li><code>arp.op = 1</code> (ARP request)</li>
1141 <li><code>arp.sha</code> copied from <code>eth.src</code></li>
1142 <li><code>arp.spa</code> copied from <code>ip4.src</code></li>
1143 <li><code>arp.tha = 00:00:00:00:00:00</code></li>
1144 <li><code>arp.tpa</code> copied from <code>ip4.dst</code></li>
1145 </ul>
1146
1147 <p>
1148 The ARP packet has the same VLAN header, if any, as the IP packet
1149 it replaces.
1150 </p>
1151
1152 <p><b>Prerequisite:</b> <code>ip4</code></p>
1153 </dd>
1154
1155 <dt><code>get_arp(<var>P</var>, <var>A</var>);</code></dt>
1156
1157 <dd>
1158 <p>
1159 <b>Parameters</b>: logical port string field <var>P</var>, 32-bit
1160 IP address field <var>A</var>.
1161 </p>
1162
1163 <p>
1164 Looks up <var>A</var> in <var>P</var>'s mac binding table.
1165 If an entry is found, stores its Ethernet address in
1166 <code>eth.dst</code>, otherwise stores
1167 <code>00:00:00:00:00:00</code> in <code>eth.dst</code>.
1168 </p>
1169
1170 <p><b>Example:</b> <code>get_arp(outport, ip4.dst);</code></p>
1171 </dd>
1172
1173 <dt>
1174 <code>put_arp(<var>P</var>, <var>A</var>, <var>E</var>);</code>
1175 </dt>
1176
1177 <dd>
1178 <p>
1179 <b>Parameters</b>: logical port string field <var>P</var>, 32-bit
1180 IP address field <var>A</var>, 48-bit Ethernet address field
1181 <var>E</var>.
1182 </p>
1183
1184 <p>
1185 Adds or updates the entry for IP address <var>A</var> in
1186 logical port <var>P</var>'s mac binding table, setting its
1187 Ethernet address to <var>E</var>.
1188 </p>
1189
1190 <p><b>Example:</b> <code>put_arp(inport, arp.spa, arp.sha);</code></p>
1191 </dd>
1192
1193 <dt>
1194 <code>nd_na { <var>action</var>; </code>...<code> };</code>
1195 </dt>
1196
1197 <dd>
1198 <p>
1199 Temporarily replaces the IPv6 neighbor solicitation packet
1200 being processed by an IPv6 neighbor advertisement (NA)
1201 packet and executes each nested <var>action</var> on the NA
1202 packet. Actions following the <code>nd_na</code> action,
1203 if any, apply to the original, unmodified packet.
1204 </p>
1205
1206 <p>
1207 The NA packet that this action operates on is initialized based on
1208 the IPv6 packet being processed, as follows. These are default
1209 values that the nested actions will probably want to change:
1210 </p>
1211
1212 <ul>
1213 <li><code>eth.dst</code> exchanged with <code>eth.src</code></li>
1214 <li><code>eth.type = 0x86dd</code></li>
1215 <li><code>ip6.dst</code> copied from <code>ip6.src</code></li>
1216 <li><code>ip6.src</code> copied from <code>nd.target</code></li>
1217 <li><code>icmp6.type = 136</code> (Neighbor Advertisement)</li>
1218 <li><code>nd.target</code> unchanged</li>
1219 <li><code>nd.sll = 00:00:00:00:00:00</code></li>
1220 <li><code>nd.tll</code> copied from <code>eth.dst</code></li>
1221 </ul>
1222
1223 <p>
1224 The ND packet has the same VLAN header, if any, as the IPv6 packet
1225 it replaces.
1226 </p>
1227
1228 <p>
1229 <b>Prerequisite:</b> <code>nd_ns</code>
1230 </p>
1231 </dd>
1232
1233 <dt><code>get_nd(<var>P</var>, <var>A</var>);</code></dt>
1234
1235 <dd>
1236 <p>
1237 <b>Parameters</b>: logical port string field <var>P</var>, 128-bit
1238 IPv6 address field <var>A</var>.
1239 </p>
1240
1241 <p>
1242 Looks up <var>A</var> in <var>P</var>'s mac binding table.
1243 If an entry is found, stores its Ethernet address in
1244 <code>eth.dst</code>, otherwise stores
1245 <code>00:00:00:00:00:00</code> in <code>eth.dst</code>.
1246 </p>
1247
1248 <p><b>Example:</b> <code>get_nd(outport, ip6.dst);</code></p>
1249 </dd>
1250
1251 <dt>
1252 <code>put_nd(<var>P</var>, <var>A</var>, <var>E</var>);</code>
1253 </dt>
1254
1255 <dd>
1256 <p>
1257 <b>Parameters</b>: logical port string field <var>P</var>,
1258 128-bit IPv6 address field <var>A</var>, 48-bit Ethernet
1259 address field <var>E</var>.
1260 </p>
1261
1262 <p>
1263 Adds or updates the entry for IPv6 address <var>A</var> in
1264 logical port <var>P</var>'s mac binding table, setting its
1265 Ethernet address to <var>E</var>.
1266 </p>
1267
1268 <p><b>Example:</b> <code>put_nd(inport, nd.target, nd.tll);</code></p>
1269 </dd>
1270
1271 <dt>
1272 <code><var>R</var> = put_dhcp_opts(<var>D1</var> = <var>V1</var>, <var>D2</var> = <var>V2</var>, ..., <var>Dn</var> = <var>Vn</var>);</code>
1273 </dt>
1274
1275 <dd>
1276 <p>
1277 <b>Parameters</b>: one or more DHCP option/value pairs, which must
1278 include an <code>offerip</code> option (with code 0).
1279 </p>
1280
1281 <p>
1282 <b>Result</b>: stored to a 1-bit subfield <var>R</var>.
1283 </p>
1284
1285 <p>
1286 Valid only in the ingress pipeline.
1287 </p>
1288
1289 <p>
1290 When this action is applied to a DHCP request packet (DHCPDISCOVER
1291 or DHCPREQUEST), it changes the packet into a DHCP reply (DHCPOFFER
1292 or DHCPACK, respectively), replaces the options by those specified
1293 as parameters, and stores 1 in <var>R</var>.
1294 </p>
1295
1296 <p>
1297 When this action is applied to a non-DHCP packet or a DHCP packet
1298 that is not DHCPDISCOVER or DHCPREQUEST, it leaves the packet
1299 unchanged and stores 0 in <var>R</var>.
1300 </p>
1301
1302 <p>
1303 The contents of the <ref table="DHCP_Option"/> table control the
1304 DHCP option names and values that this action supports.
1305 </p>
1306
1307 <p>
1308 <b>Example:</b>
1309 <code>
1310 reg0[0] = put_dhcp_opts(offerip = 10.0.0.2, router = 10.0.0.1,
1311 netmask = 255.255.255.0, dns_server = {8.8.8.8, 7.7.7.7});
1312 </code>
1313 </p>
1314 </dd>
1315
1316 <dt>
1317 <code><var>R</var> = put_dhcpv6_opts(<var>D1</var> = <var>V1</var>, <var>D2</var> = <var>V2</var>, ..., <var>Dn</var> = <var>Vn</var>);</code>
1318 </dt>
1319
1320 <dd>
1321 <p>
1322 <b>Parameters</b>: one or more DHCPv6 option/value pairs.
1323 </p>
1324
1325 <p>
1326 <b>Result</b>: stored to a 1-bit subfield <var>R</var>.
1327 </p>
1328
1329 <p>
1330 Valid only in the ingress pipeline.
1331 </p>
1332
1333 <p>
1334 When this action is applied to a DHCPv6 request packet, it changes
1335 the packet into a DHCPv6 reply, replaces the options by those
1336 specified as parameters, and stores 1 in <var>R</var>.
1337 </p>
1338
1339 <p>
1340 When this action is applied to a non-DHCPv6 packet or an invalid
1341 DHCPv6 request packet , it leaves the packet unchanged and stores
1342 0 in <var>R</var>.
1343 </p>
1344
1345 <p>
1346 The contents of the <ref table="DHCPv6_Options"/> table control the
1347 DHCPv6 option names and values that this action supports.
1348 </p>
1349
1350 <p>
1351 <b>Example:</b>
1352 <code>
1353 reg0[3] = put_dhcpv6_opts(ia_addr = aef0::4, server_id = 00:00:00:00:10:02,
1354 dns_server={ae70::1,ae70::2});
1355 </code>
1356 </p>
1357 </dd>
1358
1359 <dt>
1360 <code>set_queue(<var>queue_number</var>);</code>
1361 </dt>
1362
1363 <dd>
1364 <p>
1365 <b>Parameters</b>: Queue number <var>queue_number</var>, in the range 0 to 61440.
1366 </p>
1367
1368 <p>
1369 This is a logical equivalent of the OpenFlow <code>set_queue</code>
1370 action. It affects packets that egress a hypervisor through a
1371 physical interface. For nonzero <var>queue_number</var>, it
1372 configures packet queuing to match the settings configured for the
1373 <ref table="Port_Binding"/> with
1374 <code>options:qdisc_queue_id</code> matching
1375 <var>queue_number</var>. When <var>queue_number</var> is zero, it
1376 resets queuing to the default strategy.
1377 </p>
1378
1379 <p><b>Example:</b> <code>set_queue(10);</code></p>
1380 </dd>
1381
1382 <dt><code>ct_lb;</code></dt>
1383 <dt><code>ct_lb(</code><var>ip</var>[<code>:</code><var>port</var>]...<code>);</code></dt>
1384 <dd>
1385 <p>
1386 With one or more arguments, <code>ct_lb</code> commits the packet
1387 to the connection tracking table and DNATs the packet's destination
1388 IP address (and port) to the IP address or addresses (and optional
1389 ports) specified in the string. If multiple comma-separated IP
1390 addresses are specified, each is given equal weight for picking the
1391 DNAT address. Processing automatically moves on to the next table,
1392 as if <code>next;</code> were specified, and later tables act on
1393 the packet as modified by the connection tracker. Connection
1394 tracking state is scoped by the logical port when the action is
1395 used in a flow for a logical switch, so overlapping
1396 addresses may be used. Connection tracking state is scoped by the
1397 logical topology when the action is used in a flow for a router.
1398 </p>
1399 <p>
1400 Without arguments, <code>ct_lb</code> sends the packet to the
1401 connection tracking table to NAT the packets. If the packet is
1402 part of an established connection that was previously committed to
1403 the connection tracker via <code>ct_lb(</code>...<code>)</code>, it
1404 will automatically get DNATed to the same IP address as the first
1405 packet in that connection.
1406 </p>
1407 </dd>
1408 </dl>
1409
1410 <p>
1411 The following actions will likely be useful later, but they have not
1412 been thought out carefully.
1413 </p>
1414
1415 <dl>
1416 <dt><code>icmp4 { <var>action</var>; </code>...<code> };</code></dt>
1417 <dd>
1418 <p>
1419 Temporarily replaces the IPv4 packet being processed by an ICMPv4
1420 packet and executes each nested <var>action</var> on the ICMPv4
1421 packet. Actions following the <var>icmp4</var> action, if any,
1422 apply to the original, unmodified packet.
1423 </p>
1424
1425 <p>
1426 The ICMPv4 packet that this action operates on is initialized based
1427 on the IPv4 packet being processed, as follows. These are default
1428 values that the nested actions will probably want to change.
1429 Ethernet and IPv4 fields not listed here are not changed:
1430 </p>
1431
1432 <ul>
1433 <li><code>ip.proto = 1</code> (ICMPv4)</li>
1434 <li><code>ip.frag = 0</code> (not a fragment)</li>
1435 <li><code>icmp4.type = 3</code> (destination unreachable)</li>
1436 <li><code>icmp4.code = 1</code> (host unreachable)</li>
1437 </ul>
1438
1439 <p>
1440 Details TBD.
1441 </p>
1442
1443 <p><b>Prerequisite:</b> <code>ip4</code></p>
1444 </dd>
1445
1446 <dt><code>tcp_reset;</code></dt>
1447 <dd>
1448 <p>
1449 This action transforms the current TCP packet according to the
1450 following pseudocode:
1451 </p>
1452
1453 <pre>
1454 if (tcp.ack) {
1455 tcp.seq = tcp.ack;
1456 } else {
1457 tcp.ack = tcp.seq + length(tcp.payload);
1458 tcp.seq = 0;
1459 }
1460 tcp.flags = RST;
1461 </pre>
1462
1463 <p>
1464 Then, the action drops all TCP options and payload data, and
1465 updates the TCP checksum.
1466 </p>
1467
1468 <p>
1469 Details TBD.
1470 </p>
1471
1472 <p><b>Prerequisite:</b> <code>tcp</code></p>
1473 </dd>
1474 </dl>
1475 </column>
1476
1477 <column name="external_ids" key="stage-name">
1478 Human-readable name for this flow's stage in the pipeline.
1479 </column>
1480
1481 <column name="external_ids" key="source">
1482 Source file and line number of the code that added this flow to the
1483 pipeline.
1484 </column>
1485
1486 <group title="Common Columns">
1487 The overall purpose of these columns is described under <code>Common
1488 Columns</code> at the beginning of this document.
1489
1490 <column name="external_ids"/>
1491 </group>
1492 </table>
1493
1494 <table name="Multicast_Group" title="Logical Port Multicast Groups">
1495 <p>
1496 The rows in this table define multicast groups of logical ports.
1497 Multicast groups allow a single packet transmitted over a tunnel to a
1498 hypervisor to be delivered to multiple VMs on that hypervisor, which
1499 uses bandwidth more efficiently.
1500 </p>
1501
1502 <p>
1503 Each row in this table defines a logical multicast group numbered <ref
1504 column="tunnel_key"/> within <ref column="datapath"/>, whose logical
1505 ports are listed in the <ref column="ports"/> column.
1506 </p>
1507
1508 <column name="datapath">
1509 The logical datapath in which the multicast group resides.
1510 </column>
1511
1512 <column name="tunnel_key">
1513 The value used to designate this logical egress port in tunnel
1514 encapsulations. An index forces the key to be unique within the <ref
1515 column="datapath"/>. The unusual range ensures that multicast group IDs
1516 do not overlap with logical port IDs.
1517 </column>
1518
1519 <column name="name">
1520 <p>
1521 The logical multicast group's name. An index forces the name to be
1522 unique within the <ref column="datapath"/>. Logical flows in the
1523 ingress pipeline may output to the group just as for individual logical
1524 ports, by assigning the group's name to <code>outport</code> and
1525 executing an <code>output</code> action.
1526 </p>
1527
1528 <p>
1529 Multicast group names and logical port names share a single namespace
1530 and thus should not overlap (but the database schema cannot enforce
1531 this). To try to avoid conflicts, <code>ovn-northd</code> uses names
1532 that begin with <code>_MC_</code>.
1533 </p>
1534 </column>
1535
1536 <column name="ports">
1537 The logical ports included in the multicast group. All of these ports
1538 must be in the <ref column="datapath"/> logical datapath (but the
1539 database schema cannot enforce this).
1540 </column>
1541 </table>
1542
1543 <table name="Datapath_Binding" title="Physical-Logical Datapath Bindings">
1544 <p>
1545 Each row in this table identifies physical bindings of a logical
1546 datapath. A logical datapath implements a logical pipeline among the
1547 ports in the <ref table="Port_Binding"/> table associated with it. In
1548 practice, the pipeline in a given logical datapath implements either a
1549 logical switch or a logical router.
1550 </p>
1551
1552 <column name="tunnel_key">
1553 The tunnel key value to which the logical datapath is bound.
1554 The <code>Tunnel Encapsulation</code> section in
1555 <code>ovn-architecture</code>(7) describes how tunnel keys are
1556 constructed for each supported encapsulation.
1557 </column>
1558
1559 <group title="OVN_Northbound Relationship">
1560 <p>
1561 Each row in <ref table="Datapath_Binding"/> is associated with some
1562 logical datapath. <code>ovn-northd</code> uses these keys to track the
1563 association of a logical datapath with concepts in the <ref
1564 db="OVN_Northbound"/> database.
1565 </p>
1566
1567 <column name="external_ids" key="logical-switch" type='{"type": "uuid"}'>
1568 For a logical datapath that represents a logical switch,
1569 <code>ovn-northd</code> stores in this key the UUID of the
1570 corresponding <ref table="Logical_Switch" db="OVN_Northbound"/> row in
1571 the <ref db="OVN_Northbound"/> database.
1572 </column>
1573
1574 <column name="external_ids" key="logical-router" type='{"type": "uuid"}'>
1575 For a logical datapath that represents a logical router,
1576 <code>ovn-northd</code> stores in this key the UUID of the
1577 corresponding <ref table="Logical_Router" db="OVN_Northbound"/> row in
1578 the <ref db="OVN_Northbound"/> database.
1579 </column>
1580
1581 <column name="external_ids" key="name">
1582 <code>ovn-northd</code> copies this from the <ref
1583 table="Logical_Router" db="OVN_Northbound"/> or <ref
1584 table="Logical_Switch" db="OVN_Northbound"/> table in the <ref
1585 db="OVN_Northbound"/> database, when that column is nonempty.
1586 </column>
1587 </group>
1588
1589 <group title="Common Columns">
1590 The overall purpose of these columns is described under <code>Common
1591 Columns</code> at the beginning of this document.
1592
1593 <column name="external_ids"/>
1594 </group>
1595 </table>
1596
1597 <table name="Port_Binding" title="Physical-Logical Port Bindings">
1598 <p>
1599 Most rows in this table identify the physical location of a logical port.
1600 (The exceptions are logical patch ports, which do not have any physical
1601 location.)
1602 </p>
1603
1604 <p>
1605 For every <code>Logical_Switch_Port</code> record in
1606 <code>OVN_Northbound</code> database, <code>ovn-northd</code>
1607 creates a record in this table. <code>ovn-northd</code> populates
1608 and maintains every column except the <code>chassis</code> column,
1609 which it leaves empty in new records.
1610 </p>
1611
1612 <p>
1613 <code>ovn-controller</code>/<code>ovn-controller-vtep</code>
1614 populates the <code>chassis</code> column for the records that
1615 identify the logical ports that are located on its hypervisor/gateway,
1616 which <code>ovn-controller</code>/<code>ovn-controller-vtep</code> in
1617 turn finds out by monitoring the local hypervisor's Open_vSwitch
1618 database, which identifies logical ports via the conventions described
1619 in <code>IntegrationGuide.rst</code>. (The exceptions are for
1620 <code>Port_Binding</code> records with <code>type</code> of
1621 <code>l3gateway</code>, whose locations are identified by
1622 <code>ovn-northd</code> via the <code>options:l3gateway-chassis</code>
1623 column in this table. <code>ovn-controller</code> is still responsible
1624 to populate the <code>chassis</code> column.)
1625 </p>
1626
1627 <p>
1628 When a chassis shuts down gracefully, it should clean up the
1629 <code>chassis</code> column that it previously had populated.
1630 (This is not critical because resources hosted on the chassis are equally
1631 unreachable regardless of whether their rows are present.) To handle the
1632 case where a VM is shut down abruptly on one chassis, then brought up
1633 again on a different one,
1634 <code>ovn-controller</code>/<code>ovn-controller-vtep</code> must
1635 overwrite the <code>chassis</code> column with new information.
1636 </p>
1637
1638 <group title="Core Features">
1639 <column name="datapath">
1640 The logical datapath to which the logical port belongs.
1641 </column>
1642
1643 <column name="logical_port">
1644 A logical port, taken from <ref table="Logical_Switch_Port"
1645 column="name" db="OVN_Northbound"/> in the OVN_Northbound
1646 database's <ref table="Logical_Switch_Port" db="OVN_Northbound"/>
1647 table. OVN does not prescribe a particular format for the
1648 logical port ID.
1649 </column>
1650
1651 <column name="chassis">
1652 The meaning of this column depends on the value of the <ref column="type"/>
1653 column. This is the meaning for each <ref column="type"/>
1654
1655 <dl>
1656 <dt>(empty string)</dt>
1657 <dd>
1658 The physical location of the logical port. To successfully identify a
1659 chassis, this column must be a <ref table="Chassis"/> record. This is
1660 populated by <code>ovn-controller</code>.
1661 </dd>
1662
1663 <dt>vtep</dt>
1664 <dd>
1665 The physical location of the hardware_vtep gateway. To successfully
1666 identify a chassis, this column must be a <ref table="Chassis"/> record.
1667 This is populated by <code>ovn-controller-vtep</code>.
1668 </dd>
1669
1670 <dt>localnet</dt>
1671 <dd>
1672 Always empty. A localnet port is realized on every chassis that has
1673 connectivity to the corresponding physical network.
1674 </dd>
1675
1676 <dt>l3gateway</dt>
1677 <dd>
1678 The physical location of the L3 gateway. To successfully identify a
1679 chassis, this column must be a <ref table="Chassis"/> record. This is
1680 populated by <code>ovn-controller</code> based on the value of
1681 the <code>options:l3gateway-chassis</code> column in this table.
1682 </dd>
1683
1684 <dt>l2gateway</dt>
1685 <dd>
1686 The physical location of this L2 gateway. To successfully identify a
1687 chassis, this column must be a <ref table="Chassis"/> record.
1688 This is populated by <code>ovn-controller</code> based on the value
1689 of the <code>options:l2gateway-chassis</code> column in this table.
1690 </dd>
1691 </dl>
1692
1693 </column>
1694
1695 <column name="tunnel_key">
1696 <p>
1697 A number that represents the logical port in the key (e.g. STT key or
1698 Geneve TLV) field carried within tunnel protocol packets.
1699 </p>
1700
1701 <p>
1702 The tunnel ID must be unique within the scope of a logical datapath.
1703 </p>
1704 </column>
1705
1706 <column name="mac">
1707 <p>
1708 The Ethernet address or addresses used as a source address on the
1709 logical port, each in the form
1710 <var>xx</var>:<var>xx</var>:<var>xx</var>:<var>xx</var>:<var>xx</var>:<var>xx</var>.
1711 The string <code>unknown</code> is also allowed to indicate that the
1712 logical port has an unknown set of (additional) source addresses.
1713 </p>
1714
1715 <p>
1716 A VM interface would ordinarily have a single Ethernet address. A
1717 gateway port might initially only have <code>unknown</code>, and then
1718 add MAC addresses to the set as it learns new source addresses.
1719 </p>
1720 </column>
1721
1722 <column name="type">
1723 <p>
1724 A type for this logical port. Logical ports can be used to model other
1725 types of connectivity into an OVN logical switch. The following types
1726 are defined:
1727 </p>
1728
1729 <dl>
1730 <dt>(empty string)</dt>
1731 <dd>VM (or VIF) interface.</dd>
1732
1733 <dt><code>patch</code></dt>
1734 <dd>
1735 One of a pair of logical ports that act as if connected by a patch
1736 cable. Useful for connecting two logical datapaths, e.g. to connect
1737 a logical router to a logical switch or to another logical router.
1738 </dd>
1739
1740 <dt><code>l3gateway</code></dt>
1741 <dd>
1742 One of a pair of logical ports that act as if connected by a patch
1743 cable across multiple chassis. Useful for connecting a logical
1744 switch with a Gateway router (which is only resident on a
1745 particular chassis).
1746 </dd>
1747
1748 <dt><code>localnet</code></dt>
1749 <dd>
1750 A connection to a locally accessible network from each
1751 <code>ovn-controller</code> instance. A logical switch can only
1752 have a single <code>localnet</code> port attached. This is used
1753 to model direct connectivity to an existing network.
1754 </dd>
1755
1756 <dt><code>l2gateway</code></dt>
1757 <dd>
1758 An L2 connection to a physical network. The chassis this
1759 <ref table="Port_Binding"/> is bound to will serve as
1760 an L2 gateway to the network named by
1761 <ref column="options" table="Port_Binding"/>:<code>network_name</code>.
1762 </dd>
1763
1764 <dt><code>vtep</code></dt>
1765 <dd>
1766 A port to a logical switch on a VTEP gateway chassis. In order to
1767 get this port correctly recognized by the OVN controller, the <ref
1768 column="options"
1769 table="Port_Binding"/>:<code>vtep-physical-switch</code> and <ref
1770 column="options"
1771 table="Port_Binding"/>:<code>vtep-logical-switch</code> must also
1772 be defined.
1773 </dd>
1774 </dl>
1775 </column>
1776 </group>
1777
1778 <group title="Patch Options">
1779 <p>
1780 These options apply to logical ports with <ref column="type"/> of
1781 <code>patch</code>.
1782 </p>
1783
1784 <column name="options" key="peer">
1785 The <ref column="logical_port"/> in the <ref table="Port_Binding"/>
1786 record for the other side of the patch. The named <ref
1787 column="logical_port"/> must specify this <ref column="logical_port"/>
1788 in its own <code>peer</code> option. That is, the two patch logical
1789 ports must have reversed <ref column="logical_port"/> and
1790 <code>peer</code> values.
1791 </column>
1792 </group>
1793
1794 <group title="L3 Gateway Options">
1795 <p>
1796 These options apply to logical ports with <ref column="type"/> of
1797 <code>l3gateway</code>.
1798 </p>
1799
1800 <column name="options" key="peer">
1801 The <ref column="logical_port"/> in the <ref table="Port_Binding"/>
1802 record for the other side of the 'l3gateway' port. The named <ref
1803 column="logical_port"/> must specify this <ref column="logical_port"/>
1804 in its own <code>peer</code> option. That is, the two 'l3gateway'
1805 logical ports must have reversed <ref column="logical_port"/> and
1806 <code>peer</code> values.
1807 </column>
1808
1809 <column name="options" key="l3gateway-chassis">
1810 The <code>chassis</code> in which the port resides.
1811 </column>
1812
1813 <column name="options" key="nat-addresses">
1814 MAC address of the <code>l3gateway</code> port followed by a list of
1815 SNAT and DNAT IP addresses. This is used to send gratuitous ARPs for
1816 SNAT and DNAT IP addresses via <code>localnet</code> and is valid for
1817 only L3 gateway ports. Example: <code>80:fa:5b:06:72:b7 158.36.44.22
1818 158.36.44.24</code>. This would result in generation of gratuitous
1819 ARPs for IP addresses 158.36.44.22 and 158.36.44.24 with a MAC
1820 address of 80:fa:5b:06:72:b7.
1821 </column>
1822 </group>
1823
1824 <group title="Localnet Options">
1825 <p>
1826 These options apply to logical ports with <ref column="type"/> of
1827 <code>localnet</code>.
1828 </p>
1829
1830 <column name="options" key="network_name">
1831 Required. <code>ovn-controller</code> uses the configuration entry
1832 <code>ovn-bridge-mappings</code> to determine how to connect to this
1833 network. <code>ovn-bridge-mappings</code> is a list of network names
1834 mapped to a local OVS bridge that provides access to that network. An
1835 example of configuring <code>ovn-bridge-mappings</code> would be:
1836
1837 <pre>$ ovs-vsctl set open . external-ids:ovn-bridge-mappings=physnet1:br-eth0,physnet2:br-eth1</pre>
1838
1839 <p>
1840 When a logical switch has a <code>localnet</code> port attached,
1841 every chassis that may have a local vif attached to that logical
1842 switch must have a bridge mapping configured to reach that
1843 <code>localnet</code>. Traffic that arrives on a
1844 <code>localnet</code> port is never forwarded over a tunnel to
1845 another chassis.
1846 </p>
1847 </column>
1848
1849 <column name="tag">
1850 If set, indicates that the port represents a connection to a specific
1851 VLAN on a locally accessible network. The VLAN ID is used to match
1852 incoming traffic and is also added to outgoing traffic.
1853 </column>
1854 </group>
1855
1856 <group title="L2 Gateway Options">
1857 <p>
1858 These options apply to logical ports with <ref column="type"/> of
1859 <code>l2gateway</code>.
1860 </p>
1861
1862 <column name="options" key="network_name">
1863 Required. <code>ovn-controller</code> uses the configuration entry
1864 <code>ovn-bridge-mappings</code> to determine how to connect to this
1865 network. <code>ovn-bridge-mappings</code> is a list of network names
1866 mapped to a local OVS bridge that provides access to that network. An
1867 example of configuring <code>ovn-bridge-mappings</code> would be:
1868
1869 <pre>$ ovs-vsctl set open . external-ids:ovn-bridge-mappings=physnet1:br-eth0,physnet2:br-eth1</pre>
1870
1871 <p>
1872 When a logical switch has a <code>l2gateway</code> port attached,
1873 the chassis that the <code>l2gateway</code> port is bound to
1874 must have a bridge mapping configured to reach the network
1875 identified by <code>network_name</code>.
1876 </p>
1877 </column>
1878
1879 <column name="options" key="l2gateway-chassis">
1880 Required. The <code>chassis</code> in which the port resides.
1881 </column>
1882
1883 <column name="tag">
1884 If set, indicates that the gateway is connected to a specific
1885 VLAN on the physical network. The VLAN ID is used to match
1886 incoming traffic and is also added to outgoing traffic.
1887 </column>
1888 </group>
1889
1890 <group title="VTEP Options">
1891 <p>
1892 These options apply to logical ports with <ref column="type"/> of
1893 <code>vtep</code>.
1894 </p>
1895
1896 <column name="options" key="vtep-physical-switch">
1897 Required. The name of the VTEP gateway.
1898 </column>
1899
1900 <column name="options" key="vtep-logical-switch">
1901 Required. A logical switch name connected by the VTEP gateway. Must
1902 be set when <ref column="type"/> is <code>vtep</code>.
1903 </column>
1904 </group>
1905
1906 <group title="VMI (or VIF) Options">
1907 <p>
1908 These options apply to logical ports with <ref column="type"/> having
1909 (empty string)
1910 </p>
1911
1912 <column name="options" key="qos_max_rate">
1913 If set, indicates the maximum rate for data sent from this interface,
1914 in bit/s. The traffic will be shaped according to this limit.
1915 </column>
1916
1917 <column name="options" key="qos_burst">
1918 If set, indicates the maximum burst size for data sent from this
1919 interface, in bits.
1920 </column>
1921
1922 <column name="options" key="qdisc_queue_id"
1923 type='{"type": "integer", "minInteger": 1, "maxInteger": 61440}'>
1924 Indicates the queue number on the physical device. This is same as the
1925 <code>queue_id</code> used in OpenFlow in <code>struct
1926 ofp_action_enqueue</code>.
1927 </column>
1928 </group>
1929
1930 <group title="Nested Containers">
1931 <p>
1932 These columns support containers nested within a VM. Specifically,
1933 they are used when <ref column="type"/> is empty and <ref
1934 column="logical_port"/> identifies the interface of a container spawned
1935 inside a VM. They are empty for containers or VMs that run directly on
1936 a hypervisor.
1937 </p>
1938
1939 <column name="parent_port">
1940 This is taken from
1941 <ref table="Logical_Switch_Port" column="parent_name"
1942 db="OVN_Northbound"/> in the OVN_Northbound database's
1943 <ref table="Logical_Switch_Port" db="OVN_Northbound"/> table.
1944 </column>
1945
1946 <column name="tag">
1947 <p>
1948 Identifies the VLAN tag in the network traffic associated with that
1949 container's network interface.
1950 </p>
1951
1952 <p>
1953 This column is used for a different purpose when <ref column="type"/>
1954 is <code>localnet</code> (see <code>Localnet Options</code>, above)
1955 or <code>l2gateway</code> (see <code>L2 Gateway Options</code>, above).
1956 </p>
1957 </column>
1958 </group>
1959 </table>
1960
1961 <table name="MAC_Binding" title="IP to MAC bindings">
1962 <p>
1963 Each row in this table specifies a binding from an IP address to an
1964 Ethernet address that has been discovered through ARP (for IPv4) or
1965 neighbor discovery (for IPv6). This table is primarily used to discover
1966 bindings on physical networks, because IP-to-MAC bindings for virtual
1967 machines are usually populated statically into the <ref
1968 table="Port_Binding"/> table.
1969 </p>
1970
1971 <p>
1972 This table expresses a functional relationship: <ref
1973 table="MAC_Binding"/>(<ref column="logical_port"/>, <ref column="ip"/>) =
1974 <ref column="mac"/>.
1975 </p>
1976
1977 <p>
1978 In outline, the lifetime of a logical router's MAC binding looks like
1979 this:
1980 </p>
1981
1982 <ol>
1983 <li>
1984 On hypervisor 1, a logical router determines that a packet should be
1985 forwarded to IP address <var>A</var> on one of its router ports. It
1986 uses its logical flow table to determine that <var>A</var> lacks a
1987 static IP-to-MAC binding and the <code>get_arp</code> action to
1988 determine that it lacks a dynamic IP-to-MAC binding.
1989 </li>
1990
1991 <li>
1992 Using an OVN logical <code>arp</code> action, the logical router
1993 generates and sends a broadcast ARP request to the router port. It
1994 drops the IP packet.
1995 </li>
1996
1997 <li>
1998 The logical switch attached to the router port delivers the ARP request
1999 to all of its ports. (It might make sense to deliver it only to ports
2000 that have no static IP-to-MAC bindings, but this could also be
2001 surprising behavior.)
2002 </li>
2003
2004 <li>
2005 A host or VM on hypervisor 2 (which might be the same as hypervisor 1)
2006 attached to the logical switch owns the IP address in question. It
2007 composes an ARP reply and unicasts it to the logical router port's
2008 Ethernet address.
2009 </li>
2010
2011 <li>
2012 The logical switch delivers the ARP reply to the logical router port.
2013 </li>
2014
2015 <li>
2016 The logical router flow table executes a <code>put_arp</code> action.
2017 To record the IP-to-MAC binding, <code>ovn-controller</code> adds a row
2018 to the <ref table="MAC_Binding"/> table.
2019 </li>
2020
2021 <li>
2022 On hypervisor 1, <code>ovn-controller</code> receives the updated <ref
2023 table="MAC_Binding"/> table from the OVN southbound database. The next
2024 packet destined to <var>A</var> through the logical router is sent
2025 directly to the bound Ethernet address.
2026 </li>
2027 </ol>
2028
2029 <column name="logical_port">
2030 The logical port on which the binding was discovered.
2031 </column>
2032
2033 <column name="ip">
2034 The bound IP address.
2035 </column>
2036
2037 <column name="mac">
2038 The Ethernet address to which the IP is bound.
2039 </column>
2040 <column name="datapath">
2041 The logical datapath to which the logical port belongs.
2042 </column>
2043 </table>
2044
2045 <table name="DHCP_Options" title="DHCP Options supported by native OVN DHCP">
2046 <p>
2047 Each row in this table stores the DHCP Options supported by native OVN
2048 DHCP. <code>ovn-northd</code> populates this table with the supported
2049 DHCP options. <code>ovn-controller</code> looks up this table to get the
2050 DHCP codes of the DHCP options defined in the "put_dhcp_opts" action.
2051 Please refer to the RFC 2132 <code>"https://tools.ietf.org/html/rfc2132"</code>
2052 for the possible list of DHCP options that can be defined here.
2053 </p>
2054
2055 <column name="name">
2056 <p>
2057 Name of the DHCP option.
2058 </p>
2059
2060 <p>
2061 Example. name="router"
2062 </p>
2063 </column>
2064
2065 <column name="code">
2066 <p>
2067 DHCP option code for the DHCP option as defined in the RFC 2132.
2068 </p>
2069
2070 <p>
2071 Example. code=3
2072 </p>
2073 </column>
2074
2075 <column name="type">
2076 <p>
2077 Data type of the DHCP option code.
2078 </p>
2079
2080 <dl>
2081 <dt><code>value: bool</code></dt>
2082 <dd>
2083 <p>
2084 This indicates that the value of the DHCP option is a bool.
2085 </p>
2086
2087 <p>
2088 Example. "name=ip_forward_enable", "code=19", "type=bool".
2089 </p>
2090
2091 <p>
2092 put_dhcp_opts(..., ip_forward_enable = 1,...)
2093 </p>
2094 </dd>
2095
2096 <dt><code>value: uint8</code></dt>
2097 <dd>
2098 <p>
2099 This indicates that the value of the DHCP option is an unsigned
2100 int8 (8 bits)
2101 </p>
2102
2103 <p>
2104 Example. "name=default_ttl", "code=23", "type=uint8".
2105 </p>
2106
2107 <p>
2108 put_dhcp_opts(..., default_ttl = 50,...)
2109 </p>
2110 </dd>
2111
2112 <dt><code>value: uint16</code></dt>
2113 <dd>
2114 <p>
2115 This indicates that the value of the DHCP option is an unsigned
2116 int16 (16 bits).
2117 </p>
2118
2119 <p>
2120 Example. "name=mtu", "code=26", "type=uint16".
2121 </p>
2122
2123 <p>
2124 put_dhcp_opts(..., mtu = 1450,...)
2125 </p>
2126 </dd>
2127
2128 <dt><code>value: uint32</code></dt>
2129 <dd>
2130 <p>
2131 This indicates that the value of the DHCP option is an unsigned
2132 int32 (32 bits).
2133 </p>
2134
2135 <p>
2136 Example. "name=lease_time", "code=51", "type=uint32".
2137 </p>
2138
2139 <p>
2140 put_dhcp_opts(..., lease_time = 86400,...)
2141 </p>
2142 </dd>
2143
2144 <dt><code>value: ipv4</code></dt>
2145 <dd>
2146 <p>
2147 This indicates that the value of the DHCP option is an IPv4
2148 address or addresses.
2149 </p>
2150
2151 <p>
2152 Example. "name=router", "code=3", "type=ipv4".
2153 </p>
2154
2155 <p>
2156 put_dhcp_opts(..., router = 10.0.0.1,...)
2157 </p>
2158
2159 <p>
2160 Example. "name=dns_server", "code=6", "type=ipv4".
2161 </p>
2162
2163 <p>
2164 put_dhcp_opts(..., dns_server = {8.8.8.8 7.7.7.7},...)
2165 </p>
2166 </dd>
2167
2168 <dt><code>value: static_routes</code></dt>
2169 <dd>
2170 <p>
2171 This indicates that the value of the DHCP option contains a pair of
2172 IPv4 route and next hop addresses.
2173 </p>
2174
2175 <p>
2176 Example. "name=classless_static_route", "code=121", "type=static_routes".
2177 </p>
2178
2179 <p>
2180 put_dhcp_opts(..., classless_static_route = {30.0.0.0/24,10.0.0.4,0.0.0.0/0,10.0.0.1}...)
2181 </p>
2182 </dd>
2183
2184 <dt><code>value: str</code></dt>
2185 <dd>
2186 <p>
2187 This indicates that the value of the DHCP option is a string.
2188 </p>
2189
2190 <p>
2191 Example. "name=host_name", "code=12", "type=str".
2192 </p>
2193 </dd>
2194 </dl>
2195 </column>
2196 </table>
2197
2198 <table name="DHCPv6_Options" title="DHCPv6 Options supported by native OVN DHCPv6">
2199 <p>
2200 Each row in this table stores the DHCPv6 Options supported by native OVN
2201 DHCPv6. <code>ovn-northd</code> populates this table with the supported
2202 DHCPv6 options. <code>ovn-controller</code> looks up this table to get
2203 the DHCPv6 codes of the DHCPv6 options defined in the
2204 <code>put_dhcpv6_opts</code> action. Please refer to RFC 3315 and RFC
2205 3646 for the list of DHCPv6 options that can be defined here.
2206 </p>
2207
2208 <column name="name">
2209 <p>
2210 Name of the DHCPv6 option.
2211 </p>
2212
2213 <p>
2214 Example. name="ia_addr"
2215 </p>
2216 </column>
2217
2218 <column name="code">
2219 <p>
2220 DHCPv6 option code for the DHCPv6 option as defined in the appropriate
2221 RFC.
2222 </p>
2223
2224 <p>
2225 Example. code=3
2226 </p>
2227 </column>
2228
2229 <column name="type">
2230 <p>
2231 Data type of the DHCPv6 option code.
2232 </p>
2233
2234 <dl>
2235 <dt><code>value: ipv6</code></dt>
2236 <dd>
2237 <p>
2238 This indicates that the value of the DHCPv6 option is an IPv6
2239 address(es).
2240 </p>
2241
2242 <p>
2243 Example. "name=ia_addr", "code=5", "type=ipv6".
2244 </p>
2245
2246 <p>
2247 put_dhcpv6_opts(..., ia_addr = ae70::4,...)
2248 </p>
2249 </dd>
2250
2251 <dt><code>value: str</code></dt>
2252 <dd>
2253 <p>
2254 This indicates that the value of the DHCPv6 option is a string.
2255 </p>
2256
2257 <p>
2258 Example. "name=domain_search", "code=24", "type=str".
2259 </p>
2260
2261 <p>
2262 put_dhcpv6_opts(..., domain_search = ovn.domain,...)
2263 </p>
2264 </dd>
2265
2266 <dt><code>value: mac</code></dt>
2267 <dd>
2268 <p>
2269 This indicates that the value of the DHCPv6 option is a MAC address.
2270 </p>
2271
2272 <p>
2273 Example. "name=server_id", "code=2", "type=mac".
2274 </p>
2275
2276 <p>
2277 put_dhcpv6_opts(..., server_id = 01:02:03:04L05:06,...)
2278 </p>
2279 </dd>
2280 </dl>
2281 </column>
2282 </table>
2283 <table name="Connection" title="OVSDB client connections.">
2284 <p>
2285 Configuration for a database connection to an Open vSwitch database
2286 (OVSDB) client.
2287 </p>
2288
2289 <p>
2290 This table primarily configures the Open vSwitch database server
2291 (<code>ovsdb-server</code>).
2292 </p>
2293
2294 <p>
2295 The Open vSwitch database server can initiate and maintain active
2296 connections to remote clients. It can also listen for database
2297 connections.
2298 </p>
2299
2300 <group title="Core Features">
2301 <column name="target">
2302 <p>Connection methods for clients.</p>
2303 <p>
2304 The following connection methods are currently supported:
2305 </p>
2306 <dl>
2307 <dt><code>ssl:<var>ip</var></code>[<code>:<var>port</var></code>]</dt>
2308 <dd>
2309 <p>
2310 The specified SSL <var>port</var> on the host at the given
2311 <var>ip</var>, which must be expressed as an IP address
2312 (not a DNS name).
2313 </p>
2314 <p>
2315 If <var>port</var> is not specified, it defaults to 6640.
2316 </p>
2317 <p>
2318 SSL support is an optional feature that is not always
2319 built as part of Open vSwitch.
2320 </p>
2321 </dd>
2322
2323 <dt><code>tcp:<var>ip</var></code>[<code>:<var>port</var></code>]</dt>
2324 <dd>
2325 <p>
2326 The specified TCP <var>port</var> on the host at the given
2327 <var>ip</var>, which must be expressed as an IP address (not a
2328 DNS name), where <var>ip</var> can be IPv4 or IPv6 address. If
2329 <var>ip</var> is an IPv6 address, wrap it in square brackets,
2330 e.g. <code>tcp:[::1]:6640</code>.
2331 </p>
2332 <p>
2333 If <var>port</var> is not specified, it defaults to 6640.
2334 </p>
2335 </dd>
2336 <dt><code>pssl:</code>[<var>port</var>][<code>:<var>ip</var></code>]</dt>
2337 <dd>
2338 <p>
2339 Listens for SSL connections on the specified TCP <var>port</var>.
2340 Specify 0 for <var>port</var> to have the kernel automatically
2341 choose an available port. If <var>ip</var>, which must be
2342 expressed as an IP address (not a DNS name), is specified, then
2343 connections are restricted to the specified local IP address
2344 (either IPv4 or IPv6 address). If <var>ip</var> is an IPv6
2345 address, wrap in square brackets,
2346 e.g. <code>pssl:6640:[::1]</code>. If <var>ip</var> is not
2347 specified then it listens only on IPv4 (but not IPv6) addresses.
2348 </p>
2349 <p>
2350 If <var>port</var> is not specified, it defaults to 6640.
2351 </p>
2352 <p>
2353 SSL support is an optional feature that is not always built as
2354 part of Open vSwitch.
2355 </p>
2356 </dd>
2357 <dt><code>ptcp:</code>[<var>port</var>][<code>:<var>ip</var></code>]</dt>
2358 <dd>
2359 <p>
2360 Listens for connections on the specified TCP <var>port</var>.
2361 Specify 0 for <var>port</var> to have the kernel automatically
2362 choose an available port. If <var>ip</var>, which must be
2363 expressed as an IP address (not a DNS name), is specified, then
2364 connections are restricted to the specified local IP address
2365 (either IPv4 or IPv6 address). If <var>ip</var> is an IPv6
2366 address, wrap it in square brackets,
2367 e.g. <code>ptcp:6640:[::1]</code>. If <var>ip</var> is not
2368 specified then it listens only on IPv4 addresses.
2369 </p>
2370 <p>
2371 If <var>port</var> is not specified, it defaults to 6640.
2372 </p>
2373 </dd>
2374 </dl>
2375 <p>When multiple clients are configured, the <ref column="target"/>
2376 values must be unique. Duplicate <ref column="target"/> values yield
2377 unspecified results.</p>
2378 </column>
2379
2380 <column name="read_only">
2381 <code>true</code> to restrict these connections to read-only
2382 transactions, <code>false</code> to allow them to modify the database.
2383 </column>
2384 </group>
2385
2386 <group title="Client Failure Detection and Handling">
2387 <column name="max_backoff">
2388 Maximum number of milliseconds to wait between connection attempts.
2389 Default is implementation-specific.
2390 </column>
2391
2392 <column name="inactivity_probe">
2393 Maximum number of milliseconds of idle time on connection to the client
2394 before sending an inactivity probe message. If Open vSwitch does not
2395 communicate with the client for the specified number of seconds, it
2396 will send a probe. If a response is not received for the same
2397 additional amount of time, Open vSwitch assumes the connection has been
2398 broken and attempts to reconnect. Default is implementation-specific.
2399 A value of 0 disables inactivity probes.
2400 </column>
2401 </group>
2402
2403 <group title="Status">
2404 <p>
2405 Key-value pair of <ref column="is_connected"/> is always updated.
2406 Other key-value pairs in the status columns may be updated depends
2407 on the <ref column="target"/> type.
2408 </p>
2409
2410 <p>
2411 When <ref column="target"/> specifies a connection method that
2412 listens for inbound connections (e.g. <code>ptcp:</code> or
2413 <code>punix:</code>), both <ref column="n_connections"/> and
2414 <ref column="is_connected"/> may also be updated while the
2415 remaining key-value pairs are omitted.
2416 </p>
2417
2418 <p>
2419 On the other hand, when <ref column="target"/> specifies an
2420 outbound connection, all key-value pairs may be updated, except
2421 the above-mentioned two key-value pairs associated with inbound
2422 connection targets. They are omitted.
2423 </p>
2424
2425 <column name="is_connected">
2426 <code>true</code> if currently connected to this client,
2427 <code>false</code> otherwise.
2428 </column>
2429
2430 <column name="status" key="last_error">
2431 A human-readable description of the last error on the connection
2432 to the manager; i.e. <code>strerror(errno)</code>. This key
2433 will exist only if an error has occurred.
2434 </column>
2435
2436 <column name="status" key="state"
2437 type='{"type": "string", "enum": ["set", ["VOID", "BACKOFF", "CONNECTING", "ACTIVE", "IDLE"]]}'>
2438 <p>
2439 The state of the connection to the manager:
2440 </p>
2441 <dl>
2442 <dt><code>VOID</code></dt>
2443 <dd>Connection is disabled.</dd>
2444
2445 <dt><code>BACKOFF</code></dt>
2446 <dd>Attempting to reconnect at an increasing period.</dd>
2447
2448 <dt><code>CONNECTING</code></dt>
2449 <dd>Attempting to connect.</dd>
2450
2451 <dt><code>ACTIVE</code></dt>
2452 <dd>Connected, remote host responsive.</dd>
2453
2454 <dt><code>IDLE</code></dt>
2455 <dd>Connection is idle. Waiting for response to keep-alive.</dd>
2456 </dl>
2457 <p>
2458 These values may change in the future. They are provided only for
2459 human consumption.
2460 </p>
2461 </column>
2462
2463 <column name="status" key="sec_since_connect"
2464 type='{"type": "integer", "minInteger": 0}'>
2465 The amount of time since this client last successfully connected
2466 to the database (in seconds). Value is empty if client has never
2467 successfully been connected.
2468 </column>
2469
2470 <column name="status" key="sec_since_disconnect"
2471 type='{"type": "integer", "minInteger": 0}'>
2472 The amount of time since this client last disconnected from the
2473 database (in seconds). Value is empty if client has never
2474 disconnected.
2475 </column>
2476
2477 <column name="status" key="locks_held">
2478 Space-separated list of the names of OVSDB locks that the connection
2479 holds. Omitted if the connection does not hold any locks.
2480 </column>
2481
2482 <column name="status" key="locks_waiting">
2483 Space-separated list of the names of OVSDB locks that the connection is
2484 currently waiting to acquire. Omitted if the connection is not waiting
2485 for any locks.
2486 </column>
2487
2488 <column name="status" key="locks_lost">
2489 Space-separated list of the names of OVSDB locks that the connection
2490 has had stolen by another OVSDB client. Omitted if no locks have been
2491 stolen from this connection.
2492 </column>
2493
2494 <column name="status" key="n_connections"
2495 type='{"type": "integer", "minInteger": 2}'>
2496 When <ref column="target"/> specifies a connection method that
2497 listens for inbound connections (e.g. <code>ptcp:</code> or
2498 <code>pssl:</code>) and more than one connection is actually active,
2499 the value is the number of active connections. Otherwise, this
2500 key-value pair is omitted.
2501 </column>
2502
2503 <column name="status" key="bound_port" type='{"type": "integer"}'>
2504 When <ref column="target"/> is <code>ptcp:</code> or
2505 <code>pssl:</code>, this is the TCP port on which the OVSDB server is
2506 listening. (This is particularly useful when <ref
2507 column="target"/> specifies a port of 0, allowing the kernel to
2508 choose any available port.)
2509 </column>
2510 </group>
2511
2512 <group title="Common Columns">
2513 The overall purpose of these columns is described under <code>Common
2514 Columns</code> at the beginning of this document.
2515
2516 <column name="external_ids"/>
2517 <column name="other_config"/>
2518 </group>
2519 </table>
2520 </database>