]> git.proxmox.com Git - ovs.git/blame - ovn/ovn-sb.xml
ovn: Ability to set multiple load balancers.
[ovs.git] / ovn / ovn-sb.xml
CommitLineData
fe36184b 1<?xml version="1.0" encoding="utf-8"?>
ec78987f 2<database name="ovn-sb" title="OVN Southbound Database">
fe36184b
BP
3 <p>
4 This database holds logical and physical configuration and state for the
5 Open Virtual Network (OVN) system to support virtual network abstraction.
6 For an introduction to OVN, please see <code>ovn-architecture</code>(7).
7 </p>
8
9 <p>
ec78987f
JP
10 The OVN Southbound database sits at the center of the OVN
11 architecture. It is the one component that speaks both southbound
12 directly to all the hypervisors and gateways, via
88058f19
AW
13 <code>ovn-controller</code>/<code>ovn-controller-vtep</code>, and
14 northbound to the Cloud Management System, via <code>ovn-northd</code>:
fe36184b
BP
15 </p>
16
17 <h2>Database Structure</h2>
18
19 <p>
0bac7164 20 The OVN Southbound database contains classes of data with
ec78987f 21 different properties, as described in the sections below.
fe36184b
BP
22 </p>
23
24 <h3>Physical Network (PN) data</h3>
25
26 <p>
27 PN tables contain information about the chassis nodes in the system. This
28 contains all the information necessary to wire the overlay, such as IP
29 addresses, supported tunnel types, and security keys.
30 </p>
31
32 <p>
33 The amount of PN data is small (O(n) in the number of chassis) and it
34 changes infrequently, so it can be replicated to every chassis.
35 </p>
36
37 <p>
62fdd819 38 The <ref table="Chassis"/> table comprises the PN tables.
fe36184b
BP
39 </p>
40
41 <h3>Logical Network (LN) data</h3>
42
43 <p>
44 LN tables contain the topology of logical switches and routers, ACLs,
45 firewall rules, and everything needed to describe how packets traverse a
46 logical network, represented as logical datapath flows (see Logical
47 Datapath Flows, below).
48 </p>
49
50 <p>
51 LN data may be large (O(n) in the number of logical ports, ACL rules,
52 etc.). Thus, to improve scaling, each chassis should receive only data
53 related to logical networks in which that chassis participates. Past
54 experience shows that in the presence of large logical networks, even
55 finer-grained partitioning of data, e.g. designing logical flows so that
56 only the chassis hosting a logical port needs related flows, pays off
57 scale-wise. (This is not necessary initially but it is worth bearing in
58 mind in the design.)
59 </p>
60
61 <p>
62 The LN is a slave of the cloud management system running northbound of OVN.
63 That CMS determines the entire OVN logical configuration and therefore the
64 LN's content at any given time is a deterministic function of the CMS's
09986f8c
JP
65 configuration, although that happens indirectly via the
66 <ref db="OVN_Northbound"/> database and <code>ovn-northd</code>.
fe36184b
BP
67 </p>
68
69 <p>
70 LN data is likely to change more quickly than PN data. This is especially
71 true in a container environment where VMs are created and destroyed (and
72 therefore added to and deleted from logical switches) quickly.
73 </p>
74
75 <p>
5868eb24
BP
76 <ref table="Logical_Flow"/> and <ref table="Multicast_Group"/> contain LN
77 data.
fe36184b
BP
78 </p>
79
0bac7164 80 <h3>Logical-physical bindings</h3>
fe36184b
BP
81
82 <p>
0bac7164 83 These tables link logical and physical components. They show the current
5868eb24
BP
84 placement of logical components (such as VMs and VIFs) onto chassis, and
85 map logical entities to the values that represent them in tunnel
86 encapsulations.
fe36184b
BP
87 </p>
88
89 <p>
0bac7164 90 These tables change frequently, at least every time a VM powers up or down
fe36184b
BP
91 or migrates, and especially quickly in a container environment. The
92 amount of data per VM (or VIF) is small.
93 </p>
94
95 <p>
96 Each chassis is authoritative about the VMs and VIFs that it hosts at any
97 given time and can efficiently flood that state to a central location, so
98 the consistency needs are minimal.
99 </p>
100
101 <p>
5868eb24
BP
102 The <ref table="Port_Binding"/> and <ref table="Datapath_Binding"/> tables
103 contain binding data.
fe36184b
BP
104 </p>
105
0bac7164
BP
106 <h3>MAC bindings</h3>
107
108 <p>
109 The <ref table="MAC_Binding"/> table tracks the bindings from IP addresses
110 to Ethernet addresses that are dynamically discovered using ARP (for IPv4)
111 and neighbor discovery (for IPv6). Usually, IP-to-MAC bindings for virtual
112 machines are statically populated into the <ref table="Port_Binding"/>
113 table, so <ref table="MAC_Binding"/> is primarily used to discover bindings
114 on physical networks.
115 </p>
116
5868eb24
BP
117 <h2>Common Columns</h2>
118
119 <p>
120 Some tables contain a special column named <code>external_ids</code>. This
121 column has the same form and purpose each place that it appears, so we
122 describe it here to save space later.
123 </p>
124
125 <dl>
126 <dt><code>external_ids</code>: map of string-string pairs</dt>
127 <dd>
128 Key-value pairs for use by the software that manages the OVN Southbound
88058f19
AW
129 database rather than by
130 <code>ovn-controller</code>/<code>ovn-controller-vtep</code>. In
131 particular, <code>ovn-northd</code> can use key-value pairs in this
132 column to relate entities in the southbound database to higher-level
133 entities (such as entities in the OVN Northbound database). Individual
134 key-value pairs in this column may be documented in some cases to aid
135 in understanding and troubleshooting, but the reader should not mistake
136 such documentation as comprehensive.
5868eb24
BP
137 </dd>
138 </dl>
139
fa183acc
BP
140 <table name="SB_Global" title="Southbound configuration">
141 <p>
142 Southbound configuration for an OVN system. This table must have exactly
143 one row.
144 </p>
145
146 <group title="Status">
147 This column allow a client to track the overall configuration state of
148 the system.
149
150 <column name="nb_cfg">
151 Sequence number for the configuration. When a CMS or
152 <code>ovn-nbctl</code> updates the northbound database, it increments
153 the <code>nb_cfg</code> column in the <code>NB_Global</code> table in
154 the northbound database. In turn, when <code>ovn-northd</code> updates
155 the southbound database to bring it up to date with these changes, it
156 updates this column to the same value.
157 </column>
158 </group>
159
160 <group title="Common Columns">
161 <column name="external_ids">
162 See <em>External IDs</em> at the beginning of this document.
163 </column>
164 </group>
165 </table>
166
fe36184b
BP
167 <table name="Chassis" title="Physical Network Hypervisor and Gateway Information">
168 <p>
169 Each row in this table represents a hypervisor or gateway (a chassis) in
170 the physical network (PN). Each chassis, via
88058f19
AW
171 <code>ovn-controller</code>/<code>ovn-controller-vtep</code>, adds
172 and updates its own row, and keeps a copy of the remaining rows to
173 determine how to reach other hypervisors.
fe36184b
BP
174 </p>
175
176 <p>
177 When a chassis shuts down gracefully, it should remove its own row.
178 (This is not critical because resources hosted on the chassis are equally
179 unreachable regardless of whether the row is present.) If a chassis
180 shuts down permanently without removing its row, some kind of manual or
181 automatic cleanup is eventually needed; we can devise a process for that
182 as necessary.
183 </p>
184
185 <column name="name">
fc26cf25
RB
186 OVN does not prescribe a particular format for chassis names.
187 ovn-controller populates this column using <ref key="system-id"
188 table="Open_vSwitch" column="external_ids" db="Open_vSwitch"/>
189 in the Open_vSwitch database's <ref table="Open_vSwitch"
190 db="Open_vSwitch"/> table. ovn-controller-vtep populates this
191 column with <ref table="Physical_Switch" column="name"
192 db="hardware_vtep"/> in the hardware_vtep database's
193 <ref table="Physical_Switch" db="hardware_vtep"/> table.
fe36184b
BP
194 </column>
195
2229f3ec
RB
196 <column name="hostname">
197 The hostname of the chassis, if applicable. ovn-controller will populate
198 this column with the hostname of the host it is running on.
199 ovn-controller-vtep will leave this column empty.
200 </column>
201
fa183acc
BP
202 <column name="nb_cfg">
203 Sequence number for the configuration. When <code>ovn-controller</code>
204 updates the configuration of a chassis from the contents of the
205 southbound database, it copies <ref table="SB_Global" column="nb_cfg"/>
206 from the <ref table="SB_Global"/> table into this column.
207 </column>
208
4250ee37
RB
209 <column name="external_ids" key="ovn-bridge-mappings">
210 <code>ovn-controller</code> populates this key with the set of bridge
211 mappings it has been configured to use. Other applications should treat
212 this key as read-only. See <code>ovn-controller</code>(8) for more
213 information.
214 </column>
215
5236c73a
NS
216 <column name="external_ids" key="datapath-type">
217 <code>ovn-controller</code> populates this key with the datapath type
218 configured in the <ref table="Bridge" column="datapath_type"/> column of
219 the Open_vSwitch database's <ref table="Bridge" db="Open_vSwitch"/>
220 table. Other applications should treat this key as read-only. See
221 <code>ovn-controller</code>(8) for more information.
222 </column>
223
224 <column name="external_ids" key="iface-types">
225 <code>ovn-controller</code> populates this key with the interface types
226 configured in the <ref table="Open_vSwitch" column="iface_types"/> column
227 of the Open_vSwitch database's <ref table="Open_vSwitch"
228 db="Open_vSwitch"/> table. Other applications should treat this key as
229 read-only. See <code>ovn-controller</code>(8) for more information.
230 </column>
231
1cef5fff
RB
232 <group title="Common Columns">
233 The overall purpose of these columns is described under <code>Common
234 Columns</code> at the beginning of this document.
235
236 <column name="external_ids"/>
237 </group>
238
09db214c 239 <group title="Encapsulation Configuration">
fe36184b 240 <p>
09db214c
JP
241 OVN uses encapsulation to transmit logical dataplane packets
242 between chassis.
fe36184b
BP
243 </p>
244
09db214c
JP
245 <column name="encaps">
246 Points to supported encapsulation configurations to transmit
247 logical dataplane packets to this chassis. Each entry is a <ref
248 table="Encap"/> record that describes the configuration.
fe36184b
BP
249 </column>
250 </group>
251
62fdd819
AW
252 <group title="Gateway Configuration">
253 <p>
254 A <dfn>gateway</dfn> is a chassis that forwards traffic between the
255 OVN-managed part of a logical network and a physical VLAN, extending a
256 tunnel-based logical network into a physical network. Gateways are
88058f19
AW
257 typically dedicated nodes that do not host VMs and will be controlled
258 by <code>ovn-controller-vtep</code>.
fe36184b
BP
259 </p>
260
62fdd819 261 <column name="vtep_logical_switches">
88058f19
AW
262 Stores all VTEP logical switch names connected by this gateway
263 chassis. The <ref table="Port_Binding"/> table entry with
264 <ref column="options" table="Port_Binding"/>:<code>vtep-physical-switch</code>
265 equal <ref table="Chassis"/> <ref column="name" table="Chassis"/>, and
266 <ref column="options" table="Port_Binding"/>:<code>vtep-logical-switch</code>
267 value in <ref table="Chassis"/>
268 <ref column="vtep_logical_switches" table="Chassis"/>, will be
269 associated with this <ref table="Chassis"/>.
fe36184b 270 </column>
62fdd819 271 </group>
fe36184b
BP
272 </table>
273
09db214c
JP
274 <table name="Encap" title="Encapsulation Types">
275 <p>
276 The <ref column="encaps" table="Chassis"/> column in the <ref
277 table="Chassis"/> table refers to rows in this table to identify
278 how OVN may transmit logical dataplane packets to this chassis.
88058f19
AW
279 Each chassis, via <code>ovn-controller</code>(8) or
280 <code>ovn-controller-vtep</code>(8), adds and updates its own rows
281 and keeps a copy of the remaining rows to determine how to reach
282 other chassis.
09db214c
JP
283 </p>
284
285 <column name="type">
286 The encapsulation to use to transmit packets to this chassis.
b705f9ea
JP
287 Hypervisors must use either <code>geneve</code> or
288 <code>stt</code>. Gateways may use <code>vxlan</code>,
289 <code>geneve</code>, or <code>stt</code>.
09db214c
JP
290 </column>
291
292 <column name="options">
36283d78
JG
293 <p>
294 Options for configuring the encapsulation. Currently, the only
295 option that has been defined is <code>csum</code>.
296 </p>
297
298 <p>
299 <code>csum</code> indicates that encapsulation checksums can be
300 transmitted and received with reasonable performance. It is a hint
301 to senders transmitting data to this chassis that they should use
302 checksums to protect OVN metadata. Set to <code>true</code> to enable
303 or <code>false</code> to disable.
304 </p>
305
306 <p>
307 In terms of performance, this actually significantly increases
308 throughput in most common cases when running on Linux based hosts
309 without NICs supporting encapsulation hardware offload (around 60% for
310 bulk traffic). The reason is that generally all NICs are capable of
311 offloading transmitted and received TCP/UDP checksums (viewed as
312 ordinary data packets and not as tunnels). The benefit comes on the
313 receive side where the validated outer checksum can be used to
314 additionally validate an inner checksum (such as TCP), which in turn
315 allows aggregation of packets to be more efficiently handled by the
316 rest of the stack.
317 </p>
318
319 <p>
320 Not all devices see such a benefit. The most notable exception is
321 hardware VTEPs. These devices are designed to not buffer entire
322 packets in their switching engines and are therefore unable to
323 efficiently compute or validate full packet checksums. In addition
324 certain versions of the Linux kernel are not able to fully take
325 advantage of encapsulation NIC offloads in the presence of checksums.
326 (This is actually a pretty narrow corner case though - earlier
327 versions of Linux don't support encapsulation offloads at all and
328 later versions support both offloads and checksums well.)
329 </p>
330
331 <p>
332 <code>csum</code> defaults to <code>false</code> for hardware VTEPs and
333 <code>true</code> for all other cases.
334 </p>
09db214c
JP
335 </column>
336
337 <column name="ip">
338 The IPv4 address of the encapsulation tunnel endpoint.
339 </column>
340 </table>
341
ea382567
RB
342 <table name="Address_Set" title="Address Sets">
343 <p>
344 See the documentation for the <ref table="Address_Set"
345 db="OVN_Northbound"/> table in the <ref db="OVN_Northbound"/> database
346 for details.
347 </p>
348
349 <column name="name"/>
350 <column name="addresses"/>
351 </table>
352
5868eb24 353 <table name="Logical_Flow" title="Logical Network Flows">
fe36184b 354 <p>
09986f8c
JP
355 Each row in this table represents one logical flow.
356 <code>ovn-northd</code> populates this table with logical flows
357 that implement the L2 and L3 topologies specified in the
358 <ref db="OVN_Northbound"/> database. Each hypervisor, via
359 <code>ovn-controller</code>, translates the logical flows into
360 OpenFlow flows specific to its hypervisor and installs them into
361 Open vSwitch.
fe36184b
BP
362 </p>
363
364 <p>
365 Logical flows are expressed in an OVN-specific format, described here. A
366 logical datapath flow is much like an OpenFlow flow, except that the
367 flows are written in terms of logical ports and logical datapaths instead
368 of physical ports and physical datapaths. Translation between logical
369 and physical flows helps to ensure isolation between logical datapaths.
09986f8c
JP
370 (The logical flow abstraction also allows the OVN centralized
371 components to do less work, since they do not have to separately
372 compute and push out physical flows to each chassis.)
fe36184b
BP
373 </p>
374
375 <p>
376 The default action when no flow matches is to drop packets.
377 </p>
378
69a832cf 379 <p><em>Architectural Logical Life Cycle of a Packet</em></p>
5868eb24
BP
380
381 <p>
382 This following description focuses on the life cycle of a packet through
383 a logical datapath, ignoring physical details of the implementation.
69a832cf 384 Please refer to <em>Architectural Physical Life Cycle of a Packet</em> in
5868eb24
BP
385 <code>ovn-architecture</code>(7) for the physical information.
386 </p>
387
388 <p>
389 The description here is written as if OVN itself executes these steps,
390 but in fact OVN (that is, <code>ovn-controller</code>) programs Open
391 vSwitch, via OpenFlow and OVSDB, to execute them on its behalf.
392 </p>
393
394 <p>
395 At a high level, OVN passes each packet through the logical datapath's
396 logical ingress pipeline, which may output the packet to one or more
397 logical port or logical multicast groups. For each such logical output
398 port, OVN passes the packet through the datapath's logical egress
399 pipeline, which may either drop the packet or deliver it to the
400 destination. Between the two pipelines, outputs to logical multicast
401 groups are expanded into logical ports, so that the egress pipeline only
402 processes a single logical output port at a time. Between the two
403 pipelines is also where, when necessary, OVN encapsulates a packet in a
404 tunnel (or tunnels) to transmit to remote hypervisors.
405 </p>
406
407 <p>
408 In more detail, to start, OVN searches the <ref table="Logical_Flow"/>
409 table for a row with correct <ref column="logical_datapath"/>, a <ref
410 column="pipeline"/> of <code>ingress</code>, a <ref column="table_id"/>
411 of 0, and a <ref column="match"/> that is true for the packet. If none
412 is found, OVN drops the packet. If OVN finds more than one, it chooses
413 the match with the highest <ref column="priority"/>. Then OVN executes
414 each of the actions specified in the row's <ref table="actions"/> column,
415 in the order specified. Some actions, such as those to modify packet
416 headers, require no further details. The <code>next</code> and
417 <code>output</code> actions are special.
418 </p>
419
420 <p>
421 The <code>next</code> action causes the above process to be repeated
422 recursively, except that OVN searches for <ref column="table_id"/> of 1
423 instead of 0. Similarly, any <code>next</code> action in a row found in
424 that table would cause a further search for a <ref column="table_id"/> of
425 2, and so on. When recursive processing completes, flow control returns
426 to the action following <code>next</code>.
427 </p>
428
429 <p>
430 The <code>output</code> action also introduces recursion. Its effect
431 depends on the current value of the <code>outport</code> field. Suppose
432 <code>outport</code> designates a logical port. First, OVN compares
433 <code>inport</code> to <code>outport</code>; if they are equal, it treats
bf143492
JP
434 the <code>output</code> as a no-op by default. In the common
435 case, where they are different, the packet enters the egress
436 pipeline. This transition to the egress pipeline discards
437 register data, e.g. <code>reg0</code> ... <code>reg9</code> and
438 connection tracking state, to achieve uniform behavior regardless
439 of whether the egress pipeline is on a different hypervisor
440 (because registers aren't preserve across tunnel encapsulation).
5868eb24
BP
441 </p>
442
443 <p>
444 To execute the egress pipeline, OVN again searches the <ref
445 table="Logical_Flow"/> table for a row with correct <ref
446 column="logical_datapath"/>, a <ref column="table_id"/> of 0, a <ref
447 column="match"/> that is true for the packet, but now looking for a <ref
448 column="pipeline"/> of <code>egress</code>. If no matching row is found,
449 the output becomes a no-op. Otherwise, OVN executes the actions for the
450 matching flow (which is chosen from multiple, if necessary, as already
451 described).
452 </p>
453
454 <p>
455 In the <code>egress</code> pipeline, the <code>next</code> action acts as
456 already described, except that it, of course, searches for
457 <code>egress</code> flows. The <code>output</code> action, however, now
458 directly outputs the packet to the output port (which is now fixed,
459 because <code>outport</code> is read-only within the egress pipeline).
460 </p>
461
462 <p>
463 The description earlier assumed that <code>outport</code> referred to a
464 logical port. If it instead designates a logical multicast group, then
465 the description above still applies, with the addition of fan-out from
466 the logical multicast group to each logical port in the group. For each
467 member of the group, OVN executes the logical pipeline as described, with
468 the logical output port replaced by the group member.
469 </p>
470
8d6e5516
JP
471 <p><em>Pipeline Stages</em></p>
472
473 <p>
398be42b
BP
474 <code>ovn-northd</code> populates the <ref table="Logical_Flow"/> table
475 with the logical flows described in detail in <code>ovn-northd</code>(8).
8d6e5516
JP
476 </p>
477
747b2a45 478 <column name="logical_datapath">
5868eb24
BP
479 The logical datapath to which the logical flow belongs.
480 </column>
481
482 <column name="pipeline">
483 <p>
484 The primary flows used for deciding on a packet's destination are the
485 <code>ingress</code> flows. The <code>egress</code> flows implement
486 ACLs. See <em>Logical Life Cycle of a Packet</em>, above, for details.
487 </p>
747b2a45
BP
488 </column>
489
fe36184b
BP
490 <column name="table_id">
491 The stage in the logical pipeline, analogous to an OpenFlow table number.
492 </column>
493
494 <column name="priority">
495 The flow's priority. Flows with numerically higher priority take
496 precedence over those with lower. If two logical datapath flows with the
497 same priority both match, then the one actually applied to the packet is
498 undefined.
499 </column>
500
501 <column name="match">
502 <p>
503 A matching expression. OVN provides a superset of OpenFlow matching
504 capabilities, using a syntax similar to Boolean expressions in a
505 programming language.
506 </p>
507
508 <p>
fa6aeaeb
RB
509 The most important components of match expression are
510 <dfn>comparisons</dfn> between <dfn>symbols</dfn> and
511 <dfn>constants</dfn>, e.g. <code>ip4.dst == 192.168.0.1</code>,
512 <code>ip.proto == 6</code>, <code>arp.op == 1</code>, <code>eth.type ==
513 0x800</code>. The logical AND operator <code>&amp;&amp;</code> and
514 logical OR operator <code>||</code> can combine comparisons into a
515 larger expression.
fe36184b
BP
516 </p>
517
fe36184b 518 <p>
e0840f11
BP
519 Matching expressions also support parentheses for grouping, the logical
520 NOT prefix operator <code>!</code>, and literals <code>0</code> and
521 <code>1</code> to express ``false'' or ``true,'' respectively. The
522 latter is useful by itself as a catch-all expression that matches every
523 packet.
fe36184b
BP
524 </p>
525
e0840f11 526 <p><em>Symbols</em></p>
fe36184b
BP
527
528 <p>
fa6aeaeb
RB
529 <em>Type</em>. Symbols have <dfn>integer</dfn> or <dfn>string</dfn>
530 type. Integer symbols have a <dfn>width</dfn> in bits.
fe36184b
BP
531 </p>
532
533 <p>
fa6aeaeb 534 <em>Kinds</em>. There are three kinds of symbols:
fe36184b
BP
535 </p>
536
e0840f11 537 <ul>
fa6aeaeb
RB
538 <li>
539 <p>
540 <dfn>Fields</dfn>. A field symbol represents a packet header or
541 metadata field. For example, a field
542 named <code>vlan.tci</code> might represent the VLAN TCI field in a
543 packet.
544 </p>
545
546 <p>
547 A field symbol can have integer or string type. Integer fields can
548 be nominal or ordinal (see <em>Level of Measurement</em>,
549 below).
550 </p>
551 </li>
552
553 <li>
554 <p>
555 <dfn>Subfields</dfn>. A subfield represents a subset of bits from
556 a larger field. For example, a field <code>vlan.vid</code> might
557 be defined as an alias for <code>vlan.tci[0..11]</code>. Subfields
558 are provided for syntactic convenience, because it is always
559 possible to instead refer to a subset of bits from a field
560 directly.
561 </p>
562
563 <p>
564 Only ordinal fields (see <em>Level of Measurement</em>,
565 below) may have subfields. Subfields are always ordinal.
566 </p>
567 </li>
568
569 <li>
570 <p>
571 <dfn>Predicates</dfn>. A predicate is shorthand for a Boolean
572 expression. Predicates may be used much like 1-bit fields. For
573 example, <code>ip4</code> might expand to <code>eth.type ==
574 0x800</code>. Predicates are provided for syntactic convenience,
575 because it is always possible to instead specify the underlying
576 expression directly.
577 </p>
578
579 <p>
580 A predicate whose expansion refers to any nominal field or
581 predicate (see <em>Level of Measurement</em>, below) is nominal;
582 other predicates have Boolean level of measurement.
583 </p>
584 </li>
e0840f11
BP
585 </ul>
586
fe36184b 587 <p>
fa6aeaeb
RB
588 <em>Level of Measurement</em>. See
589 http://en.wikipedia.org/wiki/Level_of_measurement for the statistical
590 concept on which this classification is based. There are three
591 levels:
fe36184b
BP
592 </p>
593
594 <ul>
fa6aeaeb
RB
595 <li>
596 <p>
597 <dfn>Ordinal</dfn>. In statistics, ordinal values can be ordered
598 on a scale. OVN considers a field (or subfield) to be ordinal if
599 its bits can be examined individually. This is true for the
600 OpenFlow fields that OpenFlow or Open vSwitch makes ``maskable.''
601 </p>
602
603 <p>
604 Any use of a nominal field may specify a single bit or a range of
605 bits, e.g. <code>vlan.tci[13..15]</code> refers to the PCP field
606 within the VLAN TCI, and <code>eth.dst[40]</code> refers to the
607 multicast bit in the Ethernet destination address.
608 </p>
609
610 <p>
611 OVN supports all the usual arithmetic relations (<code>==</code>,
612 <code>!=</code>, <code>&lt;</code>, <code>&lt;=</code>,
613 <code>&gt;</code>, and <code>&gt;=</code>) on ordinal fields and
614 their subfields, because OVN can implement these in OpenFlow and
615 Open vSwitch as collections of bitwise tests.
616 </p>
617 </li>
618
619 <li>
620 <p>
621 <dfn>Nominal</dfn>. In statistics, nominal values cannot be
622 usefully compared except for equality. This is true of OpenFlow
623 port numbers, Ethernet types, and IP protocols are examples: all of
624 these are just identifiers assigned arbitrarily with no deeper
625 meaning. In OpenFlow and Open vSwitch, bits in these fields
626 generally aren't individually addressable.
627 </p>
628
629 <p>
630 OVN only supports arithmetic tests for equality on nominal fields,
631 because OpenFlow and Open vSwitch provide no way for a flow to
632 efficiently implement other comparisons on them. (A test for
633 inequality can be sort of built out of two flows with different
634 priorities, but OVN matching expressions always generate flows with
635 a single priority.)
636 </p>
637
638 <p>
639 String fields are always nominal.
640 </p>
641 </li>
642
643 <li>
644 <p>
645 <dfn>Boolean</dfn>. A nominal field that has only two values, 0
646 and 1, is somewhat exceptional, since it is easy to support both
647 equality and inequality tests on such a field: either one can be
648 implemented as a test for 0 or 1.
649 </p>
650
651 <p>
652 Only predicates (see above) have a Boolean level of measurement.
653 </p>
654
655 <p>
656 This isn't a standard level of measurement.
657 </p>
658 </li>
fe36184b
BP
659 </ul>
660
661 <p>
fa6aeaeb
RB
662 <em>Prerequisites</em>. Any symbol can have prerequisites, which are
663 additional condition implied by the use of the symbol. For example,
664 For example, <code>icmp4.type</code> symbol might have prerequisite
665 <code>icmp4</code>, which would cause an expression <code>icmp4.type ==
666 0</code> to be interpreted as <code>icmp4.type == 0 &amp;&amp;
667 icmp4</code>, which would in turn expand to <code>icmp4.type == 0
668 &amp;&amp; eth.type == 0x800 &amp;&amp; ip4.proto == 1</code> (assuming
669 <code>icmp4</code> is a predicate defined as suggested under
670 <em>Types</em> above).
fe36184b
BP
671 </p>
672
e0840f11
BP
673 <p><em>Relational operators</em></p>
674
fe36184b 675 <p>
fa6aeaeb
RB
676 All of the standard relational operators <code>==</code>,
677 <code>!=</code>, <code>&lt;</code>, <code>&lt;=</code>,
678 <code>&gt;</code>, and <code>&gt;=</code> are supported. Nominal
679 fields support only <code>==</code> and <code>!=</code>, and only in a
680 positive sense when outer <code>!</code> are taken into account,
681 e.g. given string field <code>inport</code>, <code>inport ==
682 "eth0"</code> and <code>!(inport != "eth0")</code> are acceptable, but
683 not <code>inport != "eth0"</code>.
fe36184b
BP
684 </p>
685
686 <p>
fa6aeaeb
RB
687 The implementation of <code>==</code> (or <code>!=</code> when it is
688 negated), is more efficient than that of the other relational
689 operators.
fe36184b
BP
690 </p>
691
e0840f11
BP
692 <p><em>Constants</em></p>
693
fe36184b 694 <p>
e0840f11
BP
695 Integer constants may be expressed in decimal, hexadecimal prefixed by
696 <code>0x</code>, or as dotted-quad IPv4 addresses, IPv6 addresses in
697 their standard forms, or Ethernet addresses as colon-separated hex
698 digits. A constant in any of these forms may be followed by a slash
699 and a second constant (the mask) in the same form, to form a masked
700 constant. IPv4 and IPv6 masks may be given as integers, to express
701 CIDR prefixes.
702 </p>
703
704 <p>
705 String constants have the same syntax as quoted strings in JSON (thus,
5868eb24 706 they are Unicode strings).
fe36184b
BP
707 </p>
708
709 <p>
e0840f11
BP
710 Some operators support sets of constants written inside curly braces
711 <code>{</code> ... <code>}</code>. Commas between elements of a set,
712 and after the last elements, are optional. With <code>==</code>,
713 ``<code><var>field</var> == { <var>constant1</var>,
714 <var>constant2</var>,</code> ... <code>}</code>'' is syntactic sugar
715 for ``<code><var>field</var> == <var>constant1</var> ||
716 <var>field</var> == <var>constant2</var> || </code>...<code></code>.
717 Similarly, ``<code><var>field</var> != { <var>constant1</var>,
718 <var>constant2</var>, </code>...<code> }</code>'' is equivalent to
719 ``<code><var>field</var> != <var>constant1</var> &amp;&amp;
fe36184b 720 <var>field</var> != <var>constant2</var> &amp;&amp;
e0840f11 721 </code>...<code></code>''.
fe36184b
BP
722 </p>
723
ea382567
RB
724 <p>
725 You may refer to a set of IPv4, IPv6, or MAC addresses stored in the
726 <ref table="Address_Set"/> table by its <ref column="name"
727 table="Address_Set"/>. An <ref table="Address_Set"/> with a name
728 of <code>set1</code> can be referred to as
729 <code>$set1</code>.
730 </p>
731
e0840f11
BP
732 <p><em>Miscellaneous</em></p>
733
fe36184b 734 <p>
fa6aeaeb
RB
735 Comparisons may name the symbol or the constant first,
736 e.g. <code>tcp.src == 80</code> and <code>80 == tcp.src</code> are both
737 acceptable.
fe36184b
BP
738 </p>
739
740 <p>
fa6aeaeb
RB
741 Tests for a range may be expressed using a syntax like <code>1024 &lt;=
742 tcp.src &lt;= 49151</code>, which is equivalent to <code>1024 &lt;=
743 tcp.src &amp;&amp; tcp.src &lt;= 49151</code>.
fe36184b
BP
744 </p>
745
746 <p>
fa6aeaeb
RB
747 For a one-bit field or predicate, a mention of its name is equivalent
748 to <code><var>symobl</var> == 1</code>, e.g. <code>vlan.present</code>
749 is equivalent to <code>vlan.present == 1</code>. The same is true for
750 one-bit subfields, e.g. <code>vlan.tci[12]</code>. There is no
751 technical limitation to implementing the same for ordinal fields of all
752 widths, but the implementation is expensive enough that the syntax
753 parser requires writing an explicit comparison against zero to make
754 mistakes less likely, e.g. in <code>tcp.src != 0</code> the comparison
755 against 0 is required.
fe36184b
BP
756 </p>
757
758 <p>
fa6aeaeb
RB
759 <em>Operator precedence</em> is as shown below, from highest to lowest.
760 There are two exceptions where parentheses are required even though the
761 table would suggest that they are not: <code>&amp;&amp;</code> and
762 <code>||</code> require parentheses when used together, and
763 <code>!</code> requires parentheses when applied to a relational
764 expression. Thus, in <code>(eth.type == 0x800 || eth.type == 0x86dd)
765 &amp;&amp; ip.proto == 6</code> or <code>!(arp.op == 1)</code>, the
766 parentheses are mandatory.
fe36184b
BP
767 </p>
768
e0840f11
BP
769 <ul>
770 <li><code>()</code></li>
771 <li><code>== != &lt; &lt;= &gt; &gt;=</code></li>
772 <li><code>!</code></li>
773 <li><code>&amp;&amp; ||</code></li>
774 </ul>
775
10b1662b
BP
776 <p>
777 <em>Comments</em> may be introduced by <code>//</code>, which extends
778 to the next new-line. Comments within a line may be bracketed by
779 <code>/*</code> and <code>*/</code>. Multiline comments are not
780 supported.
781 </p>
782
e0840f11
BP
783 <p><em>Symbols</em></p>
784
5868eb24
BP
785 <p>
786 Most of the symbols below have integer type. Only <code>inport</code>
787 and <code>outport</code> have string type. <code>inport</code> names a
788 logical port. Thus, its value is a <ref column="logical_port"/> name
62fdd819
AW
789 from the <ref table="Port_Binding"/> table. <code>outport</code> may
790 name a logical port, as <code>inport</code>, or a logical multicast
791 group defined in the <ref table="Multicast_Group"/> table. For both
792 symbols, only names within the flow's logical datapath may be used.
5868eb24
BP
793 </p>
794
394e883d
JP
795 <p>
796 The <code>reg</code><var>X</var> symbols are 32-bit integers.
797 The <code>xxreg</code><var>X</var> symbols are 128-bit integers,
798 which overlay four of the 32-bit registers: <code>xxreg0</code>
799 overlays <code>reg0</code> through <code>reg3</code>, with
800 <code>reg0</code> supplying the most-significant bits of
801 <code>xxreg0</code> and <code>reg3</code> the least-signficant.
802 <code>xxreg1</code> similarly overlays <code>reg4</code> through
803 <code>reg7</code>.
804 </p>
805
e0840f11 806 <ul>
cc5e28d8 807 <li><code>reg0</code>...<code>reg9</code></li>
394e883d 808 <li><code>xxreg0</code> <code>xxreg1</code></li>
5868eb24 809 <li><code>inport</code> <code>outport</code></li>
bf143492 810 <li><code>flags.loopback</code></li>
e0840f11
BP
811 <li><code>eth.src</code> <code>eth.dst</code> <code>eth.type</code></li>
812 <li><code>vlan.tci</code> <code>vlan.vid</code> <code>vlan.pcp</code> <code>vlan.present</code></li>
813 <li><code>ip.proto</code> <code>ip.dscp</code> <code>ip.ecn</code> <code>ip.ttl</code> <code>ip.frag</code></li>
814 <li><code>ip4.src</code> <code>ip4.dst</code></li>
815 <li><code>ip6.src</code> <code>ip6.dst</code> <code>ip6.label</code></li>
816 <li><code>arp.op</code> <code>arp.spa</code> <code>arp.tpa</code> <code>arp.sha</code> <code>arp.tha</code></li>
817 <li><code>tcp.src</code> <code>tcp.dst</code> <code>tcp.flags</code></li>
818 <li><code>udp.src</code> <code>udp.dst</code></li>
819 <li><code>sctp.src</code> <code>sctp.dst</code></li>
820 <li><code>icmp4.type</code> <code>icmp4.code</code></li>
821 <li><code>icmp6.type</code> <code>icmp6.code</code></li>
822 <li><code>nd.target</code> <code>nd.sll</code> <code>nd.tll</code></li>
e3d81ade 823 <li><code>ct_mark</code> <code>ct_label</code></li>
78aab811
JP
824 <li>
825 <p>
826 <code>ct_state</code>, which has the following Boolean subfields:
827 </p>
828 <ul>
829 <li><code>ct.new</code>: True for a new flow</li>
830 <li><code>ct.est</code>: True for an established flow</li>
831 <li><code>ct.rel</code>: True for a related flow</li>
832 <li><code>ct.rpl</code>: True for a reply flow</li>
833 <li><code>ct.inv</code>: True for a connection entry in a bad state</li>
834 </ul>
835 <p>
836 <code>ct_state</code> and its subfields are initialized by the
837 <code>ct_next</code> action, described below.
838 </p>
839 </li>
e0840f11
BP
840 </ul>
841
25030d47
RB
842 <p>
843 The following predicates are supported:
844 </p>
845
846 <ul>
a2011117
BP
847 <li><code>eth.bcast</code> expands to <code>eth.dst == ff:ff:ff:ff:ff:ff</code></li>
848 <li><code>eth.mcast</code> expands to <code>eth.dst[40]</code></li>
25030d47
RB
849 <li><code>vlan.present</code> expands to <code>vlan.tci[12]</code></li>
850 <li><code>ip4</code> expands to <code>eth.type == 0x800</code></li>
a2011117 851 <li><code>ip4.mcast</code> expands to <code>ip4.dst[28..31] == 0xe</code></li>
25030d47
RB
852 <li><code>ip6</code> expands to <code>eth.type == 0x86dd</code></li>
853 <li><code>ip</code> expands to <code>ip4 || ip6</code></li>
854 <li><code>icmp4</code> expands to <code>ip4 &amp;&amp; ip.proto == 1</code></li>
855 <li><code>icmp6</code> expands to <code>ip6 &amp;&amp; ip.proto == 58</code></li>
856 <li><code>icmp</code> expands to <code>icmp4 || icmp6</code></li>
857 <li><code>ip.is_frag</code> expands to <code>ip.frag[0]</code></li>
858 <li><code>ip.later_frag</code> expands to <code>ip.frag[1]</code></li>
859 <li><code>ip.first_frag</code> expands to <code>ip.is_frag &amp;&amp; !ip.later_frag</code></li>
860 <li><code>arp</code> expands to <code>eth.type == 0x806</code></li>
acdd9220
JP
861 <li><code>nd</code> expands to <code>icmp6.type == {135, 136} &amp;&amp; icmp6.code == 0 &amp;&amp; ip.ttl == 255</code></li>
862 <li><code>nd_ns</code> expands to <code>icmp6.type == 135 &amp;&amp; icmp6.code == 0 &amp;&amp; ip.ttl == 255</code></li>
863 <li><code>nd_na</code> expands to <code>icmp6.type == 136 &amp;&amp; icmp6.code == 0 &amp;&amp; ip.ttl == 255</code></li>
25030d47
RB
864 <li><code>tcp</code> expands to <code>ip.proto == 6</code></li>
865 <li><code>udp</code> expands to <code>ip.proto == 17</code></li>
866 <li><code>sctp</code> expands to <code>ip.proto == 132</code></li>
867 </ul>
fe36184b
BP
868 </column>
869
870 <column name="actions">
871 <p>
2cd87fce
RB
872 Logical datapath actions, to be executed when the logical flow
873 represented by this row is the highest-priority match.
fe36184b
BP
874 </p>
875
35060cdc 876 <p>
2cd87fce
RB
877 Actions share lexical syntax with the <ref column="match"/> column. An
878 empty set of actions (or one that contains just white space or
879 comments), or a set of actions that consists of just
880 <code>drop;</code>, causes the matched packets to be dropped.
881 Otherwise, the column should contain a sequence of actions, each
882 terminated by a semicolon.
35060cdc 883 </p>
fe36184b 884
35060cdc 885 <p>
eee7a8ed 886 The following actions are defined:
35060cdc 887 </p>
fe36184b 888
35060cdc
BP
889 <dl>
890 <dt><code>output;</code></dt>
891 <dd>
5868eb24 892 <p>
eee7a8ed
JP
893 In the ingress pipeline, this action executes the
894 <code>egress</code> pipeline as a subroutine. If
895 <code>outport</code> names a logical port, the egress pipeline
896 executes once; if it is a multicast group, the egress pipeline runs
897 once for each logical port in the group.
5868eb24
BP
898 </p>
899
900 <p>
901 In the egress pipeline, this action performs the actual
902 output to the <code>outport</code> logical port. (In the egress
903 pipeline, <code>outport</code> never names a multicast group.)
904 </p>
905
906 <p>
bf143492
JP
907 By default, output to the input port is implicitly dropped,
908 that is, <code>output</code> becomes a no-op if
909 <code>outport</code> == <code>inport</code>. Occasionally
910 it may be useful to override this behavior, e.g. to send an
911 ARP reply to an ARP request; to do so, use
912 <code>flags.loopback = 1</code> to allow the packet to
913 "hair-pin" back to the input port.
5868eb24 914 </p>
eee7a8ed 915 </dd>
fe36184b 916
35060cdc 917 <dt><code>next;</code></dt>
558ec83d 918 <dt><code>next(<var>table</var>);</code></dt>
35060cdc 919 <dd>
558ec83d
BP
920 Executes another logical datapath table as a subroutine. By default,
921 the table after the current one is executed. Specify
922 <var>table</var> to jump to a specific table in the same pipeline.
2cd87fce 923 </dd>
fe36184b 924
35060cdc
BP
925 <dt><code><var>field</var> = <var>constant</var>;</code></dt>
926 <dd>
5868eb24 927 <p>
5ee054fb
BP
928 Sets data or metadata field <var>field</var> to constant value
929 <var>constant</var>, e.g. <code>outport = "vif0";</code> to set the
930 logical output port. To set only a subset of bits in a field,
931 specify a subfield for <var>field</var> or a masked
932 <var>constant</var>, e.g. one may use <code>vlan.pcp[2] = 1;</code>
933 or <code>vlan.pcp = 4/4;</code> to set the most sigificant bit of
934 the VLAN PCP.
5868eb24
BP
935 </p>
936
937 <p>
938 Assigning to a field with prerequisites implicitly adds those
939 prerequisites to <ref column="match"/>; thus, for example, a flow
940 that sets <code>tcp.dst</code> applies only to TCP flows,
941 regardless of whether its <ref column="match"/> mentions any TCP
942 field.
943 </p>
944
945 <p>
946 Not all fields are modifiable (e.g. <code>eth.type</code> and
947 <code>ip.proto</code> are read-only), and not all modifiable fields
948 may be partially modified (e.g. <code>ip.ttl</code> must assigned
949 as a whole). The <code>outport</code> field is modifiable in the
950 <code>ingress</code> pipeline but not in the <code>egress</code>
951 pipeline.
952 </p>
eee7a8ed 953 </dd>
5ee054fb
BP
954
955 <dt><code><var>field1</var> = <var>field2</var>;</code></dt>
956 <dd>
957 <p>
958 Sets data or metadata field <var>field1</var> to the value of data
959 or metadata field <var>field2</var>, e.g. <code>reg0 =
960 ip4.src;</code> copies <code>ip4.src</code> into <code>reg0</code>.
961 To modify only a subset of a field's bits, specify a subfield for
962 <var>field1</var> or <var>field2</var> or both, e.g. <code>vlan.pcp
963 = reg0[0..2];</code> copies the least-significant bits of
964 <code>reg0</code> into the VLAN PCP.
965 </p>
966
967 <p>
968 <var>field1</var> and <var>field2</var> must be the same type,
969 either both string or both integer fields. If they are both
970 integer fields, they must have the same width.
971 </p>
972
973 <p>
974 If <var>field1</var> or <var>field2</var> has prerequisites, they
975 are added implicitly to <ref column="match"/>. It is possible to
976 write an assignment with contradictory prerequisites, such as
977 <code>ip4.src = ip6.src[0..31];</code>, but the contradiction means
978 that a logical flow with such an assignment will never be matched.
979 </p>
980 </dd>
a20c96c6
BP
981
982 <dt><code><var>field1</var> &lt;-&gt; <var>field2</var>;</code></dt>
983 <dd>
984 <p>
985 Similar to <code><var>field1</var> = <var>field2</var>;</code>
986 except that the two values are exchanged instead of copied. Both
987 <var>field1</var> and <var>field2</var> must modifiable.
988 </p>
989 </dd>
78aab811 990
00ea19e4
BP
991 <dt><code>ip.ttl--;</code></dt>
992 <dd>
993 <p>
994 Decrements the IPv4 or IPv6 TTL. If this would make the TTL zero
995 or negative, then processing of the packet halts; no further
996 actions are processed. (To properly handle such cases, a
4c20b9f2
JP
997 higher-priority flow should match on
998 <code>ip.ttl == {0, 1};</code>.)
00ea19e4
BP
999 </p>
1000
1001 <p><b>Prerequisite:</b> <code>ip</code></p>
1002 </dd>
1003
78aab811
JP
1004 <dt><code>ct_next;</code></dt>
1005 <dd>
1006 <p>
1007 Apply connection tracking to the flow, initializing
1008 <code>ct_state</code> for matching in later tables.
1009 Automatically moves on to the next table, as if followed by
1010 <code>next</code>.
1011 </p>
1012
1013 <p>
1014 As a side effect, IP fragments will be reassembled for matching.
1015 If a fragmented packet is output, then it will be sent with any
1016 overlapping fragments squashed. The connection tracking state is
1017 scoped by the logical port, so overlapping addresses may be used.
1018 To allow traffic related to the matched flow, execute
1019 <code>ct_commit</code>.
1020 </p>
1021
1022 <p>
1023 It is possible to have actions follow <code>ct_next</code>,
1024 but they will not have access to any of its side-effects and
1025 is not generally useful.
1026 </p>
1027 </dd>
1028
1029 <dt><code>ct_commit;</code></dt>
a9e1b66f
RB
1030 <dt><code>ct_commit(ct_mark=<var>value[/mask]</var>);</code></dt>
1031 <dt><code>ct_commit(ct_label=<var>value[/mask]</var>);</code></dt>
1032 <dt><code>ct_commit(ct_mark=<var>value[/mask]</var>, ct_label=<var>value[/mask]</var>);</code></dt>
78aab811 1033 <dd>
c4623bb8 1034 <p>
a9e1b66f
RB
1035 Commit the flow to the connection tracking entry associated with it
1036 by a previous call to <code>ct_next</code>. When
1037 <code>ct_mark=<var>value[/mask]</var></code> and/or
1038 <code>ct_label=<var>value[/mask]</var></code> are supplied,
1039 <code>ct_mark</code> and/or <code>ct_label</code> will be set to the
1040 values indicated by <var>value[/mask]</var> on the connection
1041 tracking entry. <code>ct_mark</code> is a 32-bit field.
354b8f27
NS
1042 <code>ct_label</code> is a 128-bit field. The <var>value[/mask]</var>
1043 should be specified in hex string if more than 64bits are to be used.
c4623bb8 1044 </p>
a9e1b66f 1045
c4623bb8
RB
1046 <p>
1047 Note that if you want processing to continue in the next table,
1048 you must execute the <code>next</code> action after
a9e1b66f
RB
1049 <code>ct_commit</code>. You may also leave out <code>next</code>
1050 which will commit connection tracking state, and then drop the
1051 packet. This could be useful for setting <code>ct_mark</code>
1052 on a connection tracking entry before dropping a packet,
1053 for example.
c4623bb8 1054 </p>
78aab811 1055 </dd>
fe36184b 1056
de297547
GS
1057 <dt><code>ct_dnat;</code></dt>
1058 <dt><code>ct_dnat(<var>IP</var>);</code></dt>
1059 <dd>
1060 <p>
1061 <code>ct_dnat</code> sends the packet through the DNAT zone in
1062 connection tracking table to unDNAT any packet that was DNATed in
1063 the opposite direction. The packet is then automatically sent to
1064 to the next tables as if followed by <code>next;</code> action.
1065 The next tables will see the changes in the packet caused by
1066 the connection tracker.
1067 </p>
1068 <p>
1069 <code>ct_dnat(<var>IP</var>)</code> sends the packet through the
1070 DNAT zone to change the destination IP address of the packet to
467085fd 1071 the one provided inside the parentheses and commits the connection.
de297547
GS
1072 The packet is then automatically sent to the next tables as if
1073 followed by <code>next;</code> action. The next tables will see
1074 the changes in the packet caused by the connection tracker.
1075 </p>
1076 </dd>
1077
1078 <dt><code>ct_snat;</code></dt>
1079 <dt><code>ct_snat(<var>IP</var>);</code></dt>
1080 <dd>
1081 <p>
1082 <code>ct_snat</code> sends the packet through the SNAT zone to
1083 unSNAT any packet that was SNATed in the opposite direction. If
1084 the packet needs to be sent to the next tables, then it should be
1085 followed by a <code>next;</code> action. The next tables will not
1086 see the changes in the packet caused by the connection tracker.
1087 </p>
1088 <p>
1089 <code>ct_snat(<var>IP</var>)</code> sends the packet through the
1090 SNAT zone to change the source IP address of the packet to
1091 the one provided inside the parenthesis and commits the connection.
1092 The packet is then automatically sent to the next tables as if
1093 followed by <code>next;</code> action. The next tables will see the
1094 changes in the packet caused by the connection tracker.
1095 </p>
1096 </dd>
1097
69a832cf
BP
1098 <dt><code>arp { <var>action</var>; </code>...<code> };</code></dt>
1099 <dd>
1100 <p>
1101 Temporarily replaces the IPv4 packet being processed by an ARP
1102 packet and executes each nested <var>action</var> on the ARP
1103 packet. Actions following the <var>arp</var> action, if any, apply
1104 to the original, unmodified packet.
1105 </p>
1106
1107 <p>
1108 The ARP packet that this action operates on is initialized based on
1109 the IPv4 packet being processed, as follows. These are default
1110 values that the nested actions will probably want to change:
1111 </p>
1112
1113 <ul>
1114 <li><code>eth.src</code> unchanged</li>
1115 <li><code>eth.dst</code> unchanged</li>
1116 <li><code>eth.type = 0x0806</code></li>
1117 <li><code>arp.op = 1</code> (ARP request)</li>
1118 <li><code>arp.sha</code> copied from <code>eth.src</code></li>
1119 <li><code>arp.spa</code> copied from <code>ip4.src</code></li>
1120 <li><code>arp.tha = 00:00:00:00:00:00</code></li>
1121 <li><code>arp.tpa</code> copied from <code>ip4.dst</code></li>
1122 </ul>
1123
6335d074
BP
1124 <p>
1125 The ARP packet has the same VLAN header, if any, as the IP packet
1126 it replaces.
1127 </p>
1128
69a832cf
BP
1129 <p><b>Prerequisite:</b> <code>ip4</code></p>
1130 </dd>
1131
c34a87b6
JP
1132 <dt><code>get_arp(<var>P</var>, <var>A</var>);</code></dt>
1133
1134 <dd>
1135 <p>
1136 <b>Parameters</b>: logical port string field <var>P</var>, 32-bit
1137 IP address field <var>A</var>.
1138 </p>
1139
1140 <p>
1141 Looks up <var>A</var> in <var>P</var>'s mac binding table.
1142 If an entry is found, stores its Ethernet address in
1143 <code>eth.dst</code>, otherwise stores
1144 <code>00:00:00:00:00:00</code> in <code>eth.dst</code>.
1145 </p>
1146
1147 <p><b>Example:</b> <code>get_arp(outport, ip4.dst);</code></p>
1148 </dd>
1149
1150 <dt>
1151 <code>put_arp(<var>P</var>, <var>A</var>, <var>E</var>);</code>
1152 </dt>
1153
1154 <dd>
1155 <p>
1156 <b>Parameters</b>: logical port string field <var>P</var>, 32-bit
1157 IP address field <var>A</var>, 48-bit Ethernet address field
1158 <var>E</var>.
1159 </p>
1160
1161 <p>
1162 Adds or updates the entry for IP address <var>A</var> in
1163 logical port <var>P</var>'s mac binding table, setting its
1164 Ethernet address to <var>E</var>.
1165 </p>
1166
1167 <p><b>Example:</b> <code>put_arp(inport, arp.spa, arp.sha);</code></p>
1168 </dd>
1169
e75451fe 1170 <dt>
f8a8db39 1171 <code>nd_na { <var>action</var>; </code>...<code> };</code>
e75451fe
ZKL
1172 </dt>
1173
1174 <dd>
1175 <p>
f8a8db39
JP
1176 Temporarily replaces the IPv6 neighbor solicitation packet
1177 being processed by an IPv6 neighbor advertisement (NA)
1178 packet and executes each nested <var>action</var> on the NA
1179 packet. Actions following the <code>nd_na</code> action,
1180 if any, apply to the original, unmodified packet.
e75451fe
ZKL
1181 </p>
1182
1183 <p>
1184 The NA packet that this action operates on is initialized based on
1185 the IPv6 packet being processed, as follows. These are default
1186 values that the nested actions will probably want to change:
1187 </p>
1188
1189 <ul>
1190 <li><code>eth.dst</code> exchanged with <code>eth.src</code></li>
1191 <li><code>eth.type = 0x86dd</code></li>
1192 <li><code>ip6.dst</code> copied from <code>ip6.src</code></li>
1193 <li><code>ip6.src</code> copied from <code>nd.target</code></li>
1194 <li><code>icmp6.type = 136</code> (Neighbor Advertisement)</li>
1195 <li><code>nd.target</code> unchanged</li>
1196 <li><code>nd.sll = 00:00:00:00:00:00</code></li>
1197 <li><code>nd.tll</code> copied from <code>eth.dst</code></li>
1198 </ul>
1199
1200 <p>
1201 The ND packet has the same VLAN header, if any, as the IPv6 packet
1202 it replaces.
1203 </p>
1204
1205 <p>
f8a8db39 1206 <b>Prerequisite:</b> <code>nd_ns</code>
e75451fe
ZKL
1207 </p>
1208 </dd>
1209
c34a87b6 1210 <dt><code>get_nd(<var>P</var>, <var>A</var>);</code></dt>
0bac7164
BP
1211
1212 <dd>
1213 <p>
c34a87b6
JP
1214 <b>Parameters</b>: logical port string field <var>P</var>, 128-bit
1215 IPv6 address field <var>A</var>.
0bac7164
BP
1216 </p>
1217
1218 <p>
c34a87b6
JP
1219 Looks up <var>A</var> in <var>P</var>'s mac binding table.
1220 If an entry is found, stores its Ethernet address in
1221 <code>eth.dst</code>, otherwise stores
1222 <code>00:00:00:00:00:00</code> in <code>eth.dst</code>.
0bac7164
BP
1223 </p>
1224
c34a87b6 1225 <p><b>Example:</b> <code>get_nd(outport, ip6.dst);</code></p>
0bac7164
BP
1226 </dd>
1227
1228 <dt>
c34a87b6 1229 <code>put_nd(<var>P</var>, <var>A</var>, <var>E</var>);</code>
0bac7164
BP
1230 </dt>
1231
1232 <dd>
1233 <p>
c34a87b6
JP
1234 <b>Parameters</b>: logical port string field <var>P</var>,
1235 128-bit IPv6 address field <var>A</var>, 48-bit Ethernet
1236 address field <var>E</var>.
0bac7164
BP
1237 </p>
1238
1239 <p>
c34a87b6
JP
1240 Adds or updates the entry for IPv6 address <var>A</var> in
1241 logical port <var>P</var>'s mac binding table, setting its
1242 Ethernet address to <var>E</var>.
0bac7164
BP
1243 </p>
1244
c34a87b6 1245 <p><b>Example:</b> <code>put_nd(inport, nd.target, nd.tll);</code></p>
0bac7164 1246 </dd>
42814145
NS
1247
1248 <dt>
1249 <code><var>R</var> = put_dhcp_opts(<code>offerip</code> = <var>IP</var>, <var>D1</var> = <var>V1</var>, <var>D2</var> = <var>V2</var>, ..., <var>Dn</var> = <var>Vn</var>);</code>
1250 </dt>
1251
1252 <dd>
1253 <p>
1254 <b>Parameters</b>: one or more DHCP option/value pairs, the first
1255 of which must set a value for the offered IP, <code>offerip</code>.
1256 </p>
1257
1258 <p>
1259 <b>Result</b>: stored to a 1-bit subfield <var>R</var>.
1260 </p>
1261
1262 <p>
1263 Valid only in the ingress pipeline.
1264 </p>
1265
1266 <p>
1267 When this action is applied to a DHCP request packet (DHCPDISCOVER
1268 or DHCPREQUEST), it changes the packet into a DHCP reply (DHCPOFFER
1269 or DHCPACK, respectively), replaces the options by those specified
1270 as parameters, and stores 1 in <var>R</var>.
1271 </p>
1272
1273 <p>
1274 When this action is applied to a non-DHCP packet or a DHCP packet
1275 that is not DHCPDISCOVER or DHCPREQUEST, it leaves the packet
1276 unchanged and stores 0 in <var>R</var>.
1277 </p>
1278
1279 <p>
1280 The contents of the <ref table="DHCP_Option"/> table control the
1281 DHCP option names and values that this action supports.
1282 </p>
1283
1284 <p>
1285 <b>Example:</b>
1286 <code>
1287 reg0[0] = put_dhcp_opts(offerip = 10.0.0.2, router = 10.0.0.1,
1288 netmask = 255.255.255.0, dns_server = {8.8.8.8, 7.7.7.7});
1289 </code>
1290 </p>
1291 </dd>
467085fd 1292
01cfdb2f
NS
1293 <dt>
1294 <code><var>R</var> = put_dhcpv6_opts(<var>D1</var> = <var>V1</var>, <var>D2</var> = <var>V2</var>, ..., <var>Dn</var> = <var>Vn</var>);</code>
1295 </dt>
1296
1297 <dd>
1298 <p>
1299 <b>Parameters</b>: one or more DHCPv6 option/value pairs.
1300 </p>
1301
1302 <p>
1303 <b>Result</b>: stored to a 1-bit subfield <var>R</var>.
1304 </p>
1305
1306 <p>
1307 Valid only in the ingress pipeline.
1308 </p>
1309
1310 <p>
1311 When this action is applied to a DHCPv6 request packet, it changes
1312 the packet into a DHCPv6 reply, replaces the options by those
1313 specified as parameters, and stores 1 in <var>R</var>.
1314 </p>
1315
1316 <p>
1317 When this action is applied to a non-DHCPv6 packet or an invalid
1318 DHCPv6 request packet , it leaves the packet unchanged and stores
1319 0 in <var>R</var>.
1320 </p>
1321
1322 <p>
1323 The contents of the <ref table="DHCPv6_Options"/> table control the
1324 DHCPv6 option names and values that this action supports.
1325 </p>
1326
1327 <p>
1328 <b>Example:</b>
1329 <code>
1330 reg0[3] = put_dhcpv6_opts(ia_addr = aef0::4, server_id = 00:00:00:00:10:02,
1331 dns_server={ae70::1,ae70::2});
1332 </code>
1333 </p>
1334 </dd>
1335
467085fd
GS
1336 <dt><code>ct_lb;</code></dt>
1337 <dt><code>ct_lb(</code><var>ip</var>[<code>:</code><var>port</var>]...<code>);</code></dt>
1338 <dd>
1339 <p>
1340 With one or more arguments, <code>ct_lb</code> commits the packet
1341 to the connection tracking table and DNATs the packet's destination
1342 IP address (and port) to the IP address or addresses (and optional
1343 ports) specified in the string. If multiple comma-separated IP
1344 addresses are specified, each is given equal weight for picking the
1345 DNAT address. Processing automatically moves on to the next table,
1346 as if <code>next;</code> were specified, and later tables act on
1347 the packet as modified by the connection tracker. Connection
1348 tracking state is scoped by the logical port, so overlapping
1349 addresses may be used.
1350 </p>
1351 <p>
1352 Without arguments, <code>ct_lb</code> sends the packet to the
1353 connection tracking table to NAT the packets. If the packet is
1354 part of an established connection that was previously committed to
1355 the connection tracker via <code>ct_lb(</code>...<code>)</code>, it
1356 will automatically get DNATed to the same IP address as the first
1357 packet in that connection.
1358 </p>
1359 </dd>
6335d074
BP
1360 </dl>
1361
1362 <p>
1363 The following actions will likely be useful later, but they have not
1364 been thought out carefully.
1365 </p>
1366
1367 <dl>
69a832cf
BP
1368 <dt><code>icmp4 { <var>action</var>; </code>...<code> };</code></dt>
1369 <dd>
1370 <p>
1371 Temporarily replaces the IPv4 packet being processed by an ICMPv4
1372 packet and executes each nested <var>action</var> on the ICMPv4
1373 packet. Actions following the <var>icmp4</var> action, if any,
1374 apply to the original, unmodified packet.
1375 </p>
1376
1377 <p>
1378 The ICMPv4 packet that this action operates on is initialized based
1379 on the IPv4 packet being processed, as follows. These are default
1380 values that the nested actions will probably want to change.
1381 Ethernet and IPv4 fields not listed here are not changed:
1382 </p>
1383
1384 <ul>
1385 <li><code>ip.proto = 1</code> (ICMPv4)</li>
1386 <li><code>ip.frag = 0</code> (not a fragment)</li>
1387 <li><code>icmp4.type = 3</code> (destination unreachable)</li>
1388 <li><code>icmp4.code = 1</code> (host unreachable)</li>
1389 </ul>
1390
1391 <p>
1392 Details TBD.
1393 </p>
fe36184b 1394
69a832cf
BP
1395 <p><b>Prerequisite:</b> <code>ip4</code></p>
1396 </dd>
1397
1398 <dt><code>tcp_reset;</code></dt>
1399 <dd>
1400 <p>
1401 This action transforms the current TCP packet according to the
1402 following pseudocode:
1403 </p>
1404
1405 <pre>
1406if (tcp.ack) {
1407 tcp.seq = tcp.ack;
1408} else {
1409 tcp.ack = tcp.seq + length(tcp.payload);
1410 tcp.seq = 0;
1411}
1412tcp.flags = RST;
1413</pre>
1414
1415 <p>
1416 Then, the action drops all TCP options and payload data, and
1417 updates the TCP checksum.
1418 </p>
1419
1420 <p>
1421 Details TBD.
1422 </p>
1423
1424 <p><b>Prerequisite:</b> <code>tcp</code></p>
1425 </dd>
fe36184b 1426 </dl>
fe36184b 1427 </column>
091e3af9
JP
1428
1429 <column name="external_ids" key="stage-name">
1430 Human-readable name for this flow's stage in the pipeline.
1431 </column>
1432
1433 <group title="Common Columns">
1434 The overall purpose of these columns is described under <code>Common
1435 Columns</code> at the beginning of this document.
1436
1437 <column name="external_ids"/>
1438 </group>
fe36184b
BP
1439 </table>
1440
5868eb24
BP
1441 <table name="Multicast_Group" title="Logical Port Multicast Groups">
1442 <p>
1443 The rows in this table define multicast groups of logical ports.
1444 Multicast groups allow a single packet transmitted over a tunnel to a
1445 hypervisor to be delivered to multiple VMs on that hypervisor, which
1446 uses bandwidth more efficiently.
1447 </p>
1448
1449 <p>
1450 Each row in this table defines a logical multicast group numbered <ref
1451 column="tunnel_key"/> within <ref column="datapath"/>, whose logical
1452 ports are listed in the <ref column="ports"/> column.
1453 </p>
1454
1455 <column name="datapath">
1456 The logical datapath in which the multicast group resides.
1457 </column>
1458
1459 <column name="tunnel_key">
1460 The value used to designate this logical egress port in tunnel
1461 encapsulations. An index forces the key to be unique within the <ref
1462 column="datapath"/>. The unusual range ensures that multicast group IDs
1463 do not overlap with logical port IDs.
1464 </column>
1465
1466 <column name="name">
1467 <p>
1468 The logical multicast group's name. An index forces the name to be
1469 unique within the <ref column="datapath"/>. Logical flows in the
1470 ingress pipeline may output to the group just as for individual logical
1471 ports, by assigning the group's name to <code>outport</code> and
1472 executing an <code>output</code> action.
1473 </p>
1474
1475 <p>
1476 Multicast group names and logical port names share a single namespace
1477 and thus should not overlap (but the database schema cannot enforce
1478 this). To try to avoid conflicts, <code>ovn-northd</code> uses names
1479 that begin with <code>_MC_</code>.
1480 </p>
1481 </column>
1482
1483 <column name="ports">
1484 The logical ports included in the multicast group. All of these ports
1485 must be in the <ref column="datapath"/> logical datapath (but the
1486 database schema cannot enforce this).
1487 </column>
1488 </table>
1489
1490 <table name="Datapath_Binding" title="Physical-Logical Datapath Bindings">
1491 <p>
1492 Each row in this table identifies physical bindings of a logical
1493 datapath. A logical datapath implements a logical pipeline among the
1494 ports in the <ref table="Port_Binding"/> table associated with it. In
1495 practice, the pipeline in a given logical datapath implements either a
1496 logical switch or a logical router.
1497 </p>
1498
1499 <column name="tunnel_key">
1500 The tunnel key value to which the logical datapath is bound.
1501 The <code>Tunnel Encapsulation</code> section in
1502 <code>ovn-architecture</code>(7) describes how tunnel keys are
1503 constructed for each supported encapsulation.
1504 </column>
1505
9975d7be
BP
1506 <group title="OVN_Northbound Relationship">
1507 <p>
1508 Each row in <ref table="Datapath_Binding"/> is associated with some
1509 logical datapath. <code>ovn-northd</code> uses these keys to track the
1510 association of a logical datapath with concepts in the <ref
1511 db="OVN_Northbound"/> database.
1512 </p>
1513
1514 <column name="external_ids" key="logical-switch" type='{"type": "uuid"}'>
1515 For a logical datapath that represents a logical switch,
1516 <code>ovn-northd</code> stores in this key the UUID of the
1517 corresponding <ref table="Logical_Switch" db="OVN_Northbound"/> row in
1518 the <ref db="OVN_Northbound"/> database.
1519 </column>
1520
1521 <column name="external_ids" key="logical-router" type='{"type": "uuid"}'>
1522 For a logical datapath that represents a logical router,
1523 <code>ovn-northd</code> stores in this key the UUID of the
1524 corresponding <ref table="Logical_Router" db="OVN_Northbound"/> row in
1525 the <ref db="OVN_Northbound"/> database.
1526 </column>
1527 </group>
5868eb24
BP
1528
1529 <group title="Common Columns">
1530 The overall purpose of these columns is described under <code>Common
1531 Columns</code> at the beginning of this document.
1532
1533 <column name="external_ids"/>
1534 </group>
1535 </table>
1536
dcda6e0d 1537 <table name="Port_Binding" title="Physical-Logical Port Bindings">
fe36184b 1538 <p>
d387d24d
BP
1539 Most rows in this table identify the physical location of a logical port.
1540 (The exceptions are logical patch ports, which do not have any physical
1541 location.)
fe36184b
BP
1542 </p>
1543
1544 <p>
80f408f4
JP
1545 For every <code>Logical_Switch_Port</code> record in
1546 <code>OVN_Northbound</code> database, <code>ovn-northd</code>
1547 creates a record in this table. <code>ovn-northd</code> populates
1548 and maintains every column except the <code>chassis</code> column,
1549 which it leaves empty in new records.
9fb4636f
GS
1550 </p>
1551
1552 <p>
88058f19
AW
1553 <code>ovn-controller</code>/<code>ovn-controller-vtep</code>
1554 populates the <code>chassis</code> column for the records that
1555 identify the logical ports that are located on its hypervisor/gateway,
1556 which <code>ovn-controller</code>/<code>ovn-controller-vtep</code> in
1557 turn finds out by monitoring the local hypervisor's Open_vSwitch
1558 database, which identifies logical ports via the conventions described
c1645003
GS
1559 in <code>IntegrationGuide.md</code>. (The exceptions are for
1560 <code>Port_Binding</code> records with <code>type</code> of
17bac0ff
RB
1561 <code>l3gateway</code>, whose locations are identified by
1562 <code>ovn-northd</code> via the <code>options:l3gateway-chassis</code>
c1645003
GS
1563 column in this table. <code>ovn-controller</code> is still responsible
1564 to populate the <code>chassis</code> column.)
9fb4636f
GS
1565 </p>
1566
1567 <p>
5868eb24 1568 When a chassis shuts down gracefully, it should clean up the
9fb4636f 1569 <code>chassis</code> column that it previously had populated.
fe36184b
BP
1570 (This is not critical because resources hosted on the chassis are equally
1571 unreachable regardless of whether their rows are present.) To handle the
1572 case where a VM is shut down abruptly on one chassis, then brought up
88058f19
AW
1573 again on a different one,
1574 <code>ovn-controller</code>/<code>ovn-controller-vtep</code> must
1575 overwrite the <code>chassis</code> column with new information.
fe36184b
BP
1576 </p>
1577
c96ba502
BP
1578 <group title="Core Features">
1579 <column name="datapath">
1580 The logical datapath to which the logical port belongs.
1581 </column>
1a76c93e 1582
c96ba502 1583 <column name="logical_port">
80f408f4
JP
1584 A logical port, taken from <ref table="Logical_Switch_Port"
1585 column="name" db="OVN_Northbound"/> in the OVN_Northbound
1586 database's <ref table="Logical_Switch_Port" db="OVN_Northbound"/>
1587 table. OVN does not prescribe a particular format for the
1588 logical port ID.
c96ba502 1589 </column>
c0281929 1590
c96ba502 1591 <column name="chassis">
184bc3ca
RB
1592 The meaning of this column depends on the value of the <ref column="type"/>
1593 column. This is the meaning for each <ref column="type"/>
1594
1595 <dl>
1596 <dt>(empty string)</dt>
1597 <dd>
1598 The physical location of the logical port. To successfully identify a
1599 chassis, this column must be a <ref table="Chassis"/> record. This is
1600 populated by <code>ovn-controller</code>.
1601 </dd>
1602
1603 <dt>vtep</dt>
1604 <dd>
1605 The physical location of the hardware_vtep gateway. To successfully
1606 identify a chassis, this column must be a <ref table="Chassis"/> record.
1607 This is populated by <code>ovn-controller-vtep</code>.
1608 </dd>
1609
1610 <dt>localnet</dt>
1611 <dd>
1612 Always empty. A localnet port is realized on every chassis that has
1613 connectivity to the corresponding physical network.
1614 </dd>
1615
17bac0ff 1616 <dt>l3gateway</dt>
184bc3ca
RB
1617 <dd>
1618 The physical location of the L3 gateway. To successfully identify a
1619 chassis, this column must be a <ref table="Chassis"/> record. This is
1620 populated by <code>ovn-controller</code> based on the value of
17bac0ff 1621 the <code>options:l3gateway-chassis</code> column in this table.
184bc3ca
RB
1622 </dd>
1623
1624 <dt>l2gateway</dt>
1625 <dd>
1626 The physical location of this L2 gateway. To successfully identify a
1627 chassis, this column must be a <ref table="Chassis"/> record.
62b87eab
NS
1628 This is populated by <code>ovn-controller</code> based on the value
1629 of the <code>options:l2gateway-chassis</code> column in this table.
184bc3ca
RB
1630 </dd>
1631 </dl>
1632
c96ba502 1633 </column>
c0281929 1634
c96ba502
BP
1635 <column name="tunnel_key">
1636 <p>
1637 A number that represents the logical port in the key (e.g. STT key or
1638 Geneve TLV) field carried within tunnel protocol packets.
1639 </p>
c0281929 1640
c96ba502
BP
1641 <p>
1642 The tunnel ID must be unique within the scope of a logical datapath.
1643 </p>
1644 </column>
88058f19 1645
c96ba502
BP
1646 <column name="mac">
1647 <p>
1648 The Ethernet address or addresses used as a source address on the
1649 logical port, each in the form
1650 <var>xx</var>:<var>xx</var>:<var>xx</var>:<var>xx</var>:<var>xx</var>:<var>xx</var>.
1651 The string <code>unknown</code> is also allowed to indicate that the
1652 logical port has an unknown set of (additional) source addresses.
1653 </p>
1654
1655 <p>
1656 A VM interface would ordinarily have a single Ethernet address. A
1657 gateway port might initially only have <code>unknown</code>, and then
1658 add MAC addresses to the set as it learns new source addresses.
1659 </p>
1660 </column>
88058f19 1661
c96ba502
BP
1662 <column name="type">
1663 <p>
1664 A type for this logical port. Logical ports can be used to model other
1665 types of connectivity into an OVN logical switch. The following types
1666 are defined:
1667 </p>
1668
1669 <dl>
1670 <dt>(empty string)</dt>
1671 <dd>VM (or VIF) interface.</dd>
d387d24d
BP
1672
1673 <dt><code>patch</code></dt>
1674 <dd>
1675 One of a pair of logical ports that act as if connected by a patch
1676 cable. Useful for connecting two logical datapaths, e.g. to connect
1677 a logical router to a logical switch or to another logical router.
1678 </dd>
1679
17bac0ff 1680 <dt><code>l3gateway</code></dt>
c1645003
GS
1681 <dd>
1682 One of a pair of logical ports that act as if connected by a patch
1683 cable across multiple chassis. Useful for connecting a logical
1684 switch with a Gateway router (which is only resident on a
1685 particular chassis).
1686 </dd>
1687
c96ba502
BP
1688 <dt><code>localnet</code></dt>
1689 <dd>
1690 A connection to a locally accessible network from each
1691 <code>ovn-controller</code> instance. A logical switch can only
6e6c3f91
HZ
1692 have a single <code>localnet</code> port attached. This is used
1693 to model direct connectivity to an existing network.
c96ba502
BP
1694 </dd>
1695
184bc3ca
RB
1696 <dt><code>l2gateway</code></dt>
1697 <dd>
1698 An L2 connection to a physical network. The chassis this
1699 <ref table="Port_Binding"/> is bound to will serve as
1700 an L2 gateway to the network named by
1701 <ref column="options" table="Port_Binding"/>:<code>network_name</code>.
1702 </dd>
1703
c96ba502
BP
1704 <dt><code>vtep</code></dt>
1705 <dd>
1706 A port to a logical switch on a VTEP gateway chassis. In order to
1707 get this port correctly recognized by the OVN controller, the <ref
1708 column="options"
1709 table="Port_Binding"/>:<code>vtep-physical-switch</code> and <ref
1710 column="options"
1711 table="Port_Binding"/>:<code>vtep-logical-switch</code> must also
1712 be defined.
1713 </dd>
1714 </dl>
1715 </column>
1716 </group>
1a76c93e 1717
d387d24d
BP
1718 <group title="Patch Options">
1719 <p>
1720 These options apply to logical ports with <ref column="type"/> of
1721 <code>patch</code>.
1722 </p>
1723
1724 <column name="options" key="peer">
1725 The <ref column="logical_port"/> in the <ref table="Port_Binding"/>
1726 record for the other side of the patch. The named <ref
1727 column="logical_port"/> must specify this <ref column="logical_port"/>
1728 in its own <code>peer</code> option. That is, the two patch logical
1729 ports must have reversed <ref column="logical_port"/> and
1730 <code>peer</code> values.
1731 </column>
1732 </group>
1733
184bc3ca 1734 <group title="L3 Gateway Options">
c1645003
GS
1735 <p>
1736 These options apply to logical ports with <ref column="type"/> of
17bac0ff 1737 <code>l3gateway</code>.
c1645003
GS
1738 </p>
1739
1740 <column name="options" key="peer">
1741 The <ref column="logical_port"/> in the <ref table="Port_Binding"/>
17bac0ff 1742 record for the other side of the 'l3gateway' port. The named <ref
c1645003 1743 column="logical_port"/> must specify this <ref column="logical_port"/>
17bac0ff 1744 in its own <code>peer</code> option. That is, the two 'l3gateway'
c1645003
GS
1745 logical ports must have reversed <ref column="logical_port"/> and
1746 <code>peer</code> values.
1747 </column>
1748
17bac0ff 1749 <column name="options" key="l3gateway-chassis">
c1645003
GS
1750 The <code>chassis</code> in which the port resides.
1751 </column>
1752 </group>
1753
c96ba502 1754 <group title="Localnet Options">
eb00399e 1755 <p>
c96ba502
BP
1756 These options apply to logical ports with <ref column="type"/> of
1757 <code>localnet</code>.
eb00399e
BP
1758 </p>
1759
c96ba502
BP
1760 <column name="options" key="network_name">
1761 Required. <code>ovn-controller</code> uses the configuration entry
1762 <code>ovn-bridge-mappings</code> to determine how to connect to this
1763 network. <code>ovn-bridge-mappings</code> is a list of network names
1764 mapped to a local OVS bridge that provides access to that network. An
1765 example of configuring <code>ovn-bridge-mappings</code> would be:
1766
1767 <pre>$ ovs-vsctl set open . external-ids:ovn-bridge-mappings=physnet1:br-eth0,physnet2:br-eth1</pre>
1768
1769 <p>
1770 When a logical switch has a <code>localnet</code> port attached,
1771 every chassis that may have a local vif attached to that logical
1772 switch must have a bridge mapping configured to reach that
1773 <code>localnet</code>. Traffic that arrives on a
1774 <code>localnet</code> port is never forwarded over a tunnel to
1775 another chassis.
1776 </p>
1777 </column>
1778
1779 <column name="tag">
1780 If set, indicates that the port represents a connection to a specific
1781 VLAN on a locally accessible network. The VLAN ID is used to match
1782 incoming traffic and is also added to outgoing traffic.
1783 </column>
1784 </group>
1785
184bc3ca
RB
1786 <group title="L2 Gateway Options">
1787 <p>
1788 These options apply to logical ports with <ref column="type"/> of
1789 <code>l2gateway</code>.
1790 </p>
1791
1792 <column name="options" key="network_name">
1793 Required. <code>ovn-controller</code> uses the configuration entry
1794 <code>ovn-bridge-mappings</code> to determine how to connect to this
1795 network. <code>ovn-bridge-mappings</code> is a list of network names
1796 mapped to a local OVS bridge that provides access to that network. An
1797 example of configuring <code>ovn-bridge-mappings</code> would be:
1798
1799 <pre>$ ovs-vsctl set open . external-ids:ovn-bridge-mappings=physnet1:br-eth0,physnet2:br-eth1</pre>
1800
1801 <p>
1802 When a logical switch has a <code>l2gateway</code> port attached,
1803 the chassis that the <code>l2gateway</code> port is bound to
1804 must have a bridge mapping configured to reach the network
1805 identified by <code>network_name</code>.
1806 </p>
1807 </column>
1808
62b87eab
NS
1809 <column name="options" key="l2gateway-chassis">
1810 Required. The <code>chassis</code> in which the port resides.
1811 </column>
1812
184bc3ca
RB
1813 <column name="tag">
1814 If set, indicates that the gateway is connected to a specific
1815 VLAN on the physical network. The VLAN ID is used to match
1816 incoming traffic and is also added to outgoing traffic.
1817 </column>
1818 </group>
1819
c96ba502 1820 <group title="VTEP Options">
eb00399e 1821 <p>
c96ba502
BP
1822 These options apply to logical ports with <ref column="type"/> of
1823 <code>vtep</code>.
eb00399e 1824 </p>
9fb4636f 1825
c96ba502
BP
1826 <column name="options" key="vtep-physical-switch">
1827 Required. The name of the VTEP gateway.
1828 </column>
fe36184b 1829
c96ba502
BP
1830 <column name="options" key="vtep-logical-switch">
1831 Required. A logical switch name connected by the VTEP gateway. Must
1832 be set when <ref column="type"/> is <code>vtep</code>.
1833 </column>
1834 </group>
fe36184b 1835
aef5f431
BP
1836 <group title="VMI (or VIF) Options">
1837 <p>
1838 These options apply to logical ports with <ref column="type"/> having
1839 (empty string)
1840 </p>
1841
1842 <column name="options" key="policing_rate">
1843 If set, indicates the maximum rate for data sent from this interface,
1844 in kbps. Data exceeding this rate is dropped.
1845 </column>
1846
1847 <column name="options" key="policing_burst">
1848 If set, indicates the maximum burst size for data sent from this
1849 interface, in kb.
1850 </column>
1851 </group>
1852
c96ba502 1853 <group title="Nested Containers">
fe36184b 1854 <p>
c96ba502
BP
1855 These columns support containers nested within a VM. Specifically,
1856 they are used when <ref column="type"/> is empty and <ref
1857 column="logical_port"/> identifies the interface of a container spawned
1858 inside a VM. They are empty for containers or VMs that run directly on
1859 a hypervisor.
fe36184b
BP
1860 </p>
1861
c96ba502
BP
1862 <column name="parent_port">
1863 This is taken from
80f408f4
JP
1864 <ref table="Logical_Switch_Port" column="parent_name"
1865 db="OVN_Northbound"/> in the OVN_Northbound database's
1866 <ref table="Logical_Switch_Port" db="OVN_Northbound"/> table.
c96ba502
BP
1867 </column>
1868
1869 <column name="tag">
1870 <p>
1871 Identifies the VLAN tag in the network traffic associated with that
1872 container's network interface.
1873 </p>
1874
1875 <p>
1876 This column is used for a different purpose when <ref column="type"/>
184bc3ca
RB
1877 is <code>localnet</code> (see <code>Localnet Options</code>, above)
1878 or <code>l2gateway</code> (see <code>L2 Gateway Options</code>, above).
c96ba502
BP
1879 </p>
1880 </column>
1881 </group>
fe36184b 1882 </table>
0bac7164
BP
1883
1884 <table name="MAC_Binding" title="IP to MAC bindings">
1885 <p>
1886 Each row in this table specifies a binding from an IP address to an
1887 Ethernet address that has been discovered through ARP (for IPv4) or
1888 neighbor discovery (for IPv6). This table is primarily used to discover
1889 bindings on physical networks, because IP-to-MAC bindings for virtual
1890 machines are usually populated statically into the <ref
1891 table="Port_Binding"/> table.
1892 </p>
1893
1894 <p>
1895 This table expresses a functional relationship: <ref
1896 table="MAC_Binding"/>(<ref column="logical_port"/>, <ref column="ip"/>) =
1897 <ref column="mac"/>.
1898 </p>
1899
1900 <p>
1901 In outline, the lifetime of a logical router's MAC binding looks like
1902 this:
1903 </p>
1904
1905 <ol>
1906 <li>
1907 On hypervisor 1, a logical router determines that a packet should be
1908 forwarded to IP address <var>A</var> on one of its router ports. It
1909 uses its logical flow table to determine that <var>A</var> lacks a
1910 static IP-to-MAC binding and the <code>get_arp</code> action to
1911 determine that it lacks a dynamic IP-to-MAC binding.
1912 </li>
1913
1914 <li>
1915 Using an OVN logical <code>arp</code> action, the logical router
1916 generates and sends a broadcast ARP request to the router port. It
1917 drops the IP packet.
1918 </li>
1919
1920 <li>
1921 The logical switch attached to the router port delivers the ARP request
1922 to all of its ports. (It might make sense to deliver it only to ports
1923 that have no static IP-to-MAC bindings, but this could also be
1924 surprising behavior.)
1925 </li>
1926
1927 <li>
1928 A host or VM on hypervisor 2 (which might be the same as hypervisor 1)
1929 attached to the logical switch owns the IP address in question. It
1930 composes an ARP reply and unicasts it to the logical router port's
1931 Ethernet address.
1932 </li>
1933
1934 <li>
1935 The logical switch delivers the ARP reply to the logical router port.
1936 </li>
1937
1938 <li>
1939 The logical router flow table executes a <code>put_arp</code> action.
1940 To record the IP-to-MAC binding, <code>ovn-controller</code> adds a row
1941 to the <ref table="MAC_Binding"/> table.
1942 </li>
1943
1944 <li>
1945 On hypervisor 1, <code>ovn-controller</code> receives the updated <ref
1946 table="MAC_Binding"/> table from the OVN southbound database. The next
1947 packet destined to <var>A</var> through the logical router is sent
1948 directly to the bound Ethernet address.
1949 </li>
1950 </ol>
1951
1952 <column name="logical_port">
1953 The logical port on which the binding was discovered.
1954 </column>
1955
1956 <column name="ip">
1957 The bound IP address.
1958 </column>
1959
1960 <column name="mac">
1961 The Ethernet address to which the IP is bound.
1962 </column>
791a7747
LS
1963 <column name="datapath">
1964 The logical datapath to which the logical port belongs.
1965 </column>
0bac7164 1966 </table>
42814145
NS
1967
1968 <table name="DHCP_Options" title="DHCP Options supported by native OVN DHCP">
1969 <p>
1970 Each row in this table stores the DHCP Options supported by native OVN
1971 DHCP. <code>ovn-northd</code> populates this table with the supported
1972 DHCP options. <code>ovn-controller</code> looks up this table to get the
1973 DHCP codes of the DHCP options defined in the "put_dhcp_opts" action.
1974 Please refer to the RFC 2132 <code>"https://tools.ietf.org/html/rfc2132"</code>
1975 for the possible list of DHCP options that can be defined here.
1976 </p>
1977
1978 <column name="name">
1979 <p>
1980 Name of the DHCP option.
1981 </p>
1982
1983 <p>
1984 Example. name="router"
1985 </p>
1986 </column>
1987
1988 <column name="code">
1989 <p>
1990 DHCP option code for the DHCP option as defined in the RFC 2132.
1991 </p>
1992
1993 <p>
1994 Example. code=3
1995 </p>
1996 </column>
1997
1998 <column name="type">
1999 <p>
2000 Data type of the DHCP option code.
2001 </p>
2002
2003 <dl>
2004 <dt><code>value: bool</code></dt>
2005 <dd>
2006 <p>
2007 This indicates that the value of the DHCP option is a bool.
2008 </p>
2009
2010 <p>
2011 Example. "name=ip_forward_enable", "code=19", "type=bool".
2012 </p>
2013
2014 <p>
2015 put_dhcp_opts(..., ip_forward_enable = 1,...)
2016 </p>
2017 </dd>
2018
2019 <dt><code>value: uint8</code></dt>
2020 <dd>
2021 <p>
2022 This indicates that the value of the DHCP option is an unsigned
2023 int8 (8 bits)
2024 </p>
2025
2026 <p>
2027 Example. "name=default_ttl", "code=23", "type=uint8".
2028 </p>
2029
2030 <p>
2031 put_dhcp_opts(..., default_ttl = 50,...)
2032 </p>
2033 </dd>
2034
2035 <dt><code>value: uint16</code></dt>
2036 <dd>
2037 <p>
2038 This indicates that the value of the DHCP option is an unsigned
2039 int16 (16 bits).
2040 </p>
2041
2042 <p>
2043 Example. "name=mtu", "code=26", "type=uint16".
2044 </p>
2045
2046 <p>
2047 put_dhcp_opts(..., mtu = 1450,...)
2048 </p>
2049 </dd>
2050
2051 <dt><code>value: uint32</code></dt>
2052 <dd>
2053 <p>
2054 This indicates that the value of the DHCP option is an unsigned
2055 int32 (32 bits).
2056 </p>
2057
2058 <p>
2059 Example. "name=lease_time", "code=51", "type=uint32".
2060 </p>
2061
2062 <p>
2063 put_dhcp_opts(..., lease_time = 86400,...)
2064 </p>
2065 </dd>
2066
2067 <dt><code>value: ipv4</code></dt>
2068 <dd>
2069 <p>
2070 This indicates that the value of the DHCP option is an IPv4
2071 address or addresses.
2072 </p>
2073
2074 <p>
2075 Example. "name=router", "code=3", "type=ipv4".
2076 </p>
2077
2078 <p>
2079 put_dhcp_opts(..., router = 10.0.0.1,...)
2080 </p>
2081
2082 <p>
2083 Example. "name=dns_server", "code=6", "type=ipv4".
2084 </p>
2085
2086 <p>
2087 put_dhcp_opts(..., dns_server = {8.8.8.8 7.7.7.7},...)
2088 </p>
2089 </dd>
2090
2091 <dt><code>value: static_routes</code></dt>
2092 <dd>
2093 <p>
2094 This indicates that the value of the DHCP option contains a pair of
2095 IPv4 route and next hop addresses.
2096 </p>
2097
2098 <p>
2099 Example. "name=classless_static_route", "code=121", "type=static_routes".
2100 </p>
2101
2102 <p>
2103 put_dhcp_opts(..., classless_static_route = {30.0.0.0/24,10.0.0.4,0.0.0.0/0,10.0.0.1}...)
2104 </p>
2105 </dd>
2106
2107 <dt><code>value: str</code></dt>
2108 <dd>
2109 <p>
2110 This indicates that the value of the DHCP option is a string.
2111 </p>
2112
2113 <p>
2114 Example. "name=host_name", "code=12", "type=str".
2115 </p>
2116 </dd>
2117 </dl>
2118 </column>
2119 </table>
01cfdb2f
NS
2120
2121 <table name="DHCPv6_Options" title="DHCPv6 Options supported by native OVN DHCPv6">
2122 <p>
2123 Each row in this table stores the DHCPv6 Options supported by native OVN
2124 DHCPv6. <code>ovn-northd</code> populates this table with the supported
2125 DHCPv6 options. <code>ovn-controller</code> looks up this table to get
2126 the DHCPv6 codes of the DHCPv6 options defined in the
2127 <code>put_dhcpv6_opts</code> action. Please refer to RFC 3315 and RFC
2128 3646 for the list of DHCPv6 options that can be defined here.
2129 </p>
2130
2131 <column name="name">
2132 <p>
2133 Name of the DHCPv6 option.
2134 </p>
2135
2136 <p>
2137 Example. name="ia_addr"
2138 </p>
2139 </column>
2140
2141 <column name="code">
2142 <p>
2143 DHCPv6 option code for the DHCPv6 option as defined in the appropriate
2144 RFC.
2145 </p>
2146
2147 <p>
2148 Example. code=3
2149 </p>
2150 </column>
2151
2152 <column name="type">
2153 <p>
2154 Data type of the DHCPv6 option code.
2155 </p>
2156
2157 <dl>
2158 <dt><code>value: ipv6</code></dt>
2159 <dd>
2160 <p>
2161 This indicates that the value of the DHCPv6 option is an IPv6
2162 address(es).
2163 </p>
2164
2165 <p>
2166 Example. "name=ia_addr", "code=5", "type=ipv6".
2167 </p>
2168
2169 <p>
2170 put_dhcpv6_opts(..., ia_addr = ae70::4,...)
2171 </p>
2172 </dd>
2173
2174 <dt><code>value: str</code></dt>
2175 <dd>
2176 <p>
2177 This indicates that the value of the DHCPv6 option is a string.
2178 </p>
2179
2180 <p>
2181 Example. "name=domain_search", "code=24", "type=str".
2182 </p>
2183
2184 <p>
2185 put_dhcpv6_opts(..., domain_search = ovn.domain,...)
2186 </p>
2187 </dd>
2188
2189 <dt><code>value: mac</code></dt>
2190 <dd>
2191 <p>
2192 This indicates that the value of the DHCPv6 option is a MAC address.
2193 </p>
2194
2195 <p>
2196 Example. "name=server_id", "code=2", "type=mac".
2197 </p>
2198
2199 <p>
2200 put_dhcpv6_opts(..., server_id = 01:02:03:04L05:06,...)
2201 </p>
2202 </dd>
2203 </dl>
2204 </column>
2205 </table>
fe36184b 2206</database>