]> git.proxmox.com Git - ovs.git/blob - ovn/ovn-sb.xml
ovn: Add hostname to Chassis.
[ovs.git] / ovn / ovn-sb.xml
1 <?xml version="1.0" encoding="utf-8"?>
2 <database name="ovn-sb" title="OVN Southbound Database">
3 <p>
4 This database holds logical and physical configuration and state for the
5 Open Virtual Network (OVN) system to support virtual network abstraction.
6 For an introduction to OVN, please see <code>ovn-architecture</code>(7).
7 </p>
8
9 <p>
10 The OVN Southbound database sits at the center of the OVN
11 architecture. It is the one component that speaks both southbound
12 directly to all the hypervisors and gateways, via
13 <code>ovn-controller</code>/<code>ovn-controller-vtep</code>, and
14 northbound to the Cloud Management System, via <code>ovn-northd</code>:
15 </p>
16
17 <h2>Database Structure</h2>
18
19 <p>
20 The OVN Southbound database contains classes of data with
21 different properties, as described in the sections below.
22 </p>
23
24 <h3>Physical Network (PN) data</h3>
25
26 <p>
27 PN tables contain information about the chassis nodes in the system. This
28 contains all the information necessary to wire the overlay, such as IP
29 addresses, supported tunnel types, and security keys.
30 </p>
31
32 <p>
33 The amount of PN data is small (O(n) in the number of chassis) and it
34 changes infrequently, so it can be replicated to every chassis.
35 </p>
36
37 <p>
38 The <ref table="Chassis"/> table comprises the PN tables.
39 </p>
40
41 <h3>Logical Network (LN) data</h3>
42
43 <p>
44 LN tables contain the topology of logical switches and routers, ACLs,
45 firewall rules, and everything needed to describe how packets traverse a
46 logical network, represented as logical datapath flows (see Logical
47 Datapath Flows, below).
48 </p>
49
50 <p>
51 LN data may be large (O(n) in the number of logical ports, ACL rules,
52 etc.). Thus, to improve scaling, each chassis should receive only data
53 related to logical networks in which that chassis participates. Past
54 experience shows that in the presence of large logical networks, even
55 finer-grained partitioning of data, e.g. designing logical flows so that
56 only the chassis hosting a logical port needs related flows, pays off
57 scale-wise. (This is not necessary initially but it is worth bearing in
58 mind in the design.)
59 </p>
60
61 <p>
62 The LN is a slave of the cloud management system running northbound of OVN.
63 That CMS determines the entire OVN logical configuration and therefore the
64 LN's content at any given time is a deterministic function of the CMS's
65 configuration, although that happens indirectly via the
66 <ref db="OVN_Northbound"/> database and <code>ovn-northd</code>.
67 </p>
68
69 <p>
70 LN data is likely to change more quickly than PN data. This is especially
71 true in a container environment where VMs are created and destroyed (and
72 therefore added to and deleted from logical switches) quickly.
73 </p>
74
75 <p>
76 <ref table="Logical_Flow"/> and <ref table="Multicast_Group"/> contain LN
77 data.
78 </p>
79
80 <h3>Logical-physical bindings</h3>
81
82 <p>
83 These tables link logical and physical components. They show the current
84 placement of logical components (such as VMs and VIFs) onto chassis, and
85 map logical entities to the values that represent them in tunnel
86 encapsulations.
87 </p>
88
89 <p>
90 These tables change frequently, at least every time a VM powers up or down
91 or migrates, and especially quickly in a container environment. The
92 amount of data per VM (or VIF) is small.
93 </p>
94
95 <p>
96 Each chassis is authoritative about the VMs and VIFs that it hosts at any
97 given time and can efficiently flood that state to a central location, so
98 the consistency needs are minimal.
99 </p>
100
101 <p>
102 The <ref table="Port_Binding"/> and <ref table="Datapath_Binding"/> tables
103 contain binding data.
104 </p>
105
106 <h3>MAC bindings</h3>
107
108 <p>
109 The <ref table="MAC_Binding"/> table tracks the bindings from IP addresses
110 to Ethernet addresses that are dynamically discovered using ARP (for IPv4)
111 and neighbor discovery (for IPv6). Usually, IP-to-MAC bindings for virtual
112 machines are statically populated into the <ref table="Port_Binding"/>
113 table, so <ref table="MAC_Binding"/> is primarily used to discover bindings
114 on physical networks.
115 </p>
116
117 <h2>Common Columns</h2>
118
119 <p>
120 Some tables contain a special column named <code>external_ids</code>. This
121 column has the same form and purpose each place that it appears, so we
122 describe it here to save space later.
123 </p>
124
125 <dl>
126 <dt><code>external_ids</code>: map of string-string pairs</dt>
127 <dd>
128 Key-value pairs for use by the software that manages the OVN Southbound
129 database rather than by
130 <code>ovn-controller</code>/<code>ovn-controller-vtep</code>. In
131 particular, <code>ovn-northd</code> can use key-value pairs in this
132 column to relate entities in the southbound database to higher-level
133 entities (such as entities in the OVN Northbound database). Individual
134 key-value pairs in this column may be documented in some cases to aid
135 in understanding and troubleshooting, but the reader should not mistake
136 such documentation as comprehensive.
137 </dd>
138 </dl>
139
140 <table name="Chassis" title="Physical Network Hypervisor and Gateway Information">
141 <p>
142 Each row in this table represents a hypervisor or gateway (a chassis) in
143 the physical network (PN). Each chassis, via
144 <code>ovn-controller</code>/<code>ovn-controller-vtep</code>, adds
145 and updates its own row, and keeps a copy of the remaining rows to
146 determine how to reach other hypervisors.
147 </p>
148
149 <p>
150 When a chassis shuts down gracefully, it should remove its own row.
151 (This is not critical because resources hosted on the chassis are equally
152 unreachable regardless of whether the row is present.) If a chassis
153 shuts down permanently without removing its row, some kind of manual or
154 automatic cleanup is eventually needed; we can devise a process for that
155 as necessary.
156 </p>
157
158 <column name="name">
159 A chassis name, taken from <ref key="system-id" table="Open_vSwitch"
160 column="external_ids" db="Open_vSwitch"/> in the Open_vSwitch
161 database's <ref table="Open_vSwitch" db="Open_vSwitch"/> table. OVN does
162 not prescribe a particular format for chassis names.
163 </column>
164
165 <column name="hostname">
166 The hostname of the chassis, if applicable. ovn-controller will populate
167 this column with the hostname of the host it is running on.
168 ovn-controller-vtep will leave this column empty.
169 </column>
170
171 <group title="Encapsulation Configuration">
172 <p>
173 OVN uses encapsulation to transmit logical dataplane packets
174 between chassis.
175 </p>
176
177 <column name="encaps">
178 Points to supported encapsulation configurations to transmit
179 logical dataplane packets to this chassis. Each entry is a <ref
180 table="Encap"/> record that describes the configuration.
181 </column>
182 </group>
183
184 <group title="Gateway Configuration">
185 <p>
186 A <dfn>gateway</dfn> is a chassis that forwards traffic between the
187 OVN-managed part of a logical network and a physical VLAN, extending a
188 tunnel-based logical network into a physical network. Gateways are
189 typically dedicated nodes that do not host VMs and will be controlled
190 by <code>ovn-controller-vtep</code>.
191 </p>
192
193 <column name="vtep_logical_switches">
194 Stores all VTEP logical switch names connected by this gateway
195 chassis. The <ref table="Port_Binding"/> table entry with
196 <ref column="options" table="Port_Binding"/>:<code>vtep-physical-switch</code>
197 equal <ref table="Chassis"/> <ref column="name" table="Chassis"/>, and
198 <ref column="options" table="Port_Binding"/>:<code>vtep-logical-switch</code>
199 value in <ref table="Chassis"/>
200 <ref column="vtep_logical_switches" table="Chassis"/>, will be
201 associated with this <ref table="Chassis"/>.
202 </column>
203 </group>
204 </table>
205
206 <table name="Encap" title="Encapsulation Types">
207 <p>
208 The <ref column="encaps" table="Chassis"/> column in the <ref
209 table="Chassis"/> table refers to rows in this table to identify
210 how OVN may transmit logical dataplane packets to this chassis.
211 Each chassis, via <code>ovn-controller</code>(8) or
212 <code>ovn-controller-vtep</code>(8), adds and updates its own rows
213 and keeps a copy of the remaining rows to determine how to reach
214 other chassis.
215 </p>
216
217 <column name="type">
218 The encapsulation to use to transmit packets to this chassis.
219 Hypervisors must use either <code>geneve</code> or
220 <code>stt</code>. Gateways may use <code>vxlan</code>,
221 <code>geneve</code>, or <code>stt</code>.
222 </column>
223
224 <column name="options">
225 Options for configuring the encapsulation, e.g. IPsec parameters when
226 IPsec support is introduced. No options are currently defined.
227 </column>
228
229 <column name="ip">
230 The IPv4 address of the encapsulation tunnel endpoint.
231 </column>
232 </table>
233
234 <table name="Logical_Flow" title="Logical Network Flows">
235 <p>
236 Each row in this table represents one logical flow.
237 <code>ovn-northd</code> populates this table with logical flows
238 that implement the L2 and L3 topologies specified in the
239 <ref db="OVN_Northbound"/> database. Each hypervisor, via
240 <code>ovn-controller</code>, translates the logical flows into
241 OpenFlow flows specific to its hypervisor and installs them into
242 Open vSwitch.
243 </p>
244
245 <p>
246 Logical flows are expressed in an OVN-specific format, described here. A
247 logical datapath flow is much like an OpenFlow flow, except that the
248 flows are written in terms of logical ports and logical datapaths instead
249 of physical ports and physical datapaths. Translation between logical
250 and physical flows helps to ensure isolation between logical datapaths.
251 (The logical flow abstraction also allows the OVN centralized
252 components to do less work, since they do not have to separately
253 compute and push out physical flows to each chassis.)
254 </p>
255
256 <p>
257 The default action when no flow matches is to drop packets.
258 </p>
259
260 <p><em>Architectural Logical Life Cycle of a Packet</em></p>
261
262 <p>
263 This following description focuses on the life cycle of a packet through
264 a logical datapath, ignoring physical details of the implementation.
265 Please refer to <em>Architectural Physical Life Cycle of a Packet</em> in
266 <code>ovn-architecture</code>(7) for the physical information.
267 </p>
268
269 <p>
270 The description here is written as if OVN itself executes these steps,
271 but in fact OVN (that is, <code>ovn-controller</code>) programs Open
272 vSwitch, via OpenFlow and OVSDB, to execute them on its behalf.
273 </p>
274
275 <p>
276 At a high level, OVN passes each packet through the logical datapath's
277 logical ingress pipeline, which may output the packet to one or more
278 logical port or logical multicast groups. For each such logical output
279 port, OVN passes the packet through the datapath's logical egress
280 pipeline, which may either drop the packet or deliver it to the
281 destination. Between the two pipelines, outputs to logical multicast
282 groups are expanded into logical ports, so that the egress pipeline only
283 processes a single logical output port at a time. Between the two
284 pipelines is also where, when necessary, OVN encapsulates a packet in a
285 tunnel (or tunnels) to transmit to remote hypervisors.
286 </p>
287
288 <p>
289 In more detail, to start, OVN searches the <ref table="Logical_Flow"/>
290 table for a row with correct <ref column="logical_datapath"/>, a <ref
291 column="pipeline"/> of <code>ingress</code>, a <ref column="table_id"/>
292 of 0, and a <ref column="match"/> that is true for the packet. If none
293 is found, OVN drops the packet. If OVN finds more than one, it chooses
294 the match with the highest <ref column="priority"/>. Then OVN executes
295 each of the actions specified in the row's <ref table="actions"/> column,
296 in the order specified. Some actions, such as those to modify packet
297 headers, require no further details. The <code>next</code> and
298 <code>output</code> actions are special.
299 </p>
300
301 <p>
302 The <code>next</code> action causes the above process to be repeated
303 recursively, except that OVN searches for <ref column="table_id"/> of 1
304 instead of 0. Similarly, any <code>next</code> action in a row found in
305 that table would cause a further search for a <ref column="table_id"/> of
306 2, and so on. When recursive processing completes, flow control returns
307 to the action following <code>next</code>.
308 </p>
309
310 <p>
311 The <code>output</code> action also introduces recursion. Its effect
312 depends on the current value of the <code>outport</code> field. Suppose
313 <code>outport</code> designates a logical port. First, OVN compares
314 <code>inport</code> to <code>outport</code>; if they are equal, it treats
315 the <code>output</code> as a no-op. In the common case, where they are
316 different, the packet enters the egress pipeline. This transition to the
317 egress pipeline discards register data, e.g. <code>reg0</code> ...
318 <code>reg4</code> and connection tracking state, to achieve
319 uniform behavior regardless of whether the egress pipeline is on a
320 different hypervisor (because registers aren't preserve across
321 tunnel encapsulation).
322 </p>
323
324 <p>
325 To execute the egress pipeline, OVN again searches the <ref
326 table="Logical_Flow"/> table for a row with correct <ref
327 column="logical_datapath"/>, a <ref column="table_id"/> of 0, a <ref
328 column="match"/> that is true for the packet, but now looking for a <ref
329 column="pipeline"/> of <code>egress</code>. If no matching row is found,
330 the output becomes a no-op. Otherwise, OVN executes the actions for the
331 matching flow (which is chosen from multiple, if necessary, as already
332 described).
333 </p>
334
335 <p>
336 In the <code>egress</code> pipeline, the <code>next</code> action acts as
337 already described, except that it, of course, searches for
338 <code>egress</code> flows. The <code>output</code> action, however, now
339 directly outputs the packet to the output port (which is now fixed,
340 because <code>outport</code> is read-only within the egress pipeline).
341 </p>
342
343 <p>
344 The description earlier assumed that <code>outport</code> referred to a
345 logical port. If it instead designates a logical multicast group, then
346 the description above still applies, with the addition of fan-out from
347 the logical multicast group to each logical port in the group. For each
348 member of the group, OVN executes the logical pipeline as described, with
349 the logical output port replaced by the group member.
350 </p>
351
352 <p><em>Pipeline Stages</em></p>
353
354 <p>
355 <code>ovn-northd</code> is responsible for populating the
356 <ref table="Logical_Flow"/> table, so the stages are an
357 implementation detail and subject to change. This section
358 describes the current logical flow table.
359 </p>
360
361 <p>
362 The ingress pipeline consists of the following stages:
363 </p>
364 <ul>
365 <li>
366 Port Security (Table 0): Validates the source address, drops
367 packets with a VLAN tag, and, if configured, verifies that the
368 logical port is allowed to send with the source address.
369 </li>
370
371 <li>
372 L2 Destination Lookup (Table 1): Forwards known unicast
373 addresses to the appropriate logical port. Unicast packets to
374 unknown hosts are forwarded to logical ports configured with the
375 special <code>unknown</code> mac address. Broadcast, and
376 multicast are flooded to all ports in the logical switch.
377 </li>
378 </ul>
379
380 <p>
381 The egress pipeline consists of the following stages:
382 </p>
383 <ul>
384 <li>
385 ACL (Table 0): Applies any specified access control lists.
386 </li>
387
388 <li>
389 Port Security (Table 1): If configured, verifies that the
390 logical port is allowed to receive packets with the destination
391 address.
392 </li>
393 </ul>
394
395 <column name="logical_datapath">
396 The logical datapath to which the logical flow belongs.
397 </column>
398
399 <column name="pipeline">
400 <p>
401 The primary flows used for deciding on a packet's destination are the
402 <code>ingress</code> flows. The <code>egress</code> flows implement
403 ACLs. See <em>Logical Life Cycle of a Packet</em>, above, for details.
404 </p>
405 </column>
406
407 <column name="table_id">
408 The stage in the logical pipeline, analogous to an OpenFlow table number.
409 </column>
410
411 <column name="priority">
412 The flow's priority. Flows with numerically higher priority take
413 precedence over those with lower. If two logical datapath flows with the
414 same priority both match, then the one actually applied to the packet is
415 undefined.
416 </column>
417
418 <column name="match">
419 <p>
420 A matching expression. OVN provides a superset of OpenFlow matching
421 capabilities, using a syntax similar to Boolean expressions in a
422 programming language.
423 </p>
424
425 <p>
426 The most important components of match expression are
427 <dfn>comparisons</dfn> between <dfn>symbols</dfn> and
428 <dfn>constants</dfn>, e.g. <code>ip4.dst == 192.168.0.1</code>,
429 <code>ip.proto == 6</code>, <code>arp.op == 1</code>, <code>eth.type ==
430 0x800</code>. The logical AND operator <code>&amp;&amp;</code> and
431 logical OR operator <code>||</code> can combine comparisons into a
432 larger expression.
433 </p>
434
435 <p>
436 Matching expressions also support parentheses for grouping, the logical
437 NOT prefix operator <code>!</code>, and literals <code>0</code> and
438 <code>1</code> to express ``false'' or ``true,'' respectively. The
439 latter is useful by itself as a catch-all expression that matches every
440 packet.
441 </p>
442
443 <p><em>Symbols</em></p>
444
445 <p>
446 <em>Type</em>. Symbols have <dfn>integer</dfn> or <dfn>string</dfn>
447 type. Integer symbols have a <dfn>width</dfn> in bits.
448 </p>
449
450 <p>
451 <em>Kinds</em>. There are three kinds of symbols:
452 </p>
453
454 <ul>
455 <li>
456 <p>
457 <dfn>Fields</dfn>. A field symbol represents a packet header or
458 metadata field. For example, a field
459 named <code>vlan.tci</code> might represent the VLAN TCI field in a
460 packet.
461 </p>
462
463 <p>
464 A field symbol can have integer or string type. Integer fields can
465 be nominal or ordinal (see <em>Level of Measurement</em>,
466 below).
467 </p>
468 </li>
469
470 <li>
471 <p>
472 <dfn>Subfields</dfn>. A subfield represents a subset of bits from
473 a larger field. For example, a field <code>vlan.vid</code> might
474 be defined as an alias for <code>vlan.tci[0..11]</code>. Subfields
475 are provided for syntactic convenience, because it is always
476 possible to instead refer to a subset of bits from a field
477 directly.
478 </p>
479
480 <p>
481 Only ordinal fields (see <em>Level of Measurement</em>,
482 below) may have subfields. Subfields are always ordinal.
483 </p>
484 </li>
485
486 <li>
487 <p>
488 <dfn>Predicates</dfn>. A predicate is shorthand for a Boolean
489 expression. Predicates may be used much like 1-bit fields. For
490 example, <code>ip4</code> might expand to <code>eth.type ==
491 0x800</code>. Predicates are provided for syntactic convenience,
492 because it is always possible to instead specify the underlying
493 expression directly.
494 </p>
495
496 <p>
497 A predicate whose expansion refers to any nominal field or
498 predicate (see <em>Level of Measurement</em>, below) is nominal;
499 other predicates have Boolean level of measurement.
500 </p>
501 </li>
502 </ul>
503
504 <p>
505 <em>Level of Measurement</em>. See
506 http://en.wikipedia.org/wiki/Level_of_measurement for the statistical
507 concept on which this classification is based. There are three
508 levels:
509 </p>
510
511 <ul>
512 <li>
513 <p>
514 <dfn>Ordinal</dfn>. In statistics, ordinal values can be ordered
515 on a scale. OVN considers a field (or subfield) to be ordinal if
516 its bits can be examined individually. This is true for the
517 OpenFlow fields that OpenFlow or Open vSwitch makes ``maskable.''
518 </p>
519
520 <p>
521 Any use of a nominal field may specify a single bit or a range of
522 bits, e.g. <code>vlan.tci[13..15]</code> refers to the PCP field
523 within the VLAN TCI, and <code>eth.dst[40]</code> refers to the
524 multicast bit in the Ethernet destination address.
525 </p>
526
527 <p>
528 OVN supports all the usual arithmetic relations (<code>==</code>,
529 <code>!=</code>, <code>&lt;</code>, <code>&lt;=</code>,
530 <code>&gt;</code>, and <code>&gt;=</code>) on ordinal fields and
531 their subfields, because OVN can implement these in OpenFlow and
532 Open vSwitch as collections of bitwise tests.
533 </p>
534 </li>
535
536 <li>
537 <p>
538 <dfn>Nominal</dfn>. In statistics, nominal values cannot be
539 usefully compared except for equality. This is true of OpenFlow
540 port numbers, Ethernet types, and IP protocols are examples: all of
541 these are just identifiers assigned arbitrarily with no deeper
542 meaning. In OpenFlow and Open vSwitch, bits in these fields
543 generally aren't individually addressable.
544 </p>
545
546 <p>
547 OVN only supports arithmetic tests for equality on nominal fields,
548 because OpenFlow and Open vSwitch provide no way for a flow to
549 efficiently implement other comparisons on them. (A test for
550 inequality can be sort of built out of two flows with different
551 priorities, but OVN matching expressions always generate flows with
552 a single priority.)
553 </p>
554
555 <p>
556 String fields are always nominal.
557 </p>
558 </li>
559
560 <li>
561 <p>
562 <dfn>Boolean</dfn>. A nominal field that has only two values, 0
563 and 1, is somewhat exceptional, since it is easy to support both
564 equality and inequality tests on such a field: either one can be
565 implemented as a test for 0 or 1.
566 </p>
567
568 <p>
569 Only predicates (see above) have a Boolean level of measurement.
570 </p>
571
572 <p>
573 This isn't a standard level of measurement.
574 </p>
575 </li>
576 </ul>
577
578 <p>
579 <em>Prerequisites</em>. Any symbol can have prerequisites, which are
580 additional condition implied by the use of the symbol. For example,
581 For example, <code>icmp4.type</code> symbol might have prerequisite
582 <code>icmp4</code>, which would cause an expression <code>icmp4.type ==
583 0</code> to be interpreted as <code>icmp4.type == 0 &amp;&amp;
584 icmp4</code>, which would in turn expand to <code>icmp4.type == 0
585 &amp;&amp; eth.type == 0x800 &amp;&amp; ip4.proto == 1</code> (assuming
586 <code>icmp4</code> is a predicate defined as suggested under
587 <em>Types</em> above).
588 </p>
589
590 <p><em>Relational operators</em></p>
591
592 <p>
593 All of the standard relational operators <code>==</code>,
594 <code>!=</code>, <code>&lt;</code>, <code>&lt;=</code>,
595 <code>&gt;</code>, and <code>&gt;=</code> are supported. Nominal
596 fields support only <code>==</code> and <code>!=</code>, and only in a
597 positive sense when outer <code>!</code> are taken into account,
598 e.g. given string field <code>inport</code>, <code>inport ==
599 "eth0"</code> and <code>!(inport != "eth0")</code> are acceptable, but
600 not <code>inport != "eth0"</code>.
601 </p>
602
603 <p>
604 The implementation of <code>==</code> (or <code>!=</code> when it is
605 negated), is more efficient than that of the other relational
606 operators.
607 </p>
608
609 <p><em>Constants</em></p>
610
611 <p>
612 Integer constants may be expressed in decimal, hexadecimal prefixed by
613 <code>0x</code>, or as dotted-quad IPv4 addresses, IPv6 addresses in
614 their standard forms, or Ethernet addresses as colon-separated hex
615 digits. A constant in any of these forms may be followed by a slash
616 and a second constant (the mask) in the same form, to form a masked
617 constant. IPv4 and IPv6 masks may be given as integers, to express
618 CIDR prefixes.
619 </p>
620
621 <p>
622 String constants have the same syntax as quoted strings in JSON (thus,
623 they are Unicode strings).
624 </p>
625
626 <p>
627 Some operators support sets of constants written inside curly braces
628 <code>{</code> ... <code>}</code>. Commas between elements of a set,
629 and after the last elements, are optional. With <code>==</code>,
630 ``<code><var>field</var> == { <var>constant1</var>,
631 <var>constant2</var>,</code> ... <code>}</code>'' is syntactic sugar
632 for ``<code><var>field</var> == <var>constant1</var> ||
633 <var>field</var> == <var>constant2</var> || </code>...<code></code>.
634 Similarly, ``<code><var>field</var> != { <var>constant1</var>,
635 <var>constant2</var>, </code>...<code> }</code>'' is equivalent to
636 ``<code><var>field</var> != <var>constant1</var> &amp;&amp;
637 <var>field</var> != <var>constant2</var> &amp;&amp;
638 </code>...<code></code>''.
639 </p>
640
641 <p><em>Miscellaneous</em></p>
642
643 <p>
644 Comparisons may name the symbol or the constant first,
645 e.g. <code>tcp.src == 80</code> and <code>80 == tcp.src</code> are both
646 acceptable.
647 </p>
648
649 <p>
650 Tests for a range may be expressed using a syntax like <code>1024 &lt;=
651 tcp.src &lt;= 49151</code>, which is equivalent to <code>1024 &lt;=
652 tcp.src &amp;&amp; tcp.src &lt;= 49151</code>.
653 </p>
654
655 <p>
656 For a one-bit field or predicate, a mention of its name is equivalent
657 to <code><var>symobl</var> == 1</code>, e.g. <code>vlan.present</code>
658 is equivalent to <code>vlan.present == 1</code>. The same is true for
659 one-bit subfields, e.g. <code>vlan.tci[12]</code>. There is no
660 technical limitation to implementing the same for ordinal fields of all
661 widths, but the implementation is expensive enough that the syntax
662 parser requires writing an explicit comparison against zero to make
663 mistakes less likely, e.g. in <code>tcp.src != 0</code> the comparison
664 against 0 is required.
665 </p>
666
667 <p>
668 <em>Operator precedence</em> is as shown below, from highest to lowest.
669 There are two exceptions where parentheses are required even though the
670 table would suggest that they are not: <code>&amp;&amp;</code> and
671 <code>||</code> require parentheses when used together, and
672 <code>!</code> requires parentheses when applied to a relational
673 expression. Thus, in <code>(eth.type == 0x800 || eth.type == 0x86dd)
674 &amp;&amp; ip.proto == 6</code> or <code>!(arp.op == 1)</code>, the
675 parentheses are mandatory.
676 </p>
677
678 <ul>
679 <li><code>()</code></li>
680 <li><code>== != &lt; &lt;= &gt; &gt;=</code></li>
681 <li><code>!</code></li>
682 <li><code>&amp;&amp; ||</code></li>
683 </ul>
684
685 <p>
686 <em>Comments</em> may be introduced by <code>//</code>, which extends
687 to the next new-line. Comments within a line may be bracketed by
688 <code>/*</code> and <code>*/</code>. Multiline comments are not
689 supported.
690 </p>
691
692 <p><em>Symbols</em></p>
693
694 <p>
695 Most of the symbols below have integer type. Only <code>inport</code>
696 and <code>outport</code> have string type. <code>inport</code> names a
697 logical port. Thus, its value is a <ref column="logical_port"/> name
698 from the <ref table="Port_Binding"/> table. <code>outport</code> may
699 name a logical port, as <code>inport</code>, or a logical multicast
700 group defined in the <ref table="Multicast_Group"/> table. For both
701 symbols, only names within the flow's logical datapath may be used.
702 </p>
703
704 <ul>
705 <li><code>reg0</code>...<code>reg4</code></li>
706 <li><code>inport</code> <code>outport</code></li>
707 <li><code>eth.src</code> <code>eth.dst</code> <code>eth.type</code></li>
708 <li><code>vlan.tci</code> <code>vlan.vid</code> <code>vlan.pcp</code> <code>vlan.present</code></li>
709 <li><code>ip.proto</code> <code>ip.dscp</code> <code>ip.ecn</code> <code>ip.ttl</code> <code>ip.frag</code></li>
710 <li><code>ip4.src</code> <code>ip4.dst</code></li>
711 <li><code>ip6.src</code> <code>ip6.dst</code> <code>ip6.label</code></li>
712 <li><code>arp.op</code> <code>arp.spa</code> <code>arp.tpa</code> <code>arp.sha</code> <code>arp.tha</code></li>
713 <li><code>tcp.src</code> <code>tcp.dst</code> <code>tcp.flags</code></li>
714 <li><code>udp.src</code> <code>udp.dst</code></li>
715 <li><code>sctp.src</code> <code>sctp.dst</code></li>
716 <li><code>icmp4.type</code> <code>icmp4.code</code></li>
717 <li><code>icmp6.type</code> <code>icmp6.code</code></li>
718 <li><code>nd.target</code> <code>nd.sll</code> <code>nd.tll</code></li>
719 <li><code>ct_mark</code> <code>ct_label</code></li>
720 <li>
721 <p>
722 <code>ct_state</code>, which has the following Boolean subfields:
723 </p>
724 <ul>
725 <li><code>ct.new</code>: True for a new flow</li>
726 <li><code>ct.est</code>: True for an established flow</li>
727 <li><code>ct.rel</code>: True for a related flow</li>
728 <li><code>ct.rpl</code>: True for a reply flow</li>
729 <li><code>ct.inv</code>: True for a connection entry in a bad state</li>
730 </ul>
731 <p>
732 <code>ct_state</code> and its subfields are initialized by the
733 <code>ct_next</code> action, described below.
734 </p>
735 </li>
736 </ul>
737
738 <p>
739 The following predicates are supported:
740 </p>
741
742 <ul>
743 <li><code>eth.bcast</code> expands to <code>eth.dst == ff:ff:ff:ff:ff:ff</code></li>
744 <li><code>eth.mcast</code> expands to <code>eth.dst[40]</code></li>
745 <li><code>vlan.present</code> expands to <code>vlan.tci[12]</code></li>
746 <li><code>ip4</code> expands to <code>eth.type == 0x800</code></li>
747 <li><code>ip4.mcast</code> expands to <code>ip4.dst[28..31] == 0xe</code></li>
748 <li><code>ip6</code> expands to <code>eth.type == 0x86dd</code></li>
749 <li><code>ip</code> expands to <code>ip4 || ip6</code></li>
750 <li><code>icmp4</code> expands to <code>ip4 &amp;&amp; ip.proto == 1</code></li>
751 <li><code>icmp6</code> expands to <code>ip6 &amp;&amp; ip.proto == 58</code></li>
752 <li><code>icmp</code> expands to <code>icmp4 || icmp6</code></li>
753 <li><code>ip.is_frag</code> expands to <code>ip.frag[0]</code></li>
754 <li><code>ip.later_frag</code> expands to <code>ip.frag[1]</code></li>
755 <li><code>ip.first_frag</code> expands to <code>ip.is_frag &amp;&amp; !ip.later_frag</code></li>
756 <li><code>arp</code> expands to <code>eth.type == 0x806</code></li>
757 <li><code>nd</code> expands to <code>icmp6.type == {135, 136} &amp;&amp; icmp6.code == 0</code></li>
758 <li><code>tcp</code> expands to <code>ip.proto == 6</code></li>
759 <li><code>udp</code> expands to <code>ip.proto == 17</code></li>
760 <li><code>sctp</code> expands to <code>ip.proto == 132</code></li>
761 </ul>
762 </column>
763
764 <column name="actions">
765 <p>
766 Logical datapath actions, to be executed when the logical flow
767 represented by this row is the highest-priority match.
768 </p>
769
770 <p>
771 Actions share lexical syntax with the <ref column="match"/> column. An
772 empty set of actions (or one that contains just white space or
773 comments), or a set of actions that consists of just
774 <code>drop;</code>, causes the matched packets to be dropped.
775 Otherwise, the column should contain a sequence of actions, each
776 terminated by a semicolon.
777 </p>
778
779 <p>
780 The following actions are defined:
781 </p>
782
783 <dl>
784 <dt><code>output;</code></dt>
785 <dd>
786 <p>
787 In the ingress pipeline, this action executes the
788 <code>egress</code> pipeline as a subroutine. If
789 <code>outport</code> names a logical port, the egress pipeline
790 executes once; if it is a multicast group, the egress pipeline runs
791 once for each logical port in the group.
792 </p>
793
794 <p>
795 In the egress pipeline, this action performs the actual
796 output to the <code>outport</code> logical port. (In the egress
797 pipeline, <code>outport</code> never names a multicast group.)
798 </p>
799
800 <p>
801 Output to the input port is implicitly dropped, that is,
802 <code>output</code> becomes a no-op if <code>outport</code> ==
803 <code>inport</code>. Occasionally it may be useful to override
804 this behavior, e.g. to send an ARP reply to an ARP request; to do
805 so, use <code>inport = "";</code> to set the logical input port to
806 an empty string (which should not be used as the name of any
807 logical port).
808 </p>
809 </dd>
810
811 <dt><code>next;</code></dt>
812 <dt><code>next(<var>table</var>);</code></dt>
813 <dd>
814 Executes another logical datapath table as a subroutine. By default,
815 the table after the current one is executed. Specify
816 <var>table</var> to jump to a specific table in the same pipeline.
817 </dd>
818
819 <dt><code><var>field</var> = <var>constant</var>;</code></dt>
820 <dd>
821 <p>
822 Sets data or metadata field <var>field</var> to constant value
823 <var>constant</var>, e.g. <code>outport = "vif0";</code> to set the
824 logical output port. To set only a subset of bits in a field,
825 specify a subfield for <var>field</var> or a masked
826 <var>constant</var>, e.g. one may use <code>vlan.pcp[2] = 1;</code>
827 or <code>vlan.pcp = 4/4;</code> to set the most sigificant bit of
828 the VLAN PCP.
829 </p>
830
831 <p>
832 Assigning to a field with prerequisites implicitly adds those
833 prerequisites to <ref column="match"/>; thus, for example, a flow
834 that sets <code>tcp.dst</code> applies only to TCP flows,
835 regardless of whether its <ref column="match"/> mentions any TCP
836 field.
837 </p>
838
839 <p>
840 Not all fields are modifiable (e.g. <code>eth.type</code> and
841 <code>ip.proto</code> are read-only), and not all modifiable fields
842 may be partially modified (e.g. <code>ip.ttl</code> must assigned
843 as a whole). The <code>outport</code> field is modifiable in the
844 <code>ingress</code> pipeline but not in the <code>egress</code>
845 pipeline.
846 </p>
847 </dd>
848
849 <dt><code><var>field1</var> = <var>field2</var>;</code></dt>
850 <dd>
851 <p>
852 Sets data or metadata field <var>field1</var> to the value of data
853 or metadata field <var>field2</var>, e.g. <code>reg0 =
854 ip4.src;</code> copies <code>ip4.src</code> into <code>reg0</code>.
855 To modify only a subset of a field's bits, specify a subfield for
856 <var>field1</var> or <var>field2</var> or both, e.g. <code>vlan.pcp
857 = reg0[0..2];</code> copies the least-significant bits of
858 <code>reg0</code> into the VLAN PCP.
859 </p>
860
861 <p>
862 <var>field1</var> and <var>field2</var> must be the same type,
863 either both string or both integer fields. If they are both
864 integer fields, they must have the same width.
865 </p>
866
867 <p>
868 If <var>field1</var> or <var>field2</var> has prerequisites, they
869 are added implicitly to <ref column="match"/>. It is possible to
870 write an assignment with contradictory prerequisites, such as
871 <code>ip4.src = ip6.src[0..31];</code>, but the contradiction means
872 that a logical flow with such an assignment will never be matched.
873 </p>
874 </dd>
875
876 <dt><code><var>field1</var> &lt;-&gt; <var>field2</var>;</code></dt>
877 <dd>
878 <p>
879 Similar to <code><var>field1</var> = <var>field2</var>;</code>
880 except that the two values are exchanged instead of copied. Both
881 <var>field1</var> and <var>field2</var> must modifiable.
882 </p>
883 </dd>
884
885 <dt><code>ip.ttl--;</code></dt>
886 <dd>
887 <p>
888 Decrements the IPv4 or IPv6 TTL. If this would make the TTL zero
889 or negative, then processing of the packet halts; no further
890 actions are processed. (To properly handle such cases, a
891 higher-priority flow should match on
892 <code>ip.ttl == {0, 1};</code>.)
893 </p>
894
895 <p><b>Prerequisite:</b> <code>ip</code></p>
896 </dd>
897
898 <dt><code>ct_next;</code></dt>
899 <dd>
900 <p>
901 Apply connection tracking to the flow, initializing
902 <code>ct_state</code> for matching in later tables.
903 Automatically moves on to the next table, as if followed by
904 <code>next</code>.
905 </p>
906
907 <p>
908 As a side effect, IP fragments will be reassembled for matching.
909 If a fragmented packet is output, then it will be sent with any
910 overlapping fragments squashed. The connection tracking state is
911 scoped by the logical port, so overlapping addresses may be used.
912 To allow traffic related to the matched flow, execute
913 <code>ct_commit</code>.
914 </p>
915
916 <p>
917 It is possible to have actions follow <code>ct_next</code>,
918 but they will not have access to any of its side-effects and
919 is not generally useful.
920 </p>
921 </dd>
922
923 <dt><code>ct_commit;</code></dt>
924 <dd>
925 Commit the flow to the connection tracking entry associated
926 with it by a previous call to <code>ct_next</code>.
927 </dd>
928
929 <dt><code>arp { <var>action</var>; </code>...<code> };</code></dt>
930 <dd>
931 <p>
932 Temporarily replaces the IPv4 packet being processed by an ARP
933 packet and executes each nested <var>action</var> on the ARP
934 packet. Actions following the <var>arp</var> action, if any, apply
935 to the original, unmodified packet.
936 </p>
937
938 <p>
939 The ARP packet that this action operates on is initialized based on
940 the IPv4 packet being processed, as follows. These are default
941 values that the nested actions will probably want to change:
942 </p>
943
944 <ul>
945 <li><code>eth.src</code> unchanged</li>
946 <li><code>eth.dst</code> unchanged</li>
947 <li><code>eth.type = 0x0806</code></li>
948 <li><code>arp.op = 1</code> (ARP request)</li>
949 <li><code>arp.sha</code> copied from <code>eth.src</code></li>
950 <li><code>arp.spa</code> copied from <code>ip4.src</code></li>
951 <li><code>arp.tha = 00:00:00:00:00:00</code></li>
952 <li><code>arp.tpa</code> copied from <code>ip4.dst</code></li>
953 </ul>
954
955 <p>
956 The ARP packet has the same VLAN header, if any, as the IP packet
957 it replaces.
958 </p>
959
960 <p><b>Prerequisite:</b> <code>ip4</code></p>
961 </dd>
962
963 <dt><code>get_arp(<var>P</var>, <var>A</var>);</code></dt>
964
965 <dd>
966 <p>
967 <b>Parameters</b>: logical port string field <var>P</var>, 32-bit
968 IP address field <var>A</var>.
969 </p>
970
971 <p>
972 Looks up <var>A</var> in <var>P</var>'s ARP table. If an entry is
973 found, stores its Ethernet address in <code>eth.dst</code>,
974 otherwise stores <code>00:00:00:00:00:00</code> in
975 <code>eth.dst</code>.
976 </p>
977
978 <p><b>Example:</b> <code>get_arp(outport, ip4.dst);</code></p>
979 </dd>
980
981 <dt>
982 <code>put_arp(<var>P</var>, <var>A</var>, <var>E</var>);</code>
983 </dt>
984
985 <dd>
986 <p>
987 <b>Parameters</b>: logical port string field <var>P</var>, 32-bit
988 IP address field <var>A</var>, 48-bit Ethernet address field
989 <var>E</var>.
990 </p>
991
992 <p>
993 Adds or updates the entry for IP address <var>A</var> in logical
994 port <var>P</var>'s ARP table, setting its Ethernet address to
995 <var>E</var>.
996 </p>
997
998 <p><b>Example:</b> <code>put_arp(inport, arp.spa, arp.sha);</code></p>
999 </dd>
1000 </dl>
1001
1002 <p>
1003 The following actions will likely be useful later, but they have not
1004 been thought out carefully.
1005 </p>
1006
1007 <dl>
1008 <dt><code>icmp4 { <var>action</var>; </code>...<code> };</code></dt>
1009 <dd>
1010 <p>
1011 Temporarily replaces the IPv4 packet being processed by an ICMPv4
1012 packet and executes each nested <var>action</var> on the ICMPv4
1013 packet. Actions following the <var>icmp4</var> action, if any,
1014 apply to the original, unmodified packet.
1015 </p>
1016
1017 <p>
1018 The ICMPv4 packet that this action operates on is initialized based
1019 on the IPv4 packet being processed, as follows. These are default
1020 values that the nested actions will probably want to change.
1021 Ethernet and IPv4 fields not listed here are not changed:
1022 </p>
1023
1024 <ul>
1025 <li><code>ip.proto = 1</code> (ICMPv4)</li>
1026 <li><code>ip.frag = 0</code> (not a fragment)</li>
1027 <li><code>icmp4.type = 3</code> (destination unreachable)</li>
1028 <li><code>icmp4.code = 1</code> (host unreachable)</li>
1029 </ul>
1030
1031 <p>
1032 Details TBD.
1033 </p>
1034
1035 <p><b>Prerequisite:</b> <code>ip4</code></p>
1036 </dd>
1037
1038 <dt><code>tcp_reset;</code></dt>
1039 <dd>
1040 <p>
1041 This action transforms the current TCP packet according to the
1042 following pseudocode:
1043 </p>
1044
1045 <pre>
1046 if (tcp.ack) {
1047 tcp.seq = tcp.ack;
1048 } else {
1049 tcp.ack = tcp.seq + length(tcp.payload);
1050 tcp.seq = 0;
1051 }
1052 tcp.flags = RST;
1053 </pre>
1054
1055 <p>
1056 Then, the action drops all TCP options and payload data, and
1057 updates the TCP checksum.
1058 </p>
1059
1060 <p>
1061 Details TBD.
1062 </p>
1063
1064 <p><b>Prerequisite:</b> <code>tcp</code></p>
1065 </dd>
1066 </dl>
1067 </column>
1068
1069 <column name="external_ids" key="stage-name">
1070 Human-readable name for this flow's stage in the pipeline.
1071 </column>
1072
1073 <group title="Common Columns">
1074 The overall purpose of these columns is described under <code>Common
1075 Columns</code> at the beginning of this document.
1076
1077 <column name="external_ids"/>
1078 </group>
1079 </table>
1080
1081 <table name="Multicast_Group" title="Logical Port Multicast Groups">
1082 <p>
1083 The rows in this table define multicast groups of logical ports.
1084 Multicast groups allow a single packet transmitted over a tunnel to a
1085 hypervisor to be delivered to multiple VMs on that hypervisor, which
1086 uses bandwidth more efficiently.
1087 </p>
1088
1089 <p>
1090 Each row in this table defines a logical multicast group numbered <ref
1091 column="tunnel_key"/> within <ref column="datapath"/>, whose logical
1092 ports are listed in the <ref column="ports"/> column.
1093 </p>
1094
1095 <column name="datapath">
1096 The logical datapath in which the multicast group resides.
1097 </column>
1098
1099 <column name="tunnel_key">
1100 The value used to designate this logical egress port in tunnel
1101 encapsulations. An index forces the key to be unique within the <ref
1102 column="datapath"/>. The unusual range ensures that multicast group IDs
1103 do not overlap with logical port IDs.
1104 </column>
1105
1106 <column name="name">
1107 <p>
1108 The logical multicast group's name. An index forces the name to be
1109 unique within the <ref column="datapath"/>. Logical flows in the
1110 ingress pipeline may output to the group just as for individual logical
1111 ports, by assigning the group's name to <code>outport</code> and
1112 executing an <code>output</code> action.
1113 </p>
1114
1115 <p>
1116 Multicast group names and logical port names share a single namespace
1117 and thus should not overlap (but the database schema cannot enforce
1118 this). To try to avoid conflicts, <code>ovn-northd</code> uses names
1119 that begin with <code>_MC_</code>.
1120 </p>
1121 </column>
1122
1123 <column name="ports">
1124 The logical ports included in the multicast group. All of these ports
1125 must be in the <ref column="datapath"/> logical datapath (but the
1126 database schema cannot enforce this).
1127 </column>
1128 </table>
1129
1130 <table name="Datapath_Binding" title="Physical-Logical Datapath Bindings">
1131 <p>
1132 Each row in this table identifies physical bindings of a logical
1133 datapath. A logical datapath implements a logical pipeline among the
1134 ports in the <ref table="Port_Binding"/> table associated with it. In
1135 practice, the pipeline in a given logical datapath implements either a
1136 logical switch or a logical router.
1137 </p>
1138
1139 <column name="tunnel_key">
1140 The tunnel key value to which the logical datapath is bound.
1141 The <code>Tunnel Encapsulation</code> section in
1142 <code>ovn-architecture</code>(7) describes how tunnel keys are
1143 constructed for each supported encapsulation.
1144 </column>
1145
1146 <group title="OVN_Northbound Relationship">
1147 <p>
1148 Each row in <ref table="Datapath_Binding"/> is associated with some
1149 logical datapath. <code>ovn-northd</code> uses these keys to track the
1150 association of a logical datapath with concepts in the <ref
1151 db="OVN_Northbound"/> database.
1152 </p>
1153
1154 <column name="external_ids" key="logical-switch" type='{"type": "uuid"}'>
1155 For a logical datapath that represents a logical switch,
1156 <code>ovn-northd</code> stores in this key the UUID of the
1157 corresponding <ref table="Logical_Switch" db="OVN_Northbound"/> row in
1158 the <ref db="OVN_Northbound"/> database.
1159 </column>
1160
1161 <column name="external_ids" key="logical-router" type='{"type": "uuid"}'>
1162 For a logical datapath that represents a logical router,
1163 <code>ovn-northd</code> stores in this key the UUID of the
1164 corresponding <ref table="Logical_Router" db="OVN_Northbound"/> row in
1165 the <ref db="OVN_Northbound"/> database.
1166 </column>
1167 </group>
1168
1169 <group title="Common Columns">
1170 The overall purpose of these columns is described under <code>Common
1171 Columns</code> at the beginning of this document.
1172
1173 <column name="external_ids"/>
1174 </group>
1175 </table>
1176
1177 <table name="Port_Binding" title="Physical-Logical Port Bindings">
1178 <p>
1179 Most rows in this table identify the physical location of a logical port.
1180 (The exceptions are logical patch ports, which do not have any physical
1181 location.)
1182 </p>
1183
1184 <p>
1185 For every <code>Logical_Port</code> record in <code>OVN_Northbound</code>
1186 database, <code>ovn-northd</code> creates a record in this table.
1187 <code>ovn-northd</code> populates and maintains every column except
1188 the <code>chassis</code> column, which it leaves empty in new records.
1189 </p>
1190
1191 <p>
1192 <code>ovn-controller</code>/<code>ovn-controller-vtep</code>
1193 populates the <code>chassis</code> column for the records that
1194 identify the logical ports that are located on its hypervisor/gateway,
1195 which <code>ovn-controller</code>/<code>ovn-controller-vtep</code> in
1196 turn finds out by monitoring the local hypervisor's Open_vSwitch
1197 database, which identifies logical ports via the conventions described
1198 in <code>IntegrationGuide.md</code>.
1199 </p>
1200
1201 <p>
1202 When a chassis shuts down gracefully, it should clean up the
1203 <code>chassis</code> column that it previously had populated.
1204 (This is not critical because resources hosted on the chassis are equally
1205 unreachable regardless of whether their rows are present.) To handle the
1206 case where a VM is shut down abruptly on one chassis, then brought up
1207 again on a different one,
1208 <code>ovn-controller</code>/<code>ovn-controller-vtep</code> must
1209 overwrite the <code>chassis</code> column with new information.
1210 </p>
1211
1212 <group title="Core Features">
1213 <column name="datapath">
1214 The logical datapath to which the logical port belongs.
1215 </column>
1216
1217 <column name="logical_port">
1218 A logical port, taken from <ref table="Logical_Port" column="name"
1219 db="OVN_Northbound"/> in the OVN_Northbound database's <ref
1220 table="Logical_Port" db="OVN_Northbound"/> table. OVN does not
1221 prescribe a particular format for the logical port ID.
1222 </column>
1223
1224 <column name="chassis">
1225 The physical location of the logical port. To successfully identify a
1226 chassis, this column must be a <ref table="Chassis"/> record. This is
1227 populated by
1228 <code>ovn-controller</code>/<code>ovn-controller-vtep</code>.
1229 </column>
1230
1231 <column name="tunnel_key">
1232 <p>
1233 A number that represents the logical port in the key (e.g. STT key or
1234 Geneve TLV) field carried within tunnel protocol packets.
1235 </p>
1236
1237 <p>
1238 The tunnel ID must be unique within the scope of a logical datapath.
1239 </p>
1240 </column>
1241
1242 <column name="mac">
1243 <p>
1244 The Ethernet address or addresses used as a source address on the
1245 logical port, each in the form
1246 <var>xx</var>:<var>xx</var>:<var>xx</var>:<var>xx</var>:<var>xx</var>:<var>xx</var>.
1247 The string <code>unknown</code> is also allowed to indicate that the
1248 logical port has an unknown set of (additional) source addresses.
1249 </p>
1250
1251 <p>
1252 A VM interface would ordinarily have a single Ethernet address. A
1253 gateway port might initially only have <code>unknown</code>, and then
1254 add MAC addresses to the set as it learns new source addresses.
1255 </p>
1256 </column>
1257
1258 <column name="type">
1259 <p>
1260 A type for this logical port. Logical ports can be used to model other
1261 types of connectivity into an OVN logical switch. The following types
1262 are defined:
1263 </p>
1264
1265 <dl>
1266 <dt>(empty string)</dt>
1267 <dd>VM (or VIF) interface.</dd>
1268
1269 <dt><code>patch</code></dt>
1270 <dd>
1271 One of a pair of logical ports that act as if connected by a patch
1272 cable. Useful for connecting two logical datapaths, e.g. to connect
1273 a logical router to a logical switch or to another logical router.
1274 </dd>
1275
1276 <dt><code>localnet</code></dt>
1277 <dd>
1278 A connection to a locally accessible network from each
1279 <code>ovn-controller</code> instance. A logical switch can only
1280 have a single <code>localnet</code> port attached. This is used
1281 to model direct connectivity to an existing network.
1282 </dd>
1283
1284 <dt><code>vtep</code></dt>
1285 <dd>
1286 A port to a logical switch on a VTEP gateway chassis. In order to
1287 get this port correctly recognized by the OVN controller, the <ref
1288 column="options"
1289 table="Port_Binding"/>:<code>vtep-physical-switch</code> and <ref
1290 column="options"
1291 table="Port_Binding"/>:<code>vtep-logical-switch</code> must also
1292 be defined.
1293 </dd>
1294 </dl>
1295 </column>
1296 </group>
1297
1298 <group title="Patch Options">
1299 <p>
1300 These options apply to logical ports with <ref column="type"/> of
1301 <code>patch</code>.
1302 </p>
1303
1304 <column name="options" key="peer">
1305 The <ref column="logical_port"/> in the <ref table="Port_Binding"/>
1306 record for the other side of the patch. The named <ref
1307 column="logical_port"/> must specify this <ref column="logical_port"/>
1308 in its own <code>peer</code> option. That is, the two patch logical
1309 ports must have reversed <ref column="logical_port"/> and
1310 <code>peer</code> values.
1311 </column>
1312 </group>
1313
1314 <group title="Localnet Options">
1315 <p>
1316 These options apply to logical ports with <ref column="type"/> of
1317 <code>localnet</code>.
1318 </p>
1319
1320 <column name="options" key="network_name">
1321 Required. <code>ovn-controller</code> uses the configuration entry
1322 <code>ovn-bridge-mappings</code> to determine how to connect to this
1323 network. <code>ovn-bridge-mappings</code> is a list of network names
1324 mapped to a local OVS bridge that provides access to that network. An
1325 example of configuring <code>ovn-bridge-mappings</code> would be:
1326
1327 <pre>$ ovs-vsctl set open . external-ids:ovn-bridge-mappings=physnet1:br-eth0,physnet2:br-eth1</pre>
1328
1329 <p>
1330 When a logical switch has a <code>localnet</code> port attached,
1331 every chassis that may have a local vif attached to that logical
1332 switch must have a bridge mapping configured to reach that
1333 <code>localnet</code>. Traffic that arrives on a
1334 <code>localnet</code> port is never forwarded over a tunnel to
1335 another chassis.
1336 </p>
1337 </column>
1338
1339 <column name="tag">
1340 If set, indicates that the port represents a connection to a specific
1341 VLAN on a locally accessible network. The VLAN ID is used to match
1342 incoming traffic and is also added to outgoing traffic.
1343 </column>
1344 </group>
1345
1346 <group title="VTEP Options">
1347 <p>
1348 These options apply to logical ports with <ref column="type"/> of
1349 <code>vtep</code>.
1350 </p>
1351
1352 <column name="options" key="vtep-physical-switch">
1353 Required. The name of the VTEP gateway.
1354 </column>
1355
1356 <column name="options" key="vtep-logical-switch">
1357 Required. A logical switch name connected by the VTEP gateway. Must
1358 be set when <ref column="type"/> is <code>vtep</code>.
1359 </column>
1360 </group>
1361
1362 <group title="VMI (or VIF) Options">
1363 <p>
1364 These options apply to logical ports with <ref column="type"/> having
1365 (empty string)
1366 </p>
1367
1368 <column name="options" key="policing_rate">
1369 If set, indicates the maximum rate for data sent from this interface,
1370 in kbps. Data exceeding this rate is dropped.
1371 </column>
1372
1373 <column name="options" key="policing_burst">
1374 If set, indicates the maximum burst size for data sent from this
1375 interface, in kb.
1376 </column>
1377 </group>
1378
1379 <group title="Nested Containers">
1380 <p>
1381 These columns support containers nested within a VM. Specifically,
1382 they are used when <ref column="type"/> is empty and <ref
1383 column="logical_port"/> identifies the interface of a container spawned
1384 inside a VM. They are empty for containers or VMs that run directly on
1385 a hypervisor.
1386 </p>
1387
1388 <column name="parent_port">
1389 This is taken from
1390 <ref table="Logical_Port" column="parent_name" db="OVN_Northbound"/>
1391 in the OVN_Northbound database's <ref table="Logical_Port"
1392 db="OVN_Northbound"/> table.
1393 </column>
1394
1395 <column name="tag">
1396 <p>
1397 Identifies the VLAN tag in the network traffic associated with that
1398 container's network interface.
1399 </p>
1400
1401 <p>
1402 This column is used for a different purpose when <ref column="type"/>
1403 is <code>localnet</code> (see <code>Localnet Options</code>, above).
1404 </p>
1405 </column>
1406 </group>
1407 </table>
1408
1409 <table name="MAC_Binding" title="IP to MAC bindings">
1410 <p>
1411 Each row in this table specifies a binding from an IP address to an
1412 Ethernet address that has been discovered through ARP (for IPv4) or
1413 neighbor discovery (for IPv6). This table is primarily used to discover
1414 bindings on physical networks, because IP-to-MAC bindings for virtual
1415 machines are usually populated statically into the <ref
1416 table="Port_Binding"/> table.
1417 </p>
1418
1419 <p>
1420 This table expresses a functional relationship: <ref
1421 table="MAC_Binding"/>(<ref column="logical_port"/>, <ref column="ip"/>) =
1422 <ref column="mac"/>.
1423 </p>
1424
1425 <p>
1426 In outline, the lifetime of a logical router's MAC binding looks like
1427 this:
1428 </p>
1429
1430 <ol>
1431 <li>
1432 On hypervisor 1, a logical router determines that a packet should be
1433 forwarded to IP address <var>A</var> on one of its router ports. It
1434 uses its logical flow table to determine that <var>A</var> lacks a
1435 static IP-to-MAC binding and the <code>get_arp</code> action to
1436 determine that it lacks a dynamic IP-to-MAC binding.
1437 </li>
1438
1439 <li>
1440 Using an OVN logical <code>arp</code> action, the logical router
1441 generates and sends a broadcast ARP request to the router port. It
1442 drops the IP packet.
1443 </li>
1444
1445 <li>
1446 The logical switch attached to the router port delivers the ARP request
1447 to all of its ports. (It might make sense to deliver it only to ports
1448 that have no static IP-to-MAC bindings, but this could also be
1449 surprising behavior.)
1450 </li>
1451
1452 <li>
1453 A host or VM on hypervisor 2 (which might be the same as hypervisor 1)
1454 attached to the logical switch owns the IP address in question. It
1455 composes an ARP reply and unicasts it to the logical router port's
1456 Ethernet address.
1457 </li>
1458
1459 <li>
1460 The logical switch delivers the ARP reply to the logical router port.
1461 </li>
1462
1463 <li>
1464 The logical router flow table executes a <code>put_arp</code> action.
1465 To record the IP-to-MAC binding, <code>ovn-controller</code> adds a row
1466 to the <ref table="MAC_Binding"/> table.
1467 </li>
1468
1469 <li>
1470 On hypervisor 1, <code>ovn-controller</code> receives the updated <ref
1471 table="MAC_Binding"/> table from the OVN southbound database. The next
1472 packet destined to <var>A</var> through the logical router is sent
1473 directly to the bound Ethernet address.
1474 </li>
1475 </ol>
1476
1477 <column name="logical_port">
1478 The logical port on which the binding was discovered.
1479 </column>
1480
1481 <column name="ip">
1482 The bound IP address.
1483 </column>
1484
1485 <column name="mac">
1486 The Ethernet address to which the IP is bound.
1487 </column>
1488 </table>
1489 </database>