]> git.proxmox.com Git - ovs.git/blob - ovn/ovn-sb.xml
ovn: Document chassis->name for ovn-controller-vtep.
[ovs.git] / ovn / ovn-sb.xml
1 <?xml version="1.0" encoding="utf-8"?>
2 <database name="ovn-sb" title="OVN Southbound Database">
3 <p>
4 This database holds logical and physical configuration and state for the
5 Open Virtual Network (OVN) system to support virtual network abstraction.
6 For an introduction to OVN, please see <code>ovn-architecture</code>(7).
7 </p>
8
9 <p>
10 The OVN Southbound database sits at the center of the OVN
11 architecture. It is the one component that speaks both southbound
12 directly to all the hypervisors and gateways, via
13 <code>ovn-controller</code>/<code>ovn-controller-vtep</code>, and
14 northbound to the Cloud Management System, via <code>ovn-northd</code>:
15 </p>
16
17 <h2>Database Structure</h2>
18
19 <p>
20 The OVN Southbound database contains classes of data with
21 different properties, as described in the sections below.
22 </p>
23
24 <h3>Physical Network (PN) data</h3>
25
26 <p>
27 PN tables contain information about the chassis nodes in the system. This
28 contains all the information necessary to wire the overlay, such as IP
29 addresses, supported tunnel types, and security keys.
30 </p>
31
32 <p>
33 The amount of PN data is small (O(n) in the number of chassis) and it
34 changes infrequently, so it can be replicated to every chassis.
35 </p>
36
37 <p>
38 The <ref table="Chassis"/> table comprises the PN tables.
39 </p>
40
41 <h3>Logical Network (LN) data</h3>
42
43 <p>
44 LN tables contain the topology of logical switches and routers, ACLs,
45 firewall rules, and everything needed to describe how packets traverse a
46 logical network, represented as logical datapath flows (see Logical
47 Datapath Flows, below).
48 </p>
49
50 <p>
51 LN data may be large (O(n) in the number of logical ports, ACL rules,
52 etc.). Thus, to improve scaling, each chassis should receive only data
53 related to logical networks in which that chassis participates. Past
54 experience shows that in the presence of large logical networks, even
55 finer-grained partitioning of data, e.g. designing logical flows so that
56 only the chassis hosting a logical port needs related flows, pays off
57 scale-wise. (This is not necessary initially but it is worth bearing in
58 mind in the design.)
59 </p>
60
61 <p>
62 The LN is a slave of the cloud management system running northbound of OVN.
63 That CMS determines the entire OVN logical configuration and therefore the
64 LN's content at any given time is a deterministic function of the CMS's
65 configuration, although that happens indirectly via the
66 <ref db="OVN_Northbound"/> database and <code>ovn-northd</code>.
67 </p>
68
69 <p>
70 LN data is likely to change more quickly than PN data. This is especially
71 true in a container environment where VMs are created and destroyed (and
72 therefore added to and deleted from logical switches) quickly.
73 </p>
74
75 <p>
76 <ref table="Logical_Flow"/> and <ref table="Multicast_Group"/> contain LN
77 data.
78 </p>
79
80 <h3>Logical-physical bindings</h3>
81
82 <p>
83 These tables link logical and physical components. They show the current
84 placement of logical components (such as VMs and VIFs) onto chassis, and
85 map logical entities to the values that represent them in tunnel
86 encapsulations.
87 </p>
88
89 <p>
90 These tables change frequently, at least every time a VM powers up or down
91 or migrates, and especially quickly in a container environment. The
92 amount of data per VM (or VIF) is small.
93 </p>
94
95 <p>
96 Each chassis is authoritative about the VMs and VIFs that it hosts at any
97 given time and can efficiently flood that state to a central location, so
98 the consistency needs are minimal.
99 </p>
100
101 <p>
102 The <ref table="Port_Binding"/> and <ref table="Datapath_Binding"/> tables
103 contain binding data.
104 </p>
105
106 <h3>MAC bindings</h3>
107
108 <p>
109 The <ref table="MAC_Binding"/> table tracks the bindings from IP addresses
110 to Ethernet addresses that are dynamically discovered using ARP (for IPv4)
111 and neighbor discovery (for IPv6). Usually, IP-to-MAC bindings for virtual
112 machines are statically populated into the <ref table="Port_Binding"/>
113 table, so <ref table="MAC_Binding"/> is primarily used to discover bindings
114 on physical networks.
115 </p>
116
117 <h2>Common Columns</h2>
118
119 <p>
120 Some tables contain a special column named <code>external_ids</code>. This
121 column has the same form and purpose each place that it appears, so we
122 describe it here to save space later.
123 </p>
124
125 <dl>
126 <dt><code>external_ids</code>: map of string-string pairs</dt>
127 <dd>
128 Key-value pairs for use by the software that manages the OVN Southbound
129 database rather than by
130 <code>ovn-controller</code>/<code>ovn-controller-vtep</code>. In
131 particular, <code>ovn-northd</code> can use key-value pairs in this
132 column to relate entities in the southbound database to higher-level
133 entities (such as entities in the OVN Northbound database). Individual
134 key-value pairs in this column may be documented in some cases to aid
135 in understanding and troubleshooting, but the reader should not mistake
136 such documentation as comprehensive.
137 </dd>
138 </dl>
139
140 <table name="Chassis" title="Physical Network Hypervisor and Gateway Information">
141 <p>
142 Each row in this table represents a hypervisor or gateway (a chassis) in
143 the physical network (PN). Each chassis, via
144 <code>ovn-controller</code>/<code>ovn-controller-vtep</code>, adds
145 and updates its own row, and keeps a copy of the remaining rows to
146 determine how to reach other hypervisors.
147 </p>
148
149 <p>
150 When a chassis shuts down gracefully, it should remove its own row.
151 (This is not critical because resources hosted on the chassis are equally
152 unreachable regardless of whether the row is present.) If a chassis
153 shuts down permanently without removing its row, some kind of manual or
154 automatic cleanup is eventually needed; we can devise a process for that
155 as necessary.
156 </p>
157
158 <column name="name">
159 OVN does not prescribe a particular format for chassis names.
160 ovn-controller populates this column using <ref key="system-id"
161 table="Open_vSwitch" column="external_ids" db="Open_vSwitch"/>
162 in the Open_vSwitch database's <ref table="Open_vSwitch"
163 db="Open_vSwitch"/> table. ovn-controller-vtep populates this
164 column with <ref table="Physical_Switch" column="name"
165 db="hardware_vtep"/> in the hardware_vtep database's
166 <ref table="Physical_Switch" db="hardware_vtep"/> table.
167 </column>
168
169 <column name="hostname">
170 The hostname of the chassis, if applicable. ovn-controller will populate
171 this column with the hostname of the host it is running on.
172 ovn-controller-vtep will leave this column empty.
173 </column>
174
175 <group title="Encapsulation Configuration">
176 <p>
177 OVN uses encapsulation to transmit logical dataplane packets
178 between chassis.
179 </p>
180
181 <column name="encaps">
182 Points to supported encapsulation configurations to transmit
183 logical dataplane packets to this chassis. Each entry is a <ref
184 table="Encap"/> record that describes the configuration.
185 </column>
186 </group>
187
188 <group title="Gateway Configuration">
189 <p>
190 A <dfn>gateway</dfn> is a chassis that forwards traffic between the
191 OVN-managed part of a logical network and a physical VLAN, extending a
192 tunnel-based logical network into a physical network. Gateways are
193 typically dedicated nodes that do not host VMs and will be controlled
194 by <code>ovn-controller-vtep</code>.
195 </p>
196
197 <column name="vtep_logical_switches">
198 Stores all VTEP logical switch names connected by this gateway
199 chassis. The <ref table="Port_Binding"/> table entry with
200 <ref column="options" table="Port_Binding"/>:<code>vtep-physical-switch</code>
201 equal <ref table="Chassis"/> <ref column="name" table="Chassis"/>, and
202 <ref column="options" table="Port_Binding"/>:<code>vtep-logical-switch</code>
203 value in <ref table="Chassis"/>
204 <ref column="vtep_logical_switches" table="Chassis"/>, will be
205 associated with this <ref table="Chassis"/>.
206 </column>
207 </group>
208 </table>
209
210 <table name="Encap" title="Encapsulation Types">
211 <p>
212 The <ref column="encaps" table="Chassis"/> column in the <ref
213 table="Chassis"/> table refers to rows in this table to identify
214 how OVN may transmit logical dataplane packets to this chassis.
215 Each chassis, via <code>ovn-controller</code>(8) or
216 <code>ovn-controller-vtep</code>(8), adds and updates its own rows
217 and keeps a copy of the remaining rows to determine how to reach
218 other chassis.
219 </p>
220
221 <column name="type">
222 The encapsulation to use to transmit packets to this chassis.
223 Hypervisors must use either <code>geneve</code> or
224 <code>stt</code>. Gateways may use <code>vxlan</code>,
225 <code>geneve</code>, or <code>stt</code>.
226 </column>
227
228 <column name="options">
229 Options for configuring the encapsulation, e.g. IPsec parameters when
230 IPsec support is introduced. No options are currently defined.
231 </column>
232
233 <column name="ip">
234 The IPv4 address of the encapsulation tunnel endpoint.
235 </column>
236 </table>
237
238 <table name="Logical_Flow" title="Logical Network Flows">
239 <p>
240 Each row in this table represents one logical flow.
241 <code>ovn-northd</code> populates this table with logical flows
242 that implement the L2 and L3 topologies specified in the
243 <ref db="OVN_Northbound"/> database. Each hypervisor, via
244 <code>ovn-controller</code>, translates the logical flows into
245 OpenFlow flows specific to its hypervisor and installs them into
246 Open vSwitch.
247 </p>
248
249 <p>
250 Logical flows are expressed in an OVN-specific format, described here. A
251 logical datapath flow is much like an OpenFlow flow, except that the
252 flows are written in terms of logical ports and logical datapaths instead
253 of physical ports and physical datapaths. Translation between logical
254 and physical flows helps to ensure isolation between logical datapaths.
255 (The logical flow abstraction also allows the OVN centralized
256 components to do less work, since they do not have to separately
257 compute and push out physical flows to each chassis.)
258 </p>
259
260 <p>
261 The default action when no flow matches is to drop packets.
262 </p>
263
264 <p><em>Architectural Logical Life Cycle of a Packet</em></p>
265
266 <p>
267 This following description focuses on the life cycle of a packet through
268 a logical datapath, ignoring physical details of the implementation.
269 Please refer to <em>Architectural Physical Life Cycle of a Packet</em> in
270 <code>ovn-architecture</code>(7) for the physical information.
271 </p>
272
273 <p>
274 The description here is written as if OVN itself executes these steps,
275 but in fact OVN (that is, <code>ovn-controller</code>) programs Open
276 vSwitch, via OpenFlow and OVSDB, to execute them on its behalf.
277 </p>
278
279 <p>
280 At a high level, OVN passes each packet through the logical datapath's
281 logical ingress pipeline, which may output the packet to one or more
282 logical port or logical multicast groups. For each such logical output
283 port, OVN passes the packet through the datapath's logical egress
284 pipeline, which may either drop the packet or deliver it to the
285 destination. Between the two pipelines, outputs to logical multicast
286 groups are expanded into logical ports, so that the egress pipeline only
287 processes a single logical output port at a time. Between the two
288 pipelines is also where, when necessary, OVN encapsulates a packet in a
289 tunnel (or tunnels) to transmit to remote hypervisors.
290 </p>
291
292 <p>
293 In more detail, to start, OVN searches the <ref table="Logical_Flow"/>
294 table for a row with correct <ref column="logical_datapath"/>, a <ref
295 column="pipeline"/> of <code>ingress</code>, a <ref column="table_id"/>
296 of 0, and a <ref column="match"/> that is true for the packet. If none
297 is found, OVN drops the packet. If OVN finds more than one, it chooses
298 the match with the highest <ref column="priority"/>. Then OVN executes
299 each of the actions specified in the row's <ref table="actions"/> column,
300 in the order specified. Some actions, such as those to modify packet
301 headers, require no further details. The <code>next</code> and
302 <code>output</code> actions are special.
303 </p>
304
305 <p>
306 The <code>next</code> action causes the above process to be repeated
307 recursively, except that OVN searches for <ref column="table_id"/> of 1
308 instead of 0. Similarly, any <code>next</code> action in a row found in
309 that table would cause a further search for a <ref column="table_id"/> of
310 2, and so on. When recursive processing completes, flow control returns
311 to the action following <code>next</code>.
312 </p>
313
314 <p>
315 The <code>output</code> action also introduces recursion. Its effect
316 depends on the current value of the <code>outport</code> field. Suppose
317 <code>outport</code> designates a logical port. First, OVN compares
318 <code>inport</code> to <code>outport</code>; if they are equal, it treats
319 the <code>output</code> as a no-op. In the common case, where they are
320 different, the packet enters the egress pipeline. This transition to the
321 egress pipeline discards register data, e.g. <code>reg0</code> ...
322 <code>reg4</code> and connection tracking state, to achieve
323 uniform behavior regardless of whether the egress pipeline is on a
324 different hypervisor (because registers aren't preserve across
325 tunnel encapsulation).
326 </p>
327
328 <p>
329 To execute the egress pipeline, OVN again searches the <ref
330 table="Logical_Flow"/> table for a row with correct <ref
331 column="logical_datapath"/>, a <ref column="table_id"/> of 0, a <ref
332 column="match"/> that is true for the packet, but now looking for a <ref
333 column="pipeline"/> of <code>egress</code>. If no matching row is found,
334 the output becomes a no-op. Otherwise, OVN executes the actions for the
335 matching flow (which is chosen from multiple, if necessary, as already
336 described).
337 </p>
338
339 <p>
340 In the <code>egress</code> pipeline, the <code>next</code> action acts as
341 already described, except that it, of course, searches for
342 <code>egress</code> flows. The <code>output</code> action, however, now
343 directly outputs the packet to the output port (which is now fixed,
344 because <code>outport</code> is read-only within the egress pipeline).
345 </p>
346
347 <p>
348 The description earlier assumed that <code>outport</code> referred to a
349 logical port. If it instead designates a logical multicast group, then
350 the description above still applies, with the addition of fan-out from
351 the logical multicast group to each logical port in the group. For each
352 member of the group, OVN executes the logical pipeline as described, with
353 the logical output port replaced by the group member.
354 </p>
355
356 <p><em>Pipeline Stages</em></p>
357
358 <p>
359 <code>ovn-northd</code> is responsible for populating the
360 <ref table="Logical_Flow"/> table, so the stages are an
361 implementation detail and subject to change. This section
362 describes the current logical flow table.
363 </p>
364
365 <p>
366 The ingress pipeline consists of the following stages:
367 </p>
368 <ul>
369 <li>
370 Port Security (Table 0): Validates the source address, drops
371 packets with a VLAN tag, and, if configured, verifies that the
372 logical port is allowed to send with the source address.
373 </li>
374
375 <li>
376 L2 Destination Lookup (Table 1): Forwards known unicast
377 addresses to the appropriate logical port. Unicast packets to
378 unknown hosts are forwarded to logical ports configured with the
379 special <code>unknown</code> mac address. Broadcast, and
380 multicast are flooded to all ports in the logical switch.
381 </li>
382 </ul>
383
384 <p>
385 The egress pipeline consists of the following stages:
386 </p>
387 <ul>
388 <li>
389 ACL (Table 0): Applies any specified access control lists.
390 </li>
391
392 <li>
393 Port Security (Table 1): If configured, verifies that the
394 logical port is allowed to receive packets with the destination
395 address.
396 </li>
397 </ul>
398
399 <column name="logical_datapath">
400 The logical datapath to which the logical flow belongs.
401 </column>
402
403 <column name="pipeline">
404 <p>
405 The primary flows used for deciding on a packet's destination are the
406 <code>ingress</code> flows. The <code>egress</code> flows implement
407 ACLs. See <em>Logical Life Cycle of a Packet</em>, above, for details.
408 </p>
409 </column>
410
411 <column name="table_id">
412 The stage in the logical pipeline, analogous to an OpenFlow table number.
413 </column>
414
415 <column name="priority">
416 The flow's priority. Flows with numerically higher priority take
417 precedence over those with lower. If two logical datapath flows with the
418 same priority both match, then the one actually applied to the packet is
419 undefined.
420 </column>
421
422 <column name="match">
423 <p>
424 A matching expression. OVN provides a superset of OpenFlow matching
425 capabilities, using a syntax similar to Boolean expressions in a
426 programming language.
427 </p>
428
429 <p>
430 The most important components of match expression are
431 <dfn>comparisons</dfn> between <dfn>symbols</dfn> and
432 <dfn>constants</dfn>, e.g. <code>ip4.dst == 192.168.0.1</code>,
433 <code>ip.proto == 6</code>, <code>arp.op == 1</code>, <code>eth.type ==
434 0x800</code>. The logical AND operator <code>&amp;&amp;</code> and
435 logical OR operator <code>||</code> can combine comparisons into a
436 larger expression.
437 </p>
438
439 <p>
440 Matching expressions also support parentheses for grouping, the logical
441 NOT prefix operator <code>!</code>, and literals <code>0</code> and
442 <code>1</code> to express ``false'' or ``true,'' respectively. The
443 latter is useful by itself as a catch-all expression that matches every
444 packet.
445 </p>
446
447 <p><em>Symbols</em></p>
448
449 <p>
450 <em>Type</em>. Symbols have <dfn>integer</dfn> or <dfn>string</dfn>
451 type. Integer symbols have a <dfn>width</dfn> in bits.
452 </p>
453
454 <p>
455 <em>Kinds</em>. There are three kinds of symbols:
456 </p>
457
458 <ul>
459 <li>
460 <p>
461 <dfn>Fields</dfn>. A field symbol represents a packet header or
462 metadata field. For example, a field
463 named <code>vlan.tci</code> might represent the VLAN TCI field in a
464 packet.
465 </p>
466
467 <p>
468 A field symbol can have integer or string type. Integer fields can
469 be nominal or ordinal (see <em>Level of Measurement</em>,
470 below).
471 </p>
472 </li>
473
474 <li>
475 <p>
476 <dfn>Subfields</dfn>. A subfield represents a subset of bits from
477 a larger field. For example, a field <code>vlan.vid</code> might
478 be defined as an alias for <code>vlan.tci[0..11]</code>. Subfields
479 are provided for syntactic convenience, because it is always
480 possible to instead refer to a subset of bits from a field
481 directly.
482 </p>
483
484 <p>
485 Only ordinal fields (see <em>Level of Measurement</em>,
486 below) may have subfields. Subfields are always ordinal.
487 </p>
488 </li>
489
490 <li>
491 <p>
492 <dfn>Predicates</dfn>. A predicate is shorthand for a Boolean
493 expression. Predicates may be used much like 1-bit fields. For
494 example, <code>ip4</code> might expand to <code>eth.type ==
495 0x800</code>. Predicates are provided for syntactic convenience,
496 because it is always possible to instead specify the underlying
497 expression directly.
498 </p>
499
500 <p>
501 A predicate whose expansion refers to any nominal field or
502 predicate (see <em>Level of Measurement</em>, below) is nominal;
503 other predicates have Boolean level of measurement.
504 </p>
505 </li>
506 </ul>
507
508 <p>
509 <em>Level of Measurement</em>. See
510 http://en.wikipedia.org/wiki/Level_of_measurement for the statistical
511 concept on which this classification is based. There are three
512 levels:
513 </p>
514
515 <ul>
516 <li>
517 <p>
518 <dfn>Ordinal</dfn>. In statistics, ordinal values can be ordered
519 on a scale. OVN considers a field (or subfield) to be ordinal if
520 its bits can be examined individually. This is true for the
521 OpenFlow fields that OpenFlow or Open vSwitch makes ``maskable.''
522 </p>
523
524 <p>
525 Any use of a nominal field may specify a single bit or a range of
526 bits, e.g. <code>vlan.tci[13..15]</code> refers to the PCP field
527 within the VLAN TCI, and <code>eth.dst[40]</code> refers to the
528 multicast bit in the Ethernet destination address.
529 </p>
530
531 <p>
532 OVN supports all the usual arithmetic relations (<code>==</code>,
533 <code>!=</code>, <code>&lt;</code>, <code>&lt;=</code>,
534 <code>&gt;</code>, and <code>&gt;=</code>) on ordinal fields and
535 their subfields, because OVN can implement these in OpenFlow and
536 Open vSwitch as collections of bitwise tests.
537 </p>
538 </li>
539
540 <li>
541 <p>
542 <dfn>Nominal</dfn>. In statistics, nominal values cannot be
543 usefully compared except for equality. This is true of OpenFlow
544 port numbers, Ethernet types, and IP protocols are examples: all of
545 these are just identifiers assigned arbitrarily with no deeper
546 meaning. In OpenFlow and Open vSwitch, bits in these fields
547 generally aren't individually addressable.
548 </p>
549
550 <p>
551 OVN only supports arithmetic tests for equality on nominal fields,
552 because OpenFlow and Open vSwitch provide no way for a flow to
553 efficiently implement other comparisons on them. (A test for
554 inequality can be sort of built out of two flows with different
555 priorities, but OVN matching expressions always generate flows with
556 a single priority.)
557 </p>
558
559 <p>
560 String fields are always nominal.
561 </p>
562 </li>
563
564 <li>
565 <p>
566 <dfn>Boolean</dfn>. A nominal field that has only two values, 0
567 and 1, is somewhat exceptional, since it is easy to support both
568 equality and inequality tests on such a field: either one can be
569 implemented as a test for 0 or 1.
570 </p>
571
572 <p>
573 Only predicates (see above) have a Boolean level of measurement.
574 </p>
575
576 <p>
577 This isn't a standard level of measurement.
578 </p>
579 </li>
580 </ul>
581
582 <p>
583 <em>Prerequisites</em>. Any symbol can have prerequisites, which are
584 additional condition implied by the use of the symbol. For example,
585 For example, <code>icmp4.type</code> symbol might have prerequisite
586 <code>icmp4</code>, which would cause an expression <code>icmp4.type ==
587 0</code> to be interpreted as <code>icmp4.type == 0 &amp;&amp;
588 icmp4</code>, which would in turn expand to <code>icmp4.type == 0
589 &amp;&amp; eth.type == 0x800 &amp;&amp; ip4.proto == 1</code> (assuming
590 <code>icmp4</code> is a predicate defined as suggested under
591 <em>Types</em> above).
592 </p>
593
594 <p><em>Relational operators</em></p>
595
596 <p>
597 All of the standard relational operators <code>==</code>,
598 <code>!=</code>, <code>&lt;</code>, <code>&lt;=</code>,
599 <code>&gt;</code>, and <code>&gt;=</code> are supported. Nominal
600 fields support only <code>==</code> and <code>!=</code>, and only in a
601 positive sense when outer <code>!</code> are taken into account,
602 e.g. given string field <code>inport</code>, <code>inport ==
603 "eth0"</code> and <code>!(inport != "eth0")</code> are acceptable, but
604 not <code>inport != "eth0"</code>.
605 </p>
606
607 <p>
608 The implementation of <code>==</code> (or <code>!=</code> when it is
609 negated), is more efficient than that of the other relational
610 operators.
611 </p>
612
613 <p><em>Constants</em></p>
614
615 <p>
616 Integer constants may be expressed in decimal, hexadecimal prefixed by
617 <code>0x</code>, or as dotted-quad IPv4 addresses, IPv6 addresses in
618 their standard forms, or Ethernet addresses as colon-separated hex
619 digits. A constant in any of these forms may be followed by a slash
620 and a second constant (the mask) in the same form, to form a masked
621 constant. IPv4 and IPv6 masks may be given as integers, to express
622 CIDR prefixes.
623 </p>
624
625 <p>
626 String constants have the same syntax as quoted strings in JSON (thus,
627 they are Unicode strings).
628 </p>
629
630 <p>
631 Some operators support sets of constants written inside curly braces
632 <code>{</code> ... <code>}</code>. Commas between elements of a set,
633 and after the last elements, are optional. With <code>==</code>,
634 ``<code><var>field</var> == { <var>constant1</var>,
635 <var>constant2</var>,</code> ... <code>}</code>'' is syntactic sugar
636 for ``<code><var>field</var> == <var>constant1</var> ||
637 <var>field</var> == <var>constant2</var> || </code>...<code></code>.
638 Similarly, ``<code><var>field</var> != { <var>constant1</var>,
639 <var>constant2</var>, </code>...<code> }</code>'' is equivalent to
640 ``<code><var>field</var> != <var>constant1</var> &amp;&amp;
641 <var>field</var> != <var>constant2</var> &amp;&amp;
642 </code>...<code></code>''.
643 </p>
644
645 <p><em>Miscellaneous</em></p>
646
647 <p>
648 Comparisons may name the symbol or the constant first,
649 e.g. <code>tcp.src == 80</code> and <code>80 == tcp.src</code> are both
650 acceptable.
651 </p>
652
653 <p>
654 Tests for a range may be expressed using a syntax like <code>1024 &lt;=
655 tcp.src &lt;= 49151</code>, which is equivalent to <code>1024 &lt;=
656 tcp.src &amp;&amp; tcp.src &lt;= 49151</code>.
657 </p>
658
659 <p>
660 For a one-bit field or predicate, a mention of its name is equivalent
661 to <code><var>symobl</var> == 1</code>, e.g. <code>vlan.present</code>
662 is equivalent to <code>vlan.present == 1</code>. The same is true for
663 one-bit subfields, e.g. <code>vlan.tci[12]</code>. There is no
664 technical limitation to implementing the same for ordinal fields of all
665 widths, but the implementation is expensive enough that the syntax
666 parser requires writing an explicit comparison against zero to make
667 mistakes less likely, e.g. in <code>tcp.src != 0</code> the comparison
668 against 0 is required.
669 </p>
670
671 <p>
672 <em>Operator precedence</em> is as shown below, from highest to lowest.
673 There are two exceptions where parentheses are required even though the
674 table would suggest that they are not: <code>&amp;&amp;</code> and
675 <code>||</code> require parentheses when used together, and
676 <code>!</code> requires parentheses when applied to a relational
677 expression. Thus, in <code>(eth.type == 0x800 || eth.type == 0x86dd)
678 &amp;&amp; ip.proto == 6</code> or <code>!(arp.op == 1)</code>, the
679 parentheses are mandatory.
680 </p>
681
682 <ul>
683 <li><code>()</code></li>
684 <li><code>== != &lt; &lt;= &gt; &gt;=</code></li>
685 <li><code>!</code></li>
686 <li><code>&amp;&amp; ||</code></li>
687 </ul>
688
689 <p>
690 <em>Comments</em> may be introduced by <code>//</code>, which extends
691 to the next new-line. Comments within a line may be bracketed by
692 <code>/*</code> and <code>*/</code>. Multiline comments are not
693 supported.
694 </p>
695
696 <p><em>Symbols</em></p>
697
698 <p>
699 Most of the symbols below have integer type. Only <code>inport</code>
700 and <code>outport</code> have string type. <code>inport</code> names a
701 logical port. Thus, its value is a <ref column="logical_port"/> name
702 from the <ref table="Port_Binding"/> table. <code>outport</code> may
703 name a logical port, as <code>inport</code>, or a logical multicast
704 group defined in the <ref table="Multicast_Group"/> table. For both
705 symbols, only names within the flow's logical datapath may be used.
706 </p>
707
708 <ul>
709 <li><code>reg0</code>...<code>reg4</code></li>
710 <li><code>inport</code> <code>outport</code></li>
711 <li><code>eth.src</code> <code>eth.dst</code> <code>eth.type</code></li>
712 <li><code>vlan.tci</code> <code>vlan.vid</code> <code>vlan.pcp</code> <code>vlan.present</code></li>
713 <li><code>ip.proto</code> <code>ip.dscp</code> <code>ip.ecn</code> <code>ip.ttl</code> <code>ip.frag</code></li>
714 <li><code>ip4.src</code> <code>ip4.dst</code></li>
715 <li><code>ip6.src</code> <code>ip6.dst</code> <code>ip6.label</code></li>
716 <li><code>arp.op</code> <code>arp.spa</code> <code>arp.tpa</code> <code>arp.sha</code> <code>arp.tha</code></li>
717 <li><code>tcp.src</code> <code>tcp.dst</code> <code>tcp.flags</code></li>
718 <li><code>udp.src</code> <code>udp.dst</code></li>
719 <li><code>sctp.src</code> <code>sctp.dst</code></li>
720 <li><code>icmp4.type</code> <code>icmp4.code</code></li>
721 <li><code>icmp6.type</code> <code>icmp6.code</code></li>
722 <li><code>nd.target</code> <code>nd.sll</code> <code>nd.tll</code></li>
723 <li><code>ct_mark</code> <code>ct_label</code></li>
724 <li>
725 <p>
726 <code>ct_state</code>, which has the following Boolean subfields:
727 </p>
728 <ul>
729 <li><code>ct.new</code>: True for a new flow</li>
730 <li><code>ct.est</code>: True for an established flow</li>
731 <li><code>ct.rel</code>: True for a related flow</li>
732 <li><code>ct.rpl</code>: True for a reply flow</li>
733 <li><code>ct.inv</code>: True for a connection entry in a bad state</li>
734 </ul>
735 <p>
736 <code>ct_state</code> and its subfields are initialized by the
737 <code>ct_next</code> action, described below.
738 </p>
739 </li>
740 </ul>
741
742 <p>
743 The following predicates are supported:
744 </p>
745
746 <ul>
747 <li><code>eth.bcast</code> expands to <code>eth.dst == ff:ff:ff:ff:ff:ff</code></li>
748 <li><code>eth.mcast</code> expands to <code>eth.dst[40]</code></li>
749 <li><code>vlan.present</code> expands to <code>vlan.tci[12]</code></li>
750 <li><code>ip4</code> expands to <code>eth.type == 0x800</code></li>
751 <li><code>ip4.mcast</code> expands to <code>ip4.dst[28..31] == 0xe</code></li>
752 <li><code>ip6</code> expands to <code>eth.type == 0x86dd</code></li>
753 <li><code>ip</code> expands to <code>ip4 || ip6</code></li>
754 <li><code>icmp4</code> expands to <code>ip4 &amp;&amp; ip.proto == 1</code></li>
755 <li><code>icmp6</code> expands to <code>ip6 &amp;&amp; ip.proto == 58</code></li>
756 <li><code>icmp</code> expands to <code>icmp4 || icmp6</code></li>
757 <li><code>ip.is_frag</code> expands to <code>ip.frag[0]</code></li>
758 <li><code>ip.later_frag</code> expands to <code>ip.frag[1]</code></li>
759 <li><code>ip.first_frag</code> expands to <code>ip.is_frag &amp;&amp; !ip.later_frag</code></li>
760 <li><code>arp</code> expands to <code>eth.type == 0x806</code></li>
761 <li><code>nd</code> expands to <code>icmp6.type == {135, 136} &amp;&amp; icmp6.code == 0</code></li>
762 <li><code>tcp</code> expands to <code>ip.proto == 6</code></li>
763 <li><code>udp</code> expands to <code>ip.proto == 17</code></li>
764 <li><code>sctp</code> expands to <code>ip.proto == 132</code></li>
765 </ul>
766 </column>
767
768 <column name="actions">
769 <p>
770 Logical datapath actions, to be executed when the logical flow
771 represented by this row is the highest-priority match.
772 </p>
773
774 <p>
775 Actions share lexical syntax with the <ref column="match"/> column. An
776 empty set of actions (or one that contains just white space or
777 comments), or a set of actions that consists of just
778 <code>drop;</code>, causes the matched packets to be dropped.
779 Otherwise, the column should contain a sequence of actions, each
780 terminated by a semicolon.
781 </p>
782
783 <p>
784 The following actions are defined:
785 </p>
786
787 <dl>
788 <dt><code>output;</code></dt>
789 <dd>
790 <p>
791 In the ingress pipeline, this action executes the
792 <code>egress</code> pipeline as a subroutine. If
793 <code>outport</code> names a logical port, the egress pipeline
794 executes once; if it is a multicast group, the egress pipeline runs
795 once for each logical port in the group.
796 </p>
797
798 <p>
799 In the egress pipeline, this action performs the actual
800 output to the <code>outport</code> logical port. (In the egress
801 pipeline, <code>outport</code> never names a multicast group.)
802 </p>
803
804 <p>
805 Output to the input port is implicitly dropped, that is,
806 <code>output</code> becomes a no-op if <code>outport</code> ==
807 <code>inport</code>. Occasionally it may be useful to override
808 this behavior, e.g. to send an ARP reply to an ARP request; to do
809 so, use <code>inport = "";</code> to set the logical input port to
810 an empty string (which should not be used as the name of any
811 logical port).
812 </p>
813 </dd>
814
815 <dt><code>next;</code></dt>
816 <dt><code>next(<var>table</var>);</code></dt>
817 <dd>
818 Executes another logical datapath table as a subroutine. By default,
819 the table after the current one is executed. Specify
820 <var>table</var> to jump to a specific table in the same pipeline.
821 </dd>
822
823 <dt><code><var>field</var> = <var>constant</var>;</code></dt>
824 <dd>
825 <p>
826 Sets data or metadata field <var>field</var> to constant value
827 <var>constant</var>, e.g. <code>outport = "vif0";</code> to set the
828 logical output port. To set only a subset of bits in a field,
829 specify a subfield for <var>field</var> or a masked
830 <var>constant</var>, e.g. one may use <code>vlan.pcp[2] = 1;</code>
831 or <code>vlan.pcp = 4/4;</code> to set the most sigificant bit of
832 the VLAN PCP.
833 </p>
834
835 <p>
836 Assigning to a field with prerequisites implicitly adds those
837 prerequisites to <ref column="match"/>; thus, for example, a flow
838 that sets <code>tcp.dst</code> applies only to TCP flows,
839 regardless of whether its <ref column="match"/> mentions any TCP
840 field.
841 </p>
842
843 <p>
844 Not all fields are modifiable (e.g. <code>eth.type</code> and
845 <code>ip.proto</code> are read-only), and not all modifiable fields
846 may be partially modified (e.g. <code>ip.ttl</code> must assigned
847 as a whole). The <code>outport</code> field is modifiable in the
848 <code>ingress</code> pipeline but not in the <code>egress</code>
849 pipeline.
850 </p>
851 </dd>
852
853 <dt><code><var>field1</var> = <var>field2</var>;</code></dt>
854 <dd>
855 <p>
856 Sets data or metadata field <var>field1</var> to the value of data
857 or metadata field <var>field2</var>, e.g. <code>reg0 =
858 ip4.src;</code> copies <code>ip4.src</code> into <code>reg0</code>.
859 To modify only a subset of a field's bits, specify a subfield for
860 <var>field1</var> or <var>field2</var> or both, e.g. <code>vlan.pcp
861 = reg0[0..2];</code> copies the least-significant bits of
862 <code>reg0</code> into the VLAN PCP.
863 </p>
864
865 <p>
866 <var>field1</var> and <var>field2</var> must be the same type,
867 either both string or both integer fields. If they are both
868 integer fields, they must have the same width.
869 </p>
870
871 <p>
872 If <var>field1</var> or <var>field2</var> has prerequisites, they
873 are added implicitly to <ref column="match"/>. It is possible to
874 write an assignment with contradictory prerequisites, such as
875 <code>ip4.src = ip6.src[0..31];</code>, but the contradiction means
876 that a logical flow with such an assignment will never be matched.
877 </p>
878 </dd>
879
880 <dt><code><var>field1</var> &lt;-&gt; <var>field2</var>;</code></dt>
881 <dd>
882 <p>
883 Similar to <code><var>field1</var> = <var>field2</var>;</code>
884 except that the two values are exchanged instead of copied. Both
885 <var>field1</var> and <var>field2</var> must modifiable.
886 </p>
887 </dd>
888
889 <dt><code>ip.ttl--;</code></dt>
890 <dd>
891 <p>
892 Decrements the IPv4 or IPv6 TTL. If this would make the TTL zero
893 or negative, then processing of the packet halts; no further
894 actions are processed. (To properly handle such cases, a
895 higher-priority flow should match on
896 <code>ip.ttl == {0, 1};</code>.)
897 </p>
898
899 <p><b>Prerequisite:</b> <code>ip</code></p>
900 </dd>
901
902 <dt><code>ct_next;</code></dt>
903 <dd>
904 <p>
905 Apply connection tracking to the flow, initializing
906 <code>ct_state</code> for matching in later tables.
907 Automatically moves on to the next table, as if followed by
908 <code>next</code>.
909 </p>
910
911 <p>
912 As a side effect, IP fragments will be reassembled for matching.
913 If a fragmented packet is output, then it will be sent with any
914 overlapping fragments squashed. The connection tracking state is
915 scoped by the logical port, so overlapping addresses may be used.
916 To allow traffic related to the matched flow, execute
917 <code>ct_commit</code>.
918 </p>
919
920 <p>
921 It is possible to have actions follow <code>ct_next</code>,
922 but they will not have access to any of its side-effects and
923 is not generally useful.
924 </p>
925 </dd>
926
927 <dt><code>ct_commit;</code></dt>
928 <dd>
929 Commit the flow to the connection tracking entry associated
930 with it by a previous call to <code>ct_next</code>.
931 </dd>
932
933 <dt><code>arp { <var>action</var>; </code>...<code> };</code></dt>
934 <dd>
935 <p>
936 Temporarily replaces the IPv4 packet being processed by an ARP
937 packet and executes each nested <var>action</var> on the ARP
938 packet. Actions following the <var>arp</var> action, if any, apply
939 to the original, unmodified packet.
940 </p>
941
942 <p>
943 The ARP packet that this action operates on is initialized based on
944 the IPv4 packet being processed, as follows. These are default
945 values that the nested actions will probably want to change:
946 </p>
947
948 <ul>
949 <li><code>eth.src</code> unchanged</li>
950 <li><code>eth.dst</code> unchanged</li>
951 <li><code>eth.type = 0x0806</code></li>
952 <li><code>arp.op = 1</code> (ARP request)</li>
953 <li><code>arp.sha</code> copied from <code>eth.src</code></li>
954 <li><code>arp.spa</code> copied from <code>ip4.src</code></li>
955 <li><code>arp.tha = 00:00:00:00:00:00</code></li>
956 <li><code>arp.tpa</code> copied from <code>ip4.dst</code></li>
957 </ul>
958
959 <p>
960 The ARP packet has the same VLAN header, if any, as the IP packet
961 it replaces.
962 </p>
963
964 <p><b>Prerequisite:</b> <code>ip4</code></p>
965 </dd>
966
967 <dt><code>get_arp(<var>P</var>, <var>A</var>);</code></dt>
968
969 <dd>
970 <p>
971 <b>Parameters</b>: logical port string field <var>P</var>, 32-bit
972 IP address field <var>A</var>.
973 </p>
974
975 <p>
976 Looks up <var>A</var> in <var>P</var>'s ARP table. If an entry is
977 found, stores its Ethernet address in <code>eth.dst</code>,
978 otherwise stores <code>00:00:00:00:00:00</code> in
979 <code>eth.dst</code>.
980 </p>
981
982 <p><b>Example:</b> <code>get_arp(outport, ip4.dst);</code></p>
983 </dd>
984
985 <dt>
986 <code>put_arp(<var>P</var>, <var>A</var>, <var>E</var>);</code>
987 </dt>
988
989 <dd>
990 <p>
991 <b>Parameters</b>: logical port string field <var>P</var>, 32-bit
992 IP address field <var>A</var>, 48-bit Ethernet address field
993 <var>E</var>.
994 </p>
995
996 <p>
997 Adds or updates the entry for IP address <var>A</var> in logical
998 port <var>P</var>'s ARP table, setting its Ethernet address to
999 <var>E</var>.
1000 </p>
1001
1002 <p><b>Example:</b> <code>put_arp(inport, arp.spa, arp.sha);</code></p>
1003 </dd>
1004 </dl>
1005
1006 <p>
1007 The following actions will likely be useful later, but they have not
1008 been thought out carefully.
1009 </p>
1010
1011 <dl>
1012 <dt><code>icmp4 { <var>action</var>; </code>...<code> };</code></dt>
1013 <dd>
1014 <p>
1015 Temporarily replaces the IPv4 packet being processed by an ICMPv4
1016 packet and executes each nested <var>action</var> on the ICMPv4
1017 packet. Actions following the <var>icmp4</var> action, if any,
1018 apply to the original, unmodified packet.
1019 </p>
1020
1021 <p>
1022 The ICMPv4 packet that this action operates on is initialized based
1023 on the IPv4 packet being processed, as follows. These are default
1024 values that the nested actions will probably want to change.
1025 Ethernet and IPv4 fields not listed here are not changed:
1026 </p>
1027
1028 <ul>
1029 <li><code>ip.proto = 1</code> (ICMPv4)</li>
1030 <li><code>ip.frag = 0</code> (not a fragment)</li>
1031 <li><code>icmp4.type = 3</code> (destination unreachable)</li>
1032 <li><code>icmp4.code = 1</code> (host unreachable)</li>
1033 </ul>
1034
1035 <p>
1036 Details TBD.
1037 </p>
1038
1039 <p><b>Prerequisite:</b> <code>ip4</code></p>
1040 </dd>
1041
1042 <dt><code>tcp_reset;</code></dt>
1043 <dd>
1044 <p>
1045 This action transforms the current TCP packet according to the
1046 following pseudocode:
1047 </p>
1048
1049 <pre>
1050 if (tcp.ack) {
1051 tcp.seq = tcp.ack;
1052 } else {
1053 tcp.ack = tcp.seq + length(tcp.payload);
1054 tcp.seq = 0;
1055 }
1056 tcp.flags = RST;
1057 </pre>
1058
1059 <p>
1060 Then, the action drops all TCP options and payload data, and
1061 updates the TCP checksum.
1062 </p>
1063
1064 <p>
1065 Details TBD.
1066 </p>
1067
1068 <p><b>Prerequisite:</b> <code>tcp</code></p>
1069 </dd>
1070 </dl>
1071 </column>
1072
1073 <column name="external_ids" key="stage-name">
1074 Human-readable name for this flow's stage in the pipeline.
1075 </column>
1076
1077 <group title="Common Columns">
1078 The overall purpose of these columns is described under <code>Common
1079 Columns</code> at the beginning of this document.
1080
1081 <column name="external_ids"/>
1082 </group>
1083 </table>
1084
1085 <table name="Multicast_Group" title="Logical Port Multicast Groups">
1086 <p>
1087 The rows in this table define multicast groups of logical ports.
1088 Multicast groups allow a single packet transmitted over a tunnel to a
1089 hypervisor to be delivered to multiple VMs on that hypervisor, which
1090 uses bandwidth more efficiently.
1091 </p>
1092
1093 <p>
1094 Each row in this table defines a logical multicast group numbered <ref
1095 column="tunnel_key"/> within <ref column="datapath"/>, whose logical
1096 ports are listed in the <ref column="ports"/> column.
1097 </p>
1098
1099 <column name="datapath">
1100 The logical datapath in which the multicast group resides.
1101 </column>
1102
1103 <column name="tunnel_key">
1104 The value used to designate this logical egress port in tunnel
1105 encapsulations. An index forces the key to be unique within the <ref
1106 column="datapath"/>. The unusual range ensures that multicast group IDs
1107 do not overlap with logical port IDs.
1108 </column>
1109
1110 <column name="name">
1111 <p>
1112 The logical multicast group's name. An index forces the name to be
1113 unique within the <ref column="datapath"/>. Logical flows in the
1114 ingress pipeline may output to the group just as for individual logical
1115 ports, by assigning the group's name to <code>outport</code> and
1116 executing an <code>output</code> action.
1117 </p>
1118
1119 <p>
1120 Multicast group names and logical port names share a single namespace
1121 and thus should not overlap (but the database schema cannot enforce
1122 this). To try to avoid conflicts, <code>ovn-northd</code> uses names
1123 that begin with <code>_MC_</code>.
1124 </p>
1125 </column>
1126
1127 <column name="ports">
1128 The logical ports included in the multicast group. All of these ports
1129 must be in the <ref column="datapath"/> logical datapath (but the
1130 database schema cannot enforce this).
1131 </column>
1132 </table>
1133
1134 <table name="Datapath_Binding" title="Physical-Logical Datapath Bindings">
1135 <p>
1136 Each row in this table identifies physical bindings of a logical
1137 datapath. A logical datapath implements a logical pipeline among the
1138 ports in the <ref table="Port_Binding"/> table associated with it. In
1139 practice, the pipeline in a given logical datapath implements either a
1140 logical switch or a logical router.
1141 </p>
1142
1143 <column name="tunnel_key">
1144 The tunnel key value to which the logical datapath is bound.
1145 The <code>Tunnel Encapsulation</code> section in
1146 <code>ovn-architecture</code>(7) describes how tunnel keys are
1147 constructed for each supported encapsulation.
1148 </column>
1149
1150 <group title="OVN_Northbound Relationship">
1151 <p>
1152 Each row in <ref table="Datapath_Binding"/> is associated with some
1153 logical datapath. <code>ovn-northd</code> uses these keys to track the
1154 association of a logical datapath with concepts in the <ref
1155 db="OVN_Northbound"/> database.
1156 </p>
1157
1158 <column name="external_ids" key="logical-switch" type='{"type": "uuid"}'>
1159 For a logical datapath that represents a logical switch,
1160 <code>ovn-northd</code> stores in this key the UUID of the
1161 corresponding <ref table="Logical_Switch" db="OVN_Northbound"/> row in
1162 the <ref db="OVN_Northbound"/> database.
1163 </column>
1164
1165 <column name="external_ids" key="logical-router" type='{"type": "uuid"}'>
1166 For a logical datapath that represents a logical router,
1167 <code>ovn-northd</code> stores in this key the UUID of the
1168 corresponding <ref table="Logical_Router" db="OVN_Northbound"/> row in
1169 the <ref db="OVN_Northbound"/> database.
1170 </column>
1171 </group>
1172
1173 <group title="Common Columns">
1174 The overall purpose of these columns is described under <code>Common
1175 Columns</code> at the beginning of this document.
1176
1177 <column name="external_ids"/>
1178 </group>
1179 </table>
1180
1181 <table name="Port_Binding" title="Physical-Logical Port Bindings">
1182 <p>
1183 Most rows in this table identify the physical location of a logical port.
1184 (The exceptions are logical patch ports, which do not have any physical
1185 location.)
1186 </p>
1187
1188 <p>
1189 For every <code>Logical_Port</code> record in <code>OVN_Northbound</code>
1190 database, <code>ovn-northd</code> creates a record in this table.
1191 <code>ovn-northd</code> populates and maintains every column except
1192 the <code>chassis</code> column, which it leaves empty in new records.
1193 </p>
1194
1195 <p>
1196 <code>ovn-controller</code>/<code>ovn-controller-vtep</code>
1197 populates the <code>chassis</code> column for the records that
1198 identify the logical ports that are located on its hypervisor/gateway,
1199 which <code>ovn-controller</code>/<code>ovn-controller-vtep</code> in
1200 turn finds out by monitoring the local hypervisor's Open_vSwitch
1201 database, which identifies logical ports via the conventions described
1202 in <code>IntegrationGuide.md</code>.
1203 </p>
1204
1205 <p>
1206 When a chassis shuts down gracefully, it should clean up the
1207 <code>chassis</code> column that it previously had populated.
1208 (This is not critical because resources hosted on the chassis are equally
1209 unreachable regardless of whether their rows are present.) To handle the
1210 case where a VM is shut down abruptly on one chassis, then brought up
1211 again on a different one,
1212 <code>ovn-controller</code>/<code>ovn-controller-vtep</code> must
1213 overwrite the <code>chassis</code> column with new information.
1214 </p>
1215
1216 <group title="Core Features">
1217 <column name="datapath">
1218 The logical datapath to which the logical port belongs.
1219 </column>
1220
1221 <column name="logical_port">
1222 A logical port, taken from <ref table="Logical_Port" column="name"
1223 db="OVN_Northbound"/> in the OVN_Northbound database's <ref
1224 table="Logical_Port" db="OVN_Northbound"/> table. OVN does not
1225 prescribe a particular format for the logical port ID.
1226 </column>
1227
1228 <column name="chassis">
1229 The physical location of the logical port. To successfully identify a
1230 chassis, this column must be a <ref table="Chassis"/> record. This is
1231 populated by
1232 <code>ovn-controller</code>/<code>ovn-controller-vtep</code>.
1233 </column>
1234
1235 <column name="tunnel_key">
1236 <p>
1237 A number that represents the logical port in the key (e.g. STT key or
1238 Geneve TLV) field carried within tunnel protocol packets.
1239 </p>
1240
1241 <p>
1242 The tunnel ID must be unique within the scope of a logical datapath.
1243 </p>
1244 </column>
1245
1246 <column name="mac">
1247 <p>
1248 The Ethernet address or addresses used as a source address on the
1249 logical port, each in the form
1250 <var>xx</var>:<var>xx</var>:<var>xx</var>:<var>xx</var>:<var>xx</var>:<var>xx</var>.
1251 The string <code>unknown</code> is also allowed to indicate that the
1252 logical port has an unknown set of (additional) source addresses.
1253 </p>
1254
1255 <p>
1256 A VM interface would ordinarily have a single Ethernet address. A
1257 gateway port might initially only have <code>unknown</code>, and then
1258 add MAC addresses to the set as it learns new source addresses.
1259 </p>
1260 </column>
1261
1262 <column name="type">
1263 <p>
1264 A type for this logical port. Logical ports can be used to model other
1265 types of connectivity into an OVN logical switch. The following types
1266 are defined:
1267 </p>
1268
1269 <dl>
1270 <dt>(empty string)</dt>
1271 <dd>VM (or VIF) interface.</dd>
1272
1273 <dt><code>patch</code></dt>
1274 <dd>
1275 One of a pair of logical ports that act as if connected by a patch
1276 cable. Useful for connecting two logical datapaths, e.g. to connect
1277 a logical router to a logical switch or to another logical router.
1278 </dd>
1279
1280 <dt><code>localnet</code></dt>
1281 <dd>
1282 A connection to a locally accessible network from each
1283 <code>ovn-controller</code> instance. A logical switch can only
1284 have a single <code>localnet</code> port attached. This is used
1285 to model direct connectivity to an existing network.
1286 </dd>
1287
1288 <dt><code>vtep</code></dt>
1289 <dd>
1290 A port to a logical switch on a VTEP gateway chassis. In order to
1291 get this port correctly recognized by the OVN controller, the <ref
1292 column="options"
1293 table="Port_Binding"/>:<code>vtep-physical-switch</code> and <ref
1294 column="options"
1295 table="Port_Binding"/>:<code>vtep-logical-switch</code> must also
1296 be defined.
1297 </dd>
1298 </dl>
1299 </column>
1300 </group>
1301
1302 <group title="Patch Options">
1303 <p>
1304 These options apply to logical ports with <ref column="type"/> of
1305 <code>patch</code>.
1306 </p>
1307
1308 <column name="options" key="peer">
1309 The <ref column="logical_port"/> in the <ref table="Port_Binding"/>
1310 record for the other side of the patch. The named <ref
1311 column="logical_port"/> must specify this <ref column="logical_port"/>
1312 in its own <code>peer</code> option. That is, the two patch logical
1313 ports must have reversed <ref column="logical_port"/> and
1314 <code>peer</code> values.
1315 </column>
1316 </group>
1317
1318 <group title="Localnet Options">
1319 <p>
1320 These options apply to logical ports with <ref column="type"/> of
1321 <code>localnet</code>.
1322 </p>
1323
1324 <column name="options" key="network_name">
1325 Required. <code>ovn-controller</code> uses the configuration entry
1326 <code>ovn-bridge-mappings</code> to determine how to connect to this
1327 network. <code>ovn-bridge-mappings</code> is a list of network names
1328 mapped to a local OVS bridge that provides access to that network. An
1329 example of configuring <code>ovn-bridge-mappings</code> would be:
1330
1331 <pre>$ ovs-vsctl set open . external-ids:ovn-bridge-mappings=physnet1:br-eth0,physnet2:br-eth1</pre>
1332
1333 <p>
1334 When a logical switch has a <code>localnet</code> port attached,
1335 every chassis that may have a local vif attached to that logical
1336 switch must have a bridge mapping configured to reach that
1337 <code>localnet</code>. Traffic that arrives on a
1338 <code>localnet</code> port is never forwarded over a tunnel to
1339 another chassis.
1340 </p>
1341 </column>
1342
1343 <column name="tag">
1344 If set, indicates that the port represents a connection to a specific
1345 VLAN on a locally accessible network. The VLAN ID is used to match
1346 incoming traffic and is also added to outgoing traffic.
1347 </column>
1348 </group>
1349
1350 <group title="VTEP Options">
1351 <p>
1352 These options apply to logical ports with <ref column="type"/> of
1353 <code>vtep</code>.
1354 </p>
1355
1356 <column name="options" key="vtep-physical-switch">
1357 Required. The name of the VTEP gateway.
1358 </column>
1359
1360 <column name="options" key="vtep-logical-switch">
1361 Required. A logical switch name connected by the VTEP gateway. Must
1362 be set when <ref column="type"/> is <code>vtep</code>.
1363 </column>
1364 </group>
1365
1366 <group title="VMI (or VIF) Options">
1367 <p>
1368 These options apply to logical ports with <ref column="type"/> having
1369 (empty string)
1370 </p>
1371
1372 <column name="options" key="policing_rate">
1373 If set, indicates the maximum rate for data sent from this interface,
1374 in kbps. Data exceeding this rate is dropped.
1375 </column>
1376
1377 <column name="options" key="policing_burst">
1378 If set, indicates the maximum burst size for data sent from this
1379 interface, in kb.
1380 </column>
1381 </group>
1382
1383 <group title="Nested Containers">
1384 <p>
1385 These columns support containers nested within a VM. Specifically,
1386 they are used when <ref column="type"/> is empty and <ref
1387 column="logical_port"/> identifies the interface of a container spawned
1388 inside a VM. They are empty for containers or VMs that run directly on
1389 a hypervisor.
1390 </p>
1391
1392 <column name="parent_port">
1393 This is taken from
1394 <ref table="Logical_Port" column="parent_name" db="OVN_Northbound"/>
1395 in the OVN_Northbound database's <ref table="Logical_Port"
1396 db="OVN_Northbound"/> table.
1397 </column>
1398
1399 <column name="tag">
1400 <p>
1401 Identifies the VLAN tag in the network traffic associated with that
1402 container's network interface.
1403 </p>
1404
1405 <p>
1406 This column is used for a different purpose when <ref column="type"/>
1407 is <code>localnet</code> (see <code>Localnet Options</code>, above).
1408 </p>
1409 </column>
1410 </group>
1411 </table>
1412
1413 <table name="MAC_Binding" title="IP to MAC bindings">
1414 <p>
1415 Each row in this table specifies a binding from an IP address to an
1416 Ethernet address that has been discovered through ARP (for IPv4) or
1417 neighbor discovery (for IPv6). This table is primarily used to discover
1418 bindings on physical networks, because IP-to-MAC bindings for virtual
1419 machines are usually populated statically into the <ref
1420 table="Port_Binding"/> table.
1421 </p>
1422
1423 <p>
1424 This table expresses a functional relationship: <ref
1425 table="MAC_Binding"/>(<ref column="logical_port"/>, <ref column="ip"/>) =
1426 <ref column="mac"/>.
1427 </p>
1428
1429 <p>
1430 In outline, the lifetime of a logical router's MAC binding looks like
1431 this:
1432 </p>
1433
1434 <ol>
1435 <li>
1436 On hypervisor 1, a logical router determines that a packet should be
1437 forwarded to IP address <var>A</var> on one of its router ports. It
1438 uses its logical flow table to determine that <var>A</var> lacks a
1439 static IP-to-MAC binding and the <code>get_arp</code> action to
1440 determine that it lacks a dynamic IP-to-MAC binding.
1441 </li>
1442
1443 <li>
1444 Using an OVN logical <code>arp</code> action, the logical router
1445 generates and sends a broadcast ARP request to the router port. It
1446 drops the IP packet.
1447 </li>
1448
1449 <li>
1450 The logical switch attached to the router port delivers the ARP request
1451 to all of its ports. (It might make sense to deliver it only to ports
1452 that have no static IP-to-MAC bindings, but this could also be
1453 surprising behavior.)
1454 </li>
1455
1456 <li>
1457 A host or VM on hypervisor 2 (which might be the same as hypervisor 1)
1458 attached to the logical switch owns the IP address in question. It
1459 composes an ARP reply and unicasts it to the logical router port's
1460 Ethernet address.
1461 </li>
1462
1463 <li>
1464 The logical switch delivers the ARP reply to the logical router port.
1465 </li>
1466
1467 <li>
1468 The logical router flow table executes a <code>put_arp</code> action.
1469 To record the IP-to-MAC binding, <code>ovn-controller</code> adds a row
1470 to the <ref table="MAC_Binding"/> table.
1471 </li>
1472
1473 <li>
1474 On hypervisor 1, <code>ovn-controller</code> receives the updated <ref
1475 table="MAC_Binding"/> table from the OVN southbound database. The next
1476 packet destined to <var>A</var> through the logical router is sent
1477 directly to the bound Ethernet address.
1478 </li>
1479 </ol>
1480
1481 <column name="logical_port">
1482 The logical port on which the binding was discovered.
1483 </column>
1484
1485 <column name="ip">
1486 The bound IP address.
1487 </column>
1488
1489 <column name="mac">
1490 The Ethernet address to which the IP is bound.
1491 </column>
1492 </table>
1493 </database>