]> git.proxmox.com Git - mirror_ovs.git/blame - ovn/ovn-sb.xml
ovn: Suppport ct_mark/ct_label in lflow matches.
[mirror_ovs.git] / ovn / ovn-sb.xml
CommitLineData
fe36184b 1<?xml version="1.0" encoding="utf-8"?>
ec78987f 2<database name="ovn-sb" title="OVN Southbound Database">
fe36184b
BP
3 <p>
4 This database holds logical and physical configuration and state for the
5 Open Virtual Network (OVN) system to support virtual network abstraction.
6 For an introduction to OVN, please see <code>ovn-architecture</code>(7).
7 </p>
8
9 <p>
ec78987f
JP
10 The OVN Southbound database sits at the center of the OVN
11 architecture. It is the one component that speaks both southbound
12 directly to all the hypervisors and gateways, via
88058f19
AW
13 <code>ovn-controller</code>/<code>ovn-controller-vtep</code>, and
14 northbound to the Cloud Management System, via <code>ovn-northd</code>:
fe36184b
BP
15 </p>
16
17 <h2>Database Structure</h2>
18
19 <p>
ec78987f
JP
20 The OVN Southbound database contains three classes of data with
21 different properties, as described in the sections below.
fe36184b
BP
22 </p>
23
24 <h3>Physical Network (PN) data</h3>
25
26 <p>
27 PN tables contain information about the chassis nodes in the system. This
28 contains all the information necessary to wire the overlay, such as IP
29 addresses, supported tunnel types, and security keys.
30 </p>
31
32 <p>
33 The amount of PN data is small (O(n) in the number of chassis) and it
34 changes infrequently, so it can be replicated to every chassis.
35 </p>
36
37 <p>
62fdd819 38 The <ref table="Chassis"/> table comprises the PN tables.
fe36184b
BP
39 </p>
40
41 <h3>Logical Network (LN) data</h3>
42
43 <p>
44 LN tables contain the topology of logical switches and routers, ACLs,
45 firewall rules, and everything needed to describe how packets traverse a
46 logical network, represented as logical datapath flows (see Logical
47 Datapath Flows, below).
48 </p>
49
50 <p>
51 LN data may be large (O(n) in the number of logical ports, ACL rules,
52 etc.). Thus, to improve scaling, each chassis should receive only data
53 related to logical networks in which that chassis participates. Past
54 experience shows that in the presence of large logical networks, even
55 finer-grained partitioning of data, e.g. designing logical flows so that
56 only the chassis hosting a logical port needs related flows, pays off
57 scale-wise. (This is not necessary initially but it is worth bearing in
58 mind in the design.)
59 </p>
60
61 <p>
62 The LN is a slave of the cloud management system running northbound of OVN.
63 That CMS determines the entire OVN logical configuration and therefore the
64 LN's content at any given time is a deterministic function of the CMS's
09986f8c
JP
65 configuration, although that happens indirectly via the
66 <ref db="OVN_Northbound"/> database and <code>ovn-northd</code>.
fe36184b
BP
67 </p>
68
69 <p>
70 LN data is likely to change more quickly than PN data. This is especially
71 true in a container environment where VMs are created and destroyed (and
72 therefore added to and deleted from logical switches) quickly.
73 </p>
74
75 <p>
5868eb24
BP
76 <ref table="Logical_Flow"/> and <ref table="Multicast_Group"/> contain LN
77 data.
fe36184b
BP
78 </p>
79
80 <h3>Bindings data</h3>
81
82 <p>
5868eb24
BP
83 Bindings data link logical and physical components. They show the current
84 placement of logical components (such as VMs and VIFs) onto chassis, and
85 map logical entities to the values that represent them in tunnel
86 encapsulations.
fe36184b
BP
87 </p>
88
89 <p>
90 Bindings change frequently, at least every time a VM powers up or down
91 or migrates, and especially quickly in a container environment. The
92 amount of data per VM (or VIF) is small.
93 </p>
94
95 <p>
96 Each chassis is authoritative about the VMs and VIFs that it hosts at any
97 given time and can efficiently flood that state to a central location, so
98 the consistency needs are minimal.
99 </p>
100
101 <p>
5868eb24
BP
102 The <ref table="Port_Binding"/> and <ref table="Datapath_Binding"/> tables
103 contain binding data.
fe36184b
BP
104 </p>
105
5868eb24
BP
106 <h2>Common Columns</h2>
107
108 <p>
109 Some tables contain a special column named <code>external_ids</code>. This
110 column has the same form and purpose each place that it appears, so we
111 describe it here to save space later.
112 </p>
113
114 <dl>
115 <dt><code>external_ids</code>: map of string-string pairs</dt>
116 <dd>
117 Key-value pairs for use by the software that manages the OVN Southbound
88058f19
AW
118 database rather than by
119 <code>ovn-controller</code>/<code>ovn-controller-vtep</code>. In
120 particular, <code>ovn-northd</code> can use key-value pairs in this
121 column to relate entities in the southbound database to higher-level
122 entities (such as entities in the OVN Northbound database). Individual
123 key-value pairs in this column may be documented in some cases to aid
124 in understanding and troubleshooting, but the reader should not mistake
125 such documentation as comprehensive.
5868eb24
BP
126 </dd>
127 </dl>
128
fe36184b
BP
129 <table name="Chassis" title="Physical Network Hypervisor and Gateway Information">
130 <p>
131 Each row in this table represents a hypervisor or gateway (a chassis) in
132 the physical network (PN). Each chassis, via
88058f19
AW
133 <code>ovn-controller</code>/<code>ovn-controller-vtep</code>, adds
134 and updates its own row, and keeps a copy of the remaining rows to
135 determine how to reach other hypervisors.
fe36184b
BP
136 </p>
137
138 <p>
139 When a chassis shuts down gracefully, it should remove its own row.
140 (This is not critical because resources hosted on the chassis are equally
141 unreachable regardless of whether the row is present.) If a chassis
142 shuts down permanently without removing its row, some kind of manual or
143 automatic cleanup is eventually needed; we can devise a process for that
144 as necessary.
145 </p>
146
147 <column name="name">
148 A chassis name, taken from <ref key="system-id" table="Open_vSwitch"
149 column="external_ids" db="Open_vSwitch"/> in the Open_vSwitch
150 database's <ref table="Open_vSwitch" db="Open_vSwitch"/> table. OVN does
151 not prescribe a particular format for chassis names.
152 </column>
153
09db214c 154 <group title="Encapsulation Configuration">
fe36184b 155 <p>
09db214c
JP
156 OVN uses encapsulation to transmit logical dataplane packets
157 between chassis.
fe36184b
BP
158 </p>
159
09db214c
JP
160 <column name="encaps">
161 Points to supported encapsulation configurations to transmit
162 logical dataplane packets to this chassis. Each entry is a <ref
163 table="Encap"/> record that describes the configuration.
fe36184b
BP
164 </column>
165 </group>
166
62fdd819
AW
167 <group title="Gateway Configuration">
168 <p>
169 A <dfn>gateway</dfn> is a chassis that forwards traffic between the
170 OVN-managed part of a logical network and a physical VLAN, extending a
171 tunnel-based logical network into a physical network. Gateways are
88058f19
AW
172 typically dedicated nodes that do not host VMs and will be controlled
173 by <code>ovn-controller-vtep</code>.
fe36184b
BP
174 </p>
175
62fdd819 176 <column name="vtep_logical_switches">
88058f19
AW
177 Stores all VTEP logical switch names connected by this gateway
178 chassis. The <ref table="Port_Binding"/> table entry with
179 <ref column="options" table="Port_Binding"/>:<code>vtep-physical-switch</code>
180 equal <ref table="Chassis"/> <ref column="name" table="Chassis"/>, and
181 <ref column="options" table="Port_Binding"/>:<code>vtep-logical-switch</code>
182 value in <ref table="Chassis"/>
183 <ref column="vtep_logical_switches" table="Chassis"/>, will be
184 associated with this <ref table="Chassis"/>.
fe36184b 185 </column>
62fdd819 186 </group>
fe36184b
BP
187 </table>
188
09db214c
JP
189 <table name="Encap" title="Encapsulation Types">
190 <p>
191 The <ref column="encaps" table="Chassis"/> column in the <ref
192 table="Chassis"/> table refers to rows in this table to identify
193 how OVN may transmit logical dataplane packets to this chassis.
88058f19
AW
194 Each chassis, via <code>ovn-controller</code>(8) or
195 <code>ovn-controller-vtep</code>(8), adds and updates its own rows
196 and keeps a copy of the remaining rows to determine how to reach
197 other chassis.
09db214c
JP
198 </p>
199
200 <column name="type">
201 The encapsulation to use to transmit packets to this chassis.
b705f9ea
JP
202 Hypervisors must use either <code>geneve</code> or
203 <code>stt</code>. Gateways may use <code>vxlan</code>,
204 <code>geneve</code>, or <code>stt</code>.
09db214c
JP
205 </column>
206
207 <column name="options">
208 Options for configuring the encapsulation, e.g. IPsec parameters when
209 IPsec support is introduced. No options are currently defined.
210 </column>
211
212 <column name="ip">
213 The IPv4 address of the encapsulation tunnel endpoint.
214 </column>
215 </table>
216
5868eb24 217 <table name="Logical_Flow" title="Logical Network Flows">
fe36184b 218 <p>
09986f8c
JP
219 Each row in this table represents one logical flow.
220 <code>ovn-northd</code> populates this table with logical flows
221 that implement the L2 and L3 topologies specified in the
222 <ref db="OVN_Northbound"/> database. Each hypervisor, via
223 <code>ovn-controller</code>, translates the logical flows into
224 OpenFlow flows specific to its hypervisor and installs them into
225 Open vSwitch.
fe36184b
BP
226 </p>
227
228 <p>
229 Logical flows are expressed in an OVN-specific format, described here. A
230 logical datapath flow is much like an OpenFlow flow, except that the
231 flows are written in terms of logical ports and logical datapaths instead
232 of physical ports and physical datapaths. Translation between logical
233 and physical flows helps to ensure isolation between logical datapaths.
09986f8c
JP
234 (The logical flow abstraction also allows the OVN centralized
235 components to do less work, since they do not have to separately
236 compute and push out physical flows to each chassis.)
fe36184b
BP
237 </p>
238
239 <p>
240 The default action when no flow matches is to drop packets.
241 </p>
242
69a832cf 243 <p><em>Architectural Logical Life Cycle of a Packet</em></p>
5868eb24
BP
244
245 <p>
246 This following description focuses on the life cycle of a packet through
247 a logical datapath, ignoring physical details of the implementation.
69a832cf 248 Please refer to <em>Architectural Physical Life Cycle of a Packet</em> in
5868eb24
BP
249 <code>ovn-architecture</code>(7) for the physical information.
250 </p>
251
252 <p>
253 The description here is written as if OVN itself executes these steps,
254 but in fact OVN (that is, <code>ovn-controller</code>) programs Open
255 vSwitch, via OpenFlow and OVSDB, to execute them on its behalf.
256 </p>
257
258 <p>
259 At a high level, OVN passes each packet through the logical datapath's
260 logical ingress pipeline, which may output the packet to one or more
261 logical port or logical multicast groups. For each such logical output
262 port, OVN passes the packet through the datapath's logical egress
263 pipeline, which may either drop the packet or deliver it to the
264 destination. Between the two pipelines, outputs to logical multicast
265 groups are expanded into logical ports, so that the egress pipeline only
266 processes a single logical output port at a time. Between the two
267 pipelines is also where, when necessary, OVN encapsulates a packet in a
268 tunnel (or tunnels) to transmit to remote hypervisors.
269 </p>
270
271 <p>
272 In more detail, to start, OVN searches the <ref table="Logical_Flow"/>
273 table for a row with correct <ref column="logical_datapath"/>, a <ref
274 column="pipeline"/> of <code>ingress</code>, a <ref column="table_id"/>
275 of 0, and a <ref column="match"/> that is true for the packet. If none
276 is found, OVN drops the packet. If OVN finds more than one, it chooses
277 the match with the highest <ref column="priority"/>. Then OVN executes
278 each of the actions specified in the row's <ref table="actions"/> column,
279 in the order specified. Some actions, such as those to modify packet
280 headers, require no further details. The <code>next</code> and
281 <code>output</code> actions are special.
282 </p>
283
284 <p>
285 The <code>next</code> action causes the above process to be repeated
286 recursively, except that OVN searches for <ref column="table_id"/> of 1
287 instead of 0. Similarly, any <code>next</code> action in a row found in
288 that table would cause a further search for a <ref column="table_id"/> of
289 2, and so on. When recursive processing completes, flow control returns
290 to the action following <code>next</code>.
291 </p>
292
293 <p>
294 The <code>output</code> action also introduces recursion. Its effect
295 depends on the current value of the <code>outport</code> field. Suppose
296 <code>outport</code> designates a logical port. First, OVN compares
297 <code>inport</code> to <code>outport</code>; if they are equal, it treats
298 the <code>output</code> as a no-op. In the common case, where they are
299 different, the packet enters the egress pipeline. This transition to the
78aab811
JP
300 egress pipeline discards register data, e.g. <code>reg0</code> ...
301 <code>reg4</code> and connection tracking state, to achieve
302 uniform behavior regardless of whether the egress pipeline is on a
303 different hypervisor (because registers aren't preserve across
304 tunnel encapsulation).
5868eb24
BP
305 </p>
306
307 <p>
308 To execute the egress pipeline, OVN again searches the <ref
309 table="Logical_Flow"/> table for a row with correct <ref
310 column="logical_datapath"/>, a <ref column="table_id"/> of 0, a <ref
311 column="match"/> that is true for the packet, but now looking for a <ref
312 column="pipeline"/> of <code>egress</code>. If no matching row is found,
313 the output becomes a no-op. Otherwise, OVN executes the actions for the
314 matching flow (which is chosen from multiple, if necessary, as already
315 described).
316 </p>
317
318 <p>
319 In the <code>egress</code> pipeline, the <code>next</code> action acts as
320 already described, except that it, of course, searches for
321 <code>egress</code> flows. The <code>output</code> action, however, now
322 directly outputs the packet to the output port (which is now fixed,
323 because <code>outport</code> is read-only within the egress pipeline).
324 </p>
325
326 <p>
327 The description earlier assumed that <code>outport</code> referred to a
328 logical port. If it instead designates a logical multicast group, then
329 the description above still applies, with the addition of fan-out from
330 the logical multicast group to each logical port in the group. For each
331 member of the group, OVN executes the logical pipeline as described, with
332 the logical output port replaced by the group member.
333 </p>
334
8d6e5516
JP
335 <p><em>Pipeline Stages</em></p>
336
337 <p>
338 <code>ovn-northd</code> is responsible for populating the
339 <ref table="Logical_Flow"/> table, so the stages are an
340 implementation detail and subject to change. This section
341 describes the current logical flow table.
342 </p>
343
344 <p>
345 The ingress pipeline consists of the following stages:
346 </p>
347 <ul>
348 <li>
349 Port Security (Table 0): Validates the source address, drops
350 packets with a VLAN tag, and, if configured, verifies that the
351 logical port is allowed to send with the source address.
352 </li>
353
354 <li>
355 L2 Destination Lookup (Table 1): Forwards known unicast
356 addresses to the appropriate logical port. Unicast packets to
357 unknown hosts are forwarded to logical ports configured with the
358 special <code>unknown</code> mac address. Broadcast, and
359 multicast are flooded to all ports in the logical switch.
360 </li>
361 </ul>
362
363 <p>
364 The egress pipeline consists of the following stages:
365 </p>
366 <ul>
367 <li>
368 ACL (Table 0): Applies any specified access control lists.
369 </li>
370
371 <li>
372 Port Security (Table 1): If configured, verifies that the
373 logical port is allowed to receive packets with the destination
374 address.
375 </li>
376 </ul>
377
747b2a45 378 <column name="logical_datapath">
5868eb24
BP
379 The logical datapath to which the logical flow belongs.
380 </column>
381
382 <column name="pipeline">
383 <p>
384 The primary flows used for deciding on a packet's destination are the
385 <code>ingress</code> flows. The <code>egress</code> flows implement
386 ACLs. See <em>Logical Life Cycle of a Packet</em>, above, for details.
387 </p>
747b2a45
BP
388 </column>
389
fe36184b
BP
390 <column name="table_id">
391 The stage in the logical pipeline, analogous to an OpenFlow table number.
392 </column>
393
394 <column name="priority">
395 The flow's priority. Flows with numerically higher priority take
396 precedence over those with lower. If two logical datapath flows with the
397 same priority both match, then the one actually applied to the packet is
398 undefined.
399 </column>
400
401 <column name="match">
402 <p>
403 A matching expression. OVN provides a superset of OpenFlow matching
404 capabilities, using a syntax similar to Boolean expressions in a
405 programming language.
406 </p>
407
408 <p>
fa6aeaeb
RB
409 The most important components of match expression are
410 <dfn>comparisons</dfn> between <dfn>symbols</dfn> and
411 <dfn>constants</dfn>, e.g. <code>ip4.dst == 192.168.0.1</code>,
412 <code>ip.proto == 6</code>, <code>arp.op == 1</code>, <code>eth.type ==
413 0x800</code>. The logical AND operator <code>&amp;&amp;</code> and
414 logical OR operator <code>||</code> can combine comparisons into a
415 larger expression.
fe36184b
BP
416 </p>
417
fe36184b 418 <p>
e0840f11
BP
419 Matching expressions also support parentheses for grouping, the logical
420 NOT prefix operator <code>!</code>, and literals <code>0</code> and
421 <code>1</code> to express ``false'' or ``true,'' respectively. The
422 latter is useful by itself as a catch-all expression that matches every
423 packet.
fe36184b
BP
424 </p>
425
e0840f11 426 <p><em>Symbols</em></p>
fe36184b
BP
427
428 <p>
fa6aeaeb
RB
429 <em>Type</em>. Symbols have <dfn>integer</dfn> or <dfn>string</dfn>
430 type. Integer symbols have a <dfn>width</dfn> in bits.
fe36184b
BP
431 </p>
432
433 <p>
fa6aeaeb 434 <em>Kinds</em>. There are three kinds of symbols:
fe36184b
BP
435 </p>
436
e0840f11 437 <ul>
fa6aeaeb
RB
438 <li>
439 <p>
440 <dfn>Fields</dfn>. A field symbol represents a packet header or
441 metadata field. For example, a field
442 named <code>vlan.tci</code> might represent the VLAN TCI field in a
443 packet.
444 </p>
445
446 <p>
447 A field symbol can have integer or string type. Integer fields can
448 be nominal or ordinal (see <em>Level of Measurement</em>,
449 below).
450 </p>
451 </li>
452
453 <li>
454 <p>
455 <dfn>Subfields</dfn>. A subfield represents a subset of bits from
456 a larger field. For example, a field <code>vlan.vid</code> might
457 be defined as an alias for <code>vlan.tci[0..11]</code>. Subfields
458 are provided for syntactic convenience, because it is always
459 possible to instead refer to a subset of bits from a field
460 directly.
461 </p>
462
463 <p>
464 Only ordinal fields (see <em>Level of Measurement</em>,
465 below) may have subfields. Subfields are always ordinal.
466 </p>
467 </li>
468
469 <li>
470 <p>
471 <dfn>Predicates</dfn>. A predicate is shorthand for a Boolean
472 expression. Predicates may be used much like 1-bit fields. For
473 example, <code>ip4</code> might expand to <code>eth.type ==
474 0x800</code>. Predicates are provided for syntactic convenience,
475 because it is always possible to instead specify the underlying
476 expression directly.
477 </p>
478
479 <p>
480 A predicate whose expansion refers to any nominal field or
481 predicate (see <em>Level of Measurement</em>, below) is nominal;
482 other predicates have Boolean level of measurement.
483 </p>
484 </li>
e0840f11
BP
485 </ul>
486
fe36184b 487 <p>
fa6aeaeb
RB
488 <em>Level of Measurement</em>. See
489 http://en.wikipedia.org/wiki/Level_of_measurement for the statistical
490 concept on which this classification is based. There are three
491 levels:
fe36184b
BP
492 </p>
493
494 <ul>
fa6aeaeb
RB
495 <li>
496 <p>
497 <dfn>Ordinal</dfn>. In statistics, ordinal values can be ordered
498 on a scale. OVN considers a field (or subfield) to be ordinal if
499 its bits can be examined individually. This is true for the
500 OpenFlow fields that OpenFlow or Open vSwitch makes ``maskable.''
501 </p>
502
503 <p>
504 Any use of a nominal field may specify a single bit or a range of
505 bits, e.g. <code>vlan.tci[13..15]</code> refers to the PCP field
506 within the VLAN TCI, and <code>eth.dst[40]</code> refers to the
507 multicast bit in the Ethernet destination address.
508 </p>
509
510 <p>
511 OVN supports all the usual arithmetic relations (<code>==</code>,
512 <code>!=</code>, <code>&lt;</code>, <code>&lt;=</code>,
513 <code>&gt;</code>, and <code>&gt;=</code>) on ordinal fields and
514 their subfields, because OVN can implement these in OpenFlow and
515 Open vSwitch as collections of bitwise tests.
516 </p>
517 </li>
518
519 <li>
520 <p>
521 <dfn>Nominal</dfn>. In statistics, nominal values cannot be
522 usefully compared except for equality. This is true of OpenFlow
523 port numbers, Ethernet types, and IP protocols are examples: all of
524 these are just identifiers assigned arbitrarily with no deeper
525 meaning. In OpenFlow and Open vSwitch, bits in these fields
526 generally aren't individually addressable.
527 </p>
528
529 <p>
530 OVN only supports arithmetic tests for equality on nominal fields,
531 because OpenFlow and Open vSwitch provide no way for a flow to
532 efficiently implement other comparisons on them. (A test for
533 inequality can be sort of built out of two flows with different
534 priorities, but OVN matching expressions always generate flows with
535 a single priority.)
536 </p>
537
538 <p>
539 String fields are always nominal.
540 </p>
541 </li>
542
543 <li>
544 <p>
545 <dfn>Boolean</dfn>. A nominal field that has only two values, 0
546 and 1, is somewhat exceptional, since it is easy to support both
547 equality and inequality tests on such a field: either one can be
548 implemented as a test for 0 or 1.
549 </p>
550
551 <p>
552 Only predicates (see above) have a Boolean level of measurement.
553 </p>
554
555 <p>
556 This isn't a standard level of measurement.
557 </p>
558 </li>
fe36184b
BP
559 </ul>
560
561 <p>
fa6aeaeb
RB
562 <em>Prerequisites</em>. Any symbol can have prerequisites, which are
563 additional condition implied by the use of the symbol. For example,
564 For example, <code>icmp4.type</code> symbol might have prerequisite
565 <code>icmp4</code>, which would cause an expression <code>icmp4.type ==
566 0</code> to be interpreted as <code>icmp4.type == 0 &amp;&amp;
567 icmp4</code>, which would in turn expand to <code>icmp4.type == 0
568 &amp;&amp; eth.type == 0x800 &amp;&amp; ip4.proto == 1</code> (assuming
569 <code>icmp4</code> is a predicate defined as suggested under
570 <em>Types</em> above).
fe36184b
BP
571 </p>
572
e0840f11
BP
573 <p><em>Relational operators</em></p>
574
fe36184b 575 <p>
fa6aeaeb
RB
576 All of the standard relational operators <code>==</code>,
577 <code>!=</code>, <code>&lt;</code>, <code>&lt;=</code>,
578 <code>&gt;</code>, and <code>&gt;=</code> are supported. Nominal
579 fields support only <code>==</code> and <code>!=</code>, and only in a
580 positive sense when outer <code>!</code> are taken into account,
581 e.g. given string field <code>inport</code>, <code>inport ==
582 "eth0"</code> and <code>!(inport != "eth0")</code> are acceptable, but
583 not <code>inport != "eth0"</code>.
fe36184b
BP
584 </p>
585
586 <p>
fa6aeaeb
RB
587 The implementation of <code>==</code> (or <code>!=</code> when it is
588 negated), is more efficient than that of the other relational
589 operators.
fe36184b
BP
590 </p>
591
e0840f11
BP
592 <p><em>Constants</em></p>
593
fe36184b 594 <p>
e0840f11
BP
595 Integer constants may be expressed in decimal, hexadecimal prefixed by
596 <code>0x</code>, or as dotted-quad IPv4 addresses, IPv6 addresses in
597 their standard forms, or Ethernet addresses as colon-separated hex
598 digits. A constant in any of these forms may be followed by a slash
599 and a second constant (the mask) in the same form, to form a masked
600 constant. IPv4 and IPv6 masks may be given as integers, to express
601 CIDR prefixes.
602 </p>
603
604 <p>
605 String constants have the same syntax as quoted strings in JSON (thus,
5868eb24 606 they are Unicode strings).
fe36184b
BP
607 </p>
608
609 <p>
e0840f11
BP
610 Some operators support sets of constants written inside curly braces
611 <code>{</code> ... <code>}</code>. Commas between elements of a set,
612 and after the last elements, are optional. With <code>==</code>,
613 ``<code><var>field</var> == { <var>constant1</var>,
614 <var>constant2</var>,</code> ... <code>}</code>'' is syntactic sugar
615 for ``<code><var>field</var> == <var>constant1</var> ||
616 <var>field</var> == <var>constant2</var> || </code>...<code></code>.
617 Similarly, ``<code><var>field</var> != { <var>constant1</var>,
618 <var>constant2</var>, </code>...<code> }</code>'' is equivalent to
619 ``<code><var>field</var> != <var>constant1</var> &amp;&amp;
fe36184b 620 <var>field</var> != <var>constant2</var> &amp;&amp;
e0840f11 621 </code>...<code></code>''.
fe36184b
BP
622 </p>
623
e0840f11
BP
624 <p><em>Miscellaneous</em></p>
625
fe36184b 626 <p>
fa6aeaeb
RB
627 Comparisons may name the symbol or the constant first,
628 e.g. <code>tcp.src == 80</code> and <code>80 == tcp.src</code> are both
629 acceptable.
fe36184b
BP
630 </p>
631
632 <p>
fa6aeaeb
RB
633 Tests for a range may be expressed using a syntax like <code>1024 &lt;=
634 tcp.src &lt;= 49151</code>, which is equivalent to <code>1024 &lt;=
635 tcp.src &amp;&amp; tcp.src &lt;= 49151</code>.
fe36184b
BP
636 </p>
637
638 <p>
fa6aeaeb
RB
639 For a one-bit field or predicate, a mention of its name is equivalent
640 to <code><var>symobl</var> == 1</code>, e.g. <code>vlan.present</code>
641 is equivalent to <code>vlan.present == 1</code>. The same is true for
642 one-bit subfields, e.g. <code>vlan.tci[12]</code>. There is no
643 technical limitation to implementing the same for ordinal fields of all
644 widths, but the implementation is expensive enough that the syntax
645 parser requires writing an explicit comparison against zero to make
646 mistakes less likely, e.g. in <code>tcp.src != 0</code> the comparison
647 against 0 is required.
fe36184b
BP
648 </p>
649
650 <p>
fa6aeaeb
RB
651 <em>Operator precedence</em> is as shown below, from highest to lowest.
652 There are two exceptions where parentheses are required even though the
653 table would suggest that they are not: <code>&amp;&amp;</code> and
654 <code>||</code> require parentheses when used together, and
655 <code>!</code> requires parentheses when applied to a relational
656 expression. Thus, in <code>(eth.type == 0x800 || eth.type == 0x86dd)
657 &amp;&amp; ip.proto == 6</code> or <code>!(arp.op == 1)</code>, the
658 parentheses are mandatory.
fe36184b
BP
659 </p>
660
e0840f11
BP
661 <ul>
662 <li><code>()</code></li>
663 <li><code>== != &lt; &lt;= &gt; &gt;=</code></li>
664 <li><code>!</code></li>
665 <li><code>&amp;&amp; ||</code></li>
666 </ul>
667
10b1662b
BP
668 <p>
669 <em>Comments</em> may be introduced by <code>//</code>, which extends
670 to the next new-line. Comments within a line may be bracketed by
671 <code>/*</code> and <code>*/</code>. Multiline comments are not
672 supported.
673 </p>
674
e0840f11
BP
675 <p><em>Symbols</em></p>
676
5868eb24
BP
677 <p>
678 Most of the symbols below have integer type. Only <code>inport</code>
679 and <code>outport</code> have string type. <code>inport</code> names a
680 logical port. Thus, its value is a <ref column="logical_port"/> name
62fdd819
AW
681 from the <ref table="Port_Binding"/> table. <code>outport</code> may
682 name a logical port, as <code>inport</code>, or a logical multicast
683 group defined in the <ref table="Multicast_Group"/> table. For both
684 symbols, only names within the flow's logical datapath may be used.
5868eb24
BP
685 </p>
686
e0840f11 687 <ul>
78aab811 688 <li><code>reg0</code>...<code>reg4</code></li>
5868eb24 689 <li><code>inport</code> <code>outport</code></li>
e0840f11
BP
690 <li><code>eth.src</code> <code>eth.dst</code> <code>eth.type</code></li>
691 <li><code>vlan.tci</code> <code>vlan.vid</code> <code>vlan.pcp</code> <code>vlan.present</code></li>
692 <li><code>ip.proto</code> <code>ip.dscp</code> <code>ip.ecn</code> <code>ip.ttl</code> <code>ip.frag</code></li>
693 <li><code>ip4.src</code> <code>ip4.dst</code></li>
694 <li><code>ip6.src</code> <code>ip6.dst</code> <code>ip6.label</code></li>
695 <li><code>arp.op</code> <code>arp.spa</code> <code>arp.tpa</code> <code>arp.sha</code> <code>arp.tha</code></li>
696 <li><code>tcp.src</code> <code>tcp.dst</code> <code>tcp.flags</code></li>
697 <li><code>udp.src</code> <code>udp.dst</code></li>
698 <li><code>sctp.src</code> <code>sctp.dst</code></li>
699 <li><code>icmp4.type</code> <code>icmp4.code</code></li>
700 <li><code>icmp6.type</code> <code>icmp6.code</code></li>
701 <li><code>nd.target</code> <code>nd.sll</code> <code>nd.tll</code></li>
e3d81ade 702 <li><code>ct_mark</code> <code>ct_label</code></li>
78aab811
JP
703 <li>
704 <p>
705 <code>ct_state</code>, which has the following Boolean subfields:
706 </p>
707 <ul>
708 <li><code>ct.new</code>: True for a new flow</li>
709 <li><code>ct.est</code>: True for an established flow</li>
710 <li><code>ct.rel</code>: True for a related flow</li>
711 <li><code>ct.rpl</code>: True for a reply flow</li>
712 <li><code>ct.inv</code>: True for a connection entry in a bad state</li>
713 </ul>
714 <p>
715 <code>ct_state</code> and its subfields are initialized by the
716 <code>ct_next</code> action, described below.
717 </p>
718 </li>
e0840f11
BP
719 </ul>
720
25030d47
RB
721 <p>
722 The following predicates are supported:
723 </p>
724
725 <ul>
a2011117
BP
726 <li><code>eth.bcast</code> expands to <code>eth.dst == ff:ff:ff:ff:ff:ff</code></li>
727 <li><code>eth.mcast</code> expands to <code>eth.dst[40]</code></li>
25030d47
RB
728 <li><code>vlan.present</code> expands to <code>vlan.tci[12]</code></li>
729 <li><code>ip4</code> expands to <code>eth.type == 0x800</code></li>
a2011117 730 <li><code>ip4.mcast</code> expands to <code>ip4.dst[28..31] == 0xe</code></li>
25030d47
RB
731 <li><code>ip6</code> expands to <code>eth.type == 0x86dd</code></li>
732 <li><code>ip</code> expands to <code>ip4 || ip6</code></li>
733 <li><code>icmp4</code> expands to <code>ip4 &amp;&amp; ip.proto == 1</code></li>
734 <li><code>icmp6</code> expands to <code>ip6 &amp;&amp; ip.proto == 58</code></li>
735 <li><code>icmp</code> expands to <code>icmp4 || icmp6</code></li>
736 <li><code>ip.is_frag</code> expands to <code>ip.frag[0]</code></li>
737 <li><code>ip.later_frag</code> expands to <code>ip.frag[1]</code></li>
738 <li><code>ip.first_frag</code> expands to <code>ip.is_frag &amp;&amp; !ip.later_frag</code></li>
739 <li><code>arp</code> expands to <code>eth.type == 0x806</code></li>
740 <li><code>nd</code> expands to <code>icmp6.type == {135, 136} &amp;&amp; icmp6.code == 0</code></li>
741 <li><code>tcp</code> expands to <code>ip.proto == 6</code></li>
742 <li><code>udp</code> expands to <code>ip.proto == 17</code></li>
743 <li><code>sctp</code> expands to <code>ip.proto == 132</code></li>
744 </ul>
fe36184b
BP
745 </column>
746
747 <column name="actions">
748 <p>
2cd87fce
RB
749 Logical datapath actions, to be executed when the logical flow
750 represented by this row is the highest-priority match.
fe36184b
BP
751 </p>
752
35060cdc 753 <p>
2cd87fce
RB
754 Actions share lexical syntax with the <ref column="match"/> column. An
755 empty set of actions (or one that contains just white space or
756 comments), or a set of actions that consists of just
757 <code>drop;</code>, causes the matched packets to be dropped.
758 Otherwise, the column should contain a sequence of actions, each
759 terminated by a semicolon.
35060cdc 760 </p>
fe36184b 761
35060cdc 762 <p>
eee7a8ed 763 The following actions are defined:
35060cdc 764 </p>
fe36184b 765
35060cdc
BP
766 <dl>
767 <dt><code>output;</code></dt>
768 <dd>
5868eb24 769 <p>
eee7a8ed
JP
770 In the ingress pipeline, this action executes the
771 <code>egress</code> pipeline as a subroutine. If
772 <code>outport</code> names a logical port, the egress pipeline
773 executes once; if it is a multicast group, the egress pipeline runs
774 once for each logical port in the group.
5868eb24
BP
775 </p>
776
777 <p>
778 In the egress pipeline, this action performs the actual
779 output to the <code>outport</code> logical port. (In the egress
780 pipeline, <code>outport</code> never names a multicast group.)
781 </p>
782
783 <p>
784 Output to the input port is implicitly dropped, that is,
785 <code>output</code> becomes a no-op if <code>outport</code> ==
b4970837
BP
786 <code>inport</code>. Occasionally it may be useful to override
787 this behavior, e.g. to send an ARP reply to an ARP request; to do
788 so, use <code>inport = "";</code> to set the logical input port to
789 an empty string (which should not be used as the name of any
790 logical port).
5868eb24 791 </p>
eee7a8ed 792 </dd>
fe36184b 793
35060cdc 794 <dt><code>next;</code></dt>
558ec83d 795 <dt><code>next(<var>table</var>);</code></dt>
35060cdc 796 <dd>
558ec83d
BP
797 Executes another logical datapath table as a subroutine. By default,
798 the table after the current one is executed. Specify
799 <var>table</var> to jump to a specific table in the same pipeline.
2cd87fce 800 </dd>
fe36184b 801
35060cdc
BP
802 <dt><code><var>field</var> = <var>constant</var>;</code></dt>
803 <dd>
5868eb24 804 <p>
5ee054fb
BP
805 Sets data or metadata field <var>field</var> to constant value
806 <var>constant</var>, e.g. <code>outport = "vif0";</code> to set the
807 logical output port. To set only a subset of bits in a field,
808 specify a subfield for <var>field</var> or a masked
809 <var>constant</var>, e.g. one may use <code>vlan.pcp[2] = 1;</code>
810 or <code>vlan.pcp = 4/4;</code> to set the most sigificant bit of
811 the VLAN PCP.
5868eb24
BP
812 </p>
813
814 <p>
815 Assigning to a field with prerequisites implicitly adds those
816 prerequisites to <ref column="match"/>; thus, for example, a flow
817 that sets <code>tcp.dst</code> applies only to TCP flows,
818 regardless of whether its <ref column="match"/> mentions any TCP
819 field.
820 </p>
821
822 <p>
823 Not all fields are modifiable (e.g. <code>eth.type</code> and
824 <code>ip.proto</code> are read-only), and not all modifiable fields
825 may be partially modified (e.g. <code>ip.ttl</code> must assigned
826 as a whole). The <code>outport</code> field is modifiable in the
827 <code>ingress</code> pipeline but not in the <code>egress</code>
828 pipeline.
829 </p>
eee7a8ed 830 </dd>
5ee054fb
BP
831
832 <dt><code><var>field1</var> = <var>field2</var>;</code></dt>
833 <dd>
834 <p>
835 Sets data or metadata field <var>field1</var> to the value of data
836 or metadata field <var>field2</var>, e.g. <code>reg0 =
837 ip4.src;</code> copies <code>ip4.src</code> into <code>reg0</code>.
838 To modify only a subset of a field's bits, specify a subfield for
839 <var>field1</var> or <var>field2</var> or both, e.g. <code>vlan.pcp
840 = reg0[0..2];</code> copies the least-significant bits of
841 <code>reg0</code> into the VLAN PCP.
842 </p>
843
844 <p>
845 <var>field1</var> and <var>field2</var> must be the same type,
846 either both string or both integer fields. If they are both
847 integer fields, they must have the same width.
848 </p>
849
850 <p>
851 If <var>field1</var> or <var>field2</var> has prerequisites, they
852 are added implicitly to <ref column="match"/>. It is possible to
853 write an assignment with contradictory prerequisites, such as
854 <code>ip4.src = ip6.src[0..31];</code>, but the contradiction means
855 that a logical flow with such an assignment will never be matched.
856 </p>
857 </dd>
a20c96c6
BP
858
859 <dt><code><var>field1</var> &lt;-&gt; <var>field2</var>;</code></dt>
860 <dd>
861 <p>
862 Similar to <code><var>field1</var> = <var>field2</var>;</code>
863 except that the two values are exchanged instead of copied. Both
864 <var>field1</var> and <var>field2</var> must modifiable.
865 </p>
866 </dd>
78aab811 867
00ea19e4
BP
868 <dt><code>ip.ttl--;</code></dt>
869 <dd>
870 <p>
871 Decrements the IPv4 or IPv6 TTL. If this would make the TTL zero
872 or negative, then processing of the packet halts; no further
873 actions are processed. (To properly handle such cases, a
4c20b9f2
JP
874 higher-priority flow should match on
875 <code>ip.ttl == {0, 1};</code>.)
00ea19e4
BP
876 </p>
877
878 <p><b>Prerequisite:</b> <code>ip</code></p>
879 </dd>
880
78aab811
JP
881 <dt><code>ct_next;</code></dt>
882 <dd>
883 <p>
884 Apply connection tracking to the flow, initializing
885 <code>ct_state</code> for matching in later tables.
886 Automatically moves on to the next table, as if followed by
887 <code>next</code>.
888 </p>
889
890 <p>
891 As a side effect, IP fragments will be reassembled for matching.
892 If a fragmented packet is output, then it will be sent with any
893 overlapping fragments squashed. The connection tracking state is
894 scoped by the logical port, so overlapping addresses may be used.
895 To allow traffic related to the matched flow, execute
896 <code>ct_commit</code>.
897 </p>
898
899 <p>
900 It is possible to have actions follow <code>ct_next</code>,
901 but they will not have access to any of its side-effects and
902 is not generally useful.
903 </p>
904 </dd>
905
906 <dt><code>ct_commit;</code></dt>
907 <dd>
908 Commit the flow to the connection tracking entry associated
909 with it by a previous call to <code>ct_next</code>.
910 </dd>
fe36184b
BP
911 </dl>
912
913 <p>
2cd87fce
RB
914 The following actions will likely be useful later, but they have not
915 been thought out carefully.
fe36184b
BP
916 </p>
917
918 <dl>
fe36184b 919
69a832cf
BP
920 <dt><code>arp { <var>action</var>; </code>...<code> };</code></dt>
921 <dd>
922 <p>
923 Temporarily replaces the IPv4 packet being processed by an ARP
924 packet and executes each nested <var>action</var> on the ARP
925 packet. Actions following the <var>arp</var> action, if any, apply
926 to the original, unmodified packet.
927 </p>
928
929 <p>
930 The ARP packet that this action operates on is initialized based on
931 the IPv4 packet being processed, as follows. These are default
932 values that the nested actions will probably want to change:
933 </p>
934
935 <ul>
936 <li><code>eth.src</code> unchanged</li>
937 <li><code>eth.dst</code> unchanged</li>
938 <li><code>eth.type = 0x0806</code></li>
939 <li><code>arp.op = 1</code> (ARP request)</li>
940 <li><code>arp.sha</code> copied from <code>eth.src</code></li>
941 <li><code>arp.spa</code> copied from <code>ip4.src</code></li>
942 <li><code>arp.tha = 00:00:00:00:00:00</code></li>
943 <li><code>arp.tpa</code> copied from <code>ip4.dst</code></li>
944 </ul>
945
946 <p><b>Prerequisite:</b> <code>ip4</code></p>
947 </dd>
948
949 <dt><code>icmp4 { <var>action</var>; </code>...<code> };</code></dt>
950 <dd>
951 <p>
952 Temporarily replaces the IPv4 packet being processed by an ICMPv4
953 packet and executes each nested <var>action</var> on the ICMPv4
954 packet. Actions following the <var>icmp4</var> action, if any,
955 apply to the original, unmodified packet.
956 </p>
957
958 <p>
959 The ICMPv4 packet that this action operates on is initialized based
960 on the IPv4 packet being processed, as follows. These are default
961 values that the nested actions will probably want to change.
962 Ethernet and IPv4 fields not listed here are not changed:
963 </p>
964
965 <ul>
966 <li><code>ip.proto = 1</code> (ICMPv4)</li>
967 <li><code>ip.frag = 0</code> (not a fragment)</li>
968 <li><code>icmp4.type = 3</code> (destination unreachable)</li>
969 <li><code>icmp4.code = 1</code> (host unreachable)</li>
970 </ul>
971
972 <p>
973 Details TBD.
974 </p>
fe36184b 975
69a832cf
BP
976 <p><b>Prerequisite:</b> <code>ip4</code></p>
977 </dd>
978
979 <dt><code>tcp_reset;</code></dt>
980 <dd>
981 <p>
982 This action transforms the current TCP packet according to the
983 following pseudocode:
984 </p>
985
986 <pre>
987if (tcp.ack) {
988 tcp.seq = tcp.ack;
989} else {
990 tcp.ack = tcp.seq + length(tcp.payload);
991 tcp.seq = 0;
992}
993tcp.flags = RST;
994</pre>
995
996 <p>
997 Then, the action drops all TCP options and payload data, and
998 updates the TCP checksum.
999 </p>
1000
1001 <p>
1002 Details TBD.
1003 </p>
1004
1005 <p><b>Prerequisite:</b> <code>tcp</code></p>
1006 </dd>
fe36184b 1007 </dl>
fe36184b 1008 </column>
091e3af9
JP
1009
1010 <column name="external_ids" key="stage-name">
1011 Human-readable name for this flow's stage in the pipeline.
1012 </column>
1013
1014 <group title="Common Columns">
1015 The overall purpose of these columns is described under <code>Common
1016 Columns</code> at the beginning of this document.
1017
1018 <column name="external_ids"/>
1019 </group>
fe36184b
BP
1020 </table>
1021
5868eb24
BP
1022 <table name="Multicast_Group" title="Logical Port Multicast Groups">
1023 <p>
1024 The rows in this table define multicast groups of logical ports.
1025 Multicast groups allow a single packet transmitted over a tunnel to a
1026 hypervisor to be delivered to multiple VMs on that hypervisor, which
1027 uses bandwidth more efficiently.
1028 </p>
1029
1030 <p>
1031 Each row in this table defines a logical multicast group numbered <ref
1032 column="tunnel_key"/> within <ref column="datapath"/>, whose logical
1033 ports are listed in the <ref column="ports"/> column.
1034 </p>
1035
1036 <column name="datapath">
1037 The logical datapath in which the multicast group resides.
1038 </column>
1039
1040 <column name="tunnel_key">
1041 The value used to designate this logical egress port in tunnel
1042 encapsulations. An index forces the key to be unique within the <ref
1043 column="datapath"/>. The unusual range ensures that multicast group IDs
1044 do not overlap with logical port IDs.
1045 </column>
1046
1047 <column name="name">
1048 <p>
1049 The logical multicast group's name. An index forces the name to be
1050 unique within the <ref column="datapath"/>. Logical flows in the
1051 ingress pipeline may output to the group just as for individual logical
1052 ports, by assigning the group's name to <code>outport</code> and
1053 executing an <code>output</code> action.
1054 </p>
1055
1056 <p>
1057 Multicast group names and logical port names share a single namespace
1058 and thus should not overlap (but the database schema cannot enforce
1059 this). To try to avoid conflicts, <code>ovn-northd</code> uses names
1060 that begin with <code>_MC_</code>.
1061 </p>
1062 </column>
1063
1064 <column name="ports">
1065 The logical ports included in the multicast group. All of these ports
1066 must be in the <ref column="datapath"/> logical datapath (but the
1067 database schema cannot enforce this).
1068 </column>
1069 </table>
1070
1071 <table name="Datapath_Binding" title="Physical-Logical Datapath Bindings">
1072 <p>
1073 Each row in this table identifies physical bindings of a logical
1074 datapath. A logical datapath implements a logical pipeline among the
1075 ports in the <ref table="Port_Binding"/> table associated with it. In
1076 practice, the pipeline in a given logical datapath implements either a
1077 logical switch or a logical router.
1078 </p>
1079
1080 <column name="tunnel_key">
1081 The tunnel key value to which the logical datapath is bound.
1082 The <code>Tunnel Encapsulation</code> section in
1083 <code>ovn-architecture</code>(7) describes how tunnel keys are
1084 constructed for each supported encapsulation.
1085 </column>
1086
9975d7be
BP
1087 <group title="OVN_Northbound Relationship">
1088 <p>
1089 Each row in <ref table="Datapath_Binding"/> is associated with some
1090 logical datapath. <code>ovn-northd</code> uses these keys to track the
1091 association of a logical datapath with concepts in the <ref
1092 db="OVN_Northbound"/> database.
1093 </p>
1094
1095 <column name="external_ids" key="logical-switch" type='{"type": "uuid"}'>
1096 For a logical datapath that represents a logical switch,
1097 <code>ovn-northd</code> stores in this key the UUID of the
1098 corresponding <ref table="Logical_Switch" db="OVN_Northbound"/> row in
1099 the <ref db="OVN_Northbound"/> database.
1100 </column>
1101
1102 <column name="external_ids" key="logical-router" type='{"type": "uuid"}'>
1103 For a logical datapath that represents a logical router,
1104 <code>ovn-northd</code> stores in this key the UUID of the
1105 corresponding <ref table="Logical_Router" db="OVN_Northbound"/> row in
1106 the <ref db="OVN_Northbound"/> database.
1107 </column>
1108 </group>
5868eb24
BP
1109
1110 <group title="Common Columns">
1111 The overall purpose of these columns is described under <code>Common
1112 Columns</code> at the beginning of this document.
1113
1114 <column name="external_ids"/>
1115 </group>
1116 </table>
1117
dcda6e0d 1118 <table name="Port_Binding" title="Physical-Logical Port Bindings">
fe36184b 1119 <p>
d387d24d
BP
1120 Most rows in this table identify the physical location of a logical port.
1121 (The exceptions are logical patch ports, which do not have any physical
1122 location.)
fe36184b
BP
1123 </p>
1124
1125 <p>
9fb4636f 1126 For every <code>Logical_Port</code> record in <code>OVN_Northbound</code>
91ae2065
RB
1127 database, <code>ovn-northd</code> creates a record in this table.
1128 <code>ovn-northd</code> populates and maintains every column except
3213e9df 1129 the <code>chassis</code> column, which it leaves empty in new records.
9fb4636f
GS
1130 </p>
1131
1132 <p>
88058f19
AW
1133 <code>ovn-controller</code>/<code>ovn-controller-vtep</code>
1134 populates the <code>chassis</code> column for the records that
1135 identify the logical ports that are located on its hypervisor/gateway,
1136 which <code>ovn-controller</code>/<code>ovn-controller-vtep</code> in
1137 turn finds out by monitoring the local hypervisor's Open_vSwitch
1138 database, which identifies logical ports via the conventions described
1139 in <code>IntegrationGuide.md</code>.
9fb4636f
GS
1140 </p>
1141
1142 <p>
5868eb24 1143 When a chassis shuts down gracefully, it should clean up the
9fb4636f 1144 <code>chassis</code> column that it previously had populated.
fe36184b
BP
1145 (This is not critical because resources hosted on the chassis are equally
1146 unreachable regardless of whether their rows are present.) To handle the
1147 case where a VM is shut down abruptly on one chassis, then brought up
88058f19
AW
1148 again on a different one,
1149 <code>ovn-controller</code>/<code>ovn-controller-vtep</code> must
1150 overwrite the <code>chassis</code> column with new information.
fe36184b
BP
1151 </p>
1152
c96ba502
BP
1153 <group title="Core Features">
1154 <column name="datapath">
1155 The logical datapath to which the logical port belongs.
1156 </column>
1a76c93e 1157
c96ba502
BP
1158 <column name="logical_port">
1159 A logical port, taken from <ref table="Logical_Port" column="name"
1160 db="OVN_Northbound"/> in the OVN_Northbound database's <ref
1161 table="Logical_Port" db="OVN_Northbound"/> table. OVN does not
1162 prescribe a particular format for the logical port ID.
1163 </column>
c0281929 1164
c96ba502
BP
1165 <column name="chassis">
1166 The physical location of the logical port. To successfully identify a
1167 chassis, this column must be a <ref table="Chassis"/> record. This is
1168 populated by
1169 <code>ovn-controller</code>/<code>ovn-controller-vtep</code>.
1170 </column>
c0281929 1171
c96ba502
BP
1172 <column name="tunnel_key">
1173 <p>
1174 A number that represents the logical port in the key (e.g. STT key or
1175 Geneve TLV) field carried within tunnel protocol packets.
1176 </p>
c0281929 1177
c96ba502
BP
1178 <p>
1179 The tunnel ID must be unique within the scope of a logical datapath.
1180 </p>
1181 </column>
88058f19 1182
c96ba502
BP
1183 <column name="mac">
1184 <p>
1185 The Ethernet address or addresses used as a source address on the
1186 logical port, each in the form
1187 <var>xx</var>:<var>xx</var>:<var>xx</var>:<var>xx</var>:<var>xx</var>:<var>xx</var>.
1188 The string <code>unknown</code> is also allowed to indicate that the
1189 logical port has an unknown set of (additional) source addresses.
1190 </p>
1191
1192 <p>
1193 A VM interface would ordinarily have a single Ethernet address. A
1194 gateway port might initially only have <code>unknown</code>, and then
1195 add MAC addresses to the set as it learns new source addresses.
1196 </p>
1197 </column>
88058f19 1198
c96ba502
BP
1199 <column name="type">
1200 <p>
1201 A type for this logical port. Logical ports can be used to model other
1202 types of connectivity into an OVN logical switch. The following types
1203 are defined:
1204 </p>
1205
1206 <dl>
1207 <dt>(empty string)</dt>
1208 <dd>VM (or VIF) interface.</dd>
d387d24d
BP
1209
1210 <dt><code>patch</code></dt>
1211 <dd>
1212 One of a pair of logical ports that act as if connected by a patch
1213 cable. Useful for connecting two logical datapaths, e.g. to connect
1214 a logical router to a logical switch or to another logical router.
1215 </dd>
1216
c96ba502
BP
1217 <dt><code>localnet</code></dt>
1218 <dd>
1219 A connection to a locally accessible network from each
1220 <code>ovn-controller</code> instance. A logical switch can only
6e6c3f91
HZ
1221 have a single <code>localnet</code> port attached. This is used
1222 to model direct connectivity to an existing network.
c96ba502
BP
1223 </dd>
1224
1225 <dt><code>vtep</code></dt>
1226 <dd>
1227 A port to a logical switch on a VTEP gateway chassis. In order to
1228 get this port correctly recognized by the OVN controller, the <ref
1229 column="options"
1230 table="Port_Binding"/>:<code>vtep-physical-switch</code> and <ref
1231 column="options"
1232 table="Port_Binding"/>:<code>vtep-logical-switch</code> must also
1233 be defined.
1234 </dd>
1235 </dl>
1236 </column>
1237 </group>
1a76c93e 1238
d387d24d
BP
1239 <group title="Patch Options">
1240 <p>
1241 These options apply to logical ports with <ref column="type"/> of
1242 <code>patch</code>.
1243 </p>
1244
1245 <column name="options" key="peer">
1246 The <ref column="logical_port"/> in the <ref table="Port_Binding"/>
1247 record for the other side of the patch. The named <ref
1248 column="logical_port"/> must specify this <ref column="logical_port"/>
1249 in its own <code>peer</code> option. That is, the two patch logical
1250 ports must have reversed <ref column="logical_port"/> and
1251 <code>peer</code> values.
1252 </column>
1253 </group>
1254
c96ba502 1255 <group title="Localnet Options">
eb00399e 1256 <p>
c96ba502
BP
1257 These options apply to logical ports with <ref column="type"/> of
1258 <code>localnet</code>.
eb00399e
BP
1259 </p>
1260
c96ba502
BP
1261 <column name="options" key="network_name">
1262 Required. <code>ovn-controller</code> uses the configuration entry
1263 <code>ovn-bridge-mappings</code> to determine how to connect to this
1264 network. <code>ovn-bridge-mappings</code> is a list of network names
1265 mapped to a local OVS bridge that provides access to that network. An
1266 example of configuring <code>ovn-bridge-mappings</code> would be:
1267
1268 <pre>$ ovs-vsctl set open . external-ids:ovn-bridge-mappings=physnet1:br-eth0,physnet2:br-eth1</pre>
1269
1270 <p>
1271 When a logical switch has a <code>localnet</code> port attached,
1272 every chassis that may have a local vif attached to that logical
1273 switch must have a bridge mapping configured to reach that
1274 <code>localnet</code>. Traffic that arrives on a
1275 <code>localnet</code> port is never forwarded over a tunnel to
1276 another chassis.
1277 </p>
1278 </column>
1279
1280 <column name="tag">
1281 If set, indicates that the port represents a connection to a specific
1282 VLAN on a locally accessible network. The VLAN ID is used to match
1283 incoming traffic and is also added to outgoing traffic.
1284 </column>
1285 </group>
1286
1287 <group title="VTEP Options">
eb00399e 1288 <p>
c96ba502
BP
1289 These options apply to logical ports with <ref column="type"/> of
1290 <code>vtep</code>.
eb00399e 1291 </p>
9fb4636f 1292
c96ba502
BP
1293 <column name="options" key="vtep-physical-switch">
1294 Required. The name of the VTEP gateway.
1295 </column>
fe36184b 1296
c96ba502
BP
1297 <column name="options" key="vtep-logical-switch">
1298 Required. A logical switch name connected by the VTEP gateway. Must
1299 be set when <ref column="type"/> is <code>vtep</code>.
1300 </column>
1301 </group>
fe36184b 1302
aef5f431
BP
1303 <group title="VMI (or VIF) Options">
1304 <p>
1305 These options apply to logical ports with <ref column="type"/> having
1306 (empty string)
1307 </p>
1308
1309 <column name="options" key="policing_rate">
1310 If set, indicates the maximum rate for data sent from this interface,
1311 in kbps. Data exceeding this rate is dropped.
1312 </column>
1313
1314 <column name="options" key="policing_burst">
1315 If set, indicates the maximum burst size for data sent from this
1316 interface, in kb.
1317 </column>
1318 </group>
1319
c96ba502 1320 <group title="Nested Containers">
fe36184b 1321 <p>
c96ba502
BP
1322 These columns support containers nested within a VM. Specifically,
1323 they are used when <ref column="type"/> is empty and <ref
1324 column="logical_port"/> identifies the interface of a container spawned
1325 inside a VM. They are empty for containers or VMs that run directly on
1326 a hypervisor.
fe36184b
BP
1327 </p>
1328
c96ba502
BP
1329 <column name="parent_port">
1330 This is taken from
1331 <ref table="Logical_Port" column="parent_name" db="OVN_Northbound"/>
1332 in the OVN_Northbound database's <ref table="Logical_Port"
1333 db="OVN_Northbound"/> table.
1334 </column>
1335
1336 <column name="tag">
1337 <p>
1338 Identifies the VLAN tag in the network traffic associated with that
1339 container's network interface.
1340 </p>
1341
1342 <p>
1343 This column is used for a different purpose when <ref column="type"/>
1344 is <code>localnet</code> (see <code>Localnet Options</code>, above).
1345 </p>
1346 </column>
1347 </group>
fe36184b
BP
1348 </table>
1349</database>