]> git.proxmox.com Git - mirror_ovs.git/blame - ovn/ovn-sb.xml
ovn-controller-vtep: Extend vtep module to install Ucast_Macs_Remote.
[mirror_ovs.git] / ovn / ovn-sb.xml
CommitLineData
fe36184b 1<?xml version="1.0" encoding="utf-8"?>
ec78987f 2<database name="ovn-sb" title="OVN Southbound Database">
fe36184b
BP
3 <p>
4 This database holds logical and physical configuration and state for the
5 Open Virtual Network (OVN) system to support virtual network abstraction.
6 For an introduction to OVN, please see <code>ovn-architecture</code>(7).
7 </p>
8
9 <p>
ec78987f
JP
10 The OVN Southbound database sits at the center of the OVN
11 architecture. It is the one component that speaks both southbound
12 directly to all the hypervisors and gateways, via
13 <code>ovn-controller</code>, and northbound to the Cloud Management
91ae2065 14 System, via <code>ovn-northd</code>:
fe36184b
BP
15 </p>
16
17 <h2>Database Structure</h2>
18
19 <p>
ec78987f
JP
20 The OVN Southbound database contains three classes of data with
21 different properties, as described in the sections below.
fe36184b
BP
22 </p>
23
24 <h3>Physical Network (PN) data</h3>
25
26 <p>
27 PN tables contain information about the chassis nodes in the system. This
28 contains all the information necessary to wire the overlay, such as IP
29 addresses, supported tunnel types, and security keys.
30 </p>
31
32 <p>
33 The amount of PN data is small (O(n) in the number of chassis) and it
34 changes infrequently, so it can be replicated to every chassis.
35 </p>
36
37 <p>
62fdd819 38 The <ref table="Chassis"/> table comprises the PN tables.
fe36184b
BP
39 </p>
40
41 <h3>Logical Network (LN) data</h3>
42
43 <p>
44 LN tables contain the topology of logical switches and routers, ACLs,
45 firewall rules, and everything needed to describe how packets traverse a
46 logical network, represented as logical datapath flows (see Logical
47 Datapath Flows, below).
48 </p>
49
50 <p>
51 LN data may be large (O(n) in the number of logical ports, ACL rules,
52 etc.). Thus, to improve scaling, each chassis should receive only data
53 related to logical networks in which that chassis participates. Past
54 experience shows that in the presence of large logical networks, even
55 finer-grained partitioning of data, e.g. designing logical flows so that
56 only the chassis hosting a logical port needs related flows, pays off
57 scale-wise. (This is not necessary initially but it is worth bearing in
58 mind in the design.)
59 </p>
60
61 <p>
62 The LN is a slave of the cloud management system running northbound of OVN.
63 That CMS determines the entire OVN logical configuration and therefore the
64 LN's content at any given time is a deterministic function of the CMS's
09986f8c
JP
65 configuration, although that happens indirectly via the
66 <ref db="OVN_Northbound"/> database and <code>ovn-northd</code>.
fe36184b
BP
67 </p>
68
69 <p>
70 LN data is likely to change more quickly than PN data. This is especially
71 true in a container environment where VMs are created and destroyed (and
72 therefore added to and deleted from logical switches) quickly.
73 </p>
74
75 <p>
5868eb24
BP
76 <ref table="Logical_Flow"/> and <ref table="Multicast_Group"/> contain LN
77 data.
fe36184b
BP
78 </p>
79
80 <h3>Bindings data</h3>
81
82 <p>
5868eb24
BP
83 Bindings data link logical and physical components. They show the current
84 placement of logical components (such as VMs and VIFs) onto chassis, and
85 map logical entities to the values that represent them in tunnel
86 encapsulations.
fe36184b
BP
87 </p>
88
89 <p>
90 Bindings change frequently, at least every time a VM powers up or down
91 or migrates, and especially quickly in a container environment. The
92 amount of data per VM (or VIF) is small.
93 </p>
94
95 <p>
96 Each chassis is authoritative about the VMs and VIFs that it hosts at any
97 given time and can efficiently flood that state to a central location, so
98 the consistency needs are minimal.
99 </p>
100
101 <p>
5868eb24
BP
102 The <ref table="Port_Binding"/> and <ref table="Datapath_Binding"/> tables
103 contain binding data.
fe36184b
BP
104 </p>
105
5868eb24
BP
106 <h2>Common Columns</h2>
107
108 <p>
109 Some tables contain a special column named <code>external_ids</code>. This
110 column has the same form and purpose each place that it appears, so we
111 describe it here to save space later.
112 </p>
113
114 <dl>
115 <dt><code>external_ids</code>: map of string-string pairs</dt>
116 <dd>
117 Key-value pairs for use by the software that manages the OVN Southbound
118 database rather than by <code>ovn-controller</code>. In particular,
119 <code>ovn-northd</code> can use key-value pairs in this column to relate
120 entities in the southbound database to higher-level entities (such as
121 entities in the OVN Northbound database). Individual key-value pairs in
122 this column may be documented in some cases to aid in understanding and
123 troubleshooting, but the reader should not mistake such documentation as
124 comprehensive.
125 </dd>
126 </dl>
127
fe36184b
BP
128 <table name="Chassis" title="Physical Network Hypervisor and Gateway Information">
129 <p>
130 Each row in this table represents a hypervisor or gateway (a chassis) in
131 the physical network (PN). Each chassis, via
132 <code>ovn-controller</code>, adds and updates its own row, and keeps a
133 copy of the remaining rows to determine how to reach other hypervisors.
134 </p>
135
136 <p>
137 When a chassis shuts down gracefully, it should remove its own row.
138 (This is not critical because resources hosted on the chassis are equally
139 unreachable regardless of whether the row is present.) If a chassis
140 shuts down permanently without removing its row, some kind of manual or
141 automatic cleanup is eventually needed; we can devise a process for that
142 as necessary.
143 </p>
144
145 <column name="name">
146 A chassis name, taken from <ref key="system-id" table="Open_vSwitch"
147 column="external_ids" db="Open_vSwitch"/> in the Open_vSwitch
148 database's <ref table="Open_vSwitch" db="Open_vSwitch"/> table. OVN does
149 not prescribe a particular format for chassis names.
150 </column>
151
09db214c 152 <group title="Encapsulation Configuration">
fe36184b 153 <p>
09db214c
JP
154 OVN uses encapsulation to transmit logical dataplane packets
155 between chassis.
fe36184b
BP
156 </p>
157
09db214c
JP
158 <column name="encaps">
159 Points to supported encapsulation configurations to transmit
160 logical dataplane packets to this chassis. Each entry is a <ref
161 table="Encap"/> record that describes the configuration.
fe36184b
BP
162 </column>
163 </group>
164
62fdd819
AW
165 <group title="Gateway Configuration">
166 <p>
167 A <dfn>gateway</dfn> is a chassis that forwards traffic between the
168 OVN-managed part of a logical network and a physical VLAN, extending a
169 tunnel-based logical network into a physical network. Gateways are
170 typically dedicated nodes that do not host VMs.
fe36184b
BP
171 </p>
172
62fdd819
AW
173 <column name="vtep_logical_switches">
174 Stores all vtep logical switch names connected by this gateway
175 chassis.
fe36184b 176 </column>
62fdd819 177 </group>
fe36184b
BP
178 </table>
179
09db214c
JP
180 <table name="Encap" title="Encapsulation Types">
181 <p>
182 The <ref column="encaps" table="Chassis"/> column in the <ref
183 table="Chassis"/> table refers to rows in this table to identify
184 how OVN may transmit logical dataplane packets to this chassis.
185 Each chassis, via <code>ovn-controller</code>(8), adds and updates
186 its own rows and keeps a copy of the remaining rows to determine
187 how to reach other chassis.
188 </p>
189
190 <column name="type">
191 The encapsulation to use to transmit packets to this chassis.
b705f9ea
JP
192 Hypervisors must use either <code>geneve</code> or
193 <code>stt</code>. Gateways may use <code>vxlan</code>,
194 <code>geneve</code>, or <code>stt</code>.
09db214c
JP
195 </column>
196
197 <column name="options">
198 Options for configuring the encapsulation, e.g. IPsec parameters when
199 IPsec support is introduced. No options are currently defined.
200 </column>
201
202 <column name="ip">
203 The IPv4 address of the encapsulation tunnel endpoint.
204 </column>
205 </table>
206
5868eb24 207 <table name="Logical_Flow" title="Logical Network Flows">
fe36184b 208 <p>
09986f8c
JP
209 Each row in this table represents one logical flow.
210 <code>ovn-northd</code> populates this table with logical flows
211 that implement the L2 and L3 topologies specified in the
212 <ref db="OVN_Northbound"/> database. Each hypervisor, via
213 <code>ovn-controller</code>, translates the logical flows into
214 OpenFlow flows specific to its hypervisor and installs them into
215 Open vSwitch.
fe36184b
BP
216 </p>
217
218 <p>
219 Logical flows are expressed in an OVN-specific format, described here. A
220 logical datapath flow is much like an OpenFlow flow, except that the
221 flows are written in terms of logical ports and logical datapaths instead
222 of physical ports and physical datapaths. Translation between logical
223 and physical flows helps to ensure isolation between logical datapaths.
09986f8c
JP
224 (The logical flow abstraction also allows the OVN centralized
225 components to do less work, since they do not have to separately
226 compute and push out physical flows to each chassis.)
fe36184b
BP
227 </p>
228
229 <p>
230 The default action when no flow matches is to drop packets.
231 </p>
232
5868eb24
BP
233 <p><em>Logical Life Cycle of a Packet</em></p>
234
235 <p>
236 This following description focuses on the life cycle of a packet through
237 a logical datapath, ignoring physical details of the implementation.
238 Please refer to <em>Life Cycle of a Packet</em> in
239 <code>ovn-architecture</code>(7) for the physical information.
240 </p>
241
242 <p>
243 The description here is written as if OVN itself executes these steps,
244 but in fact OVN (that is, <code>ovn-controller</code>) programs Open
245 vSwitch, via OpenFlow and OVSDB, to execute them on its behalf.
246 </p>
247
248 <p>
249 At a high level, OVN passes each packet through the logical datapath's
250 logical ingress pipeline, which may output the packet to one or more
251 logical port or logical multicast groups. For each such logical output
252 port, OVN passes the packet through the datapath's logical egress
253 pipeline, which may either drop the packet or deliver it to the
254 destination. Between the two pipelines, outputs to logical multicast
255 groups are expanded into logical ports, so that the egress pipeline only
256 processes a single logical output port at a time. Between the two
257 pipelines is also where, when necessary, OVN encapsulates a packet in a
258 tunnel (or tunnels) to transmit to remote hypervisors.
259 </p>
260
261 <p>
262 In more detail, to start, OVN searches the <ref table="Logical_Flow"/>
263 table for a row with correct <ref column="logical_datapath"/>, a <ref
264 column="pipeline"/> of <code>ingress</code>, a <ref column="table_id"/>
265 of 0, and a <ref column="match"/> that is true for the packet. If none
266 is found, OVN drops the packet. If OVN finds more than one, it chooses
267 the match with the highest <ref column="priority"/>. Then OVN executes
268 each of the actions specified in the row's <ref table="actions"/> column,
269 in the order specified. Some actions, such as those to modify packet
270 headers, require no further details. The <code>next</code> and
271 <code>output</code> actions are special.
272 </p>
273
274 <p>
275 The <code>next</code> action causes the above process to be repeated
276 recursively, except that OVN searches for <ref column="table_id"/> of 1
277 instead of 0. Similarly, any <code>next</code> action in a row found in
278 that table would cause a further search for a <ref column="table_id"/> of
279 2, and so on. When recursive processing completes, flow control returns
280 to the action following <code>next</code>.
281 </p>
282
283 <p>
284 The <code>output</code> action also introduces recursion. Its effect
285 depends on the current value of the <code>outport</code> field. Suppose
286 <code>outport</code> designates a logical port. First, OVN compares
287 <code>inport</code> to <code>outport</code>; if they are equal, it treats
288 the <code>output</code> as a no-op. In the common case, where they are
289 different, the packet enters the egress pipeline. This transition to the
290 egress pipeline discards register data, e.g. <code>reg0</code>
291 ... <code>reg5</code>, to achieve uniform behavior regardless of whether
292 the egress pipeline is on a different hypervisor (because registers
293 aren't preserve across tunnel encapsulation).
294 </p>
295
296 <p>
297 To execute the egress pipeline, OVN again searches the <ref
298 table="Logical_Flow"/> table for a row with correct <ref
299 column="logical_datapath"/>, a <ref column="table_id"/> of 0, a <ref
300 column="match"/> that is true for the packet, but now looking for a <ref
301 column="pipeline"/> of <code>egress</code>. If no matching row is found,
302 the output becomes a no-op. Otherwise, OVN executes the actions for the
303 matching flow (which is chosen from multiple, if necessary, as already
304 described).
305 </p>
306
307 <p>
308 In the <code>egress</code> pipeline, the <code>next</code> action acts as
309 already described, except that it, of course, searches for
310 <code>egress</code> flows. The <code>output</code> action, however, now
311 directly outputs the packet to the output port (which is now fixed,
312 because <code>outport</code> is read-only within the egress pipeline).
313 </p>
314
315 <p>
316 The description earlier assumed that <code>outport</code> referred to a
317 logical port. If it instead designates a logical multicast group, then
318 the description above still applies, with the addition of fan-out from
319 the logical multicast group to each logical port in the group. For each
320 member of the group, OVN executes the logical pipeline as described, with
321 the logical output port replaced by the group member.
322 </p>
323
8d6e5516
JP
324 <p><em>Pipeline Stages</em></p>
325
326 <p>
327 <code>ovn-northd</code> is responsible for populating the
328 <ref table="Logical_Flow"/> table, so the stages are an
329 implementation detail and subject to change. This section
330 describes the current logical flow table.
331 </p>
332
333 <p>
334 The ingress pipeline consists of the following stages:
335 </p>
336 <ul>
337 <li>
338 Port Security (Table 0): Validates the source address, drops
339 packets with a VLAN tag, and, if configured, verifies that the
340 logical port is allowed to send with the source address.
341 </li>
342
343 <li>
344 L2 Destination Lookup (Table 1): Forwards known unicast
345 addresses to the appropriate logical port. Unicast packets to
346 unknown hosts are forwarded to logical ports configured with the
347 special <code>unknown</code> mac address. Broadcast, and
348 multicast are flooded to all ports in the logical switch.
349 </li>
350 </ul>
351
352 <p>
353 The egress pipeline consists of the following stages:
354 </p>
355 <ul>
356 <li>
357 ACL (Table 0): Applies any specified access control lists.
358 </li>
359
360 <li>
361 Port Security (Table 1): If configured, verifies that the
362 logical port is allowed to receive packets with the destination
363 address.
364 </li>
365 </ul>
366
747b2a45 367 <column name="logical_datapath">
5868eb24
BP
368 The logical datapath to which the logical flow belongs.
369 </column>
370
371 <column name="pipeline">
372 <p>
373 The primary flows used for deciding on a packet's destination are the
374 <code>ingress</code> flows. The <code>egress</code> flows implement
375 ACLs. See <em>Logical Life Cycle of a Packet</em>, above, for details.
376 </p>
747b2a45
BP
377 </column>
378
fe36184b
BP
379 <column name="table_id">
380 The stage in the logical pipeline, analogous to an OpenFlow table number.
381 </column>
382
383 <column name="priority">
384 The flow's priority. Flows with numerically higher priority take
385 precedence over those with lower. If two logical datapath flows with the
386 same priority both match, then the one actually applied to the packet is
387 undefined.
388 </column>
389
390 <column name="match">
391 <p>
392 A matching expression. OVN provides a superset of OpenFlow matching
393 capabilities, using a syntax similar to Boolean expressions in a
394 programming language.
395 </p>
396
397 <p>
fa6aeaeb
RB
398 The most important components of match expression are
399 <dfn>comparisons</dfn> between <dfn>symbols</dfn> and
400 <dfn>constants</dfn>, e.g. <code>ip4.dst == 192.168.0.1</code>,
401 <code>ip.proto == 6</code>, <code>arp.op == 1</code>, <code>eth.type ==
402 0x800</code>. The logical AND operator <code>&amp;&amp;</code> and
403 logical OR operator <code>||</code> can combine comparisons into a
404 larger expression.
fe36184b
BP
405 </p>
406
fe36184b 407 <p>
e0840f11
BP
408 Matching expressions also support parentheses for grouping, the logical
409 NOT prefix operator <code>!</code>, and literals <code>0</code> and
410 <code>1</code> to express ``false'' or ``true,'' respectively. The
411 latter is useful by itself as a catch-all expression that matches every
412 packet.
fe36184b
BP
413 </p>
414
e0840f11 415 <p><em>Symbols</em></p>
fe36184b
BP
416
417 <p>
fa6aeaeb
RB
418 <em>Type</em>. Symbols have <dfn>integer</dfn> or <dfn>string</dfn>
419 type. Integer symbols have a <dfn>width</dfn> in bits.
fe36184b
BP
420 </p>
421
422 <p>
fa6aeaeb 423 <em>Kinds</em>. There are three kinds of symbols:
fe36184b
BP
424 </p>
425
e0840f11 426 <ul>
fa6aeaeb
RB
427 <li>
428 <p>
429 <dfn>Fields</dfn>. A field symbol represents a packet header or
430 metadata field. For example, a field
431 named <code>vlan.tci</code> might represent the VLAN TCI field in a
432 packet.
433 </p>
434
435 <p>
436 A field symbol can have integer or string type. Integer fields can
437 be nominal or ordinal (see <em>Level of Measurement</em>,
438 below).
439 </p>
440 </li>
441
442 <li>
443 <p>
444 <dfn>Subfields</dfn>. A subfield represents a subset of bits from
445 a larger field. For example, a field <code>vlan.vid</code> might
446 be defined as an alias for <code>vlan.tci[0..11]</code>. Subfields
447 are provided for syntactic convenience, because it is always
448 possible to instead refer to a subset of bits from a field
449 directly.
450 </p>
451
452 <p>
453 Only ordinal fields (see <em>Level of Measurement</em>,
454 below) may have subfields. Subfields are always ordinal.
455 </p>
456 </li>
457
458 <li>
459 <p>
460 <dfn>Predicates</dfn>. A predicate is shorthand for a Boolean
461 expression. Predicates may be used much like 1-bit fields. For
462 example, <code>ip4</code> might expand to <code>eth.type ==
463 0x800</code>. Predicates are provided for syntactic convenience,
464 because it is always possible to instead specify the underlying
465 expression directly.
466 </p>
467
468 <p>
469 A predicate whose expansion refers to any nominal field or
470 predicate (see <em>Level of Measurement</em>, below) is nominal;
471 other predicates have Boolean level of measurement.
472 </p>
473 </li>
e0840f11
BP
474 </ul>
475
fe36184b 476 <p>
fa6aeaeb
RB
477 <em>Level of Measurement</em>. See
478 http://en.wikipedia.org/wiki/Level_of_measurement for the statistical
479 concept on which this classification is based. There are three
480 levels:
fe36184b
BP
481 </p>
482
483 <ul>
fa6aeaeb
RB
484 <li>
485 <p>
486 <dfn>Ordinal</dfn>. In statistics, ordinal values can be ordered
487 on a scale. OVN considers a field (or subfield) to be ordinal if
488 its bits can be examined individually. This is true for the
489 OpenFlow fields that OpenFlow or Open vSwitch makes ``maskable.''
490 </p>
491
492 <p>
493 Any use of a nominal field may specify a single bit or a range of
494 bits, e.g. <code>vlan.tci[13..15]</code> refers to the PCP field
495 within the VLAN TCI, and <code>eth.dst[40]</code> refers to the
496 multicast bit in the Ethernet destination address.
497 </p>
498
499 <p>
500 OVN supports all the usual arithmetic relations (<code>==</code>,
501 <code>!=</code>, <code>&lt;</code>, <code>&lt;=</code>,
502 <code>&gt;</code>, and <code>&gt;=</code>) on ordinal fields and
503 their subfields, because OVN can implement these in OpenFlow and
504 Open vSwitch as collections of bitwise tests.
505 </p>
506 </li>
507
508 <li>
509 <p>
510 <dfn>Nominal</dfn>. In statistics, nominal values cannot be
511 usefully compared except for equality. This is true of OpenFlow
512 port numbers, Ethernet types, and IP protocols are examples: all of
513 these are just identifiers assigned arbitrarily with no deeper
514 meaning. In OpenFlow and Open vSwitch, bits in these fields
515 generally aren't individually addressable.
516 </p>
517
518 <p>
519 OVN only supports arithmetic tests for equality on nominal fields,
520 because OpenFlow and Open vSwitch provide no way for a flow to
521 efficiently implement other comparisons on them. (A test for
522 inequality can be sort of built out of two flows with different
523 priorities, but OVN matching expressions always generate flows with
524 a single priority.)
525 </p>
526
527 <p>
528 String fields are always nominal.
529 </p>
530 </li>
531
532 <li>
533 <p>
534 <dfn>Boolean</dfn>. A nominal field that has only two values, 0
535 and 1, is somewhat exceptional, since it is easy to support both
536 equality and inequality tests on such a field: either one can be
537 implemented as a test for 0 or 1.
538 </p>
539
540 <p>
541 Only predicates (see above) have a Boolean level of measurement.
542 </p>
543
544 <p>
545 This isn't a standard level of measurement.
546 </p>
547 </li>
fe36184b
BP
548 </ul>
549
550 <p>
fa6aeaeb
RB
551 <em>Prerequisites</em>. Any symbol can have prerequisites, which are
552 additional condition implied by the use of the symbol. For example,
553 For example, <code>icmp4.type</code> symbol might have prerequisite
554 <code>icmp4</code>, which would cause an expression <code>icmp4.type ==
555 0</code> to be interpreted as <code>icmp4.type == 0 &amp;&amp;
556 icmp4</code>, which would in turn expand to <code>icmp4.type == 0
557 &amp;&amp; eth.type == 0x800 &amp;&amp; ip4.proto == 1</code> (assuming
558 <code>icmp4</code> is a predicate defined as suggested under
559 <em>Types</em> above).
fe36184b
BP
560 </p>
561
e0840f11
BP
562 <p><em>Relational operators</em></p>
563
fe36184b 564 <p>
fa6aeaeb
RB
565 All of the standard relational operators <code>==</code>,
566 <code>!=</code>, <code>&lt;</code>, <code>&lt;=</code>,
567 <code>&gt;</code>, and <code>&gt;=</code> are supported. Nominal
568 fields support only <code>==</code> and <code>!=</code>, and only in a
569 positive sense when outer <code>!</code> are taken into account,
570 e.g. given string field <code>inport</code>, <code>inport ==
571 "eth0"</code> and <code>!(inport != "eth0")</code> are acceptable, but
572 not <code>inport != "eth0"</code>.
fe36184b
BP
573 </p>
574
575 <p>
fa6aeaeb
RB
576 The implementation of <code>==</code> (or <code>!=</code> when it is
577 negated), is more efficient than that of the other relational
578 operators.
fe36184b
BP
579 </p>
580
e0840f11
BP
581 <p><em>Constants</em></p>
582
fe36184b 583 <p>
e0840f11
BP
584 Integer constants may be expressed in decimal, hexadecimal prefixed by
585 <code>0x</code>, or as dotted-quad IPv4 addresses, IPv6 addresses in
586 their standard forms, or Ethernet addresses as colon-separated hex
587 digits. A constant in any of these forms may be followed by a slash
588 and a second constant (the mask) in the same form, to form a masked
589 constant. IPv4 and IPv6 masks may be given as integers, to express
590 CIDR prefixes.
591 </p>
592
593 <p>
594 String constants have the same syntax as quoted strings in JSON (thus,
5868eb24 595 they are Unicode strings).
fe36184b
BP
596 </p>
597
598 <p>
e0840f11
BP
599 Some operators support sets of constants written inside curly braces
600 <code>{</code> ... <code>}</code>. Commas between elements of a set,
601 and after the last elements, are optional. With <code>==</code>,
602 ``<code><var>field</var> == { <var>constant1</var>,
603 <var>constant2</var>,</code> ... <code>}</code>'' is syntactic sugar
604 for ``<code><var>field</var> == <var>constant1</var> ||
605 <var>field</var> == <var>constant2</var> || </code>...<code></code>.
606 Similarly, ``<code><var>field</var> != { <var>constant1</var>,
607 <var>constant2</var>, </code>...<code> }</code>'' is equivalent to
608 ``<code><var>field</var> != <var>constant1</var> &amp;&amp;
fe36184b 609 <var>field</var> != <var>constant2</var> &amp;&amp;
e0840f11 610 </code>...<code></code>''.
fe36184b
BP
611 </p>
612
e0840f11
BP
613 <p><em>Miscellaneous</em></p>
614
fe36184b 615 <p>
fa6aeaeb
RB
616 Comparisons may name the symbol or the constant first,
617 e.g. <code>tcp.src == 80</code> and <code>80 == tcp.src</code> are both
618 acceptable.
fe36184b
BP
619 </p>
620
621 <p>
fa6aeaeb
RB
622 Tests for a range may be expressed using a syntax like <code>1024 &lt;=
623 tcp.src &lt;= 49151</code>, which is equivalent to <code>1024 &lt;=
624 tcp.src &amp;&amp; tcp.src &lt;= 49151</code>.
fe36184b
BP
625 </p>
626
627 <p>
fa6aeaeb
RB
628 For a one-bit field or predicate, a mention of its name is equivalent
629 to <code><var>symobl</var> == 1</code>, e.g. <code>vlan.present</code>
630 is equivalent to <code>vlan.present == 1</code>. The same is true for
631 one-bit subfields, e.g. <code>vlan.tci[12]</code>. There is no
632 technical limitation to implementing the same for ordinal fields of all
633 widths, but the implementation is expensive enough that the syntax
634 parser requires writing an explicit comparison against zero to make
635 mistakes less likely, e.g. in <code>tcp.src != 0</code> the comparison
636 against 0 is required.
fe36184b
BP
637 </p>
638
639 <p>
fa6aeaeb
RB
640 <em>Operator precedence</em> is as shown below, from highest to lowest.
641 There are two exceptions where parentheses are required even though the
642 table would suggest that they are not: <code>&amp;&amp;</code> and
643 <code>||</code> require parentheses when used together, and
644 <code>!</code> requires parentheses when applied to a relational
645 expression. Thus, in <code>(eth.type == 0x800 || eth.type == 0x86dd)
646 &amp;&amp; ip.proto == 6</code> or <code>!(arp.op == 1)</code>, the
647 parentheses are mandatory.
fe36184b
BP
648 </p>
649
e0840f11
BP
650 <ul>
651 <li><code>()</code></li>
652 <li><code>== != &lt; &lt;= &gt; &gt;=</code></li>
653 <li><code>!</code></li>
654 <li><code>&amp;&amp; ||</code></li>
655 </ul>
656
10b1662b
BP
657 <p>
658 <em>Comments</em> may be introduced by <code>//</code>, which extends
659 to the next new-line. Comments within a line may be bracketed by
660 <code>/*</code> and <code>*/</code>. Multiline comments are not
661 supported.
662 </p>
663
e0840f11
BP
664 <p><em>Symbols</em></p>
665
5868eb24
BP
666 <p>
667 Most of the symbols below have integer type. Only <code>inport</code>
668 and <code>outport</code> have string type. <code>inport</code> names a
669 logical port. Thus, its value is a <ref column="logical_port"/> name
62fdd819
AW
670 from the <ref table="Port_Binding"/> table. <code>outport</code> may
671 name a logical port, as <code>inport</code>, or a logical multicast
672 group defined in the <ref table="Multicast_Group"/> table. For both
673 symbols, only names within the flow's logical datapath may be used.
5868eb24
BP
674 </p>
675
e0840f11 676 <ul>
5868eb24
BP
677 <li><code>reg0</code>...<code>reg5</code></li>
678 <li><code>inport</code> <code>outport</code></li>
e0840f11
BP
679 <li><code>eth.src</code> <code>eth.dst</code> <code>eth.type</code></li>
680 <li><code>vlan.tci</code> <code>vlan.vid</code> <code>vlan.pcp</code> <code>vlan.present</code></li>
681 <li><code>ip.proto</code> <code>ip.dscp</code> <code>ip.ecn</code> <code>ip.ttl</code> <code>ip.frag</code></li>
682 <li><code>ip4.src</code> <code>ip4.dst</code></li>
683 <li><code>ip6.src</code> <code>ip6.dst</code> <code>ip6.label</code></li>
684 <li><code>arp.op</code> <code>arp.spa</code> <code>arp.tpa</code> <code>arp.sha</code> <code>arp.tha</code></li>
685 <li><code>tcp.src</code> <code>tcp.dst</code> <code>tcp.flags</code></li>
686 <li><code>udp.src</code> <code>udp.dst</code></li>
687 <li><code>sctp.src</code> <code>sctp.dst</code></li>
688 <li><code>icmp4.type</code> <code>icmp4.code</code></li>
689 <li><code>icmp6.type</code> <code>icmp6.code</code></li>
690 <li><code>nd.target</code> <code>nd.sll</code> <code>nd.tll</code></li>
691 </ul>
692
25030d47
RB
693 <p>
694 The following predicates are supported:
695 </p>
696
697 <ul>
698 <li><code>vlan.present</code> expands to <code>vlan.tci[12]</code></li>
699 <li><code>ip4</code> expands to <code>eth.type == 0x800</code></li>
700 <li><code>ip6</code> expands to <code>eth.type == 0x86dd</code></li>
701 <li><code>ip</code> expands to <code>ip4 || ip6</code></li>
702 <li><code>icmp4</code> expands to <code>ip4 &amp;&amp; ip.proto == 1</code></li>
703 <li><code>icmp6</code> expands to <code>ip6 &amp;&amp; ip.proto == 58</code></li>
704 <li><code>icmp</code> expands to <code>icmp4 || icmp6</code></li>
705 <li><code>ip.is_frag</code> expands to <code>ip.frag[0]</code></li>
706 <li><code>ip.later_frag</code> expands to <code>ip.frag[1]</code></li>
707 <li><code>ip.first_frag</code> expands to <code>ip.is_frag &amp;&amp; !ip.later_frag</code></li>
708 <li><code>arp</code> expands to <code>eth.type == 0x806</code></li>
709 <li><code>nd</code> expands to <code>icmp6.type == {135, 136} &amp;&amp; icmp6.code == 0</code></li>
710 <li><code>tcp</code> expands to <code>ip.proto == 6</code></li>
711 <li><code>udp</code> expands to <code>ip.proto == 17</code></li>
712 <li><code>sctp</code> expands to <code>ip.proto == 132</code></li>
713 </ul>
fe36184b
BP
714 </column>
715
716 <column name="actions">
717 <p>
2cd87fce
RB
718 Logical datapath actions, to be executed when the logical flow
719 represented by this row is the highest-priority match.
fe36184b
BP
720 </p>
721
35060cdc 722 <p>
2cd87fce
RB
723 Actions share lexical syntax with the <ref column="match"/> column. An
724 empty set of actions (or one that contains just white space or
725 comments), or a set of actions that consists of just
726 <code>drop;</code>, causes the matched packets to be dropped.
727 Otherwise, the column should contain a sequence of actions, each
728 terminated by a semicolon.
35060cdc 729 </p>
fe36184b 730
35060cdc 731 <p>
5868eb24 732 The following actions are defined:
35060cdc 733 </p>
fe36184b 734
35060cdc
BP
735 <dl>
736 <dt><code>output;</code></dt>
737 <dd>
5868eb24
BP
738 <p>
739 In the ingress pipeline, this action executes the
740 <code>egress</code> pipeline as a subroutine. If
741 <code>outport</code> names a logical port, the egress pipeline
742 executes once; if it is a multicast group, the egress pipeline runs
743 once for each logical port in the group.
744 </p>
745
746 <p>
747 In the egress pipeline, this action performs the actual
748 output to the <code>outport</code> logical port. (In the egress
749 pipeline, <code>outport</code> never names a multicast group.)
750 </p>
751
752 <p>
753 Output to the input port is implicitly dropped, that is,
754 <code>output</code> becomes a no-op if <code>outport</code> ==
755 <code>inport</code>.
756 </p>
757 </dd>
fe36184b 758
35060cdc
BP
759 <dt><code>next;</code></dt>
760 <dd>
2cd87fce
RB
761 Executes the next logical datapath table as a subroutine.
762 </dd>
fe36184b 763
35060cdc
BP
764 <dt><code><var>field</var> = <var>constant</var>;</code></dt>
765 <dd>
5868eb24
BP
766 <p>
767 Sets data or metadata field <var>field</var> to constant value
768 <var>constant</var>, e.g. <code>outport = "vif0";</code> to set the
769 logical output port. To set only a subset of bits in a field,
770 specify a subfield for <var>field</var> or a masked
771 <var>constant</var>, e.g. one may use <code>vlan.pcp[2] = 1;</code>
772 or <code>vlan.pcp = 4/4;</code> to set the most sigificant bit of
773 the VLAN PCP.
774 </p>
775
776 <p>
777 Assigning to a field with prerequisites implicitly adds those
778 prerequisites to <ref column="match"/>; thus, for example, a flow
779 that sets <code>tcp.dst</code> applies only to TCP flows,
780 regardless of whether its <ref column="match"/> mentions any TCP
781 field.
782 </p>
783
784 <p>
785 Not all fields are modifiable (e.g. <code>eth.type</code> and
786 <code>ip.proto</code> are read-only), and not all modifiable fields
787 may be partially modified (e.g. <code>ip.ttl</code> must assigned
788 as a whole). The <code>outport</code> field is modifiable in the
789 <code>ingress</code> pipeline but not in the <code>egress</code>
790 pipeline.
791 </p>
792 </dd>
fe36184b
BP
793 </dl>
794
795 <p>
2cd87fce
RB
796 The following actions will likely be useful later, but they have not
797 been thought out carefully.
fe36184b
BP
798 </p>
799
800 <dl>
35060cdc 801 <dt><code><var>field1</var> = <var>field2</var>;</code></dt>
2cd87fce
RB
802 <dd>
803 Extends the assignment action to allow copying between fields.
804 </dd>
35060cdc 805
e0840f11 806 <dt><code>learn</code></dt>
fe36184b 807
e0840f11 808 <dt><code>conntrack</code></dt>
fe36184b 809
35060cdc 810 <dt><code>dec_ttl { <var>action</var>, </code>...<code> } { <var>action</var>; </code>...<code>};</code></dt>
e0840f11
BP
811 <dd>
812 decrement TTL; execute first set of actions if
813 successful, second set if TTL decrement fails
814 </dd>
fe36184b 815
35060cdc 816 <dt><code>icmp_reply { <var>action</var>, </code>...<code> };</code></dt>
e0840f11 817 <dd>generate ICMP reply from packet, execute <var>action</var>s</dd>
fe36184b 818
fa6aeaeb
RB
819 <dt><code>arp { <var>action</var>, </code>...<code> }</code></dt>
820 <dd>generate ARP from packet, execute <var>action</var>s</dd>
fe36184b 821 </dl>
fe36184b 822 </column>
091e3af9
JP
823
824 <column name="external_ids" key="stage-name">
825 Human-readable name for this flow's stage in the pipeline.
826 </column>
827
828 <group title="Common Columns">
829 The overall purpose of these columns is described under <code>Common
830 Columns</code> at the beginning of this document.
831
832 <column name="external_ids"/>
833 </group>
fe36184b
BP
834 </table>
835
5868eb24
BP
836 <table name="Multicast_Group" title="Logical Port Multicast Groups">
837 <p>
838 The rows in this table define multicast groups of logical ports.
839 Multicast groups allow a single packet transmitted over a tunnel to a
840 hypervisor to be delivered to multiple VMs on that hypervisor, which
841 uses bandwidth more efficiently.
842 </p>
843
844 <p>
845 Each row in this table defines a logical multicast group numbered <ref
846 column="tunnel_key"/> within <ref column="datapath"/>, whose logical
847 ports are listed in the <ref column="ports"/> column.
848 </p>
849
850 <column name="datapath">
851 The logical datapath in which the multicast group resides.
852 </column>
853
854 <column name="tunnel_key">
855 The value used to designate this logical egress port in tunnel
856 encapsulations. An index forces the key to be unique within the <ref
857 column="datapath"/>. The unusual range ensures that multicast group IDs
858 do not overlap with logical port IDs.
859 </column>
860
861 <column name="name">
862 <p>
863 The logical multicast group's name. An index forces the name to be
864 unique within the <ref column="datapath"/>. Logical flows in the
865 ingress pipeline may output to the group just as for individual logical
866 ports, by assigning the group's name to <code>outport</code> and
867 executing an <code>output</code> action.
868 </p>
869
870 <p>
871 Multicast group names and logical port names share a single namespace
872 and thus should not overlap (but the database schema cannot enforce
873 this). To try to avoid conflicts, <code>ovn-northd</code> uses names
874 that begin with <code>_MC_</code>.
875 </p>
876 </column>
877
878 <column name="ports">
879 The logical ports included in the multicast group. All of these ports
880 must be in the <ref column="datapath"/> logical datapath (but the
881 database schema cannot enforce this).
882 </column>
883 </table>
884
885 <table name="Datapath_Binding" title="Physical-Logical Datapath Bindings">
886 <p>
887 Each row in this table identifies physical bindings of a logical
888 datapath. A logical datapath implements a logical pipeline among the
889 ports in the <ref table="Port_Binding"/> table associated with it. In
890 practice, the pipeline in a given logical datapath implements either a
891 logical switch or a logical router.
892 </p>
893
894 <column name="tunnel_key">
895 The tunnel key value to which the logical datapath is bound.
896 The <code>Tunnel Encapsulation</code> section in
897 <code>ovn-architecture</code>(7) describes how tunnel keys are
898 constructed for each supported encapsulation.
899 </column>
900
901 <column name="external_ids" key="logical-switch" type='{"type": "uuid"}'>
902 Each row in <ref table="Datapath_Binding"/> is associated with some
903 logical datapath. <code>ovn-northd</code> uses this key to store the
904 UUID of the logical datapath <ref table="Logical_Switch"
905 db="OVN_Northbound"/> row in the <ref db="OVN_Northbound"/> database.
906 </column>
907
908 <group title="Common Columns">
909 The overall purpose of these columns is described under <code>Common
910 Columns</code> at the beginning of this document.
911
912 <column name="external_ids"/>
913 </group>
914 </table>
915
dcda6e0d 916 <table name="Port_Binding" title="Physical-Logical Port Bindings">
fe36184b
BP
917 <p>
918 Each row in this table identifies the physical location of a logical
9fb4636f 919 port.
fe36184b
BP
920 </p>
921
922 <p>
9fb4636f 923 For every <code>Logical_Port</code> record in <code>OVN_Northbound</code>
91ae2065
RB
924 database, <code>ovn-northd</code> creates a record in this table.
925 <code>ovn-northd</code> populates and maintains every column except
3213e9df 926 the <code>chassis</code> column, which it leaves empty in new records.
9fb4636f
GS
927 </p>
928
929 <p>
930 <code>ovn-controller</code> populates the <code>chassis</code> column
931 for the records that identify the logical ports that are located on its
932 hypervisor, which <code>ovn-controller</code> in turn finds out by
933 monitoring the local hypervisor's Open_vSwitch database, which
934 identifies logical ports via the conventions described in
935 <code>IntegrationGuide.md</code>.
936 </p>
937
938 <p>
5868eb24 939 When a chassis shuts down gracefully, it should clean up the
9fb4636f 940 <code>chassis</code> column that it previously had populated.
fe36184b
BP
941 (This is not critical because resources hosted on the chassis are equally
942 unreachable regardless of whether their rows are present.) To handle the
943 case where a VM is shut down abruptly on one chassis, then brought up
9fb4636f
GS
944 again on a different one, <code>ovn-controller</code> must overwrite the
945 <code>chassis</code> column with new information.
fe36184b
BP
946 </p>
947
5868eb24
BP
948 <column name="datapath">
949 The logical datapath to which the logical port belongs.
747b2a45
BP
950 </column>
951
fe36184b 952 <column name="logical_port">
9fb4636f
GS
953 A logical port, taken from <ref table="Logical_Port" column="name"
954 db="OVN_Northbound"/> in the OVN_Northbound database's
955 <ref table="Logical_Port" db="OVN_Northbound"/> table. OVN does not
956 prescribe a particular format for the logical port ID.
957 </column>
958
1a76c93e
RB
959 <column name="type">
960 <p>
961 A type for this logical port. Logical ports can be used to model
962 other types of connectivity into an OVN logical switch. Leaving this column
963 blank maintains the default logical port behavior.
964 </p>
965
966 <p>
c0281929
RB
967 When this column is set to <code>localnet</code>, this logical port
968 represents a connection to a locally accessible network from each
969 ovn-controller instance. A logical switch can only have a single
970 <code>localnet</code> port attached and at most one regular logical
971 port. This is used to model direct connectivity to an existing
972 network.
1a76c93e
RB
973 </p>
974 </column>
975
976 <column name="options">
c0281929 977 <p>
1a76c93e
RB
978 This column provides key/value settings specific to the logical port
979 <ref column="type"/>.
c0281929
RB
980 </p>
981
982 <p>
983 When <ref column="type"/> is set to <code>localnet</code>, you must set
984 the option <code>network_name</code>. <code>ovn-controller</code> uses
985 the configuration entry <code>ovn-bridge-mappings</code> to determine
986 how to connect to this network. <code>ovn-bridge-mappings</code> is a
987 list of network names mapped to a local OVS bridge that provides access
988 to that network. An example of configuring
989 <code>ovn-bridge-mappings</code> would be:
990 </p>
991
992 <p>
993 <code>$ ovs-vsctl set open
994 . external-ids:ovn-bridge-mappings=physnet1:br-eth0,physnet2:br-eth1</code>
995 </p>
996
997 <p>
998 Also note that when a logical switch has a <code>localnet</code> port
999 attached, every chassis that may have a local vif attached to that
1000 logical switch must have a bridge mapping configured to reach that
1001 <code>localnet</code>. Traffic that arrives on a <code>localnet</code>
1002 port is never forwarded over a tunnel to another chassis.
1003 </p>
1a76c93e
RB
1004 </column>
1005
eb00399e
BP
1006 <column name="tunnel_key">
1007 <p>
5868eb24
BP
1008 A number that represents the logical port in the key (e.g. STT key or
1009 Geneve TLV) field carried within tunnel protocol packets.
eb00399e
BP
1010 </p>
1011
1012 <p>
5868eb24 1013 The tunnel ID must be unique within the scope of a logical datapath.
eb00399e
BP
1014 </p>
1015 </column>
1016
9fb4636f
GS
1017 <column name="parent_port">
1018 For containers created inside a VM, this is taken from
1019 <ref table="Logical_Port" column="parent_name" db="OVN_Northbound"/>
1020 in the OVN_Northbound database's <ref table="Logical_Port"
1021 db="OVN_Northbound"/> table. It is left empty if
1022 <ref column="logical_port"/> belongs to a VM or a container created
1023 in the hypervisor.
1024 </column>
1025
1026 <column name="tag">
1027 When <ref column="logical_port"/> identifies the interface of a container
1028 spawned inside a VM, this column identifies the VLAN tag in
1029 the network traffic associated with that container's network interface.
1030 It is left empty if <ref column="logical_port"/> belongs to a VM or a
1031 container created in the hypervisor.
fe36184b
BP
1032 </column>
1033
1034 <column name="chassis">
1035 The physical location of the logical port. To successfully identify a
71332231 1036 chassis, this column must be a <ref table="Chassis"/> record. This is
9fb4636f 1037 populated by <code>ovn-controller</code>.
fe36184b
BP
1038 </column>
1039
1040 <column name="mac">
1041 <p>
1042 The Ethernet address or addresses used as a source address on the
1043 logical port, each in the form
1044 <var>xx</var>:<var>xx</var>:<var>xx</var>:<var>xx</var>:<var>xx</var>:<var>xx</var>.
1045 The string <code>unknown</code> is also allowed to indicate that the
1046 logical port has an unknown set of (additional) source addresses.
1047 </p>
1048
1049 <p>
1050 A VM interface would ordinarily have a single Ethernet address. A
1051 gateway port might initially only have <code>unknown</code>, and then
1052 add MAC addresses to the set as it learns new source addresses.
1053 </p>
1054 </column>
1055 </table>
1056</database>