]> git.proxmox.com Git - ovs.git/blame - ovn/ovn-sb.xml
travis: Update datapath target kernel list.
[ovs.git] / ovn / ovn-sb.xml
CommitLineData
fe36184b 1<?xml version="1.0" encoding="utf-8"?>
ec78987f 2<database name="ovn-sb" title="OVN Southbound Database">
fe36184b
BP
3 <p>
4 This database holds logical and physical configuration and state for the
5 Open Virtual Network (OVN) system to support virtual network abstraction.
6 For an introduction to OVN, please see <code>ovn-architecture</code>(7).
7 </p>
8
9 <p>
ec78987f
JP
10 The OVN Southbound database sits at the center of the OVN
11 architecture. It is the one component that speaks both southbound
12 directly to all the hypervisors and gateways, via
88058f19
AW
13 <code>ovn-controller</code>/<code>ovn-controller-vtep</code>, and
14 northbound to the Cloud Management System, via <code>ovn-northd</code>:
fe36184b
BP
15 </p>
16
17 <h2>Database Structure</h2>
18
19 <p>
0bac7164 20 The OVN Southbound database contains classes of data with
ec78987f 21 different properties, as described in the sections below.
fe36184b
BP
22 </p>
23
24 <h3>Physical Network (PN) data</h3>
25
26 <p>
27 PN tables contain information about the chassis nodes in the system. This
28 contains all the information necessary to wire the overlay, such as IP
29 addresses, supported tunnel types, and security keys.
30 </p>
31
32 <p>
33 The amount of PN data is small (O(n) in the number of chassis) and it
34 changes infrequently, so it can be replicated to every chassis.
35 </p>
36
37 <p>
62fdd819 38 The <ref table="Chassis"/> table comprises the PN tables.
fe36184b
BP
39 </p>
40
41 <h3>Logical Network (LN) data</h3>
42
43 <p>
44 LN tables contain the topology of logical switches and routers, ACLs,
45 firewall rules, and everything needed to describe how packets traverse a
46 logical network, represented as logical datapath flows (see Logical
47 Datapath Flows, below).
48 </p>
49
50 <p>
51 LN data may be large (O(n) in the number of logical ports, ACL rules,
52 etc.). Thus, to improve scaling, each chassis should receive only data
53 related to logical networks in which that chassis participates. Past
54 experience shows that in the presence of large logical networks, even
55 finer-grained partitioning of data, e.g. designing logical flows so that
56 only the chassis hosting a logical port needs related flows, pays off
57 scale-wise. (This is not necessary initially but it is worth bearing in
58 mind in the design.)
59 </p>
60
61 <p>
62 The LN is a slave of the cloud management system running northbound of OVN.
63 That CMS determines the entire OVN logical configuration and therefore the
64 LN's content at any given time is a deterministic function of the CMS's
09986f8c
JP
65 configuration, although that happens indirectly via the
66 <ref db="OVN_Northbound"/> database and <code>ovn-northd</code>.
fe36184b
BP
67 </p>
68
69 <p>
70 LN data is likely to change more quickly than PN data. This is especially
71 true in a container environment where VMs are created and destroyed (and
72 therefore added to and deleted from logical switches) quickly.
73 </p>
74
75 <p>
5868eb24
BP
76 <ref table="Logical_Flow"/> and <ref table="Multicast_Group"/> contain LN
77 data.
fe36184b
BP
78 </p>
79
0bac7164 80 <h3>Logical-physical bindings</h3>
fe36184b
BP
81
82 <p>
0bac7164 83 These tables link logical and physical components. They show the current
5868eb24
BP
84 placement of logical components (such as VMs and VIFs) onto chassis, and
85 map logical entities to the values that represent them in tunnel
86 encapsulations.
fe36184b
BP
87 </p>
88
89 <p>
0bac7164 90 These tables change frequently, at least every time a VM powers up or down
fe36184b
BP
91 or migrates, and especially quickly in a container environment. The
92 amount of data per VM (or VIF) is small.
93 </p>
94
95 <p>
96 Each chassis is authoritative about the VMs and VIFs that it hosts at any
97 given time and can efficiently flood that state to a central location, so
98 the consistency needs are minimal.
99 </p>
100
101 <p>
5868eb24
BP
102 The <ref table="Port_Binding"/> and <ref table="Datapath_Binding"/> tables
103 contain binding data.
fe36184b
BP
104 </p>
105
0bac7164
BP
106 <h3>MAC bindings</h3>
107
108 <p>
109 The <ref table="MAC_Binding"/> table tracks the bindings from IP addresses
110 to Ethernet addresses that are dynamically discovered using ARP (for IPv4)
111 and neighbor discovery (for IPv6). Usually, IP-to-MAC bindings for virtual
112 machines are statically populated into the <ref table="Port_Binding"/>
113 table, so <ref table="MAC_Binding"/> is primarily used to discover bindings
114 on physical networks.
115 </p>
116
5868eb24
BP
117 <h2>Common Columns</h2>
118
119 <p>
120 Some tables contain a special column named <code>external_ids</code>. This
121 column has the same form and purpose each place that it appears, so we
122 describe it here to save space later.
123 </p>
124
125 <dl>
126 <dt><code>external_ids</code>: map of string-string pairs</dt>
127 <dd>
128 Key-value pairs for use by the software that manages the OVN Southbound
88058f19
AW
129 database rather than by
130 <code>ovn-controller</code>/<code>ovn-controller-vtep</code>. In
131 particular, <code>ovn-northd</code> can use key-value pairs in this
132 column to relate entities in the southbound database to higher-level
133 entities (such as entities in the OVN Northbound database). Individual
134 key-value pairs in this column may be documented in some cases to aid
135 in understanding and troubleshooting, but the reader should not mistake
136 such documentation as comprehensive.
5868eb24
BP
137 </dd>
138 </dl>
139
fe36184b
BP
140 <table name="Chassis" title="Physical Network Hypervisor and Gateway Information">
141 <p>
142 Each row in this table represents a hypervisor or gateway (a chassis) in
143 the physical network (PN). Each chassis, via
88058f19
AW
144 <code>ovn-controller</code>/<code>ovn-controller-vtep</code>, adds
145 and updates its own row, and keeps a copy of the remaining rows to
146 determine how to reach other hypervisors.
fe36184b
BP
147 </p>
148
149 <p>
150 When a chassis shuts down gracefully, it should remove its own row.
151 (This is not critical because resources hosted on the chassis are equally
152 unreachable regardless of whether the row is present.) If a chassis
153 shuts down permanently without removing its row, some kind of manual or
154 automatic cleanup is eventually needed; we can devise a process for that
155 as necessary.
156 </p>
157
158 <column name="name">
fc26cf25
RB
159 OVN does not prescribe a particular format for chassis names.
160 ovn-controller populates this column using <ref key="system-id"
161 table="Open_vSwitch" column="external_ids" db="Open_vSwitch"/>
162 in the Open_vSwitch database's <ref table="Open_vSwitch"
163 db="Open_vSwitch"/> table. ovn-controller-vtep populates this
164 column with <ref table="Physical_Switch" column="name"
165 db="hardware_vtep"/> in the hardware_vtep database's
166 <ref table="Physical_Switch" db="hardware_vtep"/> table.
fe36184b
BP
167 </column>
168
2229f3ec
RB
169 <column name="hostname">
170 The hostname of the chassis, if applicable. ovn-controller will populate
171 this column with the hostname of the host it is running on.
172 ovn-controller-vtep will leave this column empty.
173 </column>
174
4250ee37
RB
175 <column name="external_ids" key="ovn-bridge-mappings">
176 <code>ovn-controller</code> populates this key with the set of bridge
177 mappings it has been configured to use. Other applications should treat
178 this key as read-only. See <code>ovn-controller</code>(8) for more
179 information.
180 </column>
181
1cef5fff
RB
182 <group title="Common Columns">
183 The overall purpose of these columns is described under <code>Common
184 Columns</code> at the beginning of this document.
185
186 <column name="external_ids"/>
187 </group>
188
09db214c 189 <group title="Encapsulation Configuration">
fe36184b 190 <p>
09db214c
JP
191 OVN uses encapsulation to transmit logical dataplane packets
192 between chassis.
fe36184b
BP
193 </p>
194
09db214c
JP
195 <column name="encaps">
196 Points to supported encapsulation configurations to transmit
197 logical dataplane packets to this chassis. Each entry is a <ref
198 table="Encap"/> record that describes the configuration.
fe36184b
BP
199 </column>
200 </group>
201
62fdd819
AW
202 <group title="Gateway Configuration">
203 <p>
204 A <dfn>gateway</dfn> is a chassis that forwards traffic between the
205 OVN-managed part of a logical network and a physical VLAN, extending a
206 tunnel-based logical network into a physical network. Gateways are
88058f19
AW
207 typically dedicated nodes that do not host VMs and will be controlled
208 by <code>ovn-controller-vtep</code>.
fe36184b
BP
209 </p>
210
62fdd819 211 <column name="vtep_logical_switches">
88058f19
AW
212 Stores all VTEP logical switch names connected by this gateway
213 chassis. The <ref table="Port_Binding"/> table entry with
214 <ref column="options" table="Port_Binding"/>:<code>vtep-physical-switch</code>
215 equal <ref table="Chassis"/> <ref column="name" table="Chassis"/>, and
216 <ref column="options" table="Port_Binding"/>:<code>vtep-logical-switch</code>
217 value in <ref table="Chassis"/>
218 <ref column="vtep_logical_switches" table="Chassis"/>, will be
219 associated with this <ref table="Chassis"/>.
fe36184b 220 </column>
62fdd819 221 </group>
fe36184b
BP
222 </table>
223
09db214c
JP
224 <table name="Encap" title="Encapsulation Types">
225 <p>
226 The <ref column="encaps" table="Chassis"/> column in the <ref
227 table="Chassis"/> table refers to rows in this table to identify
228 how OVN may transmit logical dataplane packets to this chassis.
88058f19
AW
229 Each chassis, via <code>ovn-controller</code>(8) or
230 <code>ovn-controller-vtep</code>(8), adds and updates its own rows
231 and keeps a copy of the remaining rows to determine how to reach
232 other chassis.
09db214c
JP
233 </p>
234
235 <column name="type">
236 The encapsulation to use to transmit packets to this chassis.
b705f9ea
JP
237 Hypervisors must use either <code>geneve</code> or
238 <code>stt</code>. Gateways may use <code>vxlan</code>,
239 <code>geneve</code>, or <code>stt</code>.
09db214c
JP
240 </column>
241
242 <column name="options">
243 Options for configuring the encapsulation, e.g. IPsec parameters when
244 IPsec support is introduced. No options are currently defined.
245 </column>
246
247 <column name="ip">
248 The IPv4 address of the encapsulation tunnel endpoint.
249 </column>
250 </table>
251
ea382567
RB
252 <table name="Address_Set" title="Address Sets">
253 <p>
254 See the documentation for the <ref table="Address_Set"
255 db="OVN_Northbound"/> table in the <ref db="OVN_Northbound"/> database
256 for details.
257 </p>
258
259 <column name="name"/>
260 <column name="addresses"/>
261 </table>
262
5868eb24 263 <table name="Logical_Flow" title="Logical Network Flows">
fe36184b 264 <p>
09986f8c
JP
265 Each row in this table represents one logical flow.
266 <code>ovn-northd</code> populates this table with logical flows
267 that implement the L2 and L3 topologies specified in the
268 <ref db="OVN_Northbound"/> database. Each hypervisor, via
269 <code>ovn-controller</code>, translates the logical flows into
270 OpenFlow flows specific to its hypervisor and installs them into
271 Open vSwitch.
fe36184b
BP
272 </p>
273
274 <p>
275 Logical flows are expressed in an OVN-specific format, described here. A
276 logical datapath flow is much like an OpenFlow flow, except that the
277 flows are written in terms of logical ports and logical datapaths instead
278 of physical ports and physical datapaths. Translation between logical
279 and physical flows helps to ensure isolation between logical datapaths.
09986f8c
JP
280 (The logical flow abstraction also allows the OVN centralized
281 components to do less work, since they do not have to separately
282 compute and push out physical flows to each chassis.)
fe36184b
BP
283 </p>
284
285 <p>
286 The default action when no flow matches is to drop packets.
287 </p>
288
69a832cf 289 <p><em>Architectural Logical Life Cycle of a Packet</em></p>
5868eb24
BP
290
291 <p>
292 This following description focuses on the life cycle of a packet through
293 a logical datapath, ignoring physical details of the implementation.
69a832cf 294 Please refer to <em>Architectural Physical Life Cycle of a Packet</em> in
5868eb24
BP
295 <code>ovn-architecture</code>(7) for the physical information.
296 </p>
297
298 <p>
299 The description here is written as if OVN itself executes these steps,
300 but in fact OVN (that is, <code>ovn-controller</code>) programs Open
301 vSwitch, via OpenFlow and OVSDB, to execute them on its behalf.
302 </p>
303
304 <p>
305 At a high level, OVN passes each packet through the logical datapath's
306 logical ingress pipeline, which may output the packet to one or more
307 logical port or logical multicast groups. For each such logical output
308 port, OVN passes the packet through the datapath's logical egress
309 pipeline, which may either drop the packet or deliver it to the
310 destination. Between the two pipelines, outputs to logical multicast
311 groups are expanded into logical ports, so that the egress pipeline only
312 processes a single logical output port at a time. Between the two
313 pipelines is also where, when necessary, OVN encapsulates a packet in a
314 tunnel (or tunnels) to transmit to remote hypervisors.
315 </p>
316
317 <p>
318 In more detail, to start, OVN searches the <ref table="Logical_Flow"/>
319 table for a row with correct <ref column="logical_datapath"/>, a <ref
320 column="pipeline"/> of <code>ingress</code>, a <ref column="table_id"/>
321 of 0, and a <ref column="match"/> that is true for the packet. If none
322 is found, OVN drops the packet. If OVN finds more than one, it chooses
323 the match with the highest <ref column="priority"/>. Then OVN executes
324 each of the actions specified in the row's <ref table="actions"/> column,
325 in the order specified. Some actions, such as those to modify packet
326 headers, require no further details. The <code>next</code> and
327 <code>output</code> actions are special.
328 </p>
329
330 <p>
331 The <code>next</code> action causes the above process to be repeated
332 recursively, except that OVN searches for <ref column="table_id"/> of 1
333 instead of 0. Similarly, any <code>next</code> action in a row found in
334 that table would cause a further search for a <ref column="table_id"/> of
335 2, and so on. When recursive processing completes, flow control returns
336 to the action following <code>next</code>.
337 </p>
338
339 <p>
340 The <code>output</code> action also introduces recursion. Its effect
341 depends on the current value of the <code>outport</code> field. Suppose
342 <code>outport</code> designates a logical port. First, OVN compares
343 <code>inport</code> to <code>outport</code>; if they are equal, it treats
344 the <code>output</code> as a no-op. In the common case, where they are
345 different, the packet enters the egress pipeline. This transition to the
78aab811 346 egress pipeline discards register data, e.g. <code>reg0</code> ...
cc5e28d8 347 <code>reg9</code> and connection tracking state, to achieve
78aab811
JP
348 uniform behavior regardless of whether the egress pipeline is on a
349 different hypervisor (because registers aren't preserve across
350 tunnel encapsulation).
5868eb24
BP
351 </p>
352
353 <p>
354 To execute the egress pipeline, OVN again searches the <ref
355 table="Logical_Flow"/> table for a row with correct <ref
356 column="logical_datapath"/>, a <ref column="table_id"/> of 0, a <ref
357 column="match"/> that is true for the packet, but now looking for a <ref
358 column="pipeline"/> of <code>egress</code>. If no matching row is found,
359 the output becomes a no-op. Otherwise, OVN executes the actions for the
360 matching flow (which is chosen from multiple, if necessary, as already
361 described).
362 </p>
363
364 <p>
365 In the <code>egress</code> pipeline, the <code>next</code> action acts as
366 already described, except that it, of course, searches for
367 <code>egress</code> flows. The <code>output</code> action, however, now
368 directly outputs the packet to the output port (which is now fixed,
369 because <code>outport</code> is read-only within the egress pipeline).
370 </p>
371
372 <p>
373 The description earlier assumed that <code>outport</code> referred to a
374 logical port. If it instead designates a logical multicast group, then
375 the description above still applies, with the addition of fan-out from
376 the logical multicast group to each logical port in the group. For each
377 member of the group, OVN executes the logical pipeline as described, with
378 the logical output port replaced by the group member.
379 </p>
380
8d6e5516
JP
381 <p><em>Pipeline Stages</em></p>
382
383 <p>
384 <code>ovn-northd</code> is responsible for populating the
385 <ref table="Logical_Flow"/> table, so the stages are an
386 implementation detail and subject to change. This section
387 describes the current logical flow table.
388 </p>
389
390 <p>
391 The ingress pipeline consists of the following stages:
392 </p>
393 <ul>
394 <li>
395 Port Security (Table 0): Validates the source address, drops
396 packets with a VLAN tag, and, if configured, verifies that the
397 logical port is allowed to send with the source address.
398 </li>
399
400 <li>
401 L2 Destination Lookup (Table 1): Forwards known unicast
402 addresses to the appropriate logical port. Unicast packets to
403 unknown hosts are forwarded to logical ports configured with the
404 special <code>unknown</code> mac address. Broadcast, and
405 multicast are flooded to all ports in the logical switch.
406 </li>
407 </ul>
408
409 <p>
410 The egress pipeline consists of the following stages:
411 </p>
412 <ul>
413 <li>
414 ACL (Table 0): Applies any specified access control lists.
415 </li>
416
417 <li>
418 Port Security (Table 1): If configured, verifies that the
419 logical port is allowed to receive packets with the destination
420 address.
421 </li>
422 </ul>
423
747b2a45 424 <column name="logical_datapath">
5868eb24
BP
425 The logical datapath to which the logical flow belongs.
426 </column>
427
428 <column name="pipeline">
429 <p>
430 The primary flows used for deciding on a packet's destination are the
431 <code>ingress</code> flows. The <code>egress</code> flows implement
432 ACLs. See <em>Logical Life Cycle of a Packet</em>, above, for details.
433 </p>
747b2a45
BP
434 </column>
435
fe36184b
BP
436 <column name="table_id">
437 The stage in the logical pipeline, analogous to an OpenFlow table number.
438 </column>
439
440 <column name="priority">
441 The flow's priority. Flows with numerically higher priority take
442 precedence over those with lower. If two logical datapath flows with the
443 same priority both match, then the one actually applied to the packet is
444 undefined.
445 </column>
446
447 <column name="match">
448 <p>
449 A matching expression. OVN provides a superset of OpenFlow matching
450 capabilities, using a syntax similar to Boolean expressions in a
451 programming language.
452 </p>
453
454 <p>
fa6aeaeb
RB
455 The most important components of match expression are
456 <dfn>comparisons</dfn> between <dfn>symbols</dfn> and
457 <dfn>constants</dfn>, e.g. <code>ip4.dst == 192.168.0.1</code>,
458 <code>ip.proto == 6</code>, <code>arp.op == 1</code>, <code>eth.type ==
459 0x800</code>. The logical AND operator <code>&amp;&amp;</code> and
460 logical OR operator <code>||</code> can combine comparisons into a
461 larger expression.
fe36184b
BP
462 </p>
463
fe36184b 464 <p>
e0840f11
BP
465 Matching expressions also support parentheses for grouping, the logical
466 NOT prefix operator <code>!</code>, and literals <code>0</code> and
467 <code>1</code> to express ``false'' or ``true,'' respectively. The
468 latter is useful by itself as a catch-all expression that matches every
469 packet.
fe36184b
BP
470 </p>
471
e0840f11 472 <p><em>Symbols</em></p>
fe36184b
BP
473
474 <p>
fa6aeaeb
RB
475 <em>Type</em>. Symbols have <dfn>integer</dfn> or <dfn>string</dfn>
476 type. Integer symbols have a <dfn>width</dfn> in bits.
fe36184b
BP
477 </p>
478
479 <p>
fa6aeaeb 480 <em>Kinds</em>. There are three kinds of symbols:
fe36184b
BP
481 </p>
482
e0840f11 483 <ul>
fa6aeaeb
RB
484 <li>
485 <p>
486 <dfn>Fields</dfn>. A field symbol represents a packet header or
487 metadata field. For example, a field
488 named <code>vlan.tci</code> might represent the VLAN TCI field in a
489 packet.
490 </p>
491
492 <p>
493 A field symbol can have integer or string type. Integer fields can
494 be nominal or ordinal (see <em>Level of Measurement</em>,
495 below).
496 </p>
497 </li>
498
499 <li>
500 <p>
501 <dfn>Subfields</dfn>. A subfield represents a subset of bits from
502 a larger field. For example, a field <code>vlan.vid</code> might
503 be defined as an alias for <code>vlan.tci[0..11]</code>. Subfields
504 are provided for syntactic convenience, because it is always
505 possible to instead refer to a subset of bits from a field
506 directly.
507 </p>
508
509 <p>
510 Only ordinal fields (see <em>Level of Measurement</em>,
511 below) may have subfields. Subfields are always ordinal.
512 </p>
513 </li>
514
515 <li>
516 <p>
517 <dfn>Predicates</dfn>. A predicate is shorthand for a Boolean
518 expression. Predicates may be used much like 1-bit fields. For
519 example, <code>ip4</code> might expand to <code>eth.type ==
520 0x800</code>. Predicates are provided for syntactic convenience,
521 because it is always possible to instead specify the underlying
522 expression directly.
523 </p>
524
525 <p>
526 A predicate whose expansion refers to any nominal field or
527 predicate (see <em>Level of Measurement</em>, below) is nominal;
528 other predicates have Boolean level of measurement.
529 </p>
530 </li>
e0840f11
BP
531 </ul>
532
fe36184b 533 <p>
fa6aeaeb
RB
534 <em>Level of Measurement</em>. See
535 http://en.wikipedia.org/wiki/Level_of_measurement for the statistical
536 concept on which this classification is based. There are three
537 levels:
fe36184b
BP
538 </p>
539
540 <ul>
fa6aeaeb
RB
541 <li>
542 <p>
543 <dfn>Ordinal</dfn>. In statistics, ordinal values can be ordered
544 on a scale. OVN considers a field (or subfield) to be ordinal if
545 its bits can be examined individually. This is true for the
546 OpenFlow fields that OpenFlow or Open vSwitch makes ``maskable.''
547 </p>
548
549 <p>
550 Any use of a nominal field may specify a single bit or a range of
551 bits, e.g. <code>vlan.tci[13..15]</code> refers to the PCP field
552 within the VLAN TCI, and <code>eth.dst[40]</code> refers to the
553 multicast bit in the Ethernet destination address.
554 </p>
555
556 <p>
557 OVN supports all the usual arithmetic relations (<code>==</code>,
558 <code>!=</code>, <code>&lt;</code>, <code>&lt;=</code>,
559 <code>&gt;</code>, and <code>&gt;=</code>) on ordinal fields and
560 their subfields, because OVN can implement these in OpenFlow and
561 Open vSwitch as collections of bitwise tests.
562 </p>
563 </li>
564
565 <li>
566 <p>
567 <dfn>Nominal</dfn>. In statistics, nominal values cannot be
568 usefully compared except for equality. This is true of OpenFlow
569 port numbers, Ethernet types, and IP protocols are examples: all of
570 these are just identifiers assigned arbitrarily with no deeper
571 meaning. In OpenFlow and Open vSwitch, bits in these fields
572 generally aren't individually addressable.
573 </p>
574
575 <p>
576 OVN only supports arithmetic tests for equality on nominal fields,
577 because OpenFlow and Open vSwitch provide no way for a flow to
578 efficiently implement other comparisons on them. (A test for
579 inequality can be sort of built out of two flows with different
580 priorities, but OVN matching expressions always generate flows with
581 a single priority.)
582 </p>
583
584 <p>
585 String fields are always nominal.
586 </p>
587 </li>
588
589 <li>
590 <p>
591 <dfn>Boolean</dfn>. A nominal field that has only two values, 0
592 and 1, is somewhat exceptional, since it is easy to support both
593 equality and inequality tests on such a field: either one can be
594 implemented as a test for 0 or 1.
595 </p>
596
597 <p>
598 Only predicates (see above) have a Boolean level of measurement.
599 </p>
600
601 <p>
602 This isn't a standard level of measurement.
603 </p>
604 </li>
fe36184b
BP
605 </ul>
606
607 <p>
fa6aeaeb
RB
608 <em>Prerequisites</em>. Any symbol can have prerequisites, which are
609 additional condition implied by the use of the symbol. For example,
610 For example, <code>icmp4.type</code> symbol might have prerequisite
611 <code>icmp4</code>, which would cause an expression <code>icmp4.type ==
612 0</code> to be interpreted as <code>icmp4.type == 0 &amp;&amp;
613 icmp4</code>, which would in turn expand to <code>icmp4.type == 0
614 &amp;&amp; eth.type == 0x800 &amp;&amp; ip4.proto == 1</code> (assuming
615 <code>icmp4</code> is a predicate defined as suggested under
616 <em>Types</em> above).
fe36184b
BP
617 </p>
618
e0840f11
BP
619 <p><em>Relational operators</em></p>
620
fe36184b 621 <p>
fa6aeaeb
RB
622 All of the standard relational operators <code>==</code>,
623 <code>!=</code>, <code>&lt;</code>, <code>&lt;=</code>,
624 <code>&gt;</code>, and <code>&gt;=</code> are supported. Nominal
625 fields support only <code>==</code> and <code>!=</code>, and only in a
626 positive sense when outer <code>!</code> are taken into account,
627 e.g. given string field <code>inport</code>, <code>inport ==
628 "eth0"</code> and <code>!(inport != "eth0")</code> are acceptable, but
629 not <code>inport != "eth0"</code>.
fe36184b
BP
630 </p>
631
632 <p>
fa6aeaeb
RB
633 The implementation of <code>==</code> (or <code>!=</code> when it is
634 negated), is more efficient than that of the other relational
635 operators.
fe36184b
BP
636 </p>
637
e0840f11
BP
638 <p><em>Constants</em></p>
639
fe36184b 640 <p>
e0840f11
BP
641 Integer constants may be expressed in decimal, hexadecimal prefixed by
642 <code>0x</code>, or as dotted-quad IPv4 addresses, IPv6 addresses in
643 their standard forms, or Ethernet addresses as colon-separated hex
644 digits. A constant in any of these forms may be followed by a slash
645 and a second constant (the mask) in the same form, to form a masked
646 constant. IPv4 and IPv6 masks may be given as integers, to express
647 CIDR prefixes.
648 </p>
649
650 <p>
651 String constants have the same syntax as quoted strings in JSON (thus,
5868eb24 652 they are Unicode strings).
fe36184b
BP
653 </p>
654
655 <p>
e0840f11
BP
656 Some operators support sets of constants written inside curly braces
657 <code>{</code> ... <code>}</code>. Commas between elements of a set,
658 and after the last elements, are optional. With <code>==</code>,
659 ``<code><var>field</var> == { <var>constant1</var>,
660 <var>constant2</var>,</code> ... <code>}</code>'' is syntactic sugar
661 for ``<code><var>field</var> == <var>constant1</var> ||
662 <var>field</var> == <var>constant2</var> || </code>...<code></code>.
663 Similarly, ``<code><var>field</var> != { <var>constant1</var>,
664 <var>constant2</var>, </code>...<code> }</code>'' is equivalent to
665 ``<code><var>field</var> != <var>constant1</var> &amp;&amp;
fe36184b 666 <var>field</var> != <var>constant2</var> &amp;&amp;
e0840f11 667 </code>...<code></code>''.
fe36184b
BP
668 </p>
669
ea382567
RB
670 <p>
671 You may refer to a set of IPv4, IPv6, or MAC addresses stored in the
672 <ref table="Address_Set"/> table by its <ref column="name"
673 table="Address_Set"/>. An <ref table="Address_Set"/> with a name
674 of <code>set1</code> can be referred to as
675 <code>$set1</code>.
676 </p>
677
e0840f11
BP
678 <p><em>Miscellaneous</em></p>
679
fe36184b 680 <p>
fa6aeaeb
RB
681 Comparisons may name the symbol or the constant first,
682 e.g. <code>tcp.src == 80</code> and <code>80 == tcp.src</code> are both
683 acceptable.
fe36184b
BP
684 </p>
685
686 <p>
fa6aeaeb
RB
687 Tests for a range may be expressed using a syntax like <code>1024 &lt;=
688 tcp.src &lt;= 49151</code>, which is equivalent to <code>1024 &lt;=
689 tcp.src &amp;&amp; tcp.src &lt;= 49151</code>.
fe36184b
BP
690 </p>
691
692 <p>
fa6aeaeb
RB
693 For a one-bit field or predicate, a mention of its name is equivalent
694 to <code><var>symobl</var> == 1</code>, e.g. <code>vlan.present</code>
695 is equivalent to <code>vlan.present == 1</code>. The same is true for
696 one-bit subfields, e.g. <code>vlan.tci[12]</code>. There is no
697 technical limitation to implementing the same for ordinal fields of all
698 widths, but the implementation is expensive enough that the syntax
699 parser requires writing an explicit comparison against zero to make
700 mistakes less likely, e.g. in <code>tcp.src != 0</code> the comparison
701 against 0 is required.
fe36184b
BP
702 </p>
703
704 <p>
fa6aeaeb
RB
705 <em>Operator precedence</em> is as shown below, from highest to lowest.
706 There are two exceptions where parentheses are required even though the
707 table would suggest that they are not: <code>&amp;&amp;</code> and
708 <code>||</code> require parentheses when used together, and
709 <code>!</code> requires parentheses when applied to a relational
710 expression. Thus, in <code>(eth.type == 0x800 || eth.type == 0x86dd)
711 &amp;&amp; ip.proto == 6</code> or <code>!(arp.op == 1)</code>, the
712 parentheses are mandatory.
fe36184b
BP
713 </p>
714
e0840f11
BP
715 <ul>
716 <li><code>()</code></li>
717 <li><code>== != &lt; &lt;= &gt; &gt;=</code></li>
718 <li><code>!</code></li>
719 <li><code>&amp;&amp; ||</code></li>
720 </ul>
721
10b1662b
BP
722 <p>
723 <em>Comments</em> may be introduced by <code>//</code>, which extends
724 to the next new-line. Comments within a line may be bracketed by
725 <code>/*</code> and <code>*/</code>. Multiline comments are not
726 supported.
727 </p>
728
e0840f11
BP
729 <p><em>Symbols</em></p>
730
5868eb24
BP
731 <p>
732 Most of the symbols below have integer type. Only <code>inport</code>
733 and <code>outport</code> have string type. <code>inport</code> names a
734 logical port. Thus, its value is a <ref column="logical_port"/> name
62fdd819
AW
735 from the <ref table="Port_Binding"/> table. <code>outport</code> may
736 name a logical port, as <code>inport</code>, or a logical multicast
737 group defined in the <ref table="Multicast_Group"/> table. For both
738 symbols, only names within the flow's logical datapath may be used.
5868eb24
BP
739 </p>
740
394e883d
JP
741 <p>
742 The <code>reg</code><var>X</var> symbols are 32-bit integers.
743 The <code>xxreg</code><var>X</var> symbols are 128-bit integers,
744 which overlay four of the 32-bit registers: <code>xxreg0</code>
745 overlays <code>reg0</code> through <code>reg3</code>, with
746 <code>reg0</code> supplying the most-significant bits of
747 <code>xxreg0</code> and <code>reg3</code> the least-signficant.
748 <code>xxreg1</code> similarly overlays <code>reg4</code> through
749 <code>reg7</code>.
750 </p>
751
e0840f11 752 <ul>
cc5e28d8 753 <li><code>reg0</code>...<code>reg9</code></li>
394e883d 754 <li><code>xxreg0</code> <code>xxreg1</code></li>
5868eb24 755 <li><code>inport</code> <code>outport</code></li>
e0840f11
BP
756 <li><code>eth.src</code> <code>eth.dst</code> <code>eth.type</code></li>
757 <li><code>vlan.tci</code> <code>vlan.vid</code> <code>vlan.pcp</code> <code>vlan.present</code></li>
758 <li><code>ip.proto</code> <code>ip.dscp</code> <code>ip.ecn</code> <code>ip.ttl</code> <code>ip.frag</code></li>
759 <li><code>ip4.src</code> <code>ip4.dst</code></li>
760 <li><code>ip6.src</code> <code>ip6.dst</code> <code>ip6.label</code></li>
761 <li><code>arp.op</code> <code>arp.spa</code> <code>arp.tpa</code> <code>arp.sha</code> <code>arp.tha</code></li>
762 <li><code>tcp.src</code> <code>tcp.dst</code> <code>tcp.flags</code></li>
763 <li><code>udp.src</code> <code>udp.dst</code></li>
764 <li><code>sctp.src</code> <code>sctp.dst</code></li>
765 <li><code>icmp4.type</code> <code>icmp4.code</code></li>
766 <li><code>icmp6.type</code> <code>icmp6.code</code></li>
767 <li><code>nd.target</code> <code>nd.sll</code> <code>nd.tll</code></li>
e3d81ade 768 <li><code>ct_mark</code> <code>ct_label</code></li>
78aab811
JP
769 <li>
770 <p>
771 <code>ct_state</code>, which has the following Boolean subfields:
772 </p>
773 <ul>
774 <li><code>ct.new</code>: True for a new flow</li>
775 <li><code>ct.est</code>: True for an established flow</li>
776 <li><code>ct.rel</code>: True for a related flow</li>
777 <li><code>ct.rpl</code>: True for a reply flow</li>
778 <li><code>ct.inv</code>: True for a connection entry in a bad state</li>
779 </ul>
780 <p>
781 <code>ct_state</code> and its subfields are initialized by the
782 <code>ct_next</code> action, described below.
783 </p>
784 </li>
e0840f11
BP
785 </ul>
786
25030d47
RB
787 <p>
788 The following predicates are supported:
789 </p>
790
791 <ul>
a2011117
BP
792 <li><code>eth.bcast</code> expands to <code>eth.dst == ff:ff:ff:ff:ff:ff</code></li>
793 <li><code>eth.mcast</code> expands to <code>eth.dst[40]</code></li>
25030d47
RB
794 <li><code>vlan.present</code> expands to <code>vlan.tci[12]</code></li>
795 <li><code>ip4</code> expands to <code>eth.type == 0x800</code></li>
a2011117 796 <li><code>ip4.mcast</code> expands to <code>ip4.dst[28..31] == 0xe</code></li>
25030d47
RB
797 <li><code>ip6</code> expands to <code>eth.type == 0x86dd</code></li>
798 <li><code>ip</code> expands to <code>ip4 || ip6</code></li>
799 <li><code>icmp4</code> expands to <code>ip4 &amp;&amp; ip.proto == 1</code></li>
800 <li><code>icmp6</code> expands to <code>ip6 &amp;&amp; ip.proto == 58</code></li>
801 <li><code>icmp</code> expands to <code>icmp4 || icmp6</code></li>
802 <li><code>ip.is_frag</code> expands to <code>ip.frag[0]</code></li>
803 <li><code>ip.later_frag</code> expands to <code>ip.frag[1]</code></li>
804 <li><code>ip.first_frag</code> expands to <code>ip.is_frag &amp;&amp; !ip.later_frag</code></li>
805 <li><code>arp</code> expands to <code>eth.type == 0x806</code></li>
806 <li><code>nd</code> expands to <code>icmp6.type == {135, 136} &amp;&amp; icmp6.code == 0</code></li>
807 <li><code>tcp</code> expands to <code>ip.proto == 6</code></li>
808 <li><code>udp</code> expands to <code>ip.proto == 17</code></li>
809 <li><code>sctp</code> expands to <code>ip.proto == 132</code></li>
810 </ul>
fe36184b
BP
811 </column>
812
813 <column name="actions">
814 <p>
2cd87fce
RB
815 Logical datapath actions, to be executed when the logical flow
816 represented by this row is the highest-priority match.
fe36184b
BP
817 </p>
818
35060cdc 819 <p>
2cd87fce
RB
820 Actions share lexical syntax with the <ref column="match"/> column. An
821 empty set of actions (or one that contains just white space or
822 comments), or a set of actions that consists of just
823 <code>drop;</code>, causes the matched packets to be dropped.
824 Otherwise, the column should contain a sequence of actions, each
825 terminated by a semicolon.
35060cdc 826 </p>
fe36184b 827
35060cdc 828 <p>
eee7a8ed 829 The following actions are defined:
35060cdc 830 </p>
fe36184b 831
35060cdc
BP
832 <dl>
833 <dt><code>output;</code></dt>
834 <dd>
5868eb24 835 <p>
eee7a8ed
JP
836 In the ingress pipeline, this action executes the
837 <code>egress</code> pipeline as a subroutine. If
838 <code>outport</code> names a logical port, the egress pipeline
839 executes once; if it is a multicast group, the egress pipeline runs
840 once for each logical port in the group.
5868eb24
BP
841 </p>
842
843 <p>
844 In the egress pipeline, this action performs the actual
845 output to the <code>outport</code> logical port. (In the egress
846 pipeline, <code>outport</code> never names a multicast group.)
847 </p>
848
849 <p>
850 Output to the input port is implicitly dropped, that is,
851 <code>output</code> becomes a no-op if <code>outport</code> ==
b4970837
BP
852 <code>inport</code>. Occasionally it may be useful to override
853 this behavior, e.g. to send an ARP reply to an ARP request; to do
854 so, use <code>inport = "";</code> to set the logical input port to
855 an empty string (which should not be used as the name of any
856 logical port).
5868eb24 857 </p>
eee7a8ed 858 </dd>
fe36184b 859
35060cdc 860 <dt><code>next;</code></dt>
558ec83d 861 <dt><code>next(<var>table</var>);</code></dt>
35060cdc 862 <dd>
558ec83d
BP
863 Executes another logical datapath table as a subroutine. By default,
864 the table after the current one is executed. Specify
865 <var>table</var> to jump to a specific table in the same pipeline.
2cd87fce 866 </dd>
fe36184b 867
35060cdc
BP
868 <dt><code><var>field</var> = <var>constant</var>;</code></dt>
869 <dd>
5868eb24 870 <p>
5ee054fb
BP
871 Sets data or metadata field <var>field</var> to constant value
872 <var>constant</var>, e.g. <code>outport = "vif0";</code> to set the
873 logical output port. To set only a subset of bits in a field,
874 specify a subfield for <var>field</var> or a masked
875 <var>constant</var>, e.g. one may use <code>vlan.pcp[2] = 1;</code>
876 or <code>vlan.pcp = 4/4;</code> to set the most sigificant bit of
877 the VLAN PCP.
5868eb24
BP
878 </p>
879
880 <p>
881 Assigning to a field with prerequisites implicitly adds those
882 prerequisites to <ref column="match"/>; thus, for example, a flow
883 that sets <code>tcp.dst</code> applies only to TCP flows,
884 regardless of whether its <ref column="match"/> mentions any TCP
885 field.
886 </p>
887
888 <p>
889 Not all fields are modifiable (e.g. <code>eth.type</code> and
890 <code>ip.proto</code> are read-only), and not all modifiable fields
891 may be partially modified (e.g. <code>ip.ttl</code> must assigned
892 as a whole). The <code>outport</code> field is modifiable in the
893 <code>ingress</code> pipeline but not in the <code>egress</code>
894 pipeline.
895 </p>
eee7a8ed 896 </dd>
5ee054fb
BP
897
898 <dt><code><var>field1</var> = <var>field2</var>;</code></dt>
899 <dd>
900 <p>
901 Sets data or metadata field <var>field1</var> to the value of data
902 or metadata field <var>field2</var>, e.g. <code>reg0 =
903 ip4.src;</code> copies <code>ip4.src</code> into <code>reg0</code>.
904 To modify only a subset of a field's bits, specify a subfield for
905 <var>field1</var> or <var>field2</var> or both, e.g. <code>vlan.pcp
906 = reg0[0..2];</code> copies the least-significant bits of
907 <code>reg0</code> into the VLAN PCP.
908 </p>
909
910 <p>
911 <var>field1</var> and <var>field2</var> must be the same type,
912 either both string or both integer fields. If they are both
913 integer fields, they must have the same width.
914 </p>
915
916 <p>
917 If <var>field1</var> or <var>field2</var> has prerequisites, they
918 are added implicitly to <ref column="match"/>. It is possible to
919 write an assignment with contradictory prerequisites, such as
920 <code>ip4.src = ip6.src[0..31];</code>, but the contradiction means
921 that a logical flow with such an assignment will never be matched.
922 </p>
923 </dd>
a20c96c6
BP
924
925 <dt><code><var>field1</var> &lt;-&gt; <var>field2</var>;</code></dt>
926 <dd>
927 <p>
928 Similar to <code><var>field1</var> = <var>field2</var>;</code>
929 except that the two values are exchanged instead of copied. Both
930 <var>field1</var> and <var>field2</var> must modifiable.
931 </p>
932 </dd>
78aab811 933
00ea19e4
BP
934 <dt><code>ip.ttl--;</code></dt>
935 <dd>
936 <p>
937 Decrements the IPv4 or IPv6 TTL. If this would make the TTL zero
938 or negative, then processing of the packet halts; no further
939 actions are processed. (To properly handle such cases, a
4c20b9f2
JP
940 higher-priority flow should match on
941 <code>ip.ttl == {0, 1};</code>.)
00ea19e4
BP
942 </p>
943
944 <p><b>Prerequisite:</b> <code>ip</code></p>
945 </dd>
946
78aab811
JP
947 <dt><code>ct_next;</code></dt>
948 <dd>
949 <p>
950 Apply connection tracking to the flow, initializing
951 <code>ct_state</code> for matching in later tables.
952 Automatically moves on to the next table, as if followed by
953 <code>next</code>.
954 </p>
955
956 <p>
957 As a side effect, IP fragments will be reassembled for matching.
958 If a fragmented packet is output, then it will be sent with any
959 overlapping fragments squashed. The connection tracking state is
960 scoped by the logical port, so overlapping addresses may be used.
961 To allow traffic related to the matched flow, execute
962 <code>ct_commit</code>.
963 </p>
964
965 <p>
966 It is possible to have actions follow <code>ct_next</code>,
967 but they will not have access to any of its side-effects and
968 is not generally useful.
969 </p>
970 </dd>
971
972 <dt><code>ct_commit;</code></dt>
a9e1b66f
RB
973 <dt><code>ct_commit(ct_mark=<var>value[/mask]</var>);</code></dt>
974 <dt><code>ct_commit(ct_label=<var>value[/mask]</var>);</code></dt>
975 <dt><code>ct_commit(ct_mark=<var>value[/mask]</var>, ct_label=<var>value[/mask]</var>);</code></dt>
78aab811 976 <dd>
c4623bb8 977 <p>
a9e1b66f
RB
978 Commit the flow to the connection tracking entry associated with it
979 by a previous call to <code>ct_next</code>. When
980 <code>ct_mark=<var>value[/mask]</var></code> and/or
981 <code>ct_label=<var>value[/mask]</var></code> are supplied,
982 <code>ct_mark</code> and/or <code>ct_label</code> will be set to the
983 values indicated by <var>value[/mask]</var> on the connection
984 tracking entry. <code>ct_mark</code> is a 32-bit field.
354b8f27
NS
985 <code>ct_label</code> is a 128-bit field. The <var>value[/mask]</var>
986 should be specified in hex string if more than 64bits are to be used.
c4623bb8 987 </p>
a9e1b66f 988
c4623bb8
RB
989 <p>
990 Note that if you want processing to continue in the next table,
991 you must execute the <code>next</code> action after
a9e1b66f
RB
992 <code>ct_commit</code>. You may also leave out <code>next</code>
993 which will commit connection tracking state, and then drop the
994 packet. This could be useful for setting <code>ct_mark</code>
995 on a connection tracking entry before dropping a packet,
996 for example.
c4623bb8 997 </p>
78aab811 998 </dd>
fe36184b 999
de297547
GS
1000 <dt><code>ct_dnat;</code></dt>
1001 <dt><code>ct_dnat(<var>IP</var>);</code></dt>
1002 <dd>
1003 <p>
1004 <code>ct_dnat</code> sends the packet through the DNAT zone in
1005 connection tracking table to unDNAT any packet that was DNATed in
1006 the opposite direction. The packet is then automatically sent to
1007 to the next tables as if followed by <code>next;</code> action.
1008 The next tables will see the changes in the packet caused by
1009 the connection tracker.
1010 </p>
1011 <p>
1012 <code>ct_dnat(<var>IP</var>)</code> sends the packet through the
1013 DNAT zone to change the destination IP address of the packet to
467085fd 1014 the one provided inside the parentheses and commits the connection.
de297547
GS
1015 The packet is then automatically sent to the next tables as if
1016 followed by <code>next;</code> action. The next tables will see
1017 the changes in the packet caused by the connection tracker.
1018 </p>
1019 </dd>
1020
1021 <dt><code>ct_snat;</code></dt>
1022 <dt><code>ct_snat(<var>IP</var>);</code></dt>
1023 <dd>
1024 <p>
1025 <code>ct_snat</code> sends the packet through the SNAT zone to
1026 unSNAT any packet that was SNATed in the opposite direction. If
1027 the packet needs to be sent to the next tables, then it should be
1028 followed by a <code>next;</code> action. The next tables will not
1029 see the changes in the packet caused by the connection tracker.
1030 </p>
1031 <p>
1032 <code>ct_snat(<var>IP</var>)</code> sends the packet through the
1033 SNAT zone to change the source IP address of the packet to
1034 the one provided inside the parenthesis and commits the connection.
1035 The packet is then automatically sent to the next tables as if
1036 followed by <code>next;</code> action. The next tables will see the
1037 changes in the packet caused by the connection tracker.
1038 </p>
1039 </dd>
1040
69a832cf
BP
1041 <dt><code>arp { <var>action</var>; </code>...<code> };</code></dt>
1042 <dd>
1043 <p>
1044 Temporarily replaces the IPv4 packet being processed by an ARP
1045 packet and executes each nested <var>action</var> on the ARP
1046 packet. Actions following the <var>arp</var> action, if any, apply
1047 to the original, unmodified packet.
1048 </p>
1049
1050 <p>
1051 The ARP packet that this action operates on is initialized based on
1052 the IPv4 packet being processed, as follows. These are default
1053 values that the nested actions will probably want to change:
1054 </p>
1055
1056 <ul>
1057 <li><code>eth.src</code> unchanged</li>
1058 <li><code>eth.dst</code> unchanged</li>
1059 <li><code>eth.type = 0x0806</code></li>
1060 <li><code>arp.op = 1</code> (ARP request)</li>
1061 <li><code>arp.sha</code> copied from <code>eth.src</code></li>
1062 <li><code>arp.spa</code> copied from <code>ip4.src</code></li>
1063 <li><code>arp.tha = 00:00:00:00:00:00</code></li>
1064 <li><code>arp.tpa</code> copied from <code>ip4.dst</code></li>
1065 </ul>
1066
6335d074
BP
1067 <p>
1068 The ARP packet has the same VLAN header, if any, as the IP packet
1069 it replaces.
1070 </p>
1071
69a832cf
BP
1072 <p><b>Prerequisite:</b> <code>ip4</code></p>
1073 </dd>
1074
e75451fe
ZKL
1075 <dt>
1076 <code>na { <var>action</var>; </code>...<code> };</code>
1077 </dt>
1078
1079 <dd>
1080 <p>
1081 Temporarily replaces the IPv6 packet being processed by an IPv6
1082 neighbor advertisement (NA) packet and executes each nested
1083 <var>action</var> on the NA packet. Actions following the
1084 <var>na</var> action, if any, apply to the original, unmodified
1085 packet.
1086 </p>
1087
1088 <p>
1089 The NA packet that this action operates on is initialized based on
1090 the IPv6 packet being processed, as follows. These are default
1091 values that the nested actions will probably want to change:
1092 </p>
1093
1094 <ul>
1095 <li><code>eth.dst</code> exchanged with <code>eth.src</code></li>
1096 <li><code>eth.type = 0x86dd</code></li>
1097 <li><code>ip6.dst</code> copied from <code>ip6.src</code></li>
1098 <li><code>ip6.src</code> copied from <code>nd.target</code></li>
1099 <li><code>icmp6.type = 136</code> (Neighbor Advertisement)</li>
1100 <li><code>nd.target</code> unchanged</li>
1101 <li><code>nd.sll = 00:00:00:00:00:00</code></li>
1102 <li><code>nd.tll</code> copied from <code>eth.dst</code></li>
1103 </ul>
1104
1105 <p>
1106 The ND packet has the same VLAN header, if any, as the IPv6 packet
1107 it replaces.
1108 </p>
1109
1110 <p>
1111 <b>Prerequisite:</b> <code>nd</code>
1112 </p>
1113 </dd>
1114
0bac7164
BP
1115 <dt><code>get_arp(<var>P</var>, <var>A</var>);</code></dt>
1116
1117 <dd>
1118 <p>
1119 <b>Parameters</b>: logical port string field <var>P</var>, 32-bit
1120 IP address field <var>A</var>.
1121 </p>
1122
1123 <p>
1124 Looks up <var>A</var> in <var>P</var>'s ARP table. If an entry is
1125 found, stores its Ethernet address in <code>eth.dst</code>,
1126 otherwise stores <code>00:00:00:00:00:00</code> in
1127 <code>eth.dst</code>.
1128 </p>
1129
1130 <p><b>Example:</b> <code>get_arp(outport, ip4.dst);</code></p>
1131 </dd>
1132
1133 <dt>
1134 <code>put_arp(<var>P</var>, <var>A</var>, <var>E</var>);</code>
1135 </dt>
1136
1137 <dd>
1138 <p>
1139 <b>Parameters</b>: logical port string field <var>P</var>, 32-bit
1140 IP address field <var>A</var>, 48-bit Ethernet address field
1141 <var>E</var>.
1142 </p>
1143
1144 <p>
1145 Adds or updates the entry for IP address <var>A</var> in logical
1146 port <var>P</var>'s ARP table, setting its Ethernet address to
1147 <var>E</var>.
1148 </p>
1149
1150 <p><b>Example:</b> <code>put_arp(inport, arp.spa, arp.sha);</code></p>
1151 </dd>
42814145
NS
1152
1153 <dt>
1154 <code><var>R</var> = put_dhcp_opts(<code>offerip</code> = <var>IP</var>, <var>D1</var> = <var>V1</var>, <var>D2</var> = <var>V2</var>, ..., <var>Dn</var> = <var>Vn</var>);</code>
1155 </dt>
1156
1157 <dd>
1158 <p>
1159 <b>Parameters</b>: one or more DHCP option/value pairs, the first
1160 of which must set a value for the offered IP, <code>offerip</code>.
1161 </p>
1162
1163 <p>
1164 <b>Result</b>: stored to a 1-bit subfield <var>R</var>.
1165 </p>
1166
1167 <p>
1168 Valid only in the ingress pipeline.
1169 </p>
1170
1171 <p>
1172 When this action is applied to a DHCP request packet (DHCPDISCOVER
1173 or DHCPREQUEST), it changes the packet into a DHCP reply (DHCPOFFER
1174 or DHCPACK, respectively), replaces the options by those specified
1175 as parameters, and stores 1 in <var>R</var>.
1176 </p>
1177
1178 <p>
1179 When this action is applied to a non-DHCP packet or a DHCP packet
1180 that is not DHCPDISCOVER or DHCPREQUEST, it leaves the packet
1181 unchanged and stores 0 in <var>R</var>.
1182 </p>
1183
1184 <p>
1185 The contents of the <ref table="DHCP_Option"/> table control the
1186 DHCP option names and values that this action supports.
1187 </p>
1188
1189 <p>
1190 <b>Example:</b>
1191 <code>
1192 reg0[0] = put_dhcp_opts(offerip = 10.0.0.2, router = 10.0.0.1,
1193 netmask = 255.255.255.0, dns_server = {8.8.8.8, 7.7.7.7});
1194 </code>
1195 </p>
1196 </dd>
467085fd
GS
1197
1198 <dt><code>ct_lb;</code></dt>
1199 <dt><code>ct_lb(</code><var>ip</var>[<code>:</code><var>port</var>]...<code>);</code></dt>
1200 <dd>
1201 <p>
1202 With one or more arguments, <code>ct_lb</code> commits the packet
1203 to the connection tracking table and DNATs the packet's destination
1204 IP address (and port) to the IP address or addresses (and optional
1205 ports) specified in the string. If multiple comma-separated IP
1206 addresses are specified, each is given equal weight for picking the
1207 DNAT address. Processing automatically moves on to the next table,
1208 as if <code>next;</code> were specified, and later tables act on
1209 the packet as modified by the connection tracker. Connection
1210 tracking state is scoped by the logical port, so overlapping
1211 addresses may be used.
1212 </p>
1213 <p>
1214 Without arguments, <code>ct_lb</code> sends the packet to the
1215 connection tracking table to NAT the packets. If the packet is
1216 part of an established connection that was previously committed to
1217 the connection tracker via <code>ct_lb(</code>...<code>)</code>, it
1218 will automatically get DNATed to the same IP address as the first
1219 packet in that connection.
1220 </p>
1221 </dd>
6335d074
BP
1222 </dl>
1223
1224 <p>
1225 The following actions will likely be useful later, but they have not
1226 been thought out carefully.
1227 </p>
1228
1229 <dl>
69a832cf
BP
1230 <dt><code>icmp4 { <var>action</var>; </code>...<code> };</code></dt>
1231 <dd>
1232 <p>
1233 Temporarily replaces the IPv4 packet being processed by an ICMPv4
1234 packet and executes each nested <var>action</var> on the ICMPv4
1235 packet. Actions following the <var>icmp4</var> action, if any,
1236 apply to the original, unmodified packet.
1237 </p>
1238
1239 <p>
1240 The ICMPv4 packet that this action operates on is initialized based
1241 on the IPv4 packet being processed, as follows. These are default
1242 values that the nested actions will probably want to change.
1243 Ethernet and IPv4 fields not listed here are not changed:
1244 </p>
1245
1246 <ul>
1247 <li><code>ip.proto = 1</code> (ICMPv4)</li>
1248 <li><code>ip.frag = 0</code> (not a fragment)</li>
1249 <li><code>icmp4.type = 3</code> (destination unreachable)</li>
1250 <li><code>icmp4.code = 1</code> (host unreachable)</li>
1251 </ul>
1252
1253 <p>
1254 Details TBD.
1255 </p>
fe36184b 1256
69a832cf
BP
1257 <p><b>Prerequisite:</b> <code>ip4</code></p>
1258 </dd>
1259
1260 <dt><code>tcp_reset;</code></dt>
1261 <dd>
1262 <p>
1263 This action transforms the current TCP packet according to the
1264 following pseudocode:
1265 </p>
1266
1267 <pre>
1268if (tcp.ack) {
1269 tcp.seq = tcp.ack;
1270} else {
1271 tcp.ack = tcp.seq + length(tcp.payload);
1272 tcp.seq = 0;
1273}
1274tcp.flags = RST;
1275</pre>
1276
1277 <p>
1278 Then, the action drops all TCP options and payload data, and
1279 updates the TCP checksum.
1280 </p>
1281
1282 <p>
1283 Details TBD.
1284 </p>
1285
1286 <p><b>Prerequisite:</b> <code>tcp</code></p>
1287 </dd>
fe36184b 1288 </dl>
fe36184b 1289 </column>
091e3af9
JP
1290
1291 <column name="external_ids" key="stage-name">
1292 Human-readable name for this flow's stage in the pipeline.
1293 </column>
1294
1295 <group title="Common Columns">
1296 The overall purpose of these columns is described under <code>Common
1297 Columns</code> at the beginning of this document.
1298
1299 <column name="external_ids"/>
1300 </group>
fe36184b
BP
1301 </table>
1302
5868eb24
BP
1303 <table name="Multicast_Group" title="Logical Port Multicast Groups">
1304 <p>
1305 The rows in this table define multicast groups of logical ports.
1306 Multicast groups allow a single packet transmitted over a tunnel to a
1307 hypervisor to be delivered to multiple VMs on that hypervisor, which
1308 uses bandwidth more efficiently.
1309 </p>
1310
1311 <p>
1312 Each row in this table defines a logical multicast group numbered <ref
1313 column="tunnel_key"/> within <ref column="datapath"/>, whose logical
1314 ports are listed in the <ref column="ports"/> column.
1315 </p>
1316
1317 <column name="datapath">
1318 The logical datapath in which the multicast group resides.
1319 </column>
1320
1321 <column name="tunnel_key">
1322 The value used to designate this logical egress port in tunnel
1323 encapsulations. An index forces the key to be unique within the <ref
1324 column="datapath"/>. The unusual range ensures that multicast group IDs
1325 do not overlap with logical port IDs.
1326 </column>
1327
1328 <column name="name">
1329 <p>
1330 The logical multicast group's name. An index forces the name to be
1331 unique within the <ref column="datapath"/>. Logical flows in the
1332 ingress pipeline may output to the group just as for individual logical
1333 ports, by assigning the group's name to <code>outport</code> and
1334 executing an <code>output</code> action.
1335 </p>
1336
1337 <p>
1338 Multicast group names and logical port names share a single namespace
1339 and thus should not overlap (but the database schema cannot enforce
1340 this). To try to avoid conflicts, <code>ovn-northd</code> uses names
1341 that begin with <code>_MC_</code>.
1342 </p>
1343 </column>
1344
1345 <column name="ports">
1346 The logical ports included in the multicast group. All of these ports
1347 must be in the <ref column="datapath"/> logical datapath (but the
1348 database schema cannot enforce this).
1349 </column>
1350 </table>
1351
1352 <table name="Datapath_Binding" title="Physical-Logical Datapath Bindings">
1353 <p>
1354 Each row in this table identifies physical bindings of a logical
1355 datapath. A logical datapath implements a logical pipeline among the
1356 ports in the <ref table="Port_Binding"/> table associated with it. In
1357 practice, the pipeline in a given logical datapath implements either a
1358 logical switch or a logical router.
1359 </p>
1360
1361 <column name="tunnel_key">
1362 The tunnel key value to which the logical datapath is bound.
1363 The <code>Tunnel Encapsulation</code> section in
1364 <code>ovn-architecture</code>(7) describes how tunnel keys are
1365 constructed for each supported encapsulation.
1366 </column>
1367
9975d7be
BP
1368 <group title="OVN_Northbound Relationship">
1369 <p>
1370 Each row in <ref table="Datapath_Binding"/> is associated with some
1371 logical datapath. <code>ovn-northd</code> uses these keys to track the
1372 association of a logical datapath with concepts in the <ref
1373 db="OVN_Northbound"/> database.
1374 </p>
1375
1376 <column name="external_ids" key="logical-switch" type='{"type": "uuid"}'>
1377 For a logical datapath that represents a logical switch,
1378 <code>ovn-northd</code> stores in this key the UUID of the
1379 corresponding <ref table="Logical_Switch" db="OVN_Northbound"/> row in
1380 the <ref db="OVN_Northbound"/> database.
1381 </column>
1382
1383 <column name="external_ids" key="logical-router" type='{"type": "uuid"}'>
1384 For a logical datapath that represents a logical router,
1385 <code>ovn-northd</code> stores in this key the UUID of the
1386 corresponding <ref table="Logical_Router" db="OVN_Northbound"/> row in
1387 the <ref db="OVN_Northbound"/> database.
1388 </column>
1389 </group>
5868eb24
BP
1390
1391 <group title="Common Columns">
1392 The overall purpose of these columns is described under <code>Common
1393 Columns</code> at the beginning of this document.
1394
1395 <column name="external_ids"/>
1396 </group>
1397 </table>
1398
dcda6e0d 1399 <table name="Port_Binding" title="Physical-Logical Port Bindings">
fe36184b 1400 <p>
d387d24d
BP
1401 Most rows in this table identify the physical location of a logical port.
1402 (The exceptions are logical patch ports, which do not have any physical
1403 location.)
fe36184b
BP
1404 </p>
1405
1406 <p>
80f408f4
JP
1407 For every <code>Logical_Switch_Port</code> record in
1408 <code>OVN_Northbound</code> database, <code>ovn-northd</code>
1409 creates a record in this table. <code>ovn-northd</code> populates
1410 and maintains every column except the <code>chassis</code> column,
1411 which it leaves empty in new records.
9fb4636f
GS
1412 </p>
1413
1414 <p>
88058f19
AW
1415 <code>ovn-controller</code>/<code>ovn-controller-vtep</code>
1416 populates the <code>chassis</code> column for the records that
1417 identify the logical ports that are located on its hypervisor/gateway,
1418 which <code>ovn-controller</code>/<code>ovn-controller-vtep</code> in
1419 turn finds out by monitoring the local hypervisor's Open_vSwitch
1420 database, which identifies logical ports via the conventions described
c1645003
GS
1421 in <code>IntegrationGuide.md</code>. (The exceptions are for
1422 <code>Port_Binding</code> records with <code>type</code> of
1423 <code>gateway</code>, whose locations are identified by
1424 <code>ovn-northd</code> via the <code>options:gateway-chassis</code>
1425 column in this table. <code>ovn-controller</code> is still responsible
1426 to populate the <code>chassis</code> column.)
9fb4636f
GS
1427 </p>
1428
1429 <p>
5868eb24 1430 When a chassis shuts down gracefully, it should clean up the
9fb4636f 1431 <code>chassis</code> column that it previously had populated.
fe36184b
BP
1432 (This is not critical because resources hosted on the chassis are equally
1433 unreachable regardless of whether their rows are present.) To handle the
1434 case where a VM is shut down abruptly on one chassis, then brought up
88058f19
AW
1435 again on a different one,
1436 <code>ovn-controller</code>/<code>ovn-controller-vtep</code> must
1437 overwrite the <code>chassis</code> column with new information.
fe36184b
BP
1438 </p>
1439
c96ba502
BP
1440 <group title="Core Features">
1441 <column name="datapath">
1442 The logical datapath to which the logical port belongs.
1443 </column>
1a76c93e 1444
c96ba502 1445 <column name="logical_port">
80f408f4
JP
1446 A logical port, taken from <ref table="Logical_Switch_Port"
1447 column="name" db="OVN_Northbound"/> in the OVN_Northbound
1448 database's <ref table="Logical_Switch_Port" db="OVN_Northbound"/>
1449 table. OVN does not prescribe a particular format for the
1450 logical port ID.
c96ba502 1451 </column>
c0281929 1452
c96ba502 1453 <column name="chassis">
184bc3ca
RB
1454 The meaning of this column depends on the value of the <ref column="type"/>
1455 column. This is the meaning for each <ref column="type"/>
1456
1457 <dl>
1458 <dt>(empty string)</dt>
1459 <dd>
1460 The physical location of the logical port. To successfully identify a
1461 chassis, this column must be a <ref table="Chassis"/> record. This is
1462 populated by <code>ovn-controller</code>.
1463 </dd>
1464
1465 <dt>vtep</dt>
1466 <dd>
1467 The physical location of the hardware_vtep gateway. To successfully
1468 identify a chassis, this column must be a <ref table="Chassis"/> record.
1469 This is populated by <code>ovn-controller-vtep</code>.
1470 </dd>
1471
1472 <dt>localnet</dt>
1473 <dd>
1474 Always empty. A localnet port is realized on every chassis that has
1475 connectivity to the corresponding physical network.
1476 </dd>
1477
1478 <dt>gateway</dt>
1479 <dd>
1480 The physical location of the L3 gateway. To successfully identify a
1481 chassis, this column must be a <ref table="Chassis"/> record. This is
1482 populated by <code>ovn-controller</code> based on the value of
1483 the <code>options:gateway-chassis</code> column in this table.
1484 </dd>
1485
1486 <dt>l2gateway</dt>
1487 <dd>
1488 The physical location of this L2 gateway. To successfully identify a
1489 chassis, this column must be a <ref table="Chassis"/> record.
62b87eab
NS
1490 This is populated by <code>ovn-controller</code> based on the value
1491 of the <code>options:l2gateway-chassis</code> column in this table.
184bc3ca
RB
1492 </dd>
1493 </dl>
1494
c96ba502 1495 </column>
c0281929 1496
c96ba502
BP
1497 <column name="tunnel_key">
1498 <p>
1499 A number that represents the logical port in the key (e.g. STT key or
1500 Geneve TLV) field carried within tunnel protocol packets.
1501 </p>
c0281929 1502
c96ba502
BP
1503 <p>
1504 The tunnel ID must be unique within the scope of a logical datapath.
1505 </p>
1506 </column>
88058f19 1507
c96ba502
BP
1508 <column name="mac">
1509 <p>
1510 The Ethernet address or addresses used as a source address on the
1511 logical port, each in the form
1512 <var>xx</var>:<var>xx</var>:<var>xx</var>:<var>xx</var>:<var>xx</var>:<var>xx</var>.
1513 The string <code>unknown</code> is also allowed to indicate that the
1514 logical port has an unknown set of (additional) source addresses.
1515 </p>
1516
1517 <p>
1518 A VM interface would ordinarily have a single Ethernet address. A
1519 gateway port might initially only have <code>unknown</code>, and then
1520 add MAC addresses to the set as it learns new source addresses.
1521 </p>
1522 </column>
88058f19 1523
c96ba502
BP
1524 <column name="type">
1525 <p>
1526 A type for this logical port. Logical ports can be used to model other
1527 types of connectivity into an OVN logical switch. The following types
1528 are defined:
1529 </p>
1530
1531 <dl>
1532 <dt>(empty string)</dt>
1533 <dd>VM (or VIF) interface.</dd>
d387d24d
BP
1534
1535 <dt><code>patch</code></dt>
1536 <dd>
1537 One of a pair of logical ports that act as if connected by a patch
1538 cable. Useful for connecting two logical datapaths, e.g. to connect
1539 a logical router to a logical switch or to another logical router.
1540 </dd>
1541
c1645003
GS
1542 <dt><code>gateway</code></dt>
1543 <dd>
1544 One of a pair of logical ports that act as if connected by a patch
1545 cable across multiple chassis. Useful for connecting a logical
1546 switch with a Gateway router (which is only resident on a
1547 particular chassis).
1548 </dd>
1549
c96ba502
BP
1550 <dt><code>localnet</code></dt>
1551 <dd>
1552 A connection to a locally accessible network from each
1553 <code>ovn-controller</code> instance. A logical switch can only
6e6c3f91
HZ
1554 have a single <code>localnet</code> port attached. This is used
1555 to model direct connectivity to an existing network.
c96ba502
BP
1556 </dd>
1557
184bc3ca
RB
1558 <dt><code>l2gateway</code></dt>
1559 <dd>
1560 An L2 connection to a physical network. The chassis this
1561 <ref table="Port_Binding"/> is bound to will serve as
1562 an L2 gateway to the network named by
1563 <ref column="options" table="Port_Binding"/>:<code>network_name</code>.
1564 </dd>
1565
c96ba502
BP
1566 <dt><code>vtep</code></dt>
1567 <dd>
1568 A port to a logical switch on a VTEP gateway chassis. In order to
1569 get this port correctly recognized by the OVN controller, the <ref
1570 column="options"
1571 table="Port_Binding"/>:<code>vtep-physical-switch</code> and <ref
1572 column="options"
1573 table="Port_Binding"/>:<code>vtep-logical-switch</code> must also
1574 be defined.
1575 </dd>
1576 </dl>
1577 </column>
1578 </group>
1a76c93e 1579
d387d24d
BP
1580 <group title="Patch Options">
1581 <p>
1582 These options apply to logical ports with <ref column="type"/> of
1583 <code>patch</code>.
1584 </p>
1585
1586 <column name="options" key="peer">
1587 The <ref column="logical_port"/> in the <ref table="Port_Binding"/>
1588 record for the other side of the patch. The named <ref
1589 column="logical_port"/> must specify this <ref column="logical_port"/>
1590 in its own <code>peer</code> option. That is, the two patch logical
1591 ports must have reversed <ref column="logical_port"/> and
1592 <code>peer</code> values.
1593 </column>
1594 </group>
1595
184bc3ca 1596 <group title="L3 Gateway Options">
c1645003
GS
1597 <p>
1598 These options apply to logical ports with <ref column="type"/> of
1599 <code>gateway</code>.
1600 </p>
1601
1602 <column name="options" key="peer">
1603 The <ref column="logical_port"/> in the <ref table="Port_Binding"/>
1604 record for the other side of the 'gateway' port. The named <ref
1605 column="logical_port"/> must specify this <ref column="logical_port"/>
1606 in its own <code>peer</code> option. That is, the two 'gateway'
1607 logical ports must have reversed <ref column="logical_port"/> and
1608 <code>peer</code> values.
1609 </column>
1610
1611 <column name="options" key="gateway-chassis">
1612 The <code>chassis</code> in which the port resides.
1613 </column>
1614 </group>
1615
c96ba502 1616 <group title="Localnet Options">
eb00399e 1617 <p>
c96ba502
BP
1618 These options apply to logical ports with <ref column="type"/> of
1619 <code>localnet</code>.
eb00399e
BP
1620 </p>
1621
c96ba502
BP
1622 <column name="options" key="network_name">
1623 Required. <code>ovn-controller</code> uses the configuration entry
1624 <code>ovn-bridge-mappings</code> to determine how to connect to this
1625 network. <code>ovn-bridge-mappings</code> is a list of network names
1626 mapped to a local OVS bridge that provides access to that network. An
1627 example of configuring <code>ovn-bridge-mappings</code> would be:
1628
1629 <pre>$ ovs-vsctl set open . external-ids:ovn-bridge-mappings=physnet1:br-eth0,physnet2:br-eth1</pre>
1630
1631 <p>
1632 When a logical switch has a <code>localnet</code> port attached,
1633 every chassis that may have a local vif attached to that logical
1634 switch must have a bridge mapping configured to reach that
1635 <code>localnet</code>. Traffic that arrives on a
1636 <code>localnet</code> port is never forwarded over a tunnel to
1637 another chassis.
1638 </p>
1639 </column>
1640
1641 <column name="tag">
1642 If set, indicates that the port represents a connection to a specific
1643 VLAN on a locally accessible network. The VLAN ID is used to match
1644 incoming traffic and is also added to outgoing traffic.
1645 </column>
1646 </group>
1647
184bc3ca
RB
1648 <group title="L2 Gateway Options">
1649 <p>
1650 These options apply to logical ports with <ref column="type"/> of
1651 <code>l2gateway</code>.
1652 </p>
1653
1654 <column name="options" key="network_name">
1655 Required. <code>ovn-controller</code> uses the configuration entry
1656 <code>ovn-bridge-mappings</code> to determine how to connect to this
1657 network. <code>ovn-bridge-mappings</code> is a list of network names
1658 mapped to a local OVS bridge that provides access to that network. An
1659 example of configuring <code>ovn-bridge-mappings</code> would be:
1660
1661 <pre>$ ovs-vsctl set open . external-ids:ovn-bridge-mappings=physnet1:br-eth0,physnet2:br-eth1</pre>
1662
1663 <p>
1664 When a logical switch has a <code>l2gateway</code> port attached,
1665 the chassis that the <code>l2gateway</code> port is bound to
1666 must have a bridge mapping configured to reach the network
1667 identified by <code>network_name</code>.
1668 </p>
1669 </column>
1670
62b87eab
NS
1671 <column name="options" key="l2gateway-chassis">
1672 Required. The <code>chassis</code> in which the port resides.
1673 </column>
1674
184bc3ca
RB
1675 <column name="tag">
1676 If set, indicates that the gateway is connected to a specific
1677 VLAN on the physical network. The VLAN ID is used to match
1678 incoming traffic and is also added to outgoing traffic.
1679 </column>
1680 </group>
1681
c96ba502 1682 <group title="VTEP Options">
eb00399e 1683 <p>
c96ba502
BP
1684 These options apply to logical ports with <ref column="type"/> of
1685 <code>vtep</code>.
eb00399e 1686 </p>
9fb4636f 1687
c96ba502
BP
1688 <column name="options" key="vtep-physical-switch">
1689 Required. The name of the VTEP gateway.
1690 </column>
fe36184b 1691
c96ba502
BP
1692 <column name="options" key="vtep-logical-switch">
1693 Required. A logical switch name connected by the VTEP gateway. Must
1694 be set when <ref column="type"/> is <code>vtep</code>.
1695 </column>
1696 </group>
fe36184b 1697
aef5f431
BP
1698 <group title="VMI (or VIF) Options">
1699 <p>
1700 These options apply to logical ports with <ref column="type"/> having
1701 (empty string)
1702 </p>
1703
1704 <column name="options" key="policing_rate">
1705 If set, indicates the maximum rate for data sent from this interface,
1706 in kbps. Data exceeding this rate is dropped.
1707 </column>
1708
1709 <column name="options" key="policing_burst">
1710 If set, indicates the maximum burst size for data sent from this
1711 interface, in kb.
1712 </column>
1713 </group>
1714
c96ba502 1715 <group title="Nested Containers">
fe36184b 1716 <p>
c96ba502
BP
1717 These columns support containers nested within a VM. Specifically,
1718 they are used when <ref column="type"/> is empty and <ref
1719 column="logical_port"/> identifies the interface of a container spawned
1720 inside a VM. They are empty for containers or VMs that run directly on
1721 a hypervisor.
fe36184b
BP
1722 </p>
1723
c96ba502
BP
1724 <column name="parent_port">
1725 This is taken from
80f408f4
JP
1726 <ref table="Logical_Switch_Port" column="parent_name"
1727 db="OVN_Northbound"/> in the OVN_Northbound database's
1728 <ref table="Logical_Switch_Port" db="OVN_Northbound"/> table.
c96ba502
BP
1729 </column>
1730
1731 <column name="tag">
1732 <p>
1733 Identifies the VLAN tag in the network traffic associated with that
1734 container's network interface.
1735 </p>
1736
1737 <p>
1738 This column is used for a different purpose when <ref column="type"/>
184bc3ca
RB
1739 is <code>localnet</code> (see <code>Localnet Options</code>, above)
1740 or <code>l2gateway</code> (see <code>L2 Gateway Options</code>, above).
c96ba502
BP
1741 </p>
1742 </column>
1743 </group>
fe36184b 1744 </table>
0bac7164
BP
1745
1746 <table name="MAC_Binding" title="IP to MAC bindings">
1747 <p>
1748 Each row in this table specifies a binding from an IP address to an
1749 Ethernet address that has been discovered through ARP (for IPv4) or
1750 neighbor discovery (for IPv6). This table is primarily used to discover
1751 bindings on physical networks, because IP-to-MAC bindings for virtual
1752 machines are usually populated statically into the <ref
1753 table="Port_Binding"/> table.
1754 </p>
1755
1756 <p>
1757 This table expresses a functional relationship: <ref
1758 table="MAC_Binding"/>(<ref column="logical_port"/>, <ref column="ip"/>) =
1759 <ref column="mac"/>.
1760 </p>
1761
1762 <p>
1763 In outline, the lifetime of a logical router's MAC binding looks like
1764 this:
1765 </p>
1766
1767 <ol>
1768 <li>
1769 On hypervisor 1, a logical router determines that a packet should be
1770 forwarded to IP address <var>A</var> on one of its router ports. It
1771 uses its logical flow table to determine that <var>A</var> lacks a
1772 static IP-to-MAC binding and the <code>get_arp</code> action to
1773 determine that it lacks a dynamic IP-to-MAC binding.
1774 </li>
1775
1776 <li>
1777 Using an OVN logical <code>arp</code> action, the logical router
1778 generates and sends a broadcast ARP request to the router port. It
1779 drops the IP packet.
1780 </li>
1781
1782 <li>
1783 The logical switch attached to the router port delivers the ARP request
1784 to all of its ports. (It might make sense to deliver it only to ports
1785 that have no static IP-to-MAC bindings, but this could also be
1786 surprising behavior.)
1787 </li>
1788
1789 <li>
1790 A host or VM on hypervisor 2 (which might be the same as hypervisor 1)
1791 attached to the logical switch owns the IP address in question. It
1792 composes an ARP reply and unicasts it to the logical router port's
1793 Ethernet address.
1794 </li>
1795
1796 <li>
1797 The logical switch delivers the ARP reply to the logical router port.
1798 </li>
1799
1800 <li>
1801 The logical router flow table executes a <code>put_arp</code> action.
1802 To record the IP-to-MAC binding, <code>ovn-controller</code> adds a row
1803 to the <ref table="MAC_Binding"/> table.
1804 </li>
1805
1806 <li>
1807 On hypervisor 1, <code>ovn-controller</code> receives the updated <ref
1808 table="MAC_Binding"/> table from the OVN southbound database. The next
1809 packet destined to <var>A</var> through the logical router is sent
1810 directly to the bound Ethernet address.
1811 </li>
1812 </ol>
1813
1814 <column name="logical_port">
1815 The logical port on which the binding was discovered.
1816 </column>
1817
1818 <column name="ip">
1819 The bound IP address.
1820 </column>
1821
1822 <column name="mac">
1823 The Ethernet address to which the IP is bound.
1824 </column>
791a7747
LS
1825 <column name="datapath">
1826 The logical datapath to which the logical port belongs.
1827 </column>
0bac7164 1828 </table>
42814145
NS
1829
1830 <table name="DHCP_Options" title="DHCP Options supported by native OVN DHCP">
1831 <p>
1832 Each row in this table stores the DHCP Options supported by native OVN
1833 DHCP. <code>ovn-northd</code> populates this table with the supported
1834 DHCP options. <code>ovn-controller</code> looks up this table to get the
1835 DHCP codes of the DHCP options defined in the "put_dhcp_opts" action.
1836 Please refer to the RFC 2132 <code>"https://tools.ietf.org/html/rfc2132"</code>
1837 for the possible list of DHCP options that can be defined here.
1838 </p>
1839
1840 <column name="name">
1841 <p>
1842 Name of the DHCP option.
1843 </p>
1844
1845 <p>
1846 Example. name="router"
1847 </p>
1848 </column>
1849
1850 <column name="code">
1851 <p>
1852 DHCP option code for the DHCP option as defined in the RFC 2132.
1853 </p>
1854
1855 <p>
1856 Example. code=3
1857 </p>
1858 </column>
1859
1860 <column name="type">
1861 <p>
1862 Data type of the DHCP option code.
1863 </p>
1864
1865 <dl>
1866 <dt><code>value: bool</code></dt>
1867 <dd>
1868 <p>
1869 This indicates that the value of the DHCP option is a bool.
1870 </p>
1871
1872 <p>
1873 Example. "name=ip_forward_enable", "code=19", "type=bool".
1874 </p>
1875
1876 <p>
1877 put_dhcp_opts(..., ip_forward_enable = 1,...)
1878 </p>
1879 </dd>
1880
1881 <dt><code>value: uint8</code></dt>
1882 <dd>
1883 <p>
1884 This indicates that the value of the DHCP option is an unsigned
1885 int8 (8 bits)
1886 </p>
1887
1888 <p>
1889 Example. "name=default_ttl", "code=23", "type=uint8".
1890 </p>
1891
1892 <p>
1893 put_dhcp_opts(..., default_ttl = 50,...)
1894 </p>
1895 </dd>
1896
1897 <dt><code>value: uint16</code></dt>
1898 <dd>
1899 <p>
1900 This indicates that the value of the DHCP option is an unsigned
1901 int16 (16 bits).
1902 </p>
1903
1904 <p>
1905 Example. "name=mtu", "code=26", "type=uint16".
1906 </p>
1907
1908 <p>
1909 put_dhcp_opts(..., mtu = 1450,...)
1910 </p>
1911 </dd>
1912
1913 <dt><code>value: uint32</code></dt>
1914 <dd>
1915 <p>
1916 This indicates that the value of the DHCP option is an unsigned
1917 int32 (32 bits).
1918 </p>
1919
1920 <p>
1921 Example. "name=lease_time", "code=51", "type=uint32".
1922 </p>
1923
1924 <p>
1925 put_dhcp_opts(..., lease_time = 86400,...)
1926 </p>
1927 </dd>
1928
1929 <dt><code>value: ipv4</code></dt>
1930 <dd>
1931 <p>
1932 This indicates that the value of the DHCP option is an IPv4
1933 address or addresses.
1934 </p>
1935
1936 <p>
1937 Example. "name=router", "code=3", "type=ipv4".
1938 </p>
1939
1940 <p>
1941 put_dhcp_opts(..., router = 10.0.0.1,...)
1942 </p>
1943
1944 <p>
1945 Example. "name=dns_server", "code=6", "type=ipv4".
1946 </p>
1947
1948 <p>
1949 put_dhcp_opts(..., dns_server = {8.8.8.8 7.7.7.7},...)
1950 </p>
1951 </dd>
1952
1953 <dt><code>value: static_routes</code></dt>
1954 <dd>
1955 <p>
1956 This indicates that the value of the DHCP option contains a pair of
1957 IPv4 route and next hop addresses.
1958 </p>
1959
1960 <p>
1961 Example. "name=classless_static_route", "code=121", "type=static_routes".
1962 </p>
1963
1964 <p>
1965 put_dhcp_opts(..., classless_static_route = {30.0.0.0/24,10.0.0.4,0.0.0.0/0,10.0.0.1}...)
1966 </p>
1967 </dd>
1968
1969 <dt><code>value: str</code></dt>
1970 <dd>
1971 <p>
1972 This indicates that the value of the DHCP option is a string.
1973 </p>
1974
1975 <p>
1976 Example. "name=host_name", "code=12", "type=str".
1977 </p>
1978 </dd>
1979 </dl>
1980 </column>
1981 </table>
fe36184b 1982</database>