]> git.proxmox.com Git - mirror_ovs.git/blame - ovn/northd/ovn-northd.8.xml
ovn-northd: Add logical flows to support native DNS
[mirror_ovs.git] / ovn / northd / ovn-northd.8.xml
CommitLineData
1af530bc
JP
1<?xml version="1.0" encoding="utf-8"?>
2<manpage program="ovn-northd" section="8" title="ovn-northd">
3 <h1>Name</h1>
4 <p>ovn-northd -- Open Virtual Network central control daemon</p>
5
6 <h1>Synopsis</h1>
7 <p><code>ovn-northd</code> [<var>options</var>]</p>
8
9 <h1>Description</h1>
10 <p>
11 <code>ovn-northd</code> is a centralized daemon responsible for
12 translating the high-level OVN configuration into logical
13 configuration consumable by daemons such as
14 <code>ovn-controller</code>. It translates the logical network
15 configuration in terms of conventional network concepts, taken
16 from the OVN Northbound Database (see <code>ovn-nb</code>(5)),
17 into logical datapath flows in the OVN Southbound Database (see
18 <code>ovn-sb</code>(5)) below it.
19 </p>
20
10381044
LR
21 <h1>Options</h1>
22 <dl>
23 <dt><code>--ovnnb-db=<var>database</var></code></dt>
24 <dd>
25 The OVSDB database containing the OVN Northbound Database. If the
26 <env>OVN_NB_DB</env> environment variable is set, its value is used
27 as the default. Otherwise, the default is
28 <code>unix:@RUNDIR@/ovnnb_db.sock</code>.
29 </dd>
30 <dt><code>--ovnsb-db=<var>database</var></code></dt>
31 <dd>
32 The OVSDB database containing the OVN Southbound Database. If the
33 <env>OVN_SB_DB</env> environment variable is set, its value is used
34 as the default. Otherwise, the default is
35 <code>unix:@RUNDIR@/ovnsb_db.sock</code>.
36 </dd>
37 </dl>
1af530bc 38 <p>
10381044
LR
39 <var>database</var> in the above options must take one of the following
40 forms:
1af530bc 41 </p>
10381044
LR
42 <xi:include href="ovsdb/remote-active.xml" xmlns:xi="http://www.w3.org/2003/XInclude"/>
43 <xi:include href="ovsdb/remote-passive.xml" xmlns:xi="http://www.w3.org/2003/XInclude"/>
44
45 <h2>Daemon Options</h2>
46 <xi:include href="lib/daemon.xml" xmlns:xi="http://www.w3.org/2003/XInclude"/>
47
48 <h2>Logging Options</h2>
49 <xi:include href="lib/vlog.xml" xmlns:xi="http://www.w3.org/2003/XInclude"/>
50
51 <h2>PKI Options</h2>
1af530bc 52 <p>
10381044
LR
53 PKI configuration is required in order to use SSL for the connections to
54 the Northbound and Southbound databases.
1af530bc 55 </p>
10381044
LR
56 <xi:include href="lib/ssl.xml" xmlns:xi="http://www.w3.org/2003/XInclude"/>
57
58 <h2>Other Options</h2>
59
60 <xi:include href="lib/common.xml" xmlns:xi="http://www.w3.org/2003/XInclude"/>
1af530bc 61
322ec639 62 <h1>Runtime Management Commands</h1>
1af530bc
JP
63 <p>
64 <code>ovs-appctl</code> can send commands to a running
65 <code>ovn-northd</code> process. The currently supported commands
66 are described below.
67 <dl>
68 <dt><code>exit</code></dt>
69 <dd>
70 Causes <code>ovn-northd</code> to gracefully terminate.
71 </dd>
72 </dl>
73 </p>
74
5cff6b99
BP
75 <h1>Logical Flow Table Structure</h1>
76
77 <p>
78 One of the main purposes of <code>ovn-northd</code> is to populate the
79 <code>Logical_Flow</code> table in the <code>OVN_Southbound</code>
80 database. This section describes how <code>ovn-northd</code> does this
9975d7be 81 for switch and router logical datapaths.
5cff6b99
BP
82 </p>
83
9975d7be
BP
84 <h2>Logical Switch Datapaths</h2>
85
685f4dfe 86 <h3>Ingress Table 0: Admission Control and Ingress Port Security - L2</h3>
5cff6b99
BP
87
88 <p>
89 Ingress table 0 contains these logical flows:
90 </p>
91
92 <ul>
93 <li>
94 Priority 100 flows to drop packets with VLAN tags or multicast Ethernet
95 source addresses.
96 </li>
97
98 <li>
99 Priority 50 flows that implement ingress port security for each enabled
100 logical port. For logical ports on which port security is enabled,
101 these match the <code>inport</code> and the valid <code>eth.src</code>
102 address(es) and advance only those packets to the next flow table. For
103 logical ports on which port security is not enabled, these advance all
104 packets that match the <code>inport</code>.
105 </li>
106 </ul>
107
108 <p>
109 There are no flows for disabled logical ports because the default-drop
110 behavior of logical flow tables causes packets that ingress from them to
111 be dropped.
112 </p>
113
685f4dfe 114 <h3>Ingress Table 1: Ingress Port Security - IP</h3>
78aab811
JP
115
116 <p>
685f4dfe
NS
117 Ingress table 1 contains these logical flows:
118 </p>
119
120 <ul>
121 <li>
122 <p>
123 For each element in the port security set having one or more IPv4 or
124 IPv6 addresses (or both),
125 </p>
126
127 <ul>
128 <li>
129 Priority 90 flow to allow IPv4 traffic if it has IPv4 addresses
130 which match the <code>inport</code>, valid <code>eth.src</code>
131 and valid <code>ip4.src</code> address(es).
132 </li>
133
9e687b23
DL
134 <li>
135 Priority 90 flow to allow IPv4 DHCP discovery traffic if it has a
136 valid <code>eth.src</code>. This is necessary since DHCP discovery
137 messages are sent from the unspecified IPv4 address (0.0.0.0) since
138 the IPv4 address has not yet been assigned.
139 </li>
140
685f4dfe
NS
141 <li>
142 Priority 90 flow to allow IPv6 traffic if it has IPv6 addresses
143 which match the <code>inport</code>, valid <code>eth.src</code> and
144 valid <code>ip6.src</code> address(es).
145 </li>
146
9e687b23
DL
147 <li>
148 Priority 90 flow to allow IPv6 DAD (Duplicate Address Detection)
149 traffic if it has a valid <code>eth.src</code>. This is is
150 necessary since DAD include requires joining an multicast group and
151 sending neighbor solicitations for the newly assigned address. Since
152 no address is yet assigned, these are sent from the unspecified
153 IPv6 address (::).
154 </li>
155
685f4dfe
NS
156 <li>
157 Priority 80 flow to drop IP (both IPv4 and IPv6) traffic which
158 match the <code>inport</code> and valid <code>eth.src</code>.
159 </li>
160 </ul>
161 </li>
162
163 <li>
164 One priority-0 fallback flow that matches all packets and advances to
2c36d5a6 165 the next table.
685f4dfe
NS
166 </li>
167 </ul>
168
169 <h3>Ingress Table 2: Ingress Port Security - Neighbor discovery</h3>
170
171 <p>
172 Ingress table 2 contains these logical flows:
173 </p>
174
175 <ul>
176 <li>
177 <p>
178 For each element in the port security set,
179 </p>
180
181 <ul>
182 <li>
183 Priority 90 flow to allow ARP traffic which match the
184 <code>inport</code> and valid <code>eth.src</code> and
185 <code>arp.sha</code>. If the element has one or more
186 IPv4 addresses, then it also matches the valid
187 <code>arp.spa</code>.
188 </li>
189
190 <li>
191 Priority 90 flow to allow IPv6 Neighbor Solicitation and
192 Advertisement traffic which match the <code>inport</code>,
193 valid <code>eth.src</code> and
194 <code>nd.sll</code>/<code>nd.tll</code>.
195 If the element has one or more IPv6 addresses, then it also
196 matches the valid <code>nd.target</code> address(es) for Neighbor
197 Advertisement traffic.
198 </li>
199
200 <li>
201 Priority 80 flow to drop ARP and IPv6 Neighbor Solicitation and
202 Advertisement traffic which match the <code>inport</code> and
203 valid <code>eth.src</code>.
204 </li>
205 </ul>
206 </li>
207
208 <li>
209 One priority-0 fallback flow that matches all packets and advances to
2c36d5a6 210 the next table.
685f4dfe
NS
211 </li>
212 </ul>
213
214 <h3>Ingress Table 3: <code>from-lport</code> Pre-ACLs</h3>
215
216 <p>
2c36d5a6
GS
217 This table prepares flows for possible stateful ACL processing in
218 ingress table <code>ACLs</code>. It contains a priority-0 flow that
219 simply moves traffic to the next table. If stateful ACLs are used in the
facf8652
GS
220 logical datapath, a priority-100 flow is added that sets a hint
221 (with <code>reg0[0] = 1; next;</code>) for table
222 <code>Pre-stateful</code> to send IP packets to the connection tracker
223 before eventually advancing to ingress table <code>ACLs</code>.
78aab811
JP
224 </p>
225
7a15be69
GS
226 <h3>Ingress Table 4: Pre-LB</h3>
227
228 <p>
229 This table prepares flows for possible stateful load balancing processing
230 in ingress table <code>LB</code> and <code>Stateful</code>. It contains
231 a priority-0 flow that simply moves traffic to the next table. If load
232 balancing rules with virtual IP addresses (and ports) are configured in
cc4583aa 233 <code>OVN_Northbound</code> database for a logical switch datapath, a
7a15be69
GS
234 priority-100 flow is added for each configured virtual IP address
235 <var>VIP</var> with a match <code>ip &amp;&amp; ip4.dst == <var>VIP</var>
236 </code> that sets an action <code>reg0[0] = 1; next;</code> to act as a
237 hint for table <code>Pre-stateful</code> to send IP packets to the
238 connection tracker for packet de-fragmentation before eventually
239 advancing to ingress table <code>LB</code>.
240 </p>
241
242 <h3>Ingress Table 5: Pre-stateful</h3>
facf8652
GS
243
244 <p>
245 This table prepares flows for all possible stateful processing
246 in next tables. It contains a priority-0 flow that simply moves
247 traffic to the next table. A priority-100 flow sends the packets to
248 connection tracker based on a hint provided by the previous tables
249 (with a match for <code>reg0[0] == 1</code>) by using the
250 <code>ct_next;</code> action.
251 </p>
252
7a15be69 253 <h3>Ingress table 6: <code>from-lport</code> ACLs</h3>
5cff6b99
BP
254
255 <p>
256 Logical flows in this table closely reproduce those in the
78aab811 257 <code>ACL</code> table in the <code>OVN_Northbound</code> database
cc58e1f2
RB
258 for the <code>from-lport</code> direction. The <code>priority</code>
259 values from the <code>ACL</code> table have a limited range and have
260 1000 added to them to leave room for OVN default flows at both
261 higher and lower priorities.
5cff6b99 262 </p>
cc58e1f2
RB
263 <ul>
264 <li>
265 <code>allow</code> ACLs translate into logical flows with
266 the <code>next;</code> action. If there are any stateful ACLs
267 on this datapath, then <code>allow</code> ACLs translate to
268 <code>ct_commit; next;</code> (which acts as a hint for the next tables
269 to commit the connection to conntrack),
270 </li>
271 <li>
272 <code>allow-related</code> ACLs translate into logical
273 flows with the <code>ct_commit(ct_label=0/1); next;</code> actions
274 for new connections and <code>reg0[1] = 1; next;</code> for existing
275 connections.
276 </li>
277 <li>
278 Other ACLs translate to <code>drop;</code> for new or untracked
279 connections and <code>ct_commit(ct_label=1/1);</code> for known
280 connections. Setting <code>ct_label</code> marks a connection
281 as one that was previously allowed, but should no longer be
282 allowed due to a policy change.
283 </li>
284 </ul>
5cff6b99
BP
285
286 <p>
2c36d5a6 287 This table also contains a priority 0 flow with action
78aab811
JP
288 <code>next;</code>, so that ACLs allow packets by default. If the
289 logical datapath has a statetful ACL, the following flows will
290 also be added:
5cff6b99
BP
291 </p>
292
78aab811
JP
293 <ul>
294 <li>
fa313a8c
GS
295 A priority-1 flow that sets the hint to commit IP traffic to the
296 connection tracker (with action <code>reg0[1] = 1; next;</code>). This
297 is needed for the default allow policy because, while the initiator's
298 direction may not have any stateful rules, the server's may and then
299 its return traffic would not be known and marked as invalid.
78aab811
JP
300 </li>
301
302 <li>
cc58e1f2
RB
303 A priority-65535 flow that allows any traffic in the reply
304 direction for a connection that has been committed to the
305 connection tracker (i.e., established flows), as long as
b73db61d 306 the committed flow does not have <code>ct_label.blocked</code> set.
cc58e1f2
RB
307 We only handle traffic in the reply direction here because
308 we want all packets going in the request direction to still
309 go through the flows that implement the currently defined
310 policy based on ACLs. If a connection is no longer allowed by
b73db61d 311 policy, <code>ct_label.blocked</code> will get set and packets in the
cc58e1f2 312 reply direction will no longer be allowed, either.
78aab811
JP
313 </li>
314
315 <li>
316 A priority-65535 flow that allows any traffic that is considered
317 related to a committed flow in the connection tracker (e.g., an
cc58e1f2 318 ICMP Port Unreachable from a non-listening UDP port), as long
b73db61d 319 as the committed flow does not have <code>ct_label.blocked</code> set.
78aab811
JP
320 </li>
321
322 <li>
323 A priority-65535 flow that drops all traffic marked by the
324 connection tracker as invalid.
325 </li>
cc58e1f2
RB
326
327 <li>
328 A priority-65535 flow that drops all trafic in the reply direction
b73db61d 329 with <code>ct_label.blocked</code> set meaning that the connection
cc58e1f2
RB
330 should no longer be allowed due to a policy change. Packets
331 in the request direction are skipped here to let a newly created
332 ACL re-allow this connection.
333 </li>
78aab811
JP
334 </ul>
335
1a03fc7d
BS
336 <h3>Ingress Table 7: <code>from-lport</code> QoS marking</h3>
337
338 <p>
339 Logical flows in this table closely reproduce those in the
340 <code>QoS</code> table in the <code>OVN_Northbound</code> database
341 for the <code>from-lport</code> direction.
342 </p>
343
344 <ul>
345 <li>
346 For every qos_rules for every logical switch a flow will be added at
347 priorities mentioned in the QoS table.
348 </li>
349
350 <li>
351 One priority-0 fallback flow that matches all packets and advances to
352 the next table.
353 </li>
354 </ul>
355
356 <h3>Ingress Table 8: LB</h3>
fa313a8c
GS
357
358 <p>
359 It contains a priority-0 flow that simply moves traffic to the next
7a15be69
GS
360 table. For established connections a priority 100 flow matches on
361 <code>ct.est &amp;&amp; !ct.rel &amp;&amp; !ct.new &amp;&amp;
362 !ct.inv</code> and sets an action <code>reg0[2] = 1; next;</code> to act
363 as a hint for table <code>Stateful</code> to send packets through
364 connection tracker to NAT the packets. (The packet will automatically
365 get DNATed to the same IP address as the first packet in that
366 connection.)
fa313a8c
GS
367 </p>
368
1a03fc7d 369 <h3>Ingress Table 9: Stateful</h3>
7a15be69
GS
370
371 <ul>
372 <li>
cc4583aa 373 For all the configured load balancing rules for a switch in
7a15be69
GS
374 <code>OVN_Northbound</code> database that includes a L4 port
375 <var>PORT</var> of protocol <var>P</var> and IPv4 address
376 <var>VIP</var>, a priority-120 flow that matches on
377 <code>ct.new &amp;&amp; ip &amp;&amp; ip4.dst == <var>VIP
378 </var>&amp;&amp; <var>P</var> &amp;&amp; <var>P</var>.dst == <var>PORT
379 </var></code> with an action of <code>ct_lb(<var>args</var>)</code>,
380 where <var>args</var> contains comma separated IPv4 addresses (and
381 optional port numbers) to load balance to.
382 </li>
383 <li>
cc4583aa 384 For all the configured load balancing rules for a switch in
7a15be69
GS
385 <code>OVN_Northbound</code> database that includes just an IP address
386 <var>VIP</var> to match on, a priority-110 flow that matches on
387 <code>ct.new &amp;&amp; ip &amp;&amp; ip4.dst == <var>VIP</var></code>
388 with an action of <code>ct_lb(<var>args</var>)</code>, where
389 <var>args</var> contains comma separated IPv4 addresses.
390 </li>
391 <li>
392 A priority-100 flow commits packets to connection tracker using
393 <code>ct_commit; next;</code> action based on a hint provided by
394 the previous tables (with a match for <code>reg0[1] == 1</code>).
395 </li>
396 <li>
397 A priority-100 flow sends the packets to connection tracker using
398 <code>ct_lb;</code> as the action based on a hint provided by the
399 previous tables (with a match for <code>reg0[2] == 1</code>).
400 </li>
401 <li>
402 A priority-0 flow that simply moves traffic to the next table.
403 </li>
404 </ul>
405
1a03fc7d 406 <h3>Ingress Table 10: ARP/ND responder</h3>
5cff6b99
BP
407
408 <p>
22ab299e
DB
409 This table implements ARP/ND responder in a logical switch for known
410 IPs. The advantage of the ARP responder flow is to limit ARP
411 broadcasts by locally responding to ARP requests without the need to
412 send to other hypervisors. One common case is when the inport is a
413 logical port associated with a VIF and the broadcast is responded to
414 on the local hypervisor rather than broadcast across the whole
415 network and responded to by the destination VM. This behavior is
416 proxy ARP.
5cff6b99
BP
417 </p>
418
22ab299e
DB
419 <p>
420 ARP requests arrive from VMs from a logical switch inport of type
421 default. For this case, the logical switch proxy ARP rules can be
422 for other VMs or logical router ports. Logical switch proxy ARP
423 rules may be programmed both for mac binding of IP addresses on
424 other logical switch VIF ports (which are of the default logical
425 switch port type, representing connectivity to VMs or containers),
426 and for mac binding of IP addresses on logical switch router type
427 ports, representing their logical router port peers. In order to
428 support proxy ARP for logical router ports, an IP address must be
429 configured on the logical switch router type port, with the same
430 value as the peer logical router port. The configured MAC addresses
431 must match as well. When a VM sends an ARP request for a distributed
432 logical router port and if the peer router type port of the attached
433 logical switch does not have an IP address configured, the ARP request
434 will be broadcast on the logical switch. One of the copies of the ARP
435 request will go through the logical switch router type port to the
436 logical router datapath, where the logical router ARP responder will
437 generate a reply. The MAC binding of a distributed logical router,
438 once learned by an associated VM, is used for all that VM's
439 communication needing routing. Hence, the action of a VM re-arping for
440 the mac binding of the logical router port should be rare.
441 </p>
442
443 <p>
444 Logical switch ARP responder proxy ARP rules can also be hit when
445 receiving ARP requests externally on a L2 gateway port. In this case,
446 the hypervisor acting as an L2 gateway, responds to the ARP request on
447 behalf of a destination VM.
448 </p>
449
450 <p>
451 Note that ARP requests received from <code>localnet</code> or
452 <code>vtep</code> logical inports can either go directly to VMs, in
453 which case the VM responds or can hit an ARP responder for a logical
454 router port if the packet is used to resolve a logical router port
455 next hop address. In either case, logical switch ARP responder rules
456 will not be hit. It contains these logical flows:
457 </p>
458
5cff6b99 459 <ul>
fa128126 460 <li>
22ab299e
DB
461 Priority-100 flows to skip the ARP responder if inport is of type
462 <code>localnet</code> or <code>vtep</code> and advances directly
463 to the next table. ARP requests sent to <code>localnet</code> or
464 <code>vtep</code> ports can be received by multiple hypervisors.
465 Now, because the same mac binding rules are downloaded to all
466 hypervisors, each of the multiple hypervisors will respond. This
467 will confuse L2 learning on the source of the ARP requests. ARP
468 requests received on an inport of type <code>router</code> are not
469 expected to hit any logical switch ARP responder flows. However,
470 no skip flows are installed for these packets, as there would be
471 some additional flow cost for this and the value appears limited.
fa128126
HZ
472 </li>
473
57d143eb 474 <li>
4c7bf534 475 <p>
6fdb7cd6 476 Priority-50 flows that match ARP requests to each known IP address
22ab299e 477 <var>A</var> of every logical switch port, and respond with ARP
4c7bf534
NS
478 replies directly with corresponding Ethernet address <var>E</var>:
479 </p>
480
57d143eb
HZ
481 <pre>
482eth.dst = eth.src;
483eth.src = <var>E</var>;
484arp.op = 2; /* ARP reply. */
485arp.tha = arp.sha;
486arp.sha = <var>E</var>;
487arp.tpa = arp.spa;
488arp.spa = <var>A</var>;
6fdb7cd6 489outport = inport;
bf143492 490flags.loopback = 1;
57d143eb
HZ
491output;
492 </pre>
4c7bf534
NS
493
494 <p>
495 These flows are omitted for logical ports (other than router ports)
496 that are down.
497 </p>
57d143eb
HZ
498 </li>
499
6fdb7cd6
JP
500 <li>
501 <p>
502 Priority-50 flows that match IPv6 ND neighbor solicitations to
503 each known IP address <var>A</var> (and <var>A</var>'s
22ab299e 504 solicited node address) of every logical switch port, and
6fdb7cd6
JP
505 respond with neighbor advertisements directly with
506 corresponding Ethernet address <var>E</var>:
507 </p>
508
509 <pre>
510nd_na {
511 eth.src = <var>E</var>;
512 ip6.src = <var>A</var>;
513 nd.target = <var>A</var>;
514 nd.tll = <var>E</var>;
515 outport = inport;
bf143492 516 flags.loopback = 1;
6fdb7cd6
JP
517 output;
518};
519 </pre>
520
521 <p>
522 These flows are omitted for logical ports (other than router ports)
523 that are down.
524 </p>
525 </li>
526
9fcb6a18
BP
527 <li>
528 <p>
529 Priority-100 flows with match criteria like the ARP and ND flows
530 above, except that they only match packets from the
531 <code>inport</code> that owns the IP addresses in question, with
532 action <code>next;</code>. These flows prevent OVN from replying to,
533 for example, an ARP request emitted by a VM for its own IP address.
534 A VM only makes this kind of request to attempt to detect a duplicate
535 IP address assignment, so sending a reply will prevent the VM from
536 accepting the IP address that it owns.
537 </p>
538
539 <p>
540 In place of <code>next;</code>, it would be reasonable to use
541 <code>drop;</code> for the flows' actions. If everything is working
542 as it is configured, then this would produce equivalent results,
543 since no host should reply to the request. But ARPing for one's own
544 IP address is intended to detect situations where the network is not
545 working as configured, so dropping the request would frustrate that
546 intent.
547 </p>
548 </li>
549
fa128126
HZ
550 <li>
551 One priority-0 fallback flow that matches all packets and advances to
2c36d5a6 552 the next table.
fa128126
HZ
553 </li>
554 </ul>
555
1a03fc7d 556 <h3>Ingress Table 11: DHCP option processing</h3>
281977f7
NS
557
558 <p>
559 This table adds the DHCPv4 options to a DHCPv4 packet from the
33ac3c83
NS
560 logical ports configured with IPv4 address(es) and DHCPv4 options,
561 and similarly for DHCPv6 options.
281977f7
NS
562 </p>
563
564 <ul>
565 <li>
566 <p>
567 A priority-100 logical flow is added for these logical ports
568 which matches the IPv4 packet with <code>udp.src</code> = 68 and
569 <code>udp.dst</code> = 67 and applies the action
570 <code>put_dhcp_opts</code> and advances the packet to the next table.
571 </p>
572
573 <pre>
33ac3c83 574reg0[3] = put_dhcp_opts(offer_ip = <var>ip</var>, <var>options</var>...);
281977f7
NS
575next;
576 </pre>
577
578 <p>
579 For DHCPDISCOVER and DHCPREQUEST, this transforms the packet into a
33ac3c83 580 DHCP reply, adds the DHCP offer IP <var>ip</var> and options to the
281977f7
NS
581 packet, and stores 1 into reg0[3]. For other kinds of packets, it
582 just stores 0 into reg0[3]. Either way, it continues to the next
583 table.
584 </p>
585
586 </li>
587
33ac3c83
NS
588 <li>
589 <p>
590 A priority-100 logical flow is added for these logical ports
591 which matches the IPv6 packet with <code>udp.src</code> = 546 and
592 <code>udp.dst</code> = 547 and applies the action
593 <code>put_dhcpv6_opts</code> and advances the packet to the next
594 table.
595 </p>
596
597 <pre>
598reg0[3] = put_dhcpv6_opts(ia_addr = <var>ip</var>, <var>options</var>...);
599next;
600 </pre>
601
602 <p>
603 For DHCPv6 Solicit/Request/Confirm packets, this transforms the
604 packet into a DHCPv6 Advertise/Reply, adds the DHCPv6 offer IP
605 <var>ip</var> and options to the packet, and stores 1 into reg0[3].
606 For other kinds of packets, it just stores 0 into reg0[3]. Either
607 way, it continues to the next table.
608 </p>
609 </li>
610
281977f7
NS
611 <li>
612 A priority-0 flow that matches all packets to advances to table 11.
613 </li>
614 </ul>
615
1a03fc7d 616 <h3>Ingress Table 12: DHCP responses</h3>
281977f7
NS
617
618 <p>
619 This table implements DHCP responder for the DHCP replies generated by
620 the previous table.
621 </p>
622
623 <ul>
624 <li>
625 <p>
626 A priority 100 logical flow is added for the logical ports configured
627 with DHCPv4 options which matches IPv4 packets with <code>udp.src == 68
628 &amp;&amp; udp.dst == 67 &amp;&amp; reg0[3] == 1</code> and
629 responds back to the <code>inport</code> after applying these
630 actions. If <code>reg0[3]</code> is set to 1, it means that the
631 action <code>put_dhcp_opts</code> was successful.
632 </p>
633
634 <pre>
635eth.dst = eth.src;
636eth.src = <var>E</var>;
33ac3c83 637ip4.dst = <var>A</var>;
281977f7
NS
638ip4.src = <var>S</var>;
639udp.src = 67;
640udp.dst = 68;
641outport = <var>P</var>;
bf143492 642flags.loopback = 1;
281977f7
NS
643output;
644 </pre>
645
646 <p>
647 where <var>E</var> is the server MAC address and <var>S</var> is the
33ac3c83 648 server IPv4 address defined in the DHCPv4 options and <var>A</var> is
281977f7
NS
649 the IPv4 address defined in the logical port's addresses column.
650 </p>
651
652 <p>
653 (This terminates ingress packet processing; the packet does not go
654 to the next ingress table.)
655 </p>
656 </li>
657
33ac3c83
NS
658 <li>
659 <p>
660 A priority 100 logical flow is added for the logical ports configured
661 with DHCPv6 options which matches IPv6 packets with <code>udp.src == 546
662 &amp;&amp; udp.dst == 547 &amp;&amp; reg0[3] == 1</code> and
663 responds back to the <code>inport</code> after applying these
664 actions. If <code>reg0[3]</code> is set to 1, it means that the
665 action <code>put_dhcpv6_opts</code> was successful.
666 </p>
667
668 <pre>
669eth.dst = eth.src;
670eth.src = <var>E</var>;
671ip6.dst = <var>A</var>;
672ip6.src = <var>S</var>;
673udp.src = 547;
674udp.dst = 546;
675outport = <var>P</var>;
676flags.loopback = 1;
677output;
678 </pre>
679
680 <p>
681 where <var>E</var> is the server MAC address and <var>S</var> is the
682 server IPv6 LLA address generated from the <code>server_id</code>
683 defined in the DHCPv6 options and <var>A</var> is
684 the IPv6 address defined in the logical port's addresses column.
685 </p>
686
687 <p>
688 (This terminates packet processing; the packet does not go on the
689 next ingress table.)
690 </p>
691 </li>
692
281977f7
NS
693 <li>
694 A priority-0 flow that matches all packets to advances to table 12.
695 </li>
696 </ul>
697
302eda27
NS
698 <h3>Ingress Table 13 DNS Lookup</h3>
699
700 <p>
701 This table looks up and resolves the DNS names to the corresponding
702 configured IP address(es).
703 </p>
704
705 <ul>
706 <li>
707 <p>
708 A priority-100 logical flow for each logical switch datapath
709 if it is configured with DNS records, which matches the IPv4 and IPv6
710 packets with <code>udp.dst</code> = 53 and applies the action
711 <code>dns_lookup</code> and advances the packet to the next table.
712 </p>
713
714 <pre>
715reg0[4] = dns_lookup(); next;
716 </pre>
717
718 <p>
719 For valid DNS packets, this transforms the packet into a DNS
720 reply if the DNS name can be resolved, and stores 1 into reg0[4].
721 For failed DNS resolution or other kinds of packets, it just stores
722 0 into reg0[4]. Either way, it continues to the next table.
723 </p>
724 </li>
725 </ul>
726
727 <h3>Ingress Table 14 DNS Responses</h3>
728
729 <p>
730 This table implements DNS responder for the DNS replies generated by
731 the previous table.
732 </p>
733
734 <ul>
735 <li>
736 <p>
737 A priority-100 logical flow for each logical switch datapath
738 if it is configured with DNS records, which matches the IPv4 and IPv6
739 packets with <code>udp.dst = 53 &amp;&amp; reg0[4] == 1</code>
740 and responds back to the <code>inport</code> after applying these
741 actions. If <code>reg0[4]</code> is set to 1, it means that the
742 action <code>dns_lookup</code> was successful.
743 </p>
744
745 <pre>
746eth.dst &lt;-&gt; eth.src;
747ip4.src &lt;-&gt; ip4.dst;
748udp.dst = udp.src;
749udp.src = 53;
750outport = <var>P</var>;
751flags.loopback = 1;
752output;
753 </pre>
754
755 <p>
756 (This terminates ingress packet processing; the packet does not go
757 to the next ingress table.)
758 </p>
759 </li>
760 </ul>
761
762 <h3>Ingress Table 15 Destination Lookup</h3>
fa128126
HZ
763
764 <p>
765 This table implements switching behavior. It contains these logical
766 flows:
767 </p>
768
769 <ul>
5cff6b99
BP
770 <li>
771 A priority-100 flow that outputs all packets with an Ethernet broadcast
772 or multicast <code>eth.dst</code> to the <code>MC_FLOOD</code>
773 multicast group, which <code>ovn-northd</code> populates with all
774 enabled logical ports.
775 </li>
776
777 <li>
41a15b71
MS
778 <p>
779 One priority-50 flow that matches each known Ethernet address against
780 <code>eth.dst</code> and outputs the packet to the single associated
781 output port.
782 </p>
783
784 <p>
785 For the Ethernet address on a logical switch port of type
786 <code>router</code>, when that logical switch port's
787 <ref column="addresses" table="Logical_Switch_Port"
788 db="OVN_Northbound"/> column is set to <code>router</code> and
789 the connected logical router port specifies a
06a26dd2 790 <code>redirect-chassis</code>:
41a15b71 791 </p>
06a26dd2
MS
792
793 <ul>
794 <li>
795 The flow for the connected logical router port's Ethernet
796 address is only programmed on the <code>redirect-chassis</code>.
797 </li>
798
799 <li>
800 If the logical router has rules specified in
801 <ref column="nat" table="Logical_Router" db="OVN_Northbound"/> with
802 <ref column="external_mac" table="NAT" db="OVN_Northbound"/>, then
803 those addresses are also used to populate the switch's destination
804 lookup on the chassis where
805 <ref column="logical_port" table="NAT" db="OVN_Northbound"/> is
806 resident.
807 </li>
808 </ul>
5cff6b99
BP
809 </li>
810
811 <li>
812 One priority-0 fallback flow that matches all packets and outputs them
813 to the <code>MC_UNKNOWN</code> multicast group, which
814 <code>ovn-northd</code> populates with all enabled logical ports that
815 accept unknown destination packets. As a small optimization, if no
816 logical ports accept unknown destination packets,
817 <code>ovn-northd</code> omits this multicast group and logical flow.
818 </li>
819 </ul>
820
7a15be69
GS
821 <h3>Egress Table 0: Pre-LB</h3>
822
823 <p>
824 This table is similar to ingress table <code>Pre-LB</code>. It
825 contains a priority-0 flow that simply moves traffic to the next table.
826 If any load balancing rules exist for the datapath, a priority-100 flow
827 is added with a match of <code>ip</code> and action of <code>reg0[0] = 1;
828 next;</code> to act as a hint for table <code>Pre-stateful</code> to
829 send IP packets to the connection tracker for packet de-fragmentation.
830 </p>
831
832 <h3>Egress Table 1: <code>to-lport</code> Pre-ACLs</h3>
78aab811
JP
833
834 <p>
2c36d5a6
GS
835 This is similar to ingress table <code>Pre-ACLs</code> except for
836 <code>to-lport</code> traffic.
78aab811
JP
837 </p>
838
7a15be69 839 <h3>Egress Table 2: Pre-stateful</h3>
facf8652
GS
840
841 <p>
842 This is similar to ingress table <code>Pre-stateful</code>.
843 </p>
844
7a15be69
GS
845 <h3>Egress Table 3: LB</h3>
846 <p>
847 This is similar to ingress table <code>LB</code>.
848 </p>
849
850 <h3>Egress Table 4: <code>to-lport</code> ACLs</h3>
5cff6b99
BP
851
852 <p>
2c36d5a6
GS
853 This is similar to ingress table <code>ACLs</code> except for
854 <code>to-lport</code> ACLs.
685f4dfe
NS
855 </p>
856
1a03fc7d
BS
857 <h3>Egress Table 5: <code>to-lport</code> QoS marking</h3>
858
859 <p>
860 This is similar to ingress table <code>QoS marking</code> except for
861 <code>to-lport</code> qos rules.
862 </p>
863
864 <h3>Egress Table 6: Stateful</h3>
fa313a8c
GS
865
866 <p>
7a15be69
GS
867 This is similar to ingress table <code>Stateful</code> except that
868 there are no rules added for load balancing new connections.
fa313a8c
GS
869 </p>
870
281977f7 871 <p>
302eda27 872 Also the following flows are added.
281977f7 873 </p>
302eda27
NS
874 <ul>
875 <li>
876 A priority 34000 logical flow is added for each logical port which
877 has DHCPv4 options defined to allow the DHCPv4 reply packet and which has
878 DHCPv6 options defined to allow the DHCPv6 reply packet from the
879 <code>Ingress Table 12: DHCP responses</code>.
880 </li>
881
882 <li>
883 A priority 34000 logical flow is added for each logical switch datapath
884 configured with DNS records with the match <code>udp.dst = 53</code>
885 to allow the DNS reply packet from the
886 <code>Ingress Table 14:DNS responses</code>.
887 </li>
888 </ul>
281977f7 889
1a03fc7d 890 <h3>Egress Table 7: Egress Port Security - IP</h3>
685f4dfe
NS
891
892 <p>
2c36d5a6
GS
893 This is similar to the port security logic in table
894 <code>Ingress Port Security - IP</code> except that <code>outport</code>,
895 <code>eth.dst</code>, <code>ip4.dst</code> and <code>ip6.dst</code>
896 are checked instead of <code>inport</code>, <code>eth.src</code>,
897 <code>ip4.src</code> and <code>ip6.src</code>
5cff6b99
BP
898 </p>
899
1a03fc7d 900 <h3>Egress Table 8: Egress Port Security - L2</h3>
5cff6b99
BP
901
902 <p>
2c36d5a6
GS
903 This is similar to the ingress port security logic in ingress table
904 <code>Admission Control and Ingress Port Security - L2</code>,
5cff6b99
BP
905 but with important differences. Most obviously, <code>outport</code> and
906 <code>eth.dst</code> are checked instead of <code>inport</code> and
907 <code>eth.src</code>. Second, packets directed to broadcast or multicast
908 <code>eth.dst</code> are always accepted instead of being subject to the
909 port security rules; this is implemented through a priority-100 flow that
9975d7be 910 matches on <code>eth.mcast</code> with action <code>output;</code>.
5cff6b99
BP
911 Finally, to ensure that even broadcast and multicast packets are not
912 delivered to disabled logical ports, a priority-150 flow for each
913 disabled logical <code>outport</code> overrides the priority-100 flow
914 with a <code>drop;</code> action.
915 </p>
9975d7be
BP
916
917 <h2>Logical Router Datapaths</h2>
918
5412db30
J
919 <p>
920 Logical router datapaths will only exist for <ref table="Logical_Router"
921 db="OVN_Northbound"/> rows in the <ref db="OVN_Northbound"/> database
922 that do not have <ref column="enabled" table="Logical_Router"
923 db="OVN_Northbound"/> set to <code>false</code>
924 </p>
925
9975d7be
BP
926 <h3>Ingress Table 0: L2 Admission Control</h3>
927
928 <p>
929 This table drops packets that the router shouldn't see at all based on
930 their Ethernet headers. It contains the following flows:
931 </p>
932
933 <ul>
934 <li>
935 Priority-100 flows to drop packets with VLAN tags or multicast Ethernet
936 source addresses.
937 </li>
938
939 <li>
41a15b71
MS
940 <p>
941 For each enabled router port <var>P</var> with Ethernet address
942 <var>E</var>, a priority-50 flow that matches <code>inport ==
943 <var>P</var> &amp;&amp; (eth.mcast || eth.dst ==
944 <var>E</var></code>), with action <code>next;</code>.
945 </p>
946
947 <p>
948 For the gateway port on a distributed logical router (where
949 one of the logical router ports specifies a
950 <code>redirect-chassis</code>), the above flow matching
951 <code>eth.dst == <var>E</var></code> is only programmed on
952 the gateway port instance on the
953 <code>redirect-chassis</code>.
954 </p>
9975d7be 955 </li>
06a26dd2
MS
956
957 <li>
958 <p>
959 For each <code>dnat_and_snat</code> NAT rule on a distributed
960 router that specifies an external Ethernet address <var>E</var>,
961 a priority-50 flow that matches <code>inport == <var>GW</var>
962 &amp;&amp; eth.dst == <var>E</var></code>, where <var>GW</var>
963 is the logical router gateway port, with action
964 <code>next;</code>.
965 </p>
966
967 <p>
968 This flow is only programmed on the gateway port instance on
969 the chassis where the <code>logical_port</code> specified in
970 the NAT rule resides.
971 </p>
972 </li>
9975d7be
BP
973 </ul>
974
975 <p>
976 Other packets are implicitly dropped.
977 </p>
978
979 <h3>Ingress Table 1: IP Input</h3>
980
981 <p>
982 This table is the core of the logical router datapath functionality. It
983 contains the following flows to implement very basic IP host
984 functionality.
985 </p>
986
987 <ul>
988 <li>
989 <p>
990 L3 admission control: A priority-100 flow drops packets that match
991 any of the following:
992 </p>
993
994 <ul>
995 <li>
996 <code>ip4.src[28..31] == 0xe</code> (multicast source)
997 </li>
998 <li>
999 <code>ip4.src == 255.255.255.255</code> (broadcast source)
1000 </li>
1001 <li>
1002 <code>ip4.src == 127.0.0.0/8 || ip4.dst == 127.0.0.0/8</code>
1003 (localhost source or destination)
1004 </li>
1005 <li>
1006 <code>ip4.src == 0.0.0.0/8 || ip4.dst == 0.0.0.0/8</code> (zero
1007 network source or destination)
1008 </li>
1009 <li>
6fdb7cd6 1010 <code>ip4.src</code> or <code>ip6.src</code> is any IP
06a26dd2
MS
1011 address owned by the router, unless the packet was recirculated
1012 due to egress loopback as indicated by
1013 <code>REGBIT_EGRESS_LOOPBACK</code>.
9975d7be
BP
1014 </li>
1015 <li>
1016 <code>ip4.src</code> is the broadcast address of any IP network
1017 known to the router.
1018 </li>
1019 </ul>
1020 </li>
1021
1022 <li>
1023 <p>
1024 ICMP echo reply. These flows reply to ICMP echo requests received
e9bc5de1 1025 for the router's IP address. Let <var>A</var> be an IP address
6fdb7cd6
JP
1026 owned by a router port. Then, for each <var>A</var> that is
1027 an IPv4 address, a priority-90 flow matches on
1028 <code>ip4.dst == <var>A</var></code> and
1029 <code>icmp4.type == 8 &amp;&amp; icmp4.code == 0</code>
1030 (ICMP echo request). For each <var>A</var> that is an IPv6
1031 address, a priority-90 flow matches on
1032 <code>ip6.dst == <var>A</var></code> and
1033 <code>icmp6.type == 128 &amp;&amp; icmp6.code == 0</code>
1034 (ICMPv6 echo request). The port of the router that receives the
1035 echo request does not matter. Also, the <code>ip.ttl</code> of
1036 the echo request packet is not checked, so it complies with
1037 RFC 1812, section 4.2.2.9. Flows for ICMPv4 echo requests use the
1038 following actions:
9975d7be
BP
1039 </p>
1040
1041 <pre>
4685e523 1042ip4.dst &lt;-&gt; ip4.src;
47f3b59b 1043ip.ttl = 255;
9975d7be 1044icmp4.type = 0;
bf143492 1045flags.loopback = 1;
6fdb7cd6
JP
1046next;
1047 </pre>
1048
1049 <p>
1050 Flows for ICMPv6 echo requests use the following actions:
1051 </p>
1052
1053 <pre>
1054ip6.dst &lt;-&gt; ip6.src;
1055ip.ttl = 255;
1056icmp6.type = 129;
bf143492 1057flags.loopback = 1;
9975d7be
BP
1058next;
1059 </pre>
9975d7be
BP
1060 </li>
1061
1062 <li>
1063 <p>
de297547
GS
1064 Reply to ARP requests.
1065 </p>
1066
1067 <p>
1068 These flows reply to ARP requests for the router's own IP address.
1069 For each router port <var>P</var> that owns IP address <var>A</var>
1070 and Ethernet address <var>E</var>, a priority-90 flow matches
1071 <code>inport == <var>P</var> &amp;&amp; arp.op == 1 &amp;&amp;
1072 arp.tpa == <var>A</var></code> (ARP request) with the following
1073 actions:
1074 </p>
1075
1076 <pre>
1077eth.dst = eth.src;
1078eth.src = <var>E</var>;
1079arp.op = 2; /* ARP reply. */
1080arp.tha = arp.sha;
1081arp.sha = <var>E</var>;
1082arp.tpa = arp.spa;
1083arp.spa = <var>A</var>;
1084outport = <var>P</var>;
bf143492 1085flags.loopback = 1;
de297547
GS
1086output;
1087 </pre>
41a15b71
MS
1088
1089 <p>
1090 For the gateway port on a distributed logical router (where
1091 one of the logical router ports specifies a
1092 <code>redirect-chassis</code>), the above flows are only
1093 programmed on the gateway port instance on the
1094 <code>redirect-chassis</code>. This behavior avoids generation
1095 of multiple ARP responses from different chassis, and allows
1096 upstream MAC learning to point to the
1097 <code>redirect-chassis</code>.
1098 </p>
de297547
GS
1099 </li>
1100
1101 <li>
1102 <p>
1103 These flows reply to ARP requests for the virtual IP addresses
cc4583aa
GS
1104 configured in the router for DNAT or load balancing. For a
1105 configured DNAT IP address or a load balancer VIP <var>A</var>,
1106 for each router port <var>P</var> with Ethernet
de297547
GS
1107 address <var>E</var>, a priority-90 flow matches
1108 <code>inport == <var>P</var> &amp;&amp; arp.op == 1 &amp;&amp;
1109 arp.tpa == <var>A</var></code> (ARP request)
0bac7164 1110 with the following actions:
9975d7be
BP
1111 </p>
1112
1113 <pre>
1114eth.dst = eth.src;
1115eth.src = <var>E</var>;
1116arp.op = 2; /* ARP reply. */
1117arp.tha = arp.sha;
1118arp.sha = <var>E</var>;
1119arp.tpa = arp.spa;
1120arp.spa = <var>A</var>;
1121outport = <var>P</var>;
bf143492 1122flags.loopback = 1;
9975d7be
BP
1123output;
1124 </pre>
06a26dd2
MS
1125
1126 <p>
1127 For the gateway port on a distributed logical router with NAT
1128 (where one of the logical router ports specifies a
1129 <code>redirect-chassis</code>):
1130 </p>
1131
1132 <ul>
1133 <li>
1134 If the corresponding NAT rule cannot be handled in a
1135 distributed manner, then this flow is only programmed on
1136 the gateway port instance on the
1137 <code>redirect-chassis</code>. This behavior avoids
1138 generation of multiple ARP responses from different chassis,
1139 and allows upstream MAC learning to point to the
1140 <code>redirect-chassis</code>.
1141 </li>
1142
1143 <li>
1144 <p>
1145 If the corresponding NAT rule can be handled in a distributed
1146 manner, then this flow is only programmed on the gateway port
1147 instance where the <code>logical_port</code> specified in the
1148 NAT rule resides.
1149 </p>
1150
1151 <p>
1152 Some of the actions are different for this case, using the
1153 <code>external_mac</code> specified in the NAT rule rather
1154 than the gateway port's Ethernet address <var>E</var>:
1155 </p>
1156
1157 <pre>
1158eth.src = <var>external_mac</var>;
1159arp.sha = <var>external_mac</var>;
1160 </pre>
1161
1162 <p>
1163 This behavior avoids generation of multiple ARP responses
1164 from different chassis, and allows upstream MAC learning to
1165 point to the correct chassis.
1166 </p>
1167 </li>
1168 </ul>
9975d7be
BP
1169 </li>
1170
0bac7164 1171 <li>
c34a87b6 1172 ARP reply handling. This flow uses ARP replies to populate the
0bac7164
BP
1173 logical router's ARP table. A priority-90 flow with match <code>arp.op
1174 == 2</code> has actions <code>put_arp(inport, arp.spa,
1175 arp.sha);</code>.
1176 </li>
1177
6fdb7cd6
JP
1178 <li>
1179 <p>
c34a87b6
JP
1180 Reply to IPv6 Neighbor Solicitations. These flows reply to
1181 Neighbor Solicitation requests for the router's own IPv6
1182 address and populate the logical router's mac binding table.
1183 For each router port <var>P</var> that owns IPv6 address
1184 <var>A</var>, solicited node address <var>S</var>, and
1185 Ethernet address <var>E</var>, a priority-90 flow matches
1186 <code>inport == <var>P</var> &amp;&amp; nd_ns &amp;&amp;
1187 ip6.dst == {<var>A</var>, <var>E</var>} &amp;&amp; nd.target
1188 == <var>A</var></code> with the following actions:
6fdb7cd6
JP
1189 </p>
1190
1191 <pre>
c34a87b6 1192put_nd(inport, ip6.src, nd.sll);
6fdb7cd6
JP
1193nd_na {
1194 eth.src = <var>E</var>;
1195 ip6.src = <var>A</var>;
1196 nd.target = <var>A</var>;
1197 nd.tll = <var>E</var>;
1198 outport = inport;
bf143492 1199 flags.loopback = 1;
6fdb7cd6
JP
1200 output;
1201};
1202 </pre>
41a15b71
MS
1203
1204 <p>
1205 For the gateway port on a distributed logical router (where
1206 one of the logical router ports specifies a
1207 <code>redirect-chassis</code>), the above flows replying to
1208 IPv6 Neighbor Solicitations are only programmed on the
1209 gateway port instance on the <code>redirect-chassis</code>.
1210 This behavior avoids generation of multiple replies from
1211 different chassis, and allows upstream MAC learning to point
1212 to the <code>redirect-chassis</code>.
1213 </p>
6fdb7cd6
JP
1214 </li>
1215
c34a87b6
JP
1216 <li>
1217 IPv6 neighbor advertisement handling. This flow uses neighbor
1218 advertisements to populate the logical router's mac binding
1219 table. A priority-90 flow with match <code>nd_na</code>
1220 has actions <code>put_nd(inport, nd.target, nd.tll);</code>.
1221 </li>
1222
1223 <li>
1224 IPv6 neighbor solicitation for non-hosted addresses handling.
1225 This flow uses neighbor solicitations to populate the logical
1226 router's mac binding table (ones that were directed at the
1227 logical router would have matched the priority-90 neighbor
1228 solicitation flow already). A priority-80 flow with match
1229 <code>nd_ns</code> has actions
1230 <code>put_nd(inport, ip6.src, nd.sll);</code>.
1231 </li>
1232
9975d7be
BP
1233 <li>
1234 <p>
1235 UDP port unreachable. Priority-80 flows generate ICMP port
1236 unreachable messages in reply to UDP datagrams directed to the
1237 router's IP address. The logical router doesn't accept any UDP
1238 traffic so it always generates such a reply.
1239 </p>
1240
1241 <p>
1242 These flows should not match IP fragments with nonzero offset.
1243 </p>
1244
1245 <p>
1246 Details TBD. Not yet implemented.
1247 </p>
1248 </li>
1249
1250 <li>
1251 <p>
1252 TCP reset. Priority-80 flows generate TCP reset messages in reply to
1253 TCP datagrams directed to the router's IP address. The logical
1254 router doesn't accept any TCP traffic so it always generates such a
1255 reply.
1256 </p>
1257
1258 <p>
1259 These flows should not match IP fragments with nonzero offset.
1260 </p>
1261
1262 <p>
1263 Details TBD. Not yet implemented.
1264 </p>
1265 </li>
1266
1267 <li>
1268 <p>
1269 Protocol unreachable. Priority-70 flows generate ICMP protocol
1270 unreachable messages in reply to packets directed to the router's IP
1271 address on IP protocols other than UDP, TCP, and ICMP.
1272 </p>
1273
1274 <p>
1275 These flows should not match IP fragments with nonzero offset.
1276 </p>
1277
1278 <p>
1279 Details TBD. Not yet implemented.
1280 </p>
1281 </li>
1282
1283 <li>
1284 Drop other IP traffic to this router. These flows drop any other
1285 traffic destined to an IP address of this router that is not already
1286 handled by one of the flows above, which amounts to ICMP (other than
1287 echo requests) and fragments with nonzero offsets. For each IP address
1288 <var>A</var> owned by the router, a priority-60 flow matches
4ef48e9d
CSV
1289 <code>ip4.dst == <var>A</var></code> and drops the traffic. An
1290 exception is made and the above flow is not added if the router
1291 port's own IP address is used to SNAT packets passing through that
1292 router.
9975d7be
BP
1293 </li>
1294 </ul>
1295
1296 <p>
1297 The flows above handle all of the traffic that might be directed to the
1298 router itself. The following flows (with lower priorities) handle the
1299 remaining traffic, potentially for forwarding:
1300 </p>
1301
1302 <ul>
1303 <li>
1304 Drop Ethernet local broadcast. A priority-50 flow with match
1305 <code>eth.bcast</code> drops traffic destined to the local Ethernet
1306 broadcast address. By definition this traffic should not be forwarded.
1307 </li>
1308
9975d7be
BP
1309 <li>
1310 <p>
1311 ICMP time exceeded. For each router port <var>P</var>, whose IP
1312 address is <var>A</var>, a priority-40 flow with match <code>inport
47f3b59b 1313 == <var>P</var> &amp;&amp; ip.ttl == {0, 1} &amp;&amp;
9975d7be
BP
1314 !ip.later_frag</code> matches packets whose TTL has expired, with the
1315 following actions to send an ICMP time exceeded reply:
1316 </p>
1317
1318 <pre>
1319icmp4 {
1320 icmp4.type = 11; /* Time exceeded. */
1321 icmp4.code = 0; /* TTL exceeded in transit. */
1322 ip4.dst = ip4.src;
1323 ip4.src = <var>A</var>;
47f3b59b 1324 ip.ttl = 255;
9975d7be
BP
1325 next;
1326};
1327 </pre>
1328
1329 <p>
1330 Not yet implemented.
1331 </p>
1332 </li>
1333
1334 <li>
47f3b59b 1335 TTL discard. A priority-30 flow with match <code>ip.ttl == {0,
9975d7be
BP
1336 1}</code> and actions <code>drop;</code> drops other packets whose TTL
1337 has expired, that should not receive a ICMP error reply (i.e. fragments
1338 with nonzero offset).
1339 </li>
1340
1341 <li>
1342 Next table. A priority-0 flows match all packets that aren't already
cc4583aa
GS
1343 handled and uses actions <code>next;</code> to feed them to the next
1344 table.
9975d7be
BP
1345 </li>
1346 </ul>
1347
cc4583aa
GS
1348 <h3>Ingress Table 2: DEFRAG</h3>
1349
1350 <p>
1351 This is to send packets to connection tracker for tracking and
1352 defragmentation. It contains a priority-0 flow that simply moves traffic
1353 to the next table. If load balancing rules with virtual IP addresses
1354 (and ports) are configured in <code>OVN_Northbound</code> database for a
1355 Gateway router, a priority-100 flow is added for each configured virtual
1356 IP address <var>VIP</var> with a match <code>ip &amp;&amp;
1357 ip4.dst == <var>VIP</var></code> that sets an action
1358 <code>ct_next;</code> to send IP packets to the connection tracker for
1359 packet de-fragmentation and tracking before sending it to the next table.
1360 </p>
1361
1362 <h3>Ingress Table 3: UNSNAT</h3>
de297547
GS
1363
1364 <p>
1365 This is for already established connections' reverse traffic.
1366 i.e., SNAT has already been done in egress pipeline and now the
1367 packet has entered the ingress pipeline as part of a reply. It is
1368 unSNATted here.
1369 </p>
1370
06a26dd2
MS
1371 <p>Ingress Table 3: UNSNAT on Gateway Routers</p>
1372
de297547
GS
1373 <ul>
1374 <li>
1375 <p>
65d8810c
GS
1376 If the Gateway router has been configured to force SNAT any
1377 previously DNATted packets to <var>B</var>, a priority-110 flow
1378 matches <code>ip &amp;&amp; ip4.dst == <var>B</var></code> with
1379 an action <code>ct_snat; next;</code>.
1380 </p>
1381
1382 <p>
1383 If the Gateway router has been configured to force SNAT any
1384 previously load-balanced packets to <var>B</var>, a priority-100 flow
1385 matches <code>ip &amp;&amp; ip4.dst == <var>B</var></code> with
1386 an action <code>ct_snat; next;</code>.
1387 </p>
1388
1389 <p>
1390 For each NAT configuration in the OVN Northbound database, that asks
de297547 1391 to change the source IP address of a packet from <var>A</var> to
65d8810c 1392 <var>B</var>, a priority-90 flow matches <code>ip &amp;&amp;
de297547
GS
1393 ip4.dst == <var>B</var></code> with an action
1394 <code>ct_snat; next;</code>.
1395 </p>
1396
1397 <p>
1398 A priority-0 logical flow with match <code>1</code> has actions
1399 <code>next;</code>.
1400 </p>
1401 </li>
1402 </ul>
1403
06a26dd2
MS
1404 <p>Ingress Table 3: UNSNAT on Distributed Routers</p>
1405
1406 <ul>
1407 <li>
1408 <p>
1409 For each configuration in the OVN Northbound database, that asks
1410 to change the source IP address of a packet from <var>A</var> to
1411 <var>B</var>, a priority-100 flow matches <code>ip &amp;&amp;
1412 ip4.dst == <var>B</var> &amp;&amp; inport == <var>GW</var></code>,
1413 where <var>GW</var> is the logical router gateway port, with an
1414 action <code>ct_snat; next;</code>.
1415 </p>
1416
1417 <p>
1418 If the NAT rule cannot be handled in a distributed manner, then
1419 the priority-100 flow above is only programmed on the
1420 <code>redirect-chassis</code>.
1421 </p>
1422
1423 <p>
1424 For each configuration in the OVN Northbound database, that asks
1425 to change the source IP address of a packet from <var>A</var> to
1426 <var>B</var>, a priority-50 flow matches <code>ip &amp;&amp;
1427 ip4.dst == <var>B</var></code> with an action
1428 <code>REGBIT_NAT_REDIRECT = 1; next;</code>. This flow is for
1429 east/west traffic to a NAT destination IPv4 address. By
1430 setting the <code>REGBIT_NAT_REDIRECT</code> flag, in the
1431 ingress table <code>Gateway Redirect</code> this will trigger a
1432 redirect to the instance of the gateway port on the
1433 <code>redirect-chassis</code>.
1434 </p>
1435
1436 <p>
1437 A priority-0 logical flow with match <code>1</code> has actions
1438 <code>next;</code>.
1439 </p>
1440 </li>
1441 </ul>
1442
cc4583aa 1443 <h3>Ingress Table 4: DNAT</h3>
de297547
GS
1444
1445 <p>
1446 Packets enter the pipeline with destination IP address that needs to
1447 be DNATted from a virtual IP address to a real IP address. Packets
1448 in the reverse direction needs to be unDNATed.
1449 </p>
06a26dd2
MS
1450
1451 <p>Ingress Table 4: DNAT on Gateway Routers</p>
1452
de297547
GS
1453 <ul>
1454 <li>
cc4583aa
GS
1455 For all the configured load balancing rules for Gateway router in
1456 <code>OVN_Northbound</code> database that includes a L4 port
1457 <var>PORT</var> of protocol <var>P</var> and IPv4 address
1458 <var>VIP</var>, a priority-120 flow that matches on
1459 <code>ct.new &amp;&amp; ip &amp;&amp; ip4.dst == <var>VIP</var>
1460 &amp;&amp; <var>P</var> &amp;&amp; <var>P</var>.dst == <var>PORT
1461 </var></code> with an action of <code>ct_lb(<var>args</var>)</code>,
1462 where <var>args</var> contains comma separated IPv4 addresses (and
65d8810c
GS
1463 optional port numbers) to load balance to. If the Gateway router
1464 is configured to force SNAT any load-balanced packets, the above
1465 action will be replaced by <code>flags.force_snat_for_lb = 1;
1466 ct_lb(<var>args</var>);</code>.
1467 </li>
1468
1469 <li>
1470 For all the configured load balancing rules for Gateway router in
1471 <code>OVN_Northbound</code> database that includes a L4 port
1472 <var>PORT</var> of protocol <var>P</var> and IPv4 address
1473 <var>VIP</var>, a priority-120 flow that matches on
1474 <code>ct.est &amp;&amp; ip &amp;&amp; ip4.dst == <var>VIP</var>
1475 &amp;&amp; <var>P</var> &amp;&amp; <var>P</var>.dst == <var>PORT
1476 </var></code> with an action of <code>ct_dnat;</code>.
1477 If the Gateway router is configured to force SNAT any load-balanced
1478 packets, the above action will be replaced by
1479 <code>flags.force_snat_for_lb = 1; ct_dnat;</code>.
cc4583aa 1480 </li>
de297547 1481
cc4583aa
GS
1482 <li>
1483 For all the configured load balancing rules for Gateway router in
1484 <code>OVN_Northbound</code> database that includes just an IP address
1485 <var>VIP</var> to match on, a priority-110 flow that matches on
1486 <code>ct.new &amp;&amp; ip &amp;&amp; ip4.dst ==
1487 <var>VIP</var></code> with an action of
1488 <code>ct_lb(<var>args</var>)</code>, where <var>args</var> contains
65d8810c
GS
1489 comma separated IPv4 addresses. If the Gateway router
1490 is configured to force SNAT any load-balanced packets, the above
1491 action will be replaced by <code>flags.force_snat_for_lb = 1;
1492 ct_lb(<var>args</var>);</code>.
1493 </li>
1494
1495 <li>
1496 For all the configured load balancing rules for Gateway router in
1497 <code>OVN_Northbound</code> database that includes just an IP address
1498 <var>VIP</var> to match on, a priority-110 flow that matches on
1499 <code>ct.est &amp;&amp; ip &amp;&amp; ip4.dst ==
1500 <var>VIP</var></code> with an action of <code>ct_dnat;</code>.
1501 If the Gateway router is configured to force SNAT any load-balanced
1502 packets, the above action will be replaced by
1503 <code>flags.force_snat_for_lb = 1; ct_dnat;</code>.
cc4583aa 1504 </li>
de297547 1505
cc4583aa
GS
1506 <li>
1507 For each configuration in the OVN Northbound database, that asks
1508 to change the destination IP address of a packet from <var>A</var> to
1509 <var>B</var>, a priority-100 flow matches <code>ip &amp;&amp;
1510 ip4.dst == <var>A</var></code> with an action
65d8810c
GS
1511 <code>flags.loopback = 1; ct_dnat(<var>B</var>);</code>. If the
1512 Gateway router is configured to force SNAT any DNATed packet,
1513 the above action will be replaced by
1514 <code>flags.force_snat_for_dnat = 1; flags.loopback = 1;
1515 ct_dnat(<var>B</var>);</code>.
cc4583aa
GS
1516 </li>
1517
1518 <li>
1519 For all IP packets of a Gateway router, a priority-50 flow with an
1520 action <code>flags.loopback = 1; ct_dnat;</code>.
1521 </li>
1522
1523 <li>
1524 A priority-0 logical flow with match <code>1</code> has actions
1525 <code>next;</code>.
de297547
GS
1526 </li>
1527 </ul>
1528
06a26dd2
MS
1529 <p>Ingress Table 4: DNAT on Distributed Routers</p>
1530
1531 <p>
1532 On distributed routers, the DNAT table only handles packets
1533 with destination IP address that needs to be DNATted from a
1534 virtual IP address to a real IP address. The unDNAT processing
1535 in the reverse direction is handled in a separate table in the
1536 egress pipeline.
1537 </p>
1538
1539 <ul>
1540 <li>
1541 <p>
1542 For each configuration in the OVN Northbound database, that asks
1543 to change the destination IP address of a packet from <var>A</var> to
1544 <var>B</var>, a priority-100 flow matches <code>ip &amp;&amp;
1545 ip4.dst == <var>B</var> &amp;&amp; inport == <var>GW</var></code>,
1546 where <var>GW</var> is the logical router gateway port, with an
1547 action <code>ct_dnat(<var>B</var>);</code>.
1548 </p>
1549
1550 <p>
1551 If the NAT rule cannot be handled in a distributed manner, then
1552 the priority-100 flow above is only programmed on the
1553 <code>redirect-chassis</code>.
1554 </p>
1555
1556 <p>
1557 For each configuration in the OVN Northbound database, that asks
1558 to change the destination IP address of a packet from <var>A</var> to
1559 <var>B</var>, a priority-50 flow matches <code>ip &amp;&amp;
1560 ip4.dst == <var>B</var></code> with an action
1561 <code>REGBIT_NAT_REDIRECT = 1; next;</code>. This flow is for
1562 east/west traffic to a NAT destination IPv4 address. By
1563 setting the <code>REGBIT_NAT_REDIRECT</code> flag, in the
1564 ingress table <code>Gateway Redirect</code> this will trigger a
1565 redirect to the instance of the gateway port on the
1566 <code>redirect-chassis</code>.
1567 </p>
1568
1569 <p>
1570 A priority-0 logical flow with match <code>1</code> has actions
1571 <code>next;</code>.
1572 </p>
1573 </li>
1574 </ul>
1575
cc4583aa 1576 <h3>Ingress Table 5: IP Routing</h3>
9975d7be
BP
1577
1578 <p>
6fdb7cd6
JP
1579 A packet that arrives at this table is an IP packet that should be
1580 routed to the address in <code>ip4.dst</code> or
1581 <code>ip6.dst</code>. This table implements IP routing, setting
1582 <code>reg0</code> (or <code>xxreg0</code> for IPv6) to the next-hop IP
1583 address (leaving <code>ip4.dst</code> or <code>ip6.dst</code>, the
1584 packet's final destination, unchanged) and advances to the next
1585 table for ARP resolution. It also sets <code>reg1</code> (or
47021598 1586 <code>xxreg1</code>) to the IP address owned by the selected router
06a26dd2
MS
1587 port (ingress table <code>ARP Request</code> will generate an ARP
1588 request, if needed, with <code>reg0</code> as the target protocol
1589 address and <code>reg1</code> as the source protocol address).
9975d7be
BP
1590 </p>
1591
1592 <p>
1593 This table contains the following logical flows:
1594 </p>
1595
1596 <ul>
06a26dd2
MS
1597 <li>
1598 <p>
1599 For distributed logical routers where one of the logical router
1600 ports specifies a <code>redirect-chassis</code>, a priority-300
1601 logical flow with match <code>REGBIT_NAT_REDIRECT == 1</code> has
1602 actions <code>ip.ttl--; next;</code>. The <code>outport</code>
1603 will be set later in the Gateway Redirect table.
1604 </p>
1605 </li>
1606
9975d7be
BP
1607 <li>
1608 <p>
6fdb7cd6 1609 IPv4 routing table. For each route to IPv4 network <var>N</var> with
0bac7164
BP
1610 netmask <var>M</var>, on router port <var>P</var> with IP address
1611 <var>A</var> and Ethernet
1612 address <var>E</var>, a logical flow with match <code>ip4.dst ==
9975d7be
BP
1613 <var>N</var>/<var>M</var></code>, whose priority is the number of
1614 1-bits in <var>M</var>, has the following actions:
1615 </p>
1616
1617 <pre>
47f3b59b 1618ip.ttl--;
9975d7be 1619reg0 = <var>G</var>;
0bac7164
BP
1620reg1 = <var>A</var>;
1621eth.src = <var>E</var>;
1622outport = <var>P</var>;
bf143492 1623flags.loopback = 1;
9975d7be
BP
1624next;
1625 </pre>
1626
1627 <p>
47f3b59b 1628 (Ingress table 1 already verified that <code>ip.ttl--;</code> will
9975d7be
BP
1629 not yield a TTL exceeded error.)
1630 </p>
1631
1632 <p>
28dc3fe9
SR
1633 If the route has a gateway, <var>G</var> is the gateway IP address.
1634 Instead, if the route is from a configured static route, <var>G</var>
1635 is the next hop IP address. Else it is <code>ip4.dst</code>.
9975d7be
BP
1636 </p>
1637 </li>
6fdb7cd6
JP
1638
1639 <li>
1640 <p>
1641 IPv6 routing table. For each route to IPv6 network
1642 <var>N</var> with netmask <var>M</var>, on router port
1643 <var>P</var> with IP address <var>A</var> and Ethernet address
1644 <var>E</var>, a logical flow with match in CIDR notation
1645 <code>ip6.dst == <var>N</var>/<var>M</var></code>,
1646 whose priority is the integer value of <var>M</var>, has the
1647 following actions:
1648 </p>
1649
1650 <pre>
1651ip.ttl--;
1652xxreg0 = <var>G</var>;
1653xxreg1 = <var>A</var>;
1654eth.src = <var>E</var>;
1655outport = <var>P</var>;
bf143492 1656flags.loopback = 1;
6fdb7cd6
JP
1657next;
1658 </pre>
1659
1660 <p>
1661 (Ingress table 1 already verified that <code>ip.ttl--;</code> will
1662 not yield a TTL exceeded error.)
1663 </p>
1664
1665 <p>
1666 If the route has a gateway, <var>G</var> is the gateway IP address.
1667 Instead, if the route is from a configured static route, <var>G</var>
1668 is the next hop IP address. Else it is <code>ip6.dst</code>.
1669 </p>
a63f7235
JP
1670
1671 <p>
1672 If the address <var>A</var> is in the link-local scope, the
1673 route will be limited to sending on the ingress port.
1674 </p>
6fdb7cd6 1675 </li>
9975d7be
BP
1676 </ul>
1677
cc4583aa 1678 <h3>Ingress Table 6: ARP/ND Resolution</h3>
9975d7be
BP
1679
1680 <p>
6fdb7cd6
JP
1681 Any packet that reaches this table is an IP packet whose next-hop
1682 IPv4 address is in <code>reg0</code> or IPv6 address is in
1683 <code>xxreg0</code>. (<code>ip4.dst</code> or
1684 <code>ip6.dst</code> contains the final destination.) This table
1685 resolves the IP address in <code>reg0</code> (or
1686 <code>xxreg0</code>) into an output port in <code>outport</code>
1687 and an Ethernet address in <code>eth.dst</code>, using the
1688 following flows:
9975d7be
BP
1689 </p>
1690
1691 <ul>
06a26dd2
MS
1692 <li>
1693 <p>
1694 For distributed logical routers where one of the logical router
1695 ports specifies a <code>redirect-chassis</code>, a priority-200
1696 logical flow with match <code>REGBIT_NAT_REDIRECT == 1</code> has
1697 actions <code>eth.dst = <var>E</var>; next;</code>, where
1698 <var>E</var> is the ethernet address of the router's distributed
1699 gateway port.
1700 </p>
1701 </li>
1702
9975d7be
BP
1703 <li>
1704 <p>
0bac7164
BP
1705 Static MAC bindings. MAC bindings can be known statically based on
1706 data in the <code>OVN_Northbound</code> database. For router ports
1707 connected to logical switches, MAC bindings can be known statically
1708 from the <code>addresses</code> column in the
80f408f4
JP
1709 <code>Logical_Switch_Port</code> table. For router ports
1710 connected to other logical routers, MAC bindings can be known
4685e523 1711 statically from the <code>mac</code> and <code>networks</code>
80f408f4 1712 column in the <code>Logical_Router_Port</code> table.
9975d7be
BP
1713 </p>
1714
0bac7164 1715 <p>
6fdb7cd6
JP
1716 For each IPv4 address <var>A</var> whose host is known to have
1717 Ethernet address <var>E</var> on router port <var>P</var>, a
1718 priority-100 flow with match <code>outport === <var>P</var>
1719 &amp;&amp; reg0 == <var>A</var></code> has actions
1720 <code>eth.dst = <var>E</var>; next;</code>.
1721 </p>
1722
1723 <p>
1724 For each IPv6 address <var>A</var> whose host is known to have
1725 Ethernet address <var>E</var> on router port <var>P</var>, a
1726 priority-100 flow with match <code>outport === <var>P</var>
1727 &amp;&amp; xxreg0 == <var>A</var></code> has actions
1728 <code>eth.dst = <var>E</var>; next;</code>.
1729 </p>
1730
1731 <p>
1732 For each logical router port with an IPv4 address <var>A</var> and
1733 a mac address of <var>E</var> that is reachable via a different
1734 logical router port <var>P</var>, a priority-100 flow with
1735 match <code>outport === <var>P</var> &amp;&amp; reg0 ==
0bac7164
BP
1736 <var>A</var></code> has actions <code>eth.dst = <var>E</var>;
1737 next;</code>.
1738 </p>
509afdc3
GS
1739
1740 <p>
6fdb7cd6 1741 For each logical router port with an IPv6 address <var>A</var> and
509afdc3
GS
1742 a mac address of <var>E</var> that is reachable via a different
1743 logical router port <var>P</var>, a priority-100 flow with
6fdb7cd6 1744 match <code>outport === <var>P</var> &amp;&amp; xxreg0 ==
509afdc3
GS
1745 <var>A</var></code> has actions <code>eth.dst = <var>E</var>;
1746 next;</code>.
1747 </p>
0bac7164
BP
1748 </li>
1749
1750 <li>
1751 <p>
c34a87b6
JP
1752 Dynamic MAC bindings. These flows resolve MAC-to-IP bindings
1753 that have become known dynamically through ARP or neighbor
06a26dd2
MS
1754 discovery. (The ingress table <code>ARP Request</code> will
1755 issue an ARP or neighbor solicitation request for cases where
1756 the binding is not yet known.)
0bac7164 1757 </p>
9975d7be
BP
1758
1759 <p>
c34a87b6 1760 A priority-0 logical flow with match <code>ip4</code> has actions
0bac7164 1761 <code>get_arp(outport, reg0); next;</code>.
9975d7be 1762 </p>
c34a87b6
JP
1763
1764 <p>
1765 A priority-0 logical flow with match <code>ip6</code> has actions
1766 <code>get_nd(outport, xxreg0); next;</code>.
1767 </p>
9975d7be 1768 </li>
0bac7164
BP
1769 </ul>
1770
41a15b71
MS
1771 <h3>Ingress Table 7: Gateway Redirect</h3>
1772
1773 <p>
1774 For distributed logical routers where one of the logical router
1775 ports specifies a <code>redirect-chassis</code>, this table redirects
1776 certain packets to the distributed gateway port instance on the
1777 <code>redirect-chassis</code>. This table has the following flows:
1778 </p>
1779
1780 <ul>
06a26dd2
MS
1781 <li>
1782 A priority-200 logical flow with match
1783 <code>REGBIT_NAT_REDIRECT == 1</code> has actions
1784 <code>outport = <var>CR</var>; next;</code>, where <var>CR</var>
1785 is the <code>chassisredirect</code> port representing the instance
1786 of the logical router distributed gateway port on the
1787 <code>redirect-chassis</code>.
1788 </li>
1789
41a15b71
MS
1790 <li>
1791 A priority-150 logical flow with match
1792 <code>outport == <var>GW</var> &amp;&amp;
1793 eth.dst == 00:00:00:00:00:00</code> has actions
1794 <code>outport = <var>CR</var>; next;</code>, where
1795 <var>GW</var> is the logical router distributed gateway
1796 port and <var>CR</var> is the <code>chassisredirect</code>
1797 port representing the instance of the logical router
1798 distributed gateway port on the
1799 <code>redirect-chassis</code>.
1800 </li>
1801
06a26dd2
MS
1802 <li>
1803 For each NAT rule in the OVN Northbound database that can
1804 be handled in a distributed manner, a priority-100 logical
1805 flow with match <code>ip4.src == <var>B</var> &amp;&amp;
1806 outport == <var>GW</var></code>, where <var>GW</var> is
1807 the logical router distributed gateway port, with actions
1808 <code>next;</code>.
1809 </li>
1810
41a15b71
MS
1811 <li>
1812 A priority-50 logical flow with match
1813 <code>outport == <var>GW</var></code> has actions
1814 <code>outport = <var>CR</var>; next;</code>, where
1815 <var>GW</var> is the logical router distributed gateway
1816 port and <var>CR</var> is the <code>chassisredirect</code>
1817 port representing the instance of the logical router
1818 distributed gateway port on the
1819 <code>redirect-chassis</code>.
1820 </li>
1821
1822 <li>
1823 A priority-0 logical flow with match <code>1</code> has actions
1824 <code>next;</code>.
1825 </li>
1826 </ul>
1827
1828 <h3>Ingress Table 8: ARP Request</h3>
0bac7164
BP
1829
1830 <p>
1831 In the common case where the Ethernet destination has been resolved, this
1832 table outputs the packet. Otherwise, it composes and sends an ARP
1833 request. It holds the following flows:
1834 </p>
9975d7be 1835
0bac7164 1836 <ul>
9975d7be
BP
1837 <li>
1838 <p>
0bac7164
BP
1839 Unknown MAC address. A priority-100 flow with match <code>eth.dst ==
1840 00:00:00:00:00:00</code> has the following actions:
9975d7be
BP
1841 </p>
1842
1843 <pre>
1844arp {
1845 eth.dst = ff:ff:ff:ff:ff:ff;
0bac7164 1846 arp.spa = reg1;
47021598 1847 arp.tpa = reg0;
9975d7be 1848 arp.op = 1; /* ARP request. */
9975d7be
BP
1849 output;
1850};
1851 </pre>
1852
1853 <p>
06a26dd2
MS
1854 (Ingress table <code>IP Routing</code> initialized <code>reg1</code>
1855 with the IP address owned by <code>outport</code> and
1856 <code>reg0</code> with the next-hop IP address)
9975d7be
BP
1857 </p>
1858
1859 <p>
0bac7164 1860 The IP packet that triggers the ARP request is dropped.
9975d7be
BP
1861 </p>
1862 </li>
0bac7164
BP
1863
1864 <li>
1865 Known MAC address. A priority-0 flow with match <code>1</code> has
1866 actions <code>output;</code>.
1867 </li>
9975d7be
BP
1868 </ul>
1869
06a26dd2
MS
1870 <h3>Egress Table 0: UNDNAT</h3>
1871
1872 <p>
1873 This is for already established connections' reverse traffic.
1874 i.e., DNAT has already been done in ingress pipeline and now the
1875 packet has entered the egress pipeline as part of a reply. For
1876 NAT on a distributed router, it is unDNATted here. For Gateway
1877 routers, the unDNAT processing is carried out in the ingress DNAT
1878 table.
1879 </p>
1880
1881 <ul>
1882 <li>
1883 <p>
1884 For each configuration in the OVN Northbound database that asks
1885 to change the destination IP address of a packet from an IP
1886 address of <var>A</var> to <var>B</var>, a priority-100 flow
1887 matches <code>ip &amp;&amp; ip4.src == <var>B</var>
1888 &amp;&amp; outport == <var>GW</var></code>, where <var>GW</var>
1889 is the logical router gateway port, with an action
1890 <code>ct_dnat;</code>.
1891 </p>
1892
1893 <p>
1894 If the NAT rule cannot be handled in a distributed manner, then
1895 the priority-100 flow above is only programmed on the
1896 <code>redirect-chassis</code>.
1897 </p>
1898
1899 <p>
1900 If the NAT rule can be handled in a distributed manner, then
1901 there is an additional action
1902 <code>eth.src = <var>EA</var>;</code>, where <var>EA</var>
1903 is the ethernet address associated with the IP address
1904 <var>A</var> in the NAT rule. This allows upstream MAC
1905 learning to point to the correct chassis.
1906 </p>
1907 </li>
1908
1909 <li>
1910 A priority-0 logical flow with match <code>1</code> has actions
1911 <code>next;</code>.
1912 </li>
1913 </ul>
1914
1915 <h3>Egress Table 1: SNAT</h3>
de297547
GS
1916
1917 <p>
1918 Packets that are configured to be SNATed get their source IP address
1919 changed based on the configuration in the OVN Northbound database.
1920 </p>
06a26dd2
MS
1921
1922 <p>Egress Table 1: SNAT on Gateway Routers</p>
1923
de297547
GS
1924 <ul>
1925 <li>
1926 <p>
65d8810c
GS
1927 If the Gateway router in the OVN Northbound database has been
1928 configured to force SNAT a packet (that has been previously DNATted)
1929 to <var>B</var>, a priority-100 flow matches
1930 <code>flags.force_snat_for_dnat == 1 &amp;&amp; ip</code> with an
1931 action <code>ct_snat(<var>B</var>);</code>.
1932 </p>
1933 <p>
1934 If the Gateway router in the OVN Northbound database has been
1935 configured to force SNAT a packet (that has been previously
1936 load-balanced) to <var>B</var>, a priority-100 flow matches
1937 <code>flags.force_snat_for_lb == 1 &amp;&amp; ip</code> with an
1938 action <code>ct_snat(<var>B</var>);</code>.
1939 </p>
1940 <p>
de297547
GS
1941 For each configuration in the OVN Northbound database, that asks
1942 to change the source IP address of a packet from an IP address of
1943 <var>A</var> or to change the source IP address of a packet that
1944 belongs to network <var>A</var> to <var>B</var>, a flow matches
1945 <code>ip &amp;&amp; ip4.src == <var>A</var></code> with an action
1946 <code>ct_snat(<var>B</var>);</code>. The priority of the flow
1947 is calculated based on the mask of <var>A</var>, with matches
1948 having larger masks getting higher priorities.
1949 </p>
1950 <p>
1951 A priority-0 logical flow with match <code>1</code> has actions
1952 <code>next;</code>.
1953 </p>
1954 </li>
1955 </ul>
1956
06a26dd2
MS
1957 <p>Egress Table 1: SNAT on Distributed Routers</p>
1958
1959 <ul>
1960 <li>
1961 <p>
1962 For each configuration in the OVN Northbound database, that asks
1963 to change the source IP address of a packet from an IP address of
1964 <var>A</var> or to change the source IP address of a packet that
1965 belongs to network <var>A</var> to <var>B</var>, a flow matches
1966 <code>ip &amp;&amp; ip4.src == <var>A</var> &amp;&amp;
1967 outport == <var>GW</var></code>, where <var>GW</var> is the
1968 logical router gateway port, with an action
1969 <code>ct_snat(<var>B</var>);</code>. The priority of the flow
1970 is calculated based on the mask of <var>A</var>, with matches
1971 having larger masks getting higher priorities.
1972 </p>
1973
1974 <p>
1975 If the NAT rule cannot be handled in a distributed manner, then
1976 the flow above is only programmed on the
1977 <code>redirect-chassis</code>.
1978 </p>
1979
1980 <p>
1981 If the NAT rule can be handled in a distributed manner, then
1982 there is an additional action
1983 <code>eth.src = <var>EA</var>;</code>, where <var>EA</var>
1984 is the ethernet address associated with the IP address
1985 <var>A</var> in the NAT rule. This allows upstream MAC
1986 learning to point to the correct chassis.
1987 </p>
1988 </li>
1989
1990 <li>
1991 A priority-0 logical flow with match <code>1</code> has actions
1992 <code>next;</code>.
1993 </li>
1994 </ul>
1995
1996 <h3>Egress Table 2: Egress Loopback</h3>
1997
1998 <p>
1999 For distributed logical routers where one of the logical router
2000 ports specifies a <code>redirect-chassis</code>.
2001 </p>
2002
2003 <p>
2004 Earlier in the ingress pipeline, some east-west traffic was
2005 redirected to the <code>chassisredirect</code> port, based on
2006 flows in the <code>UNSNAT</code> and <code>DNAT</code> ingress
2007 tables setting the <code>REGBIT_NAT_REDIRECT</code> flag, which
2008 then triggered a match to a flow in the
2009 <code>Gateway Redirect</code> ingress table. The intention was
2010 not to actually send traffic out the distributed gateway port
2011 instance on the <code>redirect-chassis</code>. This traffic was
2012 sent to the distributed gateway port instance in order for DNAT
2013 and/or SNAT processing to be applied.
2014 </p>
2015
2016 <p>
2017 While UNDNAT and SNAT processing have already occurred by this
2018 point, this traffic needs to be forced through egress loopback on
2019 this distributed gateway port instance, in order for UNSNAT and
2020 DNAT processing to be applied, and also for IP routing and ARP
2021 resolution after all of the NAT processing, so that the packet can
2022 be forwarded to the destination.
2023 </p>
2024
2025 <p>
2026 This table has the following flows:
2027 </p>
2028
2029 <ul>
2030 <li>
2031 <p>
2032 For each NAT rule in the OVN Northbound database on a
2033 distributed router, a priority-100 logical flow with match
2034 <code>ip4.dst == <var>E</var> &amp;&amp;
2035 outport == <var>GW</var></code>, where <var>E</var> is the
2036 external IP address specified in the NAT rule, and <var>GW</var>
2037 is the logical router distributed gateway port, with the
2038 following actions:
2039 </p>
2040
2041 <pre>
2042clone {
2043 ct_clear;
2044 inport = outport;
2045 outport = "";
2046 flags = 0;
2047 flags.loopback = 1;
2048 reg0 = 0;
2049 reg1 = 0;
2050 ...
2051 reg9 = 0;
2052 REGBIT_EGRESS_LOOPBACK = 1;
2053 next(pipeline=ingress, table=0);
2054};
2055 </pre>
2056
2057 <p>
2058 <code>flags.loopback</code> is set since in_port is unchanged
2059 and the packet may return back to that port after NAT processing.
2060 <code>REGBIT_EGRESS_LOOPBACK</code> is set to indicate that
2061 egress loopback has occurred, in order to skip the source IP
2062 address check against the router address.
2063 </p>
2064 </li>
2065
2066 <li>
2067 A priority-0 logical flow with match <code>1</code> has actions
2068 <code>next;</code>.
2069 </li>
2070 </ul>
2071
2072 <h3>Egress Table 3: Delivery</h3>
9975d7be
BP
2073
2074 <p>
2075 Packets that reach this table are ready for delivery. It contains
2076 priority-100 logical flows that match packets on each enabled logical
2077 router port, with action <code>output;</code>.
2078 </p>
2079
1af530bc 2080</manpage>