]>
Commit | Line | Data |
---|---|---|
fe36184b | 1 | <?xml version="1.0" encoding="utf-8"?> |
ec78987f | 2 | <database name="ovn-sb" title="OVN Southbound Database"> |
fe36184b BP |
3 | <p> |
4 | This database holds logical and physical configuration and state for the | |
5 | Open Virtual Network (OVN) system to support virtual network abstraction. | |
6 | For an introduction to OVN, please see <code>ovn-architecture</code>(7). | |
7 | </p> | |
8 | ||
9 | <p> | |
ec78987f JP |
10 | The OVN Southbound database sits at the center of the OVN |
11 | architecture. It is the one component that speaks both southbound | |
12 | directly to all the hypervisors and gateways, via | |
88058f19 AW |
13 | <code>ovn-controller</code>/<code>ovn-controller-vtep</code>, and |
14 | northbound to the Cloud Management System, via <code>ovn-northd</code>: | |
fe36184b BP |
15 | </p> |
16 | ||
17 | <h2>Database Structure</h2> | |
18 | ||
19 | <p> | |
0bac7164 | 20 | The OVN Southbound database contains classes of data with |
ec78987f | 21 | different properties, as described in the sections below. |
fe36184b BP |
22 | </p> |
23 | ||
24 | <h3>Physical Network (PN) data</h3> | |
25 | ||
26 | <p> | |
27 | PN tables contain information about the chassis nodes in the system. This | |
28 | contains all the information necessary to wire the overlay, such as IP | |
29 | addresses, supported tunnel types, and security keys. | |
30 | </p> | |
31 | ||
32 | <p> | |
33 | The amount of PN data is small (O(n) in the number of chassis) and it | |
34 | changes infrequently, so it can be replicated to every chassis. | |
35 | </p> | |
36 | ||
37 | <p> | |
62fdd819 | 38 | The <ref table="Chassis"/> table comprises the PN tables. |
fe36184b BP |
39 | </p> |
40 | ||
41 | <h3>Logical Network (LN) data</h3> | |
42 | ||
43 | <p> | |
44 | LN tables contain the topology of logical switches and routers, ACLs, | |
45 | firewall rules, and everything needed to describe how packets traverse a | |
46 | logical network, represented as logical datapath flows (see Logical | |
47 | Datapath Flows, below). | |
48 | </p> | |
49 | ||
50 | <p> | |
51 | LN data may be large (O(n) in the number of logical ports, ACL rules, | |
52 | etc.). Thus, to improve scaling, each chassis should receive only data | |
53 | related to logical networks in which that chassis participates. Past | |
54 | experience shows that in the presence of large logical networks, even | |
55 | finer-grained partitioning of data, e.g. designing logical flows so that | |
56 | only the chassis hosting a logical port needs related flows, pays off | |
57 | scale-wise. (This is not necessary initially but it is worth bearing in | |
58 | mind in the design.) | |
59 | </p> | |
60 | ||
61 | <p> | |
62 | The LN is a slave of the cloud management system running northbound of OVN. | |
63 | That CMS determines the entire OVN logical configuration and therefore the | |
64 | LN's content at any given time is a deterministic function of the CMS's | |
09986f8c JP |
65 | configuration, although that happens indirectly via the |
66 | <ref db="OVN_Northbound"/> database and <code>ovn-northd</code>. | |
fe36184b BP |
67 | </p> |
68 | ||
69 | <p> | |
70 | LN data is likely to change more quickly than PN data. This is especially | |
71 | true in a container environment where VMs are created and destroyed (and | |
72 | therefore added to and deleted from logical switches) quickly. | |
73 | </p> | |
74 | ||
75 | <p> | |
5868eb24 BP |
76 | <ref table="Logical_Flow"/> and <ref table="Multicast_Group"/> contain LN |
77 | data. | |
fe36184b BP |
78 | </p> |
79 | ||
0bac7164 | 80 | <h3>Logical-physical bindings</h3> |
fe36184b BP |
81 | |
82 | <p> | |
0bac7164 | 83 | These tables link logical and physical components. They show the current |
5868eb24 BP |
84 | placement of logical components (such as VMs and VIFs) onto chassis, and |
85 | map logical entities to the values that represent them in tunnel | |
86 | encapsulations. | |
fe36184b BP |
87 | </p> |
88 | ||
89 | <p> | |
0bac7164 | 90 | These tables change frequently, at least every time a VM powers up or down |
fe36184b BP |
91 | or migrates, and especially quickly in a container environment. The |
92 | amount of data per VM (or VIF) is small. | |
93 | </p> | |
94 | ||
95 | <p> | |
96 | Each chassis is authoritative about the VMs and VIFs that it hosts at any | |
97 | given time and can efficiently flood that state to a central location, so | |
98 | the consistency needs are minimal. | |
99 | </p> | |
100 | ||
101 | <p> | |
5868eb24 BP |
102 | The <ref table="Port_Binding"/> and <ref table="Datapath_Binding"/> tables |
103 | contain binding data. | |
fe36184b BP |
104 | </p> |
105 | ||
0bac7164 BP |
106 | <h3>MAC bindings</h3> |
107 | ||
108 | <p> | |
109 | The <ref table="MAC_Binding"/> table tracks the bindings from IP addresses | |
110 | to Ethernet addresses that are dynamically discovered using ARP (for IPv4) | |
111 | and neighbor discovery (for IPv6). Usually, IP-to-MAC bindings for virtual | |
112 | machines are statically populated into the <ref table="Port_Binding"/> | |
113 | table, so <ref table="MAC_Binding"/> is primarily used to discover bindings | |
114 | on physical networks. | |
115 | </p> | |
116 | ||
5868eb24 BP |
117 | <h2>Common Columns</h2> |
118 | ||
119 | <p> | |
120 | Some tables contain a special column named <code>external_ids</code>. This | |
121 | column has the same form and purpose each place that it appears, so we | |
122 | describe it here to save space later. | |
123 | </p> | |
124 | ||
125 | <dl> | |
126 | <dt><code>external_ids</code>: map of string-string pairs</dt> | |
127 | <dd> | |
128 | Key-value pairs for use by the software that manages the OVN Southbound | |
88058f19 AW |
129 | database rather than by |
130 | <code>ovn-controller</code>/<code>ovn-controller-vtep</code>. In | |
131 | particular, <code>ovn-northd</code> can use key-value pairs in this | |
132 | column to relate entities in the southbound database to higher-level | |
133 | entities (such as entities in the OVN Northbound database). Individual | |
134 | key-value pairs in this column may be documented in some cases to aid | |
135 | in understanding and troubleshooting, but the reader should not mistake | |
136 | such documentation as comprehensive. | |
5868eb24 BP |
137 | </dd> |
138 | </dl> | |
139 | ||
fe36184b BP |
140 | <table name="Chassis" title="Physical Network Hypervisor and Gateway Information"> |
141 | <p> | |
142 | Each row in this table represents a hypervisor or gateway (a chassis) in | |
143 | the physical network (PN). Each chassis, via | |
88058f19 AW |
144 | <code>ovn-controller</code>/<code>ovn-controller-vtep</code>, adds |
145 | and updates its own row, and keeps a copy of the remaining rows to | |
146 | determine how to reach other hypervisors. | |
fe36184b BP |
147 | </p> |
148 | ||
149 | <p> | |
150 | When a chassis shuts down gracefully, it should remove its own row. | |
151 | (This is not critical because resources hosted on the chassis are equally | |
152 | unreachable regardless of whether the row is present.) If a chassis | |
153 | shuts down permanently without removing its row, some kind of manual or | |
154 | automatic cleanup is eventually needed; we can devise a process for that | |
155 | as necessary. | |
156 | </p> | |
157 | ||
158 | <column name="name"> | |
fc26cf25 RB |
159 | OVN does not prescribe a particular format for chassis names. |
160 | ovn-controller populates this column using <ref key="system-id" | |
161 | table="Open_vSwitch" column="external_ids" db="Open_vSwitch"/> | |
162 | in the Open_vSwitch database's <ref table="Open_vSwitch" | |
163 | db="Open_vSwitch"/> table. ovn-controller-vtep populates this | |
164 | column with <ref table="Physical_Switch" column="name" | |
165 | db="hardware_vtep"/> in the hardware_vtep database's | |
166 | <ref table="Physical_Switch" db="hardware_vtep"/> table. | |
fe36184b BP |
167 | </column> |
168 | ||
2229f3ec RB |
169 | <column name="hostname"> |
170 | The hostname of the chassis, if applicable. ovn-controller will populate | |
171 | this column with the hostname of the host it is running on. | |
172 | ovn-controller-vtep will leave this column empty. | |
173 | </column> | |
174 | ||
4250ee37 RB |
175 | <column name="external_ids" key="ovn-bridge-mappings"> |
176 | <code>ovn-controller</code> populates this key with the set of bridge | |
177 | mappings it has been configured to use. Other applications should treat | |
178 | this key as read-only. See <code>ovn-controller</code>(8) for more | |
179 | information. | |
180 | </column> | |
181 | ||
1cef5fff RB |
182 | <group title="Common Columns"> |
183 | The overall purpose of these columns is described under <code>Common | |
184 | Columns</code> at the beginning of this document. | |
185 | ||
186 | <column name="external_ids"/> | |
187 | </group> | |
188 | ||
09db214c | 189 | <group title="Encapsulation Configuration"> |
fe36184b | 190 | <p> |
09db214c JP |
191 | OVN uses encapsulation to transmit logical dataplane packets |
192 | between chassis. | |
fe36184b BP |
193 | </p> |
194 | ||
09db214c JP |
195 | <column name="encaps"> |
196 | Points to supported encapsulation configurations to transmit | |
197 | logical dataplane packets to this chassis. Each entry is a <ref | |
198 | table="Encap"/> record that describes the configuration. | |
fe36184b BP |
199 | </column> |
200 | </group> | |
201 | ||
62fdd819 AW |
202 | <group title="Gateway Configuration"> |
203 | <p> | |
204 | A <dfn>gateway</dfn> is a chassis that forwards traffic between the | |
205 | OVN-managed part of a logical network and a physical VLAN, extending a | |
206 | tunnel-based logical network into a physical network. Gateways are | |
88058f19 AW |
207 | typically dedicated nodes that do not host VMs and will be controlled |
208 | by <code>ovn-controller-vtep</code>. | |
fe36184b BP |
209 | </p> |
210 | ||
62fdd819 | 211 | <column name="vtep_logical_switches"> |
88058f19 AW |
212 | Stores all VTEP logical switch names connected by this gateway |
213 | chassis. The <ref table="Port_Binding"/> table entry with | |
214 | <ref column="options" table="Port_Binding"/>:<code>vtep-physical-switch</code> | |
215 | equal <ref table="Chassis"/> <ref column="name" table="Chassis"/>, and | |
216 | <ref column="options" table="Port_Binding"/>:<code>vtep-logical-switch</code> | |
217 | value in <ref table="Chassis"/> | |
218 | <ref column="vtep_logical_switches" table="Chassis"/>, will be | |
219 | associated with this <ref table="Chassis"/>. | |
fe36184b | 220 | </column> |
62fdd819 | 221 | </group> |
fe36184b BP |
222 | </table> |
223 | ||
09db214c JP |
224 | <table name="Encap" title="Encapsulation Types"> |
225 | <p> | |
226 | The <ref column="encaps" table="Chassis"/> column in the <ref | |
227 | table="Chassis"/> table refers to rows in this table to identify | |
228 | how OVN may transmit logical dataplane packets to this chassis. | |
88058f19 AW |
229 | Each chassis, via <code>ovn-controller</code>(8) or |
230 | <code>ovn-controller-vtep</code>(8), adds and updates its own rows | |
231 | and keeps a copy of the remaining rows to determine how to reach | |
232 | other chassis. | |
09db214c JP |
233 | </p> |
234 | ||
235 | <column name="type"> | |
236 | The encapsulation to use to transmit packets to this chassis. | |
b705f9ea JP |
237 | Hypervisors must use either <code>geneve</code> or |
238 | <code>stt</code>. Gateways may use <code>vxlan</code>, | |
239 | <code>geneve</code>, or <code>stt</code>. | |
09db214c JP |
240 | </column> |
241 | ||
242 | <column name="options"> | |
243 | Options for configuring the encapsulation, e.g. IPsec parameters when | |
244 | IPsec support is introduced. No options are currently defined. | |
245 | </column> | |
246 | ||
247 | <column name="ip"> | |
248 | The IPv4 address of the encapsulation tunnel endpoint. | |
249 | </column> | |
250 | </table> | |
251 | ||
ea382567 RB |
252 | <table name="Address_Set" title="Address Sets"> |
253 | <p> | |
254 | See the documentation for the <ref table="Address_Set" | |
255 | db="OVN_Northbound"/> table in the <ref db="OVN_Northbound"/> database | |
256 | for details. | |
257 | </p> | |
258 | ||
259 | <column name="name"/> | |
260 | <column name="addresses"/> | |
261 | </table> | |
262 | ||
5868eb24 | 263 | <table name="Logical_Flow" title="Logical Network Flows"> |
fe36184b | 264 | <p> |
09986f8c JP |
265 | Each row in this table represents one logical flow. |
266 | <code>ovn-northd</code> populates this table with logical flows | |
267 | that implement the L2 and L3 topologies specified in the | |
268 | <ref db="OVN_Northbound"/> database. Each hypervisor, via | |
269 | <code>ovn-controller</code>, translates the logical flows into | |
270 | OpenFlow flows specific to its hypervisor and installs them into | |
271 | Open vSwitch. | |
fe36184b BP |
272 | </p> |
273 | ||
274 | <p> | |
275 | Logical flows are expressed in an OVN-specific format, described here. A | |
276 | logical datapath flow is much like an OpenFlow flow, except that the | |
277 | flows are written in terms of logical ports and logical datapaths instead | |
278 | of physical ports and physical datapaths. Translation between logical | |
279 | and physical flows helps to ensure isolation between logical datapaths. | |
09986f8c JP |
280 | (The logical flow abstraction also allows the OVN centralized |
281 | components to do less work, since they do not have to separately | |
282 | compute and push out physical flows to each chassis.) | |
fe36184b BP |
283 | </p> |
284 | ||
285 | <p> | |
286 | The default action when no flow matches is to drop packets. | |
287 | </p> | |
288 | ||
69a832cf | 289 | <p><em>Architectural Logical Life Cycle of a Packet</em></p> |
5868eb24 BP |
290 | |
291 | <p> | |
292 | This following description focuses on the life cycle of a packet through | |
293 | a logical datapath, ignoring physical details of the implementation. | |
69a832cf | 294 | Please refer to <em>Architectural Physical Life Cycle of a Packet</em> in |
5868eb24 BP |
295 | <code>ovn-architecture</code>(7) for the physical information. |
296 | </p> | |
297 | ||
298 | <p> | |
299 | The description here is written as if OVN itself executes these steps, | |
300 | but in fact OVN (that is, <code>ovn-controller</code>) programs Open | |
301 | vSwitch, via OpenFlow and OVSDB, to execute them on its behalf. | |
302 | </p> | |
303 | ||
304 | <p> | |
305 | At a high level, OVN passes each packet through the logical datapath's | |
306 | logical ingress pipeline, which may output the packet to one or more | |
307 | logical port or logical multicast groups. For each such logical output | |
308 | port, OVN passes the packet through the datapath's logical egress | |
309 | pipeline, which may either drop the packet or deliver it to the | |
310 | destination. Between the two pipelines, outputs to logical multicast | |
311 | groups are expanded into logical ports, so that the egress pipeline only | |
312 | processes a single logical output port at a time. Between the two | |
313 | pipelines is also where, when necessary, OVN encapsulates a packet in a | |
314 | tunnel (or tunnels) to transmit to remote hypervisors. | |
315 | </p> | |
316 | ||
317 | <p> | |
318 | In more detail, to start, OVN searches the <ref table="Logical_Flow"/> | |
319 | table for a row with correct <ref column="logical_datapath"/>, a <ref | |
320 | column="pipeline"/> of <code>ingress</code>, a <ref column="table_id"/> | |
321 | of 0, and a <ref column="match"/> that is true for the packet. If none | |
322 | is found, OVN drops the packet. If OVN finds more than one, it chooses | |
323 | the match with the highest <ref column="priority"/>. Then OVN executes | |
324 | each of the actions specified in the row's <ref table="actions"/> column, | |
325 | in the order specified. Some actions, such as those to modify packet | |
326 | headers, require no further details. The <code>next</code> and | |
327 | <code>output</code> actions are special. | |
328 | </p> | |
329 | ||
330 | <p> | |
331 | The <code>next</code> action causes the above process to be repeated | |
332 | recursively, except that OVN searches for <ref column="table_id"/> of 1 | |
333 | instead of 0. Similarly, any <code>next</code> action in a row found in | |
334 | that table would cause a further search for a <ref column="table_id"/> of | |
335 | 2, and so on. When recursive processing completes, flow control returns | |
336 | to the action following <code>next</code>. | |
337 | </p> | |
338 | ||
339 | <p> | |
340 | The <code>output</code> action also introduces recursion. Its effect | |
341 | depends on the current value of the <code>outport</code> field. Suppose | |
342 | <code>outport</code> designates a logical port. First, OVN compares | |
343 | <code>inport</code> to <code>outport</code>; if they are equal, it treats | |
344 | the <code>output</code> as a no-op. In the common case, where they are | |
345 | different, the packet enters the egress pipeline. This transition to the | |
78aab811 | 346 | egress pipeline discards register data, e.g. <code>reg0</code> ... |
cc5e28d8 | 347 | <code>reg9</code> and connection tracking state, to achieve |
78aab811 JP |
348 | uniform behavior regardless of whether the egress pipeline is on a |
349 | different hypervisor (because registers aren't preserve across | |
350 | tunnel encapsulation). | |
5868eb24 BP |
351 | </p> |
352 | ||
353 | <p> | |
354 | To execute the egress pipeline, OVN again searches the <ref | |
355 | table="Logical_Flow"/> table for a row with correct <ref | |
356 | column="logical_datapath"/>, a <ref column="table_id"/> of 0, a <ref | |
357 | column="match"/> that is true for the packet, but now looking for a <ref | |
358 | column="pipeline"/> of <code>egress</code>. If no matching row is found, | |
359 | the output becomes a no-op. Otherwise, OVN executes the actions for the | |
360 | matching flow (which is chosen from multiple, if necessary, as already | |
361 | described). | |
362 | </p> | |
363 | ||
364 | <p> | |
365 | In the <code>egress</code> pipeline, the <code>next</code> action acts as | |
366 | already described, except that it, of course, searches for | |
367 | <code>egress</code> flows. The <code>output</code> action, however, now | |
368 | directly outputs the packet to the output port (which is now fixed, | |
369 | because <code>outport</code> is read-only within the egress pipeline). | |
370 | </p> | |
371 | ||
372 | <p> | |
373 | The description earlier assumed that <code>outport</code> referred to a | |
374 | logical port. If it instead designates a logical multicast group, then | |
375 | the description above still applies, with the addition of fan-out from | |
376 | the logical multicast group to each logical port in the group. For each | |
377 | member of the group, OVN executes the logical pipeline as described, with | |
378 | the logical output port replaced by the group member. | |
379 | </p> | |
380 | ||
8d6e5516 JP |
381 | <p><em>Pipeline Stages</em></p> |
382 | ||
383 | <p> | |
384 | <code>ovn-northd</code> is responsible for populating the | |
385 | <ref table="Logical_Flow"/> table, so the stages are an | |
386 | implementation detail and subject to change. This section | |
387 | describes the current logical flow table. | |
388 | </p> | |
389 | ||
390 | <p> | |
391 | The ingress pipeline consists of the following stages: | |
392 | </p> | |
393 | <ul> | |
394 | <li> | |
395 | Port Security (Table 0): Validates the source address, drops | |
396 | packets with a VLAN tag, and, if configured, verifies that the | |
397 | logical port is allowed to send with the source address. | |
398 | </li> | |
399 | ||
400 | <li> | |
401 | L2 Destination Lookup (Table 1): Forwards known unicast | |
402 | addresses to the appropriate logical port. Unicast packets to | |
403 | unknown hosts are forwarded to logical ports configured with the | |
404 | special <code>unknown</code> mac address. Broadcast, and | |
405 | multicast are flooded to all ports in the logical switch. | |
406 | </li> | |
407 | </ul> | |
408 | ||
409 | <p> | |
410 | The egress pipeline consists of the following stages: | |
411 | </p> | |
412 | <ul> | |
413 | <li> | |
414 | ACL (Table 0): Applies any specified access control lists. | |
415 | </li> | |
416 | ||
417 | <li> | |
418 | Port Security (Table 1): If configured, verifies that the | |
419 | logical port is allowed to receive packets with the destination | |
420 | address. | |
421 | </li> | |
422 | </ul> | |
423 | ||
747b2a45 | 424 | <column name="logical_datapath"> |
5868eb24 BP |
425 | The logical datapath to which the logical flow belongs. |
426 | </column> | |
427 | ||
428 | <column name="pipeline"> | |
429 | <p> | |
430 | The primary flows used for deciding on a packet's destination are the | |
431 | <code>ingress</code> flows. The <code>egress</code> flows implement | |
432 | ACLs. See <em>Logical Life Cycle of a Packet</em>, above, for details. | |
433 | </p> | |
747b2a45 BP |
434 | </column> |
435 | ||
fe36184b BP |
436 | <column name="table_id"> |
437 | The stage in the logical pipeline, analogous to an OpenFlow table number. | |
438 | </column> | |
439 | ||
440 | <column name="priority"> | |
441 | The flow's priority. Flows with numerically higher priority take | |
442 | precedence over those with lower. If two logical datapath flows with the | |
443 | same priority both match, then the one actually applied to the packet is | |
444 | undefined. | |
445 | </column> | |
446 | ||
447 | <column name="match"> | |
448 | <p> | |
449 | A matching expression. OVN provides a superset of OpenFlow matching | |
450 | capabilities, using a syntax similar to Boolean expressions in a | |
451 | programming language. | |
452 | </p> | |
453 | ||
454 | <p> | |
fa6aeaeb RB |
455 | The most important components of match expression are |
456 | <dfn>comparisons</dfn> between <dfn>symbols</dfn> and | |
457 | <dfn>constants</dfn>, e.g. <code>ip4.dst == 192.168.0.1</code>, | |
458 | <code>ip.proto == 6</code>, <code>arp.op == 1</code>, <code>eth.type == | |
459 | 0x800</code>. The logical AND operator <code>&&</code> and | |
460 | logical OR operator <code>||</code> can combine comparisons into a | |
461 | larger expression. | |
fe36184b BP |
462 | </p> |
463 | ||
fe36184b | 464 | <p> |
e0840f11 BP |
465 | Matching expressions also support parentheses for grouping, the logical |
466 | NOT prefix operator <code>!</code>, and literals <code>0</code> and | |
467 | <code>1</code> to express ``false'' or ``true,'' respectively. The | |
468 | latter is useful by itself as a catch-all expression that matches every | |
469 | packet. | |
fe36184b BP |
470 | </p> |
471 | ||
e0840f11 | 472 | <p><em>Symbols</em></p> |
fe36184b BP |
473 | |
474 | <p> | |
fa6aeaeb RB |
475 | <em>Type</em>. Symbols have <dfn>integer</dfn> or <dfn>string</dfn> |
476 | type. Integer symbols have a <dfn>width</dfn> in bits. | |
fe36184b BP |
477 | </p> |
478 | ||
479 | <p> | |
fa6aeaeb | 480 | <em>Kinds</em>. There are three kinds of symbols: |
fe36184b BP |
481 | </p> |
482 | ||
e0840f11 | 483 | <ul> |
fa6aeaeb RB |
484 | <li> |
485 | <p> | |
486 | <dfn>Fields</dfn>. A field symbol represents a packet header or | |
487 | metadata field. For example, a field | |
488 | named <code>vlan.tci</code> might represent the VLAN TCI field in a | |
489 | packet. | |
490 | </p> | |
491 | ||
492 | <p> | |
493 | A field symbol can have integer or string type. Integer fields can | |
494 | be nominal or ordinal (see <em>Level of Measurement</em>, | |
495 | below). | |
496 | </p> | |
497 | </li> | |
498 | ||
499 | <li> | |
500 | <p> | |
501 | <dfn>Subfields</dfn>. A subfield represents a subset of bits from | |
502 | a larger field. For example, a field <code>vlan.vid</code> might | |
503 | be defined as an alias for <code>vlan.tci[0..11]</code>. Subfields | |
504 | are provided for syntactic convenience, because it is always | |
505 | possible to instead refer to a subset of bits from a field | |
506 | directly. | |
507 | </p> | |
508 | ||
509 | <p> | |
510 | Only ordinal fields (see <em>Level of Measurement</em>, | |
511 | below) may have subfields. Subfields are always ordinal. | |
512 | </p> | |
513 | </li> | |
514 | ||
515 | <li> | |
516 | <p> | |
517 | <dfn>Predicates</dfn>. A predicate is shorthand for a Boolean | |
518 | expression. Predicates may be used much like 1-bit fields. For | |
519 | example, <code>ip4</code> might expand to <code>eth.type == | |
520 | 0x800</code>. Predicates are provided for syntactic convenience, | |
521 | because it is always possible to instead specify the underlying | |
522 | expression directly. | |
523 | </p> | |
524 | ||
525 | <p> | |
526 | A predicate whose expansion refers to any nominal field or | |
527 | predicate (see <em>Level of Measurement</em>, below) is nominal; | |
528 | other predicates have Boolean level of measurement. | |
529 | </p> | |
530 | </li> | |
e0840f11 BP |
531 | </ul> |
532 | ||
fe36184b | 533 | <p> |
fa6aeaeb RB |
534 | <em>Level of Measurement</em>. See |
535 | http://en.wikipedia.org/wiki/Level_of_measurement for the statistical | |
536 | concept on which this classification is based. There are three | |
537 | levels: | |
fe36184b BP |
538 | </p> |
539 | ||
540 | <ul> | |
fa6aeaeb RB |
541 | <li> |
542 | <p> | |
543 | <dfn>Ordinal</dfn>. In statistics, ordinal values can be ordered | |
544 | on a scale. OVN considers a field (or subfield) to be ordinal if | |
545 | its bits can be examined individually. This is true for the | |
546 | OpenFlow fields that OpenFlow or Open vSwitch makes ``maskable.'' | |
547 | </p> | |
548 | ||
549 | <p> | |
550 | Any use of a nominal field may specify a single bit or a range of | |
551 | bits, e.g. <code>vlan.tci[13..15]</code> refers to the PCP field | |
552 | within the VLAN TCI, and <code>eth.dst[40]</code> refers to the | |
553 | multicast bit in the Ethernet destination address. | |
554 | </p> | |
555 | ||
556 | <p> | |
557 | OVN supports all the usual arithmetic relations (<code>==</code>, | |
558 | <code>!=</code>, <code><</code>, <code><=</code>, | |
559 | <code>></code>, and <code>>=</code>) on ordinal fields and | |
560 | their subfields, because OVN can implement these in OpenFlow and | |
561 | Open vSwitch as collections of bitwise tests. | |
562 | </p> | |
563 | </li> | |
564 | ||
565 | <li> | |
566 | <p> | |
567 | <dfn>Nominal</dfn>. In statistics, nominal values cannot be | |
568 | usefully compared except for equality. This is true of OpenFlow | |
569 | port numbers, Ethernet types, and IP protocols are examples: all of | |
570 | these are just identifiers assigned arbitrarily with no deeper | |
571 | meaning. In OpenFlow and Open vSwitch, bits in these fields | |
572 | generally aren't individually addressable. | |
573 | </p> | |
574 | ||
575 | <p> | |
576 | OVN only supports arithmetic tests for equality on nominal fields, | |
577 | because OpenFlow and Open vSwitch provide no way for a flow to | |
578 | efficiently implement other comparisons on them. (A test for | |
579 | inequality can be sort of built out of two flows with different | |
580 | priorities, but OVN matching expressions always generate flows with | |
581 | a single priority.) | |
582 | </p> | |
583 | ||
584 | <p> | |
585 | String fields are always nominal. | |
586 | </p> | |
587 | </li> | |
588 | ||
589 | <li> | |
590 | <p> | |
591 | <dfn>Boolean</dfn>. A nominal field that has only two values, 0 | |
592 | and 1, is somewhat exceptional, since it is easy to support both | |
593 | equality and inequality tests on such a field: either one can be | |
594 | implemented as a test for 0 or 1. | |
595 | </p> | |
596 | ||
597 | <p> | |
598 | Only predicates (see above) have a Boolean level of measurement. | |
599 | </p> | |
600 | ||
601 | <p> | |
602 | This isn't a standard level of measurement. | |
603 | </p> | |
604 | </li> | |
fe36184b BP |
605 | </ul> |
606 | ||
607 | <p> | |
fa6aeaeb RB |
608 | <em>Prerequisites</em>. Any symbol can have prerequisites, which are |
609 | additional condition implied by the use of the symbol. For example, | |
610 | For example, <code>icmp4.type</code> symbol might have prerequisite | |
611 | <code>icmp4</code>, which would cause an expression <code>icmp4.type == | |
612 | 0</code> to be interpreted as <code>icmp4.type == 0 && | |
613 | icmp4</code>, which would in turn expand to <code>icmp4.type == 0 | |
614 | && eth.type == 0x800 && ip4.proto == 1</code> (assuming | |
615 | <code>icmp4</code> is a predicate defined as suggested under | |
616 | <em>Types</em> above). | |
fe36184b BP |
617 | </p> |
618 | ||
e0840f11 BP |
619 | <p><em>Relational operators</em></p> |
620 | ||
fe36184b | 621 | <p> |
fa6aeaeb RB |
622 | All of the standard relational operators <code>==</code>, |
623 | <code>!=</code>, <code><</code>, <code><=</code>, | |
624 | <code>></code>, and <code>>=</code> are supported. Nominal | |
625 | fields support only <code>==</code> and <code>!=</code>, and only in a | |
626 | positive sense when outer <code>!</code> are taken into account, | |
627 | e.g. given string field <code>inport</code>, <code>inport == | |
628 | "eth0"</code> and <code>!(inport != "eth0")</code> are acceptable, but | |
629 | not <code>inport != "eth0"</code>. | |
fe36184b BP |
630 | </p> |
631 | ||
632 | <p> | |
fa6aeaeb RB |
633 | The implementation of <code>==</code> (or <code>!=</code> when it is |
634 | negated), is more efficient than that of the other relational | |
635 | operators. | |
fe36184b BP |
636 | </p> |
637 | ||
e0840f11 BP |
638 | <p><em>Constants</em></p> |
639 | ||
fe36184b | 640 | <p> |
e0840f11 BP |
641 | Integer constants may be expressed in decimal, hexadecimal prefixed by |
642 | <code>0x</code>, or as dotted-quad IPv4 addresses, IPv6 addresses in | |
643 | their standard forms, or Ethernet addresses as colon-separated hex | |
644 | digits. A constant in any of these forms may be followed by a slash | |
645 | and a second constant (the mask) in the same form, to form a masked | |
646 | constant. IPv4 and IPv6 masks may be given as integers, to express | |
647 | CIDR prefixes. | |
648 | </p> | |
649 | ||
650 | <p> | |
651 | String constants have the same syntax as quoted strings in JSON (thus, | |
5868eb24 | 652 | they are Unicode strings). |
fe36184b BP |
653 | </p> |
654 | ||
655 | <p> | |
e0840f11 BP |
656 | Some operators support sets of constants written inside curly braces |
657 | <code>{</code> ... <code>}</code>. Commas between elements of a set, | |
658 | and after the last elements, are optional. With <code>==</code>, | |
659 | ``<code><var>field</var> == { <var>constant1</var>, | |
660 | <var>constant2</var>,</code> ... <code>}</code>'' is syntactic sugar | |
661 | for ``<code><var>field</var> == <var>constant1</var> || | |
662 | <var>field</var> == <var>constant2</var> || </code>...<code></code>. | |
663 | Similarly, ``<code><var>field</var> != { <var>constant1</var>, | |
664 | <var>constant2</var>, </code>...<code> }</code>'' is equivalent to | |
665 | ``<code><var>field</var> != <var>constant1</var> && | |
fe36184b | 666 | <var>field</var> != <var>constant2</var> && |
e0840f11 | 667 | </code>...<code></code>''. |
fe36184b BP |
668 | </p> |
669 | ||
ea382567 RB |
670 | <p> |
671 | You may refer to a set of IPv4, IPv6, or MAC addresses stored in the | |
672 | <ref table="Address_Set"/> table by its <ref column="name" | |
673 | table="Address_Set"/>. An <ref table="Address_Set"/> with a name | |
674 | of <code>set1</code> can be referred to as | |
675 | <code>$set1</code>. | |
676 | </p> | |
677 | ||
e0840f11 BP |
678 | <p><em>Miscellaneous</em></p> |
679 | ||
fe36184b | 680 | <p> |
fa6aeaeb RB |
681 | Comparisons may name the symbol or the constant first, |
682 | e.g. <code>tcp.src == 80</code> and <code>80 == tcp.src</code> are both | |
683 | acceptable. | |
fe36184b BP |
684 | </p> |
685 | ||
686 | <p> | |
fa6aeaeb RB |
687 | Tests for a range may be expressed using a syntax like <code>1024 <= |
688 | tcp.src <= 49151</code>, which is equivalent to <code>1024 <= | |
689 | tcp.src && tcp.src <= 49151</code>. | |
fe36184b BP |
690 | </p> |
691 | ||
692 | <p> | |
fa6aeaeb RB |
693 | For a one-bit field or predicate, a mention of its name is equivalent |
694 | to <code><var>symobl</var> == 1</code>, e.g. <code>vlan.present</code> | |
695 | is equivalent to <code>vlan.present == 1</code>. The same is true for | |
696 | one-bit subfields, e.g. <code>vlan.tci[12]</code>. There is no | |
697 | technical limitation to implementing the same for ordinal fields of all | |
698 | widths, but the implementation is expensive enough that the syntax | |
699 | parser requires writing an explicit comparison against zero to make | |
700 | mistakes less likely, e.g. in <code>tcp.src != 0</code> the comparison | |
701 | against 0 is required. | |
fe36184b BP |
702 | </p> |
703 | ||
704 | <p> | |
fa6aeaeb RB |
705 | <em>Operator precedence</em> is as shown below, from highest to lowest. |
706 | There are two exceptions where parentheses are required even though the | |
707 | table would suggest that they are not: <code>&&</code> and | |
708 | <code>||</code> require parentheses when used together, and | |
709 | <code>!</code> requires parentheses when applied to a relational | |
710 | expression. Thus, in <code>(eth.type == 0x800 || eth.type == 0x86dd) | |
711 | && ip.proto == 6</code> or <code>!(arp.op == 1)</code>, the | |
712 | parentheses are mandatory. | |
fe36184b BP |
713 | </p> |
714 | ||
e0840f11 BP |
715 | <ul> |
716 | <li><code>()</code></li> | |
717 | <li><code>== != < <= > >=</code></li> | |
718 | <li><code>!</code></li> | |
719 | <li><code>&& ||</code></li> | |
720 | </ul> | |
721 | ||
10b1662b BP |
722 | <p> |
723 | <em>Comments</em> may be introduced by <code>//</code>, which extends | |
724 | to the next new-line. Comments within a line may be bracketed by | |
725 | <code>/*</code> and <code>*/</code>. Multiline comments are not | |
726 | supported. | |
727 | </p> | |
728 | ||
e0840f11 BP |
729 | <p><em>Symbols</em></p> |
730 | ||
5868eb24 BP |
731 | <p> |
732 | Most of the symbols below have integer type. Only <code>inport</code> | |
733 | and <code>outport</code> have string type. <code>inport</code> names a | |
734 | logical port. Thus, its value is a <ref column="logical_port"/> name | |
62fdd819 AW |
735 | from the <ref table="Port_Binding"/> table. <code>outport</code> may |
736 | name a logical port, as <code>inport</code>, or a logical multicast | |
737 | group defined in the <ref table="Multicast_Group"/> table. For both | |
738 | symbols, only names within the flow's logical datapath may be used. | |
5868eb24 BP |
739 | </p> |
740 | ||
394e883d JP |
741 | <p> |
742 | The <code>reg</code><var>X</var> symbols are 32-bit integers. | |
743 | The <code>xxreg</code><var>X</var> symbols are 128-bit integers, | |
744 | which overlay four of the 32-bit registers: <code>xxreg0</code> | |
745 | overlays <code>reg0</code> through <code>reg3</code>, with | |
746 | <code>reg0</code> supplying the most-significant bits of | |
747 | <code>xxreg0</code> and <code>reg3</code> the least-signficant. | |
748 | <code>xxreg1</code> similarly overlays <code>reg4</code> through | |
749 | <code>reg7</code>. | |
750 | </p> | |
751 | ||
e0840f11 | 752 | <ul> |
cc5e28d8 | 753 | <li><code>reg0</code>...<code>reg9</code></li> |
394e883d | 754 | <li><code>xxreg0</code> <code>xxreg1</code></li> |
5868eb24 | 755 | <li><code>inport</code> <code>outport</code></li> |
e0840f11 BP |
756 | <li><code>eth.src</code> <code>eth.dst</code> <code>eth.type</code></li> |
757 | <li><code>vlan.tci</code> <code>vlan.vid</code> <code>vlan.pcp</code> <code>vlan.present</code></li> | |
758 | <li><code>ip.proto</code> <code>ip.dscp</code> <code>ip.ecn</code> <code>ip.ttl</code> <code>ip.frag</code></li> | |
759 | <li><code>ip4.src</code> <code>ip4.dst</code></li> | |
760 | <li><code>ip6.src</code> <code>ip6.dst</code> <code>ip6.label</code></li> | |
761 | <li><code>arp.op</code> <code>arp.spa</code> <code>arp.tpa</code> <code>arp.sha</code> <code>arp.tha</code></li> | |
762 | <li><code>tcp.src</code> <code>tcp.dst</code> <code>tcp.flags</code></li> | |
763 | <li><code>udp.src</code> <code>udp.dst</code></li> | |
764 | <li><code>sctp.src</code> <code>sctp.dst</code></li> | |
765 | <li><code>icmp4.type</code> <code>icmp4.code</code></li> | |
766 | <li><code>icmp6.type</code> <code>icmp6.code</code></li> | |
767 | <li><code>nd.target</code> <code>nd.sll</code> <code>nd.tll</code></li> | |
e3d81ade | 768 | <li><code>ct_mark</code> <code>ct_label</code></li> |
78aab811 JP |
769 | <li> |
770 | <p> | |
771 | <code>ct_state</code>, which has the following Boolean subfields: | |
772 | </p> | |
773 | <ul> | |
774 | <li><code>ct.new</code>: True for a new flow</li> | |
775 | <li><code>ct.est</code>: True for an established flow</li> | |
776 | <li><code>ct.rel</code>: True for a related flow</li> | |
777 | <li><code>ct.rpl</code>: True for a reply flow</li> | |
778 | <li><code>ct.inv</code>: True for a connection entry in a bad state</li> | |
779 | </ul> | |
780 | <p> | |
781 | <code>ct_state</code> and its subfields are initialized by the | |
782 | <code>ct_next</code> action, described below. | |
783 | </p> | |
784 | </li> | |
e0840f11 BP |
785 | </ul> |
786 | ||
25030d47 RB |
787 | <p> |
788 | The following predicates are supported: | |
789 | </p> | |
790 | ||
791 | <ul> | |
a2011117 BP |
792 | <li><code>eth.bcast</code> expands to <code>eth.dst == ff:ff:ff:ff:ff:ff</code></li> |
793 | <li><code>eth.mcast</code> expands to <code>eth.dst[40]</code></li> | |
25030d47 RB |
794 | <li><code>vlan.present</code> expands to <code>vlan.tci[12]</code></li> |
795 | <li><code>ip4</code> expands to <code>eth.type == 0x800</code></li> | |
a2011117 | 796 | <li><code>ip4.mcast</code> expands to <code>ip4.dst[28..31] == 0xe</code></li> |
25030d47 RB |
797 | <li><code>ip6</code> expands to <code>eth.type == 0x86dd</code></li> |
798 | <li><code>ip</code> expands to <code>ip4 || ip6</code></li> | |
799 | <li><code>icmp4</code> expands to <code>ip4 && ip.proto == 1</code></li> | |
800 | <li><code>icmp6</code> expands to <code>ip6 && ip.proto == 58</code></li> | |
801 | <li><code>icmp</code> expands to <code>icmp4 || icmp6</code></li> | |
802 | <li><code>ip.is_frag</code> expands to <code>ip.frag[0]</code></li> | |
803 | <li><code>ip.later_frag</code> expands to <code>ip.frag[1]</code></li> | |
804 | <li><code>ip.first_frag</code> expands to <code>ip.is_frag && !ip.later_frag</code></li> | |
805 | <li><code>arp</code> expands to <code>eth.type == 0x806</code></li> | |
806 | <li><code>nd</code> expands to <code>icmp6.type == {135, 136} && icmp6.code == 0</code></li> | |
807 | <li><code>tcp</code> expands to <code>ip.proto == 6</code></li> | |
808 | <li><code>udp</code> expands to <code>ip.proto == 17</code></li> | |
809 | <li><code>sctp</code> expands to <code>ip.proto == 132</code></li> | |
810 | </ul> | |
fe36184b BP |
811 | </column> |
812 | ||
813 | <column name="actions"> | |
814 | <p> | |
2cd87fce RB |
815 | Logical datapath actions, to be executed when the logical flow |
816 | represented by this row is the highest-priority match. | |
fe36184b BP |
817 | </p> |
818 | ||
35060cdc | 819 | <p> |
2cd87fce RB |
820 | Actions share lexical syntax with the <ref column="match"/> column. An |
821 | empty set of actions (or one that contains just white space or | |
822 | comments), or a set of actions that consists of just | |
823 | <code>drop;</code>, causes the matched packets to be dropped. | |
824 | Otherwise, the column should contain a sequence of actions, each | |
825 | terminated by a semicolon. | |
35060cdc | 826 | </p> |
fe36184b | 827 | |
35060cdc | 828 | <p> |
eee7a8ed | 829 | The following actions are defined: |
35060cdc | 830 | </p> |
fe36184b | 831 | |
35060cdc BP |
832 | <dl> |
833 | <dt><code>output;</code></dt> | |
834 | <dd> | |
5868eb24 | 835 | <p> |
eee7a8ed JP |
836 | In the ingress pipeline, this action executes the |
837 | <code>egress</code> pipeline as a subroutine. If | |
838 | <code>outport</code> names a logical port, the egress pipeline | |
839 | executes once; if it is a multicast group, the egress pipeline runs | |
840 | once for each logical port in the group. | |
5868eb24 BP |
841 | </p> |
842 | ||
843 | <p> | |
844 | In the egress pipeline, this action performs the actual | |
845 | output to the <code>outport</code> logical port. (In the egress | |
846 | pipeline, <code>outport</code> never names a multicast group.) | |
847 | </p> | |
848 | ||
849 | <p> | |
850 | Output to the input port is implicitly dropped, that is, | |
851 | <code>output</code> becomes a no-op if <code>outport</code> == | |
b4970837 BP |
852 | <code>inport</code>. Occasionally it may be useful to override |
853 | this behavior, e.g. to send an ARP reply to an ARP request; to do | |
854 | so, use <code>inport = "";</code> to set the logical input port to | |
855 | an empty string (which should not be used as the name of any | |
856 | logical port). | |
5868eb24 | 857 | </p> |
eee7a8ed | 858 | </dd> |
fe36184b | 859 | |
35060cdc | 860 | <dt><code>next;</code></dt> |
558ec83d | 861 | <dt><code>next(<var>table</var>);</code></dt> |
35060cdc | 862 | <dd> |
558ec83d BP |
863 | Executes another logical datapath table as a subroutine. By default, |
864 | the table after the current one is executed. Specify | |
865 | <var>table</var> to jump to a specific table in the same pipeline. | |
2cd87fce | 866 | </dd> |
fe36184b | 867 | |
35060cdc BP |
868 | <dt><code><var>field</var> = <var>constant</var>;</code></dt> |
869 | <dd> | |
5868eb24 | 870 | <p> |
5ee054fb BP |
871 | Sets data or metadata field <var>field</var> to constant value |
872 | <var>constant</var>, e.g. <code>outport = "vif0";</code> to set the | |
873 | logical output port. To set only a subset of bits in a field, | |
874 | specify a subfield for <var>field</var> or a masked | |
875 | <var>constant</var>, e.g. one may use <code>vlan.pcp[2] = 1;</code> | |
876 | or <code>vlan.pcp = 4/4;</code> to set the most sigificant bit of | |
877 | the VLAN PCP. | |
5868eb24 BP |
878 | </p> |
879 | ||
880 | <p> | |
881 | Assigning to a field with prerequisites implicitly adds those | |
882 | prerequisites to <ref column="match"/>; thus, for example, a flow | |
883 | that sets <code>tcp.dst</code> applies only to TCP flows, | |
884 | regardless of whether its <ref column="match"/> mentions any TCP | |
885 | field. | |
886 | </p> | |
887 | ||
888 | <p> | |
889 | Not all fields are modifiable (e.g. <code>eth.type</code> and | |
890 | <code>ip.proto</code> are read-only), and not all modifiable fields | |
891 | may be partially modified (e.g. <code>ip.ttl</code> must assigned | |
892 | as a whole). The <code>outport</code> field is modifiable in the | |
893 | <code>ingress</code> pipeline but not in the <code>egress</code> | |
894 | pipeline. | |
895 | </p> | |
eee7a8ed | 896 | </dd> |
5ee054fb BP |
897 | |
898 | <dt><code><var>field1</var> = <var>field2</var>;</code></dt> | |
899 | <dd> | |
900 | <p> | |
901 | Sets data or metadata field <var>field1</var> to the value of data | |
902 | or metadata field <var>field2</var>, e.g. <code>reg0 = | |
903 | ip4.src;</code> copies <code>ip4.src</code> into <code>reg0</code>. | |
904 | To modify only a subset of a field's bits, specify a subfield for | |
905 | <var>field1</var> or <var>field2</var> or both, e.g. <code>vlan.pcp | |
906 | = reg0[0..2];</code> copies the least-significant bits of | |
907 | <code>reg0</code> into the VLAN PCP. | |
908 | </p> | |
909 | ||
910 | <p> | |
911 | <var>field1</var> and <var>field2</var> must be the same type, | |
912 | either both string or both integer fields. If they are both | |
913 | integer fields, they must have the same width. | |
914 | </p> | |
915 | ||
916 | <p> | |
917 | If <var>field1</var> or <var>field2</var> has prerequisites, they | |
918 | are added implicitly to <ref column="match"/>. It is possible to | |
919 | write an assignment with contradictory prerequisites, such as | |
920 | <code>ip4.src = ip6.src[0..31];</code>, but the contradiction means | |
921 | that a logical flow with such an assignment will never be matched. | |
922 | </p> | |
923 | </dd> | |
a20c96c6 BP |
924 | |
925 | <dt><code><var>field1</var> <-> <var>field2</var>;</code></dt> | |
926 | <dd> | |
927 | <p> | |
928 | Similar to <code><var>field1</var> = <var>field2</var>;</code> | |
929 | except that the two values are exchanged instead of copied. Both | |
930 | <var>field1</var> and <var>field2</var> must modifiable. | |
931 | </p> | |
932 | </dd> | |
78aab811 | 933 | |
00ea19e4 BP |
934 | <dt><code>ip.ttl--;</code></dt> |
935 | <dd> | |
936 | <p> | |
937 | Decrements the IPv4 or IPv6 TTL. If this would make the TTL zero | |
938 | or negative, then processing of the packet halts; no further | |
939 | actions are processed. (To properly handle such cases, a | |
4c20b9f2 JP |
940 | higher-priority flow should match on |
941 | <code>ip.ttl == {0, 1};</code>.) | |
00ea19e4 BP |
942 | </p> |
943 | ||
944 | <p><b>Prerequisite:</b> <code>ip</code></p> | |
945 | </dd> | |
946 | ||
78aab811 JP |
947 | <dt><code>ct_next;</code></dt> |
948 | <dd> | |
949 | <p> | |
950 | Apply connection tracking to the flow, initializing | |
951 | <code>ct_state</code> for matching in later tables. | |
952 | Automatically moves on to the next table, as if followed by | |
953 | <code>next</code>. | |
954 | </p> | |
955 | ||
956 | <p> | |
957 | As a side effect, IP fragments will be reassembled for matching. | |
958 | If a fragmented packet is output, then it will be sent with any | |
959 | overlapping fragments squashed. The connection tracking state is | |
960 | scoped by the logical port, so overlapping addresses may be used. | |
961 | To allow traffic related to the matched flow, execute | |
962 | <code>ct_commit</code>. | |
963 | </p> | |
964 | ||
965 | <p> | |
966 | It is possible to have actions follow <code>ct_next</code>, | |
967 | but they will not have access to any of its side-effects and | |
968 | is not generally useful. | |
969 | </p> | |
970 | </dd> | |
971 | ||
972 | <dt><code>ct_commit;</code></dt> | |
a9e1b66f RB |
973 | <dt><code>ct_commit(ct_mark=<var>value[/mask]</var>);</code></dt> |
974 | <dt><code>ct_commit(ct_label=<var>value[/mask]</var>);</code></dt> | |
975 | <dt><code>ct_commit(ct_mark=<var>value[/mask]</var>, ct_label=<var>value[/mask]</var>);</code></dt> | |
78aab811 | 976 | <dd> |
c4623bb8 | 977 | <p> |
a9e1b66f RB |
978 | Commit the flow to the connection tracking entry associated with it |
979 | by a previous call to <code>ct_next</code>. When | |
980 | <code>ct_mark=<var>value[/mask]</var></code> and/or | |
981 | <code>ct_label=<var>value[/mask]</var></code> are supplied, | |
982 | <code>ct_mark</code> and/or <code>ct_label</code> will be set to the | |
983 | values indicated by <var>value[/mask]</var> on the connection | |
984 | tracking entry. <code>ct_mark</code> is a 32-bit field. | |
354b8f27 NS |
985 | <code>ct_label</code> is a 128-bit field. The <var>value[/mask]</var> |
986 | should be specified in hex string if more than 64bits are to be used. | |
c4623bb8 | 987 | </p> |
a9e1b66f | 988 | |
c4623bb8 RB |
989 | <p> |
990 | Note that if you want processing to continue in the next table, | |
991 | you must execute the <code>next</code> action after | |
a9e1b66f RB |
992 | <code>ct_commit</code>. You may also leave out <code>next</code> |
993 | which will commit connection tracking state, and then drop the | |
994 | packet. This could be useful for setting <code>ct_mark</code> | |
995 | on a connection tracking entry before dropping a packet, | |
996 | for example. | |
c4623bb8 | 997 | </p> |
78aab811 | 998 | </dd> |
fe36184b | 999 | |
de297547 GS |
1000 | <dt><code>ct_dnat;</code></dt> |
1001 | <dt><code>ct_dnat(<var>IP</var>);</code></dt> | |
1002 | <dd> | |
1003 | <p> | |
1004 | <code>ct_dnat</code> sends the packet through the DNAT zone in | |
1005 | connection tracking table to unDNAT any packet that was DNATed in | |
1006 | the opposite direction. The packet is then automatically sent to | |
1007 | to the next tables as if followed by <code>next;</code> action. | |
1008 | The next tables will see the changes in the packet caused by | |
1009 | the connection tracker. | |
1010 | </p> | |
1011 | <p> | |
1012 | <code>ct_dnat(<var>IP</var>)</code> sends the packet through the | |
1013 | DNAT zone to change the destination IP address of the packet to | |
467085fd | 1014 | the one provided inside the parentheses and commits the connection. |
de297547 GS |
1015 | The packet is then automatically sent to the next tables as if |
1016 | followed by <code>next;</code> action. The next tables will see | |
1017 | the changes in the packet caused by the connection tracker. | |
1018 | </p> | |
1019 | </dd> | |
1020 | ||
1021 | <dt><code>ct_snat;</code></dt> | |
1022 | <dt><code>ct_snat(<var>IP</var>);</code></dt> | |
1023 | <dd> | |
1024 | <p> | |
1025 | <code>ct_snat</code> sends the packet through the SNAT zone to | |
1026 | unSNAT any packet that was SNATed in the opposite direction. If | |
1027 | the packet needs to be sent to the next tables, then it should be | |
1028 | followed by a <code>next;</code> action. The next tables will not | |
1029 | see the changes in the packet caused by the connection tracker. | |
1030 | </p> | |
1031 | <p> | |
1032 | <code>ct_snat(<var>IP</var>)</code> sends the packet through the | |
1033 | SNAT zone to change the source IP address of the packet to | |
1034 | the one provided inside the parenthesis and commits the connection. | |
1035 | The packet is then automatically sent to the next tables as if | |
1036 | followed by <code>next;</code> action. The next tables will see the | |
1037 | changes in the packet caused by the connection tracker. | |
1038 | </p> | |
1039 | </dd> | |
1040 | ||
69a832cf BP |
1041 | <dt><code>arp { <var>action</var>; </code>...<code> };</code></dt> |
1042 | <dd> | |
1043 | <p> | |
1044 | Temporarily replaces the IPv4 packet being processed by an ARP | |
1045 | packet and executes each nested <var>action</var> on the ARP | |
1046 | packet. Actions following the <var>arp</var> action, if any, apply | |
1047 | to the original, unmodified packet. | |
1048 | </p> | |
1049 | ||
1050 | <p> | |
1051 | The ARP packet that this action operates on is initialized based on | |
1052 | the IPv4 packet being processed, as follows. These are default | |
1053 | values that the nested actions will probably want to change: | |
1054 | </p> | |
1055 | ||
1056 | <ul> | |
1057 | <li><code>eth.src</code> unchanged</li> | |
1058 | <li><code>eth.dst</code> unchanged</li> | |
1059 | <li><code>eth.type = 0x0806</code></li> | |
1060 | <li><code>arp.op = 1</code> (ARP request)</li> | |
1061 | <li><code>arp.sha</code> copied from <code>eth.src</code></li> | |
1062 | <li><code>arp.spa</code> copied from <code>ip4.src</code></li> | |
1063 | <li><code>arp.tha = 00:00:00:00:00:00</code></li> | |
1064 | <li><code>arp.tpa</code> copied from <code>ip4.dst</code></li> | |
1065 | </ul> | |
1066 | ||
6335d074 BP |
1067 | <p> |
1068 | The ARP packet has the same VLAN header, if any, as the IP packet | |
1069 | it replaces. | |
1070 | </p> | |
1071 | ||
69a832cf BP |
1072 | <p><b>Prerequisite:</b> <code>ip4</code></p> |
1073 | </dd> | |
1074 | ||
e75451fe ZKL |
1075 | <dt> |
1076 | <code>na { <var>action</var>; </code>...<code> };</code> | |
1077 | </dt> | |
1078 | ||
1079 | <dd> | |
1080 | <p> | |
1081 | Temporarily replaces the IPv6 packet being processed by an IPv6 | |
1082 | neighbor advertisement (NA) packet and executes each nested | |
1083 | <var>action</var> on the NA packet. Actions following the | |
1084 | <var>na</var> action, if any, apply to the original, unmodified | |
1085 | packet. | |
1086 | </p> | |
1087 | ||
1088 | <p> | |
1089 | The NA packet that this action operates on is initialized based on | |
1090 | the IPv6 packet being processed, as follows. These are default | |
1091 | values that the nested actions will probably want to change: | |
1092 | </p> | |
1093 | ||
1094 | <ul> | |
1095 | <li><code>eth.dst</code> exchanged with <code>eth.src</code></li> | |
1096 | <li><code>eth.type = 0x86dd</code></li> | |
1097 | <li><code>ip6.dst</code> copied from <code>ip6.src</code></li> | |
1098 | <li><code>ip6.src</code> copied from <code>nd.target</code></li> | |
1099 | <li><code>icmp6.type = 136</code> (Neighbor Advertisement)</li> | |
1100 | <li><code>nd.target</code> unchanged</li> | |
1101 | <li><code>nd.sll = 00:00:00:00:00:00</code></li> | |
1102 | <li><code>nd.tll</code> copied from <code>eth.dst</code></li> | |
1103 | </ul> | |
1104 | ||
1105 | <p> | |
1106 | The ND packet has the same VLAN header, if any, as the IPv6 packet | |
1107 | it replaces. | |
1108 | </p> | |
1109 | ||
1110 | <p> | |
1111 | <b>Prerequisite:</b> <code>nd</code> | |
1112 | </p> | |
1113 | </dd> | |
1114 | ||
0bac7164 BP |
1115 | <dt><code>get_arp(<var>P</var>, <var>A</var>);</code></dt> |
1116 | ||
1117 | <dd> | |
1118 | <p> | |
1119 | <b>Parameters</b>: logical port string field <var>P</var>, 32-bit | |
1120 | IP address field <var>A</var>. | |
1121 | </p> | |
1122 | ||
1123 | <p> | |
1124 | Looks up <var>A</var> in <var>P</var>'s ARP table. If an entry is | |
1125 | found, stores its Ethernet address in <code>eth.dst</code>, | |
1126 | otherwise stores <code>00:00:00:00:00:00</code> in | |
1127 | <code>eth.dst</code>. | |
1128 | </p> | |
1129 | ||
1130 | <p><b>Example:</b> <code>get_arp(outport, ip4.dst);</code></p> | |
1131 | </dd> | |
1132 | ||
1133 | <dt> | |
1134 | <code>put_arp(<var>P</var>, <var>A</var>, <var>E</var>);</code> | |
1135 | </dt> | |
1136 | ||
1137 | <dd> | |
1138 | <p> | |
1139 | <b>Parameters</b>: logical port string field <var>P</var>, 32-bit | |
1140 | IP address field <var>A</var>, 48-bit Ethernet address field | |
1141 | <var>E</var>. | |
1142 | </p> | |
1143 | ||
1144 | <p> | |
1145 | Adds or updates the entry for IP address <var>A</var> in logical | |
1146 | port <var>P</var>'s ARP table, setting its Ethernet address to | |
1147 | <var>E</var>. | |
1148 | </p> | |
1149 | ||
1150 | <p><b>Example:</b> <code>put_arp(inport, arp.spa, arp.sha);</code></p> | |
1151 | </dd> | |
42814145 NS |
1152 | |
1153 | <dt> | |
1154 | <code><var>R</var> = put_dhcp_opts(<code>offerip</code> = <var>IP</var>, <var>D1</var> = <var>V1</var>, <var>D2</var> = <var>V2</var>, ..., <var>Dn</var> = <var>Vn</var>);</code> | |
1155 | </dt> | |
1156 | ||
1157 | <dd> | |
1158 | <p> | |
1159 | <b>Parameters</b>: one or more DHCP option/value pairs, the first | |
1160 | of which must set a value for the offered IP, <code>offerip</code>. | |
1161 | </p> | |
1162 | ||
1163 | <p> | |
1164 | <b>Result</b>: stored to a 1-bit subfield <var>R</var>. | |
1165 | </p> | |
1166 | ||
1167 | <p> | |
1168 | Valid only in the ingress pipeline. | |
1169 | </p> | |
1170 | ||
1171 | <p> | |
1172 | When this action is applied to a DHCP request packet (DHCPDISCOVER | |
1173 | or DHCPREQUEST), it changes the packet into a DHCP reply (DHCPOFFER | |
1174 | or DHCPACK, respectively), replaces the options by those specified | |
1175 | as parameters, and stores 1 in <var>R</var>. | |
1176 | </p> | |
1177 | ||
1178 | <p> | |
1179 | When this action is applied to a non-DHCP packet or a DHCP packet | |
1180 | that is not DHCPDISCOVER or DHCPREQUEST, it leaves the packet | |
1181 | unchanged and stores 0 in <var>R</var>. | |
1182 | </p> | |
1183 | ||
1184 | <p> | |
1185 | The contents of the <ref table="DHCP_Option"/> table control the | |
1186 | DHCP option names and values that this action supports. | |
1187 | </p> | |
1188 | ||
1189 | <p> | |
1190 | <b>Example:</b> | |
1191 | <code> | |
1192 | reg0[0] = put_dhcp_opts(offerip = 10.0.0.2, router = 10.0.0.1, | |
1193 | netmask = 255.255.255.0, dns_server = {8.8.8.8, 7.7.7.7}); | |
1194 | </code> | |
1195 | </p> | |
1196 | </dd> | |
467085fd GS |
1197 | |
1198 | <dt><code>ct_lb;</code></dt> | |
1199 | <dt><code>ct_lb(</code><var>ip</var>[<code>:</code><var>port</var>]...<code>);</code></dt> | |
1200 | <dd> | |
1201 | <p> | |
1202 | With one or more arguments, <code>ct_lb</code> commits the packet | |
1203 | to the connection tracking table and DNATs the packet's destination | |
1204 | IP address (and port) to the IP address or addresses (and optional | |
1205 | ports) specified in the string. If multiple comma-separated IP | |
1206 | addresses are specified, each is given equal weight for picking the | |
1207 | DNAT address. Processing automatically moves on to the next table, | |
1208 | as if <code>next;</code> were specified, and later tables act on | |
1209 | the packet as modified by the connection tracker. Connection | |
1210 | tracking state is scoped by the logical port, so overlapping | |
1211 | addresses may be used. | |
1212 | </p> | |
1213 | <p> | |
1214 | Without arguments, <code>ct_lb</code> sends the packet to the | |
1215 | connection tracking table to NAT the packets. If the packet is | |
1216 | part of an established connection that was previously committed to | |
1217 | the connection tracker via <code>ct_lb(</code>...<code>)</code>, it | |
1218 | will automatically get DNATed to the same IP address as the first | |
1219 | packet in that connection. | |
1220 | </p> | |
1221 | </dd> | |
6335d074 BP |
1222 | </dl> |
1223 | ||
1224 | <p> | |
1225 | The following actions will likely be useful later, but they have not | |
1226 | been thought out carefully. | |
1227 | </p> | |
1228 | ||
1229 | <dl> | |
69a832cf BP |
1230 | <dt><code>icmp4 { <var>action</var>; </code>...<code> };</code></dt> |
1231 | <dd> | |
1232 | <p> | |
1233 | Temporarily replaces the IPv4 packet being processed by an ICMPv4 | |
1234 | packet and executes each nested <var>action</var> on the ICMPv4 | |
1235 | packet. Actions following the <var>icmp4</var> action, if any, | |
1236 | apply to the original, unmodified packet. | |
1237 | </p> | |
1238 | ||
1239 | <p> | |
1240 | The ICMPv4 packet that this action operates on is initialized based | |
1241 | on the IPv4 packet being processed, as follows. These are default | |
1242 | values that the nested actions will probably want to change. | |
1243 | Ethernet and IPv4 fields not listed here are not changed: | |
1244 | </p> | |
1245 | ||
1246 | <ul> | |
1247 | <li><code>ip.proto = 1</code> (ICMPv4)</li> | |
1248 | <li><code>ip.frag = 0</code> (not a fragment)</li> | |
1249 | <li><code>icmp4.type = 3</code> (destination unreachable)</li> | |
1250 | <li><code>icmp4.code = 1</code> (host unreachable)</li> | |
1251 | </ul> | |
1252 | ||
1253 | <p> | |
1254 | Details TBD. | |
1255 | </p> | |
fe36184b | 1256 | |
69a832cf BP |
1257 | <p><b>Prerequisite:</b> <code>ip4</code></p> |
1258 | </dd> | |
1259 | ||
1260 | <dt><code>tcp_reset;</code></dt> | |
1261 | <dd> | |
1262 | <p> | |
1263 | This action transforms the current TCP packet according to the | |
1264 | following pseudocode: | |
1265 | </p> | |
1266 | ||
1267 | <pre> | |
1268 | if (tcp.ack) { | |
1269 | tcp.seq = tcp.ack; | |
1270 | } else { | |
1271 | tcp.ack = tcp.seq + length(tcp.payload); | |
1272 | tcp.seq = 0; | |
1273 | } | |
1274 | tcp.flags = RST; | |
1275 | </pre> | |
1276 | ||
1277 | <p> | |
1278 | Then, the action drops all TCP options and payload data, and | |
1279 | updates the TCP checksum. | |
1280 | </p> | |
1281 | ||
1282 | <p> | |
1283 | Details TBD. | |
1284 | </p> | |
1285 | ||
1286 | <p><b>Prerequisite:</b> <code>tcp</code></p> | |
1287 | </dd> | |
fe36184b | 1288 | </dl> |
fe36184b | 1289 | </column> |
091e3af9 JP |
1290 | |
1291 | <column name="external_ids" key="stage-name"> | |
1292 | Human-readable name for this flow's stage in the pipeline. | |
1293 | </column> | |
1294 | ||
1295 | <group title="Common Columns"> | |
1296 | The overall purpose of these columns is described under <code>Common | |
1297 | Columns</code> at the beginning of this document. | |
1298 | ||
1299 | <column name="external_ids"/> | |
1300 | </group> | |
fe36184b BP |
1301 | </table> |
1302 | ||
5868eb24 BP |
1303 | <table name="Multicast_Group" title="Logical Port Multicast Groups"> |
1304 | <p> | |
1305 | The rows in this table define multicast groups of logical ports. | |
1306 | Multicast groups allow a single packet transmitted over a tunnel to a | |
1307 | hypervisor to be delivered to multiple VMs on that hypervisor, which | |
1308 | uses bandwidth more efficiently. | |
1309 | </p> | |
1310 | ||
1311 | <p> | |
1312 | Each row in this table defines a logical multicast group numbered <ref | |
1313 | column="tunnel_key"/> within <ref column="datapath"/>, whose logical | |
1314 | ports are listed in the <ref column="ports"/> column. | |
1315 | </p> | |
1316 | ||
1317 | <column name="datapath"> | |
1318 | The logical datapath in which the multicast group resides. | |
1319 | </column> | |
1320 | ||
1321 | <column name="tunnel_key"> | |
1322 | The value used to designate this logical egress port in tunnel | |
1323 | encapsulations. An index forces the key to be unique within the <ref | |
1324 | column="datapath"/>. The unusual range ensures that multicast group IDs | |
1325 | do not overlap with logical port IDs. | |
1326 | </column> | |
1327 | ||
1328 | <column name="name"> | |
1329 | <p> | |
1330 | The logical multicast group's name. An index forces the name to be | |
1331 | unique within the <ref column="datapath"/>. Logical flows in the | |
1332 | ingress pipeline may output to the group just as for individual logical | |
1333 | ports, by assigning the group's name to <code>outport</code> and | |
1334 | executing an <code>output</code> action. | |
1335 | </p> | |
1336 | ||
1337 | <p> | |
1338 | Multicast group names and logical port names share a single namespace | |
1339 | and thus should not overlap (but the database schema cannot enforce | |
1340 | this). To try to avoid conflicts, <code>ovn-northd</code> uses names | |
1341 | that begin with <code>_MC_</code>. | |
1342 | </p> | |
1343 | </column> | |
1344 | ||
1345 | <column name="ports"> | |
1346 | The logical ports included in the multicast group. All of these ports | |
1347 | must be in the <ref column="datapath"/> logical datapath (but the | |
1348 | database schema cannot enforce this). | |
1349 | </column> | |
1350 | </table> | |
1351 | ||
1352 | <table name="Datapath_Binding" title="Physical-Logical Datapath Bindings"> | |
1353 | <p> | |
1354 | Each row in this table identifies physical bindings of a logical | |
1355 | datapath. A logical datapath implements a logical pipeline among the | |
1356 | ports in the <ref table="Port_Binding"/> table associated with it. In | |
1357 | practice, the pipeline in a given logical datapath implements either a | |
1358 | logical switch or a logical router. | |
1359 | </p> | |
1360 | ||
1361 | <column name="tunnel_key"> | |
1362 | The tunnel key value to which the logical datapath is bound. | |
1363 | The <code>Tunnel Encapsulation</code> section in | |
1364 | <code>ovn-architecture</code>(7) describes how tunnel keys are | |
1365 | constructed for each supported encapsulation. | |
1366 | </column> | |
1367 | ||
9975d7be BP |
1368 | <group title="OVN_Northbound Relationship"> |
1369 | <p> | |
1370 | Each row in <ref table="Datapath_Binding"/> is associated with some | |
1371 | logical datapath. <code>ovn-northd</code> uses these keys to track the | |
1372 | association of a logical datapath with concepts in the <ref | |
1373 | db="OVN_Northbound"/> database. | |
1374 | </p> | |
1375 | ||
1376 | <column name="external_ids" key="logical-switch" type='{"type": "uuid"}'> | |
1377 | For a logical datapath that represents a logical switch, | |
1378 | <code>ovn-northd</code> stores in this key the UUID of the | |
1379 | corresponding <ref table="Logical_Switch" db="OVN_Northbound"/> row in | |
1380 | the <ref db="OVN_Northbound"/> database. | |
1381 | </column> | |
1382 | ||
1383 | <column name="external_ids" key="logical-router" type='{"type": "uuid"}'> | |
1384 | For a logical datapath that represents a logical router, | |
1385 | <code>ovn-northd</code> stores in this key the UUID of the | |
1386 | corresponding <ref table="Logical_Router" db="OVN_Northbound"/> row in | |
1387 | the <ref db="OVN_Northbound"/> database. | |
1388 | </column> | |
1389 | </group> | |
5868eb24 BP |
1390 | |
1391 | <group title="Common Columns"> | |
1392 | The overall purpose of these columns is described under <code>Common | |
1393 | Columns</code> at the beginning of this document. | |
1394 | ||
1395 | <column name="external_ids"/> | |
1396 | </group> | |
1397 | </table> | |
1398 | ||
dcda6e0d | 1399 | <table name="Port_Binding" title="Physical-Logical Port Bindings"> |
fe36184b | 1400 | <p> |
d387d24d BP |
1401 | Most rows in this table identify the physical location of a logical port. |
1402 | (The exceptions are logical patch ports, which do not have any physical | |
1403 | location.) | |
fe36184b BP |
1404 | </p> |
1405 | ||
1406 | <p> | |
80f408f4 JP |
1407 | For every <code>Logical_Switch_Port</code> record in |
1408 | <code>OVN_Northbound</code> database, <code>ovn-northd</code> | |
1409 | creates a record in this table. <code>ovn-northd</code> populates | |
1410 | and maintains every column except the <code>chassis</code> column, | |
1411 | which it leaves empty in new records. | |
9fb4636f GS |
1412 | </p> |
1413 | ||
1414 | <p> | |
88058f19 AW |
1415 | <code>ovn-controller</code>/<code>ovn-controller-vtep</code> |
1416 | populates the <code>chassis</code> column for the records that | |
1417 | identify the logical ports that are located on its hypervisor/gateway, | |
1418 | which <code>ovn-controller</code>/<code>ovn-controller-vtep</code> in | |
1419 | turn finds out by monitoring the local hypervisor's Open_vSwitch | |
1420 | database, which identifies logical ports via the conventions described | |
c1645003 GS |
1421 | in <code>IntegrationGuide.md</code>. (The exceptions are for |
1422 | <code>Port_Binding</code> records with <code>type</code> of | |
1423 | <code>gateway</code>, whose locations are identified by | |
1424 | <code>ovn-northd</code> via the <code>options:gateway-chassis</code> | |
1425 | column in this table. <code>ovn-controller</code> is still responsible | |
1426 | to populate the <code>chassis</code> column.) | |
9fb4636f GS |
1427 | </p> |
1428 | ||
1429 | <p> | |
5868eb24 | 1430 | When a chassis shuts down gracefully, it should clean up the |
9fb4636f | 1431 | <code>chassis</code> column that it previously had populated. |
fe36184b BP |
1432 | (This is not critical because resources hosted on the chassis are equally |
1433 | unreachable regardless of whether their rows are present.) To handle the | |
1434 | case where a VM is shut down abruptly on one chassis, then brought up | |
88058f19 AW |
1435 | again on a different one, |
1436 | <code>ovn-controller</code>/<code>ovn-controller-vtep</code> must | |
1437 | overwrite the <code>chassis</code> column with new information. | |
fe36184b BP |
1438 | </p> |
1439 | ||
c96ba502 BP |
1440 | <group title="Core Features"> |
1441 | <column name="datapath"> | |
1442 | The logical datapath to which the logical port belongs. | |
1443 | </column> | |
1a76c93e | 1444 | |
c96ba502 | 1445 | <column name="logical_port"> |
80f408f4 JP |
1446 | A logical port, taken from <ref table="Logical_Switch_Port" |
1447 | column="name" db="OVN_Northbound"/> in the OVN_Northbound | |
1448 | database's <ref table="Logical_Switch_Port" db="OVN_Northbound"/> | |
1449 | table. OVN does not prescribe a particular format for the | |
1450 | logical port ID. | |
c96ba502 | 1451 | </column> |
c0281929 | 1452 | |
c96ba502 | 1453 | <column name="chassis"> |
184bc3ca RB |
1454 | The meaning of this column depends on the value of the <ref column="type"/> |
1455 | column. This is the meaning for each <ref column="type"/> | |
1456 | ||
1457 | <dl> | |
1458 | <dt>(empty string)</dt> | |
1459 | <dd> | |
1460 | The physical location of the logical port. To successfully identify a | |
1461 | chassis, this column must be a <ref table="Chassis"/> record. This is | |
1462 | populated by <code>ovn-controller</code>. | |
1463 | </dd> | |
1464 | ||
1465 | <dt>vtep</dt> | |
1466 | <dd> | |
1467 | The physical location of the hardware_vtep gateway. To successfully | |
1468 | identify a chassis, this column must be a <ref table="Chassis"/> record. | |
1469 | This is populated by <code>ovn-controller-vtep</code>. | |
1470 | </dd> | |
1471 | ||
1472 | <dt>localnet</dt> | |
1473 | <dd> | |
1474 | Always empty. A localnet port is realized on every chassis that has | |
1475 | connectivity to the corresponding physical network. | |
1476 | </dd> | |
1477 | ||
1478 | <dt>gateway</dt> | |
1479 | <dd> | |
1480 | The physical location of the L3 gateway. To successfully identify a | |
1481 | chassis, this column must be a <ref table="Chassis"/> record. This is | |
1482 | populated by <code>ovn-controller</code> based on the value of | |
1483 | the <code>options:gateway-chassis</code> column in this table. | |
1484 | </dd> | |
1485 | ||
1486 | <dt>l2gateway</dt> | |
1487 | <dd> | |
1488 | The physical location of this L2 gateway. To successfully identify a | |
1489 | chassis, this column must be a <ref table="Chassis"/> record. | |
62b87eab NS |
1490 | This is populated by <code>ovn-controller</code> based on the value |
1491 | of the <code>options:l2gateway-chassis</code> column in this table. | |
184bc3ca RB |
1492 | </dd> |
1493 | </dl> | |
1494 | ||
c96ba502 | 1495 | </column> |
c0281929 | 1496 | |
c96ba502 BP |
1497 | <column name="tunnel_key"> |
1498 | <p> | |
1499 | A number that represents the logical port in the key (e.g. STT key or | |
1500 | Geneve TLV) field carried within tunnel protocol packets. | |
1501 | </p> | |
c0281929 | 1502 | |
c96ba502 BP |
1503 | <p> |
1504 | The tunnel ID must be unique within the scope of a logical datapath. | |
1505 | </p> | |
1506 | </column> | |
88058f19 | 1507 | |
c96ba502 BP |
1508 | <column name="mac"> |
1509 | <p> | |
1510 | The Ethernet address or addresses used as a source address on the | |
1511 | logical port, each in the form | |
1512 | <var>xx</var>:<var>xx</var>:<var>xx</var>:<var>xx</var>:<var>xx</var>:<var>xx</var>. | |
1513 | The string <code>unknown</code> is also allowed to indicate that the | |
1514 | logical port has an unknown set of (additional) source addresses. | |
1515 | </p> | |
1516 | ||
1517 | <p> | |
1518 | A VM interface would ordinarily have a single Ethernet address. A | |
1519 | gateway port might initially only have <code>unknown</code>, and then | |
1520 | add MAC addresses to the set as it learns new source addresses. | |
1521 | </p> | |
1522 | </column> | |
88058f19 | 1523 | |
c96ba502 BP |
1524 | <column name="type"> |
1525 | <p> | |
1526 | A type for this logical port. Logical ports can be used to model other | |
1527 | types of connectivity into an OVN logical switch. The following types | |
1528 | are defined: | |
1529 | </p> | |
1530 | ||
1531 | <dl> | |
1532 | <dt>(empty string)</dt> | |
1533 | <dd>VM (or VIF) interface.</dd> | |
d387d24d BP |
1534 | |
1535 | <dt><code>patch</code></dt> | |
1536 | <dd> | |
1537 | One of a pair of logical ports that act as if connected by a patch | |
1538 | cable. Useful for connecting two logical datapaths, e.g. to connect | |
1539 | a logical router to a logical switch or to another logical router. | |
1540 | </dd> | |
1541 | ||
c1645003 GS |
1542 | <dt><code>gateway</code></dt> |
1543 | <dd> | |
1544 | One of a pair of logical ports that act as if connected by a patch | |
1545 | cable across multiple chassis. Useful for connecting a logical | |
1546 | switch with a Gateway router (which is only resident on a | |
1547 | particular chassis). | |
1548 | </dd> | |
1549 | ||
c96ba502 BP |
1550 | <dt><code>localnet</code></dt> |
1551 | <dd> | |
1552 | A connection to a locally accessible network from each | |
1553 | <code>ovn-controller</code> instance. A logical switch can only | |
6e6c3f91 HZ |
1554 | have a single <code>localnet</code> port attached. This is used |
1555 | to model direct connectivity to an existing network. | |
c96ba502 BP |
1556 | </dd> |
1557 | ||
184bc3ca RB |
1558 | <dt><code>l2gateway</code></dt> |
1559 | <dd> | |
1560 | An L2 connection to a physical network. The chassis this | |
1561 | <ref table="Port_Binding"/> is bound to will serve as | |
1562 | an L2 gateway to the network named by | |
1563 | <ref column="options" table="Port_Binding"/>:<code>network_name</code>. | |
1564 | </dd> | |
1565 | ||
c96ba502 BP |
1566 | <dt><code>vtep</code></dt> |
1567 | <dd> | |
1568 | A port to a logical switch on a VTEP gateway chassis. In order to | |
1569 | get this port correctly recognized by the OVN controller, the <ref | |
1570 | column="options" | |
1571 | table="Port_Binding"/>:<code>vtep-physical-switch</code> and <ref | |
1572 | column="options" | |
1573 | table="Port_Binding"/>:<code>vtep-logical-switch</code> must also | |
1574 | be defined. | |
1575 | </dd> | |
1576 | </dl> | |
1577 | </column> | |
1578 | </group> | |
1a76c93e | 1579 | |
d387d24d BP |
1580 | <group title="Patch Options"> |
1581 | <p> | |
1582 | These options apply to logical ports with <ref column="type"/> of | |
1583 | <code>patch</code>. | |
1584 | </p> | |
1585 | ||
1586 | <column name="options" key="peer"> | |
1587 | The <ref column="logical_port"/> in the <ref table="Port_Binding"/> | |
1588 | record for the other side of the patch. The named <ref | |
1589 | column="logical_port"/> must specify this <ref column="logical_port"/> | |
1590 | in its own <code>peer</code> option. That is, the two patch logical | |
1591 | ports must have reversed <ref column="logical_port"/> and | |
1592 | <code>peer</code> values. | |
1593 | </column> | |
1594 | </group> | |
1595 | ||
184bc3ca | 1596 | <group title="L3 Gateway Options"> |
c1645003 GS |
1597 | <p> |
1598 | These options apply to logical ports with <ref column="type"/> of | |
1599 | <code>gateway</code>. | |
1600 | </p> | |
1601 | ||
1602 | <column name="options" key="peer"> | |
1603 | The <ref column="logical_port"/> in the <ref table="Port_Binding"/> | |
1604 | record for the other side of the 'gateway' port. The named <ref | |
1605 | column="logical_port"/> must specify this <ref column="logical_port"/> | |
1606 | in its own <code>peer</code> option. That is, the two 'gateway' | |
1607 | logical ports must have reversed <ref column="logical_port"/> and | |
1608 | <code>peer</code> values. | |
1609 | </column> | |
1610 | ||
1611 | <column name="options" key="gateway-chassis"> | |
1612 | The <code>chassis</code> in which the port resides. | |
1613 | </column> | |
1614 | </group> | |
1615 | ||
c96ba502 | 1616 | <group title="Localnet Options"> |
eb00399e | 1617 | <p> |
c96ba502 BP |
1618 | These options apply to logical ports with <ref column="type"/> of |
1619 | <code>localnet</code>. | |
eb00399e BP |
1620 | </p> |
1621 | ||
c96ba502 BP |
1622 | <column name="options" key="network_name"> |
1623 | Required. <code>ovn-controller</code> uses the configuration entry | |
1624 | <code>ovn-bridge-mappings</code> to determine how to connect to this | |
1625 | network. <code>ovn-bridge-mappings</code> is a list of network names | |
1626 | mapped to a local OVS bridge that provides access to that network. An | |
1627 | example of configuring <code>ovn-bridge-mappings</code> would be: | |
1628 | ||
1629 | <pre>$ ovs-vsctl set open . external-ids:ovn-bridge-mappings=physnet1:br-eth0,physnet2:br-eth1</pre> | |
1630 | ||
1631 | <p> | |
1632 | When a logical switch has a <code>localnet</code> port attached, | |
1633 | every chassis that may have a local vif attached to that logical | |
1634 | switch must have a bridge mapping configured to reach that | |
1635 | <code>localnet</code>. Traffic that arrives on a | |
1636 | <code>localnet</code> port is never forwarded over a tunnel to | |
1637 | another chassis. | |
1638 | </p> | |
1639 | </column> | |
1640 | ||
1641 | <column name="tag"> | |
1642 | If set, indicates that the port represents a connection to a specific | |
1643 | VLAN on a locally accessible network. The VLAN ID is used to match | |
1644 | incoming traffic and is also added to outgoing traffic. | |
1645 | </column> | |
1646 | </group> | |
1647 | ||
184bc3ca RB |
1648 | <group title="L2 Gateway Options"> |
1649 | <p> | |
1650 | These options apply to logical ports with <ref column="type"/> of | |
1651 | <code>l2gateway</code>. | |
1652 | </p> | |
1653 | ||
1654 | <column name="options" key="network_name"> | |
1655 | Required. <code>ovn-controller</code> uses the configuration entry | |
1656 | <code>ovn-bridge-mappings</code> to determine how to connect to this | |
1657 | network. <code>ovn-bridge-mappings</code> is a list of network names | |
1658 | mapped to a local OVS bridge that provides access to that network. An | |
1659 | example of configuring <code>ovn-bridge-mappings</code> would be: | |
1660 | ||
1661 | <pre>$ ovs-vsctl set open . external-ids:ovn-bridge-mappings=physnet1:br-eth0,physnet2:br-eth1</pre> | |
1662 | ||
1663 | <p> | |
1664 | When a logical switch has a <code>l2gateway</code> port attached, | |
1665 | the chassis that the <code>l2gateway</code> port is bound to | |
1666 | must have a bridge mapping configured to reach the network | |
1667 | identified by <code>network_name</code>. | |
1668 | </p> | |
1669 | </column> | |
1670 | ||
62b87eab NS |
1671 | <column name="options" key="l2gateway-chassis"> |
1672 | Required. The <code>chassis</code> in which the port resides. | |
1673 | </column> | |
1674 | ||
184bc3ca RB |
1675 | <column name="tag"> |
1676 | If set, indicates that the gateway is connected to a specific | |
1677 | VLAN on the physical network. The VLAN ID is used to match | |
1678 | incoming traffic and is also added to outgoing traffic. | |
1679 | </column> | |
1680 | </group> | |
1681 | ||
c96ba502 | 1682 | <group title="VTEP Options"> |
eb00399e | 1683 | <p> |
c96ba502 BP |
1684 | These options apply to logical ports with <ref column="type"/> of |
1685 | <code>vtep</code>. | |
eb00399e | 1686 | </p> |
9fb4636f | 1687 | |
c96ba502 BP |
1688 | <column name="options" key="vtep-physical-switch"> |
1689 | Required. The name of the VTEP gateway. | |
1690 | </column> | |
fe36184b | 1691 | |
c96ba502 BP |
1692 | <column name="options" key="vtep-logical-switch"> |
1693 | Required. A logical switch name connected by the VTEP gateway. Must | |
1694 | be set when <ref column="type"/> is <code>vtep</code>. | |
1695 | </column> | |
1696 | </group> | |
fe36184b | 1697 | |
aef5f431 BP |
1698 | <group title="VMI (or VIF) Options"> |
1699 | <p> | |
1700 | These options apply to logical ports with <ref column="type"/> having | |
1701 | (empty string) | |
1702 | </p> | |
1703 | ||
1704 | <column name="options" key="policing_rate"> | |
1705 | If set, indicates the maximum rate for data sent from this interface, | |
1706 | in kbps. Data exceeding this rate is dropped. | |
1707 | </column> | |
1708 | ||
1709 | <column name="options" key="policing_burst"> | |
1710 | If set, indicates the maximum burst size for data sent from this | |
1711 | interface, in kb. | |
1712 | </column> | |
1713 | </group> | |
1714 | ||
c96ba502 | 1715 | <group title="Nested Containers"> |
fe36184b | 1716 | <p> |
c96ba502 BP |
1717 | These columns support containers nested within a VM. Specifically, |
1718 | they are used when <ref column="type"/> is empty and <ref | |
1719 | column="logical_port"/> identifies the interface of a container spawned | |
1720 | inside a VM. They are empty for containers or VMs that run directly on | |
1721 | a hypervisor. | |
fe36184b BP |
1722 | </p> |
1723 | ||
c96ba502 BP |
1724 | <column name="parent_port"> |
1725 | This is taken from | |
80f408f4 JP |
1726 | <ref table="Logical_Switch_Port" column="parent_name" |
1727 | db="OVN_Northbound"/> in the OVN_Northbound database's | |
1728 | <ref table="Logical_Switch_Port" db="OVN_Northbound"/> table. | |
c96ba502 BP |
1729 | </column> |
1730 | ||
1731 | <column name="tag"> | |
1732 | <p> | |
1733 | Identifies the VLAN tag in the network traffic associated with that | |
1734 | container's network interface. | |
1735 | </p> | |
1736 | ||
1737 | <p> | |
1738 | This column is used for a different purpose when <ref column="type"/> | |
184bc3ca RB |
1739 | is <code>localnet</code> (see <code>Localnet Options</code>, above) |
1740 | or <code>l2gateway</code> (see <code>L2 Gateway Options</code>, above). | |
c96ba502 BP |
1741 | </p> |
1742 | </column> | |
1743 | </group> | |
fe36184b | 1744 | </table> |
0bac7164 BP |
1745 | |
1746 | <table name="MAC_Binding" title="IP to MAC bindings"> | |
1747 | <p> | |
1748 | Each row in this table specifies a binding from an IP address to an | |
1749 | Ethernet address that has been discovered through ARP (for IPv4) or | |
1750 | neighbor discovery (for IPv6). This table is primarily used to discover | |
1751 | bindings on physical networks, because IP-to-MAC bindings for virtual | |
1752 | machines are usually populated statically into the <ref | |
1753 | table="Port_Binding"/> table. | |
1754 | </p> | |
1755 | ||
1756 | <p> | |
1757 | This table expresses a functional relationship: <ref | |
1758 | table="MAC_Binding"/>(<ref column="logical_port"/>, <ref column="ip"/>) = | |
1759 | <ref column="mac"/>. | |
1760 | </p> | |
1761 | ||
1762 | <p> | |
1763 | In outline, the lifetime of a logical router's MAC binding looks like | |
1764 | this: | |
1765 | </p> | |
1766 | ||
1767 | <ol> | |
1768 | <li> | |
1769 | On hypervisor 1, a logical router determines that a packet should be | |
1770 | forwarded to IP address <var>A</var> on one of its router ports. It | |
1771 | uses its logical flow table to determine that <var>A</var> lacks a | |
1772 | static IP-to-MAC binding and the <code>get_arp</code> action to | |
1773 | determine that it lacks a dynamic IP-to-MAC binding. | |
1774 | </li> | |
1775 | ||
1776 | <li> | |
1777 | Using an OVN logical <code>arp</code> action, the logical router | |
1778 | generates and sends a broadcast ARP request to the router port. It | |
1779 | drops the IP packet. | |
1780 | </li> | |
1781 | ||
1782 | <li> | |
1783 | The logical switch attached to the router port delivers the ARP request | |
1784 | to all of its ports. (It might make sense to deliver it only to ports | |
1785 | that have no static IP-to-MAC bindings, but this could also be | |
1786 | surprising behavior.) | |
1787 | </li> | |
1788 | ||
1789 | <li> | |
1790 | A host or VM on hypervisor 2 (which might be the same as hypervisor 1) | |
1791 | attached to the logical switch owns the IP address in question. It | |
1792 | composes an ARP reply and unicasts it to the logical router port's | |
1793 | Ethernet address. | |
1794 | </li> | |
1795 | ||
1796 | <li> | |
1797 | The logical switch delivers the ARP reply to the logical router port. | |
1798 | </li> | |
1799 | ||
1800 | <li> | |
1801 | The logical router flow table executes a <code>put_arp</code> action. | |
1802 | To record the IP-to-MAC binding, <code>ovn-controller</code> adds a row | |
1803 | to the <ref table="MAC_Binding"/> table. | |
1804 | </li> | |
1805 | ||
1806 | <li> | |
1807 | On hypervisor 1, <code>ovn-controller</code> receives the updated <ref | |
1808 | table="MAC_Binding"/> table from the OVN southbound database. The next | |
1809 | packet destined to <var>A</var> through the logical router is sent | |
1810 | directly to the bound Ethernet address. | |
1811 | </li> | |
1812 | </ol> | |
1813 | ||
1814 | <column name="logical_port"> | |
1815 | The logical port on which the binding was discovered. | |
1816 | </column> | |
1817 | ||
1818 | <column name="ip"> | |
1819 | The bound IP address. | |
1820 | </column> | |
1821 | ||
1822 | <column name="mac"> | |
1823 | The Ethernet address to which the IP is bound. | |
1824 | </column> | |
791a7747 LS |
1825 | <column name="datapath"> |
1826 | The logical datapath to which the logical port belongs. | |
1827 | </column> | |
0bac7164 | 1828 | </table> |
42814145 NS |
1829 | |
1830 | <table name="DHCP_Options" title="DHCP Options supported by native OVN DHCP"> | |
1831 | <p> | |
1832 | Each row in this table stores the DHCP Options supported by native OVN | |
1833 | DHCP. <code>ovn-northd</code> populates this table with the supported | |
1834 | DHCP options. <code>ovn-controller</code> looks up this table to get the | |
1835 | DHCP codes of the DHCP options defined in the "put_dhcp_opts" action. | |
1836 | Please refer to the RFC 2132 <code>"https://tools.ietf.org/html/rfc2132"</code> | |
1837 | for the possible list of DHCP options that can be defined here. | |
1838 | </p> | |
1839 | ||
1840 | <column name="name"> | |
1841 | <p> | |
1842 | Name of the DHCP option. | |
1843 | </p> | |
1844 | ||
1845 | <p> | |
1846 | Example. name="router" | |
1847 | </p> | |
1848 | </column> | |
1849 | ||
1850 | <column name="code"> | |
1851 | <p> | |
1852 | DHCP option code for the DHCP option as defined in the RFC 2132. | |
1853 | </p> | |
1854 | ||
1855 | <p> | |
1856 | Example. code=3 | |
1857 | </p> | |
1858 | </column> | |
1859 | ||
1860 | <column name="type"> | |
1861 | <p> | |
1862 | Data type of the DHCP option code. | |
1863 | </p> | |
1864 | ||
1865 | <dl> | |
1866 | <dt><code>value: bool</code></dt> | |
1867 | <dd> | |
1868 | <p> | |
1869 | This indicates that the value of the DHCP option is a bool. | |
1870 | </p> | |
1871 | ||
1872 | <p> | |
1873 | Example. "name=ip_forward_enable", "code=19", "type=bool". | |
1874 | </p> | |
1875 | ||
1876 | <p> | |
1877 | put_dhcp_opts(..., ip_forward_enable = 1,...) | |
1878 | </p> | |
1879 | </dd> | |
1880 | ||
1881 | <dt><code>value: uint8</code></dt> | |
1882 | <dd> | |
1883 | <p> | |
1884 | This indicates that the value of the DHCP option is an unsigned | |
1885 | int8 (8 bits) | |
1886 | </p> | |
1887 | ||
1888 | <p> | |
1889 | Example. "name=default_ttl", "code=23", "type=uint8". | |
1890 | </p> | |
1891 | ||
1892 | <p> | |
1893 | put_dhcp_opts(..., default_ttl = 50,...) | |
1894 | </p> | |
1895 | </dd> | |
1896 | ||
1897 | <dt><code>value: uint16</code></dt> | |
1898 | <dd> | |
1899 | <p> | |
1900 | This indicates that the value of the DHCP option is an unsigned | |
1901 | int16 (16 bits). | |
1902 | </p> | |
1903 | ||
1904 | <p> | |
1905 | Example. "name=mtu", "code=26", "type=uint16". | |
1906 | </p> | |
1907 | ||
1908 | <p> | |
1909 | put_dhcp_opts(..., mtu = 1450,...) | |
1910 | </p> | |
1911 | </dd> | |
1912 | ||
1913 | <dt><code>value: uint32</code></dt> | |
1914 | <dd> | |
1915 | <p> | |
1916 | This indicates that the value of the DHCP option is an unsigned | |
1917 | int32 (32 bits). | |
1918 | </p> | |
1919 | ||
1920 | <p> | |
1921 | Example. "name=lease_time", "code=51", "type=uint32". | |
1922 | </p> | |
1923 | ||
1924 | <p> | |
1925 | put_dhcp_opts(..., lease_time = 86400,...) | |
1926 | </p> | |
1927 | </dd> | |
1928 | ||
1929 | <dt><code>value: ipv4</code></dt> | |
1930 | <dd> | |
1931 | <p> | |
1932 | This indicates that the value of the DHCP option is an IPv4 | |
1933 | address or addresses. | |
1934 | </p> | |
1935 | ||
1936 | <p> | |
1937 | Example. "name=router", "code=3", "type=ipv4". | |
1938 | </p> | |
1939 | ||
1940 | <p> | |
1941 | put_dhcp_opts(..., router = 10.0.0.1,...) | |
1942 | </p> | |
1943 | ||
1944 | <p> | |
1945 | Example. "name=dns_server", "code=6", "type=ipv4". | |
1946 | </p> | |
1947 | ||
1948 | <p> | |
1949 | put_dhcp_opts(..., dns_server = {8.8.8.8 7.7.7.7},...) | |
1950 | </p> | |
1951 | </dd> | |
1952 | ||
1953 | <dt><code>value: static_routes</code></dt> | |
1954 | <dd> | |
1955 | <p> | |
1956 | This indicates that the value of the DHCP option contains a pair of | |
1957 | IPv4 route and next hop addresses. | |
1958 | </p> | |
1959 | ||
1960 | <p> | |
1961 | Example. "name=classless_static_route", "code=121", "type=static_routes". | |
1962 | </p> | |
1963 | ||
1964 | <p> | |
1965 | put_dhcp_opts(..., classless_static_route = {30.0.0.0/24,10.0.0.4,0.0.0.0/0,10.0.0.1}...) | |
1966 | </p> | |
1967 | </dd> | |
1968 | ||
1969 | <dt><code>value: str</code></dt> | |
1970 | <dd> | |
1971 | <p> | |
1972 | This indicates that the value of the DHCP option is a string. | |
1973 | </p> | |
1974 | ||
1975 | <p> | |
1976 | Example. "name=host_name", "code=12", "type=str". | |
1977 | </p> | |
1978 | </dd> | |
1979 | </dl> | |
1980 | </column> | |
1981 | </table> | |
fe36184b | 1982 | </database> |