]> git.proxmox.com Git - ovs.git/blame - DESIGN.rst
ovsschema: Add protected column to Port table
[ovs.git] / DESIGN.rst
CommitLineData
368ed582
SF
1..
2 Licensed under the Apache License, Version 2.0 (the "License"); you may
3 not use this file except in compliance with the License. You may obtain
4 a copy of the License at
5
6 http://www.apache.org/licenses/LICENSE-2.0
7
8 Unless required by applicable law or agreed to in writing, software
9 distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
10 WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
11 License for the specific language governing permissions and limitations
12 under the License.
13
14 Convention for heading levels in Open vSwitch documentation:
15
16 ======= Heading 0 (reserved for the title in a document)
17 ------- Heading 1
18 ~~~~~~~ Heading 2
19 +++++++ Heading 3
20 ''''''' Heading 4
21
22 Avoid deeper levels because they do not render well.
23
24================================
25Design Decisions In Open vSwitch
26================================
27
28This document describes design decisions that went into implementing Open
29vSwitch. While we believe these to be reasonable decisions, it is impossible
30to predict how Open vSwitch will be used in all environments. Understanding
31assumptions made by Open vSwitch is critical to a successful deployment. The
32end of this document contains contact information that can be used to let us
33know how we can make Open vSwitch more generally useful.
34
35Asynchronous Messages
36---------------------
37
38Over time, Open vSwitch has added many knobs that control whether a given
39controller receives OpenFlow asynchronous messages. This section describes how
40all of these features interact.
41
42First, a service controller never receives any asynchronous messages unless it
43changes its miss_send_len from the service controller default of zero in one of
44the following ways:
45
46- Sending an ``OFPT_SET_CONFIG`` message with nonzero ``miss_send_len``.
47
48- Sending any ``NXT_SET_ASYNC_CONFIG`` message: as a side effect, this message
49 changes the ``miss_send_len`` to ``OFP_DEFAULT_MISS_SEND_LEN`` (128) for
50 service controllers.
51
52Second, ``OFPT_FLOW_REMOVED`` and ``NXT_FLOW_REMOVED`` messages are generated
53only if the flow that was removed had the ``OFPFF_SEND_FLOW_REM`` flag set.
54
55Third, ``OFPT_PACKET_IN`` and ``NXT_PACKET_IN`` messages are sent only to
56OpenFlow controller connections that have the correct connection ID (see
57``struct nx_controller_id`` and ``struct nx_action_controller``):
58
59- For packet-in messages generated by a ``NXAST_CONTROLLER`` action, the
60 controller ID specified in the action.
61
62- For other packet-in messages, controller ID zero. (This is the default ID
63 when an OpenFlow controller does not configure one.)
64
65Finally, Open vSwitch consults a per-connection table indexed by the message
66type, reason code, and current role. The following table shows how this table
67is initialized by default when an OpenFlow connection is made. An entry
68labeled ``yes`` means that the message is sent, an entry labeled ``---`` means
69that the message is suppressed.
70
71.. table:: ``OFPT_PACKET_IN`` / ``NXT_PACKET_IN``
72
73 =========================================== ======= =====
74 master/
75 message and reason code other slave
76 =========================================== ======= =====
77 ``OFPR_NO_MATCH`` yes ---
78 ``OFPR_ACTION`` yes ---
79 ``OFPR_INVALID_TTL`` --- ---
80 ``OFPR_ACTION_SET`` (OF1.4+) yes ---
81 ``OFPR_GROUP`` (OF1.4+) yes ---
82 =========================================== ======= =====
83
84.. table:: ``OFPT_FLOW_REMOVED`` / ``NXT_FLOW_REMOVED``
85
86 =========================================== ======= =====
87 master/
88 message and reason code other slave
89 =========================================== ======= =====
90 ``OFPRR_IDLE_TIMEOUT`` yes ---
91 ``OFPRR_HARD_TIMEOUT`` yes ---
92 ``OFPRR_DELETE`` yes ---
93 ``OFPRR_GROUP_DELETE`` (OF1.4+) yes ---
94 ``OFPRR_METER_DELETE`` (OF1.4+) yes ---
95 ``OFPRR_EVICTION`` (OF1.4+) yes ---
96 =========================================== ======= =====
97
98.. table:: ``OFPT_PORT_STATUS``
99
100 =========================================== ======= =====
101 master/
102 message and reason code other slave
103 =========================================== ======= =====
104 ``OFPPR_ADD`` yes yes
105 ``OFPPR_DELETE`` yes yes
106 ``OFPPR_MODIFY`` yes yes
107 =========================================== ======= =====
108
109.. table:: ``OFPT_ROLE_REQUEST`` / ``OFPT_ROLE_REPLY`` (OF1.4+)
110
111 =========================================== ======= =====
112 master/
113 message and reason code other slave
114 =========================================== ======= =====
115 ``OFPCRR_MASTER_REQUEST`` --- ---
116 ``OFPCRR_CONFIG`` --- ---
117 ``OFPCRR_EXPERIMENTER`` --- ---
118 =========================================== ======= =====
119
120.. table:: ``OFPT_TABLE_STATUS`` (OF1.4+)
121
122 =========================================== ======= =====
123 master/
124 message and reason code other slave
125 =========================================== ======= =====
126 ``OFPTR_VACANCY_DOWN`` --- ---
127 ``OFPTR_VACANCY_UP`` --- ---
128 =========================================== ======= =====
129
130
131.. table:: ``OFPT_REQUESTFORWARD`` (OF1.4+)
132
133 =========================================== ======= =====
134 master/
135 message and reason code other slave
136 =========================================== ======= =====
137 ``OFPRFR_GROUP_MOD`` --- ---
138 ``OFPRFR_METER_MOD`` --- ---
139 =========================================== ======= =====
140
141The ``NXT_SET_ASYNC_CONFIG`` message directly sets all of the values in this
142table for the current connection. The ``OFPC_INVALID_TTL_TO_CONTROLLER`` bit
143in the ``OFPT_SET_CONFIG`` message controls the setting for
144``OFPR_INVALID_TTL`` for the "master" role.
145
146``OFPAT_ENQUEUE``
147-----------------
148
149The OpenFlow 1.0 specification requires the output port of the
150``OFPAT_ENQUEUE`` action to "refer to a valid physical port (i.e. <
151``OFPP_MAX``) or ``OFPP_IN_PORT``". Although ``OFPP_LOCAL`` is not less than
152``OFPP_MAX``, it is an 'internal' port which can have QoS applied to it in
153Linux. Since we allow the ``OFPAT_ENQUEUE`` to apply to 'internal' ports whose
154port numbers are less than ``OFPP_MAX``, we interpret ``OFPP_LOCAL`` as a
155physical port and support ``OFPAT_ENQUEUE`` on it as well.
156
157``OFPT_FLOW_MOD``
158-----------------
159
160The OpenFlow specification for the behavior of ``OFPT_FLOW_MOD`` is confusing.
161The following tables summarize the Open vSwitch implementation of its behavior
162in the following categories:
163
164"match on priority"
165 Whether the ``flow_mod`` acts only on flows whose priority matches that
166 included in the ``flow_mod`` message.
167
168"match on out_port"
169 Whether the ``flow_mod`` acts only on flows that output to the out_port
170 included in the flow_mod message (if out_port is not ``OFPP_NONE``).
171 OpenFlow 1.1 and later have a similar feature (not listed separately here)
172 for ``out_group``.
173
174"match on flow_cookie":
175 Whether the ``flow_mod`` acts only on flows whose ``flow_cookie`` matches an
176 optional controller-specified value and mask.
177
178"updates flow_cookie":
179 Whether the ``flow_mod`` changes the ``flow_cookie`` of the flow or flows
180 that it matches to the ``flow_cookie`` included in the flow_mod message.
181
182"updates ``OFPFF_`` flags":
183 Whether the flow_mod changes the ``OFPFF_SEND_FLOW_REM`` flag of the flow or
184 flows that it matches to the setting included in the flags of the flow_mod
185 message.
186
187"honors ``OFPFF_CHECK_OVERLAP``":
188 Whether the ``OFPFF_CHECK_OVERLAP`` flag in the flow_mod is significant.
189
190"updates ``idle_timeout``" and "updates ``hard_timeout``":
191 Whether the ``idle_timeout`` and hard_timeout in the ``flow_mod``,
192 respectively, have an effect on the flow or flows matched by the
193 ``flow_mod``.
194
195"updates idle timer":
196 Whether the ``flow_mod`` resets the per-flow timer that measures how long a
197 flow has been idle.
198
199"updates hard timer":
200 Whether the ``flow_mod`` resets the per-flow timer that measures how long it
201 has been since a flow was modified.
202
203"zeros counters":
204 Whether the ``flow_mod`` resets per-flow packet and byte counters to zero.
205
206"may add a new flow":
207 Whether the ``flow_mod`` may add a new flow to the flow table. (Obviously
208 this is always true for "add" commands but in some OpenFlow versions "modify"
209 and "modify-strict" can also add new flows.)
210
211"sends ``flow_removed`` message":
212 Whether the flow_mod generates a flow_removed message for the flow or flows
213 that it affects.
214
215An entry labeled ``yes`` means that the flow mod type does have the indicated
216behavior, ``---`` means that it does not, an empty cell means that the property
217is not applicable, and other values are explained below the table.
218
219OpenFlow 1.0
220~~~~~~~~~~~~
221
222================================ === ====== ====== ====== ======
223 MODIFY DELETE
224RULE ADD MODIFY STRICT DELETE STRICT
225================================ === ====== ====== ====== ======
226match on ``priority`` yes --- yes --- yes
227match on ``out_port`` --- --- --- yes yes
228match on ``flow_cookie`` --- --- --- --- ---
229match on ``table_id`` --- --- --- --- ---
230controller chooses ``table_id`` --- --- ---
231updates ``flow_cookie`` yes yes yes
232updates ``OFPFF_SEND_FLOW_REM`` yes + +
233honors ``OFPFF_CHECK_OVERLAP`` yes + +
234updates ``idle_timeout`` yes + +
235updates ``hard_timeout`` yes + +
236resets idle timer yes + +
237resets hard timer yes yes yes
238zeros counters yes + +
239may add a new flow yes yes yes
240sends ``flow_removed`` message --- --- --- % %
241================================ === ====== ====== ====== ======
242
243where:
244
245``+``
246 "modify" and "modify-strict" only take these actions when they create a new
247 flow, not when they update an existing flow.
248
249``%``
250 "delete" and "delete_strict" generates a flow_removed message if the deleted
251 flow or flows have the ``OFPFF_SEND_FLOW_REM`` flag set. (Each controller
252 can separately control whether it wants to receive the generated messages.)
253
254OpenFlow 1.1
255~~~~~~~~~~~~
256
257OpenFlow 1.1 makes these changes:
258
259- The controller now must specify the ``table_id`` of the flow match searched
260 and into which a flow may be inserted. Behavior for a ``table_id`` of 255 is
261 undefined.
262
263- A ``flow_mod``, except an "add", can now match on the ``flow_cookie``.
264
265- When a ``flow_mod`` matches on the ``flow_cookie``, "modify" and
266 "modify-strict" never insert a new flow.
267
268================================ === ====== ====== ====== ======
269 MODIFY DELETE
270RULE ADD MODIFY STRICT DELETE STRICT
271================================ === ====== ====== ====== ======
272match on ``priority`` yes --- yes --- yes
273match on ``out_port`` --- --- --- yes yes
274match on ``flow_cookie`` --- yes yes yes yes
275match on ``table_id`` yes yes yes yes yes
276controller chooses ``table_id`` yes yes yes
277updates ``flow_cookie`` yes --- ---
278updates ``OFPFF_SEND_FLOW_REM`` yes + +
279honors ``OFPFF_CHECK_OVERLAP`` yes + +
280updates ``idle_timeout`` yes + +
281updates ``hard_timeout`` yes + +
282resets idle timer yes + +
283resets hard timer yes yes yes
284zeros counters yes + +
285may add a new flow yes # #
286sends ``flow_removed`` message --- --- --- % %
287================================ === ====== ====== ====== ======
288
289where:
290
291``+``
292 "modify" and "modify-strict" only take these actions when they create a new
293 flow, not when they update an existing flow.
294
295``%``
296 "delete" and "delete_strict" generates a flow_removed message if the deleted
297 flow or flows have the ``OFPFF_SEND_FLOW_REM`` flag set. (Each controller
298 can separately control whether it wants to receive the generated messages.)
299
300``#``
301 "modify" and "modify-strict" only add a new flow if the flow_mod does not
302 match on any bits of the flow cookie
303
304OpenFlow 1.2
305~~~~~~~~~~~~
306
307OpenFlow 1.2 makes these changes:
308
309- Only "add" commands ever add flows, "modify" and "modify-strict" never do.
310
311- A new flag ``OFPFF_RESET_COUNTS`` now controls whether "modify" and
312 "modify-strict" reset counters, whereas previously they never reset counters
313 (except when they inserted a new flow).
314
315================================ === ====== ====== ====== ======
316 MODIFY DELETE
317RULE ADD MODIFY STRICT DELETE STRICT
318================================ === ====== ====== ====== ======
319match on ``priority`` yes --- yes --- yes
320match on ``out_port`` --- --- --- yes yes
321match on ``flow_cookie`` --- yes yes yes yes
322match on ``table_id`` yes yes yes yes yes
323controller chooses ``table_id`` yes yes yes
324updates ``flow_cookie`` yes --- ---
325updates ``OFPFF_SEND_FLOW_REM`` yes --- ---
326honors ``OFPFF_CHECK_OVERLAP`` yes --- ---
327updates ``idle_timeout`` yes --- ---
328updates ``hard_timeout`` yes --- ---
329resets idle timer yes --- ---
330resets hard timer yes yes yes
331zeros counters yes & &
332may add a new flow yes --- ---
333sends ``flow_removed`` message --- --- --- % %
334================================ === ====== ====== ====== ======
335
336``%``
337 "delete" and "delete_strict" generates a flow_removed message if the deleted
338 flow or flows have the ``OFPFF_SEND_FLOW_REM`` flag set. (Each controller
339 can separately control whether it wants to receive the generated messages.)
340
341``&``
342 "modify" and "modify-strict" reset counters if the ``OFPFF_RESET_COUNTS``
343 flag is specified.
344
345OpenFlow 1.3
346~~~~~~~~~~~~
347
348OpenFlow 1.3 makes these changes:
349
350- Behavior for a table_id of 255 is now defined, for "delete" and
351 "delete-strict" commands, as meaning to delete from all tables. A table_id
352 of 255 is now explicitly invalid for other commands.
353
354- New flags ``OFPFF_NO_PKT_COUNTS`` and ``OFPFF_NO_BYT_COUNTS`` for "add"
355 operations.
356
357The table for 1.3 is the same as the one shown above for 1.2.
358
359OpenFlow 1.4
360~~~~~~~~~~~~
361
362OpenFlow 1.4 makes these changes:
363
364- Adds the "importance" field to ``flow_mods``, but it does not explicitly
365 specify which kinds of ``flow_mods`` set the importance. For consistency,
366 Open vSwitch uses the same rule for importance as for ``idle_timeout`` and
367 ``hard_timeout``, that is, only an "ADD" flow_mod sets the importance. (This
368 issue has been filed with the ONF as EXT-496.)
369
370.. TODO(stephenfin) Link to EXT-496
371
372- Eviction Mechanism to automatically delete entries of lower importance to
373 make space for newer entries.
374
375OpenFlow 1.4 Bundles
376--------------------
377
378Open vSwitch makes all flow table modifications atomically, i.e., any datapath
379packet only sees flow table configurations either before or after any change
380made by any ``flow_mod``. For example, if a controller removes all flows with
381a single OpenFlow ``flow_mod``, no packet sees an intermediate version of the
382OpenFlow pipeline where only some of the flows have been deleted.
383
384It should be noted that Open vSwitch caches datapath flows, and that the cached
385flows are *NOT* flushed immediately when a flow table changes. Instead, the
386datapath flows are revalidated against the new flow table as soon as possible,
387and usually within one second of the modification. This design amortizes the
388cost of datapath cache flushing across multiple flow table changes, and has a
389significant performance effect during simultaneous heavy flow table churn and
390high traffic load. This means that different cached datapath flows may have
391been computed based on a different flow table configurations, but each of the
392datapath flows is guaranteed to have been computed over a coherent view of the
393flow tables, as described above.
394
395With OpenFlow 1.4 bundles this atomicity can be extended across an arbitrary
396set of ``flow_mod``. Bundles are supported for ``flow_mod`` and port_mod
397messages only. For ``flow_mod``, both ``atomic`` and ``ordered`` bundle flags
398are trivially supported, as all bundled messages are executed in the order they
399were added and all flow table modifications are now atomic to the datapath.
400Port mods may not appear in atomic bundles, as port status modifications are
401not atomic.
402
403To support bundles, ovs-ofctl has a ``--bundle`` option that makes the
404flow mod commands (``add-flow``, ``add-flows``, ``mod-flows``, ``del-flows``,
405and ``replace-flows``) use an OpenFlow 1.4 bundle to operate the
406modifications as a single atomic transaction. If any of the flow mods
407in a transaction fail, none of them are executed. All flow mods in a
408bundle appear to datapath lookups simultaneously.
409
410Furthermore, ovs-ofctl ``add-flow`` and ``add-flows`` commands now accept
411arbitrary flow mods as an input by allowing the flow specification to
412start with an explicit ``add``, ``modify``, ``modify_strict``, ``delete``, or
413``delete_strict`` keyword. A missing keyword is treated as ``add``, so
414this is fully backwards compatible. With the new ``--bundle`` option
415all the flow mods are executed as a single atomic transaction using an
416OpenFlow 1.4 bundle. Without the ``--bundle`` option the flow mods are
417executed in order up to the first failing ``flow_mod``, and in case of an
418error the earlier successful ``flow_mod`` calls are not rolled back.
419
420``OFPT_PACKET_IN``
421------------------
422
423The OpenFlow 1.1 specification for ``OFPT_PACKET_IN`` is confusing. The
424definition in OF1.1 ``openflow.h`` is[*]:
425
426::
427
428 /* Packet received on port (datapath -> controller). */
429 struct ofp_packet_in {
430 struct ofp_header header;
431 uint32_t buffer_id; /* ID assigned by datapath. */
432 uint32_t in_port; /* Port on which frame was received. */
433 uint32_t in_phy_port; /* Physical Port on which frame was received. */
434 uint16_t total_len; /* Full length of frame. */
435 uint8_t reason; /* Reason packet is being sent (one of OFPR_*) */
436 uint8_t table_id; /* ID of the table that was looked up */
437 uint8_t data[0]; /* Ethernet frame, halfway through 32-bit word,
438 so the IP header is 32-bit aligned. The
439 amount of data is inferred from the length
440 field in the header. Because of padding,
441 offsetof(struct ofp_packet_in, data) ==
442 sizeof(struct ofp_packet_in) - 2. */
443 };
444 OFP_ASSERT(sizeof(struct ofp_packet_in) == 24);
445
446The confusing part is the comment on the ``data[]`` member. This comment is a
447leftover from OF1.0 ``openflow.h``, in which the comment was correct:
448``sizeof(struct ofp_packet_in)`` is 20 in OF1.0 and ``ffsetof(struct
449ofp_packet_in, data)`` is 18. When OF1.1 was written, the structure members
450were changed but the comment was carelessly not updated, and the comment became
451wrong: ``sizeof(struct ofp_packet_in)`` and offsetof(struct ofp_packet_in,
452data) are both 24 in OF1.1.
453
454That leaves the question of how to implement ``ofp_packet_in`` in OF1.1. The
455OpenFlow reference implementation for OF1.1 does not include any padding, that
456is, the first byte of the encapsulated frame immediately follows the
457``table_id`` member without a gap. Open vSwitch therefore implements it the
458same way for compatibility.
459
460For an earlier discussion, please see the thread archived at:
461https://mailman.stanford.edu/pipermail/openflow-discuss/2011-August/002604.html
462
463[*] The quoted definition is directly from OF1.1. Definitions used inside OVS
464omit the 8-byte ``ofp_header`` members, so the sizes in this discussion are
4658 bytes larger than those declared in OVS header files.
466
467VLAN Matching
468-------------
469
470The 802.1Q VLAN header causes more trouble than any other 4 bytes in
471networking. More specifically, three versions of OpenFlow and Open vSwitch
472have among them four different ways to match the contents and presence of the
473VLAN header. The following table describes how each version works.
474
475======== ============= =============== =============== ================
476 Match NXM OF1.0 OF1.1 OF1.2
477======== ============= =============== =============== ================
478 ``[1]`` ``0000/0000`` ``????/1,??/?`` ``????/1,??/?`` ``0000/0000,--``
479 ``[2]`` ``0000/ffff`` ``ffff/0,??/?`` ``ffff/0,??/?`` ``0000/ffff,--``
480 ``[3]`` ``1xxx/1fff`` ``0xxx/0,??/1`` ``0xxx/0,??/1`` ``1xxx/ffff,--``
481 ``[4]`` ``z000/f000`` ``????/1,0y/0`` ``fffe/0,0y/0`` ``1000/1000,0y``
482 ``[5]`` ``zxxx/ffff`` ``0xxx/0,0y/0`` ``0xxx/0,0y/0`` ``1xxx/ffff,0y``
483 ``[6]`` ``0000/0fff`` ``<none>`` ``<none>`` ``<none>``
484 ``[7]`` ``0000/f000`` ``<none>`` ``<none>`` ``<none>``
485 ``[8]`` ``0000/efff`` ``<none>`` ``<none>`` ``<none>``
486 ``[9]`` ``1001/1001`` ``<none>`` ``<none>`` ``1001/1001,--``
487``[10]`` ``3000/3000`` ``<none>`` ``<none>`` ``<none>``
488``[11]`` ``1000/1000`` ``<none>`` ``fffe/0,??/1`` ``1000/1000,--``
489======== ============= =============== =============== ================
490
491where:
492
493Match:
494 See the list below.
495
496NXM:
497 ``xxxx/yyyy`` means ``NXM_OF_VLAN_TCI_W`` with value ``xxxx`` and mask
498 ``yyyy``. A mask of ``0000`` is equivalent to omitting
499 ``NXM_OF_VLAN_TCI(_W)``, a mask of ``ffff`` is equivalent to
500 ``NXM_OF_VLAN_TCI``.
501
502OF1.0, OF1.1:
503 ``wwww/x,yy/z`` means ``dl_vlan`` ``wwww``, ``OFPFW_DL_VLAN`` ``x``,
504 ``dl_vlan_pcp`` ``yy``, and ``OFPFW_DL_VLAN_PCP`` ``z``. If
505 ``OFPFW_DL_VLAN`` or ``OFPFW_DL_VLAN_PCP`` is 1, the corresponding field
506 value is wildcarded, otherwise it is matched. ``?`` means that the given
507 bits are ignored (their conventional values are ``0000/x,00/0`` in OF1.0,
508 ``0000/x,00/1`` in OF1.1; ``x`` is never ignored). ``<none>`` means that the
509 given match is not supported.
510
511OF1.2:
512 ``xxxx/yyyy,zz`` means ``OXM_OF_VLAN_VID_W`` with value ``xxxx`` and mask
513 ``yyyy``, and ``OXM_OF_VLAN_PCP`` (which is not maskable) with value ``zz``.
514 A mask of ``0000`` is equivalent to omitting ``OXM_OF_VLAN_VID(_W)``, a mask
515 of ``ffff`` is equivalent to ``OXM_OF_VLAN_VID``. ``--`` means that
516 ``OXM_OF_VLAN_PCP`` is omitted. ``<none>`` means that the given match is not
517 supported.
518
519The matches are:
520
521``[1]``:
522 Matches any packet, that is, one without an 802.1Q header or with an 802.1Q
523 header with any TCI value.
524
525``[2]``
526 Matches only packets without an 802.1Q header.
527
528 NXM:
529 Any match with ``vlan_tci == 0`` and ``(vlan_tci_mask & 0x1000) != 0`` is
530 equivalent to the one listed in the table.
531
532 OF1.0:
533 The spec doesn't define behavior if ``dl_vlan`` is set to ``0xffff`` and
534 ``OFPFW_DL_VLAN_PCP`` is not set.
535
536 OF1.1:
537 The spec says explicitly to ignore ``dl_vlan_pcp`` when ``dl_vlan`` is set
538 to ``0xffff``.
539
540 OF1.2:
541 The spec doesn't say what should happen if ``vlan_vid == 0`` and
542 ``(vlan_vid_mask & 0x1000) != 0`` but ``vlan_vid_mask != 0x1000``, but it
543 would be straightforward to also interpret as ``[2]``.
544
545``[3]``
546 Matches only packets that have an 802.1Q header with VID ``xxx`` (and any
547 PCP).
548
549``[4]``
550 Matches only packets that have an 802.1Q header with PCP ``y`` (and any VID).
551
552 NXM:
553 ``z`` is ``(y << 1) | 1``.
554
555 OF1.0:
556 The spec isn't very clear, but OVS implements it this way.
557
558 OF1.2:
559 Presumably other masks such that ``(vlan_vid_mask & 0x1fff) == 0x1000``
560 would also work, but the spec doesn't define their behavior.
561
562``[5]``
563 Matches only packets that have an 802.1Q header with VID ``xxx`` and PCP
564 ``y``.
565
566 NXM:
567 ``z`` is ``((y << 1) | 1)``.
568
569 OF1.2:
570 Presumably other masks such that ``(vlan_vid_mask & 0x1fff) == 0x1fff``
571 would also work.
572
573``[6]``
574 Matches packets with no 802.1Q header or with an 802.1Q header with a VID of
575 0. Only possible with NXM.
576
577``[7]``
578 Matches packets with no 802.1Q header or with an 802.1Q header with a PCP of
579 0. Only possible with NXM.
580
581``[8]``
582 Matches packets with no 802.1Q header or with an 802.1Q header with both VID
583 and PCP of 0. Only possible with NXM.
584
585``[9]``
586 Matches only packets that have an 802.1Q header with an odd-numbered VID (and
587 any PCP). Only possible with NXM and OF1.2. (This is just an example; one
588 can match on any desired VID bit pattern.)
589
590``[10]``
591 Matches only packets that have an 802.1Q header with an odd-numbered PCP (and
592 any VID). Only possible with NXM. (This is just an example; one can match
593 on any desired VID bit pattern.)
594
595``[11]``
596 Matches any packet with an 802.1Q header, regardless of VID or PCP.
597
598Additional notes:
599
600OF1.2:
601 The top three bits of ``OXM_OF_VLAN_VID`` are fixed to zero, so bits 13, 14,
602 and 15 in the masks listed in the table may be set to arbitrary values, as
603 long as the corresponding value bits are also zero. The suggested ``ffff``
604 mask for [2], [3], and [5] allows a shorter OXM representation (the mask is
605 omitted) than the minimal ``1fff`` mask.
606
607Flow Cookies
608------------
609
610OpenFlow 1.0 and later versions have the concept of a "flow cookie", which is a
61164-bit integer value attached to each flow. The treatment of the flow cookie
612has varied greatly across OpenFlow versions, however.
613
614In OpenFlow 1.0:
615
616- ``OFPFC_ADD`` set the cookie in the flow that it added.
617
618- ``OFPFC_MODIFY`` and ``OFPFC_MODIFY_STRICT`` updated the cookie for the flow
619 or flows that it modified.
620
621- ``OFPST_FLOW`` messages included the flow cookie.
622
623- ``OFPT_FLOW_REMOVED`` messages reported the cookie of the flow that was
624 removed.
625
626OpenFlow 1.1 made the following changes:
627
628- Flow mod operations ``OFPFC_MODIFY``, ``OFPFC_MODIFY_STRICT``,
629 ``OFPFC_DELETE``, and ``OFPFC_DELETE_STRICT``, plus flow stats requests and
630 aggregate stats requests, gained the ability to match on flow cookies with an
631 arbitrary mask.
632
633- ``OFPFC_MODIFY`` and ``OFPFC_MODIFY_STRICT`` were changed to add a new flow,
634 in the case of no match, only if the flow table modification operation did
635 not match on the cookie field. (In OpenFlow 1.0, modify operations always
636 added a new flow when there was no match.)
637
638- ``OFPFC_MODIFY`` and ``OFPFC_MODIFY_STRICT`` no longer updated flow cookies.
639
640OpenFlow 1.2 made the following changes:
641
642- ``OFPC_MODIFY`` and ``OFPFC_MODIFY_STRICT`` were changed to never add a new
643 flow, regardless of whether the flow cookie was used for matching.
644
645Open vSwitch support for OpenFlow 1.0 implements the OpenFlow 1.0 behavior with
646the following extensions:
647
648- An NXM extension field ``NXM_NX_COOKIE(_W)`` allows the NXM versions of
649 ``OFPFC_MODIFY``, ``OFPFC_MODIFY_STRICT``, ``OFPFC_DELETE``, and
650 ``OFPFC_DELETE_STRICT`` ``flow_mod`` calls, plus flow stats requests and
651 aggregate stats requests, to match on flow cookies with arbitrary masks.
652 This is much like the equivalent OpenFlow 1.1 feature.
653
654- Like OpenFlow 1.1, ``OFPC_MODIFY`` and ``OFPFC_MODIFY_STRICT`` add a new flow
655 if there is no match and the mask is zero (or not given).
656
657- The ``cookie`` field in ``OFPT_FLOW_MOD`` and ``NXT_FLOW_MOD`` messages is
658 used as the cookie value for ``OFPFC_ADD`` commands, as described in OpenFlow
659 1.0. For ``OFPFC_MODIFY`` and ``OFPFC_MODIFY_STRICT`` commands, the
660 ``cookie`` field is used as a new cookie for flows that match unless it is
661 ``UINT64_MAX``, in which case the flow's cookie is not updated.
662
663- ``NXT_PACKET_IN`` (the Nicira extended version of ``OFPT_PACKET_IN``) reports
664 the cookie of the rule that generated the packet, or all-1-bits if no rule
665 generated the packet. (Older versions of OVS used all-0-bits instead of
666 all-1-bits.)
667
668The following table shows the handling of different protocols when receiving
669``OFPFC_MODIFY`` and ``OFPFC_MODIFY_STRICT`` messages. A mask of 0 indicates
670either an explicit mask of zero or an implicit one by not specifying the
671``NXM_NX_COOKIE(_W)`` field.
672
673============== ====== ====== ============= =============
674 Match Update Add on miss Add on miss
675 cookie cookie mask!=0 mask==0
676============== ====== ====== ============= =============
677OpenFlow 1.0 no yes (add on miss) (add on miss)
678OpenFlow 1.1 yes no no yes
679OpenFlow 1.2 yes no no no
680NXM yes yes\* no yes
681============== ====== ====== ============= =============
682
683\* Updates the flow's cookie unless the ``cookie`` field is ``UINT64_MAX``.
684
685Multiple Table Support
686----------------------
687
688OpenFlow 1.0 has only rudimentary support for multiple flow tables. Notably,
689OpenFlow 1.0 does not allow the controller to specify the flow table to which a
690flow is to be added. Open vSwitch adds an extension for this purpose, which is
691enabled on a per-OpenFlow connection basis using the ``NXT_FLOW_MOD_TABLE_ID``
692message. When the extension is enabled, the upper 8 bits of the ``command``
693member in an ``OFPT_FLOW_MOD`` or ``NXT_FLOW_MOD`` message designates the table
694to which a flow is to be added.
695
696The Open vSwitch software switch implementation offers 255 flow tables. On
697packet ingress, only the first flow table (table 0) is searched, and the
698contents of the remaining tables are not considered in any way. Tables other
699than table 0 only come into play when an ``NXAST_RESUBMIT_TABLE`` action
700specifies another table to search.
701
702Tables 128 and above are reserved for use by the switch itself. Controllers
703should use only tables 0 through 127.
704
705``OFPTC_*`` Table Configuration
706-------------------------------
707
708This section covers the history of the ``OFPTC_*`` table configuration bits
709across OpenFlow versions.
710
711OpenFlow 1.0 flow tables had fixed configurations.
712
713OpenFlow 1.1 enabled controllers to configure behavior upon flow table miss and
714added the ``OFPTC_MISS_*`` constants for that purpose. ``OFPTC_*`` did not
715control anything else but it was nevertheless conceptualized as a set of
716bit-fields instead of an enum. OF1.1 added the ``OFPT_TABLE_MOD`` message to
717set ``OFPTC_MISS_*`` for a flow table and added the ``config`` field to the
718``OFPST_TABLE`` reply to report the current setting.
719
720OpenFlow 1.2 did not change anything in this regard.
721
722OpenFlow 1.3 switched to another means to changing flow table miss behavior and
723deprecated ``OFPTC_MISS_*`` without adding any more ``OFPTC_*`` constants.
724This meant that ``OFPT_TABLE_MOD`` now had no purpose at all, but OF1.3 kept it
725around "for backward compatibility with older and newer versions of the
726specification." At the same time, OF1.3 introduced a new message
727OFPMP_TABLE_FEATURES that included a field ``config`` documented as reporting
728the ``OFPTC_*`` values set with ``OFPT_TABLE_MOD``; of course this served no
729real purpose because no ``OFPTC_*`` values are defined. OF1.3 did remove the
730``OFPTC_*`` field from ``OFPMP_TABLE`` (previously named ``OFPST_TABLE``).
731
732OpenFlow 1.4 defined two new ``OFPTC_*`` constants, ``OFPTC_EVICTION`` and
733``OFPTC_VACANCY_EVENTS``, using bits that did not overlap with ``OFPTC_MISS_*``
734even though those bits had not been defined since OF1.2. ``OFPT_TABLE_MOD``
735still controlled these settings. The field for ``OFPTC_*`` values in
736``OFPMP_TABLE_FEATURES`` was renamed from ``config`` to ``capabilities`` and
737documented as reporting the flags that are supported in a ``OFPT_TABLE_MOD``
738message. The ``OFPMP_TABLE_DESC`` message newly added in OF1.4 reported the
739``OFPTC_*`` setting.
740
741OpenFlow 1.5 did not change anything in this regard.
742
743.. list-table:: Revisions
744 :header-rows: 1
745
746 * - OpenFlow
747 - ``OFPTC_*`` flags
748 - ``TABLE_MOD``
749 - Statistics
750 - ``TABLE_FEATURES``
751 - ``TABLE_DESC``
752 * - OF1.0
753 - none
754 - no (\*)(+)
755 - no (\*)
756 - nothing (\*)(+)
757 - no (\*)(+)
758 * - OF1.1/1.2
759 - ``MISS_*``
760 - yes
761 - yes
762 - nothing (+)
763 - no (+)
764 * - OF1.3
765 - none
766 - yes (\*)
767 - no (\*)
768 - config (\*)
769 - no (\*)(+)
770 * - OF1.4/1.5
771 - ``EVICTION``/``VACANCY_EVENTS``
772 - yes
773 - no
774 - capabilities
775 - yes
776
777where:
778
779OpenFlow:
780 The OpenFlow version(s).
781
782``OFPTC_*`` flags:
783 The ``OFPTC_*`` flags defined in those versions.
784
785``TABLE_MOD``:
786 Whether ``OFPT_TABLE_MOD`` can modify ``OFPTC_*`` flags.
787
788Statistics:
789 Whether ``OFPST_TABLE/OFPMP_TABLE`` reports the ``OFPTC_*`` flags.
790
791``TABLE_FEATURES``:
792 What ``OFPMP_TABLE_FEATURES`` reports (if it exists): either the current
793 configuration or the switch's capabilities.
794
795``TABLE_DESC``:
796 Whether ``OFPMP_TABLE_DESC`` reports the current configuration.
797
798(\*): Nothing to report/change anyway.
799
800(+): No such message.
801
802IPv6
803----
804
805Open vSwitch supports stateless handling of IPv6 packets. Flows can be written
806to support matching TCP, UDP, and ICMPv6 headers within an IPv6 packet. Deeper
807matching of some Neighbor Discovery messages is also supported.
808
809IPv6 was not designed to interact well with middle-boxes. This, combined with
810Open vSwitch's stateless nature, have affected the processing of IPv6 traffic,
811which is detailed below.
812
813Extension Headers
814~~~~~~~~~~~~~~~~~
815
816The base IPv6 header is incredibly simple with the intention of only containing
817information relevant for routing packets between two endpoints. IPv6 relies
818heavily on the use of extension headers to provide any other functionality.
819Unfortunately, the extension headers were designed in such a way that it is
820impossible to move to the next header (including the layer-4 payload) unless
821the current header is understood.
822
823Open vSwitch will process the following extension headers and continue to the
824next header:
825
826- Fragment (see the next section)
827- AH (Authentication Header)
828- Hop-by-Hop Options
829- Routing
830- Destination Options
831
832When a header is encountered that is not in that list, it is considered
833"terminal". A terminal header's IPv6 protocol value is stored in ``nw_proto``
834for matching purposes. If a terminal header is TCP, UDP, or ICMPv6, the packet
835will be further processed in an attempt to extract layer-4 information.
836
837Fragments
838~~~~~~~~~
839
840IPv6 requires that every link in the internet have an MTU of 1280 octets or
841greater (RFC 2460). As such, a terminal header (as described above in
842"Extension Headers") in the first fragment should generally be reachable. In
843this case, the terminal header's IPv6 protocol type is stored in the
844``nw_proto`` field for matching purposes. If a terminal header cannot be found
845in the first fragment (one with a fragment offset of zero), the ``nw_proto``
846field is set to 0. Subsequent fragments (those with a non-zero fragment
847offset) have the ``nw_proto`` field set to the IPv6 protocol type for fragments
848(44).
849
850Jumbograms
851~~~~~~~~~~
852
853An IPv6 jumbogram (RFC 2675) is a packet containing a payload longer than
85465,535 octets. A jumbogram is only relevant in subnets with a link MTU greater
855than 65,575 octets, and are not required to be supported on nodes that do not
856connect to link with such large MTUs. Currently, Open vSwitch doesn't process
857jumbograms.
858
859In-Band Control
860---------------
861
862Motivation
863~~~~~~~~~~
864
865An OpenFlow switch must establish and maintain a TCP network connection to its
866controller. There are two basic ways to categorize the network that this
867connection traverses: either it is completely separate from the one that the
868switch is otherwise controlling, or its path may overlap the network that the
869switch controls. We call the former case "out-of-band control", the latter
870case "in-band control".
871
872Out-of-band control has the following benefits:
873
874- Simplicity: Out-of-band control slightly simplifies the switch
875 implementation.
876
877- Reliability: Excessive switch traffic volume cannot interfere with control
878 traffic.
879
880- Integrity: Machines not on the control network cannot impersonate a switch or
881 a controller.
882
883- Confidentiality: Machines not on the control network cannot snoop on control
884 traffic.
885
886In-band control, on the other hand, has the following advantages:
887
888- No dedicated port: There is no need to dedicate a physical switch port to
889 control, which is important on switches that have few ports (e.g. wireless
890 routers, low-end embedded platforms).
891
892- No dedicated network: There is no need to build and maintain a separate
893 control network. This is important in many environments because it reduces
894 proliferation of switches and wiring.
895
896Open vSwitch supports both out-of-band and in-band control. This section
897describes the principles behind in-band control. See the description of the
898Controller table in ovs-vswitchd.conf.db(5) to configure OVS for in-band
899control.
900
901Principles
902~~~~~~~~~~
903
904The fundamental principle of in-band control is that an OpenFlow switch must
905recognize and switch control traffic without involving the OpenFlow controller.
906All the details of implementing in-band control are special cases of this
907principle.
908
909The rationale for this principle is simple. If the switch does not handle
910in-band control traffic itself, then it will be caught in a contradiction: it
911must contact the controller, but it cannot, because only the controller can set
912up the flows that are needed to contact the controller.
913
914The following points describe important special cases of this principle.
915
916- In-band control must be implemented regardless of whether the switch is
917 connected.
918
919 It is tempting to implement the in-band control rules only when the switch is
920 not connected to the controller, using the reasoning that the controller
921 should have complete control once it has established a connection with the
922 switch.
923
924 This does not work in practice. Consider the case where the switch is
925 connected to the controller. Occasionally it can happen that the controller
926 forgets or otherwise needs to obtain the MAC address of the switch. To do
927 so, the controller sends a broadcast ARP request. A switch that implements
928 the in-band control rules only when it is disconnected will then send an
929 ``OFPT_PACKET_IN`` message up to the controller. The controller will be
930 unable to respond, because it does not know the MAC address of the switch.
931 This is a deadlock situation that can only be resolved by the switch noticing
932 that its connection to the controller has hung and reconnecting.
933
934- In-band control must override flows set up by the controller.
935
936 It is reasonable to assume that flows set up by the OpenFlow controller
937 should take precedence over in-band control, on the basis that the controller
938 should be in charge of the switch.
939
940 Again, this does not work in practice. Reasonable controller implementations
941 may set up a "last resort" fallback rule that wildcards every field and,
942 e.g., sends it up to the controller or discards it. If a controller does
943 that, then it will isolate itself from the switch.
944
945- The switch must recognize all control traffic.
946
947 The fundamental principle of in-band control states, in part, that a switch
948 must recognize control traffic without involving the OpenFlow controller.
949 More specifically, the switch must recognize *all* control traffic. "False
950 negatives", that is, packets that constitute control traffic but that the
951 switch does not recognize as control traffic, lead to control traffic storms.
952
953 Consider an OpenFlow switch that only recognizes control packets sent to or
954 from that switch. Now suppose that two switches of this type, named A and B,
955 are connected to ports on an Ethernet hub (not a switch) and that an OpenFlow
956 controller is connected to a third hub port. In this setup, control traffic
957 sent by switch A will be seen by switch B, which will send it to the
958 controller as part of an OFPT_PACKET_IN message. Switch A will then see the
959 OFPT_PACKET_IN message's packet, re-encapsulate it in another OFPT_PACKET_IN,
960 and send it to the controller. Switch B will then see that OFPT_PACKET_IN,
961 and so on in an infinite loop.
962
963 Incidentally, the consequences of "false positives", where packets that are
964 not control traffic are nevertheless recognized as control traffic, are much
965 less severe. The controller will not be able to control their behavior, but
966 the network will remain in working order. False positives do constitute a
967 security problem.
968
969- The switch should use echo-requests to detect disconnection.
970
971 TCP will notice that a connection has hung, but this can take a considerable
972 amount of time. For example, with default settings the Linux kernel TCP
973 implementation will retransmit for between 13 and 30 minutes, depending on
974 the connection's retransmission timeout, according to kernel documentation.
975 This is far too long for a switch to be disconnected, so an OpenFlow switch
976 should implement its own connection timeout. OpenFlow ``OFPT_ECHO_REQUEST``
977 messages are the best way to do this, since they test the OpenFlow connection
978 itself.
979
980Implementation
981~~~~~~~~~~~~~~
982
983This section describes how Open vSwitch implements in-band control. Correctly
984implementing in-band control has proven difficult due to its many subtleties,
985and has thus gone through many iterations. Please read through and understand
986the reasoning behind the chosen rules before making modifications.
987
988Open vSwitch implements in-band control as "hidden" flows, that is, flows that
989are not visible through OpenFlow, and at a higher priority than wildcarded
990flows can be set up through OpenFlow. This is done so that the OpenFlow
991controller cannot interfere with them and possibly break connectivity with its
992switches. It is possible to see all flows, including in-band ones, with the
993ovs-appctl "bridge/dump-flows" command.
994
995The Open vSwitch implementation of in-band control can hide traffic to
996arbitrary "remotes", where each remote is one TCP port on one IP address.
997Currently the remotes are automatically configured as the in-band OpenFlow
998controllers plus the OVSDB managers, if any. (The latter is a requirement
999because OVSDB managers are responsible for configuring OpenFlow controllers, so
1000if the manager cannot be reached then OpenFlow cannot be reconfigured.)
1001
1002The following rules (with the OFPP_NORMAL action) are set up on any bridge that
1003has any remotes:
1004
1005(a)
1006 DHCP requests sent from the local port.
1007(b)
1008 ARP replies to the local port's MAC address.
1009(c)
1010 ARP requests from the local port's MAC address.
1011
1012In-band also sets up the following rules for each unique next-hop MAC address
1013for the remotes' IPs (the "next hop" is either the remote itself, if it is on a
1014local subnet, or the gateway to reach the remote):
1015
1016(d)
1017 ARP replies to the next hop's MAC address.
1018(e)
1019 ARP requests from the next hop's MAC address.
1020
1021In-band also sets up the following rules for each unique remote IP address:
1022
1023(f)
1024 ARP replies containing the remote's IP address as a target.
1025(g)
1026 ARP requests containing the remote's IP address as a source.
1027
1028In-band also sets up the following rules for each unique remote (IP,port) pair:
1029
1030(h)
1031 TCP traffic to the remote's IP and port.
1032(i)
1033 TCP traffic from the remote's IP and port.
1034
1035The goal of these rules is to be as narrow as possible to allow a switch to
1036join a network and be able to communicate with the remotes. As mentioned
1037earlier, these rules have higher priority than the controller's rules, so if
1038they are too broad, they may prevent the controller from implementing its
1039policy. As such, in-band actively monitors some aspects of flow and packet
1040processing so that the rules can be made more precise.
1041
1042In-band control monitors attempts to add flows into the datapath that could
1043interfere with its duties. The datapath only allows exact match entries, so
1044in-band control is able to be very precise about the flows it prevents. Flows
1045that miss in the datapath are sent to userspace to be processed, so preventing
1046these flows from being cached in the "fast path" does not affect correctness.
1047The only type of flow that is currently prevented is one that would prevent
1048DHCP replies from being seen by the local port. For example, a rule that
1049forwarded all DHCP traffic to the controller would not be allowed, but one that
1050forwarded to all ports (including the local port) would.
1051
1052As mentioned earlier, packets that miss in the datapath are sent to the
1053userspace for processing. The userspace has its own flow table, the
1054"classifier", so in-band checks whether any special processing is needed before
1055the classifier is consulted. If a packet is a DHCP response to a request from
1056the local port, the packet is forwarded to the local port, regardless of the
1057flow table. Note that this requires L7 processing of DHCP replies to determine
1058whether the 'chaddr' field matches the MAC address of the local port.
1059
1060It is interesting to note that for an L3-based in-band control mechanism, the
1061majority of rules are devoted to ARP traffic. At first glance, some of these
1062rules appear redundant. However, each serves an important role. First, in
1063order to determine the MAC address of the remote side (controller or gateway)
1064for other ARP rules, we must allow ARP traffic for our local port with rules
1065(b) and (c). If we are between a switch and its connection to the remote, we
1066have to allow the other switch's ARP traffic to through. This is done with
1067rules (d) and (e), since we do not know the addresses of the other switches a
1068priori, but do know the remote's or gateway's. Finally, if the remote is
1069running in a local guest VM that is not reached through the local port, the
1070switch that is connected to the VM must allow ARP traffic based on the remote's
1071IP address, since it will not know the MAC address of the local port that is
1072sending the traffic or the MAC address of the remote in the guest VM.
1073
1074With a few notable exceptions below, in-band should work in most network
1075setups. The following are considered "supported" in the current
1076implementation:
1077
1078- Locally Connected. The switch and remote are on the same subnet. This uses
1079 rules (a), (b), (c), (h), and (i).
1080
1081- Reached through Gateway. The switch and remote are on different subnets and
1082 must go through a gateway. This uses rules (a), (b), (c), (h), and (i).
1083
1084- Between Switch and Remote. This switch is between another switch and the
1085 remote, and we want to allow the other switch's traffic through. This uses
1086 rules (d), (e), (h), and (i). It uses (b) and (c) indirectly in order to
1087 know the MAC address for rules (d) and (e). Note that DHCP for the other
1088 switch will not work unless an OpenFlow controller explicitly lets this
1089 switch pass the traffic.
1090
1091- Between Switch and Gateway. This switch is between another switch and the
1092 gateway, and we want to allow the other switch's traffic through. This uses
1093 the same rules and logic as the "Between Switch and Remote" configuration
1094 described earlier.
1095
1096- Remote on Local VM. The remote is a guest VM on the system running in-band
1097 control. This uses rules (a), (b), (c), (h), and (i).
1098
1099- Remote on Local VM with Different Networks. The remote is a guest VM on the
1100 system running in-band control, but the local port is not used to connect to
1101 the remote. For example, an IP address is configured on eth0 of the switch.
1102 The remote's VM is connected through eth1 of the switch, but an IP address
1103 has not been configured for that port on the switch. As such, the switch
1104 will use eth0 to connect to the remote, and eth1's rules about the local port
1105 will not work. In the example, the switch attached to eth0 would use rules
1106 (a), (b), (c), (h), and (i) on eth0. The switch attached to eth1 would use
1107 rules (f), (g), (h), and (i).
1108
1109The following are explicitly *not* supported by in-band control:
1110
1111- Specify Remote by Name. Currently, the remote must be identified by IP
1112 address. A naive approach would be to permit all DNS traffic.
1113 Unfortunately, this would prevent the controller from defining any policy
1114 over DNS. Since switches that are located behind us need to connect to the
1115 remote, in-band cannot simply add a rule that allows DNS traffic from the
1116 local port. The "correct" way to support this is to parse DNS requests to
1117 allow all traffic related to a request for the remote's name through. Due to
1118 the potential security problems and amount of processing, we decided to hold
1119 off for the time-being.
1120
1121- Differing Remotes for Switches. All switches must know the L3 addresses for
1122 all the remotes that other switches may use, since rules need to be set up to
1123 allow traffic related to those remotes through. See rules (f), (g), (h), and
1124 (i).
1125
1126- Differing Routes for Switches. In order for the switch to allow other
1127 switches to connect to a remote through a gateway, it allows the gateway's
1128 traffic through with rules (d) and (e). If the routes to the remote differ
1129 for the two switches, we will not know the MAC address of the alternate
1130 gateway.
1131
1132Action Reproduction
1133-------------------
1134
1135It seems likely that many controllers, at least at startup, use the OpenFlow
1136"flow statistics" request to obtain existing flows, then compare the flows'
1137actions against the actions that they expect to find. Before version 1.8.0,
1138Open vSwitch always returned exact, byte-for-byte copies of the actions that
1139had been added to the flow table. The current version of Open vSwitch does not
1140always do this in some exceptional cases. This section lists the exceptions
1141that controller authors must keep in mind if they compare actual actions
1142against desired actions in a bytewise fashion:
1143
1144- Open vSwitch zeros padding bytes in action structures, regardless of their
1145 values when the flows were added.
1146
1147- Open vSwitch "normalizes" the instructions in OpenFlow 1.1 (and later) in the
1148 following way:
1149
1150 * OVS sorts the instructions into the following order: Apply-Actions,
1151 Clear-Actions, Write-Actions, Write-Metadata, Goto-Table.
1152
1153 * OVS drops Apply-Actions instructions that have empty action lists.
1154
1155 * OVS drops Write-Actions instructions that have empty action sets.
1156
1157Please report other discrepancies, if you notice any, so that we can fix or
1158document them.
1159
1160Suggestions
1161-----------
1162
1163Suggestions to improve Open vSwitch are welcome at discuss@openvswitch.org.