]> git.proxmox.com Git - mirror_ovs.git/blame - Documentation/tutorials/faucet.rst
treewide: Convert leading tabs to spaces.
[mirror_ovs.git] / Documentation / tutorials / faucet.rst
CommitLineData
98dc8dee
BP
1..
2 Licensed under the Apache License, Version 2.0 (the "License"); you may
3 not use this file except in compliance with the License. You may obtain
4 a copy of the License at
5
6 http://www.apache.org/licenses/LICENSE-2.0
7
8 Unless required by applicable law or agreed to in writing, software
9 distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
10 WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
11 License for the specific language governing permissions and limitations
12 under the License.
13
14 Convention for heading levels in Open vSwitch documentation:
15
16 ======= Heading 0 (reserved for the title in a document)
17 ------- Heading 1
18 ~~~~~~~ Heading 2
19 +++++++ Heading 3
20 ''''''' Heading 4
21
22 Avoid deeper levels because they do not render well.
23
24===================
25OVS Faucet Tutorial
26===================
27
fd0e8355
BP
28This tutorial demonstrates how Open vSwitch works with a general-purpose
29OpenFlow controller, using the Faucet controller as a simple way to get
30started. It was tested with the "master" branch of Open vSwitch and version
dcc3e70b
BC
311.6.15 of Faucet. It does not use advanced or recently added features in OVS
32or Faucet, so other versions of both pieces of software are likely to work
33equally well.
98dc8dee
BP
34
35The goal of the tutorial is to demonstrate Open vSwitch and Faucet in an
36end-to-end way, that is, to show how it works from the Faucet controller
37configuration at the top, through the OpenFlow flow table, to the datapath
38processing. Along the way, in addition to helping to understand the
39architecture at each level, we discuss performance and troubleshooting issues.
40We hope that this demonstration makes it easier for users and potential users
41to understand how Open vSwitch works and how to debug and troubleshoot it.
42
43We provide enough details in the tutorial that you should be able to fully
44follow along by following the instructions.
45
46Setting Up OVS
47--------------
48
49This section explains how to set up Open vSwitch for the purpose of using it
50with Faucet for the tutorial.
51
52You might already have Open vSwitch installed on one or more computers or VMs,
53perhaps set up to control a set of VMs or a physical network. This is
54admirable, but we will be using Open vSwitch in a different way to set up a
55simulation environment called the OVS "sandbox". The sandbox does not use
56virtual machines or containers, which makes it more limited, but on the other
57hand it is (in this writer's opinion) easier to set up.
58
59There are two ways to start a sandbox: one that uses the Open vSwitch that is
60already installed on a system, and another that uses a copy of Open vSwitch
61that has been built but not yet installed. The latter is more often used and
62thus better tested, but both should work. The instructions below explain both
63approaches:
64
651. Get a copy of the Open vSwitch source repository using Git, then ``cd`` into
66 the new directory::
67
68 $ git clone https://github.com/openvswitch/ovs.git
69 $ cd ovs
70
71 The default checkout is the master branch. You can check out a tag
72 (such as v2.8.0) or a branch (such as origin/branch-2.8), if you
73 prefer.
74
752. If you do not already have an installed copy of Open vSwitch on your system,
76 or if you do not want to use it for the sandbox (the sandbox will not
77 disturb the functionality of any existing switches), then proceed to step 3.
78 If you do have an installed copy and you want to use it for the sandbox, try
79 to start the sandbox by running::
80
81 $ tutorial/ovs-sandbox
82
83 If it is successful, you will find yourself in a subshell environment, which
84 is the sandbox (you can exit with ``exit`` or Control+D). If so, you're
85 finished and do not need to complete the rest of the steps. If it fails,
86 you can proceed to step 3 to build Open vSwitch anyway.
87
883. Before you build, you might want to check that your system meets the build
89 requirements. Read :doc:`/intro/install/general` to find out. For this
90 tutorial, there is no need to compile the Linux kernel module, or to use any
91 of the optional libraries such as OpenSSL, DPDK, or libcap-ng.
92
934. Configure and build Open vSwitch::
94
95 $ ./boot.sh
96 $ ./configure
97 $ make -j4
98
995. Try out the sandbox by running::
100
101 $ make sandbox
102
103 You can exit the sandbox with ``exit`` or Control+D.
104
105Setting up Faucet
106-----------------
107
108This section explains how to get a copy of Faucet and set it up
109appropriately for the tutorial. There are many other ways to install
110Faucet, but this simple approach worked well for me. It has the
111advantage that it does not require modifying any system-level files or
112directories on your machine. It does, on the other hand, require
113Docker, so make sure you have it installed and working.
114
115It will be a little easier to go through the rest of the tutorial if
116you run these instructions in a separate terminal from the one that
117you're using for Open vSwitch, because it's often necessary to switch
118between one and the other.
119
1201. Get a copy of the Faucet source repository using Git, then ``cd``
121 into the new directory::
122
123 $ git clone https://github.com/faucetsdn/faucet.git
124 $ cd faucet
125
126 At this point I checked out the latest tag::
127
dcc3e70b
BC
128 $ latest_tag=$(git describe --tags $(git rev-list --tags --max-count=1))
129 $ git checkout $latest_tag
98dc8dee
BP
130
1312. Build a docker container image::
132
133 $ docker build -t faucet/faucet .
134
135 This will take a few minutes.
136
1373. Create an installation directory under the ``faucet`` directory for
138 the docker image to use::
139
140 $ mkdir inst
141
142 The Faucet configuration will go in ``inst/faucet.yaml`` and its
143 main log will appear in ``inst/faucet.log``. (The official Faucet
144 installation instructions call to put these in ``/etc/ryu/faucet``
145 and ``/var/log/ryu/faucet``, respectively, but we avoid modifying
146 these system directories.)
147
1484. Create a container and start Faucet::
149
a1fc8639 150 $ docker run -d --name faucet --restart=always -v $(pwd)/inst/:/etc/faucet/ -v $(pwd)/inst/:/var/log/faucet/ -p 6653:6653 -p 9302:9302 faucet/faucet
98dc8dee
BP
151
1525. Look in ``inst/faucet.log`` to verify that Faucet started. It will
153 probably start with an exception and traceback because we have not
154 yet created ``inst/faucet.yaml``.
155
1566. Later on, to make a new or updated Faucet configuration take
157 effect quickly, you can run::
158
159 $ docker exec faucet pkill -HUP -f faucet.faucet
160
161 Another way is to stop and start the Faucet container::
162
163 $ docker restart faucet
164
165 You can also stop and delete the container; after this, to start it
166 again, you need to rerun the ``docker run`` command::
167
168 $ docker stop faucet
169 $ docker rm faucet
170
171Overview
172--------
173
174Now that Open vSwitch and Faucet are ready, here's an overview of what
175we're going to do for the remainder of the tutorial:
176
1771. Switching: Set up an L2 network with Faucet.
178
1792. Routing: Route between multiple L3 networks with Faucet.
180
1813. ACLs: Add and modify access control rules.
182
183At each step, we will take a look at how the features in question work
184from Faucet at the top to the data plane layer at the bottom. From
185the highest to lowest level, these layers and the software components
186that connect them are:
187
fd0e8355
BP
188Faucet.
189 As the top level in the system, this is the authoritative source of the
190 network configuration.
98dc8dee
BP
191
192 Faucet connects to a variety of monitoring and performance tools,
193 but we won't use them in this tutorial. Our main insights into the
194 system will be through ``faucet.yaml`` for configuration and
195 ``faucet.log`` to observe state, such as MAC learning and ARP
196 resolution, and to tell when we've screwed up configuration syntax
197 or semantics.
198
fd0e8355
BP
199The OpenFlow subsystem in Open vSwitch.
200 OpenFlow is the protocol, standardized by the Open Networking Foundation,
201 that controllers like Faucet use to control how Open vSwitch and other
202 switches treat packets in the network.
98dc8dee
BP
203
204 We will use ``ovs-ofctl``, a utility that comes with Open vSwitch,
205 to observe and occasionally modify Open vSwitch's OpenFlow behavior.
206 We will also use ``ovs-appctl``, a utility for communicating with
207 ``ovs-vswitchd`` and other Open vSwitch daemons, to ask "what-if?"
208 type questions.
209
210 In addition, the OVS sandbox by default raises the Open vSwitch
211 logging level for OpenFlow high enough that we can learn a great
212 deal about OpenFlow behavior simply by reading its log file.
213
fd0e8355
BP
214Open vSwitch datapath.
215 This is essentially a cache designed to accelerate packet processing. Open
216 vSwitch includes a few different datapaths, such as one based on the Linux
217 kernel and a userspace-only datapath (sometimes called the "DPDK" datapath).
218 The OVS sandbox uses the latter, but the principles behind it apply equally
219 well to other datapaths.
98dc8dee
BP
220
221At each step, we discuss how the design of each layer influences
222performance. We demonstrate how Open vSwitch features can be used to
223debug, troubleshoot, and understand the system as a whole.
224
225Switching
226---------
227
228Layer-2 (L2) switching is the basis of modern networking. It's also
229very simple and a good place to start, so let's set up a switch with
230some VLANs in Faucet and see how it works at each layer. Begin by
231putting the following into ``inst/faucet.yaml``::
232
233 dps:
234 switch-1:
235 dp_id: 0x1
236 timeout: 3600
237 arp_neighbor_timeout: 3600
238 interfaces:
239 1:
240 native_vlan: 100
241 2:
242 native_vlan: 100
243 3:
244 native_vlan: 100
245 4:
246 native_vlan: 200
247 5:
248 native_vlan: 200
249 vlans:
250 100:
251 200:
252
253This configuration file defines a single switch ("datapath" or "dp")
254named ``switch-1``. The switch has five ports, numbered 1 through 5.
255Ports 1, 2, and 3 are in VLAN 100, and ports 4 and 5 are in VLAN 2.
256Faucet can identify the switch from its datapath ID, which is defined
257to be 0x1.
258
259.. note::
260
261 This also sets high MAC learning and ARP timeouts. The defaults are
262 5 minutes and about 8 minutes, which are fine in production but
263 sometimes too fast for manual experimentation. (Don't use a timeout
264 bigger than about 65000 seconds because it will crash Faucet.)
265
266Now restart Faucet so that the configuration takes effect, e.g.::
267
268 $ docker restart faucet
269
270Assuming that the configuration update is successful, you should now
271see a new line at the end of ``inst/faucet.log``::
272
dcc3e70b 273 Jan 06 15:14:35 faucet INFO Add new datapath DPID 1 (0x1)
98dc8dee
BP
274
275Faucet is now waiting for a switch with datapath ID 0x1 to connect to
276it over OpenFlow, so our next step is to create a switch with OVS and
277make it connect to Faucet. To do that, switch to the terminal where
278you checked out OVS and start a sandbox with ``make sandbox`` or
fd0e8355 279``tutorial/ovs-sandbox`` (as explained earlier under `Setting Up
98dc8dee
BP
280OVS`_). You should see something like this toward the end of the
281output::
282
283 ----------------------------------------------------------------------
284 You are running in a dummy Open vSwitch environment. You can use
285 ovs-vsctl, ovs-ofctl, ovs-appctl, and other tools to work with the
286 dummy switch.
287
288 Log files, pidfiles, and the configuration database are in the
289 "sandbox" subdirectory.
290
291 Exit the shell to kill the running daemons.
292 blp@sigabrt:~/nicira/ovs/tutorial(0)$
dcc3e70b 293
98dc8dee
BP
294Inside the sandbox, create a switch ("bridge") named ``br0``, set its
295datapath ID to 0x1, add simulated ports to it named ``p1`` through
296``p5``, and tell it to connect to the Faucet controller. To make it
297easier to understand, we request for port ``p1`` to be assigned
298OpenFlow port 1, ``p2`` port 2, and so on. As a final touch,
299configure the controller to be "out-of-band" (this is mainly to avoid
300some annoying messages in the ``ovs-vswitchd`` logs; for more
301information, run ``man ovs-vswitchd.conf.db`` and search for
302``connection_mode``)::
303
304 $ ovs-vsctl add-br br0 \
5a0e4aec
BP
305 -- set bridge br0 other-config:datapath-id=0000000000000001 \
306 -- add-port br0 p1 -- set interface p1 ofport_request=1 \
307 -- add-port br0 p2 -- set interface p2 ofport_request=2 \
308 -- add-port br0 p3 -- set interface p3 ofport_request=3 \
309 -- add-port br0 p4 -- set interface p4 ofport_request=4 \
310 -- add-port br0 p5 -- set interface p5 ofport_request=5 \
311 -- set-controller br0 tcp:127.0.0.1:6653 \
312 -- set controller br0 connection-mode=out-of-band
98dc8dee
BP
313
314.. note::
315
316 You don't have to run all of these as a single ``ovs-vsctl``
317 invocation. It is a little more efficient, though, and since it
318 updates the OVS configuration in a single database transaction it
319 means that, for example, there is never a time when the controller
320 is set but it has not yet been configured as out-of-band.
321
322Now, if you look at ``inst/faucet.log`` again, you should see that
323Faucet recognized and configured the new switch and its ports::
324
dcc3e70b
BC
325 Jan 06 15:17:10 faucet INFO DPID 1 (0x1) connected
326 Jan 06 15:17:10 faucet.valve INFO DPID 1 (0x1) Cold start configuring DP
327 Jan 06 15:17:10 faucet.valve INFO DPID 1 (0x1) Configuring VLAN 100 vid:100 ports:Port 1,Port 2,Port 3
328 Jan 06 15:17:10 faucet.valve INFO DPID 1 (0x1) Configuring VLAN 200 vid:200 ports:Port 4,Port 5
329 Jan 06 15:17:10 faucet.valve INFO DPID 1 (0x1) Port 1 up, configuring
330 Jan 06 15:17:10 faucet.valve INFO DPID 1 (0x1) Port 2 up, configuring
331 Jan 06 15:17:10 faucet.valve INFO DPID 1 (0x1) Port 3 up, configuring
332 Jan 06 15:17:10 faucet.valve INFO DPID 1 (0x1) Port 4 up, configuring
333 Jan 06 15:17:10 faucet.valve INFO DPID 1 (0x1) Port 5 up, configuring
98dc8dee
BP
334
335Over on the Open vSwitch side, you can see a lot of related activity
336if you take a look in ``sandbox/ovs-vswitchd.log``. For example, here
337is the basic OpenFlow session setup and Faucet's probe of the switch's
338ports and capabilities::
339
340 rconn|INFO|br0<->tcp:127.0.0.1:6653: connecting...
341 vconn|DBG|tcp:127.0.0.1:6653: sent (Success): OFPT_HELLO (OF1.4) (xid=0x1):
342 version bitmap: 0x01, 0x02, 0x03, 0x04, 0x05
343 vconn|DBG|tcp:127.0.0.1:6653: received: OFPT_HELLO (OF1.3) (xid=0x2f24810a):
344 version bitmap: 0x01, 0x02, 0x03, 0x04
345 vconn|DBG|tcp:127.0.0.1:6653: negotiated OpenFlow version 0x04 (we support version 0x05 and earlier, peer supports version 0x04 and earlier)
346 rconn|INFO|br0<->tcp:127.0.0.1:6653: connected
347 vconn|DBG|tcp:127.0.0.1:6653: received: OFPT_ECHO_REQUEST (OF1.3) (xid=0x2f24810b): 0 bytes of payload
348 vconn|DBG|tcp:127.0.0.1:6653: sent (Success): OFPT_ECHO_REPLY (OF1.3) (xid=0x2f24810b): 0 bytes of payload
349 vconn|DBG|tcp:127.0.0.1:6653: received: OFPT_FEATURES_REQUEST (OF1.3) (xid=0x2f24810c):
350 vconn|DBG|tcp:127.0.0.1:6653: sent (Success): OFPT_FEATURES_REPLY (OF1.3) (xid=0x2f24810c): dpid:0000000000000001
351 n_tables:254, n_buffers:0
352 capabilities: FLOW_STATS TABLE_STATS PORT_STATS GROUP_STATS QUEUE_STATS
353 vconn|DBG|tcp:127.0.0.1:6653: received: OFPST_PORT_DESC request (OF1.3) (xid=0x2f24810d): port=ANY
354 vconn|DBG|tcp:127.0.0.1:6653: sent (Success): OFPST_PORT_DESC reply (OF1.3) (xid=0x2f24810d):
355 1(p1): addr:aa:55:aa:55:00:14
356 config: PORT_DOWN
357 state: LINK_DOWN
358 speed: 0 Mbps now, 0 Mbps max
359 2(p2): addr:aa:55:aa:55:00:15
360 config: PORT_DOWN
361 state: LINK_DOWN
362 speed: 0 Mbps now, 0 Mbps max
363 3(p3): addr:aa:55:aa:55:00:16
364 config: PORT_DOWN
365 state: LINK_DOWN
366 speed: 0 Mbps now, 0 Mbps max
367 4(p4): addr:aa:55:aa:55:00:17
368 config: PORT_DOWN
369 state: LINK_DOWN
370 speed: 0 Mbps now, 0 Mbps max
371 5(p5): addr:aa:55:aa:55:00:18
372 config: PORT_DOWN
373 state: LINK_DOWN
374 speed: 0 Mbps now, 0 Mbps max
375 LOCAL(br0): addr:c6:64:ff:59:48:41
376 config: PORT_DOWN
377 state: LINK_DOWN
378 speed: 0 Mbps now, 0 Mbps max
379
380After that, you can see Faucet delete all existing flows and then
381start adding new ones::
382
383 vconn|DBG|tcp:127.0.0.1:6653: received: OFPT_FLOW_MOD (OF1.3) (xid=0x2f24810e): DEL table:255 priority=0 actions=drop
384 vconn|DBG|tcp:127.0.0.1:6653: received: OFPT_BARRIER_REQUEST (OF1.3) (xid=0x2f24810f):
385 vconn|DBG|tcp:127.0.0.1:6653: sent (Success): OFPT_BARRIER_REPLY (OF1.3) (xid=0x2f24810f):
386 vconn|DBG|tcp:127.0.0.1:6653: received: OFPT_FLOW_MOD (OF1.3) (xid=0x2f248110): ADD priority=0 cookie:0x5adc15c0 out_port:0 actions=drop
387 vconn|DBG|tcp:127.0.0.1:6653: received: OFPT_FLOW_MOD (OF1.3) (xid=0x2f248111): ADD table:1 priority=0 cookie:0x5adc15c0 out_port:0 actions=drop
388 ...
389
390OpenFlow Layer
391~~~~~~~~~~~~~~
392
393Let's take a look at the OpenFlow tables that Faucet set up. Before
394we do that, it's helpful to take a look at ``docs/architecture.rst``
395in the Faucet documentation to learn how Faucet structures its flow
396tables. In summary, this document says:
397
398Table 0
399 Port-based ACLs
400
401Table 1
402 Ingress VLAN processing
403
404Table 2
405 VLAN-based ACLs
406
407Table 3
408 Ingress L2 processing, MAC learning
409
410Table 4
411 L3 forwarding for IPv4
412
413Table 5
414 L3 forwarding for IPv6
415
416Table 6
417 Virtual IP processing, e.g. for router IP addresses implemented by Faucet
418
419Table 7
420 Egress L2 processing
421
422Table 8
423 Flooding
dcc3e70b 424
98dc8dee
BP
425With that in mind, let's dump the flow tables. The simplest way is to
426just run plain ``ovs-ofctl dump-flows``::
427
428 $ ovs-ofctl dump-flows br0
429
430If you run that bare command, it produces a lot of extra junk that
431makes the output harder to read, like statistics and "cookie" values
432that are all the same. In addition, for historical reasons
433``ovs-ofctl`` always defaults to using OpenFlow 1.0 even though Faucet
434and most modern controllers use OpenFlow 1.3, so it's best to force it
435to use OpenFlow 1.3. We could throw in a lot of options to fix these,
436but we'll want to do this more than once, so let's start by defining a
437shell function for ourselves::
438
439 $ dump-flows () {
440 ovs-ofctl -OOpenFlow13 --names --no-stat dump-flows "$@" \
441 | sed 's/cookie=0x5adc15c0, //'
442 }
443
444Let's also define ``save-flows`` and ``diff-flows`` functions for
445later use::
446
447 $ save-flows () {
448 ovs-ofctl -OOpenFlow13 --no-names --sort dump-flows "$@"
449 }
450 $ diff-flows () {
451 ovs-ofctl -OOpenFlow13 diff-flows "$@" | sed 's/cookie=0x5adc15c0 //'
452 }
453
454Now let's take a look at the flows we've got and what they mean, like
455this::
456
457 $ dump-flows br0
458
459First, table 0 has a flow that just jumps to table 1 for each
460configured port, and drops other unrecognized packets. Presumably it
461will do more if we configured port-based ACLs::
462
463 priority=9099,in_port=p1 actions=goto_table:1
464 priority=9099,in_port=p2 actions=goto_table:1
465 priority=9099,in_port=p3 actions=goto_table:1
466 priority=9099,in_port=p4 actions=goto_table:1
467 priority=9099,in_port=p5 actions=goto_table:1
468 priority=0 actions=drop
469
470Table 1, for ingress VLAN processing, has a bunch of flows that drop
dcc3e70b 471inappropriate packets, such as LLDP and STP::
98dc8dee 472
98dc8dee
BP
473 table=1, priority=9099,dl_dst=01:80:c2:00:00:00 actions=drop
474 table=1, priority=9099,dl_dst=01:00:0c:cc:cc:cd actions=drop
475 table=1, priority=9099,dl_type=0x88cc actions=drop
476
477Table 1 also has some more interesting flows that recognize packets
478without a VLAN header on each of our ports
479(``vlan_tci=0x0000/0x1fff``), push on the VLAN configured for the
480port, and proceed to table 3. Presumably these skip table 2 because
481we did not configure any VLAN-based ACLs. There is also a fallback
482flow to drop other packets, which in practice means that if any
483received packet already has a VLAN header then it will be dropped::
484
485 table=1, priority=9000,in_port=p1,vlan_tci=0x0000/0x1fff actions=push_vlan:0x8100,set_field:4196->vlan_vid,goto_table:3
486 table=1, priority=9000,in_port=p2,vlan_tci=0x0000/0x1fff actions=push_vlan:0x8100,set_field:4196->vlan_vid,goto_table:3
487 table=1, priority=9000,in_port=p3,vlan_tci=0x0000/0x1fff actions=push_vlan:0x8100,set_field:4196->vlan_vid,goto_table:3
488 table=1, priority=9000,in_port=p4,vlan_tci=0x0000/0x1fff actions=push_vlan:0x8100,set_field:4296->vlan_vid,goto_table:3
489 table=1, priority=9000,in_port=p5,vlan_tci=0x0000/0x1fff actions=push_vlan:0x8100,set_field:4296->vlan_vid,goto_table:3
490 table=1, priority=0 actions=drop
491
492.. note::
493
494 The syntax ``set_field:4196->vlan_vid`` is curious and somewhat
495 misleading. OpenFlow 1.3 defines the ``vlan_vid`` field as a 13-bit
496 field where bit 12 is set to 1 if the VLAN header is present. Thus,
497 since 4196 is 0x1064, this action sets VLAN value 0x64, which in
498 decimal is 100.
499
500Table 2 isn't used because there are no VLAN-based ACLs. It just has
501a drop flow::
502
503 table=2, priority=0 actions=drop
504
505Table 3 is used for MAC learning but the controller hasn't learned any
dcc3e70b
BC
506MAC yet. It also drops some inappropriate packets such as those that claim
507to be from a broadcast source address (why not from all multicast source
508addresses, though?). We'll come back here later::
98dc8dee 509
dcc3e70b
BC
510 table=3, priority=9099,dl_src=ff:ff:ff:ff:ff:ff actions=drop
511 table=3, priority=9001,dl_src=0e:00:00:00:00:01 actions=drop
98dc8dee
BP
512 table=3, priority=0 actions=drop
513 table=3, priority=9000 actions=CONTROLLER:96,goto_table:7
514
515Tables 4, 5, and 6 aren't used because we haven't configured any
516routing::
517
518 table=4, priority=0 actions=drop
519 table=5, priority=0 actions=drop
520 table=6, priority=0 actions=drop
521
522Table 7 is used to direct packets to learned MACs but Faucet hasn't
523learned any MACs yet, so it just sends all the packets along to table
5248::
525
526 table=7, priority=0 actions=drop
527 table=7, priority=9000 actions=goto_table:8
528
529Table 8 implements flooding, broadcast, and multicast. The flows for
530broadcast and flood are easy to understand: if the packet came in on a
531given port and needs to be flooded or broadcast, output it to all the
532other ports in the same VLAN::
533
534 table=8, priority=9008,in_port=p1,dl_vlan=100,dl_dst=ff:ff:ff:ff:ff:ff actions=pop_vlan,output:p2,output:p3
535 table=8, priority=9008,in_port=p2,dl_vlan=100,dl_dst=ff:ff:ff:ff:ff:ff actions=pop_vlan,output:p1,output:p3
536 table=8, priority=9008,in_port=p3,dl_vlan=100,dl_dst=ff:ff:ff:ff:ff:ff actions=pop_vlan,output:p1,output:p2
537 table=8, priority=9008,in_port=p4,dl_vlan=200,dl_dst=ff:ff:ff:ff:ff:ff actions=pop_vlan,output:p5
538 table=8, priority=9008,in_port=p5,dl_vlan=200,dl_dst=ff:ff:ff:ff:ff:ff actions=pop_vlan,output:p4
539 table=8, priority=9000,in_port=p1,dl_vlan=100 actions=pop_vlan,output:p2,output:p3
540 table=8, priority=9000,in_port=p2,dl_vlan=100 actions=pop_vlan,output:p1,output:p3
541 table=8, priority=9000,in_port=p3,dl_vlan=100 actions=pop_vlan,output:p1,output:p2
542 table=8, priority=9000,in_port=p4,dl_vlan=200 actions=pop_vlan,output:p5
543 table=8, priority=9000,in_port=p5,dl_vlan=200 actions=pop_vlan,output:p4
544
545.. note::
546
547 These flows could apparently be simpler because OpenFlow says that
548 ``output:<port>`` is ignored if ``<port>`` is the input port. That
549 means that the first three flows above could apparently be collapsed
550 into just::
551
552 table=8, priority=9008,dl_vlan=100,dl_dst=ff:ff:ff:ff:ff:ff actions=pop_vlan,output:p1,output:p2,output:p3
553
554 There might be some reason why this won't work or isn't practical,
555 but that isn't obvious from looking at the flow table.
556
557There are also some flows for handling some standard forms of
558multicast, and a fallback drop flow::
559
560 table=8, priority=9006,in_port=p1,dl_vlan=100,dl_dst=33:33:00:00:00:00/ff:ff:00:00:00:00 actions=pop_vlan,output:p2,output:p3
561 table=8, priority=9006,in_port=p2,dl_vlan=100,dl_dst=33:33:00:00:00:00/ff:ff:00:00:00:00 actions=pop_vlan,output:p1,output:p3
562 table=8, priority=9006,in_port=p3,dl_vlan=100,dl_dst=33:33:00:00:00:00/ff:ff:00:00:00:00 actions=pop_vlan,output:p1,output:p2
563 table=8, priority=9006,in_port=p4,dl_vlan=200,dl_dst=33:33:00:00:00:00/ff:ff:00:00:00:00 actions=pop_vlan,output:p5
564 table=8, priority=9006,in_port=p5,dl_vlan=200,dl_dst=33:33:00:00:00:00/ff:ff:00:00:00:00 actions=pop_vlan,output:p4
565 table=8, priority=9002,in_port=p1,dl_vlan=100,dl_dst=01:80:c2:00:00:00/ff:ff:ff:00:00:00 actions=pop_vlan,output:p2,output:p3
566 table=8, priority=9002,in_port=p2,dl_vlan=100,dl_dst=01:80:c2:00:00:00/ff:ff:ff:00:00:00 actions=pop_vlan,output:p1,output:p3
567 table=8, priority=9002,in_port=p3,dl_vlan=100,dl_dst=01:80:c2:00:00:00/ff:ff:ff:00:00:00 actions=pop_vlan,output:p1,output:p2
568 table=8, priority=9004,in_port=p1,dl_vlan=100,dl_dst=01:00:5e:00:00:00/ff:ff:ff:00:00:00 actions=pop_vlan,output:p2,output:p3
569 table=8, priority=9004,in_port=p2,dl_vlan=100,dl_dst=01:00:5e:00:00:00/ff:ff:ff:00:00:00 actions=pop_vlan,output:p1,output:p3
570 table=8, priority=9004,in_port=p3,dl_vlan=100,dl_dst=01:00:5e:00:00:00/ff:ff:ff:00:00:00 actions=pop_vlan,output:p1,output:p2
571 table=8, priority=9002,in_port=p4,dl_vlan=200,dl_dst=01:80:c2:00:00:00/ff:ff:ff:00:00:00 actions=pop_vlan,output:p5
572 table=8, priority=9002,in_port=p5,dl_vlan=200,dl_dst=01:80:c2:00:00:00/ff:ff:ff:00:00:00 actions=pop_vlan,output:p4
573 table=8, priority=9004,in_port=p4,dl_vlan=200,dl_dst=01:00:5e:00:00:00/ff:ff:ff:00:00:00 actions=pop_vlan,output:p5
574 table=8, priority=9004,in_port=p5,dl_vlan=200,dl_dst=01:00:5e:00:00:00/ff:ff:ff:00:00:00 actions=pop_vlan,output:p4
575 table=8, priority=0 actions=drop
576
577Tracing
578~~~~~~~
579
580Let's go a level deeper. So far, everything we've done has been
581fairly general. We can also look at something more specific: the path
582that a particular packet would take through Open vSwitch. We can use
583OVN ``ofproto/trace`` command to play "what-if?" games. This command
584is one that we send directly to ``ovs-vswitchd``, using the
585``ovs-appctl`` utility.
586
587.. note::
588
589 ``ovs-appctl`` is actually a very simple-minded JSON-RPC client, so you could
590 also use some other utility that speaks JSON-RPC, or access it from a program
591 as an API.
592
593The ``ovs-vswitchd``\(8) manpage has a lot of detail on how to use
594``ofproto/trace``, but let's just start by building up from a simple
595example. You can start with a command that just specifies the
596datapath (e.g. ``br0``), an input port, and nothing else; unspecified
597fields default to all-zeros. Let's look at the full output for this
598trivial example::
599
600 $ ovs-appctl ofproto/trace br0 in_port=p1
601 Flow: in_port=1,vlan_tci=0x0000,dl_src=00:00:00:00:00:00,dl_dst=00:00:00:00:00:00,dl_type=0x0000
602
603 bridge("br0")
604 -------------
605 0. in_port=1, priority 9099, cookie 0x5adc15c0
606 goto_table:1
607 1. in_port=1,vlan_tci=0x0000/0x1fff, priority 9000, cookie 0x5adc15c0
608 push_vlan:0x8100
609 set_field:4196->vlan_vid
610 goto_table:3
611 3. priority 9000, cookie 0x5adc15c0
612 CONTROLLER:96
613 goto_table:7
614 7. priority 9000, cookie 0x5adc15c0
615 goto_table:8
616 8. in_port=1,dl_vlan=100, priority 9000, cookie 0x5adc15c0
617 pop_vlan
618 output:2
619 output:3
620
621 Final flow: unchanged
622 Megaflow: recirc_id=0,eth,in_port=1,vlan_tci=0x0000,dl_src=00:00:00:00:00:00,dl_dst=00:00:00:00:00:00,dl_type=0x0000
d39ec23d 623 Datapath actions: push_vlan(vid=100,pcp=0),userspace(pid=0,controller(reason=1,flags=1,recirc_id=1,rule_cookie=0x5adc15c0,controller_id=0,max_len=96)),pop_vlan,2,3
98dc8dee
BP
624
625The first line of output, beginning with ``Flow:``, just repeats our
626request in a more verbose form, including the L2 fields that were
627zeroed.
628
629Each of the numbered items under ``bridge("br0")`` shows what would
630happen to our hypothetical packet in the table with the given number.
631For example, we see in table 1 that the packet matches a flow that
632push on a VLAN header, set the VLAN ID to 100, and goes on to further
633processing in table 3. In table 3, the packet gets sent to the
634controller to allow MAC learning to take place, and then table 8
635floods the packet to the other ports in the same VLAN.
636
637Summary information follows the numbered tables. The packet hasn't
638been changed (overall, even though a VLAN was pushed and then popped
639back off) since ingress, hence ``Final flow: unchanged``. We'll look
640at the ``Megaflow`` information later. The ``Datapath actions``
d39ec23d 641summarize what would actually happen to such a packet.
98dc8dee
BP
642
643Triggering MAC Learning
644~~~~~~~~~~~~~~~~~~~~~~~
645
646We just saw how a packet gets sent to the controller to trigger MAC
647learning. Let's actually send the packet and see what happens. But
648before we do that, let's save a copy of the current flow tables for
649later comparison::
650
651 $ save-flows br0 > flows1
652
653Now use ``ofproto/trace``, as before, with a few new twists: we
654specify the source and destination Ethernet addresses and append the
655``-generate`` option so that side effects like sending a packet to the
656controller actually happen::
657
658 $ ovs-appctl ofproto/trace br0 in_port=p1,dl_src=00:11:11:00:00:00,dl_dst=00:22:22:00:00:00 -generate
659
660The output is almost identical to that before, so it is not repeated
661here. But, take a look at ``inst/faucet.log`` now. It should now
662include a line at the end that says that it learned about our MAC
66300:11:11:00:00:00, like this::
664
dcc3e70b 665 Jan 06 15:56:02 faucet.valve INFO DPID 1 (0x1) L2 learned 00:11:11:00:00:00 (L2 type 0x0000, L3 src None) on Port 1 on VLAN 100 (1 hosts total
98dc8dee
BP
666
667Now compare the flow tables that we saved to the current ones::
668
669 diff-flows flows1 br0
670
671The result should look like this, showing new flows for the learned
672MACs::
673
dcc3e70b
BC
674 +table=3 priority=9098,in_port=1,dl_vlan=100,dl_src=00:11:11:00:00:00 hard_timeout=3601 actions=goto_table:7
675 +table=7 priority=9099,dl_vlan=100,dl_dst=00:11:11:00:00:00 idle_timeout=3601 actions=pop_vlan,output:1
98dc8dee
BP
676
677To demonstrate the usefulness of the learned MAC, try tracing (with
678side effects) a packet arriving on ``p2`` (or ``p3``) and destined to
679the address learned on ``p1``, like this::
680
681 $ ovs-appctl ofproto/trace br0 in_port=p2,dl_src=00:22:22:00:00:00,dl_dst=00:11:11:00:00:00 -generate
682
683The first time you run this command, you will notice that it sends the
684packet to the controller, to learn ``p2``'s 00:22:22:00:00:00 source
685address::
686
687 bridge("br0")
688 -------------
689 0. in_port=2, priority 9099, cookie 0x5adc15c0
690 goto_table:1
691 1. in_port=2,vlan_tci=0x0000/0x1fff, priority 9000, cookie 0x5adc15c0
692 push_vlan:0x8100
693 set_field:4196->vlan_vid
694 goto_table:3
695 3. priority 9000, cookie 0x5adc15c0
696 CONTROLLER:96
697 goto_table:7
698 7. dl_vlan=100,dl_dst=00:11:11:00:00:00, priority 9099, cookie 0x5adc15c0
699 pop_vlan
700 output:1
701
702If you check ``inst/faucet.log``, you can see that ``p2``'s MAC has
703been learned too::
704
dcc3e70b 705 Jan 06 15:58:09 faucet.valve INFO DPID 1 (0x1) L2 learned 00:22:22:00:00:00 (L2 type 0x0000, L3 src None) on Port 2 on VLAN 100 (2 hosts total)
98dc8dee
BP
706
707Similarly for ``diff-flows``::
708
709 $ diff-flows flows1 br0
dcc3e70b
BC
710 +table=3 priority=9098,in_port=1,dl_vlan=100,dl_src=00:11:11:00:00:00 hard_timeout=3601 actions=goto_table:7
711 +table=3 priority=9098,in_port=2,dl_vlan=100,dl_src=00:22:22:00:00:00 hard_timeout=3604 actions=goto_table:7
712 +table=7 priority=9099,dl_vlan=100,dl_dst=00:11:11:00:00:00 idle_timeout=3601 actions=pop_vlan,output:1
713 +table=7 priority=9099,dl_vlan=100,dl_dst=00:22:22:00:00:00 idle_timeout=3604 actions=pop_vlan,output:2
98dc8dee
BP
714
715Then, if you re-run either of the ``ofproto/trace`` commands (with or
716without ``-generate``), you can see that the packets go back and forth
717without any further MAC learning, e.g.::
718
719 $ ovs-appctl ofproto/trace br0 in_port=p2,dl_src=00:22:22:00:00:00,dl_dst=00:11:11:00:00:00 -generate
720 Flow: in_port=2,vlan_tci=0x0000,dl_src=00:22:22:00:00:00,dl_dst=00:11:11:00:00:00,dl_type=0x0000
721
722 bridge("br0")
723 -------------
724 0. in_port=2, priority 9099, cookie 0x5adc15c0
725 goto_table:1
726 1. in_port=2,vlan_tci=0x0000/0x1fff, priority 9000, cookie 0x5adc15c0
727 push_vlan:0x8100
728 set_field:4196->vlan_vid
729 goto_table:3
730 3. in_port=2,dl_vlan=100,dl_src=00:22:22:00:00:00, priority 9098, cookie 0x5adc15c0
731 goto_table:7
732 7. dl_vlan=100,dl_dst=00:11:11:00:00:00, priority 9099, cookie 0x5adc15c0
733 pop_vlan
734 output:1
735
736 Final flow: unchanged
737 Megaflow: recirc_id=0,eth,in_port=2,vlan_tci=0x0000/0x1fff,dl_src=00:22:22:00:00:00,dl_dst=00:11:11:00:00:00,dl_type=0x0000
dcc3e70b 738 Datapath actions: 1
98dc8dee
BP
739
740Performance
741~~~~~~~~~~~
742
d39ec23d
JP
743Open vSwitch has a concept of a "fast path" and a "slow path"; ideally
744all packets stay in the fast path. This distinction between slow path
745and fast path is the key to making sure that Open vSwitch performs as
746fast as possible.
747
748Some factors can force a flow or a packet to take the slow path. As one
749example, all CFM, BFD, LACP, STP, and LLDP processing takes place in the
750slow path, in the cases where Open vSwitch processes these protocols
751itself instead of delegating to controller-written flows. As a second
98dc8dee
BP
752example, any flow that modifies ARP fields is processed in the slow
753path. These are corner cases that are unlikely to cause performance
754problems in practice because these protocols send packets at a
755relatively slow rate, and users and controller authors do not normally
756need to be concerned about them.
757
758To understand what cases users and controller authors should consider,
759we need to talk about how Open vSwitch optimizes for performance. The
760Open vSwitch code is divided into two major components which, as
761already mentioned, are called the "slow path" and "fast path" (aka
762"datapath"). The slow path is embedded in the ``ovs-vswitchd``
763userspace program. It is the part of the Open vSwitch packet
764processing logic that understands OpenFlow. Its job is to take a
765packet and run it through the OpenFlow tables to determine what should
766happen to it. It outputs a list of actions in a form similar to
767OpenFlow actions but simpler, called "ODP actions" or "datapath
768actions". It then passes the ODP actions to the datapath, which
769applies them to the packet.
770
771.. note::
772
773 Open vSwitch contains a single slow path and multiple fast paths.
774 The difference between using Open vSwitch with the Linux kernel
775 versus with DPDK is the datapath.
776
777If every packet passed through the slow path and the fast path in this
778way, performance would be terrible. The key to getting high
779performance from this architecture is caching. Open vSwitch includes
780a multi-level cache. It works like this:
781
7821. A packet initially arrives at the datapath. Some datapaths (such
783 as DPDK and the in-tree version of the OVS kernel module) have a
784 first-level cache called the "microflow cache". The microflow
785 cache is the key to performance for relatively long-lived, high
786 packet rate flows. If the datapath has a microflow cache, then it
787 consults it and, if there is a cache hit, the datapath executes the
788 associated actions. Otherwise, it proceeds to step 2.
789
7902. The datapath consults its second-level cache, called the "megaflow
791 cache". The megaflow cache is the key to performance for shorter
792 or low packet rate flows. If there is a megaflow cache hit, the
793 datapath executes the associated actions. Otherwise, it proceeds
794 to step 3.
795
7963. The datapath passes the packet to the slow path, which runs it
797 through the OpenFlow table to yield ODP actions, a process that is
798 often called "flow translation". It then passes the packet back to
799 the datapath to execute the actions and to, if possible, install a
800 megaflow cache entry so that subsequent similar packets can be
801 handled directly by the fast path. (We already described above
802 most of the cases where a cache entry cannot be installed.)
803
804The megaflow cache is the key cache to consider for performance
805tuning. Open vSwitch provides tools for understanding and optimizing
806its behavior. The ``ofproto/trace`` command that we have already been
807using is the most common tool for this use. Let's take another look
808at the most recent ``ofproto/trace`` output::
809
810 $ ovs-appctl ofproto/trace br0 in_port=p2,dl_src=00:22:22:00:00:00,dl_dst=00:11:11:00:00:00 -generate
811 Flow: in_port=2,vlan_tci=0x0000,dl_src=00:22:22:00:00:00,dl_dst=00:11:11:00:00:00,dl_type=0x0000
812
813 bridge("br0")
814 -------------
815 0. in_port=2, priority 9099, cookie 0x5adc15c0
816 goto_table:1
817 1. in_port=2,vlan_tci=0x0000/0x1fff, priority 9000, cookie 0x5adc15c0
818 push_vlan:0x8100
819 set_field:4196->vlan_vid
820 goto_table:3
821 3. in_port=2,dl_vlan=100,dl_src=00:22:22:00:00:00, priority 9098, cookie 0x5adc15c0
822 goto_table:7
823 7. dl_vlan=100,dl_dst=00:11:11:00:00:00, priority 9099, cookie 0x5adc15c0
824 pop_vlan
825 output:1
826
827 Final flow: unchanged
828 Megaflow: recirc_id=0,eth,in_port=2,vlan_tci=0x0000/0x1fff,dl_src=00:22:22:00:00:00,dl_dst=00:11:11:00:00:00,dl_type=0x0000
dcc3e70b 829 Datapath actions: 1
98dc8dee
BP
830
831This time, it's the last line that we're interested in. This line
832shows the entry that Open vSwitch would insert into the megaflow cache
833given the particular packet with the current flow tables. The
834megaflow entry includes:
835
836* ``recirc_id``. This is an implementation detail that users don't
837 normally need to understand.
838
839* ``eth``. This just indicates that the cache entry matches only
840 Ethernet packets; Open vSwitch also supports other types of packets,
841 such as IP packets not encapsulated in Ethernet.
842
843* All of the fields matched by any of the flows that the packet
844 visited:
845
846 ``in_port``
847 In tables 0, 1, and 3.
848
849 ``vlan_tci``
850 In tables 1, 3, and 7 (``vlan_tci`` includes the VLAN ID and PCP
851 fields and``dl_vlan`` is just the VLAN ID).
852
853 ``dl_src``
854 In table 3
855
856 ``dl_dst``
857 In table 7.
858
859* All of the fields matched by flows that had to be ruled out to
860 ensure that the ones that actually matched were the highest priority
861 matching rules.
862
863The last one is important. Notice how the megaflow matches on
864``dl_type=0x0000``, even though none of the tables matched on
865``dl_type`` (the Ethernet type). One reason is because of this flow
866in OpenFlow table 1 (which shows up in ``dump-flows`` output)::
867
868 table=1, priority=9099,dl_type=0x88cc actions=drop
869
870This flow has higher priority than the flow in table 1 that actually
871matched. This means that, to put it in the megaflow cache,
872``ovs-vswitchd`` has to add a match on ``dl_type`` to ensure that the
873cache entry doesn't match LLDP packets (with Ethertype 0x88cc).
874
875.. note::
876
877 In fact, in some cases ``ovs-vswitchd`` matches on fields that
878 aren't strictly required according to this description. ``dl_type``
879 is actually one of those, so deleting the LLDP flow probably would
880 not have any effect on the megaflow. But the principle here is
881 sound.
882
883So why does any of this matter? It's because, the more specific a
884megaflow is, that is, the more fields or bits within fields that a
885megaflow matches, the less valuable it is from a caching viewpoint. A
886very specific megaflow might match on L2 and L3 addresses and L4 port
887numbers. When that happens, only packets in one (half-)connection
888match the megaflow. If that connection has only a few packets, as
889many connections do, then the high cost of the slow path translation
890is amortized over only a few packets, so the average cost of
891forwarding those packets is high. On the other hand, if a megaflow
892only matches a relatively small number of L2 and L3 packets, then the
893cache entry can potentially be used by many individual connections,
894and the average cost is low.
895
896For more information on how Open vSwitch constructs megaflows,
897including about ways that it can make megaflow entries less specific
898than one would infer from the discussion here, please refer to the
8992015 NSDI paper, "The Design and Implementation of Open vSwitch",
900which focuses on this algorithm.
901
902Routing
903-------
904
905We've looked at how Faucet implements switching in OpenFlow, and how
906Open vSwitch implements OpenFlow through its datapath architecture.
907Now let's start over, adding L3 routing into the picture.
908
909It's remarkably easy to enable routing. We just change our ``vlans``
910section in ``inst/faucet.yaml`` to specify a router IP address for
1fb924b8 911each VLAN and define a router between them. The ``dps`` section is unchanged::
98dc8dee
BP
912
913 dps:
914 switch-1:
915 dp_id: 0x1
916 timeout: 3600
917 arp_neighbor_timeout: 3600
918 interfaces:
919 1:
920 native_vlan: 100
921 2:
922 native_vlan: 100
923 3:
924 native_vlan: 100
925 4:
926 native_vlan: 200
927 5:
928 native_vlan: 200
929 vlans:
930 100:
931 faucet_vips: ["10.100.0.254/24"]
932 200:
933 faucet_vips: ["10.200.0.254/24"]
934 routers:
935 router-1:
936 vlans: [100, 200]
937
938Then we restart Faucet::
939
940 $ docker restart faucet
941
942.. note::
943
944 One should be able to tell Faucet to re-read its configuration file
945 without restarting it. I sometimes saw anomalous behavior when I
946 did this, although I didn't characterize it well enough to make a
947 quality bug report. I found restarting the container to be
948 reliable.
949
950OpenFlow Layer
951~~~~~~~~~~~~~~
952
953Back in the OVS sandbox, let's see how the flow table has changed, with::
954
955 $ diff-flows flows1 br0
956
957First, table 3 has new flows to direct ARP packets to table 6 (the
958virtual IP processing table), presumably to handle ARP for the router
959IPs. New flows also send IP packets destined to a particular Ethernet
960address to table 4 (the L3 forwarding table); we can make the educated
961guess that the Ethernet address is the one used by the Faucet router::
962
dcc3e70b
BC
963 +table=3 priority=9131,arp,dl_vlan=100 actions=goto_table:6
964 +table=3 priority=9131,arp,dl_vlan=200 actions=goto_table:6
965 +table=3 priority=9099,ip,dl_vlan=100,dl_dst=0e:00:00:00:00:01 actions=goto_table:4
966 +table=3 priority=9099,ip,dl_vlan=200,dl_dst=0e:00:00:00:00:01 actions=goto_table:4
98dc8dee
BP
967
968The new flows in table 4 appear to be verifying that the packets are
969indeed addressed to a network or IP address that Faucet knows how to
970route::
971
dcc3e70b
BC
972 +table=4 priority=9131,ip,dl_vlan=100,nw_dst=10.100.0.254 actions=goto_table:6
973 +table=4 priority=9131,ip,dl_vlan=200,nw_dst=10.200.0.254 actions=goto_table:6
974 +table=4 priority=9123,ip,dl_vlan=100,nw_dst=10.100.0.0/24 actions=goto_table:6
975 +table=4 priority=9123,ip,dl_vlan=200,nw_dst=10.100.0.0/24 actions=goto_table:6
976 +table=4 priority=9123,ip,dl_vlan=100,nw_dst=10.200.0.0/24 actions=goto_table:6
977 +table=4 priority=9123,ip,dl_vlan=200,nw_dst=10.200.0.0/24 actions=goto_table:6
98dc8dee
BP
978
979Table 6 has a few different things going on. It sends ARP requests
980for the router IPs to the controller; presumably the controller will
981generate replies and send them back to the requester. It switches
982other ARP packets, either broadcasting them if they have a broadcast
983destination or attempting to unicast them otherwise. It sends all
984other IP packets to the controller::
985
dcc3e70b
BC
986 +table=6 priority=9133,arp,arp_tpa=10.100.0.254 actions=CONTROLLER:128
987 +table=6 priority=9133,arp,arp_tpa=10.200.0.254 actions=CONTROLLER:128
988 +table=6 priority=9132,arp,dl_dst=ff:ff:ff:ff:ff:ff actions=goto_table:8
989 +table=6 priority=9131,arp actions=goto_table:7
990 +table=6 priority=9130,ip actions=CONTROLLER:128
98dc8dee
BP
991
992Performance is clearly going to be poor if every packet that needs to
993be routed has to go to the controller, but it's unlikely that's the
994full story. In the next section, we'll take a closer look.
995
996Tracing
997~~~~~~~
998
999As in our switching example, we can play some "what-if?" games to
1000figure out how this works. Let's suppose that a machine with IP
100110.100.0.1, on port ``p1``, wants to send a IP packet to a machine
1002with IP 10.200.0.1 on port ``p4``. Assuming that these hosts have not
1003been in communication recently, the steps to accomplish this are
1004normally the following:
1005
10061. Host 10.100.0.1 sends an ARP request to router 10.100.0.254.
1007
10082. The router sends an ARP reply to the host.
1009
10103. Host 10.100.0.1 sends an IP packet to 10.200.0.1, via the router's
1011 Ethernet address.
1012
10134. The router broadcasts an ARP request to ``p4`` and ``p5``, the
1014 ports that carry the 10.200.0.<x> network.
1015
10165. Host 10.200.0.1 sends an ARP reply to the router.
1017
10186. Either the router sends the IP packet (which it buffered) to
1019 10.200.0.1, or eventually 10.100.0.1 times out and resends it.
1020
1021Let's use ``ofproto/trace`` to see whether Faucet and OVS follow this
1022procedure.
1023
1024Before we start, save a new snapshot of the flow tables for later
1025comparison::
1026
1027 $ save-flows br0 > flows2
1028
1029Step 1: Host ARP for Router
1030+++++++++++++++++++++++++++
1031
1032Let's simulate the ARP from 10.100.0.1 to its gateway router
103310.100.0.254. This requires more detail than any of the packets we've
1034simulated previously::
1035
1036 $ ovs-appctl ofproto/trace br0 in_port=p1,dl_src=00:01:02:03:04:05,dl_dst=ff:ff:ff:ff:ff:ff,dl_type=0x806,arp_spa=10.100.0.1,arp_tpa=10.100.0.254,arp_sha=00:01:02:03:04:05,arp_tha=ff:ff:ff:ff:ff:ff,arp_op=1 -generate
1037
1038The important part of the output is where it shows that the packet was
1039recognized as an ARP request destined to the router gateway and
1040therefore sent to the controller::
1041
1042 6. arp,arp_tpa=10.100.0.254, priority 9133, cookie 0x5adc15c0
dcc3e70b 1043 CONTROLLER:128
98dc8dee
BP
1044
1045The Faucet log shows that Faucet learned the host's MAC address,
1046its MAC-to-IP mapping, and responded to the ARP request::
1047
dcc3e70b
BC
1048 Jan 06 16:12:23 faucet.valve INFO DPID 1 (0x1) Adding new route 10.100.0.1/32 via 10.100.0.1 (00:01:02:03:04:05) on VLAN 100
1049 Jan 06 16:12:23 faucet.valve INFO DPID 1 (0x1) Responded to ARP request for 10.100.0.254 from 10.100.0.1 (00:01:02:03:04:05) on VLAN 100
1050 Jan 06 16:12:23 faucet.valve INFO DPID 1 (0x1) L2 learned 00:01:02:03:04:05 (L2 type 0x0806, L3 src 10.100.0.1) on Port 1 on VLAN 100 (1 hosts total)
98dc8dee
BP
1051
1052We can also look at the changes to the flow tables::
1053
1054 $ diff-flows flows2 br0
1055 +table=3 priority=9098,in_port=1,dl_vlan=100,dl_src=00:01:02:03:04:05 hard_timeout=3600 actions=goto_table:7
1056 +table=4 priority=9131,ip,dl_vlan=100,nw_dst=10.100.0.1 actions=set_field:4196->vlan_vid,set_field:0e:00:00:00:00:01->eth_src,set_field:00:01:02:03:04:05->eth_dst,dec_ttl,goto_table:7
1057 +table=4 priority=9131,ip,dl_vlan=200,nw_dst=10.100.0.1 actions=set_field:4196->vlan_vid,set_field:0e:00:00:00:00:01->eth_src,set_field:00:01:02:03:04:05->eth_dst,dec_ttl,goto_table:7
1058 +table=7 priority=9099,dl_vlan=100,dl_dst=00:01:02:03:04:05 idle_timeout=3600 actions=pop_vlan,output:1
1059
1060The new flows include one in table 3 and one in table 7 for the
1061learned MAC, which have the same forms we saw before. The new flows
1062in table 4 are different. They matches packets directed to 10.100.0.1
1063(in two VLANs) and forward them to the host by updating the Ethernet
1064source and destination addresses appropriately, decrementing the TTL,
1065and skipping ahead to unicast output in table 7. This means that
1066packets sent **to** 10.100.0.1 should now get to their destination.
1067
1068Step 2: Router Sends ARP Reply
1069++++++++++++++++++++++++++++++
1070
1071``inst/faucet.log`` said that the router sent an ARP reply. How can
1072we see it? Simulated packets just get dropped by default. One way is
1073to configure the dummy ports to write the packets they receive to a
1074file. Let's try that. First configure the port::
1075
1076 $ ovs-vsctl set interface p1 options:pcap=p1.pcap
1077
1078Then re-run the "trace" command::
1079
1080 $ ovs-appctl ofproto/trace br0 in_port=p1,dl_src=00:01:02:03:04:05,dl_dst=ff:ff:ff:ff:ff:ff,dl_type=0x806,arp_spa=10.100.0.1,arp_tpa=10.100.0.254,arp_sha=00:01:02:03:04:05,arp_tha=ff:ff:ff:ff:ff:ff,arp_op=1 -generate
1081
1082And dump the reply packet::
1083
1084 $ /usr/sbin/tcpdump -evvvr sandbox/p1.pcap
dcc3e70b
BC
1085 reading from file sandbox/p1.pcap, link-type EN10MB (Ethernet)
1086 16:14:47.670727 0e:00:00:00:00:01 (oui Unknown) > 00:01:02:03:04:05 (oui Unknown), ethertype ARP (0x0806), length 60: Ethernet (len 6), IPv4 (len 4), Reply 10.100.0.254 is-at 0e:00:00:00:00:01 (oui Unknown), length 46
98dc8dee
BP
1087
1088We clearly see the ARP reply, which tells us that the Faucet router's
1089Ethernet address is 0e:00:00:00:00:01 (as we guessed before from the
1090flow table.
1091
1092Let's configure the rest of our ports to log their packets, too::
1093
1094 $ for i in 2 3 4 5; do ovs-vsctl set interface p$i options:pcap=p$i.pcap; done
1095
1096Step 3: Host Sends IP Packet
1097++++++++++++++++++++++++++++
1098
1099Now that host 10.100.0.1 has the MAC address for its router, it can
1100send an IP packet to 10.200.0.1 via the router's MAC address, like
1101this::
1102
1103 $ ovs-appctl ofproto/trace br0 in_port=p1,dl_src=00:01:02:03:04:05,dl_dst=0e:00:00:00:00:01,udp,nw_src=10.100.0.1,nw_dst=10.200.0.1,nw_ttl=64 -generate
dcc3e70b 1104 Flow: udp,in_port=1,vlan_tci=0x0000,dl_src=00:01:02:03:04:05,dl_dst=0e:00:00:00:00:01,nw_src=10.100.0.1,nw_dst=10.200.0.1,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=0,tp_dst=0
98dc8dee
BP
1105
1106 bridge("br0")
1107 -------------
1108 0. in_port=1, priority 9099, cookie 0x5adc15c0
1109 goto_table:1
1110 1. in_port=1,vlan_tci=0x0000/0x1fff, priority 9000, cookie 0x5adc15c0
1111 push_vlan:0x8100
1112 set_field:4196->vlan_vid
1113 goto_table:3
1114 3. ip,dl_vlan=100,dl_dst=0e:00:00:00:00:01, priority 9099, cookie 0x5adc15c0
1115 goto_table:4
1116 4. ip,dl_vlan=100,nw_dst=10.200.0.0/24, priority 9123, cookie 0x5adc15c0
1117 goto_table:6
dcc3e70b
BC
1118 6. ip, priority 9130, cookie 0x5adc15c0
1119 CONTROLLER:128
98dc8dee 1120
dcc3e70b
BC
1121 Final flow: udp,in_port=1,dl_vlan=100,dl_vlan_pcp=0,vlan_tci1=0x0000,dl_src=00:01:02:03:04:05,dl_dst=0e:00:00:00:00:01,nw_src=10.100.0.1,nw_dst=10.200.0.1,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=0,tp_dst=0
1122 Megaflow: recirc_id=0,eth,ip,in_port=1,vlan_tci=0x0000/0x1fff,dl_src=00:01:02:03:04:05,dl_dst=0e:00:00:00:00:01,nw_dst=10.200.0.0/25,nw_frag=no
d39ec23d 1123 Datapath actions: push_vlan(vid=100,pcp=0),userspace(pid=0,controller(reason=1,flags=0,recirc_id=6,rule_cookie=0x5adc15c0,controller_id=0,max_len=128))
98dc8dee
BP
1124
1125Observe that the packet gets recognized as destined to the router, in
1126table 3, and then as properly destined to the 10.200.0.0/24 network,
1127in table 4. In table 6, however, it gets sent to the controller.
1128Presumably, this is because Faucet has not yet resolved an Ethernet
1129address for the destination host 10.200.0.1. It probably sent out an
1130ARP request. Let's take a look in the next step.
1131
1132Step 4: Router Broadcasts ARP Request
1133+++++++++++++++++++++++++++++++++++++
1134
1135The router needs to know the Ethernet address of 10.200.0.1. It knows
1136that, if this machine exists, it's on port ``p4`` or ``p5``, since we
1137configured those ports as VLAN 200.
1138
1139Let's make sure::
1140
1141 $ /usr/sbin/tcpdump -evvvr sandbox/p4.pcap
1142 reading from file sandbox/p4.pcap, link-type EN10MB (Ethernet)
dcc3e70b 1143 16:17:43.174006 0e:00:00:00:00:01 (oui Unknown) > Broadcast, ethertype ARP (0x0806), length 60: Ethernet (len 6), IPv4 (len 4), Request who-has 10.200.0.1 tell 10.200.0.254, length 46
98dc8dee
BP
1144
1145and::
1146
1147 $ /usr/sbin/tcpdump -evvvr sandbox/p5.pcap
1148 reading from file sandbox/p5.pcap, link-type EN10MB (Ethernet)
dcc3e70b 1149 16:17:43.174268 0e:00:00:00:00:01 (oui Unknown) > Broadcast, ethertype ARP (0x0806), length 60: Ethernet (len 6), IPv4 (len 4), Request who-has 10.200.0.1 tell 10.200.0.254, length 46
98dc8dee
BP
1150
1151For good measure, let's make sure that it wasn't sent to ``p3``::
1152
1153 $ /usr/sbin/tcpdump -evvvr sandbox/p3.pcap
1154 reading from file sandbox/p3.pcap, link-type EN10MB (Ethernet)
1155
1156Step 5: Host 2 Sends ARP Reply
1157++++++++++++++++++++++++++++++
1158
1159The Faucet controller sent an ARP request, so we can send an ARP
1160reply::
1161
1162 $ ovs-appctl ofproto/trace br0 in_port=p4,dl_src=00:10:20:30:40:50,dl_dst=0e:00:00:00:00:01,dl_type=0x806,arp_spa=10.200.0.1,arp_tpa=10.200.0.254,arp_sha=00:10:20:30:40:50,arp_tha=0e:00:00:00:00:01,arp_op=2 -generate
1163 Flow: arp,in_port=4,vlan_tci=0x0000,dl_src=00:10:20:30:40:50,dl_dst=0e:00:00:00:00:01,arp_spa=10.200.0.1,arp_tpa=10.200.0.254,arp_op=2,arp_sha=00:10:20:30:40:50,arp_tha=0e:00:00:00:00:01
1164
1165 bridge("br0")
1166 -------------
1167 0. in_port=4, priority 9099, cookie 0x5adc15c0
1168 goto_table:1
1169 1. in_port=4,vlan_tci=0x0000/0x1fff, priority 9000, cookie 0x5adc15c0
1170 push_vlan:0x8100
1171 set_field:4296->vlan_vid
1172 goto_table:3
1173 3. arp,dl_vlan=200, priority 9131, cookie 0x5adc15c0
1174 goto_table:6
1175 6. arp,arp_tpa=10.200.0.254, priority 9133, cookie 0x5adc15c0
dcc3e70b 1176 CONTROLLER:128
98dc8dee
BP
1177
1178 Final flow: arp,in_port=4,dl_vlan=200,dl_vlan_pcp=0,vlan_tci1=0x0000,dl_src=00:10:20:30:40:50,dl_dst=0e:00:00:00:00:01,arp_spa=10.200.0.1,arp_tpa=10.200.0.254,arp_op=2,arp_sha=00:10:20:30:40:50,arp_tha=0e:00:00:00:00:01
dcc3e70b 1179 Megaflow: recirc_id=0,eth,arp,in_port=4,vlan_tci=0x0000/0x1fff,dl_dst=0e:00:00:00:00:01,arp_tpa=10.200.0.254
d39ec23d 1180 Datapath actions: push_vlan(vid=200,pcp=0),userspace(pid=0,controller(reason=1,flags=0,recirc_id=7,rule_cookie=0x5adc15c0,controller_id=0,max_len=128))
98dc8dee
BP
1181
1182It shows up in ``inst/faucet.log``::
1183
dcc3e70b
BC
1184 Jan 06 03:20:11 faucet.valve INFO DPID 1 (0x1) Adding new route 10.200.0.1/32 via 10.200.0.1 (00:10:20:30:40:50) on VLAN 200
1185 Jan 06 03:20:11 faucet.valve INFO DPID 1 (0x1) ARP response 10.200.0.1 (00:10:20:30:40:50) on VLAN 200
1186 Jan 06 03:20:11 faucet.valve INFO DPID 1 (0x1) L2 learned 00:10:20:30:40:50 (L2 type 0x0806, L3 src 10.200.0.1) on Port 4 on VLAN 200 (1 hosts total)
98dc8dee
BP
1187
1188and in the OVS flow tables::
1189
1190 $ diff-flows flows2 br0
dcc3e70b 1191 +table=3 priority=9098,in_port=4,dl_vlan=200,dl_src=00:10:20:30:40:50 hard_timeout=3601 actions=goto_table:7
98dc8dee
BP
1192 ...
1193 +table=4 priority=9131,ip,dl_vlan=200,nw_dst=10.200.0.1 actions=set_field:4296->vlan_vid,set_field:0e:00:00:00:00:01->eth_src,set_field:00:10:20:30:40:50->eth_dst,dec_ttl,goto_table:7
1194 +table=4 priority=9131,ip,dl_vlan=100,nw_dst=10.200.0.1 actions=set_field:4296->vlan_vid,set_field:0e:00:00:00:00:01->eth_src,set_field:00:10:20:30:40:50->eth_dst,dec_ttl,goto_table:7
1195 ...
1196 +table=4 priority=9123,ip,dl_vlan=100,nw_dst=10.200.0.0/24 actions=goto_table:6
dcc3e70b 1197 +table=7 priority=9099,dl_vlan=200,dl_dst=00:10:20:30:40:50 idle_timeout=3601 actions=pop_vlan,output:4
98dc8dee
BP
1198
1199Step 6: IP Packet Delivery
1200++++++++++++++++++++++++++
1201
1202Now both the host and the router have everything they need to deliver
1203the packet. There are two ways it might happen. If Faucet's router
1204is smart enough to buffer the packet that trigger ARP resolution, then
1205it might have delivered it already. If so, then it should show up in
1206``p4.pcap``. Let's take a look::
1207
dcc3e70b 1208 $ /usr/sbin/tcpdump -evvvr sandbox/p4.pcap ip
98dc8dee
BP
1209 reading from file sandbox/p4.pcap, link-type EN10MB (Ethernet)
1210
1211Nope. That leaves the other possibility, which is that Faucet waits
1212for the original sending host to re-send the packet. We can do that
1213by re-running the trace::
1214
1215 $ ovs-appctl ofproto/trace br0 in_port=p1,dl_src=00:01:02:03:04:05,dl_dst=0e:00:00:00:00:01,udp,nw_src=10.100.0.1,nw_dst=10.200.0.1,nw_ttl=64 -generate
1216 Flow: udp,in_port=1,vlan_tci=0x0000,dl_src=00:01:02:03:04:05,dl_dst=0e:00:00:00:00:01,nw_src=10.100.0.1,nw_dst=10.200.0.1,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=0,tp_dst=0
1217
1218 bridge("br0")
1219 -------------
1220 0. in_port=1, priority 9099, cookie 0x5adc15c0
1221 goto_table:1
1222 1. in_port=1,vlan_tci=0x0000/0x1fff, priority 9000, cookie 0x5adc15c0
1223 push_vlan:0x8100
1224 set_field:4196->vlan_vid
1225 goto_table:3
1226 3. ip,dl_vlan=100,dl_dst=0e:00:00:00:00:01, priority 9099, cookie 0x5adc15c0
1227 goto_table:4
1228 4. ip,dl_vlan=100,nw_dst=10.200.0.1, priority 9131, cookie 0x5adc15c0
1229 set_field:4296->vlan_vid
1230 set_field:0e:00:00:00:00:01->eth_src
1231 set_field:00:10:20:30:40:50->eth_dst
1232 dec_ttl
1233 goto_table:7
1234 7. dl_vlan=200,dl_dst=00:10:20:30:40:50, priority 9099, cookie 0x5adc15c0
1235 pop_vlan
1236 output:4
1237
1238 Final flow: udp,in_port=1,vlan_tci=0x0000,dl_src=0e:00:00:00:00:01,dl_dst=00:10:20:30:40:50,nw_src=10.100.0.1,nw_dst=10.200.0.1,nw_tos=0,nw_ecn=0,nw_ttl=63,tp_src=0,tp_dst=0
1239 Megaflow: recirc_id=0,eth,ip,in_port=1,vlan_tci=0x0000/0x1fff,dl_src=00:01:02:03:04:05,dl_dst=0e:00:00:00:00:01,nw_dst=10.200.0.1,nw_ttl=64,nw_frag=no
1240 Datapath actions: set(eth(src=0e:00:00:00:00:01,dst=00:10:20:30:40:50)),set(ipv4(dst=10.200.0.1,ttl=63)),4
1241
1242Finally, we have working IP packet forwarding!
1243
1244Performance
1245~~~~~~~~~~~
1246
1247Take another look at the megaflow line above::
1248
1249 Megaflow: recirc_id=0,eth,ip,in_port=1,vlan_tci=0x0000/0x1fff,dl_src=00:01:02:03:04:05,dl_dst=0e:00:00:00:00:01,nw_dst=10.200.0.1,nw_ttl=64,nw_frag=no
1250
1251This means that (almost) any packet between these Ethernet source and
1252destination hosts, destined to the given IP host, will be handled by
1253this single megaflow cache entry. So regardless of the number of UDP
1254packets or TCP connections that these hosts exchange, Open vSwitch
1255packet processing won't need to fall back to the slow path. It is
1256quite efficient.
1257
1258.. note::
1259
1260 The exceptions are packets with a TTL other than 64, and fragmented
1261 packets. Most hosts use a constant TTL for outgoing packets, and
1262 fragments are rare. If either of those did change, then that would
1263 simply result in a new megaflow cache entry.
1264
1265The datapath actions might also be worth a look::
1266
1267 Datapath actions: set(eth(src=0e:00:00:00:00:01,dst=00:10:20:30:40:50)),set(ipv4(dst=10.200.0.1,ttl=63)),4
1268
1269This just means that, to process these packets, the datapath changes
1270the Ethernet source and destination addresses and the IP TTL, and then
1271transmits the packet to port ``p4`` (also numbered 4). Notice in
1272particular that, despite the OpenFlow actions that pushed, modified,
1273and popped back off a VLAN, there is nothing in the datapath actions
1274about VLANs. This is because the OVS flow translation code "optimizes
1275out" redundant or unneeded actions, which saves time when the cache
1276entry is executed later.
1277
1278.. note::
1279
1280 It's not clear why the actions also re-set the IP destination
1281 address to its original value. Perhaps this is a minor performance
1282 bug.
1283
1284ACLs
1285----
1286
1287Let's try out some ACLs, since they do a good job illustrating some of
1288the ways that OVS tries to optimize megaflows. Update
1289``inst/faucet.yaml`` to the following::
1290
1291 dps:
1292 switch-1:
5a0e4aec
BP
1293 dp_id: 0x1
1294 timeout: 3600
1295 arp_neighbor_timeout: 3600
1296 interfaces:
1297 1:
1298 native_vlan: 100
1299 acl_in: 1
1300 2:
1301 native_vlan: 100
1302 3:
1303 native_vlan: 100
1304 4:
1305 native_vlan: 200
1306 5:
1307 native_vlan: 200
98dc8dee
BP
1308 vlans:
1309 100:
5a0e4aec 1310 faucet_vips: ["10.100.0.254/24"]
98dc8dee 1311 200:
5a0e4aec 1312 faucet_vips: ["10.200.0.254/24"]
98dc8dee
BP
1313 routers:
1314 router-1:
5a0e4aec 1315 vlans: [100, 200]
98dc8dee
BP
1316 acls:
1317 1:
5a0e4aec
BP
1318 - rule:
1319 dl_type: 0x800
1320 nw_proto: 6
1321 tcp_dst: 8080
1322 actions:
1323 allow: 0
1324 - rule:
1325 actions:
1326 allow: 1
98dc8dee
BP
1327
1328Then restart Faucet::
1329
1330 $ docker restart faucet
1331
1332On port 1, this new configuration blocks all traffic to TCP port 8080
1333and allows all other traffic. The resulting change in the flow table
1334shows this clearly too::
1335
1336 $ diff-flows flows2 br0
1337 -priority=9099,in_port=1 actions=goto_table:1
1338 +priority=9098,in_port=1 actions=goto_table:1
1339 +priority=9099,tcp,in_port=1,tp_dst=8080 actions=drop
1340
1341The most interesting question here is performance. If you recall the
1342earlier discussion, when a packet through the flow table encounters a
1343match on a given field, the resulting megaflow has to match on that
1344field, even if the flow didn't actually match. This is expensive.
1345
1346In particular, here you can see that any TCP packet is going to
1347encounter the ACL flow, even if it is directed to a port other than
13488080. If that means that every megaflow for a TCP packet is going to
1349have to match on the TCP destination, that's going to be bad for
1350caching performance because there will be a need for a separate
1351megaflow for every TCP destination port that actually appears in
1352traffic, which means a lot more megaflows than otherwise. (Really, in
1353practice, if such a simple ACL blew up performance, OVS wouldn't be a
1354very good switch!)
1355
1356Let's see what happens, by sending a packet to port 80 (instead of
13578080)::
1358
1359 $ ovs-appctl ofproto/trace br0 in_port=p1,dl_src=00:01:02:03:04:05,dl_dst=0e:00:00:00:00:01,tcp,nw_src=10.100.0.1,nw_dst=10.200.0.1,nw_ttl=64,tp_dst=80 -generate
dcc3e70b 1360 Flow: tcp,in_port=1,vlan_tci=0x0000,dl_src=00:01:02:03:04:05,dl_dst=0e:00:00:00:00:01,nw_src=10.100.0.1,nw_dst=10.200.0.1,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=0,tp_dst=80,tcp_flags=0
98dc8dee
BP
1361
1362 bridge("br0")
1363 -------------
1364 0. in_port=1, priority 9098, cookie 0x5adc15c0
1365 goto_table:1
1366 1. in_port=1,vlan_tci=0x0000/0x1fff, priority 9000, cookie 0x5adc15c0
1367 push_vlan:0x8100
1368 set_field:4196->vlan_vid
1369 goto_table:3
1370 3. ip,dl_vlan=100,dl_dst=0e:00:00:00:00:01, priority 9099, cookie 0x5adc15c0
1371 goto_table:4
1372 4. ip,dl_vlan=100,nw_dst=10.200.0.0/24, priority 9123, cookie 0x5adc15c0
1373 goto_table:6
dcc3e70b
BC
1374 6. ip, priority 9130, cookie 0x5adc15c0
1375 CONTROLLER:128
98dc8dee
BP
1376
1377 Final flow: tcp,in_port=1,dl_vlan=100,dl_vlan_pcp=0,vlan_tci1=0x0000,dl_src=00:01:02:03:04:05,dl_dst=0e:00:00:00:00:01,nw_src=10.100.0.1,nw_dst=10.200.0.1,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=0,tp_dst=80,tcp_flags=0
1378 Megaflow: recirc_id=0,eth,tcp,in_port=1,vlan_tci=0x0000/0x1fff,dl_src=00:01:02:03:04:05,dl_dst=0e:00:00:00:00:01,nw_dst=10.200.0.1,nw_frag=no,tp_dst=0x0/0xf000
1379 Datapath actions: push_vlan(vid=100,pcp=0)
1380
1381Take a look at the Megaflow line and in particular the match on
1382``tp_dst``, which says ``tp_dst=0x0/0xf000``. What this means is that
1383the megaflow matches on only the top 4 bits of the TCP destination
1384port. That works because::
1385
35b2520a 1386 80 (base 10) == 0000,0000,0101,0000 (base 2)
1387 8080 (base 10) == 0001,1111,1001,0000 (base 2)
98dc8dee
BP
1388
1389and so by matching on only the top 4 bits, rather than all 16, the OVS
1390fast path can distinguish port 80 from port 8080. This allows this
1391megaflow to match one-sixteenth of the TCP destination port address
1392space, rather than just 1/65536th of it.
1393
1394.. note::
1395
1396 The algorithm OVS uses for this purpose isn't perfect. In this
1397 case, a single-bit match would work (e.g. tp_dst=0x0/0x1000), and
1398 would be superior since it would only match half the port address
1399 space instead of one-sixteenth.
1400
1401For details of this algorithm, please refer to ``lib/classifier.c`` in
1402the Open vSwitch source tree, or our 2015 NSDI paper "The Design and
1403Implementation of Open vSwitch".
1404
1405Finishing Up
1406------------
1407
1408When you're done, you probably want to exit the sandbox session, with
1409Control+D or ``exit``, and stop the Faucet controller with ``docker
1410stop faucet; docker rm faucet``.
1411
1412Further Directions
1413------------------
1414
1415We've looked a fair bit at how Faucet interacts with Open vSwitch. If
1416you still have some interest, you might want to explore some of these
1417directions:
1418
1419* Adding more than one switch. Faucet can control multiple switches
1420 but we've only been simulating one of them. It's easy enough to
1421 make a single OVS instance act as multiple switches (just
1422 ``ovs-vsctl add-br`` another bridge), or you could use genuinely
1423 separate OVS instances.
1424
1425* Additional features. Faucet has more features than we've
1426 demonstrated, such as IPv6 routing and port mirroring. These should
1427 also interact gracefully with Open vSwitch.
1428
1429* Real performance testing. We've looked at how flows and traces
1430 **should** demonstrate good performance, but of course there's no
1431 proof until it actually works in practice. We've also only tested
1432 with trivial configurations. Open vSwitch can scale to millions of
1433 OpenFlow flows, but the scaling in practice depends on the
1434 particular flow tables and traffic patterns, so it's valuable to
1435 test with large configurations, either in the way we've done it or
1436 with real traffic.