]>
Commit | Line | Data |
---|---|---|
eeecce05 BP |
1 | Open vSwitch Advanced Features Tutorial |
2 | ======================================= | |
3 | ||
4 | Many tutorials cover the basics of OpenFlow. This is not such a | |
5 | tutorial. Rather, a knowledge of the basics of OpenFlow is a | |
6 | prerequisite. If you do not already understand how an OpenFlow flow | |
7 | table works, please go read a basic tutorial and then continue reading | |
8 | here afterward. | |
9 | ||
10 | It is also important to understand the basics of Open vSwitch before | |
a0ddac4b TG |
11 | you begin. If you have never used `ovs-vsctl` or `ovs-ofctl` before, |
12 | you should learn a little about them before proceeding. | |
eeecce05 BP |
13 | |
14 | Most of the features covered in this tutorial are Open vSwitch | |
15 | extensions to OpenFlow. Also, most of the features in this tutorial | |
16 | are specific to the software Open vSwitch implementation. If you are | |
17 | using an Open vSwitch port to an ASIC-based hardware switch, this | |
18 | tutorial will not help you. | |
19 | ||
20 | This tutorial does not cover every aspect of the features that it | |
21 | mentions. You can find the details elsewhere in the Open vSwitch | |
a0ddac4b TG |
22 | documentation, especially `ovs-ofctl(8)` and the comments in the |
23 | `include/openflow/nicira-ext.h` header file. | |
eeecce05 | 24 | |
a0ddac4b TG |
25 | > In this tutorial, paragraphs set off like this designate notes |
26 | > with additional information that readers may wish to skip on a | |
27 | > first read. | |
eeecce05 BP |
28 | |
29 | Getting Started | |
542cc9bb | 30 | --------------- |
eeecce05 BP |
31 | |
32 | This is a hands-on tutorial. To get the most out of it, you will need | |
33 | Open vSwitch binaries. You do not, on the other hand, need any | |
34 | physical networking hardware or even supervisor privilege on your | |
a0ddac4b | 35 | system. Instead, we will use a script called `ovs-sandbox`, which |
eeecce05 BP |
36 | accompanies the tutorial, that constructs a software simulated network |
37 | environment based on Open vSwitch. | |
38 | ||
542cc9bb | 39 | You can use `ovs-sandbox` three ways: |
eeecce05 | 40 | |
542cc9bb TG |
41 | * If you have already installed Open vSwitch on your system, then |
42 | you should be able to just run `ovs-sandbox` from this directory | |
43 | without any options. | |
eeecce05 | 44 | |
542cc9bb TG |
45 | * If you have not installed Open vSwitch (and you do not want to |
46 | install it), then you can build Open vSwitch according to the | |
a0ddac4b | 47 | instructions in [INSTALL.md], without installing it. Then run |
542cc9bb | 48 | `./ovs-sandbox -b DIRECTORY` from this directory, substituting |
a0ddac4b | 49 | the Open vSwitch build directory for `DIRECTORY`. |
eeecce05 | 50 | |
542cc9bb TG |
51 | * As a slight variant on the latter, you can run `make sandbox` |
52 | from an Open vSwitch build directory. | |
eeecce05 | 53 | |
a0ddac4b | 54 | When you run `ovs-sandbox`, it does the following: |
eeecce05 | 55 | |
a0ddac4b | 56 | 1. **CAUTION:** Deletes any subdirectory of the current directory |
542cc9bb | 57 | named "sandbox" and any files in that directory. |
eeecce05 | 58 | |
542cc9bb | 59 | 2. Creates a new directory "sandbox" in the current directory. |
eeecce05 | 60 | |
542cc9bb TG |
61 | 3. Sets up special environment variables that ensure that Open |
62 | vSwitch programs will look inside the "sandbox" directory | |
63 | instead of in the Open vSwitch installation directory. | |
eeecce05 | 64 | |
542cc9bb TG |
65 | 4. If you are using a built but not installed Open vSwitch, |
66 | installs the Open vSwitch manpages in a subdirectory of | |
a0ddac4b | 67 | "sandbox" and adjusts the `MANPATH` environment variable to point |
542cc9bb | 68 | to this directory. This means that you can use, for example, |
a0ddac4b | 69 | `man ovs-vsctl` to see a manpage for the `ovs-vsctl` program that |
542cc9bb | 70 | you built. |
eeecce05 | 71 | |
542cc9bb TG |
72 | 5. Creates an empty Open vSwitch configuration database under |
73 | "sandbox". | |
eeecce05 | 74 | |
a0ddac4b | 75 | 6. Starts `ovsdb-server` running under "sandbox". |
eeecce05 | 76 | |
a0ddac4b | 77 | 7. Starts `ovs-vswitchd` running under "sandbox", passing special |
542cc9bb | 78 | options that enable a special "dummy" mode for testing. |
eeecce05 | 79 | |
542cc9bb | 80 | 8. Starts a nested interactive shell inside "sandbox". |
eeecce05 BP |
81 | |
82 | At this point, you can run all the usual Open vSwitch utilities from | |
a0ddac4b TG |
83 | the nested shell environment. You can, for example, use `ovs-vsctl` |
84 | to create a bridge: | |
eeecce05 BP |
85 | |
86 | ovs-vsctl add-br br0 | |
87 | ||
88 | From Open vSwitch's perspective, the bridge that you create this way | |
89 | is as real as any other. You can, for example, connect it to an | |
a0ddac4b | 90 | OpenFlow controller or use `ovs-ofctl` to examine and modify it and |
eeecce05 | 91 | its OpenFlow flow table. On the other hand, the bridge is not visible |
a0ddac4b TG |
92 | to the operating system's network stack, so `ifconfig` or `ip` cannot |
93 | see it or affect it, which means that utilities like `ping` and | |
94 | `tcpdump` will not work either. (That has its good side, too: you | |
eeecce05 BP |
95 | can't screw up your computer's network stack by manipulating a |
96 | sandboxed OVS.) | |
97 | ||
98 | When you're done using OVS from the sandbox, exit the nested shell (by | |
99 | entering the "exit" shell command or pressing Control+D). This will | |
a0ddac4b | 100 | kill the daemons that `ovs-sandbox` started, but it leaves the "sandbox" |
eeecce05 BP |
101 | directory and its contents in place. |
102 | ||
103 | The sandbox directory contains log files for the Open vSwitch dameons. | |
104 | You can examine them while you're running in the sandboxed environment | |
105 | or after you exit. | |
106 | ||
8da7cd8c AZ |
107 | Using GDB |
108 | --------- | |
109 | ||
110 | GDB support is not required to go through the tutorial. It is added in case | |
111 | user wants to explore the internals of OVS programs. | |
112 | ||
113 | GDB can already be used to debug any running process, with the usual | |
114 | 'gdb <program> <process-id>' command. | |
115 | ||
116 | 'ovs-sandbox' also has a '-g' option for launching ovs-vswitchd under GDB. | |
117 | This option can be handy for setting break points before ovs-vswitchd runs, | |
4b814d41 AZ |
118 | or for catching early segfaults. Similarly, a '-d' option can be used to |
119 | run ovsdb-server under GDB. Both options can be specified at the same time. | |
8da7cd8c | 120 | |
60ceeb6c AZ |
121 | In addition, a '-e' option also launches ovs-vswitchd under GDB. However, |
122 | instead of displaying a 'gdb>' prompt and waiting for user input, ovs-vswitchd | |
123 | will start to execute immediately. '-r' option is the corresponding option | |
124 | for running ovsdb-server under gdb with immediate execution. | |
125 | ||
8da7cd8c AZ |
126 | To avoid GDB mangling with the sandbox sub shell terminal, 'ovs-sandbox' |
127 | starts a new xterm to run each GDB session. For systems that do not support | |
128 | X windows, GDB support is effectively disabled. | |
129 | ||
130 | When launching sandbox through the build tree's make file, the '-g' option | |
131 | can be passed via the 'SANDBOXFLAGS' environment variable. | |
132 | 'make sandbox SANDBOXFLAGS=-g' will start the sandbox with ovs-vswitchd | |
133 | running under GDB in its own xterm if X is available. | |
eeecce05 BP |
134 | |
135 | Motivation | |
542cc9bb | 136 | ---------- |
eeecce05 BP |
137 | |
138 | The goal of this tutorial is to demonstrate the power of Open vSwitch | |
139 | flow tables. The tutorial works through the implementation of a | |
140 | MAC-learning switch with VLAN trunk and access ports. Outside of the | |
141 | Open vSwitch features that we will discuss, OpenFlow provides at least | |
142 | two ways to implement such a switch: | |
143 | ||
542cc9bb TG |
144 | 1. An OpenFlow controller to implement MAC learning in a |
145 | "reactive" fashion. Whenever a new MAC appears on the switch, | |
146 | or a MAC moves from one switch port to another, the controller | |
147 | adjusts the OpenFlow flow table to match. | |
eeecce05 | 148 | |
542cc9bb TG |
149 | 2. The "normal" action. OpenFlow defines this action to submit a |
150 | packet to "the traditional non-OpenFlow pipeline of the | |
151 | switch". That is, if a flow uses this action, then the packets | |
152 | in the flow go through the switch in the same way that they | |
153 | would if OpenFlow was not configured on the switch. | |
eeecce05 BP |
154 | |
155 | Each of these approaches has unfortunate pitfalls. In the first | |
156 | approach, using an OpenFlow controller to implement MAC learning, has | |
157 | a significant cost in terms of network bandwidth and latency. It also | |
158 | makes the controller more difficult to scale to large numbers of | |
159 | switches, which is especially important in environments with thousands | |
160 | of hypervisors (each of which contains a virtual OpenFlow switch). | |
161 | MAC learning at an OpenFlow controller also behaves poorly if the | |
162 | OpenFlow controller fails, slows down, or becomes unavailable due to | |
163 | network problems. | |
164 | ||
165 | The second approach, using the "normal" action, has different | |
166 | problems. First, little about the "normal" action is standardized, so | |
167 | it behaves differently on switches from different vendors, and the | |
168 | available features and how those features are configured (usually not | |
169 | through OpenFlow) varies widely. Second, "normal" does not work well | |
170 | with other OpenFlow actions. It is "all-or-nothing", with little | |
171 | potential to adjust its behavior slightly or to compose it with other | |
172 | features. | |
173 | ||
174 | ||
175 | Scenario | |
542cc9bb | 176 | -------- |
eeecce05 BP |
177 | |
178 | We will construct Open vSwitch flow tables for a VLAN-capable, | |
179 | MAC-learning switch that has four ports: | |
180 | ||
542cc9bb | 181 | * p1, a trunk port that carries all VLANs, on OpenFlow port 1. |
eeecce05 | 182 | |
542cc9bb | 183 | * p2, an access port for VLAN 20, on OpenFlow port 2. |
eeecce05 | 184 | |
542cc9bb TG |
185 | * p3 and p4, both access ports for VLAN 30, on OpenFlow ports 3 |
186 | and 4, respectively. | |
eeecce05 | 187 | |
a0ddac4b TG |
188 | > The ports' names are not significant. You could call them eth1 |
189 | > through eth4, or any other names you like. | |
eeecce05 | 190 | |
a0ddac4b TG |
191 | > An OpenFlow switch always has a "local" port as well. This |
192 | > scenario won't use the local port. | |
eeecce05 BP |
193 | |
194 | Our switch design will consist of five main flow tables, each of which | |
195 | implements one stage in the switch pipeline: | |
196 | ||
542cc9bb | 197 | Table 0: Admission control. |
eeecce05 | 198 | |
542cc9bb | 199 | Table 1: VLAN input processing. |
eeecce05 | 200 | |
542cc9bb | 201 | Table 2: Learn source MAC and VLAN for ingress port. |
eeecce05 | 202 | |
542cc9bb | 203 | Table 3: Look up learned port for destination MAC and VLAN. |
eeecce05 | 204 | |
542cc9bb | 205 | Table 4: Output processing. |
eeecce05 BP |
206 | |
207 | The section below describes how to set up the scenario, followed by a | |
208 | section for each OpenFlow table. | |
209 | ||
a0ddac4b TG |
210 | You can cut and paste the `ovs-vsctl` and `ovs-ofctl` commands in each |
211 | of the sections below into your `ovs-sandbox` shell. They are also | |
212 | available as shell scripts in this directory, named `t-setup`, `t-stage0`, | |
213 | `t-stage1`, ..., `t-stage4`. The `ovs-appctl` test commands are intended | |
eeecce05 BP |
214 | for cutting and pasting and are not supplied separately. |
215 | ||
216 | ||
217 | Setup | |
542cc9bb | 218 | ----- |
eeecce05 | 219 | |
a0ddac4b | 220 | To get started, start `ovs-sandbox`. Inside the interactive shell |
eeecce05 BP |
221 | that it starts, run this command: |
222 | ||
223 | ovs-vsctl add-br br0 -- set Bridge br0 fail-mode=secure | |
224 | ||
225 | This command creates a new bridge "br0" and puts "br0" into so-called | |
226 | "fail-secure" mode. For our purpose, this just means that the | |
227 | OpenFlow flow table starts out empty. | |
228 | ||
a0ddac4b TG |
229 | > If we did not do this, then the flow table would start out with a |
230 | > single flow that executes the "normal" action. We could use that | |
231 | > feature to yield a switch that behaves the same as the switch we | |
232 | > are currently building, but with the caveats described under | |
233 | > "Motivation" above.) | |
eeecce05 BP |
234 | |
235 | The new bridge has only one port on it so far, the "local port" br0. | |
236 | We need to add p1, p2, p3, and p4. A shell "for" loop is one way to | |
237 | do it: | |
238 | ||
239 | for i in 1 2 3 4; do | |
240 | ovs-vsctl add-port br0 p$i -- set Interface p$i ofport_request=$i | |
542cc9bb | 241 | ovs-ofctl mod-port br0 p$i up |
eeecce05 BP |
242 | done |
243 | ||
a0ddac4b | 244 | In addition to adding a port, the `ovs-vsctl` command above sets its |
eeecce05 BP |
245 | "ofport_request" column to ensure that port p1 is assigned OpenFlow |
246 | port 1, p2 is assigned OpenFlow port 2, and so on. | |
247 | ||
a0ddac4b TG |
248 | > We could omit setting the ofport_request and let Open vSwitch |
249 | > choose port numbers for us, but it's convenient for the purposes | |
250 | > of this tutorial because we can talk about OpenFlow port 1 and | |
251 | > know that it corresponds to p1. | |
eeecce05 | 252 | |
a0ddac4b | 253 | The `ovs-ofctl` command above brings up the simulated interfaces, which |
eeecce05 | 254 | are down initially, using an OpenFlow request. The effect is similar |
a0ddac4b TG |
255 | to `ifconfig up`, but the sandbox's interfaces are not visible to the |
256 | operating system and therefore `ifconfig` would not affect them. | |
eeecce05 BP |
257 | |
258 | We have not configured anything related to VLANs or MAC learning. | |
259 | That's because we're going to implement those features in the flow | |
260 | table. | |
261 | ||
262 | To see what we've done so far to set up the scenario, you can run a | |
a0ddac4b | 263 | command like `ovs-vsctl show` or `ovs-ofctl show br0`. |
eeecce05 BP |
264 | |
265 | ||
266 | Implementing Table 0: Admission control | |
542cc9bb | 267 | --------------------------------------- |
eeecce05 BP |
268 | |
269 | Table 0 is where packets enter the switch. We use this stage to | |
270 | discard packets that for one reason or another are invalid. For | |
271 | example, packets with a multicast source address are not valid, so we | |
272 | can add a flow to drop them at ingress to the switch with: | |
273 | ||
274 | ovs-ofctl add-flow br0 \ | |
275 | "table=0, dl_src=01:00:00:00:00:00/01:00:00:00:00:00, actions=drop" | |
276 | ||
277 | A switch should also not forward IEEE 802.1D Spanning Tree Protocol | |
278 | (STP) packets, so we can also add a flow to drop those and other | |
279 | packets with reserved multicast protocols: | |
280 | ||
281 | ovs-ofctl add-flow br0 \ | |
f0ac9da9 | 282 | "table=0, dl_dst=01:80:c2:00:00:00/ff:ff:ff:ff:ff:f0, actions=drop" |
eeecce05 BP |
283 | |
284 | We could add flows to drop other protocols, but these demonstrate the | |
285 | pattern. | |
286 | ||
287 | We need one more flow, with a priority lower than the default, so that | |
288 | flows that don't match either of the "drop" flows we added above go on | |
289 | to pipeline stage 1 in OpenFlow table 1: | |
290 | ||
291 | ovs-ofctl add-flow br0 "table=0, priority=0, actions=resubmit(,1)" | |
292 | ||
293 | (The "resubmit" action is an Open vSwitch extension to OpenFlow.) | |
294 | ||
295 | ||
542cc9bb | 296 | ### Testing Table 0 |
eeecce05 BP |
297 | |
298 | If we were using Open vSwitch to set up a physical or a virtual | |
299 | switch, then we would naturally test it by sending packets through it | |
300 | one way or another, perhaps with common network testing tools like | |
a0ddac4b | 301 | `ping` and `tcpdump` or more specialized tools like Scapy. That's |
eeecce05 BP |
302 | difficult with our simulated switch, since it's not visible to the |
303 | operating system. | |
304 | ||
4ff8998c | 305 | But our simulated switch has a few specialized testing tools. The |
a0ddac4b TG |
306 | most powerful of these tools is `ofproto/trace`. Given a switch and |
307 | the specification of a flow, `ofproto/trace` shows, step-by-step, how | |
eeecce05 BP |
308 | such a flow would be treated as it goes through the switch. |
309 | ||
310 | ||
542cc9bb | 311 | ### EXAMPLE 1 |
eeecce05 BP |
312 | |
313 | Try this command: | |
314 | ||
f0ac9da9 | 315 | ovs-appctl ofproto/trace br0 in_port=1,dl_dst=01:80:c2:00:00:05 |
eeecce05 BP |
316 | |
317 | The output should look something like this: | |
318 | ||
f0ac9da9 BP |
319 | Flow: metadata=0,in_port=1,vlan_tci=0x0000,dl_src=00:00:00:00:00:00,dl_dst=01:80:c2:00:00:05,dl_type=0x0000 |
320 | Rule: table=0 cookie=0 dl_dst=01:80:c2:00:00:00/ff:ff:ff:ff:ff:f0 | |
eeecce05 BP |
321 | OpenFlow actions=drop |
322 | ||
323 | Final flow: unchanged | |
324 | Datapath actions: drop | |
325 | ||
326 | The first block of lines describes an OpenFlow table lookup. The | |
327 | first line shows the fields used for the table lookup (which is mostly | |
328 | zeros because that's the default if we don't specify everything). The | |
329 | second line gives the OpenFlow flow that the fields matched (called a | |
330 | "rule" because that is the name used inside Open vSwitch for an | |
331 | OpenFlow flow). In this case, we see that this packet that has a | |
332 | reserved multicast destination address matches the rule that drops | |
333 | those packets. The third line gives the rule's OpenFlow actions. | |
334 | ||
335 | The second block of lines summarizes the results, which are not very | |
336 | interesting here. | |
337 | ||
338 | ||
542cc9bb | 339 | ### EXAMPLE 2 |
eeecce05 BP |
340 | |
341 | Try another command: | |
342 | ||
f0ac9da9 | 343 | ovs-appctl ofproto/trace br0 in_port=1,dl_dst=01:80:c2:00:00:10 |
eeecce05 BP |
344 | |
345 | The output should be: | |
346 | ||
f0ac9da9 | 347 | Flow: metadata=0,in_port=1,vlan_tci=0x0000,dl_src=00:00:00:00:00:00,dl_dst=01:80:c2:00:00:10,dl_type=0x0000 |
eeecce05 BP |
348 | Rule: table=0 cookie=0 priority=0 |
349 | OpenFlow actions=resubmit(,1) | |
350 | ||
351 | Resubmitted flow: unchanged | |
352 | Resubmitted regs: reg0=0x0 reg1=0x0 reg2=0x0 reg3=0x0 reg4=0x0 reg5=0x0 reg6=0x0 reg7=0x0 | |
353 | Resubmitted odp: drop | |
354 | No match | |
355 | ||
356 | Final flow: unchanged | |
357 | Datapath actions: drop | |
358 | ||
a0ddac4b | 359 | This time the flow we handed to `ofproto/trace` doesn't match any of |
eeecce05 BP |
360 | our "drop" rules, so it falls through to the low-priority "resubmit" |
361 | rule, which we see in the rule and the actions selected in the first | |
362 | block. The "resubmit" causes a second lookup in OpenFlow table 1, | |
363 | described by the additional block of indented text in the output. We | |
364 | haven't yet added any flows to OpenFlow table 1, so no flow actually | |
365 | matches in the second lookup. Therefore, the packet is still actually | |
366 | dropped, which means that the externally observable results would be | |
367 | identical to our first example. | |
368 | ||
369 | ||
370 | Implementing Table 1: VLAN Input Processing | |
542cc9bb | 371 | ------------------------------------------- |
eeecce05 BP |
372 | |
373 | A packet that enters table 1 has already passed basic validation in | |
374 | table 0. The purpose of table 1 is validate the packet's VLAN, based | |
375 | on the VLAN configuration of the switch port through which the packet | |
376 | entered the switch. We will also use it to attach a VLAN header to | |
377 | packets that arrive on an access port, which allows later processing | |
378 | stages to rely on the packet's VLAN always being part of the VLAN | |
379 | header, reducing special cases. | |
380 | ||
381 | Let's start by adding a low-priority flow that drops all packets, | |
382 | before we add flows that pass through acceptable packets. You can | |
383 | think of this as a "default drop" rule: | |
384 | ||
385 | ovs-ofctl add-flow br0 "table=1, priority=0, actions=drop" | |
386 | ||
387 | Our trunk port p1, on OpenFlow port 1, is an easy case. p1 accepts | |
388 | any packet regardless of whether it has a VLAN header or what the VLAN | |
389 | was, so we can add a flow that resubmits everything on input port 1 to | |
390 | the next table: | |
391 | ||
392 | ovs-ofctl add-flow br0 \ | |
393 | "table=1, priority=99, in_port=1, actions=resubmit(,2)" | |
394 | ||
395 | On the access ports, we want to accept any packet that has no VLAN | |
396 | header, tag it with the access port's VLAN number, and then pass it | |
397 | along to the next stage: | |
398 | ||
399 | ovs-ofctl add-flows br0 - <<'EOF' | |
542cc9bb TG |
400 | table=1, priority=99, in_port=2, vlan_tci=0, actions=mod_vlan_vid:20, resubmit(,2) |
401 | table=1, priority=99, in_port=3, vlan_tci=0, actions=mod_vlan_vid:30, resubmit(,2) | |
402 | table=1, priority=99, in_port=4, vlan_tci=0, actions=mod_vlan_vid:30, resubmit(,2) | |
403 | EOF | |
eeecce05 BP |
404 | |
405 | We don't write any rules that match packets with 802.1Q that enter | |
406 | this stage on any of the access ports, so the "default drop" rule we | |
407 | added earlier causes them to be dropped, which is ordinarily what we | |
408 | want for access ports. | |
409 | ||
a0ddac4b TG |
410 | > Another variation of access ports allows ingress of packets tagged |
411 | > with VLAN 0 (aka 802.1p priority tagged packets). To allow such | |
412 | > packets, replace "vlan_tci=0" by "vlan_tci=0/0xfff" above. | |
eeecce05 BP |
413 | |
414 | ||
542cc9bb | 415 | ### Testing Table 1 |
eeecce05 | 416 | |
a0ddac4b | 417 | `ofproto/trace` allows us to test the ingress VLAN rules that we added |
eeecce05 BP |
418 | above. |
419 | ||
420 | ||
542cc9bb | 421 | ### EXAMPLE 1: Packet on Trunk Port |
eeecce05 BP |
422 | |
423 | Here's a test of a packet coming in on the trunk port: | |
424 | ||
425 | ovs-appctl ofproto/trace br0 in_port=1,vlan_tci=5 | |
426 | ||
427 | The output shows the lookup in table 0, the resubmit to table 1, and | |
428 | the resubmit to table 2 (which does nothing because we haven't put | |
429 | anything there yet): | |
430 | ||
431 | Flow: metadata=0,in_port=1,vlan_tci=0x0005,dl_src=00:00:00:00:00:00,dl_dst=00:00:00:00:00:00,dl_type=0x0000 | |
432 | Rule: table=0 cookie=0 priority=0 | |
433 | OpenFlow actions=resubmit(,1) | |
434 | ||
435 | Resubmitted flow: unchanged | |
436 | Resubmitted regs: reg0=0x0 reg1=0x0 reg2=0x0 reg3=0x0 reg4=0x0 reg5=0x0 reg6=0x0 reg7=0x0 | |
437 | Resubmitted odp: drop | |
438 | Rule: table=1 cookie=0 priority=99,in_port=1 | |
439 | OpenFlow actions=resubmit(,2) | |
440 | ||
441 | Resubmitted flow: unchanged | |
442 | Resubmitted regs: reg0=0x0 reg1=0x0 reg2=0x0 reg3=0x0 reg4=0x0 reg5=0x0 reg6=0x0 reg7=0x0 | |
443 | Resubmitted odp: drop | |
444 | No match | |
445 | ||
446 | Final flow: unchanged | |
447 | Datapath actions: drop | |
448 | ||
449 | ||
542cc9bb | 450 | ### EXAMPLE 2: Valid Packet on Access Port |
eeecce05 BP |
451 | |
452 | Here's a test of a valid packet (a packet without an 802.1Q header) | |
453 | coming in on access port p2: | |
454 | ||
455 | ovs-appctl ofproto/trace br0 in_port=2 | |
456 | ||
457 | The output is similar to that for the previous case, except that it | |
458 | additionally tags the packet with p2's VLAN 20 before it passes it | |
459 | along to table 2: | |
460 | ||
461 | Flow: metadata=0,in_port=2,vlan_tci=0x0000,dl_src=00:00:00:00:00:00,dl_dst=00:00:00:00:00:00,dl_type=0x0000 | |
462 | Rule: table=0 cookie=0 priority=0 | |
463 | OpenFlow actions=resubmit(,1) | |
464 | ||
465 | Resubmitted flow: unchanged | |
466 | Resubmitted regs: reg0=0x0 reg1=0x0 reg2=0x0 reg3=0x0 reg4=0x0 reg5=0x0 reg6=0x0 reg7=0x0 | |
467 | Resubmitted odp: drop | |
468 | Rule: table=1 cookie=0 priority=99,in_port=2,vlan_tci=0x0000 | |
469 | OpenFlow actions=mod_vlan_vid:20,resubmit(,2) | |
470 | ||
471 | Resubmitted flow: metadata=0,in_port=2,dl_vlan=20,dl_vlan_pcp=0,dl_src=00:00:00:00:00:00,dl_dst=00:00:00:00:00:00,dl_type=0x0000 | |
472 | Resubmitted regs: reg0=0x0 reg1=0x0 reg2=0x0 reg3=0x0 reg4=0x0 reg5=0x0 reg6=0x0 reg7=0x0 | |
473 | Resubmitted odp: drop | |
474 | No match | |
475 | ||
476 | Final flow: unchanged | |
477 | Datapath actions: drop | |
478 | ||
479 | ||
542cc9bb | 480 | ### EXAMPLE 3: Invalid Packet on Access Port |
eeecce05 BP |
481 | |
482 | This tests an invalid packet (one that includes an 802.1Q header) | |
483 | coming in on access port p2: | |
484 | ||
485 | ovs-appctl ofproto/trace br0 in_port=2,vlan_tci=5 | |
486 | ||
487 | The output shows the packet matching the default drop rule: | |
488 | ||
489 | Flow: metadata=0,in_port=2,vlan_tci=0x0005,dl_src=00:00:00:00:00:00,dl_dst=00:00:00:00:00:00,dl_type=0x0000 | |
490 | Rule: table=0 cookie=0 priority=0 | |
491 | OpenFlow actions=resubmit(,1) | |
492 | ||
493 | Resubmitted flow: unchanged | |
494 | Resubmitted regs: reg0=0x0 reg1=0x0 reg2=0x0 reg3=0x0 reg4=0x0 reg5=0x0 reg6=0x0 reg7=0x0 | |
495 | Resubmitted odp: drop | |
496 | Rule: table=1 cookie=0 priority=0 | |
497 | OpenFlow actions=drop | |
498 | ||
499 | Final flow: unchanged | |
500 | Datapath actions: drop | |
501 | ||
502 | ||
503 | Implementing Table 2: MAC+VLAN Learning for Ingress Port | |
542cc9bb | 504 | -------------------------------------------------------- |
eeecce05 BP |
505 | |
506 | This table allows the switch we're implementing to learn that the | |
507 | packet's source MAC is located on the packet's ingress port in the | |
508 | packet's VLAN. | |
509 | ||
a0ddac4b TG |
510 | > This table is a good example why table 1 added a VLAN tag to |
511 | > packets that entered the switch through an access port. We want | |
512 | > to associate a MAC+VLAN with a port regardless of whether the VLAN | |
513 | > in question was originally part of the packet or whether it was an | |
514 | > assumed VLAN associated with an access port. | |
eeecce05 BP |
515 | |
516 | It only takes a single flow to do this. The following command adds | |
517 | it: | |
518 | ||
519 | ovs-ofctl add-flow br0 \ | |
520 | "table=2 actions=learn(table=10, NXM_OF_VLAN_TCI[0..11], \ | |
521 | NXM_OF_ETH_DST[]=NXM_OF_ETH_SRC[], \ | |
522 | load:NXM_OF_IN_PORT[]->NXM_NX_REG0[0..15]), \ | |
523 | resubmit(,3)" | |
524 | ||
525 | The "learn" action (an Open vSwitch extension to OpenFlow) modifies a | |
526 | flow table based on the content of the flow currently being processed. | |
527 | Here's how you can interpret each part of the "learn" action above: | |
528 | ||
529 | table=10 | |
530 | ||
531 | Modify flow table 10. This will be the MAC learning table. | |
532 | ||
533 | NXM_OF_VLAN_TCI[0..11] | |
534 | ||
535 | Make the flow that we add to flow table 10 match the same VLAN | |
536 | ID that the packet we're currently processing contains. This | |
537 | effectively scopes the MAC learning entry to a single VLAN, | |
4ff8998c | 538 | which is the ordinary behavior for a VLAN-aware switch. |
eeecce05 BP |
539 | |
540 | NXM_OF_ETH_DST[]=NXM_OF_ETH_SRC[] | |
541 | ||
542 | Make the flow that we add to flow table 10 match, as Ethernet | |
543 | destination, the Ethernet source address of the packet we're | |
544 | currently processing. | |
545 | ||
546 | load:NXM_OF_IN_PORT[]->NXM_NX_REG0[0..15] | |
547 | ||
548 | Whereas the preceding parts specify fields for the new flow to | |
549 | match, this specifies an action for the flow to take when it | |
550 | matches. The action is for the flow to load the ingress port | |
551 | number of the current packet into register 0 (a special field | |
552 | that is an Open vSwitch extension to OpenFlow). | |
553 | ||
a0ddac4b TG |
554 | > A real use of "learn" for MAC learning would probably involve two |
555 | > additional elements. First, the "learn" action would specify a | |
556 | > hard_timeout for the new flow, to enable a learned MAC to | |
557 | > eventually expire if no new packets were seen from a given source | |
558 | > within a reasonable interval. Second, one would usually want to | |
559 | > limit resource consumption by using the Flow_Table table in the | |
560 | > Open vSwitch configuration database to specify a maximum number of | |
561 | > flows in table 10. | |
eeecce05 BP |
562 | |
563 | This definitely calls for examples. | |
564 | ||
565 | ||
542cc9bb | 566 | ### Testing Table 2 |
eeecce05 | 567 | |
542cc9bb | 568 | ### EXAMPLE 1 |
eeecce05 BP |
569 | |
570 | Try the following test command: | |
571 | ||
572 | ovs-appctl ofproto/trace br0 in_port=1,vlan_tci=20,dl_src=50:00:00:00:00:01 -generate | |
573 | ||
574 | The output shows that "learn" was executed, but it isn't otherwise | |
575 | informative, so we won't include it here. | |
576 | ||
a0ddac4b | 577 | The `-generate` keyword is new. Ordinarily, `ofproto/trace` has no |
eeecce05 BP |
578 | side effects: "output" actions do not actually output packets, "learn" |
579 | actions do not actually modify the flow table, and so on. With | |
a0ddac4b | 580 | `-generate`, though, `ofproto/trace` does execute "learn" actions. |
eeecce05 BP |
581 | That's important now, because we want to see the effect of the "learn" |
582 | action on table 10. You can see that by running: | |
583 | ||
584 | ovs-ofctl dump-flows br0 table=10 | |
585 | ||
586 | which (omitting the "duration" and "idle_age" fields, which will vary | |
587 | based on how soon you ran this command after the previous one, as well | |
588 | as some other uninteresting fields) prints something like: | |
589 | ||
590 | NXST_FLOW reply (xid=0x4): | |
591 | table=10, vlan_tci=0x0014/0x0fff,dl_dst=50:00:00:00:00:01 actions=load:0x1->NXM_NX_REG0[0..15] | |
592 | ||
593 | You can see that the packet coming in on VLAN 20 with source MAC | |
594 | 50:00:00:00:00:01 became a flow that matches VLAN 20 (written in | |
595 | hexadecimal) and destination MAC 50:00:00:00:00:01. The flow loads | |
596 | port number 1, the input port for the flow we tested, into register 0. | |
597 | ||
598 | ||
542cc9bb | 599 | ### EXAMPLE 2 |
eeecce05 BP |
600 | |
601 | Here's a second test command: | |
602 | ||
603 | ovs-appctl ofproto/trace br0 in_port=2,dl_src=50:00:00:00:00:01 -generate | |
604 | ||
605 | The flow that this command tests has the same source MAC and VLAN as | |
606 | example 1, although the VLAN comes from an access port VLAN rather | |
607 | than an 802.1Q header. If we again dump the flows for table 10 with: | |
608 | ||
609 | ovs-ofctl dump-flows br0 table=10 | |
610 | ||
611 | then we see that the flow we saw previously has changed to indicate | |
612 | that the learned port is port 2, as we would expect: | |
613 | ||
614 | NXST_FLOW reply (xid=0x4): | |
615 | table=10, vlan_tci=0x0014/0x0fff,dl_dst=50:00:00:00:00:01 actions=load:0x2->NXM_NX_REG0[0..15] | |
616 | ||
617 | ||
618 | Implementing Table 3: Look Up Destination Port | |
542cc9bb | 619 | ---------------------------------------------- |
eeecce05 BP |
620 | |
621 | This table figures out what port we should send the packet to based on | |
622 | the destination MAC and VLAN. That is, if we've learned the location | |
623 | of the destination (from table 2 processing some previous packet with | |
624 | that destination as its source), then we want to send the packet | |
625 | there. | |
626 | ||
627 | We need only one flow to do the lookup: | |
628 | ||
629 | ovs-ofctl add-flow br0 \ | |
630 | "table=3 priority=50 actions=resubmit(,10), resubmit(,4)" | |
631 | ||
632 | The flow's first action resubmits to table 10, the table that the | |
633 | "learn" action modifies. As you saw previously, the learned flows in | |
634 | this table write the learned port into register 0. If the destination | |
635 | for our packet hasn't been learned, then there will be no matching | |
636 | flow, and so the "resubmit" turns into a no-op. Because registers are | |
637 | initialized to 0, we can use a register 0 value of 0 in our next | |
638 | pipeline stage as a signal to flood the packet. | |
639 | ||
640 | The second action resubmits to table 4, continuing to the next | |
641 | pipeline stage. | |
642 | ||
643 | We can add another flow to skip the learning table lookup for | |
644 | multicast and broadcast packets, since those should always be flooded: | |
645 | ||
646 | ovs-ofctl add-flow br0 \ | |
647 | "table=3 priority=99 dl_dst=01:00:00:00:00:00/01:00:00:00:00:00 \ | |
648 | actions=resubmit(,4)" | |
649 | ||
a0ddac4b TG |
650 | > We don't strictly need to add this flow, because multicast |
651 | > addresses will never show up in our learning table. (In turn, | |
652 | > that's because we put a flow into table 0 to drop packets that | |
653 | > have a multicast source address.) | |
eeecce05 BP |
654 | |
655 | ||
542cc9bb | 656 | ### Testing Table 3 |
eeecce05 | 657 | |
542cc9bb | 658 | ### EXAMPLE |
eeecce05 BP |
659 | |
660 | Here's a command that should cause OVS to learn that f0:00:00:00:00:01 | |
661 | is on p1 in VLAN 20: | |
662 | ||
663 | ovs-appctl ofproto/trace br0 in_port=1,dl_vlan=20,dl_src=f0:00:00:00:00:01,dl_dst=90:00:00:00:00:01 -generate | |
664 | ||
665 | Here's an excerpt from the output that shows (from the "no match" | |
666 | looking up the resubmit to table 10) that the flow's destination was | |
667 | unknown: | |
668 | ||
669 | Resubmitted flow: unchanged | |
670 | Resubmitted regs: reg0=0x0 reg1=0x0 reg2=0x0 reg3=0x0 reg4=0x0 reg5=0x0 reg6=0x0 reg7=0x0 | |
671 | Resubmitted odp: drop | |
672 | Rule: table=3 cookie=0 priority=50 | |
673 | OpenFlow actions=resubmit(,10),resubmit(,4) | |
674 | ||
675 | Resubmitted flow: unchanged | |
676 | Resubmitted regs: reg0=0x0 reg1=0x0 reg2=0x0 reg3=0x0 reg4=0x0 reg5=0x0 reg6=0x0 reg7=0x0 | |
677 | Resubmitted odp: drop | |
678 | No match | |
679 | ||
680 | You can verify that the packet's source was learned two ways. The | |
681 | most direct way is to dump the learning table with: | |
682 | ||
683 | ovs-ofctl dump-flows br0 table=10 | |
684 | ||
685 | which ought to show roughly the following, with extraneous details | |
686 | removed: | |
687 | ||
688 | table=10, vlan_tci=0x0014/0x0fff,dl_dst=f0:00:00:00:00:01 actions=load:0x1->NXM_NX_REG0[0..15] | |
689 | ||
a0ddac4b TG |
690 | > If you tried the examples for the previous step, or if you did |
691 | > some of your own experiments, then you might see additional flows | |
692 | > there. These additional flows are harmless. If they bother you, | |
693 | > then you can remove them with `ovs-ofctl del-flows br0 table=10`. | |
eeecce05 BP |
694 | |
695 | The other way is to inject a packet to take advantage of the learning | |
696 | entry. For example, we can inject a packet on p2 whose destination is | |
697 | the MAC address that we just learned on p1: | |
698 | ||
699 | ovs-appctl ofproto/trace br0 in_port=2,dl_src=90:00:00:00:00:01,dl_dst=f0:00:00:00:00:01 -generate | |
700 | ||
701 | Here's an interesting excerpt from that command's output. This group | |
702 | of lines traces the "resubmit(,10)", showing that the packet matched | |
703 | the learned flow for the first MAC we used, loading the OpenFlow port | |
704 | number for the learned port p1 into register 0: | |
705 | ||
706 | Resubmitted flow: unchanged | |
707 | Resubmitted regs: reg0=0x0 reg1=0x0 reg2=0x0 reg3=0x0 reg4=0x0 reg5=0x0 reg6=0x0 reg7=0x0 | |
708 | Resubmitted odp: drop | |
709 | Rule: table=10 cookie=0 vlan_tci=0x0014/0x0fff,dl_dst=f0:00:00:00:00:01 | |
710 | OpenFlow actions=load:0x1->NXM_NX_REG0[0..15] | |
711 | ||
712 | ||
713 | If you read the commands above carefully, then you might have noticed | |
714 | that they simply have the Ethernet source and destination addresses | |
a0ddac4b | 715 | exchanged. That means that if we now rerun the first `ovs-appctl` |
eeecce05 BP |
716 | command above, e.g.: |
717 | ||
718 | ovs-appctl ofproto/trace br0 in_port=1,dl_vlan=20,dl_src=f0:00:00:00:00:01,dl_dst=90:00:00:00:00:01 -generate | |
719 | ||
720 | then we see in the output that the destination has now been learned: | |
721 | ||
722 | Resubmitted flow: unchanged | |
723 | Resubmitted regs: reg0=0x0 reg1=0x0 reg2=0x0 reg3=0x0 reg4=0x0 reg5=0x0 reg6=0x0 reg7=0x0 | |
724 | Resubmitted odp: drop | |
725 | Rule: table=10 cookie=0 vlan_tci=0x0014/0x0fff,dl_dst=90:00:00:00:00:01 | |
726 | OpenFlow actions=load:0x2->NXM_NX_REG0[0..15] | |
727 | ||
728 | ||
729 | Implementing Table 4: Output Processing | |
542cc9bb | 730 | --------------------------------------- |
eeecce05 BP |
731 | |
732 | At entry to stage 4, we know that register 0 contains either the | |
733 | desired output port or is zero if the packet should be flooded. We | |
734 | also know that the packet's VLAN is in its 802.1Q header, even if the | |
735 | VLAN was implicit because the packet came in on an access port. | |
736 | ||
737 | The job of the final pipeline stage is to actually output packets. | |
738 | The job is trivial for output to our trunk port p1: | |
739 | ||
740 | ovs-ofctl add-flow br0 "table=4 reg0=1 actions=1" | |
741 | ||
742 | For output to the access ports, we just have to strip the VLAN header | |
743 | before outputting the packet: | |
744 | ||
745 | ovs-ofctl add-flows br0 - <<'EOF' | |
542cc9bb TG |
746 | table=4 reg0=2 actions=strip_vlan,2 |
747 | table=4 reg0=3 actions=strip_vlan,3 | |
748 | table=4 reg0=4 actions=strip_vlan,4 | |
749 | EOF | |
eeecce05 BP |
750 | |
751 | The only slightly tricky part is flooding multicast and broadcast | |
752 | packets and unicast packets with unlearned destinations. For those, | |
753 | we need to make sure that we only output the packets to the ports that | |
754 | carry our packet's VLAN, and that we include the 802.1Q header in the | |
755 | copy output to the trunk port but not in copies output to access | |
756 | ports: | |
757 | ||
758 | ovs-ofctl add-flows br0 - <<'EOF' | |
542cc9bb TG |
759 | table=4 reg0=0 priority=99 dl_vlan=20 actions=1,strip_vlan,2 |
760 | table=4 reg0=0 priority=99 dl_vlan=30 actions=1,strip_vlan,3,4 | |
761 | table=4 reg0=0 priority=50 actions=1 | |
762 | EOF | |
eeecce05 | 763 | |
a0ddac4b TG |
764 | > Our rules rely on the standard OpenFlow behavior that an output |
765 | > action will not forward a packet back out the port it came in on. | |
766 | > That is, if a packet comes in on p1, and we've learned that the | |
767 | > packet's destination MAC is also on p1, so that we end up with | |
768 | > "actions=1" as our actions, the switch will not forward the packet | |
769 | > back out its input port. The multicast/broadcast/unknown | |
770 | > destination cases above also rely on this behavior. | |
eeecce05 BP |
771 | |
772 | ||
542cc9bb | 773 | ### Testing Table 4 |
eeecce05 | 774 | |
542cc9bb | 775 | ### EXAMPLE 1: Broadcast, Multicast, and Unknown Destination |
eeecce05 BP |
776 | |
777 | Try tracing a broadcast packet arriving on p1 in VLAN 30: | |
778 | ||
779 | ovs-appctl ofproto/trace br0 in_port=1,dl_dst=ff:ff:ff:ff:ff:ff,dl_vlan=30 | |
780 | ||
781 | The interesting part of the output is the final line, which shows that | |
782 | the switch would remove the 802.1Q header and then output the packet to | |
783 | p3 and p4, which are access ports for VLAN 30: | |
784 | ||
785 | Datapath actions: pop_vlan,3,4 | |
786 | ||
787 | Similarly, if we trace a broadcast packet arriving on p3: | |
788 | ||
789 | ovs-appctl ofproto/trace br0 in_port=3,dl_dst=ff:ff:ff:ff:ff:ff | |
790 | ||
791 | then we see that it is output to p1 with an 802.1Q tag and then to p4 | |
792 | without one: | |
793 | ||
794 | Datapath actions: push_vlan(vid=30,pcp=0),1,pop_vlan,4 | |
795 | ||
a0ddac4b TG |
796 | > Open vSwitch could simplify the datapath actions here to just |
797 | > "4,push_vlan(vid=30,pcp=0),1" but it is not smart enough to do so. | |
eeecce05 BP |
798 | |
799 | The following are also broadcasts, but the result is to drop the | |
800 | packets because the VLAN only belongs to the input port: | |
801 | ||
802 | ovs-appctl ofproto/trace br0 in_port=1,dl_dst=ff:ff:ff:ff:ff:ff | |
803 | ovs-appctl ofproto/trace br0 in_port=1,dl_dst=ff:ff:ff:ff:ff:ff,dl_vlan=55 | |
804 | ||
805 | Try some other broadcast cases on your own: | |
806 | ||
807 | ovs-appctl ofproto/trace br0 in_port=1,dl_dst=ff:ff:ff:ff:ff:ff,dl_vlan=20 | |
808 | ovs-appctl ofproto/trace br0 in_port=2,dl_dst=ff:ff:ff:ff:ff:ff | |
809 | ovs-appctl ofproto/trace br0 in_port=4,dl_dst=ff:ff:ff:ff:ff:ff | |
810 | ||
811 | You can see the same behavior with multicast packets and with unicast | |
812 | packets whose destination has not been learned, e.g.: | |
813 | ||
814 | ovs-appctl ofproto/trace br0 in_port=4,dl_dst=01:00:00:00:00:00 | |
815 | ovs-appctl ofproto/trace br0 in_port=1,dl_dst=90:12:34:56:78:90,dl_vlan=20 | |
816 | ovs-appctl ofproto/trace br0 in_port=1,dl_dst=90:12:34:56:78:90,dl_vlan=30 | |
817 | ||
818 | ||
542cc9bb | 819 | ### EXAMPLE 2: MAC Learning |
eeecce05 BP |
820 | |
821 | Let's follow the same pattern as we did for table 3. First learn a | |
822 | MAC on port p1 in VLAN 30: | |
823 | ||
824 | ovs-appctl ofproto/trace br0 in_port=1,dl_vlan=30,dl_src=10:00:00:00:00:01,dl_dst=20:00:00:00:00:01 -generate | |
825 | ||
826 | You can see from the last line of output that the packet's destination | |
827 | is unknown, so it gets flooded to both p3 and p4, the other ports in | |
828 | VLAN 30: | |
829 | ||
830 | Datapath actions: pop_vlan,3,4 | |
831 | ||
832 | Then reverse the MACs and learn the first flow's destination on port | |
833 | p4: | |
834 | ||
835 | ovs-appctl ofproto/trace br0 in_port=4,dl_src=20:00:00:00:00:01,dl_dst=10:00:00:00:00:01 -generate | |
836 | ||
837 | The last line of output shows that the this packet's destination is | |
838 | known to be p1, as learned from our previous command: | |
839 | ||
840 | Datapath actions: push_vlan(vid=30,pcp=0),1 | |
841 | ||
842 | Now, if we rerun our first command: | |
843 | ||
844 | ovs-appctl ofproto/trace br0 in_port=1,dl_vlan=30,dl_src=10:00:00:00:00:01,dl_dst=20:00:00:00:00:01 -generate | |
845 | ||
846 | we can see that the result is no longer a flood but to the specified | |
847 | learned destination port p4: | |
848 | ||
849 | Datapath actions: pop_vlan,4 | |
850 | ||
851 | ||
852 | Contact | |
853 | ======= | |
854 | ||
855 | bugs@openvswitch.org | |
856 | http://openvswitch.org/ | |
a0ddac4b TG |
857 | |
858 | [INSTALL.md]:../INSTALL.md |