]>
Commit | Line | Data |
---|---|---|
11fdf7f2 TL |
1 | .. SPDX-License-Identifier: BSD-3-Clause |
2 | Copyright(c) 2018 6WIND S.A. | |
3 | ||
4 | .. _switch_representation: | |
5 | ||
6 | Switch Representation within DPDK Applications | |
7 | ============================================== | |
8 | ||
9 | .. contents:: :local: | |
10 | ||
11 | Introduction | |
12 | ------------ | |
13 | ||
14 | Network adapters with multiple physical ports and/or SR-IOV capabilities | |
15 | usually support the offload of traffic steering rules between their virtual | |
16 | functions (VFs), physical functions (PFs) and ports. | |
17 | ||
18 | Like for standard Ethernet switches, this involves a combination of | |
19 | automatic MAC learning and manual configuration. For most purposes it is | |
20 | managed by the host system and fully transparent to users and applications. | |
21 | ||
22 | On the other hand, applications typically found on hypervisors that process | |
23 | layer 2 (L2) traffic (such as OVS) need to steer traffic themselves | |
24 | according on their own criteria. | |
25 | ||
26 | Without a standard software interface to manage traffic steering rules | |
27 | between VFs, PFs and the various physical ports of a given device, | |
28 | applications cannot take advantage of these offloads; software processing is | |
29 | mandatory even for traffic which ends up re-injected into the device it | |
30 | originates from. | |
31 | ||
32 | This document describes how such steering rules can be configured through | |
33 | the DPDK flow API (**rte_flow**), with emphasis on the SR-IOV use case | |
34 | (PF/VF steering) using a single physical port for clarity, however the same | |
35 | logic applies to any number of ports without necessarily involving SR-IOV. | |
36 | ||
37 | Port Representors | |
38 | ----------------- | |
39 | ||
40 | In many cases, traffic steering rules cannot be determined in advance; | |
41 | applications usually have to process a bit of traffic in software before | |
42 | thinking about offloading specific flows to hardware. | |
43 | ||
44 | Applications therefore need the ability to receive and inject traffic to | |
45 | various device endpoints (other VFs, PFs or physical ports) before | |
46 | connecting them together. Device drivers must provide means to hook the | |
47 | "other end" of these endpoints and to refer them when configuring flow | |
48 | rules. | |
49 | ||
50 | This role is left to so-called "port representors" (also known as "VF | |
51 | representors" in the specific context of VFs), which are to DPDK what the | |
52 | Ethernet switch device driver model (**switchdev**) [1]_ is to Linux, and | |
53 | which can be thought as a software "patch panel" front-end for applications. | |
54 | ||
55 | - DPDK port representors are implemented as additional virtual Ethernet | |
56 | device (**ethdev**) instances, spawned on an as needed basis through | |
57 | configuration parameters passed to the driver of the underlying | |
58 | device using devargs. | |
59 | ||
60 | :: | |
61 | ||
62 | -w pci:dbdf,representor=0 | |
63 | -w pci:dbdf,representor=[0-3] | |
64 | -w pci:dbdf,representor=[0,5-11] | |
65 | ||
66 | - As virtual devices, they may be more limited than their physical | |
67 | counterparts, for instance by exposing only a subset of device | |
68 | configuration callbacks and/or by not necessarily having Rx/Tx capability. | |
69 | ||
70 | - Among other things, they can be used to assign MAC addresses to the | |
71 | resource they represent. | |
72 | ||
73 | - Applications can tell port representors apart from other physical of virtual | |
74 | port by checking the dev_flags field within their device information | |
75 | structure for the RTE_ETH_DEV_REPRESENTOR bit-field. | |
76 | ||
77 | .. code-block:: c | |
78 | ||
79 | struct rte_eth_dev_info { | |
80 | ... | |
81 | uint32_t dev_flags; /**< Device flags */ | |
82 | ... | |
83 | }; | |
84 | ||
85 | - The device or group relationship of ports can be discovered using the | |
86 | switch ``domain_id`` field within the devices switch information structure. By | |
87 | default the switch ``domain_id`` of a port will be | |
88 | ``RTE_ETH_DEV_SWITCH_DOMAIN_ID_INVALID`` to indicate that the port doesn't | |
89 | support the concept of a switch domain, but ports which do support the concept | |
90 | will be allocated a unique switch ``domain_id``, ports within the same switch | |
91 | domain will share the same ``domain_id``. The switch ``port_id`` is used to | |
92 | specify the port_id in terms of the switch, so in the case of SR-IOV devices | |
93 | the switch ``port_id`` would represent the virtual function identifier of the | |
94 | port. | |
95 | ||
96 | .. code-block:: c | |
97 | ||
98 | /** | |
99 | * Ethernet device associated switch information | |
100 | */ | |
101 | struct rte_eth_switch_info { | |
102 | const char *name; /**< switch name */ | |
103 | uint16_t domain_id; /**< switch domain id */ | |
104 | uint16_t port_id; /**< switch port id */ | |
105 | }; | |
106 | ||
107 | ||
108 | .. [1] `Ethernet switch device driver model (switchdev) | |
109 | <https://www.kernel.org/doc/Documentation/networking/switchdev.txt>`_ | |
110 | ||
111 | Basic SR-IOV | |
112 | ------------ | |
113 | ||
114 | "Basic" in the sense that it is not managed by applications, which | |
115 | nonetheless expect traffic to flow between the various endpoints and the | |
116 | outside as if everything was linked by an Ethernet hub. | |
117 | ||
118 | The following diagram pictures a setup involving a device with one PF, two | |
119 | VFs and one shared physical port | |
120 | ||
121 | :: | |
122 | ||
123 | .-------------. .-------------. .-------------. | |
124 | | hypervisor | | VM 1 | | VM 2 | | |
125 | | application | | application | | application | | |
126 | `--+----------' `----------+--' `--+----------' | |
127 | | | | | |
128 | .-----+-----. | | | |
129 | | port_id 3 | | | | |
130 | `-----+-----' | | | |
131 | | | | | |
132 | .-+--. .---+--. .--+---. | |
133 | | PF | | VF 1 | | VF 2 | | |
134 | `-+--' `---+--' `--+---' | |
135 | | | | | |
136 | `---------. .-----------------------' | | |
137 | | | .-------------------------' | |
138 | | | | | |
139 | .--+-----+-----+--. | |
140 | | interconnection | | |
141 | `--------+--------' | |
142 | | | |
143 | .----+-----. | |
144 | | physical | | |
145 | | port 0 | | |
146 | `----------' | |
147 | ||
148 | - A DPDK application running on the hypervisor owns the PF device, which is | |
149 | arbitrarily assigned port index 3. | |
150 | ||
151 | - Both VFs are assigned to VMs and used by unknown applications; they may be | |
152 | DPDK-based or anything else. | |
153 | ||
154 | - Interconnection is not necessarily done through a true Ethernet switch and | |
155 | may not even exist as a separate entity. The role of this block is to show | |
156 | that something brings PF, VFs and physical ports together and enables | |
157 | communication between them, with a number of built-in restrictions. | |
158 | ||
159 | Subsequent sections in this document describe means for DPDK applications | |
160 | running on the hypervisor to freely assign specific flows between PF, VFs | |
161 | and physical ports based on traffic properties, by managing this | |
162 | interconnection. | |
163 | ||
164 | Controlled SR-IOV | |
165 | ----------------- | |
166 | ||
167 | Initialization | |
168 | ~~~~~~~~~~~~~~ | |
169 | ||
170 | When a DPDK application gets assigned a PF device and is deliberately not | |
171 | started in `basic SR-IOV`_ mode, any traffic coming from physical ports is | |
172 | received by PF according to default rules, while VFs remain isolated. | |
173 | ||
174 | :: | |
175 | ||
176 | .-------------. .-------------. .-------------. | |
177 | | hypervisor | | VM 1 | | VM 2 | | |
178 | | application | | application | | application | | |
179 | `--+----------' `----------+--' `--+----------' | |
180 | | | | | |
181 | .-----+-----. | | | |
182 | | port_id 3 | | | | |
183 | `-----+-----' | | | |
184 | | | | | |
185 | .-+--. .---+--. .--+---. | |
186 | | PF | | VF 1 | | VF 2 | | |
187 | `-+--' `------' `------' | |
188 | | | |
189 | `-----. | |
190 | | | |
191 | .--+----------------------. | |
192 | | managed interconnection | | |
193 | `------------+------------' | |
194 | | | |
195 | .----+-----. | |
196 | | physical | | |
197 | | port 0 | | |
198 | `----------' | |
199 | ||
200 | In this mode, interconnection must be configured by the application to | |
201 | enable VF communication, for instance by explicitly directing traffic with a | |
202 | given destination MAC address to VF 1 and allowing that with the same source | |
203 | MAC address to come out of it. | |
204 | ||
205 | For this to work, hypervisor applications need a way to refer to either VF 1 | |
206 | or VF 2 in addition to the PF. This is addressed by `VF representors`_. | |
207 | ||
208 | VF Representors | |
209 | ~~~~~~~~~~~~~~~ | |
210 | ||
211 | VF representors are virtual but standard DPDK network devices (albeit with | |
212 | limited capabilities) created by PMDs when managing a PF device. | |
213 | ||
214 | Since they represent VF instances used by other applications, configuring | |
215 | them (e.g. assigning a MAC address or setting up promiscuous mode) affects | |
216 | interconnection accordingly. If supported, they may also be used as two-way | |
217 | communication ports with VFs (assuming **switchdev** topology) | |
218 | ||
219 | ||
220 | :: | |
221 | ||
222 | .-------------. .-------------. .-------------. | |
223 | | hypervisor | | VM 1 | | VM 2 | | |
224 | | application | | application | | application | | |
225 | `--+---+---+--' `----------+--' `--+----------' | |
226 | | | | | | | |
227 | | | `-------------------. | | | |
228 | | `---------. | | | | |
229 | | | | | | | |
230 | .-----+-----. .-----+-----. .-----+-----. | | | |
231 | | port_id 3 | | port_id 4 | | port_id 5 | | | | |
232 | `-----+-----' `-----+-----' `-----+-----' | | | |
233 | | | | | | | |
234 | .-+--. .-----+-----. .-----+-----. .---+--. .--+---. | |
235 | | PF | | VF 1 rep. | | VF 2 rep. | | VF 1 | | VF 2 | | |
236 | `-+--' `-----+-----' `-----+-----' `---+--' `--+---' | |
237 | | | | | | | |
238 | | | .---------' | | | |
239 | `-----. | | .-----------------' | | |
240 | | | | | .---------------------' | |
241 | | | | | | | |
242 | .--+-------+---+---+---+--. | |
243 | | managed interconnection | | |
244 | `------------+------------' | |
245 | | | |
246 | .----+-----. | |
247 | | physical | | |
248 | | port 0 | | |
249 | `----------' | |
250 | ||
251 | - VF representors are assigned arbitrary port indices 4 and 5 in the | |
252 | hypervisor application and are respectively associated with VF 1 and VF 2. | |
253 | ||
254 | - They can't be dissociated; even if VF 1 and VF 2 were not connected, | |
255 | representors could still be used for configuration. | |
256 | ||
257 | - In this context, port index 3 can be thought as a representor for physical | |
258 | port 0. | |
259 | ||
260 | As previously described, the "interconnection" block represents a logical | |
261 | concept. Interconnection occurs when hardware configuration enables traffic | |
262 | flows from one place to another (e.g. physical port 0 to VF 1) according to | |
263 | some criteria. | |
264 | ||
265 | This is discussed in more detail in `traffic steering`_. | |
266 | ||
267 | Traffic Steering | |
268 | ~~~~~~~~~~~~~~~~ | |
269 | ||
270 | In the following diagram, each meaningful traffic origin or endpoint as seen | |
271 | by the hypervisor application is tagged with a unique letter from A to F. | |
272 | ||
273 | :: | |
274 | ||
275 | .-------------. .-------------. .-------------. | |
276 | | hypervisor | | VM 1 | | VM 2 | | |
277 | | application | | application | | application | | |
278 | `--+---+---+--' `----------+--' `--+----------' | |
279 | | | | | | | |
280 | | | `-------------------. | | | |
281 | | `---------. | | | | |
282 | | | | | | | |
283 | .----(A)----. .----(B)----. .----(C)----. | | | |
284 | | port_id 3 | | port_id 4 | | port_id 5 | | | | |
285 | `-----+-----' `-----+-----' `-----+-----' | | | |
286 | | | | | | | |
287 | .-+--. .-----+-----. .-----+-----. .---+--. .--+---. | |
288 | | PF | | VF 1 rep. | | VF 2 rep. | | VF 1 | | VF 2 | | |
289 | `-+--' `-----+-----' `-----+-----' `--(D)-' `-(E)--' | |
290 | | | | | | | |
291 | | | .---------' | | | |
292 | `-----. | | .-----------------' | | |
293 | | | | | .---------------------' | |
294 | | | | | | | |
295 | .--+-------+---+---+---+--. | |
296 | | managed interconnection | | |
297 | `------------+------------' | |
298 | | | |
299 | .---(F)----. | |
300 | | physical | | |
301 | | port 0 | | |
302 | `----------' | |
303 | ||
304 | - **A**: PF device. | |
305 | - **B**: port representor for VF 1. | |
306 | - **C**: port representor for VF 2. | |
307 | - **D**: VF 1 proper. | |
308 | - **E**: VF 2 proper. | |
309 | - **F**: physical port. | |
310 | ||
311 | Although uncommon, some devices do not enforce a one to one mapping between | |
312 | PF and physical ports. For instance, by default all ports of **mlx4** | |
313 | adapters are available to all their PF/VF instances, in which case | |
314 | additional ports appear next to **F** in the above diagram. | |
315 | ||
316 | Assuming no interconnection is provided by default in this mode, setting up | |
317 | a `basic SR-IOV`_ configuration involving physical port 0 could be broken | |
318 | down as: | |
319 | ||
320 | PF: | |
321 | ||
322 | - **A to F**: let everything through. | |
323 | - **F to A**: PF MAC as destination. | |
324 | ||
325 | VF 1: | |
326 | ||
327 | - **A to D**, **E to D** and **F to D**: VF 1 MAC as destination. | |
328 | - **D to A**: VF 1 MAC as source and PF MAC as destination. | |
329 | - **D to E**: VF 1 MAC as source and VF 2 MAC as destination. | |
330 | - **D to F**: VF 1 MAC as source. | |
331 | ||
332 | VF 2: | |
333 | ||
334 | - **A to E**, **D to E** and **F to E**: VF 2 MAC as destination. | |
335 | - **E to A**: VF 2 MAC as source and PF MAC as destination. | |
336 | - **E to D**: VF 2 MAC as source and VF 1 MAC as destination. | |
337 | - **E to F**: VF 2 MAC as source. | |
338 | ||
339 | Devices may additionally support advanced matching criteria such as | |
340 | IPv4/IPv6 addresses or TCP/UDP ports. | |
341 | ||
342 | The combination of matching criteria with target endpoints fits well with | |
343 | **rte_flow** [6]_, which expresses flow rules as combinations of patterns | |
344 | and actions. | |
345 | ||
346 | Enhancing **rte_flow** with the ability to make flow rules match and target | |
347 | these endpoints provides a standard interface to manage their | |
348 | interconnection without introducing new concepts and whole new API to | |
349 | implement them. This is described in `flow API (rte_flow)`_. | |
350 | ||
f67539c2 | 351 | .. [6] :doc:`Generic flow API (rte_flow) <rte_flow>` |
11fdf7f2 TL |
352 | |
353 | Flow API (rte_flow) | |
354 | ------------------- | |
355 | ||
356 | Extensions | |
357 | ~~~~~~~~~~ | |
358 | ||
359 | Compared to creating a brand new dedicated interface, **rte_flow** was | |
360 | deemed flexible enough to manage representor traffic only with minor | |
361 | extensions: | |
362 | ||
363 | - Using physical ports, PF, VF or port representors as targets. | |
364 | ||
365 | - Affecting traffic that is not necessarily addressed to the DPDK port ID a | |
366 | flow rule is associated with (e.g. forcing VF traffic redirection to PF). | |
367 | ||
368 | For advanced uses: | |
369 | ||
370 | - Rule-based packet counters. | |
371 | ||
372 | - The ability to combine several identical actions for traffic duplication | |
373 | (e.g. VF representor in addition to a physical port). | |
374 | ||
375 | - Dedicated actions for traffic encapsulation / decapsulation before | |
376 | reaching an endpoint. | |
377 | ||
378 | Traffic Direction | |
379 | ~~~~~~~~~~~~~~~~~ | |
380 | ||
381 | From an application standpoint, "ingress" and "egress" flow rule attributes | |
382 | apply to the DPDK port ID they are associated with. They select a traffic | |
383 | direction for matching patterns, but have no impact on actions. | |
384 | ||
385 | When matching traffic coming from or going to a different place than the | |
386 | immediate port ID a flow rule is associated with, these attributes keep | |
387 | their meaning while applying to the chosen origin, as highlighted by the | |
388 | following diagram | |
389 | ||
390 | :: | |
391 | ||
392 | .-------------. .-------------. .-------------. | |
393 | | hypervisor | | VM 1 | | VM 2 | | |
394 | | application | | application | | application | | |
395 | `--+---+---+--' `----------+--' `--+----------' | |
396 | | | | | | | |
397 | | | `-------------------. | | | |
398 | | `---------. | | | | |
399 | | ^ | ^ | ^ | | | |
400 | | | ingress | | ingress | | ingress | | | |
401 | | | egress | | egress | | egress | | | |
402 | | v | v | v | | | |
403 | .----(A)----. .----(B)----. .----(C)----. | | | |
404 | | port_id 3 | | port_id 4 | | port_id 5 | | | | |
405 | `-----+-----' `-----+-----' `-----+-----' | | | |
406 | | | | | | | |
407 | .-+--. .-----+-----. .-----+-----. .---+--. .--+---. | |
408 | | PF | | VF 1 rep. | | VF 2 rep. | | VF 1 | | VF 2 | | |
409 | `-+--' `-----+-----' `-----+-----' `--(D)-' `-(E)--' | |
410 | | | | ^ | | ^ | |
411 | | | | egress | | | | egress | |
412 | | | | ingress | | | | ingress | |
413 | | | .---------' v | | v | |
414 | `-----. | | .-----------------' | | |
415 | | | | | .---------------------' | |
416 | | | | | | | |
417 | .--+-------+---+---+---+--. | |
418 | | managed interconnection | | |
419 | `------------+------------' | |
420 | ^ | | |
421 | ingress | | | |
422 | egress | | | |
423 | v | | |
424 | .---(F)----. | |
425 | | physical | | |
426 | | port 0 | | |
427 | `----------' | |
428 | ||
429 | Ingress and egress are defined as relative to the application creating the | |
430 | flow rule. | |
431 | ||
432 | For instance, matching traffic sent by VM 2 would be done through an ingress | |
433 | flow rule on VF 2 (**E**). Likewise for incoming traffic on physical port | |
434 | (**F**). This also applies to **C** and **A** respectively. | |
435 | ||
436 | Transferring Traffic | |
437 | ~~~~~~~~~~~~~~~~~~~~ | |
438 | ||
439 | Without Port Representors | |
440 | ^^^^^^^^^^^^^^^^^^^^^^^^^ | |
441 | ||
442 | `Traffic direction`_ describes how an application could match traffic coming | |
443 | from or going to a specific place reachable from a DPDK port ID. This makes | |
444 | sense when the traffic in question is normally seen (i.e. sent or received) | |
445 | by the application creating the flow rule (e.g. as in "redirect all traffic | |
446 | coming from VF 1 to local queue 6"). | |
447 | ||
448 | However this does not force such traffic to take a specific route. Creating | |
449 | a flow rule on **A** matching traffic coming from **D** is only meaningful | |
450 | if it can be received by **A** in the first place, otherwise doing so simply | |
451 | has no effect. | |
452 | ||
453 | A new flow rule attribute named "transfer" is necessary for that. Combining | |
454 | it with "ingress" or "egress" and a specific origin requests a flow rule to | |
455 | be applied at the lowest level | |
456 | ||
457 | :: | |
458 | ||
459 | ingress only : ingress + transfer | |
460 | : | |
461 | .-------------. .-------------. : .-------------. .-------------. | |
462 | | hypervisor | | VM 1 | : | hypervisor | | VM 1 | | |
463 | | application | | application | : | application | | application | | |
464 | `------+------' `--+----------' : `------+------' `--+----------' | |
465 | | | | traffic : | | | traffic | |
466 | .----(A)----. | v : .----(A)----. | v | |
467 | | port_id 3 | | : | port_id 3 | | | |
468 | `-----+-----' | : `-----+-----' | | |
469 | | | : | ^ | | |
470 | | | : | | traffic | | |
471 | .-+--. .---+--. : .-+--. .---+--. | |
472 | | PF | | VF 1 | : | PF | | VF 1 | | |
473 | `-+--' `--(D)-' : `-+--' `--(D)-' | |
474 | | | | traffic : | ^ | | traffic | |
475 | | | v : | | traffic | v | |
476 | .--+-----------+--. : .--+-----------+--. | |
477 | | interconnection | : | interconnection | | |
478 | `--------+--------' : `--------+--------' | |
479 | | | traffic : | | |
480 | | v : | | |
481 | .---(F)----. : .---(F)----. | |
482 | | physical | : | physical | | |
483 | | port 0 | : | port 0 | | |
484 | `----------' : `----------' | |
485 | ||
486 | With "ingress" only, traffic is matched on **A** thus still goes to physical | |
487 | port **F** by default | |
488 | ||
489 | ||
490 | :: | |
491 | ||
492 | testpmd> flow create 3 ingress pattern vf id is 1 / end | |
493 | actions queue index 6 / end | |
494 | ||
495 | With "ingress + transfer", traffic is matched on **D** and is therefore | |
496 | successfully assigned to queue 6 on **A** | |
497 | ||
498 | ||
499 | :: | |
500 | ||
501 | testpmd> flow create 3 ingress transfer pattern vf id is 1 / end | |
502 | actions queue index 6 / end | |
503 | ||
504 | ||
505 | With Port Representors | |
506 | ^^^^^^^^^^^^^^^^^^^^^^ | |
507 | ||
508 | When port representors exist, implicit flow rules with the "transfer" | |
509 | attribute (described in `without port representors`_) are be assumed to | |
510 | exist between them and their represented resources. These may be immutable. | |
511 | ||
512 | In this case, traffic is received by default through the representor and | |
513 | neither the "transfer" attribute nor traffic origin in flow rule patterns | |
514 | are necessary. They simply have to be created on the representor port | |
515 | directly and may target a different representor as described in `PORT_ID | |
516 | action`_. | |
517 | ||
518 | Implicit traffic flow with port representor | |
519 | ||
520 | :: | |
521 | ||
522 | .-------------. .-------------. | |
523 | | hypervisor | | VM 1 | | |
524 | | application | | application | | |
525 | `--+-------+--' `----------+--' | |
526 | | | ^ | | traffic | |
527 | | | | traffic | v | |
528 | | `-----. | | |
529 | | | | | |
530 | .----(A)----. .----(B)----. | | |
531 | | port_id 3 | | port_id 4 | | | |
532 | `-----+-----' `-----+-----' | | |
533 | | | | | |
534 | .-+--. .-----+-----. .---+--. | |
535 | | PF | | VF 1 rep. | | VF 1 | | |
536 | `-+--' `-----+-----' `--(D)-' | |
537 | | | | | |
538 | .--|-------------|-----------|--. | |
539 | | | | | | | |
540 | | | `-----------' | | |
541 | | | <-- traffic | | |
542 | `--|----------------------------' | |
543 | | | |
544 | .---(F)----. | |
545 | | physical | | |
546 | | port 0 | | |
547 | `----------' | |
548 | ||
549 | Pattern Items And Actions | |
550 | ~~~~~~~~~~~~~~~~~~~~~~~~~ | |
551 | ||
552 | PORT Pattern Item | |
553 | ^^^^^^^^^^^^^^^^^ | |
554 | ||
555 | Matches traffic originating from (ingress) or going to (egress) a physical | |
556 | port of the underlying device. | |
557 | ||
558 | Using this pattern item without specifying a port index matches the physical | |
559 | port associated with the current DPDK port ID by default. As described in | |
560 | `traffic steering`_, specifying it should be rarely needed. | |
561 | ||
562 | - Matches **F** in `traffic steering`_. | |
563 | ||
564 | PORT Action | |
565 | ^^^^^^^^^^^ | |
566 | ||
567 | Directs matching traffic to a given physical port index. | |
568 | ||
569 | - Targets **F** in `traffic steering`_. | |
570 | ||
571 | PORT_ID Pattern Item | |
572 | ^^^^^^^^^^^^^^^^^^^^ | |
573 | ||
574 | Matches traffic originating from (ingress) or going to (egress) a given DPDK | |
575 | port ID. | |
576 | ||
577 | Normally only supported if the port ID in question is known by the | |
578 | underlying PMD and related to the device the flow rule is created against. | |
579 | ||
580 | This must not be confused with the `PORT pattern item`_ which refers to the | |
581 | physical port of a device. ``PORT_ID`` refers to a ``struct rte_eth_dev`` | |
582 | object on the application side (also known as "port representor" depending | |
583 | on the kind of underlying device). | |
584 | ||
585 | - Matches **A**, **B** or **C** in `traffic steering`_. | |
586 | ||
587 | PORT_ID Action | |
588 | ^^^^^^^^^^^^^^ | |
589 | ||
590 | Directs matching traffic to a given DPDK port ID. | |
591 | ||
592 | Same restrictions as `PORT_ID pattern item`_. | |
593 | ||
594 | - Targets **A**, **B** or **C** in `traffic steering`_. | |
595 | ||
596 | PF Pattern Item | |
597 | ^^^^^^^^^^^^^^^ | |
598 | ||
599 | Matches traffic originating from (ingress) or going to (egress) the physical | |
600 | function of the current device. | |
601 | ||
602 | If supported, should work even if the physical function is not managed by | |
603 | the application and thus not associated with a DPDK port ID. Its behavior is | |
604 | otherwise similar to `PORT_ID pattern item`_ using PF port ID. | |
605 | ||
606 | - Matches **A** in `traffic steering`_. | |
607 | ||
608 | PF Action | |
609 | ^^^^^^^^^ | |
610 | ||
611 | Directs matching traffic to the physical function of the current device. | |
612 | ||
613 | Same restrictions as `PF pattern item`_. | |
614 | ||
615 | - Targets **A** in `traffic steering`_. | |
616 | ||
617 | VF Pattern Item | |
618 | ^^^^^^^^^^^^^^^ | |
619 | ||
620 | Matches traffic originating from (ingress) or going to (egress) a given | |
621 | virtual function of the current device. | |
622 | ||
623 | If supported, should work even if the virtual function is not managed by | |
624 | the application and thus not associated with a DPDK port ID. Its behavior is | |
625 | otherwise similar to `PORT_ID pattern item`_ using VF port ID. | |
626 | ||
627 | Note this pattern item does not match VF representors traffic which, as | |
628 | separate entities, should be addressed through their own port IDs. | |
629 | ||
630 | - Matches **D** or **E** in `traffic steering`_. | |
631 | ||
632 | VF Action | |
633 | ^^^^^^^^^ | |
634 | ||
635 | Directs matching traffic to a given virtual function of the current device. | |
636 | ||
637 | Same restrictions as `VF pattern item`_. | |
638 | ||
639 | - Targets **D** or **E** in `traffic steering`_. | |
640 | ||
641 | \*_ENCAP actions | |
642 | ^^^^^^^^^^^^^^^^ | |
643 | ||
644 | These actions are named according to the protocol they encapsulate traffic | |
645 | with (e.g. ``VXLAN_ENCAP``) and using specific parameters (e.g. VNI for | |
646 | VXLAN). | |
647 | ||
648 | While they modify traffic and can be used multiple times (order matters), | |
649 | unlike `PORT_ID action`_ and friends, they have no impact on steering. | |
650 | ||
651 | As described in `actions order and repetition`_ this means they are useless | |
652 | if used alone in an action list, the resulting traffic gets dropped unless | |
653 | combined with either ``PASSTHRU`` or other endpoint-targeting actions. | |
654 | ||
655 | \*_DECAP actions | |
656 | ^^^^^^^^^^^^^^^^ | |
657 | ||
658 | They perform the reverse of `\*_ENCAP actions`_ by popping protocol headers | |
659 | from traffic instead of pushing them. They can be used multiple times as | |
660 | well. | |
661 | ||
662 | Note that using these actions on non-matching traffic results in undefined | |
663 | behavior. It is recommended to match the protocol headers to decapsulate on | |
664 | the pattern side of a flow rule in order to use these actions or otherwise | |
665 | make sure only matching traffic goes through. | |
666 | ||
667 | Actions Order and Repetition | |
668 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
669 | ||
670 | Flow rules are currently restricted to at most a single action of each | |
671 | supported type, performed in an unpredictable order (or all at once). To | |
672 | repeat actions in a predictable fashion, applications have to make rules | |
673 | pass-through and use priority levels. | |
674 | ||
675 | It's now clear that PMD support for chaining multiple non-terminating flow | |
676 | rules of varying priority levels is prohibitively difficult to implement | |
677 | compared to simply allowing multiple identical actions performed in a | |
678 | defined order by a single flow rule. | |
679 | ||
680 | - This change is required to support protocol encapsulation offloads and the | |
681 | ability to perform them multiple times (e.g. VLAN then VXLAN). | |
682 | ||
683 | - It makes the ``DUP`` action redundant since multiple ``QUEUE`` actions can | |
684 | be combined for duplication. | |
685 | ||
686 | - The (non-)terminating property of actions must be discarded. Instead, flow | |
687 | rules themselves must be considered terminating by default (i.e. dropping | |
688 | traffic if there is no specific target) unless a ``PASSTHRU`` action is | |
689 | also specified. | |
690 | ||
691 | Switching Examples | |
692 | ------------------ | |
693 | ||
694 | This section provides practical examples based on the established testpmd | |
695 | flow command syntax [2]_, in the context described in `traffic steering`_ | |
696 | ||
697 | :: | |
698 | ||
699 | .-------------. .-------------. .-------------. | |
700 | | hypervisor | | VM 1 | | VM 2 | | |
701 | | application | | application | | application | | |
702 | `--+---+---+--' `----------+--' `--+----------' | |
703 | | | | | | | |
704 | | | `-------------------. | | | |
705 | | `---------. | | | | |
706 | | | | | | | |
707 | .----(A)----. .----(B)----. .----(C)----. | | | |
708 | | port_id 3 | | port_id 4 | | port_id 5 | | | | |
709 | `-----+-----' `-----+-----' `-----+-----' | | | |
710 | | | | | | | |
711 | .-+--. .-----+-----. .-----+-----. .---+--. .--+---. | |
712 | | PF | | VF 1 rep. | | VF 2 rep. | | VF 1 | | VF 2 | | |
713 | `-+--' `-----+-----' `-----+-----' `--(D)-' `-(E)--' | |
714 | | | | | | | |
715 | | | .---------' | | | |
716 | `-----. | | .-----------------' | | |
717 | | | | | .---------------------' | |
718 | | | | | | | |
719 | .--|-------|---|---|---|--. | |
720 | | | | `---|---' | | |
721 | | | `-------' | | |
722 | | `---------. | | |
723 | `------------|------------' | |
724 | | | |
725 | .---(F)----. | |
726 | | physical | | |
727 | | port 0 | | |
728 | `----------' | |
729 | ||
730 | By default, PF (**A**) can communicate with the physical port it is | |
731 | associated with (**F**), while VF 1 (**D**) and VF 2 (**E**) are isolated | |
732 | and restricted to communicate with the hypervisor application through their | |
733 | respective representors (**B** and **C**) if supported. | |
734 | ||
735 | Examples in subsequent sections apply to hypervisor applications only and | |
736 | are based on port representors **A**, **B** and **C**. | |
737 | ||
f67539c2 | 738 | .. [2] :ref:`Flow syntax <testpmd_rte_flow>` |
11fdf7f2 TL |
739 | |
740 | Associating VF 1 with Physical Port 0 | |
741 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
742 | ||
743 | Assign all port traffic (**F**) to VF 1 (**D**) indiscriminately through | |
744 | their representors | |
745 | ||
746 | :: | |
747 | ||
748 | flow create 3 ingress pattern / end actions port_id id 4 / end | |
749 | flow create 4 ingress pattern / end actions port_id id 3 / end | |
750 | ||
751 | More practical example with MAC address restrictions | |
752 | ||
753 | :: | |
754 | ||
755 | flow create 3 ingress | |
756 | pattern eth dst is {VF 1 MAC} / end | |
757 | actions port_id id 4 / end | |
758 | ||
759 | :: | |
760 | ||
761 | flow create 4 ingress | |
762 | pattern eth src is {VF 1 MAC} / end | |
763 | actions port_id id 3 / end | |
764 | ||
765 | ||
766 | Sharing Broadcasts | |
767 | ~~~~~~~~~~~~~~~~~~ | |
768 | ||
769 | From outside to PF and VFs | |
770 | ||
771 | :: | |
772 | ||
773 | flow create 3 ingress | |
774 | pattern eth dst is ff:ff:ff:ff:ff:ff / end | |
775 | actions port_id id 3 / port_id id 4 / port_id id 5 / end | |
776 | ||
777 | Note ``port_id id 3`` is necessary otherwise only VFs would receive matching | |
778 | traffic. | |
779 | ||
780 | From PF to outside and VFs | |
781 | ||
782 | :: | |
783 | ||
784 | flow create 3 egress | |
785 | pattern eth dst is ff:ff:ff:ff:ff:ff / end | |
786 | actions port / port_id id 4 / port_id id 5 / end | |
787 | ||
788 | From VFs to outside and PF | |
789 | ||
790 | :: | |
791 | ||
792 | flow create 4 ingress | |
793 | pattern eth dst is ff:ff:ff:ff:ff:ff src is {VF 1 MAC} / end | |
794 | actions port_id id 3 / port_id id 5 / end | |
795 | ||
796 | flow create 5 ingress | |
797 | pattern eth dst is ff:ff:ff:ff:ff:ff src is {VF 2 MAC} / end | |
798 | actions port_id id 4 / port_id id 4 / end | |
799 | ||
800 | Similar ``33:33:*`` rules based on known MAC addresses should be added for | |
801 | IPv6 traffic. | |
802 | ||
803 | Encapsulating VF 2 Traffic in VXLAN | |
804 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
805 | ||
806 | Assuming pass-through flow rules are supported | |
807 | ||
808 | :: | |
809 | ||
810 | flow create 5 ingress | |
811 | pattern eth / end | |
812 | actions vxlan_encap vni 42 / passthru / end | |
813 | ||
814 | :: | |
815 | ||
816 | flow create 5 egress | |
817 | pattern vxlan vni is 42 / end | |
818 | actions vxlan_decap / passthru / end | |
819 | ||
820 | Here ``passthru`` is needed since as described in `actions order and | |
821 | repetition`_, flow rules are otherwise terminating; if supported, a rule | |
822 | without a target endpoint will drop traffic. | |
823 | ||
824 | Without pass-through support, ingress encapsulation on the destination | |
825 | endpoint might not be supported and action list must provide one | |
826 | ||
827 | :: | |
828 | ||
829 | flow create 5 ingress | |
830 | pattern eth src is {VF 2 MAC} / end | |
831 | actions vxlan_encap vni 42 / port_id id 3 / end | |
832 | ||
833 | flow create 3 ingress | |
834 | pattern vxlan vni is 42 / end | |
835 | actions vxlan_decap / port_id id 5 / end |