]>
Commit | Line | Data |
---|---|---|
9f95a23c TL |
1 | .. SPDX-License-Identifier: BSD-3-Clause |
2 | Copyright(c) 2016 Intel Corporation. | |
7c673cae FG |
3 | |
4 | Flow Bifurcation How-to Guide | |
5 | ============================= | |
6 | ||
7 | Flow Bifurcation is a mechanism which uses hardware capable Ethernet devices | |
8 | to split traffic between Linux user space and kernel space. Since it is a | |
9 | hardware assisted feature this approach can provide line rate processing | |
10 | capability. Other than :ref:`KNI <kni>`, the software is just required to | |
11 | enable device configuration, there is no need to take care of the packet | |
12 | movement during the traffic split. This can yield better performance with | |
13 | less CPU overhead. | |
14 | ||
15 | The Flow Bifurcation splits the incoming data traffic to user space | |
16 | applications (such as DPDK applications) and/or kernel space programs (such as | |
17 | the Linux kernel stack). It can direct some traffic, for example data plane | |
18 | traffic, to DPDK, while directing some other traffic, for example control | |
19 | plane traffic, to the traditional Linux networking stack. | |
20 | ||
21 | There are a number of technical options to achieve this. A typical example is | |
22 | to combine the technology of SR-IOV and packet classification filtering. | |
23 | ||
24 | SR-IOV is a PCI standard that allows the same physical adapter to be split as | |
25 | multiple virtual functions. Each virtual function (VF) has separated queues | |
26 | with physical functions (PF). The network adapter will direct traffic to a | |
27 | virtual function with a matching destination MAC address. In a sense, SR-IOV | |
28 | has the capability for queue division. | |
29 | ||
30 | Packet classification filtering is a hardware capability available on most | |
31 | network adapters. Filters can be configured to direct specific flows to a | |
32 | given receive queue by hardware. Different NICs may have different filter | |
33 | types to direct flows to a Virtual Function or a queue that belong to it. | |
34 | ||
35 | In this way the Linux networking stack can receive specific traffic through | |
36 | the kernel driver while a DPDK application can receive specific traffic | |
37 | bypassing the Linux kernel by using drivers like VFIO or the DPDK ``igb_uio`` | |
38 | module. | |
39 | ||
40 | .. _figure_flow_bifurcation_overview: | |
41 | ||
42 | .. figure:: img/flow_bifurcation_overview.* | |
43 | ||
44 | Flow Bifurcation Overview | |
45 | ||
46 | ||
9f95a23c TL |
47 | Using Flow Bifurcation on Mellanox ConnectX |
48 | ------------------------------------------- | |
49 | ||
50 | The Mellanox devices are :ref:`natively bifurcated <bifurcated_driver>`, | |
51 | so there is no need to split into SR-IOV PF/VF | |
52 | in order to get the flow bifurcation mechanism. | |
53 | The full device is already shared with the kernel driver. | |
54 | ||
55 | The DPDK application can setup some flow steering rules, | |
56 | and let the rest go to the kernel stack. | |
57 | In order to define the filters strictly with flow rules, | |
58 | the :ref:`flow_isolated_mode` can be configured. | |
59 | ||
60 | There is no specific instructions to follow. | |
61 | The recommended reading is the :doc:`../prog_guide/rte_flow` guide. | |
62 | Below is an example of testpmd commands | |
63 | for receiving VXLAN 42 in 4 queues of the DPDK port 0, | |
64 | while all other packets go to the kernel: | |
65 | ||
66 | .. code-block:: console | |
67 | ||
68 | testpmd> flow isolate 0 true | |
69 | testpmd> flow create 0 ingress pattern eth / ipv4 / udp / vxlan vni is 42 / end \ | |
70 | actions rss queues 0 1 2 3 end / end | |
71 | ||
72 | ||
7c673cae FG |
73 | Using Flow Bifurcation on IXGBE in Linux |
74 | ---------------------------------------- | |
75 | ||
76 | On Intel 82599 10 Gigabit Ethernet Controller series NICs Flow Bifurcation can | |
77 | be achieved by SR-IOV and Intel Flow Director technologies. Traffic can be | |
78 | directed to queues by the Flow Director capability, typically by matching | |
79 | 5-tuple of UDP/TCP packets. | |
80 | ||
81 | The typical procedure to achieve this is as follows: | |
82 | ||
83 | #. Boot the system without iommu, or with ``iommu=pt``. | |
84 | ||
85 | #. Create Virtual Functions: | |
86 | ||
87 | .. code-block:: console | |
88 | ||
89 | echo 2 > /sys/bus/pci/devices/0000:01:00.0/sriov_numvfs | |
90 | ||
91 | #. Enable and set flow filters: | |
92 | ||
93 | .. code-block:: console | |
94 | ||
95 | ethtool -K eth1 ntuple on | |
96 | ethtool -N eth1 flow-type udp4 src-ip 192.0.2.2 dst-ip 198.51.100.2 \ | |
97 | action $queue_index_in_VF0 | |
98 | ethtool -N eth1 flow-type udp4 src-ip 198.51.100.2 dst-ip 192.0.2.2 \ | |
99 | action $queue_index_in_VF1 | |
100 | ||
101 | Where: | |
102 | ||
103 | * ``$queue_index_in_VFn``: Bits 39:32 of the variable defines VF id + 1; the lower 32 bits indicates the queue index of the VF. Thus: | |
104 | ||
105 | * ``$queue_index_in_VF0`` = ``(0x1 & 0xFF) << 32 + [queue index]``. | |
106 | ||
107 | * ``$queue_index_in_VF1`` = ``(0x2 & 0xFF) << 32 + [queue index]``. | |
108 | ||
109 | .. _figure_ixgbe_bifu_queue_idx: | |
110 | ||
111 | .. figure:: img/ixgbe_bifu_queue_idx.* | |
112 | ||
113 | #. Compile the DPDK application and insert ``igb_uio`` or probe the ``vfio-pci`` kernel modules as normal. | |
114 | ||
115 | #. Bind the virtual functions: | |
116 | ||
117 | .. code-block:: console | |
118 | ||
119 | modprobe vfio-pci | |
120 | dpdk-devbind.py -b vfio-pci 01:10.0 | |
121 | dpdk-devbind.py -b vfio-pci 01:10.1 | |
122 | ||
123 | #. Run a DPDK application on the VFs: | |
124 | ||
125 | .. code-block:: console | |
126 | ||
11fdf7f2 | 127 | testpmd -l 0-7 -n 4 -- -i -w 01:10.0 -w 01:10.1 --forward-mode=mac |
7c673cae FG |
128 | |
129 | In this example, traffic matching the rules will go through the VF by matching | |
130 | the filter rule. All other traffic, not matching the rules, will go through | |
131 | the default queue or scaling on queues in the PF. That is to say UDP packets | |
132 | with the specified IP source and destination addresses will go through the | |
133 | DPDK application. All other traffic, with different hosts or different | |
134 | protocols, will go through the Linux networking stack. | |
135 | ||
136 | .. note:: | |
137 | ||
138 | * The above steps work on the Linux kernel v4.2. | |
139 | ||
140 | * The Flow Bifurcation is implemented in Linux kernel and ixgbe kernel driver using the following patches: | |
141 | ||
142 | * `ethtool: Add helper routines to pass vf to rx_flow_spec <https://patchwork.ozlabs.org/patch/476511/>`_ | |
143 | ||
144 | * `ixgbe: Allow flow director to use entire queue space <https://patchwork.ozlabs.org/patch/476516/>`_ | |
145 | ||
146 | * The Ethtool version used in this example is 3.18. | |
147 | ||
148 | ||
149 | Using Flow Bifurcation on I40E in Linux | |
150 | --------------------------------------- | |
151 | ||
152 | On Intel X710/XL710 series Ethernet Controllers Flow Bifurcation can be | |
153 | achieved by SR-IOV, Cloud Filter and L3 VEB switch. The traffic can be | |
154 | directed to queues by the Cloud Filter and L3 VEB switch's matching rule. | |
155 | ||
156 | * L3 VEB filters work for non-tunneled packets. It can direct a packet just by | |
157 | the Destination IP address to a queue in a VF. | |
158 | ||
159 | * Cloud filters work for the following types of tunneled packets. | |
160 | ||
161 | * Inner mac. | |
162 | ||
163 | * Inner mac + VNI. | |
164 | ||
165 | * Outer mac + Inner mac + VNI. | |
166 | ||
167 | * Inner mac + Inner vlan + VNI. | |
168 | ||
169 | * Inner mac + Inner vlan. | |
170 | ||
171 | The typical procedure to achieve this is as follows: | |
172 | ||
173 | #. Boot the system without iommu, or with ``iommu=pt``. | |
174 | ||
175 | #. Build and insert the ``i40e.ko`` module. | |
176 | ||
177 | #. Create Virtual Functions: | |
178 | ||
179 | .. code-block:: console | |
180 | ||
181 | echo 2 > /sys/bus/pci/devices/0000:01:00.0/sriov_numvfs | |
182 | ||
183 | #. Add udp port offload to the NIC if using cloud filter: | |
184 | ||
185 | .. code-block:: console | |
186 | ||
187 | ip li add vxlan0 type vxlan id 42 group 239.1.1.1 local 10.16.43.214 dev <name> | |
188 | ifconfig vxlan0 up | |
189 | ip -d li show vxlan0 | |
190 | ||
191 | .. note:: | |
192 | ||
193 | Output such as ``add vxlan port 8472, index 0 success`` should be | |
194 | found in the system log. | |
195 | ||
196 | #. Examples of enabling and setting flow filters: | |
197 | ||
198 | * L3 VEB filter, for a route whose destination IP is 192.168.50.108 to VF | |
199 | 0's queue 2. | |
200 | ||
201 | .. code-block:: console | |
202 | ||
203 | ethtool -N <dev_name> flow-type ip4 dst-ip 192.168.50.108 \ | |
204 | user-def 0xffffffff00000000 action 2 loc 8 | |
205 | ||
206 | * Inner mac, for a route whose inner destination mac is 0:0:0:0:9:0 to | |
207 | PF's queue 6. | |
208 | ||
209 | .. code-block:: console | |
210 | ||
211 | ethtool -N <dev_name> flow-type ether dst 00:00:00:00:00:00 \ | |
212 | m ff:ff:ff:ff:ff:ff src 00:00:00:00:09:00 m 00:00:00:00:00:00 \ | |
213 | user-def 0xffffffff00000003 action 6 loc 1 | |
214 | ||
215 | * Inner mac + VNI, for a route whose inner destination mac is 0:0:0:0:9:0 | |
216 | and VNI is 8 to PF's queue 4. | |
217 | ||
218 | .. code-block:: console | |
219 | ||
220 | ethtool -N <dev_name> flow-type ether dst 00:00:00:00:00:00 \ | |
221 | m ff:ff:ff:ff:ff:ff src 00:00:00:00:09:00 m 00:00:00:00:00:00 \ | |
222 | user-def 0x800000003 action 4 loc 4 | |
223 | ||
224 | * Outer mac + Inner mac + VNI, for a route whose outer mac is | |
225 | 68:05:ca:24:03:8b, inner destination mac is c2:1a:e1:53:bc:57, and VNI | |
226 | is 8 to PF's queue 2. | |
227 | ||
228 | .. code-block:: console | |
229 | ||
230 | ethtool -N <dev_name> flow-type ether dst 68:05:ca:24:03:8b \ | |
231 | m 00:00:00:00:00:00 src c2:1a:e1:53:bc:57 m 00:00:00:00:00:00 \ | |
232 | user-def 0x800000003 action 2 loc 2 | |
233 | ||
234 | * Inner mac + Inner vlan + VNI, for a route whose inner destination mac is | |
235 | 00:00:00:00:20:00, inner vlan is 10, and VNI is 8 to VF 0's queue 1. | |
236 | ||
237 | .. code-block:: console | |
238 | ||
239 | ethtool -N <dev_name> flow-type ether dst 00:00:00:00:01:00 \ | |
240 | m ff:ff:ff:ff:ff:ff src 00:00:00:00:20:00 m 00:00:00:00:00:00 \ | |
241 | vlan 10 user-def 0x800000000 action 1 loc 5 | |
242 | ||
243 | * Inner mac + Inner vlan, for a route whose inner destination mac is | |
244 | 00:00:00:00:20:00, and inner vlan is 10 to VF 0's queue 1. | |
245 | ||
246 | .. code-block:: console | |
247 | ||
248 | ethtool -N <dev_name> flow-type ether dst 00:00:00:00:01:00 \ | |
249 | m ff:ff:ff:ff:ff:ff src 00:00:00:00:20:00 m 00:00:00:00:00:00 \ | |
250 | vlan 10 user-def 0xffffffff00000000 action 1 loc 5 | |
251 | ||
252 | .. note:: | |
253 | ||
254 | * If the upper 32 bits of 'user-def' are ``0xffffffff``, then the | |
255 | filter can be used for programming an L3 VEB filter, otherwise the | |
256 | upper 32 bits of 'user-def' can carry the tenant ID/VNI if | |
257 | specified/required. | |
258 | ||
259 | * Cloud filters can be defined with inner mac, outer mac, inner ip, | |
260 | inner vlan and VNI as part of the cloud tuple. It is always the | |
261 | destination (not source) mac/ip that these filters use. For all | |
262 | these examples dst and src mac address fields are overloaded dst == | |
263 | outer, src == inner. | |
264 | ||
265 | * The filter will direct a packet matching the rule to a vf id | |
266 | specified in the lower 32 bit of user-def to the queue specified by | |
267 | 'action'. | |
268 | ||
269 | * If the vf id specified by the lower 32 bit of user-def is greater | |
270 | than or equal to ``max_vfs``, then the filter is for the PF queues. | |
271 | ||
272 | #. Compile the DPDK application and insert ``igb_uio`` or probe the ``vfio-pci`` | |
273 | kernel modules as normal. | |
274 | ||
275 | #. Bind the virtual function: | |
276 | ||
277 | .. code-block:: console | |
278 | ||
279 | modprobe vfio-pci | |
280 | dpdk-devbind.py -b vfio-pci 01:10.0 | |
281 | dpdk-devbind.py -b vfio-pci 01:10.1 | |
282 | ||
283 | #. run DPDK application on VFs: | |
284 | ||
285 | .. code-block:: console | |
286 | ||
11fdf7f2 | 287 | testpmd -l 0-7 -n 4 -- -i -w 01:10.0 -w 01:10.1 --forward-mode=mac |
7c673cae FG |
288 | |
289 | .. note:: | |
290 | ||
291 | * The above steps work on the i40e Linux kernel driver v1.5.16. | |
292 | ||
293 | * The Ethtool version used in this example is 3.18. The mask ``ff`` means | |
294 | 'not involved', while ``00`` or no mask means 'involved'. | |
295 | ||
296 | * For more details of the configuration, refer to the | |
9f95a23c | 297 | `cloud filter test plan <http://git.dpdk.org/tools/dts/tree/test_plans/cloud_filter_test_plan.rst>`_ |