]>
Commit | Line | Data |
---|---|---|
9f95a23c TL |
1 | .. SPDX-License-Identifier: BSD-3-Clause |
2 | Copyright(c) 2010-2014 Intel Corporation. | |
7c673cae FG |
3 | |
4 | L3 Forwarding Sample Application | |
5 | ================================ | |
6 | ||
7 | The L3 Forwarding application is a simple example of packet processing using the DPDK. | |
8 | The application performs L3 forwarding. | |
9 | ||
10 | Overview | |
11 | -------- | |
12 | ||
13 | The application demonstrates the use of the hash and LPM libraries in the DPDK to implement packet forwarding. | |
14 | The initialization and run-time paths are very similar to those of the :doc:`l2_forward_real_virtual`. | |
15 | The main difference from the L2 Forwarding sample application is that the forwarding decision | |
16 | is made based on information read from the input packet. | |
17 | ||
18 | The lookup method is either hash-based or LPM-based and is selected at run time. When the selected lookup method is hash-based, | |
19 | a hash object is used to emulate the flow classification stage. | |
20 | The hash object is used in correlation with a flow table to map each input packet to its flow at runtime. | |
21 | ||
22 | The hash lookup key is represented by a DiffServ 5-tuple composed of the following fields read from the input packet: | |
23 | Source IP Address, Destination IP Address, Protocol, Source Port and Destination Port. | |
24 | The ID of the output interface for the input packet is read from the identified flow table entry. | |
25 | The set of flows used by the application is statically configured and loaded into the hash at initialization time. | |
26 | When the selected lookup method is LPM based, an LPM object is used to emulate the forwarding stage for IPv4 packets. | |
27 | The LPM object is used as the routing table to identify the next hop for each input packet at runtime. | |
28 | ||
29 | The LPM lookup key is represented by the Destination IP Address field read from the input packet. | |
30 | The ID of the output interface for the input packet is the next hop returned by the LPM lookup. | |
31 | The set of LPM rules used by the application is statically configured and loaded into the LPM object at initialization time. | |
32 | ||
33 | In the sample application, hash-based forwarding supports IPv4 and IPv6. LPM-based forwarding supports IPv4 only. | |
34 | ||
35 | Compiling the Application | |
36 | ------------------------- | |
37 | ||
9f95a23c | 38 | To compile the sample application see :doc:`compiling`. |
7c673cae | 39 | |
9f95a23c | 40 | The application is located in the ``l3fwd`` sub-directory. |
7c673cae FG |
41 | |
42 | Running the Application | |
43 | ----------------------- | |
44 | ||
45 | The application has a number of command line options:: | |
46 | ||
47 | ./l3fwd [EAL options] -- -p PORTMASK | |
48 | [-P] | |
49 | [-E] | |
50 | [-L] | |
51 | --config(port,queue,lcore)[,(port,queue,lcore)] | |
52 | [--eth-dest=X,MM:MM:MM:MM:MM:MM] | |
53 | [--enable-jumbo [--max-pkt-len PKTLEN]] | |
54 | [--no-numa] | |
55 | [--hash-entry-num] | |
56 | [--ipv6] | |
57 | [--parse-ptype] | |
9f95a23c | 58 | [--per-port-pool] |
7c673cae FG |
59 | |
60 | Where, | |
61 | ||
62 | * ``-p PORTMASK:`` Hexadecimal bitmask of ports to configure | |
63 | ||
64 | * ``-P:`` Optional, sets all ports to promiscuous mode so that packets are accepted regardless of the packet's Ethernet MAC destination address. | |
65 | Without this option, only packets with the Ethernet MAC destination address set to the Ethernet address of the port are accepted. | |
66 | ||
67 | * ``-E:`` Optional, enable exact match. | |
68 | ||
69 | * ``-L:`` Optional, enable longest prefix match. | |
70 | ||
71 | * ``--config (port,queue,lcore)[,(port,queue,lcore)]:`` Determines which queues from which ports are mapped to which cores. | |
72 | ||
73 | * ``--eth-dest=X,MM:MM:MM:MM:MM:MM:`` Optional, ethernet destination for port X. | |
74 | ||
75 | * ``--enable-jumbo:`` Optional, enables jumbo frames. | |
76 | ||
77 | * ``--max-pkt-len:`` Optional, under the premise of enabling jumbo, maximum packet length in decimal (64-9600). | |
78 | ||
79 | * ``--no-numa:`` Optional, disables numa awareness. | |
80 | ||
81 | * ``--hash-entry-num:`` Optional, specifies the hash entry number in hexadecimal to be setup. | |
82 | ||
83 | * ``--ipv6:`` Optional, set if running ipv6 packets. | |
84 | ||
85 | * ``--parse-ptype:`` Optional, set to use software to analyze packet type. Without this option, hardware will check the packet type. | |
86 | ||
9f95a23c TL |
87 | * ``--per-port-pool:`` Optional, set to use independent buffer pools per port. Without this option, single buffer pool is used for all ports. |
88 | ||
11fdf7f2 TL |
89 | For example, consider a dual processor socket platform with 8 physical cores, where cores 0-7 and 16-23 appear on socket 0, |
90 | while cores 8-15 and 24-31 appear on socket 1. | |
7c673cae | 91 | |
11fdf7f2 TL |
92 | To enable L3 forwarding between two ports, assuming that both ports are in the same socket, using two cores, cores 1 and 2, |
93 | (which are in the same socket too), use the following command: | |
7c673cae FG |
94 | |
95 | .. code-block:: console | |
96 | ||
11fdf7f2 | 97 | ./build/l3fwd -l 1,2 -n 4 -- -p 0x3 --config="(0,0,1),(1,0,2)" |
7c673cae FG |
98 | |
99 | In this command: | |
100 | ||
11fdf7f2 | 101 | * The -l option enables cores 1, 2 |
7c673cae FG |
102 | |
103 | * The -p option enables ports 0 and 1 | |
104 | ||
11fdf7f2 | 105 | * The --config option enables one queue on each port and maps each (port,queue) pair to a specific core. |
7c673cae FG |
106 | The following table shows the mapping in this example: |
107 | ||
108 | +----------+-----------+-----------+-------------------------------------+ | |
109 | | **Port** | **Queue** | **lcore** | **Description** | | |
110 | | | | | | | |
111 | +----------+-----------+-----------+-------------------------------------+ | |
11fdf7f2 | 112 | | 0 | 0 | 1 | Map queue 0 from port 0 to lcore 1. | |
7c673cae FG |
113 | | | | | | |
114 | +----------+-----------+-----------+-------------------------------------+ | |
11fdf7f2 | 115 | | 1 | 0 | 2 | Map queue 0 from port 1 to lcore 2. | |
7c673cae FG |
116 | | | | | | |
117 | +----------+-----------+-----------+-------------------------------------+ | |
118 | ||
119 | Refer to the *DPDK Getting Started Guide* for general information on running applications and | |
120 | the Environment Abstraction Layer (EAL) options. | |
121 | ||
122 | .. _l3_fwd_explanation: | |
123 | ||
124 | Explanation | |
125 | ----------- | |
126 | ||
127 | The following sections provide some explanation of the sample application code. As mentioned in the overview section, | |
128 | the initialization and run-time paths are very similar to those of the :doc:`l2_forward_real_virtual`. | |
129 | The following sections describe aspects that are specific to the L3 Forwarding sample application. | |
130 | ||
131 | Hash Initialization | |
132 | ~~~~~~~~~~~~~~~~~~~ | |
133 | ||
134 | The hash object is created and loaded with the pre-configured entries read from a global array, | |
135 | and then generate the expected 5-tuple as key to keep consistence with those of real flow | |
136 | for the convenience to execute hash performance test on 4M/8M/16M flows. | |
137 | ||
138 | .. note:: | |
139 | ||
140 | The Hash initialization will setup both ipv4 and ipv6 hash table, | |
141 | and populate the either table depending on the value of variable ipv6. | |
142 | To support the hash performance test with up to 8M single direction flows/16M bi-direction flows, | |
143 | populate_ipv4_many_flow_into_table() function will populate the hash table with specified hash table entry number(default 4M). | |
144 | ||
145 | .. note:: | |
146 | ||
147 | Value of global variable ipv6 can be specified with --ipv6 in the command line. | |
148 | Value of global variable hash_entry_number, | |
149 | which is used to specify the total hash entry number for all used ports in hash performance test, | |
150 | can be specified with --hash-entry-num VALUE in command line, being its default value 4. | |
151 | ||
152 | .. code-block:: c | |
153 | ||
154 | #if (APP_LOOKUP_METHOD == APP_LOOKUP_EXACT_MATCH) | |
155 | ||
156 | static void | |
157 | setup_hash(int socketid) | |
158 | { | |
159 | // ... | |
160 | ||
161 | if (hash_entry_number != HASH_ENTRY_NUMBER_DEFAULT) { | |
162 | if (ipv6 == 0) { | |
163 | /* populate the ipv4 hash */ | |
164 | populate_ipv4_many_flow_into_table(ipv4_l3fwd_lookup_struct[socketid], hash_entry_number); | |
165 | } else { | |
166 | /* populate the ipv6 hash */ | |
167 | populate_ipv6_many_flow_into_table( ipv6_l3fwd_lookup_struct[socketid], hash_entry_number); | |
168 | } | |
169 | } else | |
170 | if (ipv6 == 0) { | |
171 | /* populate the ipv4 hash */ | |
172 | populate_ipv4_few_flow_into_table(ipv4_l3fwd_lookup_struct[socketid]); | |
173 | } else { | |
174 | /* populate the ipv6 hash */ | |
175 | populate_ipv6_few_flow_into_table(ipv6_l3fwd_lookup_struct[socketid]); | |
176 | } | |
177 | } | |
178 | } | |
179 | #endif | |
180 | ||
181 | LPM Initialization | |
182 | ~~~~~~~~~~~~~~~~~~ | |
183 | ||
184 | The LPM object is created and loaded with the pre-configured entries read from a global array. | |
185 | ||
186 | .. code-block:: c | |
187 | ||
188 | #if (APP_LOOKUP_METHOD == APP_LOOKUP_LPM) | |
189 | ||
190 | static void | |
191 | setup_lpm(int socketid) | |
192 | { | |
193 | unsigned i; | |
194 | int ret; | |
195 | char s[64]; | |
196 | ||
197 | /* create the LPM table */ | |
198 | ||
199 | snprintf(s, sizeof(s), "IPV4_L3FWD_LPM_%d", socketid); | |
200 | ||
201 | ipv4_l3fwd_lookup_struct[socketid] = rte_lpm_create(s, socketid, IPV4_L3FWD_LPM_MAX_RULES, 0); | |
202 | ||
203 | if (ipv4_l3fwd_lookup_struct[socketid] == NULL) | |
204 | rte_exit(EXIT_FAILURE, "Unable to create the l3fwd LPM table" | |
205 | " on socket %d\n", socketid); | |
206 | ||
207 | /* populate the LPM table */ | |
208 | ||
209 | for (i = 0; i < IPV4_L3FWD_NUM_ROUTES; i++) { | |
210 | /* skip unused ports */ | |
211 | ||
212 | if ((1 << ipv4_l3fwd_route_array[i].if_out & enabled_port_mask) == 0) | |
213 | continue; | |
214 | ||
215 | ret = rte_lpm_add(ipv4_l3fwd_lookup_struct[socketid], ipv4_l3fwd_route_array[i].ip, | |
216 | ipv4_l3fwd_route_array[i].depth, ipv4_l3fwd_route_array[i].if_out); | |
217 | ||
218 | if (ret < 0) { | |
219 | rte_exit(EXIT_FAILURE, "Unable to add entry %u to the " | |
220 | "l3fwd LPM table on socket %d\n", i, socketid); | |
221 | } | |
222 | ||
223 | printf("LPM: Adding route 0x%08x / %d (%d)\n", | |
224 | (unsigned)ipv4_l3fwd_route_array[i].ip, ipv4_l3fwd_route_array[i].depth, ipv4_l3fwd_route_array[i].if_out); | |
225 | } | |
226 | } | |
227 | #endif | |
228 | ||
229 | Packet Forwarding for Hash-based Lookups | |
230 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
231 | ||
232 | For each input packet, the packet forwarding operation is done by the l3fwd_simple_forward() | |
233 | or simple_ipv4_fwd_4pkts() function for IPv4 packets or the simple_ipv6_fwd_4pkts() function for IPv6 packets. | |
234 | The l3fwd_simple_forward() function provides the basic functionality for both IPv4 and IPv6 packet forwarding | |
235 | for any number of burst packets received, | |
236 | and the packet forwarding decision (that is, the identification of the output interface for the packet) | |
237 | for hash-based lookups is done by the get_ipv4_dst_port() or get_ipv6_dst_port() function. | |
238 | The get_ipv4_dst_port() function is shown below: | |
239 | ||
240 | .. code-block:: c | |
241 | ||
242 | static inline uint8_t | |
9f95a23c | 243 | get_ipv4_dst_port(void *ipv4_hdr, uint16_t portid, lookup_struct_t *ipv4_l3fwd_lookup_struct) |
7c673cae FG |
244 | { |
245 | int ret = 0; | |
246 | union ipv4_5tuple_host key; | |
247 | ||
248 | ipv4_hdr = (uint8_t *)ipv4_hdr + offsetof(struct ipv4_hdr, time_to_live); | |
249 | ||
250 | m128i data = _mm_loadu_si128(( m128i*)(ipv4_hdr)); | |
251 | ||
252 | /* Get 5 tuple: dst port, src port, dst IP address, src IP address and protocol */ | |
253 | ||
254 | key.xmm = _mm_and_si128(data, mask0); | |
255 | ||
256 | /* Find destination port */ | |
257 | ||
258 | ret = rte_hash_lookup(ipv4_l3fwd_lookup_struct, (const void *)&key); | |
259 | ||
260 | return (uint8_t)((ret < 0)? portid : ipv4_l3fwd_out_if[ret]); | |
261 | } | |
262 | ||
263 | The get_ipv6_dst_port() function is similar to the get_ipv4_dst_port() function. | |
264 | ||
265 | The simple_ipv4_fwd_4pkts() and simple_ipv6_fwd_4pkts() function are optimized for continuous 4 valid ipv4 and ipv6 packets, | |
266 | they leverage the multiple buffer optimization to boost the performance of forwarding packets with the exact match on hash table. | |
267 | The key code snippet of simple_ipv4_fwd_4pkts() is shown below: | |
268 | ||
269 | .. code-block:: c | |
270 | ||
271 | static inline void | |
9f95a23c | 272 | simple_ipv4_fwd_4pkts(struct rte_mbuf* m[4], uint16_t portid, struct lcore_conf *qconf) |
7c673cae FG |
273 | { |
274 | // ... | |
275 | ||
276 | data[0] = _mm_loadu_si128(( m128i*)(rte_pktmbuf_mtod(m[0], unsigned char *) + sizeof(struct ether_hdr) + offsetof(struct ipv4_hdr, time_to_live))); | |
277 | data[1] = _mm_loadu_si128(( m128i*)(rte_pktmbuf_mtod(m[1], unsigned char *) + sizeof(struct ether_hdr) + offsetof(struct ipv4_hdr, time_to_live))); | |
278 | data[2] = _mm_loadu_si128(( m128i*)(rte_pktmbuf_mtod(m[2], unsigned char *) + sizeof(struct ether_hdr) + offsetof(struct ipv4_hdr, time_to_live))); | |
279 | data[3] = _mm_loadu_si128(( m128i*)(rte_pktmbuf_mtod(m[3], unsigned char *) + sizeof(struct ether_hdr) + offsetof(struct ipv4_hdr, time_to_live))); | |
280 | ||
281 | key[0].xmm = _mm_and_si128(data[0], mask0); | |
282 | key[1].xmm = _mm_and_si128(data[1], mask0); | |
283 | key[2].xmm = _mm_and_si128(data[2], mask0); | |
284 | key[3].xmm = _mm_and_si128(data[3], mask0); | |
285 | ||
286 | const void *key_array[4] = {&key[0], &key[1], &key[2],&key[3]}; | |
287 | ||
288 | rte_hash_lookup_bulk(qconf->ipv4_lookup_struct, &key_array[0], 4, ret); | |
289 | ||
290 | dst_port[0] = (ret[0] < 0)? portid:ipv4_l3fwd_out_if[ret[0]]; | |
291 | dst_port[1] = (ret[1] < 0)? portid:ipv4_l3fwd_out_if[ret[1]]; | |
292 | dst_port[2] = (ret[2] < 0)? portid:ipv4_l3fwd_out_if[ret[2]]; | |
293 | dst_port[3] = (ret[3] < 0)? portid:ipv4_l3fwd_out_if[ret[3]]; | |
294 | ||
295 | // ... | |
296 | } | |
297 | ||
298 | The simple_ipv6_fwd_4pkts() function is similar to the simple_ipv4_fwd_4pkts() function. | |
299 | ||
300 | Known issue: IP packets with extensions or IP packets which are not TCP/UDP cannot work well at this mode. | |
301 | ||
302 | Packet Forwarding for LPM-based Lookups | |
303 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
304 | ||
305 | For each input packet, the packet forwarding operation is done by the l3fwd_simple_forward() function, | |
306 | but the packet forwarding decision (that is, the identification of the output interface for the packet) | |
307 | for LPM-based lookups is done by the get_ipv4_dst_port() function below: | |
308 | ||
309 | .. code-block:: c | |
310 | ||
9f95a23c TL |
311 | static inline uint16_t |
312 | get_ipv4_dst_port(struct ipv4_hdr *ipv4_hdr, uint16_t portid, lookup_struct_t *ipv4_l3fwd_lookup_struct) | |
7c673cae FG |
313 | { |
314 | uint8_t next_hop; | |
315 | ||
9f95a23c | 316 | return ((rte_lpm_lookup(ipv4_l3fwd_lookup_struct, rte_be_to_cpu_32(ipv4_hdr->dst_addr), &next_hop) == 0)? next_hop : portid); |
7c673cae | 317 | } |