]>
Commit | Line | Data |
---|---|---|
9f95a23c TL |
1 | .. SPDX-License-Identifier: BSD-3-Clause |
2 | Copyright(c) 2010-2014 Intel Corporation. | |
7c673cae FG |
3 | |
4 | IP Reassembly Sample Application | |
5 | ================================ | |
6 | ||
7 | The L3 Forwarding application is a simple example of packet processing using the DPDK. | |
8 | The application performs L3 forwarding with reassembly for fragmented IPv4 and IPv6 packets. | |
9 | ||
10 | Overview | |
11 | -------- | |
12 | ||
13 | The application demonstrates the use of the DPDK libraries to implement packet forwarding | |
14 | with reassembly for IPv4 and IPv6 fragmented packets. | |
15 | The initialization and run- time paths are very similar to those of the :doc:`l2_forward_real_virtual`. | |
16 | The main difference from the L2 Forwarding sample application is that | |
17 | it reassembles fragmented IPv4 and IPv6 packets before forwarding. | |
18 | The maximum allowed size of reassembled packet is 9.5 KB. | |
19 | ||
20 | There are two key differences from the L2 Forwarding sample application: | |
21 | ||
22 | * The first difference is that the forwarding decision is taken based on information read from the input packet's IP header. | |
23 | ||
24 | * The second difference is that the application differentiates between IP and non-IP traffic by means of offload flags. | |
25 | ||
9f95a23c TL |
26 | The Longest Prefix Match (LPM for IPv4, LPM6 for IPv6) table is used to store/lookup an outgoing port number, |
27 | associated with that IPv4 address. Any unmatched packets are forwarded to the originating port. | |
7c673cae | 28 | |
7c673cae | 29 | |
9f95a23c TL |
30 | Compiling the Application |
31 | ------------------------- | |
7c673cae | 32 | |
9f95a23c | 33 | To compile the sample application see :doc:`compiling`. |
7c673cae | 34 | |
9f95a23c | 35 | The application is located in the ``ip_reassembly`` sub-directory. |
7c673cae | 36 | |
7c673cae FG |
37 | |
38 | Running the Application | |
39 | ----------------------- | |
40 | ||
41 | The application has a number of command line options: | |
42 | ||
43 | .. code-block:: console | |
44 | ||
45 | ./build/ip_reassembly [EAL options] -- -p PORTMASK [-q NQ] [--maxflows=FLOWS>] [--flowttl=TTL[(s|ms)]] | |
46 | ||
47 | where: | |
48 | ||
49 | * -p PORTMASK: Hexadecimal bitmask of ports to configure | |
50 | ||
51 | * -q NQ: Number of RX queues per lcore | |
52 | ||
53 | * --maxflows=FLOWS: determines maximum number of active fragmented flows (1-65535). Default value: 4096. | |
54 | ||
55 | * --flowttl=TTL[(s|ms)]: determines maximum Time To Live for fragmented packet. | |
56 | If all fragments of the packet wouldn't appear within given time-out, | |
57 | then they are considered as invalid and will be dropped. | |
58 | Valid range is 1ms - 3600s. Default value: 1s. | |
59 | ||
9f95a23c | 60 | To run the example in linux environment with 2 lcores (2,4) over 2 ports(0,2) with 1 RX queue per lcore: |
7c673cae FG |
61 | |
62 | .. code-block:: console | |
63 | ||
11fdf7f2 | 64 | ./build/ip_reassembly -l 2,4 -n 3 -- -p 5 |
7c673cae FG |
65 | EAL: coremask set to 14 |
66 | EAL: Detected lcore 0 on socket 0 | |
67 | EAL: Detected lcore 1 on socket 1 | |
68 | EAL: Detected lcore 2 on socket 0 | |
69 | EAL: Detected lcore 3 on socket 1 | |
70 | EAL: Detected lcore 4 on socket 0 | |
71 | ... | |
72 | ||
73 | Initializing port 0 on lcore 2... Address:00:1B:21:76:FA:2C, rxq=0 txq=2,0 txq=4,1 | |
74 | done: Link Up - speed 10000 Mbps - full-duplex | |
75 | Skipping disabled port 1 | |
76 | Initializing port 2 on lcore 4... Address:00:1B:21:5C:FF:54, rxq=0 txq=2,0 txq=4,1 | |
77 | done: Link Up - speed 10000 Mbps - full-duplex | |
78 | Skipping disabled port 3IP_FRAG: Socket 0: adding route 100.10.0.0/16 (port 0) | |
79 | IP_RSMBL: Socket 0: adding route 100.20.0.0/16 (port 1) | |
80 | ... | |
81 | ||
82 | IP_RSMBL: Socket 0: adding route 0101:0101:0101:0101:0101:0101:0101:0101/48 (port 0) | |
83 | IP_RSMBL: Socket 0: adding route 0201:0101:0101:0101:0101:0101:0101:0101/48 (port 1) | |
84 | ... | |
85 | ||
86 | IP_RSMBL: entering main loop on lcore 4 | |
87 | IP_RSMBL: -- lcoreid=4 portid=2 | |
88 | IP_RSMBL: entering main loop on lcore 2 | |
89 | IP_RSMBL: -- lcoreid=2 portid=0 | |
90 | ||
9f95a23c | 91 | To run the example in linux environment with 1 lcore (4) over 2 ports(0,2) with 2 RX queues per lcore: |
7c673cae FG |
92 | |
93 | .. code-block:: console | |
94 | ||
11fdf7f2 | 95 | ./build/ip_reassembly -l 4 -n 3 -- -p 5 -q 2 |
7c673cae FG |
96 | |
97 | To test the application, flows should be set up in the flow generator that match the values in the | |
98 | l3fwd_ipv4_route_array and/or l3fwd_ipv6_route_array table. | |
99 | ||
100 | Please note that in order to test this application, | |
101 | the traffic generator should be generating valid fragmented IP packets. | |
102 | For IPv6, the only supported case is when no other extension headers other than | |
103 | fragment extension header are present in the packet. | |
104 | ||
105 | The default l3fwd_ipv4_route_array table is: | |
106 | ||
107 | .. code-block:: c | |
108 | ||
109 | struct l3fwd_ipv4_route l3fwd_ipv4_route_array[] = { | |
110 | {IPv4(100, 10, 0, 0), 16, 0}, | |
111 | {IPv4(100, 20, 0, 0), 16, 1}, | |
112 | {IPv4(100, 30, 0, 0), 16, 2}, | |
113 | {IPv4(100, 40, 0, 0), 16, 3}, | |
114 | {IPv4(100, 50, 0, 0), 16, 4}, | |
115 | {IPv4(100, 60, 0, 0), 16, 5}, | |
116 | {IPv4(100, 70, 0, 0), 16, 6}, | |
117 | {IPv4(100, 80, 0, 0), 16, 7}, | |
118 | }; | |
119 | ||
120 | The default l3fwd_ipv6_route_array table is: | |
121 | ||
122 | .. code-block:: c | |
123 | ||
124 | struct l3fwd_ipv6_route l3fwd_ipv6_route_array[] = { | |
125 | {{1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1}, 48, 0}, | |
126 | {{2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1}, 48, 1}, | |
127 | {{3, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1}, 48, 2}, | |
128 | {{4, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1}, 48, 3}, | |
129 | {{5, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1}, 48, 4}, | |
130 | {{6, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1}, 48, 5}, | |
131 | {{7, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1}, 48, 6}, | |
132 | {{8, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1}, 48, 7}, | |
133 | }; | |
134 | ||
135 | For example, for the fragmented input IPv4 packet with destination address: 100.10.1.1, | |
136 | a reassembled IPv4 packet be sent out from port #0 to the destination address 100.10.1.1 | |
137 | once all the fragments are collected. | |
138 | ||
139 | Explanation | |
140 | ----------- | |
141 | ||
142 | The following sections provide some explanation of the sample application code. | |
143 | As mentioned in the overview section, the initialization and run-time paths are very similar to those of the :doc:`l2_forward_real_virtual`. | |
144 | The following sections describe aspects that are specific to the IP reassemble sample application. | |
145 | ||
146 | IPv4 Fragment Table Initialization | |
147 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
148 | ||
149 | This application uses the rte_ip_frag library. Please refer to Programmer's Guide for more detailed explanation of how to use this library. | |
150 | Fragment table maintains information about already received fragments of the packet. | |
151 | Each IP packet is uniquely identified by triple <Source IP address>, <Destination IP address>, <ID>. | |
152 | To avoid lock contention, each RX queue has its own Fragment Table, | |
153 | e.g. the application can't handle the situation when different fragments of the same packet arrive through different RX queues. | |
154 | Each table entry can hold information about packet consisting of up to RTE_LIBRTE_IP_FRAG_MAX_FRAGS fragments. | |
155 | ||
156 | .. code-block:: c | |
157 | ||
158 | frag_cycles = (rte_get_tsc_hz() + MS_PER_S - 1) / MS_PER_S * max_flow_ttl; | |
159 | ||
160 | if ((qconf->frag_tbl[queue] = rte_ip_frag_tbl_create(max_flow_num, IPV4_FRAG_TBL_BUCKET_ENTRIES, max_flow_num, frag_cycles, socket)) == NULL) | |
161 | { | |
162 | RTE_LOG(ERR, IP_RSMBL, "ip_frag_tbl_create(%u) on " "lcore: %u for queue: %u failed\n", max_flow_num, lcore, queue); | |
163 | return -1; | |
164 | } | |
165 | ||
166 | Mempools Initialization | |
167 | ~~~~~~~~~~~~~~~~~~~~~~~ | |
168 | ||
169 | The reassembly application demands a lot of mbuf's to be allocated. | |
170 | At any given time up to (2 \* max_flow_num \* RTE_LIBRTE_IP_FRAG_MAX_FRAGS \* <maximum number of mbufs per packet>) | |
171 | can be stored inside Fragment Table waiting for remaining fragments. | |
172 | To keep mempool size under reasonable limits and to avoid situation when one RX queue can starve other queues, | |
173 | each RX queue uses its own mempool. | |
174 | ||
175 | .. code-block:: c | |
176 | ||
177 | nb_mbuf = RTE_MAX(max_flow_num, 2UL * MAX_PKT_BURST) * RTE_LIBRTE_IP_FRAG_MAX_FRAGS; | |
178 | nb_mbuf *= (port_conf.rxmode.max_rx_pkt_len + BUF_SIZE - 1) / BUF_SIZE; | |
179 | nb_mbuf *= 2; /* ipv4 and ipv6 */ | |
180 | nb_mbuf += RTE_TEST_RX_DESC_DEFAULT + RTE_TEST_TX_DESC_DEFAULT; | |
181 | nb_mbuf = RTE_MAX(nb_mbuf, (uint32_t)NB_MBUF); | |
182 | ||
183 | snprintf(buf, sizeof(buf), "mbuf_pool_%u_%u", lcore, queue); | |
184 | ||
185 | if ((rxq->pool = rte_mempool_create(buf, nb_mbuf, MBUF_SIZE, 0, sizeof(struct rte_pktmbuf_pool_private), rte_pktmbuf_pool_init, NULL, | |
186 | rte_pktmbuf_init, NULL, socket, MEMPOOL_F_SP_PUT | MEMPOOL_F_SC_GET)) == NULL) { | |
187 | ||
188 | RTE_LOG(ERR, IP_RSMBL, "mempool_create(%s) failed", buf); | |
189 | return -1; | |
190 | } | |
191 | ||
192 | Packet Reassembly and Forwarding | |
193 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
194 | ||
195 | For each input packet, the packet forwarding operation is done by the l3fwd_simple_forward() function. | |
196 | If the packet is an IPv4 or IPv6 fragment, then it calls rte_ipv4_reassemble_packet() for IPv4 packets, | |
197 | or rte_ipv6_reassemble_packet() for IPv6 packets. | |
198 | These functions either return a pointer to valid mbuf that contains reassembled packet, | |
199 | or NULL (if the packet can't be reassembled for some reason). | |
200 | Then l3fwd_simple_forward() continues with the code for the packet forwarding decision | |
201 | (that is, the identification of the output interface for the packet) and | |
202 | actual transmit of the packet. | |
203 | ||
204 | The rte_ipv4_reassemble_packet() or rte_ipv6_reassemble_packet() are responsible for: | |
205 | ||
206 | #. Searching the Fragment Table for entry with packet's <IP Source Address, IP Destination Address, Packet ID> | |
207 | ||
208 | #. If the entry is found, then check if that entry already timed-out. | |
209 | If yes, then free all previously received fragments, | |
210 | and remove information about them from the entry. | |
211 | ||
212 | #. If no entry with such key is found, then try to create a new one by one of two ways: | |
213 | ||
214 | #. Use as empty entry | |
215 | ||
216 | #. Delete a timed-out entry, free mbufs associated with it mbufs and store a new entry with specified key in it. | |
217 | ||
218 | #. Update the entry with new fragment information and check | |
219 | if a packet can be reassembled (the packet's entry contains all fragments). | |
220 | ||
221 | #. If yes, then, reassemble the packet, mark table's entry as empty and return the reassembled mbuf to the caller. | |
222 | ||
223 | #. If no, then just return a NULL to the caller. | |
224 | ||
225 | If at any stage of packet processing a reassembly function encounters an error | |
226 | (can't insert new entry into the Fragment table, or invalid/timed-out fragment), | |
227 | then it will free all associated with the packet fragments, | |
228 | mark the table entry as invalid and return NULL to the caller. | |
229 | ||
230 | Debug logging and Statistics Collection | |
231 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
232 | ||
233 | The RTE_LIBRTE_IP_FRAG_TBL_STAT controls statistics collection for the IP Fragment Table. | |
234 | This macro is disabled by default. | |
235 | To make ip_reassembly print the statistics to the standard output, | |
236 | the user must send either an USR1, INT or TERM signal to the process. | |
237 | For all of these signals, the ip_reassembly process prints Fragment table statistics for each RX queue, | |
238 | plus the INT and TERM will cause process termination as usual. |