]> git.proxmox.com Git - ceph.git/blob - ceph/src/seastar/dpdk/doc/guides/sample_app_ug/exception_path.rst
import 15.2.0 Octopus source
[ceph.git] / ceph / src / seastar / dpdk / doc / guides / sample_app_ug / exception_path.rst
1 .. SPDX-License-Identifier: BSD-3-Clause
2 Copyright(c) 2010-2014 Intel Corporation.
3
4 Exception Path Sample Application
5 =================================
6
7 The Exception Path sample application is a simple example that demonstrates the use of the DPDK
8 to set up an exception path for packets to go through the Linux* kernel.
9 This is done by using virtual TAP network interfaces.
10 These can be read from and written to by the DPDK application and
11 appear to the kernel as a standard network interface.
12
13 Overview
14 --------
15
16 The application creates two threads for each NIC port being used.
17 One thread reads from the port and writes the data unmodified to a thread-specific TAP interface.
18 The second thread reads from a TAP interface and writes the data unmodified to the NIC port.
19
20 The packet flow through the exception path application is as shown in the following figure.
21
22 .. _figure_exception_path_example:
23
24 .. figure:: img/exception_path_example.*
25
26 Packet Flow
27
28
29 To make throughput measurements, kernel bridges must be setup to forward data between the bridges appropriately.
30
31 Compiling the Application
32 -------------------------
33
34 To compile the sample application see :doc:`compiling`.
35
36 The application is located in the ``exception_path`` sub-directory.
37
38 Running the Application
39 -----------------------
40
41 The application requires a number of command line options:
42
43 .. code-block:: console
44
45 .build/exception_path [EAL options] -- -p PORTMASK -i IN_CORES -o OUT_CORES
46
47 where:
48
49 * -p PORTMASK: A hex bitmask of ports to use
50
51 * -i IN_CORES: A hex bitmask of cores which read from NIC
52
53 * -o OUT_CORES: A hex bitmask of cores which write to NIC
54
55 Refer to the *DPDK Getting Started Guide* for general information on running applications
56 and the Environment Abstraction Layer (EAL) options.
57
58 The number of bits set in each bitmask must be the same.
59 The coremask -c or the corelist -l parameter of the EAL options should include IN_CORES and OUT_CORES.
60 The same bit must not be set in IN_CORES and OUT_CORES.
61 The affinities between ports and cores are set beginning with the least significant bit of each mask, that is,
62 the port represented by the lowest bit in PORTMASK is read from by the core represented by the lowest bit in IN_CORES,
63 and written to by the core represented by the lowest bit in OUT_CORES.
64
65 For example to run the application with two ports and four cores:
66
67 .. code-block:: console
68
69 ./build/exception_path -l 0-3 -n 4 -- -p 3 -i 3 -o c
70
71 Getting Statistics
72 ~~~~~~~~~~~~~~~~~~
73
74 While the application is running, statistics on packets sent and
75 received can be displayed by sending the SIGUSR1 signal to the application from another terminal:
76
77 .. code-block:: console
78
79 killall -USR1 exception_path
80
81 The statistics can be reset by sending a SIGUSR2 signal in a similar way.
82
83 Explanation
84 -----------
85
86 The following sections provide some explanation of the code.
87
88 Initialization
89 ~~~~~~~~~~~~~~
90
91 Setup of the mbuf pool, driver and queues is similar to the setup done in the :ref:`l2_fwd_app_real_and_virtual`.
92 In addition, the TAP interfaces must also be created.
93 A TAP interface is created for each lcore that is being used.
94 The code for creating the TAP interface is as follows:
95
96 .. code-block:: c
97
98 /*
99 * Create a tap network interface, or use existing one with same name.
100 * If name[0]='\0' then a name is automatically assigned and returned in name.
101 */
102
103 static int tap_create(char *name)
104 {
105 struct ifreq ifr;
106 int fd, ret;
107
108 fd = open("/dev/net/tun", O_RDWR);
109 if (fd < 0)
110 return fd;
111
112 memset(&ifr, 0, sizeof(ifr));
113
114 /* TAP device without packet information */
115
116 ifr.ifr_flags = IFF_TAP | IFF_NO_PI;
117 if (name && *name)
118 rte_snprinf(ifr.ifr_name, IFNAMSIZ, name);
119
120 ret = ioctl(fd, TUNSETIFF, (void *) &ifr);
121
122 if (ret < 0) {
123 close(fd);
124 return ret;
125
126 }
127
128 if (name)
129 snprintf(name, IFNAMSIZ, ifr.ifr_name);
130
131 return fd;
132 }
133
134 The other step in the initialization process that is unique to this sample application
135 is the association of each port with two cores:
136
137 * One core to read from the port and write to a TAP interface
138
139 * A second core to read from a TAP interface and write to the port
140
141 This is done using an array called port_ids[], which is indexed by the lcore IDs.
142 The population of this array is shown below:
143
144 .. code-block:: c
145
146 tx_port = 0;
147 rx_port = 0;
148
149 RTE_LCORE_FOREACH(i) {
150 if (input_cores_mask & (1ULL << i)) {
151 /* Skip ports that are not enabled */
152 while ((ports_mask & (1 << rx_port)) == 0) {
153 rx_port++;
154 if (rx_port > (sizeof(ports_mask) * 8))
155 goto fail; /* not enough ports */
156 }
157 port_ids[i] = rx_port++;
158 } else if (output_cores_mask & (1ULL << i)) {
159 /* Skip ports that are not enabled */
160 while ((ports_mask & (1 << tx_port)) == 0) {
161 tx_port++;
162 if (tx_port > (sizeof(ports_mask) * 8))
163 goto fail; /* not enough ports */
164 }
165 port_ids[i] = tx_port++;
166 }
167 }
168
169 Packet Forwarding
170 ~~~~~~~~~~~~~~~~~
171
172 After the initialization steps are complete, the main_loop() function is run on each lcore.
173 This function first checks the lcore_id against the user provided input_cores_mask and output_cores_mask to see
174 if this core is reading from or writing to a TAP interface.
175
176 For the case that reads from a NIC port, the packet reception is the same as in the L2 Forwarding sample application
177 (see :ref:`l2_fwd_app_rx_tx_packets`).
178 The packet transmission is done by calling write() with the file descriptor of the appropriate TAP interface
179 and then explicitly freeing the mbuf back to the pool.
180
181 .. code-block:: c
182
183 /* Loop forever reading from NIC and writing to tap */
184
185 for (;;) {
186 struct rte_mbuf *pkts_burst[PKT_BURST_SZ];
187 unsigned i;
188
189 const unsigned nb_rx = rte_eth_rx_burst(port_ids[lcore_id], 0, pkts_burst, PKT_BURST_SZ);
190
191 lcore_stats[lcore_id].rx += nb_rx;
192
193 for (i = 0; likely(i < nb_rx); i++) {
194 struct rte_mbuf *m = pkts_burst[i];
195 int ret = write(tap_fd, rte_pktmbuf_mtod(m, void*),
196
197 rte_pktmbuf_data_len(m));
198 rte_pktmbuf_free(m);
199 if (unlikely(ret<0))
200 lcore_stats[lcore_id].dropped++;
201 else
202 lcore_stats[lcore_id].tx++;
203 }
204 }
205
206 For the other case that reads from a TAP interface and writes to a NIC port,
207 packets are retrieved by doing a read() from the file descriptor of the appropriate TAP interface.
208 This fills in the data into the mbuf, then other fields are set manually.
209 The packet can then be transmitted as normal.
210
211 .. code-block:: c
212
213 /* Loop forever reading from tap and writing to NIC */
214
215 for (;;) {
216 int ret;
217 struct rte_mbuf *m = rte_pktmbuf_alloc(pktmbuf_pool);
218
219 if (m == NULL)
220 continue;
221
222 ret = read(tap_fd, m->pkt.data, MAX_PACKET_SZ); lcore_stats[lcore_id].rx++;
223 if (unlikely(ret < 0)) {
224 FATAL_ERROR("Reading from %s interface failed", tap_name);
225 }
226
227 m->pkt.nb_segs = 1;
228 m->pkt.next = NULL;
229 m->pkt.data_len = (uint16_t)ret;
230
231 ret = rte_eth_tx_burst(port_ids[lcore_id], 0, &m, 1);
232 if (unlikely(ret < 1)) {
233 rte_pktmuf_free(m);
234 lcore_stats[lcore_id].dropped++;
235 }
236 else {
237 lcore_stats[lcore_id].tx++;
238 }
239 }
240
241 To set up loops for measuring throughput, TAP interfaces can be connected using bridging.
242 The steps to do this are described in the section that follows.
243
244 Managing TAP Interfaces and Bridges
245 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
246
247 The Exception Path sample application creates TAP interfaces with names of the format tap_dpdk_nn,
248 where nn is the lcore ID. These TAP interfaces need to be configured for use:
249
250 .. code-block:: console
251
252 ifconfig tap_dpdk_00 up
253
254 To set up a bridge between two interfaces so that packets sent to one interface can be read from another,
255 use the brctl tool:
256
257 .. code-block:: console
258
259 brctl addbr "br0"
260 brctl addif br0 tap_dpdk_00
261 brctl addif br0 tap_dpdk_03
262 ifconfig br0 up
263
264 The TAP interfaces created by this application exist only when the application is running,
265 so the steps above need to be repeated each time the application is run.
266 To avoid this, persistent TAP interfaces can be created using openvpn:
267
268 .. code-block:: console
269
270 openvpn --mktun --dev tap_dpdk_00
271
272 If this method is used, then the steps above have to be done only once and
273 the same TAP interfaces can be reused each time the application is run.
274 To remove bridges and persistent TAP interfaces, the following commands are used:
275
276 .. code-block:: console
277
278 ifconfig br0 down
279 brctl delbr br0
280 openvpn --rmtun --dev tap_dpdk_00
281