]>
Commit | Line | Data |
---|---|---|
1da177e4 LT |
1 | -------------------------------------------------------------------------------- |
2 | + ABSTRACT | |
3 | -------------------------------------------------------------------------------- | |
4 | ||
889b8f96 | 5 | This file documents the mmap() facility available with the PACKET |
1da177e4 | 6 | socket interface on 2.4 and 2.6 kernels. This type of sockets is used for |
69e3c75f JB |
7 | capture network traffic with utilities like tcpdump or any other that needs |
8 | raw access to network interface. | |
1da177e4 | 9 | |
69e3c75f | 10 | You can find the latest version of this document at: |
0ea6e611 | 11 | http://wiki.ipxwarzone.com/index.php5?title=Linux_packet_mmap |
1da177e4 | 12 | |
69e3c75f JB |
13 | Howto can be found at: |
14 | http://wiki.gnu-log.net (packet_mmap) | |
1da177e4 | 15 | |
69e3c75f | 16 | Please send your comments to |
be2a608b | 17 | Ulisses Alonso CamarĂ³ <uaca@i.hate.spam.alumni.uv.es> |
69e3c75f | 18 | Johann Baudy <johann.baudy@gnu-log.net> |
1da177e4 LT |
19 | |
20 | ------------------------------------------------------------------------------- | |
21 | + Why use PACKET_MMAP | |
22 | -------------------------------------------------------------------------------- | |
23 | ||
24 | In Linux 2.4/2.6 if PACKET_MMAP is not enabled, the capture process is very | |
25 | inefficient. It uses very limited buffers and requires one system call | |
26 | to capture each packet, it requires two if you want to get packet's | |
27 | timestamp (like libpcap always does). | |
28 | ||
29 | In the other hand PACKET_MMAP is very efficient. PACKET_MMAP provides a size | |
69e3c75f JB |
30 | configurable circular buffer mapped in user space that can be used to either |
31 | send or receive packets. This way reading packets just needs to wait for them, | |
32 | most of the time there is no need to issue a single system call. Concerning | |
33 | transmission, multiple packets can be sent through one system call to get the | |
34 | highest bandwidth. | |
35 | By using a shared buffer between the kernel and the user also has the benefit | |
36 | of minimizing packet copies. | |
37 | ||
38 | It's fine to use PACKET_MMAP to improve the performance of the capture and | |
39 | transmission process, but it isn't everything. At least, if you are capturing | |
40 | at high speeds (this is relative to the cpu speed), you should check if the | |
41 | device driver of your network interface card supports some sort of interrupt | |
42 | load mitigation or (even better) if it supports NAPI, also make sure it is | |
43 | enabled. For transmission, check the MTU (Maximum Transmission Unit) used and | |
44 | supported by devices of your network. | |
1da177e4 LT |
45 | |
46 | -------------------------------------------------------------------------------- | |
889b8f96 | 47 | + How to use mmap() to improve capture process |
1da177e4 LT |
48 | -------------------------------------------------------------------------------- |
49 | ||
c30fe7f7 | 50 | From the user standpoint, you should use the higher level libpcap library, which |
1da177e4 LT |
51 | is a de facto standard, portable across nearly all operating systems |
52 | including Win32. | |
53 | ||
54 | Said that, at time of this writing, official libpcap 0.8.1 is out and doesn't include | |
55 | support for PACKET_MMAP, and also probably the libpcap included in your distribution. | |
56 | ||
57 | I'm aware of two implementations of PACKET_MMAP in libpcap: | |
58 | ||
0ea6e611 | 59 | http://wiki.ipxwarzone.com/ (by Simon Patarin, based on libpcap 0.6.2) |
1da177e4 LT |
60 | http://public.lanl.gov/cpw/ (by Phil Wood, based on lastest libpcap) |
61 | ||
62 | The rest of this document is intended for people who want to understand | |
63 | the low level details or want to improve libpcap by including PACKET_MMAP | |
64 | support. | |
65 | ||
66 | -------------------------------------------------------------------------------- | |
889b8f96 | 67 | + How to use mmap() directly to improve capture process |
1da177e4 LT |
68 | -------------------------------------------------------------------------------- |
69 | ||
70 | From the system calls stand point, the use of PACKET_MMAP involves | |
71 | the following process: | |
72 | ||
73 | ||
74 | [setup] socket() -------> creation of the capture socket | |
75 | setsockopt() ---> allocation of the circular buffer (ring) | |
69e3c75f | 76 | option: PACKET_RX_RING |
6c28f2c0 | 77 | mmap() ---------> mapping of the allocated buffer to the |
1da177e4 LT |
78 | user process |
79 | ||
80 | [capture] poll() ---------> to wait for incoming packets | |
81 | ||
82 | [shutdown] close() --------> destruction of the capture socket and | |
83 | deallocation of all associated | |
84 | resources. | |
85 | ||
86 | ||
87 | socket creation and destruction is straight forward, and is done | |
88 | the same way with or without PACKET_MMAP: | |
89 | ||
90 | int fd; | |
91 | ||
92 | fd= socket(PF_PACKET, mode, htons(ETH_P_ALL)) | |
93 | ||
94 | where mode is SOCK_RAW for the raw interface were link level | |
95 | information can be captured or SOCK_DGRAM for the cooked | |
96 | interface where link level information capture is not | |
97 | supported and a link level pseudo-header is provided | |
98 | by the kernel. | |
99 | ||
100 | The destruction of the socket and all associated resources | |
101 | is done by a simple call to close(fd). | |
102 | ||
a33f3224 | 103 | Next I will describe PACKET_MMAP settings and its constraints, |
6c28f2c0 | 104 | also the mapping of the circular buffer in the user process and |
1da177e4 LT |
105 | the use of this buffer. |
106 | ||
69e3c75f | 107 | -------------------------------------------------------------------------------- |
889b8f96 | 108 | + How to use mmap() directly to improve transmission process |
69e3c75f JB |
109 | -------------------------------------------------------------------------------- |
110 | Transmission process is similar to capture as shown below. | |
111 | ||
112 | [setup] socket() -------> creation of the transmission socket | |
113 | setsockopt() ---> allocation of the circular buffer (ring) | |
114 | option: PACKET_TX_RING | |
115 | bind() ---------> bind transmission socket with a network interface | |
116 | mmap() ---------> mapping of the allocated buffer to the | |
117 | user process | |
118 | ||
119 | [transmission] poll() ---------> wait for free packets (optional) | |
120 | send() ---------> send all packets that are set as ready in | |
121 | the ring | |
122 | The flag MSG_DONTWAIT can be used to return | |
123 | before end of transfer. | |
124 | ||
125 | [shutdown] close() --------> destruction of the transmission socket and | |
126 | deallocation of all associated resources. | |
127 | ||
128 | Binding the socket to your network interface is mandatory (with zero copy) to | |
129 | know the header size of frames used in the circular buffer. | |
130 | ||
131 | As capture, each frame contains two parts: | |
132 | ||
133 | -------------------- | |
134 | | struct tpacket_hdr | Header. It contains the status of | |
135 | | | of this frame | |
136 | |--------------------| | |
137 | | data buffer | | |
138 | . . Data that will be sent over the network interface. | |
139 | . . | |
140 | -------------------- | |
141 | ||
142 | bind() associates the socket to your network interface thanks to | |
143 | sll_ifindex parameter of struct sockaddr_ll. | |
144 | ||
145 | Initialization example: | |
146 | ||
147 | struct sockaddr_ll my_addr; | |
148 | struct ifreq s_ifr; | |
149 | ... | |
150 | ||
151 | strncpy (s_ifr.ifr_name, "eth0", sizeof(s_ifr.ifr_name)); | |
152 | ||
153 | /* get interface index of eth0 */ | |
154 | ioctl(this->socket, SIOCGIFINDEX, &s_ifr); | |
155 | ||
156 | /* fill sockaddr_ll struct to prepare binding */ | |
157 | my_addr.sll_family = AF_PACKET; | |
30e7dfe7 | 158 | my_addr.sll_protocol = htons(ETH_P_ALL); |
69e3c75f JB |
159 | my_addr.sll_ifindex = s_ifr.ifr_ifindex; |
160 | ||
161 | /* bind socket to eth0 */ | |
162 | bind(this->socket, (struct sockaddr *)&my_addr, sizeof(struct sockaddr_ll)); | |
163 | ||
164 | A complete tutorial is available at: http://wiki.gnu-log.net/ | |
165 | ||
5920cd3a PC |
166 | By default, the user should put data at : |
167 | frame base + TPACKET_HDRLEN - sizeof(struct sockaddr_ll) | |
168 | ||
169 | So, whatever you choose for the socket mode (SOCK_DGRAM or SOCK_RAW), | |
170 | the beginning of the user data will be at : | |
171 | frame base + TPACKET_ALIGN(sizeof(struct tpacket_hdr)) | |
172 | ||
173 | If you wish to put user data at a custom offset from the beginning of | |
174 | the frame (for payload alignment with SOCK_RAW mode for instance) you | |
175 | can set tp_net (with SOCK_DGRAM) or tp_mac (with SOCK_RAW). In order | |
176 | to make this work it must be enabled previously with setsockopt() | |
177 | and the PACKET_TX_HAS_OFF option. | |
178 | ||
1da177e4 LT |
179 | -------------------------------------------------------------------------------- |
180 | + PACKET_MMAP settings | |
181 | -------------------------------------------------------------------------------- | |
182 | ||
183 | ||
184 | To setup PACKET_MMAP from user level code is done with a call like | |
185 | ||
69e3c75f | 186 | - Capture process |
1da177e4 | 187 | setsockopt(fd, SOL_PACKET, PACKET_RX_RING, (void *) &req, sizeof(req)) |
69e3c75f JB |
188 | - Transmission process |
189 | setsockopt(fd, SOL_PACKET, PACKET_TX_RING, (void *) &req, sizeof(req)) | |
1da177e4 LT |
190 | |
191 | The most significant argument in the previous call is the req parameter, | |
192 | this parameter must to have the following structure: | |
193 | ||
194 | struct tpacket_req | |
195 | { | |
196 | unsigned int tp_block_size; /* Minimal size of contiguous block */ | |
197 | unsigned int tp_block_nr; /* Number of blocks */ | |
198 | unsigned int tp_frame_size; /* Size of frame */ | |
199 | unsigned int tp_frame_nr; /* Total number of frames */ | |
200 | }; | |
201 | ||
202 | This structure is defined in /usr/include/linux/if_packet.h and establishes a | |
69e3c75f | 203 | circular buffer (ring) of unswappable memory. |
1da177e4 LT |
204 | Being mapped in the capture process allows reading the captured frames and |
205 | related meta-information like timestamps without requiring a system call. | |
206 | ||
69e3c75f | 207 | Frames are grouped in blocks. Each block is a physically contiguous |
1da177e4 LT |
208 | region of memory and holds tp_block_size/tp_frame_size frames. The total number |
209 | of blocks is tp_block_nr. Note that tp_frame_nr is a redundant parameter because | |
210 | ||
211 | frames_per_block = tp_block_size/tp_frame_size | |
212 | ||
213 | indeed, packet_set_ring checks that the following condition is true | |
214 | ||
215 | frames_per_block * tp_block_nr == tp_frame_nr | |
216 | ||
217 | ||
218 | Lets see an example, with the following values: | |
219 | ||
220 | tp_block_size= 4096 | |
221 | tp_frame_size= 2048 | |
222 | tp_block_nr = 4 | |
223 | tp_frame_nr = 8 | |
224 | ||
225 | we will get the following buffer structure: | |
226 | ||
227 | block #1 block #2 | |
228 | +---------+---------+ +---------+---------+ | |
229 | | frame 1 | frame 2 | | frame 3 | frame 4 | | |
230 | +---------+---------+ +---------+---------+ | |
231 | ||
232 | block #3 block #4 | |
233 | +---------+---------+ +---------+---------+ | |
234 | | frame 5 | frame 6 | | frame 7 | frame 8 | | |
235 | +---------+---------+ +---------+---------+ | |
236 | ||
237 | A frame can be of any size with the only condition it can fit in a block. A block | |
238 | can only hold an integer number of frames, or in other words, a frame cannot | |
25985edc | 239 | be spawned across two blocks, so there are some details you have to take into |
6c28f2c0 | 240 | account when choosing the frame_size. See "Mapping and use of the circular |
1da177e4 LT |
241 | buffer (ring)". |
242 | ||
243 | ||
244 | -------------------------------------------------------------------------------- | |
245 | + PACKET_MMAP setting constraints | |
246 | -------------------------------------------------------------------------------- | |
247 | ||
248 | In kernel versions prior to 2.4.26 (for the 2.4 branch) and 2.6.5 (2.6 branch), | |
249 | the PACKET_MMAP buffer could hold only 32768 frames in a 32 bit architecture or | |
250 | 16384 in a 64 bit architecture. For information on these kernel versions | |
251 | see http://pusa.uv.es/~ulisses/packet_mmap/packet_mmap.pre-2.4.26_2.6.5.txt | |
252 | ||
253 | Block size limit | |
254 | ------------------ | |
255 | ||
256 | As stated earlier, each block is a contiguous physical region of memory. These | |
257 | memory regions are allocated with calls to the __get_free_pages() function. As | |
258 | the name indicates, this function allocates pages of memory, and the second | |
259 | argument is "order" or a power of two number of pages, that is | |
260 | (for PAGE_SIZE == 4096) order=0 ==> 4096 bytes, order=1 ==> 8192 bytes, | |
261 | order=2 ==> 16384 bytes, etc. The maximum size of a | |
262 | region allocated by __get_free_pages is determined by the MAX_ORDER macro. More | |
263 | precisely the limit can be calculated as: | |
264 | ||
265 | PAGE_SIZE << MAX_ORDER | |
266 | ||
267 | In a i386 architecture PAGE_SIZE is 4096 bytes | |
268 | In a 2.4/i386 kernel MAX_ORDER is 10 | |
269 | In a 2.6/i386 kernel MAX_ORDER is 11 | |
270 | ||
271 | So get_free_pages can allocate as much as 4MB or 8MB in a 2.4/2.6 kernel | |
272 | respectively, with an i386 architecture. | |
273 | ||
274 | User space programs can include /usr/include/sys/user.h and | |
275 | /usr/include/linux/mmzone.h to get PAGE_SIZE MAX_ORDER declarations. | |
276 | ||
277 | The pagesize can also be determined dynamically with the getpagesize (2) | |
278 | system call. | |
279 | ||
280 | ||
281 | Block number limit | |
282 | -------------------- | |
283 | ||
284 | To understand the constraints of PACKET_MMAP, we have to see the structure | |
285 | used to hold the pointers to each block. | |
286 | ||
287 | Currently, this structure is a dynamically allocated vector with kmalloc | |
288 | called pg_vec, its size limits the number of blocks that can be allocated. | |
289 | ||
290 | +---+---+---+---+ | |
291 | | x | x | x | x | | |
292 | +---+---+---+---+ | |
293 | | | | | | |
294 | | | | v | |
295 | | | v block #4 | |
296 | | v block #3 | |
297 | v block #2 | |
298 | block #1 | |
299 | ||
300 | ||
2fe0ae78 ML |
301 | kmalloc allocates any number of bytes of physically contiguous memory from |
302 | a pool of pre-determined sizes. This pool of memory is maintained by the slab | |
c30fe7f7 UZ |
303 | allocator which is at the end the responsible for doing the allocation and |
304 | hence which imposes the maximum memory that kmalloc can allocate. | |
1da177e4 LT |
305 | |
306 | In a 2.4/2.6 kernel and the i386 architecture, the limit is 131072 bytes. The | |
307 | predetermined sizes that kmalloc uses can be checked in the "size-<bytes>" | |
308 | entries of /proc/slabinfo | |
309 | ||
310 | In a 32 bit architecture, pointers are 4 bytes long, so the total number of | |
311 | pointers to blocks is | |
312 | ||
313 | 131072/4 = 32768 blocks | |
314 | ||
315 | ||
316 | PACKET_MMAP buffer size calculator | |
317 | ------------------------------------ | |
318 | ||
319 | Definitions: | |
320 | ||
321 | <size-max> : is the maximum size of allocable with kmalloc (see /proc/slabinfo) | |
322 | <pointer size>: depends on the architecture -- sizeof(void *) | |
323 | <page size> : depends on the architecture -- PAGE_SIZE or getpagesize (2) | |
324 | <max-order> : is the value defined with MAX_ORDER | |
325 | <frame size> : it's an upper bound of frame's capture size (more on this later) | |
326 | ||
327 | from these definitions we will derive | |
328 | ||
329 | <block number> = <size-max>/<pointer size> | |
330 | <block size> = <pagesize> << <max-order> | |
331 | ||
332 | so, the max buffer size is | |
333 | ||
334 | <block number> * <block size> | |
335 | ||
336 | and, the number of frames be | |
337 | ||
338 | <block number> * <block size> / <frame size> | |
339 | ||
2e150f6e | 340 | Suppose the following parameters, which apply for 2.6 kernel and an |
1da177e4 LT |
341 | i386 architecture: |
342 | ||
343 | <size-max> = 131072 bytes | |
344 | <pointer size> = 4 bytes | |
345 | <pagesize> = 4096 bytes | |
346 | <max-order> = 11 | |
347 | ||
6c28f2c0 | 348 | and a value for <frame size> of 2048 bytes. These parameters will yield |
1da177e4 LT |
349 | |
350 | <block number> = 131072/4 = 32768 blocks | |
351 | <block size> = 4096 << 11 = 8 MiB. | |
352 | ||
353 | and hence the buffer will have a 262144 MiB size. So it can hold | |
354 | 262144 MiB / 2048 bytes = 134217728 frames | |
355 | ||
356 | ||
357 | Actually, this buffer size is not possible with an i386 architecture. | |
358 | Remember that the memory is allocated in kernel space, in the case of | |
359 | an i386 kernel's memory size is limited to 1GiB. | |
360 | ||
361 | All memory allocations are not freed until the socket is closed. The memory | |
362 | allocations are done with GFP_KERNEL priority, this basically means that | |
363 | the allocation can wait and swap other process' memory in order to allocate | |
992caacf | 364 | the necessary memory, so normally limits can be reached. |
1da177e4 LT |
365 | |
366 | Other constraints | |
367 | ------------------- | |
368 | ||
369 | If you check the source code you will see that what I draw here as a frame | |
5d3f083d | 370 | is not only the link level frame. At the beginning of each frame there is a |
1da177e4 LT |
371 | header called struct tpacket_hdr used in PACKET_MMAP to hold link level's frame |
372 | meta information like timestamp. So what we draw here a frame it's really | |
373 | the following (from include/linux/if_packet.h): | |
374 | ||
375 | /* | |
376 | Frame structure: | |
377 | ||
378 | - Start. Frame must be aligned to TPACKET_ALIGNMENT=16 | |
379 | - struct tpacket_hdr | |
380 | - pad to TPACKET_ALIGNMENT=16 | |
381 | - struct sockaddr_ll | |
3f6dee9b | 382 | - Gap, chosen so that packet data (Start+tp_net) aligns to |
1da177e4 LT |
383 | TPACKET_ALIGNMENT=16 |
384 | - Start+tp_mac: [ Optional MAC header ] | |
385 | - Start+tp_net: Packet data, aligned to TPACKET_ALIGNMENT=16. | |
386 | - Pad to align to TPACKET_ALIGNMENT=16 | |
387 | */ | |
388 | ||
389 | ||
390 | The following are conditions that are checked in packet_set_ring | |
391 | ||
392 | tp_block_size must be a multiple of PAGE_SIZE (1) | |
393 | tp_frame_size must be greater than TPACKET_HDRLEN (obvious) | |
394 | tp_frame_size must be a multiple of TPACKET_ALIGNMENT | |
395 | tp_frame_nr must be exactly frames_per_block*tp_block_nr | |
396 | ||
6c28f2c0 | 397 | Note that tp_block_size should be chosen to be a power of two or there will |
1da177e4 LT |
398 | be a waste of memory. |
399 | ||
400 | -------------------------------------------------------------------------------- | |
6c28f2c0 | 401 | + Mapping and use of the circular buffer (ring) |
1da177e4 LT |
402 | -------------------------------------------------------------------------------- |
403 | ||
6c28f2c0 | 404 | The mapping of the buffer in the user process is done with the conventional |
1da177e4 LT |
405 | mmap function. Even the circular buffer is compound of several physically |
406 | discontiguous blocks of memory, they are contiguous to the user space, hence | |
407 | just one call to mmap is needed: | |
408 | ||
409 | mmap(0, size, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0); | |
410 | ||
411 | If tp_frame_size is a divisor of tp_block_size frames will be | |
d9195881 | 412 | contiguously spaced by tp_frame_size bytes. If not, each |
1da177e4 LT |
413 | tp_block_size/tp_frame_size frames there will be a gap between |
414 | the frames. This is because a frame cannot be spawn across two | |
415 | blocks. | |
416 | ||
417 | At the beginning of each frame there is an status field (see | |
418 | struct tpacket_hdr). If this field is 0 means that the frame is ready | |
419 | to be used for the kernel, If not, there is a frame the user can read | |
420 | and the following flags apply: | |
421 | ||
69e3c75f | 422 | +++ Capture process: |
1da177e4 LT |
423 | from include/linux/if_packet.h |
424 | ||
425 | #define TP_STATUS_COPY 2 | |
426 | #define TP_STATUS_LOSING 4 | |
427 | #define TP_STATUS_CSUMNOTREADY 8 | |
428 | ||
429 | ||
430 | TP_STATUS_COPY : This flag indicates that the frame (and associated | |
431 | meta information) has been truncated because it's | |
432 | larger than tp_frame_size. This packet can be | |
433 | read entirely with recvfrom(). | |
434 | ||
435 | In order to make this work it must to be | |
436 | enabled previously with setsockopt() and | |
437 | the PACKET_COPY_THRESH option. | |
438 | ||
439 | The number of frames than can be buffered to | |
440 | be read with recvfrom is limited like a normal socket. | |
441 | See the SO_RCVBUF option in the socket (7) man page. | |
442 | ||
443 | TP_STATUS_LOSING : indicates there were packet drops from last time | |
444 | statistics where checked with getsockopt() and | |
445 | the PACKET_STATISTICS option. | |
446 | ||
c30fe7f7 | 447 | TP_STATUS_CSUMNOTREADY: currently it's used for outgoing IP packets which |
a33f3224 | 448 | its checksum will be done in hardware. So while |
1da177e4 LT |
449 | reading the packet we should not try to check the |
450 | checksum. | |
451 | ||
452 | for convenience there are also the following defines: | |
453 | ||
454 | #define TP_STATUS_KERNEL 0 | |
455 | #define TP_STATUS_USER 1 | |
456 | ||
457 | The kernel initializes all frames to TP_STATUS_KERNEL, when the kernel | |
458 | receives a packet it puts in the buffer and updates the status with | |
459 | at least the TP_STATUS_USER flag. Then the user can read the packet, | |
460 | once the packet is read the user must zero the status field, so the kernel | |
461 | can use again that frame buffer. | |
462 | ||
463 | The user can use poll (any other variant should apply too) to check if new | |
464 | packets are in the ring: | |
465 | ||
466 | struct pollfd pfd; | |
467 | ||
468 | pfd.fd = fd; | |
469 | pfd.revents = 0; | |
470 | pfd.events = POLLIN|POLLRDNORM|POLLERR; | |
471 | ||
472 | if (status == TP_STATUS_KERNEL) | |
473 | retval = poll(&pfd, 1, timeout); | |
474 | ||
475 | It doesn't incur in a race condition to first check the status value and | |
476 | then poll for frames. | |
477 | ||
69e3c75f JB |
478 | |
479 | ++ Transmission process | |
480 | Those defines are also used for transmission: | |
481 | ||
482 | #define TP_STATUS_AVAILABLE 0 // Frame is available | |
483 | #define TP_STATUS_SEND_REQUEST 1 // Frame will be sent on next send() | |
484 | #define TP_STATUS_SENDING 2 // Frame is currently in transmission | |
485 | #define TP_STATUS_WRONG_FORMAT 4 // Frame format is not correct | |
486 | ||
487 | First, the kernel initializes all frames to TP_STATUS_AVAILABLE. To send a | |
488 | packet, the user fills a data buffer of an available frame, sets tp_len to | |
489 | current data buffer size and sets its status field to TP_STATUS_SEND_REQUEST. | |
490 | This can be done on multiple frames. Once the user is ready to transmit, it | |
491 | calls send(). Then all buffers with status equal to TP_STATUS_SEND_REQUEST are | |
492 | forwarded to the network device. The kernel updates each status of sent | |
493 | frames with TP_STATUS_SENDING until the end of transfer. | |
494 | At the end of each transfer, buffer status returns to TP_STATUS_AVAILABLE. | |
495 | ||
496 | header->tp_len = in_i_size; | |
497 | header->tp_status = TP_STATUS_SEND_REQUEST; | |
498 | retval = send(this->socket, NULL, 0, 0); | |
499 | ||
500 | The user can also use poll() to check if a buffer is available: | |
501 | (status == TP_STATUS_SENDING) | |
502 | ||
503 | struct pollfd pfd; | |
504 | pfd.fd = fd; | |
505 | pfd.revents = 0; | |
506 | pfd.events = POLLOUT; | |
507 | retval = poll(&pfd, 1, timeout); | |
508 | ||
614f60fa SM |
509 | ------------------------------------------------------------------------------- |
510 | + PACKET_TIMESTAMP | |
511 | ------------------------------------------------------------------------------- | |
512 | ||
513 | The PACKET_TIMESTAMP setting determines the source of the timestamp in | |
514 | the packet meta information. If your NIC is capable of timestamping | |
515 | packets in hardware, you can request those hardware timestamps to used. | |
516 | Note: you may need to enable the generation of hardware timestamps with | |
517 | SIOCSHWTSTAMP. | |
518 | ||
519 | PACKET_TIMESTAMP accepts the same integer bit field as | |
520 | SO_TIMESTAMPING. However, only the SOF_TIMESTAMPING_SYS_HARDWARE | |
521 | and SOF_TIMESTAMPING_RAW_HARDWARE values are recognized by | |
522 | PACKET_TIMESTAMP. SOF_TIMESTAMPING_SYS_HARDWARE takes precedence over | |
523 | SOF_TIMESTAMPING_RAW_HARDWARE if both bits are set. | |
524 | ||
525 | int req = 0; | |
526 | req |= SOF_TIMESTAMPING_SYS_HARDWARE; | |
527 | setsockopt(fd, SOL_PACKET, PACKET_TIMESTAMP, (void *) &req, sizeof(req)) | |
528 | ||
529 | If PACKET_TIMESTAMP is not set, a software timestamp generated inside | |
530 | the networking stack is used (the behavior before this setting was added). | |
531 | ||
532 | See include/linux/net_tstamp.h and Documentation/networking/timestamping | |
533 | for more information on hardware timestamps. | |
534 | ||
1da177e4 LT |
535 | -------------------------------------------------------------------------------- |
536 | + THANKS | |
537 | -------------------------------------------------------------------------------- | |
538 | ||
539 | Jesse Brandeburg, for fixing my grammathical/spelling errors | |
540 |