]>
Commit | Line | Data |
---|---|---|
d3f45345 QY |
1 | .. _tracing: |
2 | ||
3 | Tracing | |
4 | ======= | |
5 | ||
6 | FRR has a small but growing number of static tracepoints available for use with | |
7 | various tracing systems. These tracepoints can assist with debugging, | |
8 | performance analysis and to help understand program flow. They can also be used | |
9 | for monitoring. | |
10 | ||
11 | Developers are encouraged to write new static tracepoints where sensible. They | |
12 | are not compiled in by default, and even when they are, they have no overhead | |
13 | unless enabled by a tracer, so it is okay to be liberal with them. | |
14 | ||
15 | ||
16 | Supported tracers | |
17 | ----------------- | |
18 | ||
19 | Presently two types of tracepoints are supported: | |
20 | ||
21 | - `LTTng tracepoints <https://lttng.org/>`_ | |
22 | - `USDT probes <http://dtrace.org/guide/chp-usdt.html>`_ | |
23 | ||
24 | LTTng is a tracing framework for Linux only. It offers extremely low overhead | |
25 | and very rich tracing capabilities. FRR supports LTTng-UST, which is the | |
26 | userspace implementation. LTTng tracepoints are very rich in detail. No kernel | |
27 | modules are needed. Besides only being available for Linux, the primary | |
28 | downside of LTTng is the need to link to ``lttng-ust``. | |
29 | ||
30 | USDT probes originate from Solaris, where they were invented for use with | |
31 | dtrace. They are a kernel feature. At least Linux and FreeBSD support them. No | |
32 | library is needed; support is compiled in via a system header | |
33 | (``<sys/sdt.h>``). USDT probes are much slower than LTTng tracepoints and offer | |
34 | less flexibility in what information can be gleaned from them. | |
35 | ||
36 | LTTng is capable of tracing USDT probes but has limited support for them. | |
37 | SystemTap and dtrace both work only with USDT probes. | |
38 | ||
39 | ||
40 | Usage | |
41 | ----- | |
42 | ||
43 | To compile with tracepoints, use one of the following configure flags: | |
44 | ||
45 | .. program:: configure.ac | |
46 | ||
47 | .. option:: --enable-lttng=yes | |
48 | ||
49 | Generate LTTng tracepoints | |
50 | ||
51 | .. option:: --enable-usdt=yes | |
52 | ||
53 | Generate USDT probes | |
54 | ||
55 | To trace with LTTng, compile with either one (prefer :option:`--enable-lttng` | |
56 | run the target in non-forking mode (no ``-d``) and use LTTng as usual (refer to | |
57 | LTTng user manual). When using USDT probes with LTTng, follow the example in | |
58 | `this article | |
59 | <https://lttng.org/blog/2019/10/15/new-dynamic-user-space-tracing-in-lttng/>`_. | |
0a9720bd | 60 | To trace with dtrace or SystemTap, compile with `--enable-usdt=yes` and |
d3f45345 QY |
61 | use your tracer as usual. |
62 | ||
63 | To see available USDT probes:: | |
64 | ||
7f4d2e5a | 65 | readelf -n /usr/lib/frr/bgpd |
d3f45345 QY |
66 | |
67 | Example:: | |
68 | ||
7f4d2e5a QY |
69 | root@host ~> readelf -n /usr/lib/frr/bgpd |
70 | ||
71 | Displaying notes found in: .note.ABI-tag | |
72 | Owner Data size Description | |
73 | GNU 0x00000010 NT_GNU_ABI_TAG (ABI version tag) | |
74 | OS: Linux, ABI: 3.2.0 | |
75 | ||
76 | Displaying notes found in: .note.gnu.build-id | |
77 | Owner Data size Description | |
78 | GNU 0x00000014 NT_GNU_BUILD_ID (unique build ID bitstring) | |
79 | Build ID: 4f42933a69dcb42a519bc459b2105177c8adf55d | |
80 | ||
81 | Displaying notes found in: .note.stapsdt | |
82 | Owner Data size Description | |
83 | stapsdt 0x00000045 NT_STAPSDT (SystemTap probe descriptors) | |
84 | Provider: frr_bgp | |
85 | Name: packet_read | |
86 | Location: 0x000000000045ee48, Base: 0x00000000005a09d2, Semaphore: 0x0000000000000000 | |
87 | Arguments: 8@-96(%rbp) 8@-104(%rbp) | |
88 | stapsdt 0x00000047 NT_STAPSDT (SystemTap probe descriptors) | |
89 | Provider: frr_bgp | |
90 | Name: open_process | |
91 | Location: 0x000000000047c43b, Base: 0x00000000005a09d2, Semaphore: 0x0000000000000000 | |
92 | Arguments: 8@-224(%rbp) 2@-226(%rbp) | |
93 | stapsdt 0x00000049 NT_STAPSDT (SystemTap probe descriptors) | |
94 | Provider: frr_bgp | |
95 | Name: update_process | |
96 | Location: 0x000000000047c4bf, Base: 0x00000000005a09d2, Semaphore: 0x0000000000000000 | |
97 | Arguments: 8@-208(%rbp) 2@-210(%rbp) | |
98 | stapsdt 0x0000004f NT_STAPSDT (SystemTap probe descriptors) | |
99 | Provider: frr_bgp | |
100 | Name: notification_process | |
101 | Location: 0x000000000047c557, Base: 0x00000000005a09d2, Semaphore: 0x0000000000000000 | |
102 | Arguments: 8@-192(%rbp) 2@-194(%rbp) | |
103 | stapsdt 0x0000004c NT_STAPSDT (SystemTap probe descriptors) | |
104 | Provider: frr_bgp | |
105 | Name: keepalive_process | |
106 | Location: 0x000000000047c5db, Base: 0x00000000005a09d2, Semaphore: 0x0000000000000000 | |
107 | Arguments: 8@-176(%rbp) 2@-178(%rbp) | |
108 | stapsdt 0x0000004a NT_STAPSDT (SystemTap probe descriptors) | |
109 | Provider: frr_bgp | |
110 | Name: refresh_process | |
111 | Location: 0x000000000047c673, Base: 0x00000000005a09d2, Semaphore: 0x0000000000000000 | |
112 | Arguments: 8@-160(%rbp) 2@-162(%rbp) | |
113 | stapsdt 0x0000004d NT_STAPSDT (SystemTap probe descriptors) | |
114 | Provider: frr_bgp | |
115 | Name: capability_process | |
116 | Location: 0x000000000047c6f7, Base: 0x00000000005a09d2, Semaphore: 0x0000000000000000 | |
117 | Arguments: 8@-144(%rbp) 2@-146(%rbp) | |
118 | stapsdt 0x0000006f NT_STAPSDT (SystemTap probe descriptors) | |
119 | Provider: frr_bgp | |
120 | Name: output_filter | |
121 | Location: 0x000000000048e33a, Base: 0x00000000005a09d2, Semaphore: 0x0000000000000000 | |
122 | Arguments: 8@-144(%rbp) 8@-152(%rbp) 4@-156(%rbp) 4@-160(%rbp) 8@-168(%rbp) | |
123 | stapsdt 0x0000007d NT_STAPSDT (SystemTap probe descriptors) | |
124 | Provider: frr_bgp | |
125 | Name: process_update | |
126 | Location: 0x0000000000491f10, Base: 0x00000000005a09d2, Semaphore: 0x0000000000000000 | |
127 | Arguments: 8@-800(%rbp) 8@-808(%rbp) 4@-812(%rbp) 4@-816(%rbp) 4@-820(%rbp) 8@-832(%rbp) | |
128 | stapsdt 0x0000006e NT_STAPSDT (SystemTap probe descriptors) | |
129 | Provider: frr_bgp | |
130 | Name: input_filter | |
131 | Location: 0x00000000004940ed, Base: 0x00000000005a09d2, Semaphore: 0x0000000000000000 | |
132 | Arguments: 8@-144(%rbp) 8@-152(%rbp) 4@-156(%rbp) 4@-160(%rbp) 8@-168(%rbp) | |
d3f45345 QY |
133 | |
134 | ||
135 | To see available LTTng probes, run the target, create a session and then:: | |
136 | ||
7f4d2e5a | 137 | lttng list --userspace | grep frr |
d3f45345 QY |
138 | |
139 | Example:: | |
140 | ||
7f4d2e5a QY |
141 | root@host ~> lttng list --userspace | grep frr |
142 | PID: 11157 - Name: /usr/lib/frr/bgpd | |
143 | frr_libfrr:route_node_get (loglevel: TRACE_DEBUG_LINE (13)) (type: tracepoint) | |
144 | frr_libfrr:list_sort (loglevel: TRACE_DEBUG_LINE (13)) (type: tracepoint) | |
145 | frr_libfrr:list_delete_node (loglevel: TRACE_DEBUG_LINE (13)) (type: tracepoint) | |
146 | frr_libfrr:list_remove (loglevel: TRACE_DEBUG_LINE (13)) (type: tracepoint) | |
147 | frr_libfrr:list_add (loglevel: TRACE_DEBUG_LINE (13)) (type: tracepoint) | |
148 | frr_libfrr:memfree (loglevel: TRACE_DEBUG_LINE (13)) (type: tracepoint) | |
149 | frr_libfrr:memalloc (loglevel: TRACE_DEBUG_LINE (13)) (type: tracepoint) | |
150 | frr_libfrr:frr_pthread_stop (loglevel: TRACE_DEBUG_LINE (13)) (type: tracepoint) | |
151 | frr_libfrr:frr_pthread_run (loglevel: TRACE_DEBUG_LINE (13)) (type: tracepoint) | |
152 | frr_libfrr:thread_call (loglevel: TRACE_INFO (6)) (type: tracepoint) | |
332beb64 DS |
153 | frr_libfrr:event_cancel_async (loglevel: TRACE_INFO (6)) (type: tracepoint) |
154 | frr_libfrr:event_cancel (loglevel: TRACE_INFO (6)) (type: tracepoint) | |
7f4d2e5a QY |
155 | frr_libfrr:schedule_write (loglevel: TRACE_INFO (6)) (type: tracepoint) |
156 | frr_libfrr:schedule_read (loglevel: TRACE_INFO (6)) (type: tracepoint) | |
157 | frr_libfrr:schedule_event (loglevel: TRACE_INFO (6)) (type: tracepoint) | |
158 | frr_libfrr:schedule_timer (loglevel: TRACE_INFO (6)) (type: tracepoint) | |
159 | frr_libfrr:hash_release (loglevel: TRACE_INFO (6)) (type: tracepoint) | |
160 | frr_libfrr:hash_insert (loglevel: TRACE_INFO (6)) (type: tracepoint) | |
161 | frr_libfrr:hash_get (loglevel: TRACE_INFO (6)) (type: tracepoint) | |
162 | frr_bgp:output_filter (loglevel: TRACE_INFO (6)) (type: tracepoint) | |
163 | frr_bgp:input_filter (loglevel: TRACE_INFO (6)) (type: tracepoint) | |
164 | frr_bgp:process_update (loglevel: TRACE_INFO (6)) (type: tracepoint) | |
165 | frr_bgp:packet_read (loglevel: TRACE_INFO (6)) (type: tracepoint) | |
166 | frr_bgp:refresh_process (loglevel: TRACE_INFO (6)) (type: tracepoint) | |
167 | frr_bgp:capability_process (loglevel: TRACE_INFO (6)) (type: tracepoint) | |
168 | frr_bgp:notification_process (loglevel: TRACE_INFO (6)) (type: tracepoint) | |
169 | frr_bgp:update_process (loglevel: TRACE_INFO (6)) (type: tracepoint) | |
170 | frr_bgp:keepalive_process (loglevel: TRACE_INFO (6)) (type: tracepoint) | |
171 | frr_bgp:open_process (loglevel: TRACE_INFO (6)) (type: tracepoint) | |
d3f45345 QY |
172 | |
173 | When using LTTng, you can also get zlogs as trace events by enabling | |
174 | the ``lttng_ust_tracelog:*`` event class. | |
175 | ||
c952f184 DA |
176 | To see available SystemTap USDT probes, run:: |
177 | ||
ccc1af39 | 178 | stap -L 'process("/usr/lib/frr/bgpd").mark("*")' |
c952f184 DA |
179 | |
180 | Example:: | |
ccc1af39 DA |
181 | |
182 | root@host ~> stap -L 'process("/usr/lib/frr/bgpd").mark("*")' | |
183 | process("/usr/lib/frr/bgpd").mark("capability_process") $arg1:long $arg2:long | |
184 | process("/usr/lib/frr/bgpd").mark("input_filter") $arg1:long $arg2:long $arg3:long $arg4:long $arg5:long | |
185 | process("/usr/lib/frr/bgpd").mark("keepalive_process") $arg1:long $arg2:long | |
186 | process("/usr/lib/frr/bgpd").mark("notification_process") $arg1:long $arg2:long | |
187 | process("/usr/lib/frr/bgpd").mark("open_process") $arg1:long $arg2:long | |
188 | process("/usr/lib/frr/bgpd").mark("output_filter") $arg1:long $arg2:long $arg3:long $arg4:long $arg5:long | |
189 | process("/usr/lib/frr/bgpd").mark("packet_read") $arg1:long $arg2:long | |
190 | process("/usr/lib/frr/bgpd").mark("process_update") $arg1:long $arg2:long $arg3:long $arg4:long $arg5:long $arg6:long | |
191 | process("/usr/lib/frr/bgpd").mark("refresh_process") $arg1:long $arg2:long | |
192 | process("/usr/lib/frr/bgpd").mark("update_process") $arg1:long $arg2:long | |
c952f184 DA |
193 | |
194 | When using SystemTap, you can also easily attach to an existing function:: | |
195 | ||
ccc1af39 | 196 | stap -L 'process("/usr/lib/frr/bgpd").function("bgp_update_receive")' |
c952f184 | 197 | |
ccc1af39 | 198 | Example:: |
c952f184 | 199 | |
ccc1af39 DA |
200 | root@host ~> stap -L 'process("/usr/lib/frr/bgpd").function("bgp_update_receive")' |
201 | process("/usr/lib/frr/bgpd").function("bgp_update_receive@bgpd/bgp_packet.c:1531") $peer:struct peer* $size:bgp_size_t $attr:struct attr $restart:_Bool $nlris:struct bgp_nlri[] $__func__:char const[] const | |
c952f184 DA |
202 | |
203 | Complete ``bgp.stp`` example using SystemTap to show BGP peer, prefix and aspath | |
204 | using ``process_update`` USDT:: | |
205 | ||
ccc1af39 DA |
206 | global pkt_size; |
207 | probe begin | |
208 | { | |
209 | ansi_clear_screen(); | |
210 | println("Starting..."); | |
211 | } | |
212 | probe process("/usr/lib/frr/bgpd").function("bgp_update_receive") | |
213 | { | |
214 | pkt_size <<< $size; | |
215 | } | |
216 | probe process("/usr/lib/frr/bgpd").mark("process_update") | |
217 | { | |
218 | aspath = @cast($arg6, "attr")->aspath; | |
219 | printf("> %s via %s (%s)\n", | |
220 | user_string($arg2), | |
221 | user_string(@cast($arg1, "peer")->host), | |
222 | user_string(@cast(aspath, "aspath")->str)); | |
223 | } | |
224 | probe end | |
225 | { | |
226 | if (@count(pkt_size)) | |
227 | print(@hist_linear(pkt_size, 0, 20, 2)); | |
228 | } | |
c952f184 DA |
229 | |
230 | Output:: | |
231 | ||
ccc1af39 DA |
232 | Starting... |
233 | > 192.168.0.0/24 via 192.168.0.1 (65534) | |
234 | > 192.168.100.1/32 via 192.168.0.1 (65534) | |
235 | > 172.16.16.1/32 via 192.168.0.1 (65534 65030) | |
236 | ^Cvalue |-------------------------------------------------- count | |
237 | 0 | 0 | |
238 | 2 | 0 | |
239 | 4 |@ 1 | |
240 | 6 | 0 | |
241 | 8 | 0 | |
242 | ~ | |
243 | 18 | 0 | |
244 | 20 | 0 | |
245 | >20 |@@@@@ 5 | |
c952f184 DA |
246 | |
247 | ||
d3f45345 QY |
248 | Concepts |
249 | -------- | |
250 | ||
251 | Tracepoints are statically defined points in code where a developer has | |
252 | determined that outside observers might gain something from knowing what is | |
253 | going on at that point. It's like logging but with the ability to dump large | |
254 | amounts of internal data with much higher performance. LTTng has a good summary | |
255 | `here <https://lttng.org/docs/#doc-what-is-tracing>`_. | |
256 | ||
257 | Each tracepoint has a "provider" and name. The provider is basically a | |
258 | namespace; for example, ``bgpd`` uses the provider name ``frr_bgp``. The name | |
259 | is arbitrary, but because providers share a global namespace on the user's | |
260 | system, all providers from FRR should be prefixed by ``frr_``. The tracepoint | |
261 | name is just the name of the event. Events are globally named by their provider | |
262 | and name. For example, the event when BGP reads a packet from a peer is | |
263 | ``frr_bgp:packet_read``. | |
264 | ||
265 | To do tracing, the tracing tool of choice is told which events to listen to. | |
266 | For example, to listen to all events from FRR's BGP implementation, you would | |
267 | enable the events ``frr_bgp:*``. In the same tracing session you could also | |
268 | choose to record all memory allocations by enabling the ``malloc`` tracepoints | |
269 | in ``libc`` as well as all kernel skb operations using the various in-kernel | |
270 | tracepoints. This allows you to build as complete a view as desired of what the | |
271 | system is doing during the tracing window (subject to what tracepoints are | |
272 | available). | |
273 | ||
274 | Of particular use are the tracepoints for FRR's internal event scheduler; | |
275 | tracing these allows you to see all events executed by all event loops for the | |
276 | target(s) in question. Here's a couple events selected from a trace of BGP | |
277 | during startup:: | |
278 | ||
279 | ... | |
280 | ||
281 | [18:41:35.750131763] (+0.000048901) host frr_libfrr:thread_call: { cpu_id = | |
282 | 1 }, { threadmaster_name = "default", function_name = "zclient_connect", | |
283 | scheduled_from = "lib/zclient.c", scheduled_on_line = 3877, thread_addr = | |
284 | 0x0, file_descriptor = 0, event_value = 0, argument_ptr = 0xA37F70, timer = | |
285 | 0 } | |
286 | ||
287 | [18:41:35.750175124] (+0.000020001) host frr_libfrr:thread_call: { cpu_id = | |
288 | 1 }, { threadmaster_name = "default", function_name = "frr_config_read_in", | |
289 | scheduled_from = "lib/libfrr.c", scheduled_on_line = 934, thread_addr = 0x0, | |
290 | file_descriptor = 0, event_value = 0, argument_ptr = 0x0, timer = 0 } | |
291 | ||
292 | [18:41:35.753341264] (+0.000010532) host frr_libfrr:thread_call: { cpu_id = | |
293 | 1 }, { threadmaster_name = "default", function_name = "bgp_event", | |
294 | scheduled_from = "bgpd/bgpd.c", scheduled_on_line = 142, thread_addr = 0x0, | |
295 | file_descriptor = 2, event_value = 2, argument_ptr = 0xE4D780, timer = 2 } | |
296 | ||
297 | [18:41:35.753404186] (+0.000004910) host frr_libfrr:thread_call: { cpu_id = | |
298 | 1 }, { threadmaster_name = "default", function_name = "zclient_read", | |
299 | scheduled_from = "lib/zclient.c", scheduled_on_line = 3891, thread_addr = | |
300 | 0x0, file_descriptor = 40, event_value = 40, argument_ptr = 0xA37F70, timer | |
301 | = 40 } | |
302 | ||
303 | ... | |
304 | ||
305 | ||
306 | Very useful for getting a time-ordered look into what the process is doing. | |
307 | ||
308 | ||
309 | Adding Tracepoints | |
310 | ------------------ | |
311 | ||
312 | Adding new tracepoints is a two step process: | |
313 | ||
314 | 1. Define the tracepoint | |
315 | 2. Use the tracepoint | |
316 | ||
317 | Tracepoint definitions state the "provider" and name of the tracepoint, along | |
318 | with any values it will produce, and how to format them. This is done with | |
319 | macros provided by LTTng. USDT probes do not use definitions and are inserted | |
320 | at the trace site with a single macro. However, to maintain support for both | |
321 | platforms, you must define an LTTng tracepoint when adding a new one. | |
322 | ``frrtrace()`` will expand to the appropriate ``DTRACE_PROBEn`` macro when USDT | |
323 | is in use. | |
324 | ||
325 | If you are adding new tracepoints to a daemon that has no tracepoints, that | |
326 | daemon's ``subdir.am`` must be updated to conditionally link ``lttng-ust``. | |
327 | Look at ``bgpd/subdir.am`` for an example of how to do this; grep for | |
328 | ``UST_LIBS``. Create new files named ``<daemon>_trace.[ch]``. Use | |
329 | ``bgpd/bgp_trace.[h]`` as boilerplate. If you are adding tracepoints to a | |
330 | daemon that already has them, look for the ``<daemon>_trace.h`` file; | |
331 | tracepoints are written here. | |
332 | ||
333 | Refer to the `LTTng developer docs | |
334 | <https://lttng.org/docs/#doc-c-application>`_ for details on how to define | |
335 | tracepoints. | |
336 | ||
337 | To use them, simply add a call to ``frrtrace()`` at the point you'd like the | |
338 | event to be emitted, like so: | |
339 | ||
340 | .. code-block:: c | |
341 | ||
342 | ... | |
343 | ||
344 | switch (type) { | |
345 | case BGP_MSG_OPEN: | |
346 | frrtrace(2, frr_bgp, open_process, peer, size); /* tracepoint */ | |
347 | atomic_fetch_add_explicit(&peer->open_in, 1, | |
348 | memory_order_relaxed); | |
349 | mprc = bgp_open_receive(peer, size); | |
350 | ||
351 | ... | |
352 | ||
353 | After recompiling this tracepoint will now be available, either as a USDT probe | |
354 | or LTTng tracepoint, depending on your compilation choice. | |
355 | ||
356 | ||
357 | trace.h | |
358 | ^^^^^^^ | |
359 | ||
360 | Because FRR supports multiple types of tracepoints, the code for creating them | |
361 | abstracts away the underlying system being used. This abstraction code is in | |
362 | ``lib/trace.h``. There are 2 function-like macros that are used for working | |
363 | with tracepoints. | |
364 | ||
365 | - ``frrtrace()`` defines tracepoints | |
366 | - ``frrtrace_enabled()`` checks whether a tracepoint is enabled | |
367 | ||
368 | There is also ``frrtracelog()``, which is used in zlog core code to make zlog | |
369 | messages available as trace events to LTTng. This should not be used elsewhere. | |
370 | ||
371 | There is additional documentation in the header. The key thing to note is that | |
372 | you should never include ``trace.h`` in source where you plan to put | |
373 | tracepoints; include the tracepoint definition header instead (e.g. | |
374 | :file:`bgp_trace.h`). | |
375 | ||
376 | ||
377 | Limitations | |
378 | ----------- | |
379 | ||
380 | Tracers do not like ``fork()`` or ``dlopen()``. LTTng has some workarounds for | |
381 | this involving interceptor libraries using ``LD_PRELOAD``. | |
382 | ||
90294611 QY |
383 | If you're running FRR in a typical daemonizing way (``-d`` to the daemons) |
384 | you'll need to run the daemons like so: | |
385 | ||
386 | .. code-block:: shell | |
387 | ||
388 | LD_PRELOAD=liblttng-ust-fork.so <daemon> | |
389 | ||
390 | ||
391 | If you're using systemd this you can accomplish this for all daemons by | |
392 | modifying ``frr.service`` like so: | |
393 | ||
394 | .. code-block:: diff | |
395 | ||
396 | --- a/frr.service | |
397 | +++ b/frr.service | |
398 | @@ -7,6 +7,7 @@ Before=network.target | |
489faf5e | 399 | OnFailure=heartbeat-failed@%n |
90294611 QY |
400 | |
401 | [Service] | |
402 | +Environment="LD_PRELOAD=liblttng-ust-fork.so" | |
403 | Nice=-5 | |
404 | Type=forking | |
405 | NotifyAccess=all | |
406 | ||
407 | ||
d3f45345 QY |
408 | USDT tracepoints are relatively high overhead and probably shouldn't be used |
409 | for "flight recorder" functionality, i.e. enabling and passively recording all | |
410 | events for monitoring purposes. It's generally okay to use LTTng like this, | |
411 | though. |