]>
Commit | Line | Data |
---|---|---|
d3f45345 QY |
1 | .. _tracing: |
2 | ||
3 | Tracing | |
4 | ======= | |
5 | ||
6 | FRR has a small but growing number of static tracepoints available for use with | |
7 | various tracing systems. These tracepoints can assist with debugging, | |
8 | performance analysis and to help understand program flow. They can also be used | |
9 | for monitoring. | |
10 | ||
11 | Developers are encouraged to write new static tracepoints where sensible. They | |
12 | are not compiled in by default, and even when they are, they have no overhead | |
13 | unless enabled by a tracer, so it is okay to be liberal with them. | |
14 | ||
15 | ||
16 | Supported tracers | |
17 | ----------------- | |
18 | ||
19 | Presently two types of tracepoints are supported: | |
20 | ||
21 | - `LTTng tracepoints <https://lttng.org/>`_ | |
22 | - `USDT probes <http://dtrace.org/guide/chp-usdt.html>`_ | |
23 | ||
24 | LTTng is a tracing framework for Linux only. It offers extremely low overhead | |
25 | and very rich tracing capabilities. FRR supports LTTng-UST, which is the | |
26 | userspace implementation. LTTng tracepoints are very rich in detail. No kernel | |
27 | modules are needed. Besides only being available for Linux, the primary | |
28 | downside of LTTng is the need to link to ``lttng-ust``. | |
29 | ||
30 | USDT probes originate from Solaris, where they were invented for use with | |
31 | dtrace. They are a kernel feature. At least Linux and FreeBSD support them. No | |
32 | library is needed; support is compiled in via a system header | |
33 | (``<sys/sdt.h>``). USDT probes are much slower than LTTng tracepoints and offer | |
34 | less flexibility in what information can be gleaned from them. | |
35 | ||
36 | LTTng is capable of tracing USDT probes but has limited support for them. | |
37 | SystemTap and dtrace both work only with USDT probes. | |
38 | ||
39 | ||
40 | Usage | |
41 | ----- | |
42 | ||
43 | To compile with tracepoints, use one of the following configure flags: | |
44 | ||
45 | .. program:: configure.ac | |
46 | ||
47 | .. option:: --enable-lttng=yes | |
48 | ||
49 | Generate LTTng tracepoints | |
50 | ||
51 | .. option:: --enable-usdt=yes | |
52 | ||
53 | Generate USDT probes | |
54 | ||
55 | To trace with LTTng, compile with either one (prefer :option:`--enable-lttng` | |
56 | run the target in non-forking mode (no ``-d``) and use LTTng as usual (refer to | |
57 | LTTng user manual). When using USDT probes with LTTng, follow the example in | |
58 | `this article | |
59 | <https://lttng.org/blog/2019/10/15/new-dynamic-user-space-tracing-in-lttng/>`_. | |
0a9720bd | 60 | To trace with dtrace or SystemTap, compile with `--enable-usdt=yes` and |
d3f45345 QY |
61 | use your tracer as usual. |
62 | ||
63 | To see available USDT probes:: | |
64 | ||
65 | readelf -n /usr/lib/frr/bgpd | |
66 | ||
67 | Example:: | |
68 | ||
69 | root@host ~> readelf -n /usr/lib/frr/bgpd | |
70 | ||
71 | Displaying notes found in: .note.ABI-tag | |
72 | Owner Data size Description | |
73 | GNU 0x00000010 NT_GNU_ABI_TAG (ABI version tag) | |
74 | OS: Linux, ABI: 3.2.0 | |
75 | ||
76 | Displaying notes found in: .note.gnu.build-id | |
77 | Owner Data size Description | |
78 | GNU 0x00000014 NT_GNU_BUILD_ID (unique build ID bitstring) | |
79 | Build ID: 4f42933a69dcb42a519bc459b2105177c8adf55d | |
80 | ||
81 | Displaying notes found in: .note.stapsdt | |
82 | Owner Data size Description | |
83 | stapsdt 0x00000045 NT_STAPSDT (SystemTap probe descriptors) | |
84 | Provider: frr_bgp | |
85 | Name: packet_read | |
86 | Location: 0x000000000045ee48, Base: 0x00000000005a09d2, Semaphore: 0x0000000000000000 | |
87 | Arguments: 8@-96(%rbp) 8@-104(%rbp) | |
88 | stapsdt 0x00000047 NT_STAPSDT (SystemTap probe descriptors) | |
89 | Provider: frr_bgp | |
90 | Name: open_process | |
91 | Location: 0x000000000047c43b, Base: 0x00000000005a09d2, Semaphore: 0x0000000000000000 | |
92 | Arguments: 8@-224(%rbp) 2@-226(%rbp) | |
93 | stapsdt 0x00000049 NT_STAPSDT (SystemTap probe descriptors) | |
94 | Provider: frr_bgp | |
95 | Name: update_process | |
96 | Location: 0x000000000047c4bf, Base: 0x00000000005a09d2, Semaphore: 0x0000000000000000 | |
97 | Arguments: 8@-208(%rbp) 2@-210(%rbp) | |
98 | stapsdt 0x0000004f NT_STAPSDT (SystemTap probe descriptors) | |
99 | Provider: frr_bgp | |
100 | Name: notification_process | |
101 | Location: 0x000000000047c557, Base: 0x00000000005a09d2, Semaphore: 0x0000000000000000 | |
102 | Arguments: 8@-192(%rbp) 2@-194(%rbp) | |
103 | stapsdt 0x0000004c NT_STAPSDT (SystemTap probe descriptors) | |
104 | Provider: frr_bgp | |
105 | Name: keepalive_process | |
106 | Location: 0x000000000047c5db, Base: 0x00000000005a09d2, Semaphore: 0x0000000000000000 | |
107 | Arguments: 8@-176(%rbp) 2@-178(%rbp) | |
108 | stapsdt 0x0000004a NT_STAPSDT (SystemTap probe descriptors) | |
109 | Provider: frr_bgp | |
110 | Name: refresh_process | |
111 | Location: 0x000000000047c673, Base: 0x00000000005a09d2, Semaphore: 0x0000000000000000 | |
112 | Arguments: 8@-160(%rbp) 2@-162(%rbp) | |
113 | stapsdt 0x0000004d NT_STAPSDT (SystemTap probe descriptors) | |
114 | Provider: frr_bgp | |
115 | Name: capability_process | |
116 | Location: 0x000000000047c6f7, Base: 0x00000000005a09d2, Semaphore: 0x0000000000000000 | |
117 | Arguments: 8@-144(%rbp) 2@-146(%rbp) | |
118 | stapsdt 0x0000006f NT_STAPSDT (SystemTap probe descriptors) | |
119 | Provider: frr_bgp | |
120 | Name: output_filter | |
121 | Location: 0x000000000048e33a, Base: 0x00000000005a09d2, Semaphore: 0x0000000000000000 | |
122 | Arguments: 8@-144(%rbp) 8@-152(%rbp) 4@-156(%rbp) 4@-160(%rbp) 8@-168(%rbp) | |
123 | stapsdt 0x0000007d NT_STAPSDT (SystemTap probe descriptors) | |
124 | Provider: frr_bgp | |
125 | Name: process_update | |
126 | Location: 0x0000000000491f10, Base: 0x00000000005a09d2, Semaphore: 0x0000000000000000 | |
127 | Arguments: 8@-800(%rbp) 8@-808(%rbp) 4@-812(%rbp) 4@-816(%rbp) 4@-820(%rbp) 8@-832(%rbp) | |
128 | stapsdt 0x0000006e NT_STAPSDT (SystemTap probe descriptors) | |
129 | Provider: frr_bgp | |
130 | Name: input_filter | |
131 | Location: 0x00000000004940ed, Base: 0x00000000005a09d2, Semaphore: 0x0000000000000000 | |
132 | Arguments: 8@-144(%rbp) 8@-152(%rbp) 4@-156(%rbp) 4@-160(%rbp) 8@-168(%rbp) | |
133 | ||
134 | ||
135 | To see available LTTng probes, run the target, create a session and then:: | |
136 | ||
137 | lttng list --userspace | grep frr | |
138 | ||
139 | Example:: | |
140 | ||
141 | root@host ~> lttng list --userspace | grep frr | |
142 | PID: 11157 - Name: /usr/lib/frr/bgpd | |
143 | frr_libfrr:route_node_get (loglevel: TRACE_DEBUG_LINE (13)) (type: tracepoint) | |
144 | frr_libfrr:list_sort (loglevel: TRACE_DEBUG_LINE (13)) (type: tracepoint) | |
145 | frr_libfrr:list_delete_node (loglevel: TRACE_DEBUG_LINE (13)) (type: tracepoint) | |
146 | frr_libfrr:list_remove (loglevel: TRACE_DEBUG_LINE (13)) (type: tracepoint) | |
147 | frr_libfrr:list_add (loglevel: TRACE_DEBUG_LINE (13)) (type: tracepoint) | |
148 | frr_libfrr:memfree (loglevel: TRACE_DEBUG_LINE (13)) (type: tracepoint) | |
149 | frr_libfrr:memalloc (loglevel: TRACE_DEBUG_LINE (13)) (type: tracepoint) | |
150 | frr_libfrr:frr_pthread_stop (loglevel: TRACE_DEBUG_LINE (13)) (type: tracepoint) | |
151 | frr_libfrr:frr_pthread_run (loglevel: TRACE_DEBUG_LINE (13)) (type: tracepoint) | |
152 | frr_libfrr:thread_call (loglevel: TRACE_INFO (6)) (type: tracepoint) | |
153 | frr_libfrr:thread_cancel_async (loglevel: TRACE_INFO (6)) (type: tracepoint) | |
154 | frr_libfrr:thread_cancel (loglevel: TRACE_INFO (6)) (type: tracepoint) | |
155 | frr_libfrr:schedule_write (loglevel: TRACE_INFO (6)) (type: tracepoint) | |
156 | frr_libfrr:schedule_read (loglevel: TRACE_INFO (6)) (type: tracepoint) | |
157 | frr_libfrr:schedule_event (loglevel: TRACE_INFO (6)) (type: tracepoint) | |
158 | frr_libfrr:schedule_timer (loglevel: TRACE_INFO (6)) (type: tracepoint) | |
159 | frr_libfrr:hash_release (loglevel: TRACE_INFO (6)) (type: tracepoint) | |
160 | frr_libfrr:hash_insert (loglevel: TRACE_INFO (6)) (type: tracepoint) | |
161 | frr_libfrr:hash_get (loglevel: TRACE_INFO (6)) (type: tracepoint) | |
162 | frr_bgp:output_filter (loglevel: TRACE_INFO (6)) (type: tracepoint) | |
163 | frr_bgp:input_filter (loglevel: TRACE_INFO (6)) (type: tracepoint) | |
164 | frr_bgp:process_update (loglevel: TRACE_INFO (6)) (type: tracepoint) | |
165 | frr_bgp:packet_read (loglevel: TRACE_INFO (6)) (type: tracepoint) | |
166 | frr_bgp:refresh_process (loglevel: TRACE_INFO (6)) (type: tracepoint) | |
167 | frr_bgp:capability_process (loglevel: TRACE_INFO (6)) (type: tracepoint) | |
168 | frr_bgp:notification_process (loglevel: TRACE_INFO (6)) (type: tracepoint) | |
169 | frr_bgp:update_process (loglevel: TRACE_INFO (6)) (type: tracepoint) | |
170 | frr_bgp:keepalive_process (loglevel: TRACE_INFO (6)) (type: tracepoint) | |
171 | frr_bgp:open_process (loglevel: TRACE_INFO (6)) (type: tracepoint) | |
172 | ||
173 | When using LTTng, you can also get zlogs as trace events by enabling | |
174 | the ``lttng_ust_tracelog:*`` event class. | |
175 | ||
c952f184 DA |
176 | To see available SystemTap USDT probes, run:: |
177 | ||
178 | stap -L 'process("/usr/lib/frr/bgpd").mark("*")' | |
179 | ||
180 | Example:: | |
181 | root@host ~> stap -L 'process("/usr/lib/frr/bgpd").mark("*")' | |
182 | process("/usr/lib/frr/bgpd").mark("capability_process") $arg1:long $arg2:long | |
183 | process("/usr/lib/frr/bgpd").mark("input_filter") $arg1:long $arg2:long $arg3:long $arg4:long $arg5:long | |
184 | process("/usr/lib/frr/bgpd").mark("keepalive_process") $arg1:long $arg2:long | |
185 | process("/usr/lib/frr/bgpd").mark("notification_process") $arg1:long $arg2:long | |
186 | process("/usr/lib/frr/bgpd").mark("open_process") $arg1:long $arg2:long | |
187 | process("/usr/lib/frr/bgpd").mark("output_filter") $arg1:long $arg2:long $arg3:long $arg4:long $arg5:long | |
188 | process("/usr/lib/frr/bgpd").mark("packet_read") $arg1:long $arg2:long | |
189 | process("/usr/lib/frr/bgpd").mark("process_update") $arg1:long $arg2:long $arg3:long $arg4:long $arg5:long $arg6:long | |
190 | process("/usr/lib/frr/bgpd").mark("refresh_process") $arg1:long $arg2:long | |
191 | process("/usr/lib/frr/bgpd").mark("update_process") $arg1:long $arg2:long | |
192 | ||
193 | When using SystemTap, you can also easily attach to an existing function:: | |
194 | ||
195 | stap -L 'process("/usr/lib/frr/bgpd").function("bgp_update_receive")' | |
196 | ||
197 | Example: | |
198 | ||
199 | root@host ~> stap -L 'process("/usr/lib/frr/bgpd").function("bgp_update_receive")' | |
200 | process("/usr/lib/frr/bgpd").function("bgp_update_receive@bgpd/bgp_packet.c:1531") $peer:struct peer* $size:bgp_size_t $attr:struct attr $restart:_Bool $nlris:struct bgp_nlri[] $__func__:char const[] const | |
201 | ||
202 | Complete ``bgp.stp`` example using SystemTap to show BGP peer, prefix and aspath | |
203 | using ``process_update`` USDT:: | |
204 | ||
205 | global pkt_size; | |
206 | probe begin | |
207 | { | |
208 | ansi_clear_screen(); | |
209 | println("Starting..."); | |
210 | } | |
211 | probe process("/usr/lib/frr/bgpd").function("bgp_update_receive") | |
212 | { | |
213 | pkt_size <<< $size; | |
214 | } | |
215 | probe process("/usr/lib/frr/bgpd").mark("process_update") | |
216 | { | |
217 | aspath = @cast($arg6, "attr")->aspath; | |
218 | printf("> %s via %s (%s)\n", | |
219 | user_string($arg2), | |
220 | user_string(@cast($arg1, "peer")->host), | |
221 | user_string(@cast(aspath, "aspath")->str)); | |
222 | } | |
223 | probe end | |
224 | { | |
225 | if (@count(pkt_size)) | |
226 | print(@hist_linear(pkt_size, 0, 20, 2)); | |
227 | } | |
228 | ||
229 | Output:: | |
230 | ||
231 | Starting... | |
232 | > 192.168.0.0/24 via 192.168.0.1 (65534) | |
233 | > 192.168.100.1/32 via 192.168.0.1 (65534) | |
234 | > 172.16.16.1/32 via 192.168.0.1 (65534 65030) | |
235 | ^Cvalue |-------------------------------------------------- count | |
236 | 0 | 0 | |
237 | 2 | 0 | |
238 | 4 |@ 1 | |
239 | 6 | 0 | |
240 | 8 | 0 | |
241 | ~ | |
242 | 18 | 0 | |
243 | 20 | 0 | |
244 | >20 |@@@@@ 5 | |
245 | ||
246 | ||
d3f45345 QY |
247 | Concepts |
248 | -------- | |
249 | ||
250 | Tracepoints are statically defined points in code where a developer has | |
251 | determined that outside observers might gain something from knowing what is | |
252 | going on at that point. It's like logging but with the ability to dump large | |
253 | amounts of internal data with much higher performance. LTTng has a good summary | |
254 | `here <https://lttng.org/docs/#doc-what-is-tracing>`_. | |
255 | ||
256 | Each tracepoint has a "provider" and name. The provider is basically a | |
257 | namespace; for example, ``bgpd`` uses the provider name ``frr_bgp``. The name | |
258 | is arbitrary, but because providers share a global namespace on the user's | |
259 | system, all providers from FRR should be prefixed by ``frr_``. The tracepoint | |
260 | name is just the name of the event. Events are globally named by their provider | |
261 | and name. For example, the event when BGP reads a packet from a peer is | |
262 | ``frr_bgp:packet_read``. | |
263 | ||
264 | To do tracing, the tracing tool of choice is told which events to listen to. | |
265 | For example, to listen to all events from FRR's BGP implementation, you would | |
266 | enable the events ``frr_bgp:*``. In the same tracing session you could also | |
267 | choose to record all memory allocations by enabling the ``malloc`` tracepoints | |
268 | in ``libc`` as well as all kernel skb operations using the various in-kernel | |
269 | tracepoints. This allows you to build as complete a view as desired of what the | |
270 | system is doing during the tracing window (subject to what tracepoints are | |
271 | available). | |
272 | ||
273 | Of particular use are the tracepoints for FRR's internal event scheduler; | |
274 | tracing these allows you to see all events executed by all event loops for the | |
275 | target(s) in question. Here's a couple events selected from a trace of BGP | |
276 | during startup:: | |
277 | ||
278 | ... | |
279 | ||
280 | [18:41:35.750131763] (+0.000048901) host frr_libfrr:thread_call: { cpu_id = | |
281 | 1 }, { threadmaster_name = "default", function_name = "zclient_connect", | |
282 | scheduled_from = "lib/zclient.c", scheduled_on_line = 3877, thread_addr = | |
283 | 0x0, file_descriptor = 0, event_value = 0, argument_ptr = 0xA37F70, timer = | |
284 | 0 } | |
285 | ||
286 | [18:41:35.750175124] (+0.000020001) host frr_libfrr:thread_call: { cpu_id = | |
287 | 1 }, { threadmaster_name = "default", function_name = "frr_config_read_in", | |
288 | scheduled_from = "lib/libfrr.c", scheduled_on_line = 934, thread_addr = 0x0, | |
289 | file_descriptor = 0, event_value = 0, argument_ptr = 0x0, timer = 0 } | |
290 | ||
291 | [18:41:35.753341264] (+0.000010532) host frr_libfrr:thread_call: { cpu_id = | |
292 | 1 }, { threadmaster_name = "default", function_name = "bgp_event", | |
293 | scheduled_from = "bgpd/bgpd.c", scheduled_on_line = 142, thread_addr = 0x0, | |
294 | file_descriptor = 2, event_value = 2, argument_ptr = 0xE4D780, timer = 2 } | |
295 | ||
296 | [18:41:35.753404186] (+0.000004910) host frr_libfrr:thread_call: { cpu_id = | |
297 | 1 }, { threadmaster_name = "default", function_name = "zclient_read", | |
298 | scheduled_from = "lib/zclient.c", scheduled_on_line = 3891, thread_addr = | |
299 | 0x0, file_descriptor = 40, event_value = 40, argument_ptr = 0xA37F70, timer | |
300 | = 40 } | |
301 | ||
302 | ... | |
303 | ||
304 | ||
305 | Very useful for getting a time-ordered look into what the process is doing. | |
306 | ||
307 | ||
308 | Adding Tracepoints | |
309 | ------------------ | |
310 | ||
311 | Adding new tracepoints is a two step process: | |
312 | ||
313 | 1. Define the tracepoint | |
314 | 2. Use the tracepoint | |
315 | ||
316 | Tracepoint definitions state the "provider" and name of the tracepoint, along | |
317 | with any values it will produce, and how to format them. This is done with | |
318 | macros provided by LTTng. USDT probes do not use definitions and are inserted | |
319 | at the trace site with a single macro. However, to maintain support for both | |
320 | platforms, you must define an LTTng tracepoint when adding a new one. | |
321 | ``frrtrace()`` will expand to the appropriate ``DTRACE_PROBEn`` macro when USDT | |
322 | is in use. | |
323 | ||
324 | If you are adding new tracepoints to a daemon that has no tracepoints, that | |
325 | daemon's ``subdir.am`` must be updated to conditionally link ``lttng-ust``. | |
326 | Look at ``bgpd/subdir.am`` for an example of how to do this; grep for | |
327 | ``UST_LIBS``. Create new files named ``<daemon>_trace.[ch]``. Use | |
328 | ``bgpd/bgp_trace.[h]`` as boilerplate. If you are adding tracepoints to a | |
329 | daemon that already has them, look for the ``<daemon>_trace.h`` file; | |
330 | tracepoints are written here. | |
331 | ||
332 | Refer to the `LTTng developer docs | |
333 | <https://lttng.org/docs/#doc-c-application>`_ for details on how to define | |
334 | tracepoints. | |
335 | ||
336 | To use them, simply add a call to ``frrtrace()`` at the point you'd like the | |
337 | event to be emitted, like so: | |
338 | ||
339 | .. code-block:: c | |
340 | ||
341 | ... | |
342 | ||
343 | switch (type) { | |
344 | case BGP_MSG_OPEN: | |
345 | frrtrace(2, frr_bgp, open_process, peer, size); /* tracepoint */ | |
346 | atomic_fetch_add_explicit(&peer->open_in, 1, | |
347 | memory_order_relaxed); | |
348 | mprc = bgp_open_receive(peer, size); | |
349 | ||
350 | ... | |
351 | ||
352 | After recompiling this tracepoint will now be available, either as a USDT probe | |
353 | or LTTng tracepoint, depending on your compilation choice. | |
354 | ||
355 | ||
356 | trace.h | |
357 | ^^^^^^^ | |
358 | ||
359 | Because FRR supports multiple types of tracepoints, the code for creating them | |
360 | abstracts away the underlying system being used. This abstraction code is in | |
361 | ``lib/trace.h``. There are 2 function-like macros that are used for working | |
362 | with tracepoints. | |
363 | ||
364 | - ``frrtrace()`` defines tracepoints | |
365 | - ``frrtrace_enabled()`` checks whether a tracepoint is enabled | |
366 | ||
367 | There is also ``frrtracelog()``, which is used in zlog core code to make zlog | |
368 | messages available as trace events to LTTng. This should not be used elsewhere. | |
369 | ||
370 | There is additional documentation in the header. The key thing to note is that | |
371 | you should never include ``trace.h`` in source where you plan to put | |
372 | tracepoints; include the tracepoint definition header instead (e.g. | |
373 | :file:`bgp_trace.h`). | |
374 | ||
375 | ||
376 | Limitations | |
377 | ----------- | |
378 | ||
379 | Tracers do not like ``fork()`` or ``dlopen()``. LTTng has some workarounds for | |
380 | this involving interceptor libraries using ``LD_PRELOAD``. | |
381 | ||
90294611 QY |
382 | If you're running FRR in a typical daemonizing way (``-d`` to the daemons) |
383 | you'll need to run the daemons like so: | |
384 | ||
385 | .. code-block:: shell | |
386 | ||
387 | LD_PRELOAD=liblttng-ust-fork.so <daemon> | |
388 | ||
389 | ||
390 | If you're using systemd this you can accomplish this for all daemons by | |
391 | modifying ``frr.service`` like so: | |
392 | ||
393 | .. code-block:: diff | |
394 | ||
395 | --- a/frr.service | |
396 | +++ b/frr.service | |
397 | @@ -7,6 +7,7 @@ Before=network.target | |
398 | OnFailure=heartbeat-failed@%n.service | |
399 | ||
400 | [Service] | |
401 | +Environment="LD_PRELOAD=liblttng-ust-fork.so" | |
402 | Nice=-5 | |
403 | Type=forking | |
404 | NotifyAccess=all | |
405 | ||
406 | ||
d3f45345 QY |
407 | USDT tracepoints are relatively high overhead and probably shouldn't be used |
408 | for "flight recorder" functionality, i.e. enabling and passively recording all | |
409 | events for monitoring purposes. It's generally okay to use LTTng like this, | |
410 | though. |