1 =========================
2 Tracing Ceph With LTTng
3 =========================
5 Configuring Ceph with LTTng
6 ===========================
8 Use -DWITH_LTTNG option (default: ON)::
10 ./do_cmake -DWITH_LTTNG=ON
12 Config option for tracing must be set to true in ceph.conf.
13 Following options are currently available::
16 event_tracing (-DWITH_EVENTTRACE)
17 osd_function_tracing (-DWITH_OSD_INSTRUMENT_FUNCTIONS)
18 osd_objectstore_tracing (actually filestore tracing)
30 lttng-sessiond --daemonize
32 Run vstart cluster with enabling trace options::
34 ../src/vstart.sh -d -n -l -e -o "osd_tracing = true"
36 List available tracepoints::
38 lttng list --userspace
40 You will get something like::
44 PID: 100859 - Name: /path/to/ceph-osd
45 pg:queue_op (loglevel: TRACE_DEBUG_LINE (13)) (type: tracepoint)
46 osd:do_osd_op_post (loglevel: TRACE_DEBUG_LINE (13)) (type: tracepoint)
47 osd:do_osd_op_pre_unknown (loglevel: TRACE_DEBUG_LINE (13)) (type: tracepoint)
48 osd:do_osd_op_pre_copy_from (loglevel: TRACE_DEBUG_LINE (13)) (type: tracepoint)
49 osd:do_osd_op_pre_copy_get (loglevel: TRACE_DEBUG_LINE (13)) (type: tracepoint)
52 Create tracing session, enable tracepoints and start trace::
54 lttng create trace-test
55 lttng enable-event --userspace osd:*
58 Perform some Ceph operation::
60 rados bench -p ec 5 write
62 Stop tracing and view result::
67 Destroy tracing session::
71 =========================
72 Tracing Ceph With Blkin
73 =========================
75 Ceph can use Blkin, a library created by Marios Kogias and others,
76 which enables tracking a specific request from the time it enters
77 the system at higher levels till it is finally served by RADOS.
79 In general, Blkin implements the Dapper_ tracing semantics
80 in order to show the causal relationships between the different
81 processing phases that an IO request may trigger. The goal is an
82 end-to-end visualisation of the request's route in the system,
83 accompanied by information concerning latencies in each processing
84 phase. Thanks to LTTng this can happen with a minimal overhead and
85 in realtime. The LTTng traces can then be visualized with Twitter's
88 .. _Dapper: http://static.googleusercontent.com/media/research.google.com/el//pubs/archive/36356.pdf
89 .. _Zipkin: https://zipkin.io/
92 Configuring Ceph with Blkin
93 ===========================
95 Use -DWITH_BLKIN option (which requires -DWITH_LTTNG)::
97 ./do_cmake -DWITH_LTTNG=ON -DWITH_BLKIN=ON
99 Config option for blkin must be set to true in ceph.conf.
100 Following options are currently available::
109 It's easy to test Ceph's Blkin tracing. Let's assume you don't have
110 Ceph already running, and you compiled Ceph with Blkin support but
111 you didn't install it. Then launch Ceph with the ``vstart.sh`` script
112 in Ceph's src directory so you can see the possible tracepoints.::
114 OSD=3 MON=3 RGW=1 ../src/vstart.sh -n -o "rbd_blkin_trace_all"
115 lttng list --userspace
117 You'll see something like the following:::
121 PID: 8987 - Name: ./ceph-osd
122 zipkin:timestamp (loglevel: TRACE_WARNING (4)) (type: tracepoint)
123 zipkin:keyval_integer (loglevel: TRACE_WARNING (4)) (type: tracepoint)
124 zipkin:keyval_string (loglevel: TRACE_WARNING (4)) (type: tracepoint)
125 lttng_ust_tracelog:TRACE_DEBUG (loglevel: TRACE_DEBUG (14)) (type: tracepoint)
127 PID: 8407 - Name: ./ceph-mon
128 zipkin:timestamp (loglevel: TRACE_WARNING (4)) (type: tracepoint)
129 zipkin:keyval_integer (loglevel: TRACE_WARNING (4)) (type: tracepoint)
130 zipkin:keyval_string (loglevel: TRACE_WARNING (4)) (type: tracepoint)
131 lttng_ust_tracelog:TRACE_DEBUG (loglevel: TRACE_DEBUG (14)) (type: tracepoint)
135 Next, stop Ceph so that the tracepoints can be enabled.::
139 Start up an LTTng session and enable the tracepoints.::
141 lttng create blkin-test
142 lttng enable-event --userspace zipkin:timestamp
143 lttng enable-event --userspace zipkin:keyval_integer
144 lttng enable-event --userspace zipkin:keyval_string
147 Then start up Ceph again.::
149 OSD=3 MON=3 RGW=1 ../src/vstart.sh -n -o "rbd_blkin_trace_all"
151 You may want to check that ceph is up.::
155 Now put something in using rados, check that it made it, get it back, and remove it.::
157 ceph osd pool create test-blkin
158 rados put test-object-1 ../src/vstart.sh --pool=test-blkin
159 rados -p test-blkin ls
160 ceph osd map test-blkin test-object-1
161 rados get test-object-1 ./vstart-copy.sh --pool=test-blkin
163 rados rm test-object-1 --pool=test-blkin
165 You could also use the example in ``examples/librados/`` or ``rados bench``.
167 Then stop the LTTng session and see what was collected.::
172 You'll see something like:::
174 [15:33:08.884275486] (+0.000225472) ubuntu zipkin:timestamp: { cpu_id = 53 }, { trace_name = "op", service_name = "Objecter", port_no = 0, ip = "0.0.0.0", trace_id = 5485970765435202833, span_id = 5485970765435202833, parent_span_id = 0, event = "osd op reply" }
175 [15:33:08.884614135] (+0.000002839) ubuntu zipkin:keyval_integer: { cpu_id = 10 }, { trace_name = "", service_name = "Messenger", port_no = 6805, ip = "0.0.0.0", trace_id = 7381732770245808782, span_id = 7387710183742669839, parent_span_id = 1205040135881905799, key = "tid", val = 2 }
176 [15:33:08.884616431] (+0.000002296) ubuntu zipkin:keyval_string: { cpu_id = 10 }, { trace_name = "", service_name = "Messenger", port_no = 6805, ip = "0.0.0.0", trace_id = 7381732770245808782, span_id = 7387710183742669839, parent_span_id = 1205040135881905799, key = "entity type", val = "client" }
181 One of the points of using Blkin is so that you can look at the traces
182 using Zipkin. Users should run Zipkin as a tracepoints collector and
183 also a web service. The executable jar runs a collector on port 9410 and
184 the web interface on port 9411
186 Download Zipkin Package::
188 git clone https://github.com/openzipkin/zipkin && cd zipkin
189 wget -O zipkin.jar 'https://search.maven.org/remote_content?g=io.zipkin.java&a=zipkin-server&v=LATEST&c=exec'
192 Or, launch docker image::
194 docker run -d -p 9411:9411 openzipkin/Zipkin
196 Show Ceph's Blkin Traces in Zipkin-web
197 ======================================
198 Download babeltrace-zipkin project. This project takes the traces
199 generated with blkin and sends them to a Zipkin collector using scribe::
201 git clone https://github.com/vears91/babeltrace-zipkin
204 Send lttng data to Zipkin::
206 python3 babeltrace_zipkin.py ${lttng-traces-dir}/${blkin-test}/ust/uid/0/64-bit/ -p ${zipkin-collector-port(9410 by default)} -s ${zipkin-collector-ip}
210 python3 babeltrace_zipkin.py ~/lttng-traces-dir/blkin-test-20150225-160222/ust/uid/0/64-bit/ -p 9410 -s 127.0.0.1
212 Check Ceph traces on webpage::
214 Browse http://${zipkin-collector-ip}:9411