]> git.proxmox.com Git - ceph.git/blame - ceph/doc/dev/blkin.rst
import quincy beta 17.1.0
[ceph.git] / ceph / doc / dev / blkin.rst
CommitLineData
20effc67
TL
1=========================
2 Tracing Ceph With LTTng
3=========================
4
5Configuring Ceph with LTTng
6===========================
7
8Use -DWITH_LTTNG option (default: ON)::
9
10 ./do_cmake -DWITH_LTTNG=ON
11
12Config option for tracing must be set to true in ceph.conf.
13Following options are currently available::
14
15 bluestore_tracing
16 event_tracing (-DWITH_EVENTTRACE)
17 osd_function_tracing (-DWITH_OSD_INSTRUMENT_FUNCTIONS)
18 osd_objectstore_tracing (actually filestore tracing)
19 rbd_tracing
20 osd_tracing
21 rados_tracing
22 rgw_op_tracing
23 rgw_rados_tracing
24
25Testing Trace
26=============
27
28Start LTTng daemon::
29
30 lttng-sessiond --daemonize
31
32Run vstart cluster with enabling trace options::
33
34 ../src/vstart.sh -d -n -l -e -o "osd_tracing = true"
35
36List available tracepoints::
37
38 lttng list --userspace
39
40You will get something like::
41
42 UST events:
43 -------------
44 PID: 100859 - Name: /path/to/ceph-osd
45 pg:queue_op (loglevel: TRACE_DEBUG_LINE (13)) (type: tracepoint)
46 osd:do_osd_op_post (loglevel: TRACE_DEBUG_LINE (13)) (type: tracepoint)
47 osd:do_osd_op_pre_unknown (loglevel: TRACE_DEBUG_LINE (13)) (type: tracepoint)
48 osd:do_osd_op_pre_copy_from (loglevel: TRACE_DEBUG_LINE (13)) (type: tracepoint)
49 osd:do_osd_op_pre_copy_get (loglevel: TRACE_DEBUG_LINE (13)) (type: tracepoint)
50 ...
51
52Create tracing session, enable tracepoints and start trace::
53
54 lttng create trace-test
55 lttng enable-event --userspace osd:*
56 lttng start
57
58Perform some ceph operatin::
59
60 rados bench -p ec 5 write
61
62Stop tracing and view result::
63
64 lttng stop
65 lttng view
66
67Destroy tracing session::
68
69 lttng destroy
70
7c673cae 71=========================
11fdf7f2 72 Tracing Ceph With Blkin
7c673cae
FG
73=========================
74
75Ceph can use Blkin, a library created by Marios Kogias and others,
76which enables tracking a specific request from the time it enters
77the system at higher levels till it is finally served by RADOS.
78
79In general, Blkin implements the Dapper_ tracing semantics
80in order to show the causal relationships between the different
81processing phases that an IO request may trigger. The goal is an
82end-to-end visualisation of the request's route in the system,
83accompanied by information concerning latencies in each processing
84phase. Thanks to LTTng this can happen with a minimal overhead and
85in realtime. The LTTng traces can then be visualized with Twitter's
86Zipkin_.
87
88.. _Dapper: http://static.googleusercontent.com/media/research.google.com/el//pubs/archive/36356.pdf
11fdf7f2 89.. _Zipkin: https://zipkin.io/
7c673cae
FG
90
91
7c673cae
FG
92Configuring Ceph with Blkin
93===========================
94
20effc67 95Use -DWITH_BLKIN option (which requires -DWITH_LTTNG)::
7c673cae 96
20effc67 97 ./do_cmake -DWITH_LTTNG=ON -DWITH_BLKIN=ON
11fdf7f2 98
20effc67
TL
99Config option for blkin must be set to true in ceph.conf.
100Following options are currently available::
7c673cae 101
20effc67
TL
102 rbd_blkin_trace_all
103 osd_blkin_trace_all
104 osdc_blkin_trace_all
7c673cae
FG
105
106Testing Blkin
107=============
108
109It's easy to test Ceph's Blkin tracing. Let's assume you don't have
110Ceph already running, and you compiled Ceph with Blkin support but
11fdf7f2
TL
111you didn't install it. Then launch Ceph with the ``vstart.sh`` script
112in Ceph's src directory so you can see the possible tracepoints.::
7c673cae 113
20effc67 114 OSD=3 MON=3 RGW=1 ../src/vstart.sh -n -o "rbd_blkin_trace_all"
7c673cae
FG
115 lttng list --userspace
116
117You'll see something like the following:::
118
119 UST events:
120 -------------
121 PID: 8987 - Name: ./ceph-osd
122 zipkin:timestamp (loglevel: TRACE_WARNING (4)) (type: tracepoint)
11fdf7f2
TL
123 zipkin:keyval_integer (loglevel: TRACE_WARNING (4)) (type: tracepoint)
124 zipkin:keyval_string (loglevel: TRACE_WARNING (4)) (type: tracepoint)
125 lttng_ust_tracelog:TRACE_DEBUG (loglevel: TRACE_DEBUG (14)) (type: tracepoint)
7c673cae
FG
126
127 PID: 8407 - Name: ./ceph-mon
128 zipkin:timestamp (loglevel: TRACE_WARNING (4)) (type: tracepoint)
11fdf7f2
TL
129 zipkin:keyval_integer (loglevel: TRACE_WARNING (4)) (type: tracepoint)
130 zipkin:keyval_string (loglevel: TRACE_WARNING (4)) (type: tracepoint)
131 lttng_ust_tracelog:TRACE_DEBUG (loglevel: TRACE_DEBUG (14)) (type: tracepoint)
7c673cae
FG
132
133 ...
134
135Next, stop Ceph so that the tracepoints can be enabled.::
136
20effc67 137 ../src/stop.sh
7c673cae
FG
138
139Start up an LTTng session and enable the tracepoints.::
140
141 lttng create blkin-test
142 lttng enable-event --userspace zipkin:timestamp
11fdf7f2
TL
143 lttng enable-event --userspace zipkin:keyval_integer
144 lttng enable-event --userspace zipkin:keyval_string
7c673cae
FG
145 lttng start
146
147Then start up Ceph again.::
148
20effc67 149 OSD=3 MON=3 RGW=1 ../src/vstart.sh -n -o "rbd_blkin_trace_all"
7c673cae
FG
150
151You may want to check that ceph is up.::
152
20effc67 153 ceph status
7c673cae 154
11fdf7f2 155Now put something in using rados, check that it made it, get it back, and remove it.::
7c673cae 156
20effc67
TL
157 ceph osd pool create test-blkin
158 rados put test-object-1 ../src/vstart.sh --pool=test-blkin
159 rados -p test-blkin ls
160 ceph osd map test-blkin test-object-1
161 rados get test-object-1 ./vstart-copy.sh --pool=test-blkin
7c673cae 162 md5sum vstart*
20effc67 163 rados rm test-object-1 --pool=test-blkin
7c673cae
FG
164
165You could also use the example in ``examples/librados/`` or ``rados bench``.
166
167Then stop the LTTng session and see what was collected.::
168
169 lttng stop
170 lttng view
171
172You'll see something like:::
173
11fdf7f2
TL
174 [15:33:08.884275486] (+0.000225472) ubuntu zipkin:timestamp: { cpu_id = 53 }, { trace_name = "op", service_name = "Objecter", port_no = 0, ip = "0.0.0.0", trace_id = 5485970765435202833, span_id = 5485970765435202833, parent_span_id = 0, event = "osd op reply" }
175 [15:33:08.884614135] (+0.000002839) ubuntu zipkin:keyval_integer: { cpu_id = 10 }, { trace_name = "", service_name = "Messenger", port_no = 6805, ip = "0.0.0.0", trace_id = 7381732770245808782, span_id = 7387710183742669839, parent_span_id = 1205040135881905799, key = "tid", val = 2 }
176 [15:33:08.884616431] (+0.000002296) ubuntu zipkin:keyval_string: { cpu_id = 10 }, { trace_name = "", service_name = "Messenger", port_no = 6805, ip = "0.0.0.0", trace_id = 7381732770245808782, span_id = 7387710183742669839, parent_span_id = 1205040135881905799, key = "entity type", val = "client" }
7c673cae
FG
177
178
179Install Zipkin
180===============
181One of the points of using Blkin is so that you can look at the traces
182using Zipkin. Users should run Zipkin as a tracepoints collector and
11fdf7f2
TL
183also a web service. The executable jar runs a collector on port 9410 and
184the web interface on port 9411
7c673cae
FG
185
186Download Zipkin Package::
187
11fdf7f2
TL
188 git clone https://github.com/openzipkin/zipkin && cd zipkin
189 wget -O zipkin.jar 'https://search.maven.org/remote_content?g=io.zipkin.java&a=zipkin-server&v=LATEST&c=exec'
190 java -jar zipkin.jar
7c673cae 191
20effc67
TL
192Or, launch docker image::
193
194 docker run -d -p 9411:9411 openzipkin/Zipkin
7c673cae
FG
195
196Show Ceph's Blkin Traces in Zipkin-web
197======================================
11fdf7f2
TL
198Download babeltrace-zipkin project. This project takes the traces
199generated with blkin and sends them to a Zipkin collector using scribe::
200
201 git clone https://github.com/vears91/babeltrace-zipkin
202 cd babeltrace-zipkin
7c673cae
FG
203
204Send lttng data to Zipkin::
205
206 python3 babeltrace_zipkin.py ${lttng-traces-dir}/${blkin-test}/ust/uid/0/64-bit/ -p ${zipkin-collector-port(9410 by default)} -s ${zipkin-collector-ip}
207
208Example::
209
210 python3 babeltrace_zipkin.py ~/lttng-traces-dir/blkin-test-20150225-160222/ust/uid/0/64-bit/ -p 9410 -s 127.0.0.1
211
212Check Ceph traces on webpage::
213
11fdf7f2 214 Browse http://${zipkin-collector-ip}:9411
7c673cae 215 Click "Find traces"