]> git.proxmox.com Git - ceph.git/blob - ceph/doc/rados/troubleshooting/log-and-debug.rst
import 15.2.0 Octopus source
[ceph.git] / ceph / doc / rados / troubleshooting / log-and-debug.rst
1 =======================
2 Logging and Debugging
3 =======================
4
5 Typically, when you add debugging to your Ceph configuration, you do so at
6 runtime. You can also add Ceph debug logging to your Ceph configuration file if
7 you are encountering issues when starting your cluster. You may view Ceph log
8 files under ``/var/log/ceph`` (the default location).
9
10 .. tip:: When debug output slows down your system, the latency can hide
11 race conditions.
12
13 Logging is resource intensive. If you are encountering a problem in a specific
14 area of your cluster, enable logging for that area of the cluster. For example,
15 if your OSDs are running fine, but your metadata servers are not, you should
16 start by enabling debug logging for the specific metadata server instance(s)
17 giving you trouble. Enable logging for each subsystem as needed.
18
19 .. important:: Verbose logging can generate over 1GB of data per hour. If your
20 OS disk reaches its capacity, the node will stop working.
21
22 If you enable or increase the rate of Ceph logging, ensure that you have
23 sufficient disk space on your OS disk. See `Accelerating Log Rotation`_ for
24 details on rotating log files. When your system is running well, remove
25 unnecessary debugging settings to ensure your cluster runs optimally. Logging
26 debug output messages is relatively slow, and a waste of resources when
27 operating your cluster.
28
29 See `Subsystem, Log and Debug Settings`_ for details on available settings.
30
31 Runtime
32 =======
33
34 If you would like to see the configuration settings at runtime, you must log
35 in to a host with a running daemon and execute the following::
36
37 ceph daemon {daemon-name} config show | less
38
39 For example,::
40
41 ceph daemon osd.0 config show | less
42
43 To activate Ceph's debugging output (*i.e.*, ``dout()``) at runtime, use the
44 ``ceph tell`` command to inject arguments into the runtime configuration::
45
46 ceph tell {daemon-type}.{daemon id or *} config set {name} {value}
47
48 Replace ``{daemon-type}`` with one of ``osd``, ``mon`` or ``mds``. You may apply
49 the runtime setting to all daemons of a particular type with ``*``, or specify
50 a specific daemon's ID. For example, to increase
51 debug logging for a ``ceph-osd`` daemon named ``osd.0``, execute the following::
52
53 ceph tell osd.0 config set debug_osd 0/5
54
55 The ``ceph tell`` command goes through the monitors. If you cannot bind to the
56 monitor, you can still make the change by logging into the host of the daemon
57 whose configuration you'd like to change using ``ceph daemon``.
58 For example::
59
60 sudo ceph daemon osd.0 config set debug_osd 0/5
61
62 See `Subsystem, Log and Debug Settings`_ for details on available settings.
63
64
65 Boot Time
66 =========
67
68 To activate Ceph's debugging output (*i.e.*, ``dout()``) at boot time, you must
69 add settings to your Ceph configuration file. Subsystems common to each daemon
70 may be set under ``[global]`` in your configuration file. Subsystems for
71 particular daemons are set under the daemon section in your configuration file
72 (*e.g.*, ``[mon]``, ``[osd]``, ``[mds]``). For example::
73
74 [global]
75 debug ms = 1/5
76
77 [mon]
78 debug mon = 20
79 debug paxos = 1/5
80 debug auth = 2
81
82 [osd]
83 debug osd = 1/5
84 debug filestore = 1/5
85 debug journal = 1
86 debug monc = 5/20
87
88 [mds]
89 debug mds = 1
90 debug mds balancer = 1
91
92
93 See `Subsystem, Log and Debug Settings`_ for details.
94
95
96 Accelerating Log Rotation
97 =========================
98
99 If your OS disk is relatively full, you can accelerate log rotation by modifying
100 the Ceph log rotation file at ``/etc/logrotate.d/ceph``. Add a size setting
101 after the rotation frequency to accelerate log rotation (via cronjob) if your
102 logs exceed the size setting. For example, the default setting looks like
103 this::
104
105 rotate 7
106 weekly
107 compress
108 sharedscripts
109
110 Modify it by adding a ``size`` setting. ::
111
112 rotate 7
113 weekly
114 size 500M
115 compress
116 sharedscripts
117
118 Then, start the crontab editor for your user space. ::
119
120 crontab -e
121
122 Finally, add an entry to check the ``etc/logrotate.d/ceph`` file. ::
123
124 30 * * * * /usr/sbin/logrotate /etc/logrotate.d/ceph >/dev/null 2>&1
125
126 The preceding example checks the ``etc/logrotate.d/ceph`` file every 30 minutes.
127
128
129 Valgrind
130 ========
131
132 Debugging may also require you to track down memory and threading issues.
133 You can run a single daemon, a type of daemon, or the whole cluster with
134 Valgrind. You should only use Valgrind when developing or debugging Ceph.
135 Valgrind is computationally expensive, and will slow down your system otherwise.
136 Valgrind messages are logged to ``stderr``.
137
138
139 Subsystem, Log and Debug Settings
140 =================================
141
142 In most cases, you will enable debug logging output via subsystems.
143
144 Ceph Subsystems
145 ---------------
146
147 Each subsystem has a logging level for its output logs, and for its logs
148 in-memory. You may set different values for each of these subsystems by setting
149 a log file level and a memory level for debug logging. Ceph's logging levels
150 operate on a scale of ``1`` to ``20``, where ``1`` is terse and ``20`` is
151 verbose [#]_ . In general, the logs in-memory are not sent to the output log unless:
152
153 - a fatal signal is raised or
154 - an ``assert`` in source code is triggered or
155 - upon requested. Please consult `document on admin socket <http://docs.ceph.com/docs/master/man/8/ceph/#daemon>`_ for more details.
156
157 A debug logging setting can take a single value for the log level and the
158 memory level, which sets them both as the same value. For example, if you
159 specify ``debug ms = 5``, Ceph will treat it as a log level and a memory level
160 of ``5``. You may also specify them separately. The first setting is the log
161 level, and the second setting is the memory level. You must separate them with
162 a forward slash (/). For example, if you want to set the ``ms`` subsystem's
163 debug logging level to ``1`` and its memory level to ``5``, you would specify it
164 as ``debug ms = 1/5``. For example:
165
166
167
168 .. code-block:: ini
169
170 debug {subsystem} = {log-level}/{memory-level}
171 #for example
172 debug mds balancer = 1/20
173
174
175 The following table provides a list of Ceph subsystems and their default log and
176 memory levels. Once you complete your logging efforts, restore the subsystems
177 to their default level or to a level suitable for normal operations.
178
179
180 +--------------------+-----------+--------------+
181 | Subsystem | Log Level | Memory Level |
182 +====================+===========+==============+
183 | ``default`` | 0 | 5 |
184 +--------------------+-----------+--------------+
185 | ``lockdep`` | 0 | 1 |
186 +--------------------+-----------+--------------+
187 | ``context`` | 0 | 1 |
188 +--------------------+-----------+--------------+
189 | ``crush`` | 1 | 1 |
190 +--------------------+-----------+--------------+
191 | ``mds`` | 1 | 5 |
192 +--------------------+-----------+--------------+
193 | ``mds balancer`` | 1 | 5 |
194 +--------------------+-----------+--------------+
195 | ``mds locker`` | 1 | 5 |
196 +--------------------+-----------+--------------+
197 | ``mds log`` | 1 | 5 |
198 +--------------------+-----------+--------------+
199 | ``mds log expire`` | 1 | 5 |
200 +--------------------+-----------+--------------+
201 | ``mds migrator`` | 1 | 5 |
202 +--------------------+-----------+--------------+
203 | ``buffer`` | 0 | 1 |
204 +--------------------+-----------+--------------+
205 | ``timer`` | 0 | 1 |
206 +--------------------+-----------+--------------+
207 | ``filer`` | 0 | 1 |
208 +--------------------+-----------+--------------+
209 | ``striper`` | 0 | 1 |
210 +--------------------+-----------+--------------+
211 | ``objecter`` | 0 | 1 |
212 +--------------------+-----------+--------------+
213 | ``rados`` | 0 | 5 |
214 +--------------------+-----------+--------------+
215 | ``rbd`` | 0 | 5 |
216 +--------------------+-----------+--------------+
217 | ``rbd mirror`` | 0 | 5 |
218 +--------------------+-----------+--------------+
219 | ``rbd replay`` | 0 | 5 |
220 +--------------------+-----------+--------------+
221 | ``journaler`` | 0 | 5 |
222 +--------------------+-----------+--------------+
223 | ``objectcacher`` | 0 | 5 |
224 +--------------------+-----------+--------------+
225 | ``client`` | 0 | 5 |
226 +--------------------+-----------+--------------+
227 | ``osd`` | 1 | 5 |
228 +--------------------+-----------+--------------+
229 | ``optracker`` | 0 | 5 |
230 +--------------------+-----------+--------------+
231 | ``objclass`` | 0 | 5 |
232 +--------------------+-----------+--------------+
233 | ``filestore`` | 1 | 3 |
234 +--------------------+-----------+--------------+
235 | ``journal`` | 1 | 3 |
236 +--------------------+-----------+--------------+
237 | ``ms`` | 0 | 5 |
238 +--------------------+-----------+--------------+
239 | ``mon`` | 1 | 5 |
240 +--------------------+-----------+--------------+
241 | ``monc`` | 0 | 10 |
242 +--------------------+-----------+--------------+
243 | ``paxos`` | 1 | 5 |
244 +--------------------+-----------+--------------+
245 | ``tp`` | 0 | 5 |
246 +--------------------+-----------+--------------+
247 | ``auth`` | 1 | 5 |
248 +--------------------+-----------+--------------+
249 | ``crypto`` | 1 | 5 |
250 +--------------------+-----------+--------------+
251 | ``finisher`` | 1 | 1 |
252 +--------------------+-----------+--------------+
253 | ``reserver`` | 1 | 1 |
254 +--------------------+-----------+--------------+
255 | ``heartbeatmap`` | 1 | 5 |
256 +--------------------+-----------+--------------+
257 | ``perfcounter`` | 1 | 5 |
258 +--------------------+-----------+--------------+
259 | ``rgw`` | 1 | 5 |
260 +--------------------+-----------+--------------+
261 | ``rgw sync`` | 1 | 5 |
262 +--------------------+-----------+--------------+
263 | ``civetweb`` | 1 | 10 |
264 +--------------------+-----------+--------------+
265 | ``javaclient`` | 1 | 5 |
266 +--------------------+-----------+--------------+
267 | ``asok`` | 1 | 5 |
268 +--------------------+-----------+--------------+
269 | ``throttle`` | 1 | 1 |
270 +--------------------+-----------+--------------+
271 | ``refs`` | 0 | 0 |
272 +--------------------+-----------+--------------+
273 | ``compressor`` | 1 | 5 |
274 +--------------------+-----------+--------------+
275 | ``bluestore`` | 1 | 5 |
276 +--------------------+-----------+--------------+
277 | ``bluefs`` | 1 | 5 |
278 +--------------------+-----------+--------------+
279 | ``bdev`` | 1 | 3 |
280 +--------------------+-----------+--------------+
281 | ``kstore`` | 1 | 5 |
282 +--------------------+-----------+--------------+
283 | ``rocksdb`` | 4 | 5 |
284 +--------------------+-----------+--------------+
285 | ``leveldb`` | 4 | 5 |
286 +--------------------+-----------+--------------+
287 | ``memdb`` | 4 | 5 |
288 +--------------------+-----------+--------------+
289 | ``fuse`` | 1 | 5 |
290 +--------------------+-----------+--------------+
291 | ``mgr`` | 1 | 5 |
292 +--------------------+-----------+--------------+
293 | ``mgrc`` | 1 | 5 |
294 +--------------------+-----------+--------------+
295 | ``dpdk`` | 1 | 5 |
296 +--------------------+-----------+--------------+
297 | ``eventtrace`` | 1 | 5 |
298 +--------------------+-----------+--------------+
299
300
301 Logging Settings
302 ----------------
303
304 Logging and debugging settings are not required in a Ceph configuration file,
305 but you may override default settings as needed. Ceph supports the following
306 settings:
307
308
309 ``log file``
310
311 :Description: The location of the logging file for your cluster.
312 :Type: String
313 :Required: No
314 :Default: ``/var/log/ceph/$cluster-$name.log``
315
316
317 ``log max new``
318
319 :Description: The maximum number of new log files.
320 :Type: Integer
321 :Required: No
322 :Default: ``1000``
323
324
325 ``log max recent``
326
327 :Description: The maximum number of recent events to include in a log file.
328 :Type: Integer
329 :Required: No
330 :Default: ``10000``
331
332
333 ``log to stderr``
334
335 :Description: Determines if logging messages should appear in ``stderr``.
336 :Type: Boolean
337 :Required: No
338 :Default: ``true``
339
340
341 ``err to stderr``
342
343 :Description: Determines if error messages should appear in ``stderr``.
344 :Type: Boolean
345 :Required: No
346 :Default: ``true``
347
348
349 ``log to syslog``
350
351 :Description: Determines if logging messages should appear in ``syslog``.
352 :Type: Boolean
353 :Required: No
354 :Default: ``false``
355
356
357 ``err to syslog``
358
359 :Description: Determines if error messages should appear in ``syslog``.
360 :Type: Boolean
361 :Required: No
362 :Default: ``false``
363
364
365 ``log flush on exit``
366
367 :Description: Determines if Ceph should flush the log files after exit.
368 :Type: Boolean
369 :Required: No
370 :Default: ``true``
371
372
373 ``clog to monitors``
374
375 :Description: Determines if ``clog`` messages should be sent to monitors.
376 :Type: Boolean
377 :Required: No
378 :Default: ``true``
379
380
381 ``clog to syslog``
382
383 :Description: Determines if ``clog`` messages should be sent to syslog.
384 :Type: Boolean
385 :Required: No
386 :Default: ``false``
387
388
389 ``mon cluster log to syslog``
390
391 :Description: Determines if the cluster log should be output to the syslog.
392 :Type: Boolean
393 :Required: No
394 :Default: ``false``
395
396
397 ``mon cluster log file``
398
399 :Description: The locations of the cluster's log files. There are two channels in
400 Ceph: ``cluster`` and ``audit``. This option represents a mapping
401 from channels to log files, where the log entries of that
402 channel are sent to. The ``default`` entry is a fallback
403 mapping for channels not explicitly specified. So, the following
404 default setting will send cluster log to ``$cluster.log``, and
405 send audit log to ``$cluster.audit.log``, where ``$cluster`` will
406 be replaced with the actual cluster name.
407 :Type: String
408 :Required: No
409 :Default: ``default=/var/log/ceph/$cluster.$channel.log,cluster=/var/log/ceph/$cluster.log``
410
411
412
413 OSD
414 ---
415
416
417 ``osd debug drop ping probability``
418
419 :Description: ?
420 :Type: Double
421 :Required: No
422 :Default: 0
423
424
425 ``osd debug drop ping duration``
426
427 :Description:
428 :Type: Integer
429 :Required: No
430 :Default: 0
431
432 ``osd debug drop pg create probability``
433
434 :Description:
435 :Type: Integer
436 :Required: No
437 :Default: 0
438
439 ``osd debug drop pg create duration``
440
441 :Description: ?
442 :Type: Double
443 :Required: No
444 :Default: 1
445
446
447 ``osd min pg log entries``
448
449 :Description: The minimum number of log entries for placement groups.
450 :Type: 32-bit Unsigned Integer
451 :Required: No
452 :Default: 250
453
454
455 ``osd op log threshold``
456
457 :Description: How many op log messages to show up in one pass.
458 :Type: Integer
459 :Required: No
460 :Default: 5
461
462
463
464 Filestore
465 ---------
466
467 ``filestore debug omap check``
468
469 :Description: Debugging check on synchronization. This is an expensive operation.
470 :Type: Boolean
471 :Required: No
472 :Default: ``false``
473
474
475 MDS
476 ---
477
478
479 ``mds debug scatterstat``
480
481 :Description: Ceph will assert that various recursive stat invariants are true
482 (for developers only).
483
484 :Type: Boolean
485 :Required: No
486 :Default: ``false``
487
488
489 ``mds debug frag``
490
491 :Description: Ceph will verify directory fragmentation invariants when
492 convenient (developers only).
493
494 :Type: Boolean
495 :Required: No
496 :Default: ``false``
497
498
499 ``mds debug auth pins``
500
501 :Description: The debug auth pin invariants (for developers only).
502 :Type: Boolean
503 :Required: No
504 :Default: ``false``
505
506
507 ``mds debug subtrees``
508
509 :Description: The debug subtree invariants (for developers only).
510 :Type: Boolean
511 :Required: No
512 :Default: ``false``
513
514
515
516 RADOS Gateway
517 -------------
518
519
520 ``rgw log nonexistent bucket``
521
522 :Description: Should we log a non-existent buckets?
523 :Type: Boolean
524 :Required: No
525 :Default: ``false``
526
527
528 ``rgw log object name``
529
530 :Description: Should an object's name be logged. // man date to see codes (a subset are supported)
531 :Type: String
532 :Required: No
533 :Default: ``%Y-%m-%d-%H-%i-%n``
534
535
536 ``rgw log object name utc``
537
538 :Description: Object log name contains UTC?
539 :Type: Boolean
540 :Required: No
541 :Default: ``false``
542
543
544 ``rgw enable ops log``
545
546 :Description: Enables logging of every RGW operation.
547 :Type: Boolean
548 :Required: No
549 :Default: ``true``
550
551
552 ``rgw enable usage log``
553
554 :Description: Enable logging of RGW's bandwidth usage.
555 :Type: Boolean
556 :Required: No
557 :Default: ``false``
558
559
560 ``rgw usage log flush threshold``
561
562 :Description: Threshold to flush pending log data.
563 :Type: Integer
564 :Required: No
565 :Default: ``1024``
566
567
568 ``rgw usage log tick interval``
569
570 :Description: Flush pending log data every ``s`` seconds.
571 :Type: Integer
572 :Required: No
573 :Default: 30
574
575
576 ``rgw intent log object name``
577
578 :Description:
579 :Type: String
580 :Required: No
581 :Default: ``%Y-%m-%d-%i-%n``
582
583
584 ``rgw intent log object name utc``
585
586 :Description: Include a UTC timestamp in the intent log object name.
587 :Type: Boolean
588 :Required: No
589 :Default: ``false``
590
591 .. [#] there are levels >20 in some rare cases and that they are extremely verbose.