]> git.proxmox.com Git - ceph.git/blob - ceph/doc/rados/operations/pg-repair.rst
update ceph source to reef 18.1.2
[ceph.git] / ceph / doc / rados / operations / pg-repair.rst
1 ============================
2 Repairing PG Inconsistencies
3 ============================
4 Sometimes a Placement Group (PG) might become ``inconsistent``. To return the PG
5 to an ``active+clean`` state, you must first determine which of the PGs has become
6 inconsistent and then run the ``pg repair`` command on it. This page contains
7 commands for diagnosing PGs and the command for repairing PGs that have become
8 inconsistent.
9
10 .. highlight:: console
11
12 Commands for Diagnosing PG Problems
13 ===================================
14 The commands in this section provide various ways of diagnosing broken PGs.
15
16 To see a high-level (low-detail) overview of Ceph cluster health, run the
17 following command:
18
19 .. prompt:: bash #
20
21 ceph health detail
22
23 To see more detail on the status of the PGs, run the following command:
24
25 .. prompt:: bash #
26
27 ceph pg dump --format=json-pretty
28
29 To see a list of inconsistent PGs, run the following command:
30
31 .. prompt:: bash #
32
33 rados list-inconsistent-pg {pool}
34
35 To see a list of inconsistent RADOS objects, run the following command:
36
37 .. prompt:: bash #
38
39 rados list-inconsistent-obj {pgid}
40
41 To see a list of inconsistent snapsets in a specific PG, run the following
42 command:
43
44 .. prompt:: bash #
45
46 rados list-inconsistent-snapset {pgid}
47
48
49 Commands for Repairing PGs
50 ==========================
51 The form of the command to repair a broken PG is as follows:
52
53 .. prompt:: bash #
54
55 ceph pg repair {pgid}
56
57 Here ``{pgid}`` represents the id of the affected PG.
58
59 For example:
60
61 .. prompt:: bash #
62
63 ceph pg repair 1.4
64
65 .. note:: PG IDs have the form ``N.xxxxx``, where ``N`` is the number of the
66 pool that contains the PG. The command ``ceph osd listpools`` and the
67 command ``ceph osd dump | grep pool`` return a list of pool numbers.
68
69 More Information on PG Repair
70 =============================
71 Ceph stores and updates the checksums of objects stored in the cluster. When a
72 scrub is performed on a PG, the OSD attempts to choose an authoritative copy
73 from among its replicas. Only one of the possible cases is consistent. After
74 performing a deep scrub, Ceph calculates the checksum of an object that is read
75 from disk and compares it to the checksum that was previously recorded. If the
76 current checksum and the previously recorded checksum do not match, that
77 mismatch is considered to be an inconsistency. In the case of replicated pools,
78 any mismatch between the checksum of any replica of an object and the checksum
79 of the authoritative copy means that there is an inconsistency. The discovery
80 of these inconsistencies cause a PG's state to be set to ``inconsistent``.
81
82 The ``pg repair`` command attempts to fix inconsistencies of various kinds. If
83 ``pg repair`` finds an inconsistent PG, it attempts to overwrite the digest of
84 the inconsistent copy with the digest of the authoritative copy. If ``pg
85 repair`` finds an inconsistent replicated pool, it marks the inconsistent copy
86 as missing. In the case of replicated pools, recovery is beyond the scope of
87 ``pg repair``.
88
89 In the case of erasure-coded and BlueStore pools, Ceph will automatically
90 perform repairs if ``osd_scrub_auto_repair`` (default ``false``) is set to
91 ``true`` and if no more than ``osd_scrub_auto_repair_num_errors`` (default
92 ``5``) errors are found.
93
94 The ``pg repair`` command will not solve every problem. Ceph does not
95 automatically repair PGs when they are found to contain inconsistencies.
96
97 The checksum of a RADOS object or an omap is not always available. Checksums
98 are calculated incrementally. If a replicated object is updated
99 non-sequentially, the write operation involved in the update changes the object
100 and invalidates its checksum. The whole object is not read while the checksum
101 is recalculated. The ``pg repair`` command is able to make repairs even when
102 checksums are not available to it, as in the case of Filestore. Users working
103 with replicated Filestore pools might prefer manual repair to ``ceph pg
104 repair``.
105
106 This material is relevant for Filestore, but not for BlueStore, which has its
107 own internal checksums. The matched-record checksum and the calculated checksum
108 cannot prove that any specific copy is in fact authoritative. If there is no
109 checksum available, ``pg repair`` favors the data on the primary, but this
110 might not be the uncorrupted replica. Because of this uncertainty, human
111 intervention is necessary when an inconsistency is discovered. This
112 intervention sometimes involves use of ``ceph-objectstore-tool``.
113
114 External Links
115 ==============
116 https://ceph.io/geen-categorie/ceph-manually-repair-object/ - This page
117 contains a walkthrough of the repair of a PG. It is recommended reading if you
118 want to repair a PG but have never done so.