]> git.proxmox.com Git - ceph.git/blob - ceph/doc/radosgw/orphans.rst
9a77d60de47673d79966d689803897a7993df768
[ceph.git] / ceph / doc / radosgw / orphans.rst
1 ==================================
2 Orphan List and Associated Tooling
3 ==================================
4
5 .. version added:: Luminous
6
7 .. contents::
8
9 Orphans are RADOS objects that are left behind after their associated
10 RGW objects are removed. Normally these RADOS objects are removed
11 automatically, either immediately or through a process known as
12 "garbage collection". Over the history of RGW, however, there may have
13 been bugs that prevented these RADOS objects from being deleted, and
14 these RADOS objects may be consuming space on the Ceph cluster without
15 being of any use. From the perspective of RGW, we call such RADOS
16 objects "orphans".
17
18 Orphans Find -- DEPRECATED
19 --------------------------
20
21 The `radosgw-admin` tool has/had three subcommands to help manage
22 orphans, however these subcommands are (or will soon be)
23 deprecated. These subcommands are:
24
25 ::
26 # radosgw-admin orphans find ...
27 # radosgw-admin orphans finish ...
28 # radosgw-admin orphans list-jobs ...
29
30 There are two key problems with these subcommands, however. First,
31 these subcommands have not been actively maintained and therefore have
32 not tracked RGW as it has evolved in terms of features and updates. As
33 a result the confidence that these subcommands can accurately identify
34 true orphans is presently low.
35
36 Second, these subcommands store intermediate results on the cluster
37 itself. This can be problematic when cluster administrators are
38 confronting insufficient storage space and want to remove orphans as a
39 means of addressing the issue. The intermediate results could strain
40 the existing cluster storage capacity even further.
41
42 For these reasons "orphans find" has been deprecated.
43
44 Orphan List
45 -----------
46
47 Because "orphans find" has been deprecated, RGW now includes an
48 additional tool -- 'rgw-orphan-list'. When run it will list the
49 available pools and prompt the user to enter the name of the data
50 pool. At that point the tool will, perhaps after an extended period of
51 time, produce a local file containing the RADOS objects from the
52 designated pool that appear to be orphans. The administrator is free
53 to examine this file and the decide on a course of action, perhaps
54 removing those RADOS objects from the designated pool.
55
56 All intermediate results are stored on the local file system rather
57 than the Ceph cluster. So running the 'rgw-orphan-list' tool should
58 have no appreciable impact on the amount of cluster storage consumed.
59
60 WARNING: Experimental Status
61 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
62
63 The 'rgw-orphan-list' tool is new and therefore currently considered
64 experimental. The list of orphans produced should be "sanity checked"
65 before being used for a large delete operation.
66
67 WARNING: Specifying a Data Pool
68 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
69
70 If a pool other than an RGW data pool is specified, the results of the
71 tool will be erroneous. All RADOS objects found on such a pool will
72 falsely be designated as orphans.
73
74 WARNING: Unindexed Buckets
75 ~~~~~~~~~~~~~~~~~~~~~~~~~~
76
77 RGW allows for unindexed buckets, that is buckets that do not maintain
78 an index of their contents. This is not a typical configuration, but
79 it is supported. Because the 'rgw-orphan-list' tool uses the bucket
80 indices to determine what RADOS objects should exist, objects in the
81 unindexed buckets will falsely be listed as orphans.
82
83
84 RADOS List
85 ----------
86
87 One of the sub-steps in computing a list of orphans is to map each RGW
88 object into its corresponding set of RADOS objects. This is done using
89 a subcommand of 'radosgw-admin'.
90
91 ::
92 # radosgw-admin bucket radoslist [--bucket={bucket-name}]
93
94 The subcommand will produce a list of RADOS objects that support all
95 of the RGW objects. If a bucket is specified then the subcommand will
96 only produce a list of RADOS objects that correspond back the RGW
97 objects in the specified bucket.
98
99 Note: Shared Bucket Markers
100 ~~~~~~~~~~~~~~~~~~~~~~~~~~~
101
102 Some administrators will be aware of the coding schemes used to name
103 the RADOS objects that correspond to RGW objects, which include a
104 "marker" unique to a given bucket.
105
106 RADOS objects that correspond with the contents of one RGW bucket,
107 however, may contain a marker that specifies a different bucket. This
108 behavior is a consequence of the "shallow copy" optimization used by
109 RGW. When larger objects are copied from bucket to bucket, only the
110 "head" objects are actually copied, and the tail objects are
111 shared. Those shared objects will contain the marker of the original
112 bucket.
113
114 .. _Data Layout in RADOS : ../layout
115 .. _Pool Placement and Storage Classes : ../placement