]> git.proxmox.com Git - ceph.git/blob - ceph/doc/radosgw/orphans.rst
import quincy beta 17.1.0
[ceph.git] / ceph / doc / radosgw / orphans.rst
1 ==================================
2 Orphan List and Associated Tooling
3 ==================================
4
5 .. version added:: Luminous
6
7 .. contents::
8
9 Orphans are RADOS objects that are left behind after their associated
10 RGW objects are removed. Normally these RADOS objects are removed
11 automatically, either immediately or through a process known as
12 "garbage collection". Over the history of RGW, however, there may have
13 been bugs that prevented these RADOS objects from being deleted, and
14 these RADOS objects may be consuming space on the Ceph cluster without
15 being of any use. From the perspective of RGW, we call such RADOS
16 objects "orphans".
17
18 Orphans Find -- DEPRECATED
19 --------------------------
20
21 The `radosgw-admin` tool has/had three subcommands to help manage
22 orphans, however these subcommands are (or will soon be)
23 deprecated. These subcommands are:
24
25 .. prompt:: bash #
26
27 radosgw-admin orphans find ...
28 radosgw-admin orphans finish ...
29 radosgw-admin orphans list-jobs ...
30
31 There are two key problems with these subcommands, however. First,
32 these subcommands have not been actively maintained and therefore have
33 not tracked RGW as it has evolved in terms of features and updates. As
34 a result the confidence that these subcommands can accurately identify
35 true orphans is presently low.
36
37 Second, these subcommands store intermediate results on the cluster
38 itself. This can be problematic when cluster administrators are
39 confronting insufficient storage space and want to remove orphans as a
40 means of addressing the issue. The intermediate results could strain
41 the existing cluster storage capacity even further.
42
43 For these reasons "orphans find" has been deprecated.
44
45 Orphan List
46 -----------
47
48 Because "orphans find" has been deprecated, RGW now includes an
49 additional tool -- 'rgw-orphan-list'. When run it will list the
50 available pools and prompt the user to enter the name of the data
51 pool. At that point the tool will, perhaps after an extended period of
52 time, produce a local file containing the RADOS objects from the
53 designated pool that appear to be orphans. The administrator is free
54 to examine this file and the decide on a course of action, perhaps
55 removing those RADOS objects from the designated pool.
56
57 All intermediate results are stored on the local file system rather
58 than the Ceph cluster. So running the 'rgw-orphan-list' tool should
59 have no appreciable impact on the amount of cluster storage consumed.
60
61 WARNING: Experimental Status
62 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
63
64 The 'rgw-orphan-list' tool is new and therefore currently considered
65 experimental. The list of orphans produced should be "sanity checked"
66 before being used for a large delete operation.
67
68 WARNING: Specifying a Data Pool
69 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
70
71 If a pool other than an RGW data pool is specified, the results of the
72 tool will be erroneous. All RADOS objects found on such a pool will
73 falsely be designated as orphans.
74
75 WARNING: Unindexed Buckets
76 ~~~~~~~~~~~~~~~~~~~~~~~~~~
77
78 RGW allows for unindexed buckets, that is buckets that do not maintain
79 an index of their contents. This is not a typical configuration, but
80 it is supported. Because the 'rgw-orphan-list' tool uses the bucket
81 indices to determine what RADOS objects should exist, objects in the
82 unindexed buckets will falsely be listed as orphans.
83
84
85 RADOS List
86 ----------
87
88 One of the sub-steps in computing a list of orphans is to map each RGW
89 object into its corresponding set of RADOS objects. This is done using
90 a subcommand of 'radosgw-admin'.
91
92 .. prompt:: bash #
93
94 radosgw-admin bucket radoslist [--bucket={bucket-name}]
95
96 The subcommand will produce a list of RADOS objects that support all
97 of the RGW objects. If a bucket is specified then the subcommand will
98 only produce a list of RADOS objects that correspond back the RGW
99 objects in the specified bucket.
100
101 Note: Shared Bucket Markers
102 ~~~~~~~~~~~~~~~~~~~~~~~~~~~
103
104 Some administrators will be aware of the coding schemes used to name
105 the RADOS objects that correspond to RGW objects, which include a
106 "marker" unique to a given bucket.
107
108 RADOS objects that correspond with the contents of one RGW bucket,
109 however, may contain a marker that specifies a different bucket. This
110 behavior is a consequence of the "shallow copy" optimization used by
111 RGW. When larger objects are copied from bucket to bucket, only the
112 "head" objects are actually copied, and the tail objects are
113 shared. Those shared objects will contain the marker of the original
114 bucket.
115
116 .. _Data Layout in RADOS : ../layout
117 .. _Pool Placement and Storage Classes : ../placement