]> git.proxmox.com Git - ceph.git/blob - ceph/doc/rados/configuration/storage-devices.rst
2663b09a2d4eb13fcafe1ace3262f7395207989e
[ceph.git] / ceph / doc / rados / configuration / storage-devices.rst
1 =================
2 Storage Devices
3 =================
4
5 There are several Ceph daemons in a storage cluster:
6
7 * **Ceph OSDs** (Object Storage Daemons) store most of the data
8 in Ceph. Usually each OSD is backed by a single storage device.
9 This can be a traditional hard disk (HDD) or a solid state disk
10 (SSD). OSDs can also be backed by a combination of devices: for
11 example, a HDD for most data and an SSD (or partition of an
12 SSD) for some metadata. The number of OSDs in a cluster is
13 usually a function of the amount of data to be stored, the size
14 of each storage device, and the level and type of redundancy
15 specified (replication or erasure coding).
16 * **Ceph Monitor** daemons manage critical cluster state. This
17 includes cluster membership and authentication information.
18 Small clusters require only a few gigabytes of storage to hold
19 the monitor database. In large clusters, however, the monitor
20 database can reach sizes of tens of gigabytes to hundreds of
21 gigabytes.
22 * **Ceph Manager** daemons run alongside monitor daemons, providing
23 additional monitoring and providing interfaces to external
24 monitoring and management systems.
25
26
27 OSD Backends
28 ============
29
30 There are two ways that OSDs manage the data they store.
31 As of the Luminous 12.2.z release, the default (and recommended) backend is
32 *BlueStore*. Prior to the Luminous release, the default (and only option) was
33 *Filestore*.
34
35 BlueStore
36 ---------
37
38 BlueStore is a special-purpose storage backend designed specifically for
39 managing data on disk for Ceph OSD workloads. BlueStore's design is based on
40 a decade of experience of supporting and managing Filestore OSDs.
41
42 Key BlueStore features include:
43
44 * Direct management of storage devices. BlueStore consumes raw block devices or
45 partitions. This avoids intervening layers of abstraction (such as local file
46 systems like XFS) that can limit performance or add complexity.
47 * Metadata management with RocksDB. RocksDB's key/value database is embedded
48 in order to manage internal metadata, including the mapping of object
49 names to block locations on disk.
50 * Full data and metadata checksumming. By default, all data and
51 metadata written to BlueStore is protected by one or more
52 checksums. No data or metadata is read from disk or returned
53 to the user without being verified.
54 * Inline compression. Data can be optionally compressed before being written
55 to disk.
56 * Multi-device metadata tiering. BlueStore allows its internal
57 journal (write-ahead log) to be written to a separate, high-speed
58 device (like an SSD, NVMe, or NVDIMM) for increased performance. If
59 a significant amount of faster storage is available, internal
60 metadata can be stored on the faster device.
61 * Efficient copy-on-write. RBD and CephFS snapshots rely on a
62 copy-on-write *clone* mechanism that is implemented efficiently in
63 BlueStore. This results in efficient I/O both for regular snapshots
64 and for erasure-coded pools (which rely on cloning to implement
65 efficient two-phase commits).
66
67 For more information, see :doc:`bluestore-config-ref` and :doc:`/rados/operations/bluestore-migration`.
68
69 FileStore
70 ---------
71
72 FileStore is the legacy approach to storing objects in Ceph. It
73 relies on a standard file system (normally XFS) in combination with a
74 key/value database (traditionally LevelDB, now RocksDB) for some
75 metadata.
76
77 FileStore is well-tested and widely used in production. However, it
78 suffers from many performance deficiencies due to its overall design
79 and its reliance on a traditional file system for object data storage.
80
81 Although FileStore is capable of functioning on most POSIX-compatible
82 file systems (including btrfs and ext4), we recommend that only the
83 XFS file system be used with Ceph. Both btrfs and ext4 have known bugs and
84 deficiencies and their use may lead to data loss. By default, all Ceph
85 provisioning tools use XFS.
86
87 For more information, see :doc:`filestore-config-ref`.