]> git.proxmox.com Git - ceph.git/blame - ceph/doc/rados/configuration/storage-devices.rst
import quincy beta 17.1.0
[ceph.git] / ceph / doc / rados / configuration / storage-devices.rst
CommitLineData
d2e6a577
FG
1=================
2 Storage Devices
3=================
4
20effc67 5There are several Ceph daemons in a storage cluster:
d2e6a577 6
20effc67
TL
7* **Ceph OSDs** (Object Storage Daemons) store most of the data
8 in Ceph. Usually each OSD is backed by a single storage device.
9 This can be a traditional hard disk (HDD) or a solid state disk
10 (SSD). OSDs can also be backed by a combination of devices: for
11 example, a HDD for most data and an SSD (or partition of an
12 SSD) for some metadata. The number of OSDs in a cluster is
13 usually a function of the amount of data to be stored, the size
14 of each storage device, and the level and type of redundancy
15 specified (replication or erasure coding).
16* **Ceph Monitor** daemons manage critical cluster state. This
17 includes cluster membership and authentication information.
18 Small clusters require only a few gigabytes of storage to hold
19 the monitor database. In large clusters, however, the monitor
20 database can reach sizes of tens of gigabytes to hundreds of
21 gigabytes.
22* **Ceph Manager** daemons run alongside monitor daemons, providing
23 additional monitoring and providing interfaces to external
24 monitoring and management systems.
d2e6a577
FG
25
26
27OSD Backends
28============
29
20effc67
TL
30There are two ways that OSDs manage the data they store.
31As of the Luminous 12.2.z release, the default (and recommended) backend is
32*BlueStore*. Prior to the Luminous release, the default (and only option) was
f67539c2 33*Filestore*.
d2e6a577
FG
34
35BlueStore
36---------
37
20effc67
TL
38BlueStore is a special-purpose storage backend designed specifically for
39managing data on disk for Ceph OSD workloads. BlueStore's design is based on
40a decade of experience of supporting and managing Filestore OSDs.
d2e6a577 41
20effc67
TL
42Key BlueStore features include:
43
44* Direct management of storage devices. BlueStore consumes raw block devices or
45 partitions. This avoids intervening layers of abstraction (such as local file
46 systems like XFS) that can limit performance or add complexity.
47* Metadata management with RocksDB. RocksDB's key/value database is embedded
48 in order to manage internal metadata, including the mapping of object
d2e6a577 49 names to block locations on disk.
20effc67 50* Full data and metadata checksumming. By default, all data and
d2e6a577 51 metadata written to BlueStore is protected by one or more
20effc67 52 checksums. No data or metadata is read from disk or returned
d2e6a577 53 to the user without being verified.
20effc67
TL
54* Inline compression. Data can be optionally compressed before being written
55 to disk.
56* Multi-device metadata tiering. BlueStore allows its internal
d2e6a577 57 journal (write-ahead log) to be written to a separate, high-speed
20effc67 58 device (like an SSD, NVMe, or NVDIMM) for increased performance. If
d2e6a577 59 a significant amount of faster storage is available, internal
20effc67
TL
60 metadata can be stored on the faster device.
61* Efficient copy-on-write. RBD and CephFS snapshots rely on a
d2e6a577 62 copy-on-write *clone* mechanism that is implemented efficiently in
20effc67
TL
63 BlueStore. This results in efficient I/O both for regular snapshots
64 and for erasure-coded pools (which rely on cloning to implement
d2e6a577
FG
65 efficient two-phase commits).
66
94b18763 67For more information, see :doc:`bluestore-config-ref` and :doc:`/rados/operations/bluestore-migration`.
d2e6a577
FG
68
69FileStore
70---------
71
20effc67 72FileStore is the legacy approach to storing objects in Ceph. It
d2e6a577
FG
73relies on a standard file system (normally XFS) in combination with a
74key/value database (traditionally LevelDB, now RocksDB) for some
75metadata.
76
20effc67
TL
77FileStore is well-tested and widely used in production. However, it
78suffers from many performance deficiencies due to its overall design
79and its reliance on a traditional file system for object data storage.
d2e6a577 80
20effc67
TL
81Although FileStore is capable of functioning on most POSIX-compatible
82file systems (including btrfs and ext4), we recommend that only the
83XFS file system be used with Ceph. Both btrfs and ext4 have known bugs and
84deficiencies and their use may lead to data loss. By default, all Ceph
85provisioning tools use XFS.
d2e6a577
FG
86
87For more information, see :doc:`filestore-config-ref`.