]> git.proxmox.com Git - ceph.git/blame - ceph/doc/rbd/rbd-exclusive-locks.rst
import ceph quincy 17.2.6
[ceph.git] / ceph / doc / rbd / rbd-exclusive-locks.rst
CommitLineData
9f95a23c
TL
1.. _rbd-exclusive-locks:
2
3====================
4 RBD Exclusive Locks
5====================
6
7.. index:: Ceph Block Device; RBD exclusive locks; exclusive-lock
8
39ae355f
TL
9Exclusive locks are mechanisms designed to prevent multiple processes from
10accessing the same Rados Block Device (RBD) in an uncoordinated fashion.
11Exclusive locks are used heavily in virtualization (where they prevent VMs from
12clobbering each other's writes) and in `RBD mirroring`_ (where they are a
13prerequisite for journaling in journal-based mirroring and fast generation of
14incremental diffs in snapshot-based mirroring).
15
16The ``exclusive-lock`` feature is enabled on newly created images. This default
17can be overridden via the ``rbd_default_features`` configuration option or the
18``--image-feature`` and ``--image-shared`` options for ``rbd create`` command.
19
20.. note::
21 Many image features, including ``object-map`` and ``fast-diff``, depend upon
22 exclusive locking. Disabling the ``exclusive-lock`` feature will negatively
23 affect the performance of some operations.
24
25To maintain multi-client access, the ``exclusive-lock`` feature implements
26automatic cooperative lock transitions between clients. It ensures that only
27a single client can write to an RBD image at any given time and thus protects
28internal image structures such as the object map, the journal or the `PWL
29cache`_ from concurrent modification.
30
31Exclusive locking is mostly transparent to the user:
32
33* Whenever a client (a ``librbd`` process or, in case of a ``krbd`` client,
34 a client node's kernel) needs to handle a write to an RBD image on which
35 exclusive locking has been enabled, it first acquires an exclusive lock on
36 the image. If the lock is already held by some other client, that client is
37 requested to release it.
38
39* Whenever a client that holds an exclusive lock on an RBD image gets
40 a request to release the lock, it stops handling writes, flushes its caches
41 and releases the lock.
42
43* Whenever a client that holds an exclusive lock on an RBD image terminates
44 gracefully, the lock is also released gracefully.
45
46* A graceful release of an exclusive lock on an RBD image (whether by request
47 or due to client termination) enables another, subsequent, client to acquire
48 the lock and start handling writes.
49
50.. warning::
51 By default, the ``exclusive-lock`` feature does not prevent two or more
52 concurrently running clients from opening the same RBD image and writing to
53 it in turns (whether on the same node or not). In effect, their writes just
54 get linearized as the lock is automatically transitioned back and forth in
55 a cooperative fashion.
56
57.. note::
58 To disable automatic lock transitions between clients, the
59 ``RBD_LOCK_MODE_EXCLUSIVE`` flag may be specified when acquiring the
60 exclusive lock. This is exposed by the ``--exclusive`` option for ``rbd
61 device map`` command.
9f95a23c
TL
62
63
20effc67 64Blocklisting
9f95a23c
TL
65============
66
39ae355f
TL
67Sometimes a client that previously held an exclusive lock on an RBD image does
68not terminate gracefully, but dies abruptly. This may be because the client
69process received a ``KILL`` or ``ABRT`` signal, or because the client node
70underwent a hard reboot or suffered a power failure. In cases like this, the
71lock is never gracefully released. This means that any new client that comes up
72and attempts to write to the image must break the previously held exclusive
73lock.
74
75However, a process (or kernel thread) may hang or merely lose network
76connectivity to the Ceph cluster for some amount of time. In that case,
77breaking the lock would be potentially catastrophic: the hung process or
78connectivity issue could resolve itself and the original process might then
79compete with one that started in the interim, thus accessing RBD data in an
80uncoordinated and destructive manner.
81
82In the event that a lock cannot be acquired in the standard graceful manner,
83the overtaking process not only breaks the lock but also blocklists the
84previous lock holder. This is negotiated between the new client process and the
85Ceph Monitor.
86
87* Upon receiving the blocklist request, the monitor instructs the relevant OSDs
88 to no longer serve requests from the old client process;
89* after the associated OSD map update is complete, the new client can break the
90 previously held lock;
91* after the new client has acquired the lock, it can commence writing
9f95a23c
TL
92 to the image.
93
f67539c2 94Blocklisting is thus a form of storage-level resource `fencing`_.
9f95a23c 95
39ae355f
TL
96.. note::
97 In order for blocklisting to work, the client must have the ``osd
98 blocklist`` capability. This capability is included in the ``profile
99 rbd`` capability profile, which should be set generally on all Ceph
100 :ref:`client identities <user-management>` using RBD.
9f95a23c 101
39ae355f
TL
102.. _RBD mirroring: ../rbd-mirroring
103.. _PWL cache: ../rbd-persistent-write-log-cache
9f95a23c 104.. _fencing: https://en.wikipedia.org/wiki/Fencing_(computing)