]>
Commit | Line | Data |
---|---|---|
7c673cae FG |
1 | ==================== |
2 | Recovery Reservation | |
3 | ==================== | |
4 | ||
5 | Recovery reservation extends and subsumes backfill reservation. The | |
6 | reservation system from backfill recovery is used for local and remote | |
7 | reservations. | |
8 | ||
9 | When a PG goes active, first it determines what type of recovery is | |
10 | necessary, if any. It may need log-based recovery, backfill recovery, | |
11 | both, or neither. | |
12 | ||
13 | In log-based recovery, the primary first acquires a local reservation | |
14 | from the OSDService's local_reserver. Then a MRemoteReservationRequest | |
15 | message is sent to each replica in order of OSD number. These requests | |
16 | will always be granted (i.e., cannot be rejected), but they may take | |
17 | some time to be granted if the remotes have already granted all their | |
18 | remote reservation slots. | |
19 | ||
20 | After all reservations are acquired, log-based recovery proceeds as it | |
21 | would without the reservation system. | |
22 | ||
23 | After log-based recovery completes, the primary releases all remote | |
24 | reservations. The local reservation remains held. The primary then | |
25 | determines whether backfill is necessary. If it is not necessary, the | |
26 | primary releases its local reservation and waits in the Recovered state | |
27 | for all OSDs to indicate that they are clean. | |
28 | ||
29 | If backfill recovery occurs after log-based recovery, the local | |
30 | reservation does not need to be reacquired since it is still held from | |
31 | before. If it occurs immediately after activation (log-based recovery | |
32 | not possible/necessary), the local reservation is acquired according to | |
33 | the typical process. | |
34 | ||
35 | Once the primary has its local reservation, it requests a remote | |
36 | reservation from the backfill target. This reservation CAN be rejected, | |
37 | for instance if the OSD is too full (backfillfull_ratio osd setting). | |
38 | If the reservation is rejected, the primary drops its local | |
39 | reservation, waits (osd_backfill_retry_interval), and then retries. It | |
40 | will retry indefinitely. | |
41 | ||
42 | Once the primary has the local and remote reservations, backfill | |
43 | proceeds as usual. After backfill completes the remote reservation is | |
44 | dropped. | |
45 | ||
46 | Finally, after backfill (or log-based recovery if backfill was not | |
47 | necessary), the primary drops the local reservation and enters the | |
48 | Recovered state. Once all the PGs have reported they are clean, the | |
49 | primary enters the Clean state and marks itself active+clean. | |
50 | ||
eafe8130 TL |
51 | ----------------- |
52 | Dump Reservations | |
53 | ----------------- | |
54 | ||
55 | An OSD daemon command dumps total local and remote reservations:: | |
56 | ||
57 | ceph daemon osd.<id> dump_recovery_reservations | |
58 | ||
7c673cae FG |
59 | |
60 | -------------- | |
61 | Things to Note | |
62 | -------------- | |
63 | ||
64 | We always grab the local reservation first, to prevent a circular | |
65 | dependency. We grab remote reservations in order of OSD number for the | |
66 | same reason. | |
67 | ||
68 | The recovery reservation state chart controls the PG state as reported | |
69 | to the monitor. The state chart can set: | |
70 | ||
71 | - recovery_wait: waiting for local/remote reservations | |
72 | - recovering: recovering | |
73 | - recovery_toofull: recovery stopped, OSD(s) above full ratio | |
74 | - backfill_wait: waiting for remote backfill reservations | |
75 | - backfilling: backfilling | |
76 | - backfill_toofull: backfill stopped, OSD(s) above backfillfull ratio | |
77 | ||
78 | ||
79 | -------- | |
80 | See Also | |
81 | -------- | |
82 | ||
83 | The Active substate of the automatically generated OSD state diagram. |