]> git.proxmox.com Git - pve-docs.git/blame - pvesr.adoc
add documentation for display types and memory configuration
[pve-docs.git] / pvesr.adoc
CommitLineData
b1f48b2a 1[[chapter_pvesr]]
c024f553
DM
2ifdef::manvolnum[]
3pvesr(1)
4========
5:pve-toplevel:
6
7NAME
8----
9
236bec37 10pvesr - Proxmox VE Storage Replication
c024f553
DM
11
12SYNOPSIS
13--------
14
15include::pvesr.1-synopsis.adoc[]
16
17DESCRIPTION
18-----------
19endif::manvolnum[]
20
21ifndef::manvolnum[]
22Storage Replication
23===================
24:pve-toplevel:
25endif::manvolnum[]
26
45c218cf
DM
27The `pvesr` command line tool manages the {PVE} storage replication
28framework. Storage replication brings redundancy for guests using
728a3ea5 29local storage and reduces migration time.
45c218cf 30
728a3ea5
TL
31It replicates guest volumes to another node so that all data is available
32without using shared storage. Replication uses snapshots to minimize traffic
33sent over the network. Therefore, new data is sent only incrementally after
34an initial full sync. In the case of a node failure, your guest data is
35still available on the replicated node.
236bec37 36
728a3ea5
TL
37The replication will be done automatically in configurable intervals.
38The minimum replication interval is one minute and the maximal interval is
39once a week. The format used to specify those intervals is a subset of
40`systemd` calendar events, see
41xref:pvesr_schedule_time_format[Schedule Format] section:
236bec37 42
728a3ea5
TL
43Every guest can be replicated to multiple target nodes, but a guest cannot
44get replicated twice to the same target node.
236bec37 45
728a3ea5
TL
46Each replications bandwidth can be limited, to avoid overloading a storage
47or server.
236bec37 48
728a3ea5
TL
49Virtual guest with active replication cannot currently use online migration.
50Offline migration is supported in general. If you migrate to a node where
51the guests data is already replicated only the changes since the last
52synchronisation (so called `delta`) must be sent, this reduces the required
53time significantly. In this case the replication direction will also switch
54nodes automatically after the migration finished.
55
56For example: VM100 is currently on `nodeA` and gets replicated to `nodeB`.
57You migrate it to `nodeB`, so now it gets automatically replicated back from
58`nodeB` to `nodeA`.
59
60If you migrate to a node where the guest is not replicated, the whole disk
61data must send over. After the migration the replication job continues to
62replicate this guest to the configured nodes.
63
64[IMPORTANT]
65====
66High-Availability is allowed in combination with storage replication, but it
67has the following implications:
68
69* redistributing services after a more preferred node comes online will lead
70 to errors.
71
72* recovery works, but there may be some data loss between the last synced
73 time and the time a node failed.
74====
236bec37
WL
75
76Supported Storage Types
77-----------------------
78
79.Storage Types
80[width="100%",options="header"]
81|============================================
82|Description |PVE type |Snapshots|Stable
83|ZFS (local) |zfspool |yes |yes
84|============================================
85
728a3ea5
TL
86[[pvesr_schedule_time_format]]
87Schedule Format
88---------------
236bec37 89
728a3ea5 90{pve} has a very flexible replication scheduler. It is based on the systemd
470d4313 91time calendar event format.footnote:[see `man 7 systemd.time` for more information]
728a3ea5
TL
92Calendar events may be used to refer to one or more points in time in a
93single expression.
236bec37 94
728a3ea5 95Such a calendar event uses the following format:
236bec37 96
728a3ea5
TL
97----
98[day(s)] [[start-time(s)][/repetition-time(s)]]
99----
236bec37 100
728a3ea5
TL
101This allows you to configure a set of days on which the job should run.
102You can also set one or more start times, it tells the replication scheduler
103the moments in time when a job should start.
104With this information we could create a job which runs every workday at 10
105PM: `'mon,tue,wed,thu,fri 22'` which could be abbreviated to: `'mon..fri
10622'`, most reasonable schedules can be written quite intuitive this way.
107
108NOTE: Hours are set in 24h format.
109
110To allow easier and shorter configuration one or more repetition times can
111be set. They indicate that on the start-time(s) itself and the start-time(s)
112plus all multiples of the repetition value replications will be done. If
19cc0d77
DC
113you want to start replication at 8 AM and repeat it every 15 minutes until
1149 AM you would use: `'8:00/15'`
728a3ea5
TL
115
116Here you see also that if no hour separation (`:`) is used the value gets
117interpreted as minute. If such a separation is used the value on the left
118denotes the hour(s) and the value on the right denotes the minute(s).
119Further, you can use `*` to match all possible values.
120
121To get additional ideas look at
122xref:pvesr_schedule_format_examples[more Examples below].
123
124Detailed Specification
125~~~~~~~~~~~~~~~~~~~~~~
126
127days:: Days are specified with an abbreviated English version: `sun, mon,
128tue, wed, thu, fri and sat`. You may use multiple days as a comma-separated
129list. A range of days can also be set by specifying the start and end day
130separated by ``..'', for example `mon..fri`. Those formats can be also
131mixed. If omitted `'*'` is assumed.
132
133time-format:: A time format consists of hours and minutes interval lists.
134Hours and minutes are separated by `':'`. Both, hour and minute, can be list
135and ranges of values, using the same format as days.
136First come hours then minutes, hours can be omitted if not needed, in this
137case `'*'` is assumed for the value of hours.
138The valid range for values is `0-23` for hours and `0-59` for minutes.
139
140[[pvesr_schedule_format_examples]]
236bec37
WL
141Examples:
142~~~~~~~~~
143
144.Schedule Examples
145[width="100%",options="header"]
728a3ea5
TL
146|==============================================================================
147|Schedule String |Alternative |Meaning
8985eb37
FG
148|mon,tue,wed,thu,fri |mon..fri |Every working day at 0:00
149|sat,sun |sat..sun |Only on weekends at 0:00
728a3ea5 150|mon,wed,fri |-- |Only on Monday, Wednesday and Friday at 0:00
8985eb37
FG
151|12:05 |12:05 |Every day at 12:05 PM
152|*/5 |0/5 |Every five minutes
728a3ea5 153|mon..wed 30/10 |mon,tue,wed 30/10 |Monday, Tuesday, Wednesday 30, 40 and 50 minutes after every full hour
8985eb37 154|mon..fri 8..17,22:0/15 |-- |Every working day every 15 minutes between 8 AM and 6 PM and between 10 PM and 11 PM
728a3ea5 155|fri 12..13:5/20 |fri 12,13:5/20 |Friday at 12:05, 12:25, 12:45, 13:05, 13:25 and 13:45
8985eb37 156|12,14,16,18,20,22:5 |12/2:5 |Every day starting at 12:05 until 22:05, every 2 hours
728a3ea5
TL
157|* |*/1 |Every minute (minimum interval)
158|==============================================================================
236bec37 159
728a3ea5
TL
160Error Handling
161--------------
236bec37 162
728a3ea5
TL
163If a replication job encounters problems it will be placed in error state.
164In this state the configured replication intervals get suspended
165temporarily. Then we retry the failed replication in a 30 minute interval,
166once this succeeds the original schedule gets activated again.
236bec37 167
728a3ea5
TL
168Possible issues
169~~~~~~~~~~~~~~~
236bec37 170
728a3ea5
TL
171This represents only the most common issues possible, depending on your
172setup there may be also another cause.
236bec37 173
728a3ea5 174* Network is not working.
236bec37 175
728a3ea5 176* No free space left on the replication target storage.
236bec37 177
728a3ea5 178* Storage with same storage ID available on target node
236bec37 179
728a3ea5
TL
180NOTE: You can always use the replication log to get hints about a problems
181cause.
236bec37 182
728a3ea5
TL
183Migrating a guest in case of Error
184~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
185// FIXME: move this to better fitting chapter (sysadmin ?) and only link to
186// it here
236bec37 187
728a3ea5
TL
188In the case of a grave error a virtual guest may get stuck on a failed
189node. You then need to move it manually to a working node again.
236bec37 190
728a3ea5
TL
191Example
192~~~~~~~
236bec37 193
728a3ea5
TL
194Lets assume that you have two guests (VM 100 and CT 200) running on node A
195and replicate to node B.
196Node A failed and can not get back online. Now you have to migrate the guest
197to Node B manually.
236bec37 198
728a3ea5 199- connect to node B over ssh or open its shell via the WebUI
236bec37 200
728a3ea5
TL
201- check if that the cluster is quorate
202+
203----
204# pvecm status
205----
236bec37 206
728a3ea5
TL
207- If you have no quorum we strongly advise to fix this first and make the
208 node operable again. Only if this is not possible at the moment you may
209 use the following command to enforce quorum on the current node:
210+
211----
212# pvecm expected 1
213----
236bec37 214
728a3ea5
TL
215WARNING: If expected votes are set avoid changes which affect the cluster
216(for example adding/removing nodes, storages, virtual guests) at all costs.
217Only use it to get vital guests up and running again or to resolve to quorum
218issue itself.
236bec37 219
728a3ea5
TL
220- move both guest configuration files form the origin node A to node B:
221+
222----
1cdc7c17
DC
223# mv /etc/pve/nodes/A/qemu-server/100.conf /etc/pve/nodes/B/qemu-server/100.conf
224# mv /etc/pve/nodes/A/lxc/200.conf /etc/pve/nodes/B/lxc/200.conf
728a3ea5 225----
236bec37 226
728a3ea5
TL
227- Now you can start the guests again:
228+
229----
230# qm start 100
231# pct start 200
232----
236bec37 233
728a3ea5 234Remember to replace the VMIDs and node names with your respective values.
236bec37 235
728a3ea5
TL
236Managing Jobs
237-------------
236bec37 238
1ff5e4e8 239[thumbnail="screenshot/gui-qemu-add-replication-job.png"]
2a3d436c 240
728a3ea5
TL
241You can use the web GUI to create, modify and remove replication jobs
242easily. Additionally the command line interface (CLI) tool `pvesr` can be
243used to do this.
236bec37 244
728a3ea5
TL
245You can find the replication panel on all levels (datacenter, node, virtual
246guest) in the web GUI. They differ in what jobs get shown: all, only node
247specific or only guest specific jobs.
236bec37 248
728a3ea5
TL
249Once adding a new job you need to specify the virtual guest (if not already
250selected) and the target node. The replication
251xref:pvesr_schedule_time_format[schedule] can be set if the default of `all
25215 minutes` is not desired. You may also impose rate limiting on a
253replication job, this can help to keep the storage load acceptable.
236bec37 254
728a3ea5
TL
255A replication job is identified by an cluster-wide unique ID. This ID is
256composed of the VMID in addition to an job number.
257This ID must only be specified manually if the CLI tool is used.
236bec37 258
236bec37 259
728a3ea5
TL
260Command Line Interface Examples
261-------------------------------
236bec37 262
8985eb37 263Create a replication job which will run every 5 minutes with limited bandwidth of
728a3ea5 26410 mbps (megabytes per second) for the guest with guest ID 100.
236bec37 265
728a3ea5
TL
266----
267# pvesr create-local-job 100-0 pve1 --schedule "*/5" --rate 10
268----
236bec37 269
728a3ea5 270Disable an active job with ID `100-0`
236bec37 271
728a3ea5
TL
272----
273# pvesr disable 100-0
274----
236bec37 275
728a3ea5 276Enable a deactivated job with ID `100-0`
236bec37 277
728a3ea5
TL
278----
279# pvesr enable 100-0
280----
236bec37 281
728a3ea5 282Change the schedule interval of the job with ID `100-0` to once a hour
236bec37 283
728a3ea5
TL
284----
285# pvesr update 100-0 --schedule '*/00'
286----
c024f553
DM
287
288ifdef::manvolnum[]
289include::pve-copyright.adoc[]
290endif::manvolnum[]