]> git.proxmox.com Git - mirror_ovs.git/blame - Documentation/ref/ovsdb.7.rst
ovsdb: Use column diffs for ovsdb and raft log entries.
[mirror_ovs.git] / Documentation / ref / ovsdb.7.rst
CommitLineData
12b84d50
BP
1..
2 Copyright (c) 2017 Nicira, Inc.
3
4 Licensed under the Apache License, Version 2.0 (the "License"); you may
5 not use this file except in compliance with the License. You may obtain
6 a copy of the License at
7
8 http://www.apache.org/licenses/LICENSE-2.0
9
10 Unless required by applicable law or agreed to in writing, software
11 distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
12 WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
13 License for the specific language governing permissions and limitations
14 under the License.
15
16 Convention for heading levels in Open vSwitch documentation:
17
18 ======= Heading 0 (reserved for the title in a document)
19 ------- Heading 1
20 ~~~~~~~ Heading 2
21 +++++++ Heading 3
22 ''''''' Heading 4
23
24 Avoid deeper levels because they do not render well.
25
26=====
27ovsdb
28=====
29
30Description
31===========
32
fad59491
BP
33OVSDB, the Open vSwitch Database, is a network-accessible database system.
34Schemas in OVSDB specify the tables in a database and their columns' types and
35can include data, uniqueness, and referential integrity constraints. OVSDB
36offers atomic, consistent, isolated, durable transactions. RFC 7047 specifies
37the JSON-RPC based protocol that OVSDB clients and servers use to communicate.
12b84d50
BP
38
39The OVSDB protocol is well suited for state synchronization because it
40allows each client to monitor the contents of a whole database or a subset
41of it. Whenever a monitored portion of the database changes, the server
42tells the client what rows were added or modified (including the new
43contents) or deleted. Thus, OVSDB clients can easily keep track of the
44newest contents of any part of the database.
45
46While OVSDB is general-purpose and not particularly specialized for use with
47Open vSwitch, Open vSwitch does use it for multiple purposes. The leading use
48of OVSDB is for configuring and monitoring ``ovs-vswitchd(8)``, the Open
49vSwitch switch daemon, using the schema documented in
05bf1dbb
BP
50``ovs-vswitchd.conf.db(5)``. The Open Virtual Network (OVN) project uses two
51OVSDB schemas, documented as part of that project. Finally, Open vSwitch
52includes the "VTEP" schema, documented in ``vtep(5)`` that many third-party
53hardware switches support for configuring VXLAN, although OVS itself does not
54directly use this schema.
12b84d50
BP
55
56The OVSDB protocol specification allows independent, interoperable
57implementations of OVSDB to be developed. Open vSwitch includes an OVSDB
58server implementation named ``ovsdb-server(1)``, which supports several
59protocol extensions documented in its manpage, and a basic command-line OVSDB
60client named ``ovsdb-client(1)``, as well as OVSDB client libraries for C and
61for Python. Open vSwitch documentation often speaks of these OVSDB
62implementations in Open vSwitch as simply "OVSDB," even though that is distinct
63from the OVSDB protocol; we make the distinction explicit only when it might
64otherwise be unclear from the context.
65
66In addition to these generic OVSDB server and client tools, Open vSwitch
67includes tools for working with databases that have specific schemas:
05bf1dbb
BP
68``ovs-vsctl`` works with the ``ovs-vswitchd`` configuration database and
69``vtep-ctl`` works with the VTEP database.
12b84d50
BP
70
71RFC 7047 specifies the OVSDB protocol but it does not specify an on-disk
72storage format. Open vSwitch includes ``ovsdb-tool(1)`` for working with its
73own on-disk database formats. The most notable feature of this format is that
74``ovsdb-tool(1)`` makes it easy for users to print the transactions that have
75changed a database since the last time it was compacted. This feature is often
76useful for troubleshooting.
77
78Schemas
79=======
80
81Schemas in OVSDB have a JSON format that is specified in RFC 7047. They
82are often stored in files with an extension ``.ovsschema``. An
83on-disk database in OVSDB includes a schema and data, embedding both into a
84single file. The Open vSwitch utility ``ovsdb-tool`` has commands
85that work with schema files and with the schemas embedded in database
86files.
87
88An Open vSwitch schema has three important identifiers. The first is its
89name, which is also the name used in JSON-RPC calls to identify a database
90based on that schema. For example, the schema used to configure Open
91vSwitch has the name ``Open_vSwitch``. Schema names begin with a
92letter or an underscore, followed by any number of letters, underscores, or
93digits. The ``ovsdb-tool`` commands ``schema-name`` and
94``db-name`` extract the schema name from a schema or database
95file, respectively.
96
97An OVSDB schema also has a version of the form ``x.y.z`` e.g. ``1.2.3``.
98Schemas managed within the Open vSwitch project manage version numbering in the
99following way (but OVSDB does not mandate this approach). Whenever we change
100the database schema in a non-backward compatible way (e.g. when we delete a
101column or a table), we increment <x> and set <y> and <z> to 0. When we change
102the database schema in a backward compatible way (e.g. when we add a new
103column), we increment <y> and set <z> to 0. When we change the database schema
104cosmetically (e.g. we reindent its syntax), we increment <z>. The
105``ovsdb-tool`` commands ``schema-version`` and ``db-version`` extract the
106schema version from a schema or database file, respectively.
107
108Very old OVSDB schemas do not have a version, but RFC 7047 mandates it.
109
110An OVSDB schema optionally has a "checksum." RFC 7047 does not specify the use
111of the checksum and recommends that clients ignore it. Open vSwitch uses the
112checksum to remind developers to update the version: at build time, if the
113schema's embedded checksum, ignoring the checksum field itself, does not match
114the schema's content, then it fails the build with a recommendation to update
115the version and the checksum. Thus, a developer who changes the schema, but
116does not update the version, receives an automatic reminder. In practice this
117has been an effective way to ensure compliance with the version number policy.
118The ``ovsdb-tool`` commands ``schema-cksum`` and ``db-cksum`` extract the
119schema checksum from a schema or database file, respectively.
120
121Service Models
122==============
123
1b1d2e6d
BP
124OVSDB supports three service models for databases: **standalone**,
125**active-backup**, and **clustered**. The service models provide different
126compromises among consistency, availability, and partition tolerance. They
127also differ in the number of servers required and in terms of performance. The
128standalone and active-backup database service models share one on-disk format,
129and clustered databases use a different format, but the OVSDB programs work
130with both formats. ``ovsdb(5)`` documents these file formats.
12b84d50
BP
131
132RFC 7047, which specifies the OVSDB protocol, does not mandate or specify
133any particular service model.
134
135The following sections describe the individual service models.
136
137Standalone Database Service Model
138---------------------------------
139
140A **standalone** database runs a single server. If the server stops running,
141the database becomes inaccessible, and if the server's storage is lost or
142corrupted, the database's content is lost. This service model is appropriate
143when the database controls a process or activity to which it is linked via
144"fate-sharing." For example, an OVSDB instance that controls an Open vSwitch
145virtual switch daemon, ``ovs-vswitchd``, is a standalone database because a
146server failure would take out both the database and the virtual switch.
147
148To set up a standalone database, use ``ovsdb-tool create`` to
149create a database file, then run ``ovsdb-server`` to start the
150database service.
151
1b1d2e6d
BP
152To configure a client, such as ``ovs-vswitchd`` or ``ovs-vsctl``, to use a
153standalone database, configure the server to listen on a "connection method"
154that the client can reach, then point the client to that connection method.
155See `Connection Methods`_ below for information about connection methods.
156
12b84d50
BP
157Active-Backup Database Service Model
158------------------------------------
159
160An **active-backup** database runs two servers (on different hosts). At any
161given time, one of the servers is designated with the **active** role and the
162other the **backup** role. An active server behaves just like a standalone
163server. A backup server makes an OVSDB connection to the active server and
164uses it to continuously replicate its content as it changes in real time.
165OVSDB clients can connect to either server but only the active server allows
166data modification or lock transactions.
167
168Setup for an active-backup database starts from a working standalone database
169service, which is initially the active server. On another node, to set up a
170backup server, create a database file with the same schema as the active
171server. The initial contents of the database file do not matter, as long as
172the schema is correct, so ``ovsdb-tool create`` will work, as will copying the
173database file from the active server. Then use
174``ovsdb-server --sync-from=<active>`` to start the backup server, where
175<active> is an OVSDB connection method (see `Connection Methods`_ below) that
176connects to the active server. At that point, the backup server will fetch a
177copy of the active database and keep it up-to-date until it is killed.
178
179When the active server in an active-backup server pair fails, an administrator
180can switch the backup server to an active role with the ``ovs-appctl`` command
181``ovsdb-server/disconnect-active-ovsdb-server``. Clients then have read/write
182access to the now-active server. Of course, administrators are slow to respond
183compared to software, so in practice external management software detects the
184active server's failure and changes the backup server's role. For example, the
05bf1dbb
BP
185"Integration Guide for Centralized Control" in the OVN documentation describes
186how to use Pacemaker for this purpose in OVN.
12b84d50
BP
187
188Suppose an active server fails and its backup is promoted to active. If the
189failed server is revived, it must be started as a backup server. Otherwise, if
190both servers are active, then they may start out of sync, if the database
191changed while the server was down, and they will continue to diverge over time.
192This also happens if the software managing the database servers cannot reach
193the active server and therefore switches the backup to active, but other hosts
194can reach both servers. These "split-brain" problems are unsolvable in general
195for server pairs.
196
197Compared to a standalone server, the active-backup service model
198somewhat increases availability, at a risk of split-brain. It adds
1b1d2e6d
BP
199generally insignificant performance overhead. On the other hand, the
200clustered service model, discussed below, requires at least 3 servers
201and has greater performance overhead, but it avoids the need for
202external management software and eliminates the possibility of
203split-brain.
12b84d50
BP
204
205Open vSwitch 2.6 introduced support for the active-backup service model.
206
2ccd66f5
IM
207.. important::
208
209 There was a change of a database file format in version 2.15.
210 To upgrade/downgrade the ``ovsdb-server`` processes across this version
211 follow the instructions described under
212 `Upgrading from version 2.14 and earlier to 2.15 and later`_ and
213 `Downgrading from version 2.15 and later to 2.14 and earlier`_.
214
1b1d2e6d
BP
215Clustered Database Service Model
216--------------------------------
217
218A **clustered** database runs across 3 or 5 or more database servers (the
219**cluster**) on different hosts. Servers in a cluster automatically
220synchronize writes within the cluster. A 3-server cluster can remain available
221in the face of at most 1 server failure; a 5-server cluster tolerates up to 2
222failures. Clusters larger than 5 servers will also work, with every 2 added
223servers allowing the cluster to tolerate 1 more failure, but write performance
224decreases. The number of servers should be odd: a 4- or 6-server cluster
225cannot tolerate more failures than a 3- or 5-server cluster, respectively.
226
227To set up a clustered database, first initialize it on a single node by running
228``ovsdb-tool create-cluster``, then start ``ovsdb-server``. Depending on its
229arguments, the ``create-cluster`` command can create an empty database or copy
230a standalone database's contents into the new database.
231
05bf1dbb
BP
232To configure a client to use a clustered database, first configure all of the
233servers to listen on a connection method that the client can reach, then point
234the client to all of the servers' connection methods, comma-separated. See
235`Connection Methods`_, below, for more detail.
1b1d2e6d
BP
236
237Open vSwitch 2.9 introduced support for the clustered service model.
238
239How to Maintain a Clustered Database
240~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
241
242To add a server to a cluster, run ``ovsdb-tool join-cluster`` on the new server
243and start ``ovsdb-server``. To remove a running server from a cluster, use
244``ovs-appctl`` to invoke the ``cluster/leave`` command. When a server fails
245and cannot be recovered, e.g. because its hard disk crashed, or to otherwise
246remove a server that is down from a cluster, use ``ovs-appctl`` to invoke
247``cluster/kick`` to make the remaining servers kick it out of the cluster.
248
249The above methods for adding and removing servers only work for healthy
250clusters, that is, for clusters with no more failures than their maximum
251tolerance. For example, in a 3-server cluster, the failure of 2 servers
252prevents servers joining or leaving the cluster (as well as database access).
253To prevent data loss or inconsistency, the preferred solution to this problem
254is to bring up enough of the failed servers to make the cluster healthy again,
255then if necessary remove any remaining failed servers and add new ones. If
256this cannot be done, though, use ``ovs-appctl`` to invoke ``cluster/leave
257--force`` on a running server. This command forces the server to which it is
258directed to leave its cluster and form a new single-node cluster that contains
259only itself. The data in the new cluster may be inconsistent with the former
260cluster: transactions not yet replicated to the server will be lost, and
261transactions not yet applied to the cluster may be committed. Afterward, any
262servers in its former cluster will regard the server to have failed.
263
80c42f7f
BP
264Once a server leaves a cluster, it may never rejoin it. Instead, create a new
265server and join it to the cluster.
266
1b1d2e6d
BP
267The servers in a cluster synchronize data over a cluster management protocol
268that is specific to Open vSwitch; it is not the same as the OVSDB protocol
269specified in RFC 7047. For this purpose, a server in a cluster is tied to a
270particular IP address and TCP port, which is specified in the ``ovsdb-tool``
271command that creates or joins the cluster. The TCP port used for clustering
272must be different from that used for OVSDB clients. To change the port or
273address of a server in a cluster, first remove it from the cluster, then add it
274back with the new address.
275
276To upgrade the ``ovsdb-server`` processes in a cluster from one version of Open
277vSwitch to another, upgrading them one at a time will keep the cluster healthy
278during the upgrade process. (This is different from upgrading a database
279schema, which is covered later under `Upgrading or Downgrading a Database`_.)
280
2ccd66f5
IM
281.. important::
282
283 There was a change of a database file format in version 2.15.
284 To upgrade/downgrade the ``ovsdb-server`` processes across this version
285 follow the instructions described under
286 `Upgrading from version 2.14 and earlier to 2.15 and later`_ and
287 `Downgrading from version 2.15 and later to 2.14 and earlier`_.
288
1b1d2e6d
BP
289Clustered OVSDB does not support the OVSDB "ephemeral columns" feature.
290``ovsdb-tool`` and ``ovsdb-client`` change ephemeral columns into persistent
291ones when they work with schemas for clustered databases. Future versions of
292OVSDB might add support for this feature.
293
2ccd66f5
IM
294Upgrading from version 2.14 and earlier to 2.15 and later
295~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
296
297There is a change of a database file format in version 2.15 that doesn't allow
298older versions of ``ovsdb-server`` to read the database file modified by the
299``ovsdb-server`` version 2.15 or later. This also affects runtime
300communications between servers in **active-backup** and **cluster** service
301models. To upgrade the ``ovsdb-server`` processes from one version of Open
302vSwitch (2.14 or earlier) to another (2.15 or higher) instructions below should
303be followed. (This is different from upgrading a database schema, which is
304covered later under `Upgrading or Downgrading a Database`_.)
305
306In case of **standalone** service model no special handling during upgrade is
307required.
308
309For the **active-backup** service model, administrator needs to update backup
310``ovsdb-server`` first and the active one after that, or shut down both servers
311and upgrade at the same time.
312
313For the **cluster** service model recommended upgrade strategy is following:
314
3151. Upgrade processes one at a time. Each ``ovsdb-server`` process after
316 upgrade should be started with ``--disable-file-column-diff`` command line
317 argument.
318
3192. When all ``ovsdb-server`` processes upgraded, use ``ovs-appctl`` to invoke
320 ``ovsdb/file/column-diff-enable`` command on each of them or restart all
321 ``ovsdb-server`` processes one at a time without
322 ``--disable-file-column-diff`` command line option.
323
324Downgrading from version 2.15 and later to 2.14 and earlier
325~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
326
327Similar to upgrading covered under `Upgrading from version 2.14 and earlier to
3282.15 and later`_, downgrading from the ``ovsdb-server`` version 2.15 and later
329to 2.14 and earlier requires additional steps. (This is different from
330upgrading a database schema, which is covered later under
331`Upgrading or Downgrading a Database`_.)
332
333For all service models it's required to:
334
3351. Stop all ``ovsdb-server`` processes (single process for **standalone**
336 service model, all involved processes for **active-backup** and **cluster**
337 service models).
338
3392. Compact all database files with ``ovsdb-tool compact`` command.
340
3413. Downgrade and restart ``ovsdb-server`` processes.
342
1b1d2e6d
BP
343Understanding Cluster Consistency
344~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
345
346To ensure consistency, clustered OVSDB uses the Raft algorithm described in
347Diego Ongaro's Ph.D. thesis, "Consensus: Bridging Theory and Practice". In an
348operational Raft cluster, at any given time a single server is the "leader" and
349the other nodes are "followers". Only the leader processes transactions, but a
350transaction is only committed when a majority of the servers confirm to the
351leader that they have written it to persistent storage.
352
353In most database systems, read and write access to the database happens through
354transactions. In such a system, Raft allows a cluster to present a strongly
355consistent transactional interface. OVSDB uses conventional transactions for
356writes, but clients often effectively do reads a different way, by asking the
357server to "monitor" a database or a subset of one on the client's behalf.
358Whenever monitored data changes, the server automatically tells the client what
359changed, which allows the client to maintain an accurate snapshot of the
360database in its memory. Of course, at any given time, the snapshot may be
361somewhat dated since some of it could have changed without the change
362notification yet being received and processed by the client.
363
364Given this unconventional usage model, OVSDB also adopts an unconventional
365clustering model. Each server in a cluster acts independently for the purpose
366of monitors and read-only transactions, without verifying that data is
367up-to-date with the leader. Servers forward transactions that write to the
368database to the leader for execution, ensuring consistency. This has the
369following consequences:
370
371* Transactions that involve writes, against any server in the cluster, are
372 linearizable if clients take care to use correct prerequisites, which is the
373 same condition required for linearizability in a standalone OVSDB.
374 (Actually, "at-least-once" consistency, because OVSDB does not have a session
375 mechanism to drop duplicate transactions if a connection drops after the
376 server commits it but before the client receives the result.)
377
378* Read-only transactions can yield results based on a stale version of the
379 database, if they are executed against a follower. Transactions on the
380 leader always yield fresh results. (With monitors, as explained above, a
381 client can always see stale data even without clustering, so clustering does
382 not change the consistency model for monitors.)
383
384* Monitor-based (or read-heavy) workloads scale well across a cluster, because
385 clustering OVSDB adds no additional work or communication for reads and
386 monitors.
387
388* A write-heavy client should connect to the leader, to avoid the overhead of
389 followers forwarding transactions to the leader.
390
391* When a client conducts a mix of read and write transactions across more than
392 one server in a cluster, it can see inconsistent results because a read
393 transaction might read stale data whose updates have not yet propagated from
05bf1dbb
BP
394 the leader. By default, utilities such as ``ovn-sbctl`` (in OVN) connect to
395 the cluster leader to avoid this issue.
1b1d2e6d
BP
396
397 The same might occur for transactions against a single follower except that
398 the OVSDB server ensures that the results of a write forwarded to the leader
399 by a given server are visible at that server before it replies to the
400 requesting client.
401
402* If a client uses a database on one server in a cluster, then another server
403 in the cluster (perhaps because the first server failed), the client could
404 observe stale data. Clustered OVSDB clients, however, can use a column in
405 the ``_Server`` database to detect that data on a server is older than data
406 that the client previously read. The OVSDB client library in Open vSwitch
407 uses this feature to avoid servers with stale data.
408
12b84d50
BP
409Database Replication
410====================
411
412OVSDB can layer **replication** on top of any of its service models.
413Replication, in this context, means to make, and keep up-to-date, a read-only
414copy of the contents of a database (the ``replica``). One use of replication
415is to keep an up-to-date backup of a database. A replica used solely for
416backup would not need to support clients of its own. A set of replicas that do
417serve clients could be used to scale out read access to the primary database.
418
419A database replica is set up in the same way as a backup server in an
420active-backup pair, with the difference that the replica is never promoted to
421an active role.
422
423A database can have multiple replicas.
424
425Open vSwitch 2.6 introduced support for database replication.
426
427Connection Methods
428==================
429
430An OVSDB **connection method** is a string that specifies how to make a
431JSON-RPC connection between an OVSDB client and server. Connection methods are
432part of the Open vSwitch implementation of OVSDB and not specified by RFC 7047.
433``ovsdb-server`` uses connection methods to specify how it should listen for
434connections from clients and ``ovsdb-client`` uses them to specify how it
435should connect to a server. Connections in the opposite direction, where
436``ovsdb-server`` connects to a client that is configured to listen for an
437incoming connection, are also possible.
438
439Connection methods are classified as **active** or **passive**. An active
440connection method makes an outgoing connection to a remote host; a passive
441connection method listens for connections from remote hosts. The most common
442arrangement is to configure an OVSDB server with passive connection methods and
443clients with active ones, but the OVSDB implementation in Open vSwitch supports
444the opposite arrangement as well.
445
446OVSDB supports the following active connection methods:
447
771680d9
YS
448ssl:<host>:<port>
449 The specified SSL or TLS <port> on the given <host>.
12b84d50 450
771680d9
YS
451tcp:<host>:<port>
452 The specified TCP <port> on the given <host>.
12b84d50
BP
453
454unix:<file>
455 On Unix-like systems, connect to the Unix domain server socket named
456 <file>.
457
458 On Windows, connect to a local named pipe that is represented by a file
459 created in the path <file> to mimic the behavior of a Unix domain socket.
460
1b1d2e6d
BP
461<method1>,<method2>,...,<methodN>
462 For a clustered database service to be highly available, a client must be
463 able to connect to any of the servers in the cluster. To do so, specify
464 connection methods for each of the servers separated by commas (and
465 optional spaces).
466
467 In theory, if machines go up and down and IP addresses change in the right
468 way, a client could talk to the wrong instance of a database. To avoid
469 this possibility, add ``cid:<uuid>`` to the list of methods, where <uuid>
470 is the cluster ID of the desired database cluster, as printed by
35551b56 471 ``ovsdb-tool db-cid``. This feature is optional.
1b1d2e6d 472
12b84d50
BP
473OVSDB supports the following passive connection methods:
474
475pssl:<port>[:<ip>]
476 Listen on the given TCP <port> for SSL or TLS connections. By default,
477 connections are not bound to a particular local IP address. Specifying
478 <ip> limits connections to those from the given IP.
479
480ptcp:<port>[:<ip>]
481 Listen on the given TCP <port>. By default, connections are not bound to a
482 particular local IP address. Specifying <ip> limits connections to those
483 from the given IP.
484
485punix:<file>
486 On Unix-like systems, listens for connections on the Unix domain socket
487 named <file>.
488
489 On Windows, listens on a local named pipe, creating a named pipe
929dc96d
NW
490 <file> to mimic the behavior of a Unix domain socket. The ACLs of the named
491 pipe include LocalSystem, Administrators, and Creator Owner.
12b84d50
BP
492
493All IP-based connection methods accept IPv4 and IPv6 addresses. To specify an
494IPv6 address, wrap it in square brackets, e.g. ``ssl:[::1]:6640``. Passive
495IP-based connection methods by default listen for IPv4 connections only; use
496``[::]`` as the address to accept both IPv4 and IPv6 connections,
771680d9
YS
497e.g. ``pssl:6640:[::]``. DNS names are also accepted if built with unbound
498library. On Linux, use ``%<device>`` to designate a scope for IPv6 link-level
499addresses, e.g. ``ssl:[fe80::1234%eth0]:6653``.
12b84d50
BP
500
501The <port> may be omitted from connection methods that use a port number. The
502default <port> for TCP-based connection methods is 6640, e.g. ``pssl:`` is
503equivalent to ``pssl:6640``. In Open vSwitch prior to version 2.4.0, the
504default port was 6632. To avoid incompatibility between older and newer
505versions, we encourage users to specify a port number.
506
507The ``ssl`` and ``pssl`` connection methods requires additional configuration
508through ``--private-key``, ``--certificate``, and ``--ca-cert`` command line
509options. Open vSwitch can be built without SSL support, in which case these
510connection methods are not supported.
511
512Database Life Cycle
513===================
514
515This section describes how to handle various events in the life cycle of
516a database using the Open vSwitch implementation of OVSDB.
517
518Creating a Database
519-------------------
520
521Creating and starting up the service for a new database was covered
522separately for each database service model in the `Service
523Models`_ section, above.
524
525Backing Up and Restoring a Database
526-----------------------------------
527
528OVSDB is often used in contexts where the database contents are not
529particularly valuable. For example, in many systems, the database for
530configuring ``ovs-vswitchd`` is essentially rebuilt from scratch
531at boot time. It is not worthwhile to back up these databases.
532
533When OVSDB is used for valuable data, a backup strategy is worth
534considering. One way is to use database replication, discussed above in
535`Database Replication`_ which keeps an online, up-to-date
536copy of a database, possibly on a remote system. This works with all OVSDB
537service models.
538
539A more common backup strategy is to periodically take and store a snapshot.
540For the standalone and active-backup service models, making a copy of the
541database file, e.g. using ``cp``, effectively makes a snapshot, and because
542OVSDB database files are append-only, it works even if the database is being
1b1d2e6d
BP
543modified when the snapshot takes place. This approach does not work for
544clustered databases.
12b84d50 545
1b1d2e6d
BP
546Another way to make a backup, which works with all OVSDB service models, is to
547use ``ovsdb-client backup``, which connects to a running database server and
548outputs an atomic snapshot of its schema and content, in the same format used
549for standalone and active-backup databases.
4d0a31b6 550
fe0fb885 551Multiple options are also available when the time comes to restore a database
1b1d2e6d
BP
552from a backup. For the standalone and active-backup service models, one option
553is to stop the database server or servers, overwrite the database file with the
554backup (e.g. with ``cp``), and then restart the servers. Another way, which
555works with any service model, is to use ``ovsdb-client restore``, which
556connects to a running database server and replaces the data in one of its
557databases by a provided snapshot. The advantage of ``ovsdb-client restore`` is
558that it causes zero downtime for the database and its server. It has the
559downside that UUIDs of rows in the restored database will differ from those in
560the snapshot, because the OVSDB protocol does not allow clients to specify row
561UUIDs.
12b84d50
BP
562
563None of these approaches saves and restores data in columns that the schema
564designates as ephemeral. This is by design: the designer of a schema only
565marks a column as ephemeral if it is acceptable for its data to be lost
566when a database server restarts.
567
1b1d2e6d
BP
568Clustering and backup serve different purposes. Clustering increases
569availability, but it does not protect against data loss if, for example, a
570malicious or malfunctioning OVSDB client deletes or tampers with data.
571
572Changing Database Service Model
573-------------------------------
574
575Use ``ovsdb-tool create-cluster`` to create a clustered database from the
c2bb883c
AG
576contents of a standalone database. Use ``ovsdb-client backup`` to create a
577standalone database from the contents of a running clustered database.
578When the cluster is down and cannot be revived, ``ovsdb-client backup`` will
579not work.
1b1d2e6d 580
00de46f9
AG
581Use ``ovsdb-tool cluster-to-standalone`` to convert clustered database to
582standalone database when the cluster is down and cannot be revived.
583
12b84d50
BP
584Upgrading or Downgrading a Database
585-----------------------------------
586
587The evolution of a piece of software can require changes to the schemas of the
588databases that it uses. For example, new features might require new tables or
589new columns in existing tables, or conceptual changes might require a database
590to be reorganized in other ways. In some cases, the easiest way to deal with a
591change in a database schema is to delete the existing database and start fresh
592with the new schema, especially if the data in the database is easy to
593reconstruct. But in many other cases, it is better to convert the database
594from one schema to another.
595
596The OVSDB implementation in Open vSwitch has built-in support for some simple
597cases of converting a database from one schema to another. This support can
598handle changes that add or remove database columns or tables or that eliminate
599constraints (for example, changing a column that must have exactly one value
600into one that has one or more values). It can also handle changes that add
601constraints or make them stricter, but only if the existing data in the
602database satisfies the new constraints (for example, changing a column that has
603one or more values into a column with exactly one value, if every row in the
604column has exactly one value). The built-in conversion can cause data loss in
605obvious ways, for example if the new schema removes tables or columns, or
606indirectly, for example by deleting unreferenced rows in tables that the new
607schema marks for garbage collection.
608
609Converting a database can lose data, so it is wise to make a backup beforehand.
610
611To use OVSDB's built-in support for schema conversion with a standalone or
612active-backup database, first stop the database server or servers, then use
613``ovsdb-tool convert`` to convert it to the new schema, and then restart the
614database server.
615
1b1d2e6d
BP
616OVSDB also supports online database schema conversion for any of its database
617service models. To convert a database online, use ``ovsdb-client convert``.
53178986
BP
618The conversion is atomic, consistent, isolated, and durable. ``ovsdb-server``
619disconnects any clients connected when the conversion takes place (except
620clients that use the ``set_db_change_aware`` Open vSwitch extension RPC). Upon
621reconnection, clients will discover that the schema has changed.
622
12b84d50
BP
623Schema versions and checksums (see Schemas_ above) can give hints about whether
624a database needs to be converted to a new schema. If there is any question,
53178986
BP
625though, the ``needs-conversion`` command on ``ovsdb-tool`` and ``ovsdb-client``
626can provide a definitive answer.
12b84d50
BP
627
628Working with Database History
629-----------------------------
630
631Both on-disk database formats that OVSDB supports are organized as a stream of
632transaction records. Each record describes a change to the database as a list
633of rows that were inserted or deleted or modified, along with the details.
634Therefore, in normal operation, a database file only grows, as each change
635causes another record to be appended at the end. Usually, a user has no need
636to understand this file structure. This section covers some exceptions.
637
638Compacting Databases
639--------------------
640
641If OVSDB database files were truly append-only, then over time they would grow
642without bound. To avoid this problem, OVSDB can **compact** a database file,
643that is, replace it by a new version that contains only the current database
644contents, as if it had been inserted by a single transaction. From time to
645time, ``ovsdb-server`` automatically compacts a database that grows much larger
646than its minimum size.
647
648Because ``ovsdb-server`` automatically compacts databases, it is usually not
649necessary to compact them manually, but OVSDB still offers a few ways to do it.
650First, ``ovsdb-tool compact`` can compact a standalone or active-backup
651database that is not currently being served by ``ovsdb-server`` (or otherwise
652locked for writing by another process). To compact any database that is
653currently being served by ``ovsdb-server``, use ``ovs-appctl`` to send the
1b1d2e6d
BP
654``ovsdb-server/compact`` command. Each server in an active-backup or clustered
655database maintains its database file independently, so to compact all of them,
656issue this command separately on each server.
12b84d50
BP
657
658Viewing History
659---------------
660
661The ``ovsdb-tool`` utility's ``show-log`` command displays the transaction
662records in an OVSDB database file in a human-readable format. By default, it
663shows minimal detail, but adding the option ``-m`` once or twice increases the
664level of detail. In addition to the transaction data, it shows the time and
665date of each transaction and any "comment" added to the transaction by the
666client. The comments can be helpful for quickly understanding a transaction;
667for example, ``ovs-vsctl`` adds its command line to the transactions that it
668makes.
669
1b1d2e6d
BP
670The ``show-log`` command works with both OVSDB file formats, but the details of
671the output format differ. For active-backup and clustered databases, the
672sequence of transactions in each server's log will differ, even at points when
673they reflect the same data.
12b84d50
BP
674
675Truncating History
676------------------
677
678It may occasionally be useful to "roll back" a database file to an earlier
679point. Because of the organization of OVSDB records, this is easy to do.
680Start by noting the record number <i> of the first record to delete in
681``ovsdb-tool show-log`` output. Each record is two lines of plain text, so
682trimming the log is as simple as running ``head -n <j>``, where <j> = 2 * <i>.
683
684Corruption
685----------
686
687When ``ovsdb-server`` opens an OVSDB database file, of any kind, it reads as
688many transaction records as it can from the file until it reaches the end of
689the file or it encounters a corrupted record. At that point it stops reading
690and regards the data that it has read to this point as the full contents of the
691database file, effectively rolling the database back to an earlier point.
692
693Each transaction record contains an embedded SHA-1 checksum, which the server
694verifies as it reads a database file. It detects corruption when a checksum
695fails to verify. Even though SHA-1 is no longer considered secure for use in
696cryptography, it is acceptable for this purpose because it is not used to
697defend against malicious attackers.
698
699The first record in a standalone or active-backup database file specifies the
1b1d2e6d
BP
700schema. ``ovsdb-server`` will refuse to work with a database where this record
701is corrupted, or with a clustered database file with corruption in the first
702few records. Delete and recreate such a database, or restore it from a backup.
12b84d50
BP
703
704When ``ovsdb-server`` adds records to a database file in which it detected
705corruption, it first truncates the file just after the last good record.
706
707See Also
708========
709
710RFC 7047, "The Open vSwitch Database Management Protocol."
711
712Open vSwitch implementations of generic OVSDB functionality:
713``ovsdb-server(1)``, ``ovsdb-client(1)``, ``ovsdb-tool(1)``.
714
715Tools for working with databases that have specific OVSDB schemas:
05bf1dbb
BP
716``ovs-vsctl(8)``, ``vtep-ctl(8)``, and (in OVN) ``ovn-nbctl(8)``,
717``ovn-sbctl(8)``.
12b84d50
BP
718
719OVSDB schemas for Open vSwitch and related functionality:
05bf1dbb
BP
720``ovs-vswitchd.conf.db(5)``, ``vtep(5)``, and (in OVN) ``ovn-nb(5)``,
721``ovn-sb(5)``.