Documentation/ref/ovsdb.7.rst

   1 ..
   2       Copyright (c) 2017 Nicira, Inc.
   3
   4       Licensed under the Apache License, Version 2.0 (the "License"); you may
   5       not use this file except in compliance with the License. You may obtain
   6       a copy of the License at
   7
   8           http://www.apache.org/licenses/LICENSE-2.0
   9
  10       Unless required by applicable law or agreed to in writing, software
  11       distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
  12       WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
  13       License for the specific language governing permissions and limitations
  14       under the License.
  15
  16       Convention for heading levels in Open vSwitch documentation:
  17
  18       =======  Heading 0 (reserved for the title in a document)
  19       -------  Heading 1
  20       ~~~~~~~  Heading 2
  21       +++++++  Heading 3
  22       '''''''  Heading 4
  23
  24       Avoid deeper levels because they do not render well.
  25
  26 =====
  27 ovsdb
  28 =====
  29
  30 Description
  31 ===========
  32
  33 OVSDB, the Open vSwitch Database, is a network-accessible database system.
  34 Schemas in OVSDB specify the tables in a database and their columns' types and
  35 can include data, uniqueness, and referential integrity constraints.  OVSDB
  36 offers atomic, consistent, isolated, durable transactions.  RFC 7047 specifies
  37 the JSON-RPC based protocol that OVSDB clients and servers use to communicate.
  38
  39 The OVSDB protocol is well suited for state synchronization because it
  40 allows each client to monitor the contents of a whole database or a subset
  41 of it.  Whenever a monitored portion of the database changes, the server
  42 tells the client what rows were added or modified (including the new
  43 contents) or deleted.  Thus, OVSDB clients can easily keep track of the
  44 newest contents of any part of the database.
  45
  46 While OVSDB is general-purpose and not particularly specialized for use with
  47 Open vSwitch, Open vSwitch does use it for multiple purposes.  The leading use
  48 of OVSDB is for configuring and monitoring ``ovs-vswitchd(8)``, the Open
  49 vSwitch switch daemon, using the schema documented in
  50 ``ovs-vswitchd.conf.db(5)``.  The Open Virtual Network (OVN) sub-project of OVS
  51 uses two OVSDB schemas, documented in ``ovn-nb(5)`` and ``ovn-sb(5)``.
  52 Finally, Open vSwitch includes the "VTEP" schema, documented in
  53 ``vtep(5)`` that many third-party hardware switches support for
  54 configuring VXLAN, although OVS itself does not directly use this schema.
  55
  56 The OVSDB protocol specification allows independent, interoperable
  57 implementations of OVSDB to be developed.  Open vSwitch includes an OVSDB
  58 server implementation named ``ovsdb-server(1)``, which supports several
  59 protocol extensions documented in its manpage, and a basic command-line OVSDB
  60 client named ``ovsdb-client(1)``, as well as OVSDB client libraries for C and
  61 for Python.  Open vSwitch documentation often speaks of these OVSDB
  62 implementations in Open vSwitch as simply "OVSDB," even though that is distinct
  63 from the OVSDB protocol; we make the distinction explicit only when it might
  64 otherwise be unclear from the context.
  65
  66 In addition to these generic OVSDB server and client tools, Open vSwitch
  67 includes tools for working with databases that have specific schemas:
  68 ``ovs-vsctl`` works with the ``ovs-vswitchd`` configuration database,
  69 ``vtep-ctl`` works with the VTEP database, ``ovn-nbctl`` works with
  70 the OVN Northbound database, and so on.
  71
  72 RFC 7047 specifies the OVSDB protocol but it does not specify an on-disk
  73 storage format.  Open vSwitch includes ``ovsdb-tool(1)`` for working with its
  74 own on-disk database formats.  The most notable feature of this format is that
  75 ``ovsdb-tool(1)`` makes it easy for users to print the transactions that have
  76 changed a database since the last time it was compacted.  This feature is often
  77 useful for troubleshooting.
  78
  79 Schemas
  80 =======
  81
  82 Schemas in OVSDB have a JSON format that is specified in RFC 7047.  They
  83 are often stored in files with an extension ``.ovsschema``.  An
  84 on-disk database in OVSDB includes a schema and data, embedding both into a
  85 single file.  The Open vSwitch utility ``ovsdb-tool`` has commands
  86 that work with schema files and with the schemas embedded in database
  87 files.
  88
  89 An Open vSwitch schema has three important identifiers.  The first is its
  90 name, which is also the name used in JSON-RPC calls to identify a database
  91 based on that schema.  For example, the schema used to configure Open
  92 vSwitch has the name ``Open_vSwitch``.  Schema names begin with a
  93 letter or an underscore, followed by any number of letters, underscores, or
  94 digits.  The ``ovsdb-tool`` commands ``schema-name`` and
  95 ``db-name`` extract the schema name from a schema or database
  96 file, respectively.
  97
  98 An OVSDB schema also has a version of the form ``x.y.z`` e.g. ``1.2.3``.
  99 Schemas managed within the Open vSwitch project manage version numbering in the
 100 following way (but OVSDB does not mandate this approach).  Whenever we change
 101 the database schema in a non-backward compatible way (e.g. when we delete a
 102 column or a table), we increment <x> and set <y> and <z> to 0.  When we change
 103 the database schema in a backward compatible way (e.g. when we add a new
 104 column), we increment <y> and set <z> to 0.  When we change the database schema
 105 cosmetically (e.g. we reindent its syntax), we increment <z>.  The
 106 ``ovsdb-tool`` commands ``schema-version`` and ``db-version`` extract the
 107 schema version from a schema or database file, respectively.
 108
 109 Very old OVSDB schemas do not have a version, but RFC 7047 mandates it.
 110
 111 An OVSDB schema optionally has a "checksum."  RFC 7047 does not specify the use
 112 of the checksum and recommends that clients ignore it.  Open vSwitch uses the
 113 checksum to remind developers to update the version: at build time, if the
 114 schema's embedded checksum, ignoring the checksum field itself, does not match
 115 the schema's content, then it fails the build with a recommendation to update
 116 the version and the checksum.  Thus, a developer who changes the schema, but
 117 does not update the version, receives an automatic reminder.  In practice this
 118 has been an effective way to ensure compliance with the version number policy.
 119 The ``ovsdb-tool`` commands ``schema-cksum`` and ``db-cksum`` extract the
 120 schema checksum from a schema or database file, respectively.
 121
 122 Service Models
 123 ==============
 124
 125 OVSDB supports three service models for databases: **standalone**,
 126 **active-backup**, and **clustered**.  The service models provide different
 127 compromises among consistency, availability, and partition tolerance.  They
 128 also differ in the number of servers required and in terms of performance.  The
 129 standalone and active-backup database service models share one on-disk format,
 130 and clustered databases use a different format, but the OVSDB programs work
 131 with both formats.  ``ovsdb(5)`` documents these file formats.
 132
 133 RFC 7047, which specifies the OVSDB protocol, does not mandate or specify
 134 any particular service model.
 135
 136 The following sections describe the individual service models.
 137
 138 Standalone Database Service Model
 139 ---------------------------------
 140
 141 A **standalone** database runs a single server.  If the server stops running,
 142 the database becomes inaccessible, and if the server's storage is lost or
 143 corrupted, the database's content is lost.  This service model is appropriate
 144 when the database controls a process or activity to which it is linked via
 145 "fate-sharing."  For example, an OVSDB instance that controls an Open vSwitch
 146 virtual switch daemon, ``ovs-vswitchd``, is a standalone database because a
 147 server failure would take out both the database and the virtual switch.
 148
 149 To set up a standalone database, use ``ovsdb-tool create`` to
 150 create a database file, then run ``ovsdb-server`` to start the
 151 database service.
 152
 153 To configure a client, such as ``ovs-vswitchd`` or ``ovs-vsctl``, to use a
 154 standalone database, configure the server to listen on a "connection method"
 155 that the client can reach, then point the client to that connection method.
 156 See `Connection Methods`_ below for information about connection methods.
 157
 158 Active-Backup Database Service Model
 159 ------------------------------------
 160
 161 An **active-backup** database runs two servers (on different hosts).  At any
 162 given time, one of the servers is designated with the **active** role and the
 163 other the **backup** role.  An active server behaves just like a standalone
 164 server.  A backup server makes an OVSDB connection to the active server and
 165 uses it to continuously replicate its content as it changes in real time.
 166 OVSDB clients can connect to either server but only the active server allows
 167 data modification or lock transactions.
 168
 169 Setup for an active-backup database starts from a working standalone database
 170 service, which is initially the active server.  On another node, to set up a
 171 backup server, create a database file with the same schema as the active
 172 server.  The initial contents of the database file do not matter, as long as
 173 the schema is correct, so ``ovsdb-tool create`` will work, as will copying the
 174 database file from the active server.  Then use
 175 ``ovsdb-server --sync-from=<active>`` to start the backup server, where
 176 <active> is an OVSDB connection method (see `Connection Methods`_ below) that
 177 connects to the active server.  At that point, the backup server will fetch a
 178 copy of the active database and keep it up-to-date until it is killed.
 179
 180 When the active server in an active-backup server pair fails, an administrator
 181 can switch the backup server to an active role with the ``ovs-appctl`` command
 182 ``ovsdb-server/disconnect-active-ovsdb-server``.  Clients then have read/write
 183 access to the now-active server.  Of course, administrators are slow to respond
 184 compared to software, so in practice external management software detects the
 185 active server's failure and changes the backup server's role.  For example, the
 186 "Integration Guide for Centralized Control" in the Open vSwitch documentation
 187 describes how to use Pacemaker for this purpose in OVN.
 188
 189 Suppose an active server fails and its backup is promoted to active.  If the
 190 failed server is revived, it must be started as a backup server.  Otherwise, if
 191 both servers are active, then they may start out of sync, if the database
 192 changed while the server was down, and they will continue to diverge over time.
 193 This also happens if the software managing the database servers cannot reach
 194 the active server and therefore switches the backup to active, but other hosts
 195 can reach both servers.  These "split-brain" problems are unsolvable in general
 196 for server pairs.
 197
 198 Compared to a standalone server, the active-backup service model
 199 somewhat increases availability, at a risk of split-brain.  It adds
 200 generally insignificant performance overhead.  On the other hand, the
 201 clustered service model, discussed below, requires at least 3 servers
 202 and has greater performance overhead, but it avoids the need for
 203 external management software and eliminates the possibility of
 204 split-brain.
 205
 206 Open vSwitch 2.6 introduced support for the active-backup service model.
 207
 208 Clustered Database Service Model
 209 --------------------------------
 210
 211 A **clustered** database runs across 3 or 5 or more database servers (the
 212 **cluster**) on different hosts.  Servers in a cluster automatically
 213 synchronize writes within the cluster.  A 3-server cluster can remain available
 214 in the face of at most 1 server failure; a 5-server cluster tolerates up to 2
 215 failures.  Clusters larger than 5 servers will also work, with every 2 added
 216 servers allowing the cluster to tolerate 1 more failure, but write performance
 217 decreases.  The number of servers should be odd: a 4- or 6-server cluster
 218 cannot tolerate more failures than a 3- or 5-server cluster, respectively.
 219
 220 To set up a clustered database, first initialize it on a single node by running
 221 ``ovsdb-tool create-cluster``, then start ``ovsdb-server``.  Depending on its
 222 arguments, the ``create-cluster`` command can create an empty database or copy
 223 a standalone database's contents into the new database.
 224
 225 To configure a client, such as ``ovn-controller`` or ``ovn-sbctl``, to use a
 226 clustered database, first configure all of the servers to listen on a
 227 connection method that the client can reach, then point the client to all of
 228 the servers' connection methods, comma-separated.  See `Connection Methods`_,
 229 below, for more detail.
 230
 231 Open vSwitch 2.9 introduced support for the clustered service model.
 232
 233 How to Maintain a Clustered Database
 234 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 235
 236 To add a server to a cluster, run ``ovsdb-tool join-cluster`` on the new server
 237 and start ``ovsdb-server``.  To remove a running server from a cluster, use
 238 ``ovs-appctl`` to invoke the ``cluster/leave`` command.  When a server fails
 239 and cannot be recovered, e.g. because its hard disk crashed, or to otherwise
 240 remove a server that is down from a cluster, use ``ovs-appctl`` to invoke
 241 ``cluster/kick`` to make the remaining servers kick it out of the cluster.
 242
 243 The above methods for adding and removing servers only work for healthy
 244 clusters, that is, for clusters with no more failures than their maximum
 245 tolerance.  For example, in a 3-server cluster, the failure of 2 servers
 246 prevents servers joining or leaving the cluster (as well as database access).
 247 To prevent data loss or inconsistency, the preferred solution to this problem
 248 is to bring up enough of the failed servers to make the cluster healthy again,
 249 then if necessary remove any remaining failed servers and add new ones.  If
 250 this cannot be done, though, use ``ovs-appctl`` to invoke ``cluster/leave
 251 --force`` on a running server.  This command forces the server to which it is
 252 directed to leave its cluster and form a new single-node cluster that contains
 253 only itself.  The data in the new cluster may be inconsistent with the former
 254 cluster: transactions not yet replicated to the server will be lost, and
 255 transactions not yet applied to the cluster may be committed.  Afterward, any
 256 servers in its former cluster will regard the server to have failed.
 257
 258 Once a server leaves a cluster, it may never rejoin it.  Instead, create a new
 259 server and join it to the cluster.
 260
 261 The servers in a cluster synchronize data over a cluster management protocol
 262 that is specific to Open vSwitch; it is not the same as the OVSDB protocol
 263 specified in RFC 7047.  For this purpose, a server in a cluster is tied to a
 264 particular IP address and TCP port, which is specified in the ``ovsdb-tool``
 265 command that creates or joins the cluster.  The TCP port used for clustering
 266 must be different from that used for OVSDB clients.  To change the port or
 267 address of a server in a cluster, first remove it from the cluster, then add it
 268 back with the new address.
 269
 270 To upgrade the ``ovsdb-server`` processes in a cluster from one version of Open
 271 vSwitch to another, upgrading them one at a time will keep the cluster healthy
 272 during the upgrade process.  (This is different from upgrading a database
 273 schema, which is covered later under `Upgrading or Downgrading a Database`_.)
 274
 275 Clustered OVSDB does not support the OVSDB "ephemeral columns" feature.
 276 ``ovsdb-tool`` and ``ovsdb-client`` change ephemeral columns into persistent
 277 ones when they work with schemas for clustered databases.  Future versions of
 278 OVSDB might add support for this feature.
 279
 280 Understanding Cluster Consistency
 281 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 282
 283 To ensure consistency, clustered OVSDB uses the Raft algorithm described in
 284 Diego Ongaro's Ph.D. thesis, "Consensus: Bridging Theory and Practice".  In an
 285 operational Raft cluster, at any given time a single server is the "leader" and
 286 the other nodes are "followers".  Only the leader processes transactions, but a
 287 transaction is only committed when a majority of the servers confirm to the
 288 leader that they have written it to persistent storage.
 289
 290 In most database systems, read and write access to the database happens through
 291 transactions.  In such a system, Raft allows a cluster to present a strongly
 292 consistent transactional interface.  OVSDB uses conventional transactions for
 293 writes, but clients often effectively do reads a different way, by asking the
 294 server to "monitor" a database or a subset of one on the client's behalf.
 295 Whenever monitored data changes, the server automatically tells the client what
 296 changed, which allows the client to maintain an accurate snapshot of the
 297 database in its memory.  Of course, at any given time, the snapshot may be
 298 somewhat dated since some of it could have changed without the change
 299 notification yet being received and processed by the client.
 300
 301 Given this unconventional usage model, OVSDB also adopts an unconventional
 302 clustering model.  Each server in a cluster acts independently for the purpose
 303 of monitors and read-only transactions, without verifying that data is
 304 up-to-date with the leader.  Servers forward transactions that write to the
 305 database to the leader for execution, ensuring consistency.  This has the
 306 following consequences:
 307
 308 * Transactions that involve writes, against any server in the cluster, are
 309   linearizable if clients take care to use correct prerequisites, which is the
 310   same condition required for linearizability in a standalone OVSDB.
 311   (Actually, "at-least-once" consistency, because OVSDB does not have a session
 312   mechanism to drop duplicate transactions if a connection drops after the
 313   server commits it but before the client receives the result.)
 314
 315 * Read-only transactions can yield results based on a stale version of the
 316   database, if they are executed against a follower.  Transactions on the
 317   leader always yield fresh results.  (With monitors, as explained above, a
 318   client can always see stale data even without clustering, so clustering does
 319   not change the consistency model for monitors.)
 320
 321 * Monitor-based (or read-heavy) workloads scale well across a cluster, because
 322   clustering OVSDB adds no additional work or communication for reads and
 323   monitors.
 324
 325 * A write-heavy client should connect to the leader, to avoid the overhead of
 326   followers forwarding transactions to the leader.
 327
 328 * When a client conducts a mix of read and write transactions across more than
 329   one server in a cluster, it can see inconsistent results because a read
 330   transaction might read stale data whose updates have not yet propagated from
 331   the leader.  By default, ``ovn-sbctl`` and similar utilities connect to the
 332   cluster leader to avoid this issue.
 333
 334   The same might occur for transactions against a single follower except that
 335   the OVSDB server ensures that the results of a write forwarded to the leader
 336   by a given server are visible at that server before it replies to the
 337   requesting client.
 338
 339 * If a client uses a database on one server in a cluster, then another server
 340   in the cluster (perhaps because the first server failed), the client could
 341   observe stale data.  Clustered OVSDB clients, however, can use a column in
 342   the ``_Server`` database to detect that data on a server is older than data
 343   that the client previously read.  The OVSDB client library in Open vSwitch
 344   uses this feature to avoid servers with stale data.
 345
 346 Database Replication
 347 ====================
 348
 349 OVSDB can layer **replication** on top of any of its service models.
 350 Replication, in this context, means to make, and keep up-to-date, a read-only
 351 copy of the contents of a database (the ``replica``).  One use of replication
 352 is to keep an up-to-date backup of a database.  A replica used solely for
 353 backup would not need to support clients of its own.  A set of replicas that do
 354 serve clients could be used to scale out read access to the primary database.
 355
 356 A database replica is set up in the same way as a backup server in an
 357 active-backup pair, with the difference that the replica is never promoted to
 358 an active role.
 359
 360 A database can have multiple replicas.
 361
 362 Open vSwitch 2.6 introduced support for database replication.
 363
 364 Connection Methods
 365 ==================
 366
 367 An OVSDB **connection method** is a string that specifies how to make a
 368 JSON-RPC connection between an OVSDB client and server.  Connection methods are
 369 part of the Open vSwitch implementation of OVSDB and not specified by RFC 7047.
 370 ``ovsdb-server`` uses connection methods to specify how it should listen for
 371 connections from clients and ``ovsdb-client`` uses them to specify how it
 372 should connect to a server.  Connections in the opposite direction, where
 373 ``ovsdb-server`` connects to a client that is configured to listen for an
 374 incoming connection, are also possible.
 375
 376 Connection methods are classified as **active** or **passive**.  An active
 377 connection method makes an outgoing connection to a remote host; a passive
 378 connection method listens for connections from remote hosts.  The most common
 379 arrangement is to configure an OVSDB server with passive connection methods and
 380 clients with active ones, but the OVSDB implementation in Open vSwitch supports
 381 the opposite arrangement as well.
 382
 383 OVSDB supports the following active connection methods:
 384
 385 ssl:<host>:<port>
 386     The specified SSL or TLS <port> on the given <host>.
 387
 388 tcp:<host>:<port>
 389     The specified TCP <port> on the given <host>.
 390
 391 unix:<file>
 392     On Unix-like systems, connect to the Unix domain server socket named
 393     <file>.
 394
 395     On Windows, connect to a local named pipe that is represented by a file
 396     created in the path <file> to mimic the behavior of a Unix domain socket.
 397
 398 <method1>,<method2>,...,<methodN>
 399     For a clustered database service to be highly available, a client must be
 400     able to connect to any of the servers in the cluster.  To do so, specify
 401     connection methods for each of the servers separated by commas (and
 402     optional spaces).
 403
 404     In theory, if machines go up and down and IP addresses change in the right
 405     way, a client could talk to the wrong instance of a database.  To avoid
 406     this possibility, add ``cid:<uuid>`` to the list of methods, where <uuid>
 407     is the cluster ID of the desired database cluster, as printed by
 408     ``ovsdb-tool db-cid``.  This feature is optional.
 409
 410 OVSDB supports the following passive connection methods:
 411
 412 pssl:<port>[:<ip>]
 413     Listen on the given TCP <port> for SSL or TLS connections.  By default,
 414     connections are not bound to a particular local IP address.  Specifying
 415     <ip> limits connections to those from the given IP.
 416
 417 ptcp:<port>[:<ip>]
 418     Listen on the given TCP <port>.  By default, connections are not bound to a
 419     particular local IP address.  Specifying <ip> limits connections to those
 420     from the given IP.
 421
 422 punix:<file>
 423     On Unix-like systems, listens for connections on the Unix domain socket
 424     named <file>.
 425
 426     On Windows, listens on a local named pipe, creating a named pipe
 427     <file> to mimic the behavior of a Unix domain socket.
 428
 429 All IP-based connection methods accept IPv4 and IPv6 addresses.  To specify an
 430 IPv6 address, wrap it in square brackets, e.g.  ``ssl:[::1]:6640``.  Passive
 431 IP-based connection methods by default listen for IPv4 connections only; use
 432 ``[::]`` as the address to accept both IPv4 and IPv6 connections,
 433 e.g. ``pssl:6640:[::]``.  DNS names are also accepted if built with unbound
 434 library.  On Linux, use ``%<device>`` to designate a scope for IPv6 link-level
 435 addresses, e.g. ``ssl:[fe80::1234%eth0]:6653``.
 436
 437 The <port> may be omitted from connection methods that use a port number.  The
 438 default <port> for TCP-based connection methods is 6640, e.g. ``pssl:`` is
 439 equivalent to ``pssl:6640``.  In Open vSwitch prior to version 2.4.0, the
 440 default port was 6632.  To avoid incompatibility between older and newer
 441 versions, we encourage users to specify a port number.
 442
 443 The ``ssl`` and ``pssl`` connection methods requires additional configuration
 444 through ``--private-key``, ``--certificate``, and ``--ca-cert`` command line
 445 options.  Open vSwitch can be built without SSL support, in which case these
 446 connection methods are not supported.
 447
 448 Database Life Cycle
 449 ===================
 450
 451 This section describes how to handle various events in the life cycle of
 452 a database using the Open vSwitch implementation of OVSDB.
 453
 454 Creating a Database
 455 -------------------
 456
 457 Creating and starting up the service for a new database was covered
 458 separately for each database service model in the `Service
 459 Models`_ section, above.
 460
 461 Backing Up and Restoring a Database
 462 -----------------------------------
 463
 464 OVSDB is often used in contexts where the database contents are not
 465 particularly valuable.  For example, in many systems, the database for
 466 configuring ``ovs-vswitchd`` is essentially rebuilt from scratch
 467 at boot time.  It is not worthwhile to back up these databases.
 468
 469 When OVSDB is used for valuable data, a backup strategy is worth
 470 considering.  One way is to use database replication, discussed above in
 471 `Database Replication`_ which keeps an online, up-to-date
 472 copy of a database, possibly on a remote system.  This works with all OVSDB
 473 service models.
 474
 475 A more common backup strategy is to periodically take and store a snapshot.
 476 For the standalone and active-backup service models, making a copy of the
 477 database file, e.g. using ``cp``, effectively makes a snapshot, and because
 478 OVSDB database files are append-only, it works even if the database is being
 479 modified when the snapshot takes place.  This approach does not work for
 480 clustered databases.
 481
 482 Another way to make a backup, which works with all OVSDB service models, is to
 483 use ``ovsdb-client backup``, which connects to a running database server and
 484 outputs an atomic snapshot of its schema and content, in the same format used
 485 for standalone and active-backup databases.
 486
 487 Multiple options are also available when the time comes to restore a database
 488 from a backup.  For the standalone and active-backup service models, one option
 489 is to stop the database server or servers, overwrite the database file with the
 490 backup (e.g. with ``cp``), and then restart the servers.  Another way, which
 491 works with any service model, is to use ``ovsdb-client restore``, which
 492 connects to a running database server and replaces the data in one of its
 493 databases by a provided snapshot.  The advantage of ``ovsdb-client restore`` is
 494 that it causes zero downtime for the database and its server.  It has the
 495 downside that UUIDs of rows in the restored database will differ from those in
 496 the snapshot, because the OVSDB protocol does not allow clients to specify row
 497 UUIDs.
 498
 499 None of these approaches saves and restores data in columns that the schema
 500 designates as ephemeral.  This is by design: the designer of a schema only
 501 marks a column as ephemeral if it is acceptable for its data to be lost
 502 when a database server restarts.
 503
 504 Clustering and backup serve different purposes.  Clustering increases
 505 availability, but it does not protect against data loss if, for example, a
 506 malicious or malfunctioning OVSDB client deletes or tampers with data.
 507
 508 Changing Database Service Model
 509 -------------------------------
 510
 511 Use ``ovsdb-tool create-cluster`` to create a clustered database from the
 512 contents of a standalone database.  Use ``ovsdb-client backup`` to create a
 513 standalone database from the contents of a running clustered database.
 514 When the cluster is down and cannot be revived, ``ovsdb-client backup`` will
 515 not work.
 516
 517 Use ``ovsdb-tool cluster-to-standalone`` to convert clustered database to
 518 standalone database when the cluster is down and cannot be revived.
 519
 520 Upgrading or Downgrading a Database
 521 -----------------------------------
 522
 523 The evolution of a piece of software can require changes to the schemas of the
 524 databases that it uses.  For example, new features might require new tables or
 525 new columns in existing tables, or conceptual changes might require a database
 526 to be reorganized in other ways.  In some cases, the easiest way to deal with a
 527 change in a database schema is to delete the existing database and start fresh
 528 with the new schema, especially if the data in the database is easy to
 529 reconstruct.  But in many other cases, it is better to convert the database
 530 from one schema to another.
 531
 532 The OVSDB implementation in Open vSwitch has built-in support for some simple
 533 cases of converting a database from one schema to another.  This support can
 534 handle changes that add or remove database columns or tables or that eliminate
 535 constraints (for example, changing a column that must have exactly one value
 536 into one that has one or more values).  It can also handle changes that add
 537 constraints or make them stricter, but only if the existing data in the
 538 database satisfies the new constraints (for example, changing a column that has
 539 one or more values into a column with exactly one value, if every row in the
 540 column has exactly one value).  The built-in conversion can cause data loss in
 541 obvious ways, for example if the new schema removes tables or columns, or
 542 indirectly, for example by deleting unreferenced rows in tables that the new
 543 schema marks for garbage collection.
 544
 545 Converting a database can lose data, so it is wise to make a backup beforehand.
 546
 547 To use OVSDB's built-in support for schema conversion with a standalone or
 548 active-backup database, first stop the database server or servers, then use
 549 ``ovsdb-tool convert`` to convert it to the new schema, and then restart the
 550 database server.
 551
 552 OVSDB also supports online database schema conversion for any of its database
 553 service models.  To convert a database online, use ``ovsdb-client convert``.
 554 The conversion is atomic, consistent, isolated, and durable.  ``ovsdb-server``
 555 disconnects any clients connected when the conversion takes place (except
 556 clients that use the ``set_db_change_aware`` Open vSwitch extension RPC).  Upon
 557 reconnection, clients will discover that the schema has changed.
 558
 559 Schema versions and checksums (see Schemas_ above) can give hints about whether
 560 a database needs to be converted to a new schema.  If there is any question,
 561 though, the ``needs-conversion`` command on ``ovsdb-tool`` and ``ovsdb-client``
 562 can provide a definitive answer.
 563
 564 Working with Database History
 565 -----------------------------
 566
 567 Both on-disk database formats that OVSDB supports are organized as a stream of
 568 transaction records.  Each record describes a change to the database as a list
 569 of rows that were inserted or deleted or modified, along with the details.
 570 Therefore, in normal operation, a database file only grows, as each change
 571 causes another record to be appended at the end.  Usually, a user has no need
 572 to understand this file structure.  This section covers some exceptions.
 573
 574 Compacting Databases
 575 --------------------
 576
 577 If OVSDB database files were truly append-only, then over time they would grow
 578 without bound.  To avoid this problem, OVSDB can **compact** a database file,
 579 that is, replace it by a new version that contains only the current database
 580 contents, as if it had been inserted by a single transaction.  From time to
 581 time, ``ovsdb-server`` automatically compacts a database that grows much larger
 582 than its minimum size.
 583
 584 Because ``ovsdb-server`` automatically compacts databases, it is usually not
 585 necessary to compact them manually, but OVSDB still offers a few ways to do it.
 586 First, ``ovsdb-tool compact`` can compact a standalone or active-backup
 587 database that is not currently being served by ``ovsdb-server`` (or otherwise
 588 locked for writing by another process).  To compact any database that is
 589 currently being served by ``ovsdb-server``, use ``ovs-appctl`` to send the
 590 ``ovsdb-server/compact`` command.  Each server in an active-backup or clustered
 591 database maintains its database file independently, so to compact all of them,
 592 issue this command separately on each server.
 593
 594 Viewing History
 595 ---------------
 596
 597 The ``ovsdb-tool`` utility's ``show-log`` command displays the transaction
 598 records in an OVSDB database file in a human-readable format.  By default, it
 599 shows minimal detail, but adding the option ``-m`` once or twice increases the
 600 level of detail.  In addition to the transaction data, it shows the time and
 601 date of each transaction and any "comment" added to the transaction by the
 602 client.  The comments can be helpful for quickly understanding a transaction;
 603 for example, ``ovs-vsctl`` adds its command line to the transactions that it
 604 makes.
 605
 606 The ``show-log`` command works with both OVSDB file formats, but the details of
 607 the output format differ.  For active-backup and clustered databases, the
 608 sequence of transactions in each server's log will differ, even at points when
 609 they reflect the same data.
 610
 611 Truncating History
 612 ------------------
 613
 614 It may occasionally be useful to "roll back" a database file to an earlier
 615 point.  Because of the organization of OVSDB records, this is easy to do.
 616 Start by noting the record number <i> of the first record to delete in
 617 ``ovsdb-tool show-log`` output.  Each record is two lines of plain text, so
 618 trimming the log is as simple as running ``head -n <j>``, where <j> = 2 * <i>.
 619
 620 Corruption
 621 ----------
 622
 623 When ``ovsdb-server`` opens an OVSDB database file, of any kind, it reads as
 624 many transaction records as it can from the file until it reaches the end of
 625 the file or it encounters a corrupted record.  At that point it stops reading
 626 and regards the data that it has read to this point as the full contents of the
 627 database file, effectively rolling the database back to an earlier point.
 628
 629 Each transaction record contains an embedded SHA-1 checksum, which the server
 630 verifies as it reads a database file.  It detects corruption when a checksum
 631 fails to verify.  Even though SHA-1 is no longer considered secure for use in
 632 cryptography, it is acceptable for this purpose because it is not used to
 633 defend against malicious attackers.
 634
 635 The first record in a standalone or active-backup database file specifies the
 636 schema.  ``ovsdb-server`` will refuse to work with a database where this record
 637 is corrupted, or with a clustered database file with corruption in the first
 638 few records.  Delete and recreate such a database, or restore it from a backup.
 639
 640 When ``ovsdb-server`` adds records to a database file in which it detected
 641 corruption, it first truncates the file just after the last good record.
 642
 643 See Also
 644 ========
 645
 646 RFC 7047, "The Open vSwitch Database Management Protocol."
 647
 648 Open vSwitch implementations of generic OVSDB functionality:
 649 ``ovsdb-server(1)``, ``ovsdb-client(1)``, ``ovsdb-tool(1)``.
 650
 651 Tools for working with databases that have specific OVSDB schemas:
 652 ``ovs-vsctl(8)``, ``vtep-ctl(8)``, ``ovn-nbctl(8)``, ``ovn-sbctl(8)``.
 653
 654 OVSDB schemas for Open vSwitch and related functionality:
 655 ``ovs-vswitchd.conf.db(5)``, ``vtep(5)``, ``ovn-nb(5)``, ``ovn-sb(5)``.