]> git.proxmox.com Git - mirror_qemu.git/blame - docs/system/qemu-block-drivers.rst.inc
Merge remote-tracking branch 'remotes/ehabkost-gl/tags/machine-next-pull-request...
[mirror_qemu.git] / docs / system / qemu-block-drivers.rst.inc
CommitLineData
bccb135e
PB
1Disk image file formats
2~~~~~~~~~~~~~~~~~~~~~~~
3
4QEMU supports many image file formats that can be used with VMs as well as with
5any of the tools (like ``qemu-img``). This includes the preferred formats
6raw and qcow2 as well as formats that are supported for compatibility with
7older QEMU versions or other hypervisors.
8
9Depending on the image format, different options can be passed to
10``qemu-img create`` and ``qemu-img convert`` using the ``-o`` option.
11This section describes each format and the options that are supported for it.
12
13.. program:: image-formats
14.. option:: raw
15
16 Raw disk image format. This format has the advantage of
17 being simple and easily exportable to all other emulators. If your
18 file system supports *holes* (for example in ext2 or ext3 on
19 Linux or NTFS on Windows), then only the written sectors will reserve
20 space. Use ``qemu-img info`` to know the real size used by the
21 image or ``ls -ls`` on Unix/Linux.
22
23 Supported options:
24
25 .. program:: raw
26 .. option:: preallocation
27
28 Preallocation mode (allowed values: ``off``, ``falloc``,
29 ``full``). ``falloc`` mode preallocates space for image by
30 calling ``posix_fallocate()``. ``full`` mode preallocates space
31 for image by writing data to underlying storage. This data may or
32 may not be zero, depending on the storage location.
33
34.. program:: image-formats
35.. option:: qcow2
36
37 QEMU image format, the most versatile format. Use it to have smaller
38 images (useful if your filesystem does not supports holes, for example
39 on Windows), zlib based compression and support of multiple VM
40 snapshots.
41
42 Supported options:
43
44 .. program:: qcow2
45 .. option:: compat
46
47 Determines the qcow2 version to use. ``compat=0.10`` uses the
48 traditional image format that can be read by any QEMU since 0.10.
49 ``compat=1.1`` enables image format extensions that only QEMU 1.1 and
50 newer understand (this is the default). Amongst others, this includes
51 zero clusters, which allow efficient copy-on-read for sparse images.
52
53 .. option:: backing_file
54
55 File name of a base image (see ``create`` subcommand)
56
57 .. option:: backing_fmt
58
59 Image format of the base image
60
61 .. option:: encryption
62
63 This option is deprecated and equivalent to ``encrypt.format=aes``
64
65 .. option:: encrypt.format
66
67 If this is set to ``luks``, it requests that the qcow2 payload (not
68 qcow2 header) be encrypted using the LUKS format. The passphrase to
69 use to unlock the LUKS key slot is given by the ``encrypt.key-secret``
70 parameter. LUKS encryption parameters can be tuned with the other
71 ``encrypt.*`` parameters.
72
73 If this is set to ``aes``, the image is encrypted with 128-bit AES-CBC.
74 The encryption key is given by the ``encrypt.key-secret`` parameter.
75 This encryption format is considered to be flawed by modern cryptography
76 standards, suffering from a number of design problems:
77
78 - The AES-CBC cipher is used with predictable initialization vectors based
79 on the sector number. This makes it vulnerable to chosen plaintext attacks
80 which can reveal the existence of encrypted data.
81 - The user passphrase is directly used as the encryption key. A poorly
82 chosen or short passphrase will compromise the security of the encryption.
83 - In the event of the passphrase being compromised there is no way to
84 change the passphrase to protect data in any qcow images. The files must
85 be cloned, using a different encryption passphrase in the new file. The
86 original file must then be securely erased using a program like shred,
87 though even this is ineffective with many modern storage technologies.
88
89 The use of this is no longer supported in system emulators. Support only
90 remains in the command line utilities, for the purposes of data liberation
91 and interoperability with old versions of QEMU. The ``luks`` format
92 should be used instead.
93
94 .. option:: encrypt.key-secret
95
96 Provides the ID of a ``secret`` object that contains the passphrase
97 (``encrypt.format=luks``) or encryption key (``encrypt.format=aes``).
98
99 .. option:: encrypt.cipher-alg
100
101 Name of the cipher algorithm and key length. Currently defaults
102 to ``aes-256``. Only used when ``encrypt.format=luks``.
103
104 .. option:: encrypt.cipher-mode
105
106 Name of the encryption mode to use. Currently defaults to ``xts``.
107 Only used when ``encrypt.format=luks``.
108
109 .. option:: encrypt.ivgen-alg
110
111 Name of the initialization vector generator algorithm. Currently defaults
112 to ``plain64``. Only used when ``encrypt.format=luks``.
113
114 .. option:: encrypt.ivgen-hash-alg
115
116 Name of the hash algorithm to use with the initialization vector generator
117 (if required). Defaults to ``sha256``. Only used when ``encrypt.format=luks``.
118
119 .. option:: encrypt.hash-alg
120
121 Name of the hash algorithm to use for PBKDF algorithm
122 Defaults to ``sha256``. Only used when ``encrypt.format=luks``.
123
124 .. option:: encrypt.iter-time
125
126 Amount of time, in milliseconds, to use for PBKDF algorithm per key slot.
127 Defaults to ``2000``. Only used when ``encrypt.format=luks``.
128
129 .. option:: cluster_size
130
131 Changes the qcow2 cluster size (must be between 512 and 2M). Smaller cluster
132 sizes can improve the image file size whereas larger cluster sizes generally
133 provide better performance.
134
135 .. option:: preallocation
136
137 Preallocation mode (allowed values: ``off``, ``metadata``, ``falloc``,
138 ``full``). An image with preallocated metadata is initially larger but can
139 improve performance when the image needs to grow. ``falloc`` and ``full``
140 preallocations are like the same options of ``raw`` format, but sets up
141 metadata also.
142
143 .. option:: lazy_refcounts
144
145 If this option is set to ``on``, reference count updates are postponed with
146 the goal of avoiding metadata I/O and improving performance. This is
147 particularly interesting with :option:`cache=writethrough` which doesn't batch
148 metadata updates. The tradeoff is that after a host crash, the reference count
149 tables must be rebuilt, i.e. on the next open an (automatic) ``qemu-img
150 check -r all`` is required, which may take some time.
151
152 This option can only be enabled if ``compat=1.1`` is specified.
153
154 .. option:: nocow
155
156 If this option is set to ``on``, it will turn off COW of the file. It's only
157 valid on btrfs, no effect on other file systems.
158
159 Btrfs has low performance when hosting a VM image file, even more
160 when the guest on the VM also using btrfs as file system. Turning off
161 COW is a way to mitigate this bad performance. Generally there are two
162 ways to turn off COW on btrfs:
163
164 - Disable it by mounting with nodatacow, then all newly created files
165 will be NOCOW.
166 - For an empty file, add the NOCOW file attribute. That's what this
167 option does.
168
169 Note: this option is only valid to new or empty files. If there is
170 an existing file which is COW and has data blocks already, it couldn't
171 be changed to NOCOW by setting ``nocow=on``. One can issue ``lsattr
172 filename`` to check if the NOCOW flag is set or not (Capital 'C' is
173 NOCOW flag).
174
175.. program:: image-formats
176.. option:: qed
177
178 Old QEMU image format with support for backing files and compact image files
179 (when your filesystem or transport medium does not support holes).
180
181 When converting QED images to qcow2, you might want to consider using the
182 ``lazy_refcounts=on`` option to get a more QED-like behaviour.
183
184 Supported options:
185
186 .. program:: qed
187 .. option:: backing_file
188
189 File name of a base image (see ``create`` subcommand).
190
191 .. option:: backing_fmt
192
193 Image file format of backing file (optional). Useful if the format cannot be
194 autodetected because it has no header, like some vhd/vpc files.
195
196 .. option:: cluster_size
197
198 Changes the cluster size (must be power-of-2 between 4K and 64K). Smaller
199 cluster sizes can improve the image file size whereas larger cluster sizes
200 generally provide better performance.
201
202 .. option:: table_size
203
204 Changes the number of clusters per L1/L2 table (must be
205 power-of-2 between 1 and 16). There is normally no need to
206 change this value but this option can between used for
207 performance benchmarking.
208
209.. program:: image-formats
210.. option:: qcow
211
212 Old QEMU image format with support for backing files, compact image files,
213 encryption and compression.
214
215 Supported options:
216
217 .. program:: qcow
218 .. option:: backing_file
219
220 File name of a base image (see ``create`` subcommand)
221
222 .. option:: encryption
223
224 This option is deprecated and equivalent to ``encrypt.format=aes``
225
226 .. option:: encrypt.format
227
228 If this is set to ``aes``, the image is encrypted with 128-bit AES-CBC.
229 The encryption key is given by the ``encrypt.key-secret`` parameter.
230 This encryption format is considered to be flawed by modern cryptography
231 standards, suffering from a number of design problems enumerated previously
232 against the ``qcow2`` image format.
233
234 The use of this is no longer supported in system emulators. Support only
235 remains in the command line utilities, for the purposes of data liberation
236 and interoperability with old versions of QEMU.
237
238 Users requiring native encryption should use the ``qcow2`` format
239 instead with ``encrypt.format=luks``.
240
241 .. option:: encrypt.key-secret
242
243 Provides the ID of a ``secret`` object that contains the encryption
244 key (``encrypt.format=aes``).
245
246.. program:: image-formats
247.. option:: luks
248
249 LUKS v1 encryption format, compatible with Linux dm-crypt/cryptsetup
250
251 Supported options:
252
253 .. program:: luks
254 .. option:: key-secret
255
256 Provides the ID of a ``secret`` object that contains the passphrase.
257
258 .. option:: cipher-alg
259
260 Name of the cipher algorithm and key length. Currently defaults
261 to ``aes-256``.
262
263 .. option:: cipher-mode
264
265 Name of the encryption mode to use. Currently defaults to ``xts``.
266
267 .. option:: ivgen-alg
268
269 Name of the initialization vector generator algorithm. Currently defaults
270 to ``plain64``.
271
272 .. option:: ivgen-hash-alg
273
274 Name of the hash algorithm to use with the initialization vector generator
275 (if required). Defaults to ``sha256``.
276
277 .. option:: hash-alg
278
279 Name of the hash algorithm to use for PBKDF algorithm
280 Defaults to ``sha256``.
281
282 .. option:: iter-time
283
284 Amount of time, in milliseconds, to use for PBKDF algorithm per key slot.
285 Defaults to ``2000``.
286
287.. program:: image-formats
288.. option:: vdi
289
290 VirtualBox 1.1 compatible image format.
291
292 Supported options:
293
294 .. program:: vdi
295 .. option:: static
296
297 If this option is set to ``on``, the image is created with metadata
298 preallocation.
299
300.. program:: image-formats
301.. option:: vmdk
302
303 VMware 3 and 4 compatible image format.
304
305 Supported options:
306
307 .. program: vmdk
308 .. option:: backing_file
309
310 File name of a base image (see ``create`` subcommand).
311
312 .. option:: compat6
313
314 Create a VMDK version 6 image (instead of version 4)
315
316 .. option:: hwversion
317
318 Specify vmdk virtual hardware version. Compat6 flag cannot be enabled
319 if hwversion is specified.
320
321 .. option:: subformat
322
323 Specifies which VMDK subformat to use. Valid options are
324 ``monolithicSparse`` (default),
325 ``monolithicFlat``,
326 ``twoGbMaxExtentSparse``,
327 ``twoGbMaxExtentFlat`` and
328 ``streamOptimized``.
329
330.. program:: image-formats
331.. option:: vpc
332
333 VirtualPC compatible image format (VHD).
334
335 Supported options:
336
337 .. program:: vpc
338 .. option:: subformat
339
340 Specifies which VHD subformat to use. Valid options are
341 ``dynamic`` (default) and ``fixed``.
342
343.. program:: image-formats
344.. option:: VHDX
345
346 Hyper-V compatible image format (VHDX).
347
348 Supported options:
349
350 .. program:: VHDX
351 .. option:: subformat
352
353 Specifies which VHDX subformat to use. Valid options are
354 ``dynamic`` (default) and ``fixed``.
355
356 .. option:: block_state_zero
357
358 Force use of payload blocks of type 'ZERO'. Can be set to ``on`` (default)
359 or ``off``. When set to ``off``, new blocks will be created as
360 ``PAYLOAD_BLOCK_NOT_PRESENT``, which means parsers are free to return
361 arbitrary data for those blocks. Do not set to ``off`` when using
362 ``qemu-img convert`` with ``subformat=dynamic``.
363
364 .. option:: block_size
365
366 Block size; min 1 MB, max 256 MB. 0 means auto-calculate based on
367 image size.
368
369 .. option:: log_size
370
371 Log size; min 1 MB.
372
373Read-only formats
374~~~~~~~~~~~~~~~~~
375
376More disk image file formats are supported in a read-only mode.
377
378.. program:: image-formats
379.. option:: bochs
380
381 Bochs images of ``growing`` type.
382
383.. program:: image-formats
384.. option:: cloop
385
386 Linux Compressed Loop image, useful only to reuse directly compressed
387 CD-ROM images present for example in the Knoppix CD-ROMs.
388
389.. program:: image-formats
390.. option:: dmg
391
392 Apple disk image.
393
394.. program:: image-formats
395.. option:: parallels
396
397 Parallels disk image format.
398
399Using host drives
400~~~~~~~~~~~~~~~~~
401
402In addition to disk image files, QEMU can directly access host
403devices. We describe here the usage for QEMU version >= 0.8.3.
404
405Linux
406^^^^^
407
408On Linux, you can directly use the host device filename instead of a
409disk image filename provided you have enough privileges to access
410it. For example, use ``/dev/cdrom`` to access to the CDROM.
411
412CD
413 You can specify a CDROM device even if no CDROM is loaded. QEMU has
414 specific code to detect CDROM insertion or removal. CDROM ejection by
415 the guest OS is supported. Currently only data CDs are supported.
416
417Floppy
418 You can specify a floppy device even if no floppy is loaded. Floppy
419 removal is currently not detected accurately (if you change floppy
420 without doing floppy access while the floppy is not loaded, the guest
421 OS will think that the same floppy is loaded).
422 Use of the host's floppy device is deprecated, and support for it will
423 be removed in a future release.
424
425Hard disks
426 Hard disks can be used. Normally you must specify the whole disk
427 (``/dev/hdb`` instead of ``/dev/hdb1``) so that the guest OS can
428 see it as a partitioned disk. WARNING: unless you know what you do, it
429 is better to only make READ-ONLY accesses to the hard disk otherwise
430 you may corrupt your host data (use the ``-snapshot`` command
431 line option or modify the device permissions accordingly).
432
433Windows
434^^^^^^^
435
436CD
437 The preferred syntax is the drive letter (e.g. ``d:``). The
438 alternate syntax ``\\.\d:`` is supported. ``/dev/cdrom`` is
439 supported as an alias to the first CDROM drive.
440
441 Currently there is no specific code to handle removable media, so it
442 is better to use the ``change`` or ``eject`` monitor commands to
443 change or eject media.
444
445Hard disks
446 Hard disks can be used with the syntax: ``\\.\PhysicalDriveN``
447 where *N* is the drive number (0 is the first hard disk).
448
449 WARNING: unless you know what you do, it is better to only make
450 READ-ONLY accesses to the hard disk otherwise you may corrupt your
451 host data (use the ``-snapshot`` command line so that the
452 modifications are written in a temporary file).
453
454Mac OS X
455^^^^^^^^
456
457``/dev/cdrom`` is an alias to the first CDROM.
458
459Currently there is no specific code to handle removable media, so it
460is better to use the ``change`` or ``eject`` monitor commands to
461change or eject media.
462
463Virtual FAT disk images
464~~~~~~~~~~~~~~~~~~~~~~~
465
466QEMU can automatically create a virtual FAT disk image from a
467directory tree. In order to use it, just type:
468
469.. parsed-literal::
470
471 |qemu_system| linux.img -hdb fat:/my_directory
472
473Then you access access to all the files in the ``/my_directory``
474directory without having to copy them in a disk image or to export
475them via SAMBA or NFS. The default access is *read-only*.
476
477Floppies can be emulated with the ``:floppy:`` option:
478
479.. parsed-literal::
480
481 |qemu_system| linux.img -fda fat:floppy:/my_directory
482
483A read/write support is available for testing (beta stage) with the
484``:rw:`` option:
485
486.. parsed-literal::
487
488 |qemu_system| linux.img -fda fat:floppy:rw:/my_directory
489
490What you should *never* do:
491
492- use non-ASCII filenames
493- use "-snapshot" together with ":rw:"
494- expect it to work when loadvm'ing
495- write to the FAT directory on the host system while accessing it with the guest system
496
497NBD access
498~~~~~~~~~~
499
500QEMU can access directly to block device exported using the Network Block Device
501protocol.
502
503.. parsed-literal::
504
505 |qemu_system| linux.img -hdb nbd://my_nbd_server.mydomain.org:1024/
506
507If the NBD server is located on the same host, you can use an unix socket instead
508of an inet socket:
509
510.. parsed-literal::
511
512 |qemu_system| linux.img -hdb nbd+unix://?socket=/tmp/my_socket
513
514In this case, the block device must be exported using qemu-nbd:
515
516.. parsed-literal::
517
518 qemu-nbd --socket=/tmp/my_socket my_disk.qcow2
519
520The use of qemu-nbd allows sharing of a disk between several guests:
521
522.. parsed-literal::
523
524 qemu-nbd --socket=/tmp/my_socket --share=2 my_disk.qcow2
525
526and then you can use it with two guests:
527
528.. parsed-literal::
529
530 |qemu_system| linux1.img -hdb nbd+unix://?socket=/tmp/my_socket
531 |qemu_system| linux2.img -hdb nbd+unix://?socket=/tmp/my_socket
532
533If the nbd-server uses named exports (supported since NBD 2.9.18, or with QEMU's
534own embedded NBD server), you must specify an export name in the URI:
535
536.. parsed-literal::
537
538 |qemu_system| -cdrom nbd://localhost/debian-500-ppc-netinst
539 |qemu_system| -cdrom nbd://localhost/openSUSE-11.1-ppc-netinst
540
541The URI syntax for NBD is supported since QEMU 1.3. An alternative syntax is
542also available. Here are some example of the older syntax:
543
544.. parsed-literal::
545
546 |qemu_system| linux.img -hdb nbd:my_nbd_server.mydomain.org:1024
547 |qemu_system| linux2.img -hdb nbd:unix:/tmp/my_socket
548 |qemu_system| -cdrom nbd:localhost:10809:exportname=debian-500-ppc-netinst
549
550
551
552Sheepdog disk images
553~~~~~~~~~~~~~~~~~~~~
554
555Sheepdog is a distributed storage system for QEMU. It provides highly
556available block level storage volumes that can be attached to
557QEMU-based virtual machines.
558
559You can create a Sheepdog disk image with the command:
560
561.. parsed-literal::
562
563 qemu-img create sheepdog:///IMAGE SIZE
564
565where *IMAGE* is the Sheepdog image name and *SIZE* is its
566size.
567
568To import the existing *FILENAME* to Sheepdog, you can use a
569convert command.
570
571.. parsed-literal::
572
573 qemu-img convert FILENAME sheepdog:///IMAGE
574
575You can boot from the Sheepdog disk image with the command:
576
577.. parsed-literal::
578
579 |qemu_system| sheepdog:///IMAGE
580
581You can also create a snapshot of the Sheepdog image like qcow2.
582
583.. parsed-literal::
584
585 qemu-img snapshot -c TAG sheepdog:///IMAGE
586
587where *TAG* is a tag name of the newly created snapshot.
588
589To boot from the Sheepdog snapshot, specify the tag name of the
590snapshot.
591
592.. parsed-literal::
593
594 |qemu_system| sheepdog:///IMAGE#TAG
595
596You can create a cloned image from the existing snapshot.
597
598.. parsed-literal::
599
600 qemu-img create -b sheepdog:///BASE#TAG sheepdog:///IMAGE
601
602where *BASE* is an image name of the source snapshot and *TAG*
603is its tag name.
604
605You can use an unix socket instead of an inet socket:
606
607.. parsed-literal::
608
609 |qemu_system| sheepdog+unix:///IMAGE?socket=PATH
610
611If the Sheepdog daemon doesn't run on the local host, you need to
612specify one of the Sheepdog servers to connect to.
613
614.. parsed-literal::
615
616 qemu-img create sheepdog://HOSTNAME:PORT/IMAGE SIZE
617 |qemu_system| sheepdog://HOSTNAME:PORT/IMAGE
618
619iSCSI LUNs
620~~~~~~~~~~
621
622iSCSI is a popular protocol used to access SCSI devices across a computer
623network.
624
625There are two different ways iSCSI devices can be used by QEMU.
626
627The first method is to mount the iSCSI LUN on the host, and make it appear as
628any other ordinary SCSI device on the host and then to access this device as a
629/dev/sd device from QEMU. How to do this differs between host OSes.
630
631The second method involves using the iSCSI initiator that is built into
632QEMU. This provides a mechanism that works the same way regardless of which
633host OS you are running QEMU on. This section will describe this second method
634of using iSCSI together with QEMU.
635
636In QEMU, iSCSI devices are described using special iSCSI URLs. URL syntax:
637
638::
639
640 iscsi://[<username>[%<password>]@]<host>[:<port>]/<target-iqn-name>/<lun>
641
642Username and password are optional and only used if your target is set up
643using CHAP authentication for access control.
644Alternatively the username and password can also be set via environment
645variables to have these not show up in the process list:
646
647::
648
649 export LIBISCSI_CHAP_USERNAME=<username>
650 export LIBISCSI_CHAP_PASSWORD=<password>
651 iscsi://<host>/<target-iqn-name>/<lun>
652
653Various session related parameters can be set via special options, either
654in a configuration file provided via '-readconfig' or directly on the
655command line.
656
657If the initiator-name is not specified qemu will use a default name
658of 'iqn.2008-11.org.linux-kvm[:<uuid>'] where <uuid> is the UUID of the
659virtual machine. If the UUID is not specified qemu will use
660'iqn.2008-11.org.linux-kvm[:<name>'] where <name> is the name of the
661virtual machine.
662
663Setting a specific initiator name to use when logging in to the target:
664
665::
666
667 -iscsi initiator-name=iqn.qemu.test:my-initiator
668
669Controlling which type of header digest to negotiate with the target:
670
671::
672
673 -iscsi header-digest=CRC32C|CRC32C-NONE|NONE-CRC32C|NONE
674
675These can also be set via a configuration file:
676
677::
678
679 [iscsi]
680 user = "CHAP username"
681 password = "CHAP password"
682 initiator-name = "iqn.qemu.test:my-initiator"
683 # header digest is one of CRC32C|CRC32C-NONE|NONE-CRC32C|NONE
684 header-digest = "CRC32C"
685
686Setting the target name allows different options for different targets:
687
688::
689
690 [iscsi "iqn.target.name"]
691 user = "CHAP username"
692 password = "CHAP password"
693 initiator-name = "iqn.qemu.test:my-initiator"
694 # header digest is one of CRC32C|CRC32C-NONE|NONE-CRC32C|NONE
695 header-digest = "CRC32C"
696
697How to use a configuration file to set iSCSI configuration options:
698
699.. parsed-literal::
700
701 cat >iscsi.conf <<EOF
702 [iscsi]
703 user = "me"
704 password = "my password"
705 initiator-name = "iqn.qemu.test:my-initiator"
706 header-digest = "CRC32C"
707 EOF
708
709 |qemu_system| -drive file=iscsi://127.0.0.1/iqn.qemu.test/1 \\
710 -readconfig iscsi.conf
711
712How to set up a simple iSCSI target on loopback and access it via QEMU:
713this example shows how to set up an iSCSI target with one CDROM and one DISK
714using the Linux STGT software target. This target is available on Red Hat based
715systems as the package 'scsi-target-utils'.
716
717.. parsed-literal::
718
719 tgtd --iscsi portal=127.0.0.1:3260
720 tgtadm --lld iscsi --op new --mode target --tid 1 -T iqn.qemu.test
721 tgtadm --lld iscsi --mode logicalunit --op new --tid 1 --lun 1 \\
722 -b /IMAGES/disk.img --device-type=disk
723 tgtadm --lld iscsi --mode logicalunit --op new --tid 1 --lun 2 \\
724 -b /IMAGES/cd.iso --device-type=cd
725 tgtadm --lld iscsi --op bind --mode target --tid 1 -I ALL
726
727 |qemu_system| -iscsi initiator-name=iqn.qemu.test:my-initiator \\
728 -boot d -drive file=iscsi://127.0.0.1/iqn.qemu.test/1 \\
729 -cdrom iscsi://127.0.0.1/iqn.qemu.test/2
730
731GlusterFS disk images
732~~~~~~~~~~~~~~~~~~~~~
733
734GlusterFS is a user space distributed file system.
735
736You can boot from the GlusterFS disk image with the command:
737
738URI:
739
740.. parsed-literal::
741
742 |qemu_system| -drive file=gluster[+TYPE]://[HOST}[:PORT]]/VOLUME/PATH
743 [?socket=...][,file.debug=9][,file.logfile=...]
744
745JSON:
746
747.. parsed-literal::
748
749 |qemu_system| 'json:{"driver":"qcow2",
750 "file":{"driver":"gluster",
751 "volume":"testvol","path":"a.img","debug":9,"logfile":"...",
752 "server":[{"type":"tcp","host":"...","port":"..."},
753 {"type":"unix","socket":"..."}]}}'
754
755*gluster* is the protocol.
756
757*TYPE* specifies the transport type used to connect to gluster
758management daemon (glusterd). Valid transport types are
759tcp and unix. In the URI form, if a transport type isn't specified,
760then tcp type is assumed.
761
762*HOST* specifies the server where the volume file specification for
763the given volume resides. This can be either a hostname or an ipv4 address.
764If transport type is unix, then *HOST* field should not be specified.
765Instead *socket* field needs to be populated with the path to unix domain
766socket.
767
768*PORT* is the port number on which glusterd is listening. This is optional
769and if not specified, it defaults to port 24007. If the transport type is unix,
770then *PORT* should not be specified.
771
772*VOLUME* is the name of the gluster volume which contains the disk image.
773
774*PATH* is the path to the actual disk image that resides on gluster volume.
775
776*debug* is the logging level of the gluster protocol driver. Debug levels
777are 0-9, with 9 being the most verbose, and 0 representing no debugging output.
778The default level is 4. The current logging levels defined in the gluster source
779are 0 - None, 1 - Emergency, 2 - Alert, 3 - Critical, 4 - Error, 5 - Warning,
7806 - Notice, 7 - Info, 8 - Debug, 9 - Trace
781
782*logfile* is a commandline option to mention log file path which helps in
783logging to the specified file and also help in persisting the gfapi logs. The
784default is stderr.
785
786You can create a GlusterFS disk image with the command:
787
788.. parsed-literal::
789
790 qemu-img create gluster://HOST/VOLUME/PATH SIZE
791
792Examples
793
794.. parsed-literal::
795
796 |qemu_system| -drive file=gluster://1.2.3.4/testvol/a.img
797 |qemu_system| -drive file=gluster+tcp://1.2.3.4/testvol/a.img
798 |qemu_system| -drive file=gluster+tcp://1.2.3.4:24007/testvol/dir/a.img
799 |qemu_system| -drive file=gluster+tcp://[1:2:3:4:5:6:7:8]/testvol/dir/a.img
800 |qemu_system| -drive file=gluster+tcp://[1:2:3:4:5:6:7:8]:24007/testvol/dir/a.img
801 |qemu_system| -drive file=gluster+tcp://server.domain.com:24007/testvol/dir/a.img
802 |qemu_system| -drive file=gluster+unix:///testvol/dir/a.img?socket=/tmp/glusterd.socket
803 |qemu_system| -drive file=gluster+rdma://1.2.3.4:24007/testvol/a.img
804 |qemu_system| -drive file=gluster://1.2.3.4/testvol/a.img,file.debug=9,file.logfile=/var/log/qemu-gluster.log
805 |qemu_system| 'json:{"driver":"qcow2",
806 "file":{"driver":"gluster",
807 "volume":"testvol","path":"a.img",
808 "debug":9,"logfile":"/var/log/qemu-gluster.log",
809 "server":[{"type":"tcp","host":"1.2.3.4","port":24007},
810 {"type":"unix","socket":"/var/run/glusterd.socket"}]}}'
811 |qemu_system| -drive driver=qcow2,file.driver=gluster,file.volume=testvol,file.path=/path/a.img,
812 file.debug=9,file.logfile=/var/log/qemu-gluster.log,
813 file.server.0.type=tcp,file.server.0.host=1.2.3.4,file.server.0.port=24007,
814 file.server.1.type=unix,file.server.1.socket=/var/run/glusterd.socket
815
816Secure Shell (ssh) disk images
817~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
818
819You can access disk images located on a remote ssh server
820by using the ssh protocol:
821
822.. parsed-literal::
823
824 |qemu_system| -drive file=ssh://[USER@]SERVER[:PORT]/PATH[?host_key_check=HOST_KEY_CHECK]
825
826Alternative syntax using properties:
827
828.. parsed-literal::
829
830 |qemu_system| -drive file.driver=ssh[,file.user=USER],file.host=SERVER[,file.port=PORT],file.path=PATH[,file.host_key_check=HOST_KEY_CHECK]
831
832*ssh* is the protocol.
833
834*USER* is the remote user. If not specified, then the local
835username is tried.
836
837*SERVER* specifies the remote ssh server. Any ssh server can be
838used, but it must implement the sftp-server protocol. Most Unix/Linux
839systems should work without requiring any extra configuration.
840
841*PORT* is the port number on which sshd is listening. By default
842the standard ssh port (22) is used.
843
844*PATH* is the path to the disk image.
845
846The optional *HOST_KEY_CHECK* parameter controls how the remote
847host's key is checked. The default is ``yes`` which means to use
848the local ``.ssh/known_hosts`` file. Setting this to ``no``
849turns off known-hosts checking. Or you can check that the host key
850matches a specific fingerprint:
851``host_key_check=md5:78:45:8e:14:57:4f:d5:45:83:0a:0e:f3:49:82:c9:c8``
852(``sha1:`` can also be used as a prefix, but note that OpenSSH
853tools only use MD5 to print fingerprints).
854
855Currently authentication must be done using ssh-agent. Other
856authentication methods may be supported in future.
857
858Note: Many ssh servers do not support an ``fsync``-style operation.
859The ssh driver cannot guarantee that disk flush requests are
860obeyed, and this causes a risk of disk corruption if the remote
861server or network goes down during writes. The driver will
862print a warning when ``fsync`` is not supported:
863
864::
865
866 warning: ssh server ssh.example.com:22 does not support fsync
867
868With sufficiently new versions of libssh and OpenSSH, ``fsync`` is
869supported.
870
871NVMe disk images
872~~~~~~~~~~~~~~~~
873
874NVM Express (NVMe) storage controllers can be accessed directly by a userspace
875driver in QEMU. This bypasses the host kernel file system and block layers
876while retaining QEMU block layer functionalities, such as block jobs, I/O
877throttling, image formats, etc. Disk I/O performance is typically higher than
878with ``-drive file=/dev/sda`` using either thread pool or linux-aio.
879
880The controller will be exclusively used by the QEMU process once started. To be
881able to share storage between multiple VMs and other applications on the host,
882please use the file based protocols.
883
884Before starting QEMU, bind the host NVMe controller to the host vfio-pci
885driver. For example:
886
887.. parsed-literal::
888
889 # modprobe vfio-pci
890 # lspci -n -s 0000:06:0d.0
891 06:0d.0 0401: 1102:0002 (rev 08)
892 # echo 0000:06:0d.0 > /sys/bus/pci/devices/0000:06:0d.0/driver/unbind
893 # echo 1102 0002 > /sys/bus/pci/drivers/vfio-pci/new_id
894
895 # |qemu_system| -drive file=nvme://HOST:BUS:SLOT.FUNC/NAMESPACE
896
897Alternative syntax using properties:
898
899.. parsed-literal::
900
901 |qemu_system| -drive file.driver=nvme,file.device=HOST:BUS:SLOT.FUNC,file.namespace=NAMESPACE
902
903*HOST*:*BUS*:*SLOT*.\ *FUNC* is the NVMe controller's PCI device
904address on the host.
905
906*NAMESPACE* is the NVMe namespace number, starting from 1.
907
908Disk image file locking
909~~~~~~~~~~~~~~~~~~~~~~~
910
911By default, QEMU tries to protect image files from unexpected concurrent
912access, as long as it's supported by the block protocol driver and host
913operating system. If multiple QEMU processes (including QEMU emulators and
914utilities) try to open the same image with conflicting accessing modes, all but
915the first one will get an error.
916
917This feature is currently supported by the file protocol on Linux with the Open
918File Descriptor (OFD) locking API, and can be configured to fall back to POSIX
919locking if the POSIX host doesn't support Linux OFD locking.
920
921To explicitly enable image locking, specify "locking=on" in the file protocol
922driver options. If OFD locking is not possible, a warning will be printed and
923the POSIX locking API will be used. In this case there is a risk that the lock
924will get silently lost when doing hot plugging and block jobs, due to the
925shortcomings of the POSIX locking API.
926
927QEMU transparently handles lock handover during shared storage migration. For
928shared virtual disk images between multiple VMs, the "share-rw" device option
929should be used.
930
931By default, the guest has exclusive write access to its disk image. If the
932guest can safely share the disk image with other writers the
933``-device ...,share-rw=on`` parameter can be used. This is only safe if
934the guest is running software, such as a cluster file system, that
935coordinates disk accesses to avoid corruption.
936
937Note that share-rw=on only declares the guest's ability to share the disk.
938Some QEMU features, such as image file formats, require exclusive write access
939to the disk image and this is unaffected by the share-rw=on option.
940
941Alternatively, locking can be fully disabled by "locking=off" block device
942option. In the command line, the option is usually in the form of
943"file.locking=off" as the protocol driver is normally placed as a "file" child
944under a format driver. For example:
945
946::
947
948 -blockdev driver=qcow2,file.filename=/path/to/image,file.locking=off,file.driver=file
949
950To check if image locking is active, check the output of the "lslocks" command
951on host and see if there are locks held by the QEMU process on the image file.
952More than one byte could be locked by the QEMU instance, each byte of which
953reflects a particular permission that is acquired or protected by the running
954block driver.
33fa2222
VSO
955
956Filter drivers
957~~~~~~~~~~~~~~
958
959QEMU supports several filter drivers, which don't store any data, but perform
960some additional tasks, hooking io requests.
961
962.. program:: filter-drivers
963.. option:: preallocate
964
965 The preallocate filter driver is intended to be inserted between format
966 and protocol nodes and preallocates some additional space
967 (expanding the protocol file) when writing past the file’s end. This can be
968 useful for file-systems with slow allocation.
969
970 Supported options:
971
972 .. program:: preallocate
973 .. option:: prealloc-align
974
975 On preallocation, align the file length to this value (in bytes), default 1M.
976
977 .. program:: preallocate
978 .. option:: prealloc-size
979
980 How much to preallocate (in bytes), default 128M.