]> git.proxmox.com Git - mirror_qemu.git/blob - docs/nvdimm.txt
Merge remote-tracking branch 'remotes/kraxel/tags/vga-20180821-pull-request' into...
[mirror_qemu.git] / docs / nvdimm.txt
1 QEMU Virtual NVDIMM
2 ===================
3
4 This document explains the usage of virtual NVDIMM (vNVDIMM) feature
5 which is available since QEMU v2.6.0.
6
7 The current QEMU only implements the persistent memory mode of vNVDIMM
8 device and not the block window mode.
9
10 Basic Usage
11 -----------
12
13 The storage of a vNVDIMM device in QEMU is provided by the memory
14 backend (i.e. memory-backend-file and memory-backend-ram). A simple
15 way to create a vNVDIMM device at startup time is done via the
16 following command line options:
17
18 -machine pc,nvdimm
19 -m $RAM_SIZE,slots=$N,maxmem=$MAX_SIZE
20 -object memory-backend-file,id=mem1,share=on,mem-path=$PATH,size=$NVDIMM_SIZE
21 -device nvdimm,id=nvdimm1,memdev=mem1
22
23 Where,
24
25 - the "nvdimm" machine option enables vNVDIMM feature.
26
27 - "slots=$N" should be equal to or larger than the total amount of
28 normal RAM devices and vNVDIMM devices, e.g. $N should be >= 2 here.
29
30 - "maxmem=$MAX_SIZE" should be equal to or larger than the total size
31 of normal RAM devices and vNVDIMM devices, e.g. $MAX_SIZE should be
32 >= $RAM_SIZE + $NVDIMM_SIZE here.
33
34 - "object memory-backend-file,id=mem1,share=on,mem-path=$PATH,size=$NVDIMM_SIZE"
35 creates a backend storage of size $NVDIMM_SIZE on a file $PATH. All
36 accesses to the virtual NVDIMM device go to the file $PATH.
37
38 "share=on/off" controls the visibility of guest writes. If
39 "share=on", then guest writes will be applied to the backend
40 file. If another guest uses the same backend file with option
41 "share=on", then above writes will be visible to it as well. If
42 "share=off", then guest writes won't be applied to the backend
43 file and thus will be invisible to other guests.
44
45 - "device nvdimm,id=nvdimm1,memdev=mem1" creates a virtual NVDIMM
46 device whose storage is provided by above memory backend device.
47
48 Multiple vNVDIMM devices can be created if multiple pairs of "-object"
49 and "-device" are provided.
50
51 For above command line options, if the guest OS has the proper NVDIMM
52 driver, it should be able to detect a NVDIMM device which is in the
53 persistent memory mode and whose size is $NVDIMM_SIZE.
54
55 Note:
56
57 1. Prior to QEMU v2.8.0, if memory-backend-file is used and the actual
58 backend file size is not equal to the size given by "size" option,
59 QEMU will truncate the backend file by ftruncate(2), which will
60 corrupt the existing data in the backend file, especially for the
61 shrink case.
62
63 QEMU v2.8.0 and later check the backend file size and the "size"
64 option. If they do not match, QEMU will report errors and abort in
65 order to avoid the data corruption.
66
67 2. QEMU v2.6.0 only puts a basic alignment requirement on the "size"
68 option of memory-backend-file, e.g. 4KB alignment on x86. However,
69 QEMU v.2.7.0 puts an additional alignment requirement, which may
70 require a larger value than the basic one, e.g. 2MB on x86. This
71 change breaks the usage of memory-backend-file that only satisfies
72 the basic alignment.
73
74 QEMU v2.8.0 and later remove the additional alignment on non-s390x
75 architectures, so the broken memory-backend-file can work again.
76
77 Label
78 -----
79
80 QEMU v2.7.0 and later implement the label support for vNVDIMM devices.
81 To enable label on vNVDIMM devices, users can simply add
82 "label-size=$SZ" option to "-device nvdimm", e.g.
83
84 -device nvdimm,id=nvdimm1,memdev=mem1,label-size=128K
85
86 Note:
87
88 1. The minimal label size is 128KB.
89
90 2. QEMU v2.7.0 and later store labels at the end of backend storage.
91 If a memory backend file, which was previously used as the backend
92 of a vNVDIMM device without labels, is now used for a vNVDIMM
93 device with label, the data in the label area at the end of file
94 will be inaccessible to the guest. If any useful data (e.g. the
95 meta-data of the file system) was stored there, the latter usage
96 may result guest data corruption (e.g. breakage of guest file
97 system).
98
99 Hotplug
100 -------
101
102 QEMU v2.8.0 and later implement the hotplug support for vNVDIMM
103 devices. Similarly to the RAM hotplug, the vNVDIMM hotplug is
104 accomplished by two monitor commands "object_add" and "device_add".
105
106 For example, the following commands add another 4GB vNVDIMM device to
107 the guest:
108
109 (qemu) object_add memory-backend-file,id=mem2,share=on,mem-path=new_nvdimm.img,size=4G
110 (qemu) device_add nvdimm,id=nvdimm2,memdev=mem2
111
112 Note:
113
114 1. Each hotplugged vNVDIMM device consumes one memory slot. Users
115 should always ensure the memory option "-m ...,slots=N" specifies
116 enough number of slots, i.e.
117 N >= number of RAM devices +
118 number of statically plugged vNVDIMM devices +
119 number of hotplugged vNVDIMM devices
120
121 2. The similar is required for the memory option "-m ...,maxmem=M", i.e.
122 M >= size of RAM devices +
123 size of statically plugged vNVDIMM devices +
124 size of hotplugged vNVDIMM devices
125
126 Alignment
127 ---------
128
129 QEMU uses mmap(2) to maps vNVDIMM backends and aligns the mapping
130 address to the page size (getpagesize(2)) by default. However, some
131 types of backends may require an alignment different than the page
132 size. In that case, QEMU v2.12.0 and later provide 'align' option to
133 memory-backend-file to allow users to specify the proper alignment.
134
135 For example, device dax require the 2 MB alignment, so we can use
136 following QEMU command line options to use it (/dev/dax0.0) as the
137 backend of vNVDIMM:
138
139 -object memory-backend-file,id=mem1,share=on,mem-path=/dev/dax0.0,size=4G,align=2M
140 -device nvdimm,id=nvdimm1,memdev=mem1
141
142 Guest Data Persistence
143 ----------------------
144
145 Though QEMU supports multiple types of vNVDIMM backends on Linux,
146 currently the only one that can guarantee the guest write persistence
147 is the device DAX on the real NVDIMM device (e.g., /dev/dax0.0), to
148 which all guest access do not involve any host-side kernel cache.
149
150 When using other types of backends, it's suggested to set 'unarmed'
151 option of '-device nvdimm' to 'on', which sets the unarmed flag of the
152 guest NVDIMM region mapping structure. This unarmed flag indicates
153 guest software that this vNVDIMM device contains a region that cannot
154 accept persistent writes. In result, for example, the guest Linux
155 NVDIMM driver, marks such vNVDIMM device as read-only.
156
157 NVDIMM Persistence
158 ------------------
159
160 ACPI 6.2 Errata A added support for a new Platform Capabilities Structure
161 which allows the platform to communicate what features it supports related to
162 NVDIMM data persistence. Users can provide a persistence value to a guest via
163 the optional "nvdimm-persistence" machine command line option:
164
165 -machine pc,accel=kvm,nvdimm,nvdimm-persistence=cpu
166
167 There are currently two valid values for this option:
168
169 "mem-ctrl" - The platform supports flushing dirty data from the memory
170 controller to the NVDIMMs in the event of power loss.
171
172 "cpu" - The platform supports flushing dirty data from the CPU cache to
173 the NVDIMMs in the event of power loss. This implies that the
174 platform also supports flushing dirty data through the memory
175 controller on power loss.
176
177 If the vNVDIMM backend is in host persistent memory that can be accessed in
178 SNIA NVM Programming Model [1] (e.g., Intel NVDIMM), it's suggested to set
179 the 'pmem' option of memory-backend-file to 'on'. When 'pmem' is 'on' and QEMU
180 is built with libpmem [2] support (configured with --enable-libpmem), QEMU
181 will take necessary operations to guarantee the persistence of its own writes
182 to the vNVDIMM backend(e.g., in vNVDIMM label emulation and live migration).
183 If 'pmem' is 'on' while there is no libpmem support, qemu will exit and report
184 a "lack of libpmem support" message to ensure the persistence is available.
185 For example, if we want to ensure the persistence for some backend file,
186 use the QEMU command line:
187
188 -object memory-backend-file,id=nv_mem,mem-path=/XXX/yyy,size=4G,pmem=on
189
190 References
191 ----------
192
193 [1] NVM Programming Model (NPM)
194 Version 1.2
195 https://www.snia.org/sites/default/files/technical_work/final/NVMProgrammingModel_v1.2.pdf
196 [2] Persistent Memory Development Kit (PMDK), formerly known as NVML project, home page:
197 http://pmem.io/pmdk/