]>
Commit | Line | Data |
---|---|---|
027e3332 AB |
1 | .. |
2 | Copyright (C) 2017, Emilio G. Cota <cota@braap.org> | |
3 | Copyright (c) 2019, Linaro Limited | |
4 | Written by Emilio Cota and Alex Bennée | |
5 | ||
6 | ================ | |
7 | QEMU TCG Plugins | |
8 | ================ | |
9 | ||
10 | QEMU TCG plugins provide a way for users to run experiments taking | |
11 | advantage of the total system control emulation can have over a guest. | |
12 | It provides a mechanism for plugins to subscribe to events during | |
13 | translation and execution and optionally callback into the plugin | |
14 | during these events. TCG plugins are unable to change the system state | |
15 | only monitor it passively. However they can do this down to an | |
16 | individual instruction granularity including potentially subscribing | |
17 | to all load and store operations. | |
18 | ||
19 | API Stability | |
20 | ============= | |
21 | ||
22 | This is a new feature for QEMU and it does allow people to develop | |
23 | out-of-tree plugins that can be dynamically linked into a running QEMU | |
24 | process. However the project reserves the right to change or break the | |
25 | API should it need to do so. The best way to avoid this is to submit | |
26 | your plugin upstream so they can be updated if/when the API changes. | |
27 | ||
5c6ecbdc AB |
28 | API versioning |
29 | -------------- | |
30 | ||
31 | All plugins need to declare a symbol which exports the plugin API | |
32 | version they were built against. This can be done simply by:: | |
33 | ||
34 | QEMU_PLUGIN_EXPORT int qemu_plugin_version = QEMU_PLUGIN_VERSION; | |
35 | ||
36 | The core code will refuse to load a plugin that doesn't export a | |
37 | `qemu_plugin_version` symbol or if plugin version is outside of QEMU's | |
38 | supported range of API versions. | |
39 | ||
40 | Additionally the `qemu_info_t` structure which is passed to the | |
41 | `qemu_plugin_install` method of a plugin will detail the minimum and | |
42 | current API versions supported by QEMU. The API version will be | |
43 | incremented if new APIs are added. The minimum API version will be | |
44 | incremented if existing APIs are changed or removed. | |
027e3332 AB |
45 | |
46 | Exposure of QEMU internals | |
47 | -------------------------- | |
48 | ||
49 | The plugin architecture actively avoids leaking implementation details | |
50 | about how QEMU's translation works to the plugins. While there are | |
51 | conceptions such as translation time and translation blocks the | |
52 | details are opaque to plugins. The plugin is able to query select | |
53 | details of instructions and system configuration only through the | |
9675a9c6 AB |
54 | exported *qemu_plugin* functions. |
55 | ||
56 | Query Handle Lifetime | |
57 | --------------------- | |
58 | ||
59 | Each callback provides an opaque anonymous information handle which | |
60 | can usually be further queried to find out information about a | |
61 | translation, instruction or operation. The handles themselves are only | |
62 | valid during the lifetime of the callback so it is important that any | |
63 | information that is needed is extracted during the callback and saved | |
64 | by the plugin. | |
027e3332 | 65 | |
ca955bd7 AB |
66 | API |
67 | === | |
68 | ||
69 | .. kernel-doc:: include/qemu/qemu-plugin.h | |
70 | ||
027e3332 AB |
71 | Usage |
72 | ===== | |
73 | ||
5c6ecbdc | 74 | The QEMU binary needs to be compiled for plugin support:: |
027e3332 | 75 | |
5c6ecbdc | 76 | configure --enable-plugins |
027e3332 AB |
77 | |
78 | Once built a program can be run with multiple plugins loaded each with | |
5c6ecbdc | 79 | their own arguments:: |
027e3332 | 80 | |
5c6ecbdc | 81 | $QEMU $OTHER_QEMU_ARGS \ |
027e3332 AB |
82 | -plugin tests/plugin/libhowvec.so,arg=inline,arg=hint \ |
83 | -plugin tests/plugin/libhotblocks.so | |
84 | ||
85 | Arguments are plugin specific and can be used to modify their | |
86 | behaviour. In this case the howvec plugin is being asked to use inline | |
87 | ops to count and break down the hint instructions by type. | |
88 | ||
89 | Plugin Life cycle | |
90 | ================= | |
91 | ||
92 | First the plugin is loaded and the public qemu_plugin_install function | |
93 | is called. The plugin will then register callbacks for various plugin | |
94 | events. Generally plugins will register a handler for the *atexit* | |
95 | if they want to dump a summary of collected information once the | |
96 | program/system has finished running. | |
97 | ||
98 | When a registered event occurs the plugin callback is invoked. The | |
99 | callbacks may provide additional information. In the case of a | |
100 | translation event the plugin has an option to enumerate the | |
101 | instructions in a block of instructions and optionally register | |
102 | callbacks to some or all instructions when they are executed. | |
103 | ||
104 | There is also a facility to add an inline event where code to | |
105 | increment a counter can be directly inlined with the translation. | |
106 | Currently only a simple increment is supported. This is not atomic so | |
107 | can miss counts. If you want absolute precision you should use a | |
108 | callback which can then ensure atomicity itself. | |
109 | ||
110 | Finally when QEMU exits all the registered *atexit* callbacks are | |
111 | invoked. | |
112 | ||
113 | Internals | |
114 | ========= | |
115 | ||
116 | Locking | |
117 | ------- | |
118 | ||
119 | We have to ensure we cannot deadlock, particularly under MTTCG. For | |
120 | this we acquire a lock when called from plugin code. We also keep the | |
121 | list of callbacks under RCU so that we do not have to hold the lock | |
122 | when calling the callbacks. This is also for performance, since some | |
123 | callbacks (e.g. memory access callbacks) might be called very | |
124 | frequently. | |
125 | ||
126 | * A consequence of this is that we keep our own list of CPUs, so that | |
127 | we do not have to worry about locking order wrt cpu_list_lock. | |
128 | * Use a recursive lock, since we can get registration calls from | |
129 | callbacks. | |
130 | ||
131 | As a result registering/unregistering callbacks is "slow", since it | |
132 | takes a lock. But this is very infrequent; we want performance when | |
133 | calling (or not calling) callbacks, not when registering them. Using | |
134 | RCU is great for this. | |
135 | ||
136 | We support the uninstallation of a plugin at any time (e.g. from | |
137 | plugin callbacks). This allows plugins to remove themselves if they no | |
138 | longer want to instrument the code. This operation is asynchronous | |
139 | which means callbacks may still occur after the uninstall operation is | |
140 | requested. The plugin isn't completely uninstalled until the safe work | |
141 | has executed while all vCPUs are quiescent. | |
c17a386b AB |
142 | |
143 | Example Plugins | |
144 | =============== | |
145 | ||
146 | There are a number of plugins included with QEMU and you are | |
147 | encouraged to contribute your own plugins plugins upstream. There is a | |
148 | `contrib/plugins` directory where they can go. | |
149 | ||
150 | - tests/plugins | |
151 | ||
152 | These are some basic plugins that are used to test and exercise the | |
153 | API during the `make check-tcg` target. | |
154 | ||
155 | - contrib/plugins/hotblocks.c | |
156 | ||
157 | The hotblocks plugin allows you to examine the where hot paths of | |
158 | execution are in your program. Once the program has finished you will | |
159 | get a sorted list of blocks reporting the starting PC, translation | |
160 | count, number of instructions and execution count. This will work best | |
161 | with linux-user execution as system emulation tends to generate | |
162 | re-translations as blocks from different programs get swapped in and | |
163 | out of system memory. | |
164 | ||
165 | If your program is single-threaded you can use the `inline` option for | |
166 | slightly faster (but not thread safe) counters. | |
167 | ||
168 | Example:: | |
169 | ||
170 | ./aarch64-linux-user/qemu-aarch64 \ | |
171 | -plugin contrib/plugins/libhotblocks.so -d plugin \ | |
172 | ./tests/tcg/aarch64-linux-user/sha1 | |
173 | SHA1=15dd99a1991e0b3826fede3deffc1feba42278e6 | |
174 | collected 903 entries in the hash table | |
175 | pc, tcount, icount, ecount | |
176 | 0x0000000041ed10, 1, 5, 66087 | |
177 | 0x000000004002b0, 1, 4, 66087 | |
178 | ... | |
179 | ||
180 | - contrib/plugins/hotpages.c | |
181 | ||
182 | Similar to hotblocks but this time tracks memory accesses:: | |
183 | ||
184 | ./aarch64-linux-user/qemu-aarch64 \ | |
185 | -plugin contrib/plugins/libhotpages.so -d plugin \ | |
186 | ./tests/tcg/aarch64-linux-user/sha1 | |
187 | SHA1=15dd99a1991e0b3826fede3deffc1feba42278e6 | |
188 | Addr, RCPUs, Reads, WCPUs, Writes | |
189 | 0x000055007fe000, 0x0001, 31747952, 0x0001, 8835161 | |
190 | 0x000055007ff000, 0x0001, 29001054, 0x0001, 8780625 | |
191 | 0x00005500800000, 0x0001, 687465, 0x0001, 335857 | |
192 | 0x0000000048b000, 0x0001, 130594, 0x0001, 355 | |
193 | 0x0000000048a000, 0x0001, 1826, 0x0001, 11 | |
194 | ||
195 | - contrib/plugins/howvec.c | |
196 | ||
197 | This is an instruction classifier so can be used to count different | |
198 | types of instructions. It has a number of options to refine which get | |
199 | counted. You can give an argument for a class of instructions to break | |
200 | it down fully, so for example to see all the system registers | |
201 | accesses:: | |
202 | ||
203 | ./aarch64-softmmu/qemu-system-aarch64 $(QEMU_ARGS) \ | |
204 | -append "root=/dev/sda2 systemd.unit=benchmark.service" \ | |
205 | -smp 4 -plugin ./contrib/plugins/libhowvec.so,arg=sreg -d plugin | |
206 | ||
207 | which will lead to a sorted list after the class breakdown:: | |
208 | ||
209 | Instruction Classes: | |
210 | Class: UDEF not counted | |
211 | Class: SVE (68 hits) | |
212 | Class: PCrel addr (47789483 hits) | |
213 | Class: Add/Sub (imm) (192817388 hits) | |
214 | Class: Logical (imm) (93852565 hits) | |
215 | Class: Move Wide (imm) (76398116 hits) | |
216 | Class: Bitfield (44706084 hits) | |
217 | Class: Extract (5499257 hits) | |
218 | Class: Cond Branch (imm) (147202932 hits) | |
219 | Class: Exception Gen (193581 hits) | |
220 | Class: NOP not counted | |
221 | Class: Hints (6652291 hits) | |
222 | Class: Barriers (8001661 hits) | |
223 | Class: PSTATE (1801695 hits) | |
224 | Class: System Insn (6385349 hits) | |
225 | Class: System Reg counted individually | |
226 | Class: Branch (reg) (69497127 hits) | |
227 | Class: Branch (imm) (84393665 hits) | |
228 | Class: Cmp & Branch (110929659 hits) | |
229 | Class: Tst & Branch (44681442 hits) | |
230 | Class: AdvSimd ldstmult (736 hits) | |
231 | Class: ldst excl (9098783 hits) | |
232 | Class: Load Reg (lit) (87189424 hits) | |
233 | Class: ldst noalloc pair (3264433 hits) | |
234 | Class: ldst pair (412526434 hits) | |
235 | Class: ldst reg (imm) (314734576 hits) | |
236 | Class: Loads & Stores (2117774 hits) | |
237 | Class: Data Proc Reg (223519077 hits) | |
238 | Class: Scalar FP (31657954 hits) | |
239 | Individual Instructions: | |
240 | Instr: mrs x0, sp_el0 (2682661 hits) (op=0xd5384100/ System Reg) | |
241 | Instr: mrs x1, tpidr_el2 (1789339 hits) (op=0xd53cd041/ System Reg) | |
242 | Instr: mrs x2, tpidr_el2 (1513494 hits) (op=0xd53cd042/ System Reg) | |
243 | Instr: mrs x0, tpidr_el2 (1490823 hits) (op=0xd53cd040/ System Reg) | |
244 | Instr: mrs x1, sp_el0 (933793 hits) (op=0xd5384101/ System Reg) | |
245 | Instr: mrs x2, sp_el0 (699516 hits) (op=0xd5384102/ System Reg) | |
246 | Instr: mrs x4, tpidr_el2 (528437 hits) (op=0xd53cd044/ System Reg) | |
247 | Instr: mrs x30, ttbr1_el1 (480776 hits) (op=0xd538203e/ System Reg) | |
248 | Instr: msr ttbr1_el1, x30 (480713 hits) (op=0xd518203e/ System Reg) | |
249 | Instr: msr vbar_el1, x30 (480671 hits) (op=0xd518c01e/ System Reg) | |
250 | ... | |
251 | ||
252 | To find the argument shorthand for the class you need to examine the | |
253 | source code of the plugin at the moment, specifically the `*opt` | |
254 | argument in the InsnClassExecCount tables. | |
255 | ||
256 | - contrib/plugins/lockstep.c | |
257 | ||
258 | This is a debugging tool for developers who want to find out when and | |
259 | where execution diverges after a subtle change to TCG code generation. | |
260 | It is not an exact science and results are likely to be mixed once | |
261 | asynchronous events are introduced. While the use of -icount can | |
262 | introduce determinism to the execution flow it doesn't always follow | |
263 | the translation sequence will be exactly the same. Typically this is | |
264 | caused by a timer firing to service the GUI causing a block to end | |
265 | early. However in some cases it has proved to be useful in pointing | |
266 | people at roughly where execution diverges. The only argument you need | |
267 | for the plugin is a path for the socket the two instances will | |
268 | communicate over:: | |
269 | ||
270 | ||
271 | ./sparc-softmmu/qemu-system-sparc -monitor none -parallel none \ | |
272 | -net none -M SS-20 -m 256 -kernel day11/zImage.elf \ | |
273 | -plugin ./contrib/plugins/liblockstep.so,arg=lockstep-sparc.sock \ | |
274 | -d plugin,nochain | |
275 | ||
276 | which will eventually report:: | |
277 | ||
278 | qemu-system-sparc: warning: nic lance.0 has no peer | |
279 | @ 0x000000ffd06678 vs 0x000000ffd001e0 (2/1 since last) | |
280 | @ 0x000000ffd07d9c vs 0x000000ffd06678 (3/1 since last) | |
281 | Δ insn_count @ 0x000000ffd07d9c (809900609) vs 0x000000ffd06678 (809900612) | |
282 | previously @ 0x000000ffd06678/10 (809900609 insns) | |
283 | previously @ 0x000000ffd001e0/4 (809900599 insns) | |
284 | previously @ 0x000000ffd080ac/2 (809900595 insns) | |
285 | previously @ 0x000000ffd08098/5 (809900593 insns) | |
286 | previously @ 0x000000ffd080c0/1 (809900588 insns) | |
287 | ||
a622d64e AB |
288 | - contrib/plugins/hwprofile |
289 | ||
290 | The hwprofile tool can only be used with system emulation and allows | |
291 | the user to see what hardware is accessed how often. It has a number of options: | |
292 | ||
293 | * arg=read or arg=write | |
294 | ||
295 | By default the plugin tracks both reads and writes. You can use one | |
296 | of these options to limit the tracking to just one class of accesses. | |
297 | ||
298 | * arg=source | |
299 | ||
300 | Will include a detailed break down of what the guest PC that made the | |
301 | access was. Not compatible with arg=pattern. Example output:: | |
302 | ||
303 | cirrus-low-memory @ 0xfffffd00000a0000 | |
304 | pc:fffffc0000005cdc, 1, 256 | |
305 | pc:fffffc0000005ce8, 1, 256 | |
306 | pc:fffffc0000005cec, 1, 256 | |
307 | ||
308 | * arg=pattern | |
309 | ||
310 | Instead break down the accesses based on the offset into the HW | |
311 | region. This can be useful for seeing the most used registers of a | |
312 | device. Example output:: | |
313 | ||
314 | pci0-conf @ 0xfffffd01fe000000 | |
315 | off:00000004, 1, 1 | |
316 | off:00000010, 1, 3 | |
317 | off:00000014, 1, 3 | |
318 | off:00000018, 1, 2 | |
319 | off:0000001c, 1, 2 | |
320 | off:00000020, 1, 2 | |
321 | ... |