]>
Commit | Line | Data |
---|---|---|
4d7fe02b AB |
1 | .. |
2 | Copyright (c) 2020, Linaro Limited | |
3 | Written by Alex Bennée | |
4 | ||
5 | ||
6 | ======================== | |
7 | TCG Instruction Counting | |
8 | ======================== | |
9 | ||
10 | TCG has long supported a feature known as icount which allows for | |
11 | instruction counting during execution. This should not be confused | |
12 | with cycle accurate emulation - QEMU does not attempt to emulate how | |
13 | long an instruction would take on real hardware. That is a job for | |
14 | other more detailed (and slower) tools that simulate the rest of a | |
15 | micro-architecture. | |
16 | ||
17 | This feature is only available for system emulation and is | |
18 | incompatible with multi-threaded TCG. It can be used to better align | |
19 | execution time with wall-clock time so a "slow" device doesn't run too | |
20 | fast on modern hardware. It can also provides for a degree of | |
21 | deterministic execution and is an essential part of the record/replay | |
22 | support in QEMU. | |
23 | ||
24 | Core Concepts | |
25 | ============= | |
26 | ||
27 | At its heart icount is simply a count of executed instructions which | |
28 | is stored in the TimersState of QEMU's timer sub-system. The number of | |
29 | executed instructions can then be used to calculate QEMU_CLOCK_VIRTUAL | |
30 | which represents the amount of elapsed time in the system since | |
31 | execution started. Depending on the icount mode this may either be a | |
32 | fixed number of ns per instruction or adjusted as execution continues | |
33 | to keep wall clock time and virtual time in sync. | |
34 | ||
35 | To be able to calculate the number of executed instructions the | |
36 | translator starts by allocating a budget of instructions to be | |
37 | executed. The budget of instructions is limited by how long it will be | |
38 | until the next timer will expire. We store this budget as part of a | |
39 | vCPU icount_decr field which shared with the machinery for handling | |
40 | cpu_exit(). The whole field is checked at the start of every | |
41 | translated block and will cause a return to the outer loop to deal | |
42 | with whatever caused the exit. | |
43 | ||
44 | In the case of icount, before the flag is checked we subtract the | |
45 | number of instructions the translation block would execute. If this | |
46 | would cause the instruction budget to go negative we exit the main | |
47 | loop and regenerate a new translation block with exactly the right | |
48 | number of instructions to take the budget to 0 meaning whatever timer | |
49 | was due to expire will expire exactly when we exit the main run loop. | |
50 | ||
51 | Dealing with MMIO | |
52 | ----------------- | |
53 | ||
54 | While we can adjust the instruction budget for known events like timer | |
55 | expiry we cannot do the same for MMIO. Every load/store we execute | |
56 | might potentially trigger an I/O event, at which point we will need an | |
57 | up to date and accurate reading of the icount number. | |
58 | ||
59 | To deal with this case, when an I/O access is made we: | |
60 | ||
61 | - restore un-executed instructions to the icount budget | |
62 | - re-compile a single [1]_ instruction block for the current PC | |
63 | - exit the cpu loop and execute the re-compiled block | |
64 | ||
65 | The new block is created with the CF_LAST_IO compile flag which | |
66 | ensures the final instruction translation starts with a call to | |
67 | gen_io_start() so we don't enter a perpetual loop constantly | |
68 | recompiling a single instruction block. For translators using the | |
69 | common translator_loop this is done automatically. | |
70 | ||
71 | .. [1] sometimes two instructions if dealing with delay slots | |
72 | ||
73 | Other I/O operations | |
74 | -------------------- | |
75 | ||
76 | MMIO isn't the only type of operation for which we might need a | |
77 | correct and accurate clock. IO port instructions and accesses to | |
78 | system registers are the common examples here. These instructions have | |
79 | to be handled by the individual translators which have the knowledge | |
80 | of which operations are I/O operations. | |
81 | ||
82 | When the translator is handling an instruction of this kind: | |
83 | ||
84 | * it must call gen_io_start() if icount is enabled, at some | |
85 | point before the generation of the code which actually does | |
86 | the I/O, using a code fragment similar to: | |
87 | ||
88 | .. code:: c | |
89 | ||
90 | if (tb_cflags(s->base.tb) & CF_USE_ICOUNT) { | |
91 | gen_io_start(); | |
92 | } | |
93 | ||
94 | * it must end the TB immediately after this instruction |