]> git.proxmox.com Git - qemu.git/blame - tcg/README
suppressed unused macro handling
[qemu.git] / tcg / README
CommitLineData
c896fe29
FB
1Tiny Code Generator - Fabrice Bellard.
2
31) Introduction
4
5TCG (Tiny Code Generator) began as a generic backend for a C
6compiler. It was simplified to be used in QEMU. It also has its roots
7in the QOP code generator written by Paul Brook.
8
92) Definitions
10
11The TCG "target" is the architecture for which we generate the
12code. It is of course not the same as the "target" of QEMU which is
13the emulated architecture. As TCG started as a generic C backend used
14for cross compiling, it is assumed that the TCG target is different
15from the host, although it is never the case for QEMU.
16
17A TCG "function" corresponds to a QEMU Translated Block (TB).
18
19A TCG "temporary" is a variable only live in a given
9804c8e2 20function. Temporaries are allocated explicitly in each function.
c896fe29
FB
21
22A TCG "global" is a variable which is live in all the functions. They
23are defined before the functions defined. A TCG global can be a memory
24location (e.g. a QEMU CPU register), a fixed host register (e.g. the
25QEMU CPU state pointer) or a memory location which is stored in a
26register outside QEMU TBs (not implemented yet).
27
28A TCG "basic block" corresponds to a list of instructions terminated
29by a branch instruction.
30
313) Intermediate representation
32
333.1) Introduction
34
35TCG instructions operate on variables which are temporaries or
36globals. TCG instructions and variables are strongly typed. Two types
37are supported: 32 bit integers and 64 bit integers. Pointers are
38defined as an alias to 32 bit or 64 bit integers depending on the TCG
39target word size.
40
41Each instruction has a fixed number of output variable operands, input
42variable operands and always constant operands.
43
44The notable exception is the call instruction which has a variable
45number of outputs and inputs.
46
47In the textual form, output operands come first, followed by input
48operands, followed by constant operands. The output type is included
49in the instruction name. Constants are prefixed with a '$'.
50
51add_i32 t0, t1, t2 (t0 <- t1 + t2)
52
53sub_i64 t2, t3, $4 (t2 <- t3 - 4)
54
553.2) Assumptions
56
57* Basic blocks
58
59- Basic blocks end after branches (e.g. brcond_i32 instruction),
60 goto_tb and exit_tb instructions.
61- Basic blocks end before legacy dyngen operations.
62- Basic blocks start after the end of a previous basic block, at a
63 set_label instruction or after a legacy dyngen operation.
64
65After the end of a basic block, temporaries at destroyed and globals
66are stored at their initial storage (register or memory place
67depending on their declarations).
68
69* Floating point types are not supported yet
70
71* Pointers: depending on the TCG target, pointer size is 32 bit or 64
72 bit. The type TCG_TYPE_PTR is an alias to TCG_TYPE_I32 or
73 TCG_TYPE_I64.
74
75* Helpers:
76
77Using the tcg_gen_helper_x_y it is possible to call any function
811d4cf4 78taking i32, i64 or pointer types. Before calling an helper, all
c896fe29
FB
79globals are stored at their canonical location and it is assumed that
80the function can modify them. In the future, function modifiers will
81be allowed to tell that the helper does not read or write some globals.
82
83On some TCG targets (e.g. x86), several calling conventions are
84supported.
85
86* Branches:
87
88Use the instruction 'br' to jump to a label. Use 'jmp' to jump to an
89explicit address. Conditional branches can only jump to labels.
90
913.3) Code Optimizations
92
93When generating instructions, you can count on at least the following
94optimizations:
95
96- Single instructions are simplified, e.g.
97
98 and_i32 t0, t0, $0xffffffff
99
100 is suppressed.
101
102- A liveness analysis is done at the basic block level. The
103 information is used to suppress moves from a dead temporary to
104 another one. It is also used to remove instructions which compute
105 dead results. The later is especially useful for condition code
9804c8e2 106 optimization in QEMU.
c896fe29
FB
107
108 In the following example:
109
110 add_i32 t0, t1, t2
111 add_i32 t0, t0, $1
112 mov_i32 t0, $1
113
114 only the last instruction is kept.
115
116- A macro system is supported (may get closer to function inlining
117 some day). It is useful if the liveness analysis is likely to prove
118 that some results of a computation are indeed not useful. With the
119 macro system, the user can provide several alternative
120 implementations which are used depending on the used results. It is
9804c8e2 121 especially useful for condition code optimization in QEMU.
c896fe29
FB
122
123 Here is an example:
124
125 macro_2 t0, t1, $1
126 mov_i32 t0, $0x1234
127
128 The macro identified by the ID "$1" normally returns the values t0
129 and t1. Suppose its implementation is:
130
131 macro_start
132 brcond_i32 t2, $0, $TCG_COND_EQ, $1
133 mov_i32 t0, $2
134 br $2
135 set_label $1
136 mov_i32 t0, $3
137 set_label $2
138 add_i32 t1, t3, t4
139 macro_end
140
141 If t0 is not used after the macro, the user can provide a simpler
142 implementation:
143
144 macro_start
145 add_i32 t1, t2, t4
146 macro_end
147
148 TCG automatically chooses the right implementation depending on
149 which macro outputs are used after it.
150
151 Note that if TCG did more expensive optimizations, macros would be
152 less useful. In the previous example a macro is useful because the
153 liveness analysis is done on each basic block separately. Hence TCG
154 cannot remove the code computing 't0' even if it is not used after
155 the first macro implementation.
156
1573.4) Instruction Reference
158
159********* Function call
160
161* call <ret> <params> ptr
162
163call function 'ptr' (pointer type)
164
165<ret> optional 32 bit or 64 bit return value
166<params> optional 32 bit or 64 bit parameters
167
168********* Jumps/Labels
169
170* jmp t0
171
172Absolute jump to address t0 (pointer type).
173
174* set_label $label
175
176Define label 'label' at the current program point.
177
178* br $label
179
180Jump to label.
181
182* brcond_i32/i64 cond, t0, t1, label
183
184Conditional jump if t0 cond t1 is true. cond can be:
185 TCG_COND_EQ
186 TCG_COND_NE
187 TCG_COND_LT /* signed */
188 TCG_COND_GE /* signed */
189 TCG_COND_LE /* signed */
190 TCG_COND_GT /* signed */
191 TCG_COND_LTU /* unsigned */
192 TCG_COND_GEU /* unsigned */
193 TCG_COND_LEU /* unsigned */
194 TCG_COND_GTU /* unsigned */
195
196********* Arithmetic
197
198* add_i32/i64 t0, t1, t2
199
200t0=t1+t2
201
202* sub_i32/i64 t0, t1, t2
203
204t0=t1-t2
205
390efc54
PB
206* neg_i32/i64 t0, t1
207
208t0=-t1 (two's complement)
209
c896fe29
FB
210* mul_i32/i64 t0, t1, t2
211
212t0=t1*t2
213
214* div_i32/i64 t0, t1, t2
215
216t0=t1/t2 (signed). Undefined behavior if division by zero or overflow.
217
218* divu_i32/i64 t0, t1, t2
219
220t0=t1/t2 (unsigned). Undefined behavior if division by zero.
221
222* rem_i32/i64 t0, t1, t2
223
224t0=t1%t2 (signed). Undefined behavior if division by zero or overflow.
225
226* remu_i32/i64 t0, t1, t2
227
228t0=t1%t2 (unsigned). Undefined behavior if division by zero.
229
c896fe29
FB
230********* Logical
231
5e85404a
AJ
232* and_i32/i64 t0, t1, t2
233
c896fe29
FB
234t0=t1&t2
235
236* or_i32/i64 t0, t1, t2
237
238t0=t1|t2
239
240* xor_i32/i64 t0, t1, t2
241
242t0=t1^t2
243
c896fe29
FB
244********* Shifts
245
246* shl_i32/i64 t0, t1, t2
247
248t0=t1 << t2. Undefined behavior if t2 < 0 or t2 >= 32 (resp 64)
249
250* shr_i32/i64 t0, t1, t2
251
252t0=t1 >> t2 (unsigned). Undefined behavior if t2 < 0 or t2 >= 32 (resp 64)
253
254* sar_i32/i64 t0, t1, t2
255
256t0=t1 >> t2 (signed). Undefined behavior if t2 < 0 or t2 >= 32 (resp 64)
257
258********* Misc
259
260* mov_i32/i64 t0, t1
261
262t0 = t1
263
264Move t1 to t0 (both operands must have the same type).
265
266* ext8s_i32/i64 t0, t1
86831435 267ext8u_i32/i64 t0, t1
c896fe29 268ext16s_i32/i64 t0, t1
86831435 269ext16u_i32/i64 t0, t1
c896fe29 270ext32s_i64 t0, t1
86831435 271ext32u_i64 t0, t1
c896fe29 272
86831435 2738, 16 or 32 bit sign/zero extension (both operands must have the same type)
c896fe29
FB
274
275* bswap16_i32 t0, t1
276
27716 bit byte swap on a 32 bit value. The two high order bytes must be set
278to zero.
279
280* bswap_i32 t0, t1
281
28232 bit byte swap
283
284* bswap_i64 t0, t1
285
28664 bit byte swap
287
5ff9d6a4
FB
288* discard_i32/i64 t0
289
290Indicate that the value of t0 won't be used later. It is useful to
291force dead code elimination.
292
c896fe29
FB
293********* Type conversions
294
295* ext_i32_i64 t0, t1
296Convert t1 (32 bit) to t0 (64 bit) and does sign extension
297
298* extu_i32_i64 t0, t1
299Convert t1 (32 bit) to t0 (64 bit) and does zero extension
300
301* trunc_i64_i32 t0, t1
302Truncate t1 (64 bit) to t0 (32 bit)
303
304********* Load/Store
305
306* ld_i32/i64 t0, t1, offset
307ld8s_i32/i64 t0, t1, offset
308ld8u_i32/i64 t0, t1, offset
309ld16s_i32/i64 t0, t1, offset
310ld16u_i32/i64 t0, t1, offset
311ld32s_i64 t0, t1, offset
312ld32u_i64 t0, t1, offset
313
314t0 = read(t1 + offset)
315Load 8, 16, 32 or 64 bits with or without sign extension from host memory.
316offset must be a constant.
317
318* st_i32/i64 t0, t1, offset
319st8_i32/i64 t0, t1, offset
320st16_i32/i64 t0, t1, offset
321st32_i64 t0, t1, offset
322
323write(t0, t1 + offset)
324Write 8, 16, 32 or 64 bits to host memory.
325
326********* QEMU specific operations
327
328* tb_exit t0
329
330Exit the current TB and return the value t0 (word type).
331
332* goto_tb index
333
334Exit the current TB and jump to the TB index 'index' (constant) if the
335current TB was linked to this TB. Otherwise execute the next
336instructions.
337
338* qemu_ld_i32/i64 t0, t1, flags
339qemu_ld8u_i32/i64 t0, t1, flags
340qemu_ld8s_i32/i64 t0, t1, flags
341qemu_ld16u_i32/i64 t0, t1, flags
342qemu_ld16s_i32/i64 t0, t1, flags
343qemu_ld32u_i64 t0, t1, flags
344qemu_ld32s_i64 t0, t1, flags
345
346Load data at the QEMU CPU address t1 into t0. t1 has the QEMU CPU
347address type. 'flags' contains the QEMU memory index (selects user or
348kernel access) for example.
349
350* qemu_st_i32/i64 t0, t1, flags
351qemu_st8_i32/i64 t0, t1, flags
352qemu_st16_i32/i64 t0, t1, flags
353qemu_st32_i64 t0, t1, flags
354
355Store the data t0 at the QEMU CPU Address t1. t1 has the QEMU CPU
356address type. 'flags' contains the QEMU memory index (selects user or
357kernel access) for example.
358
359Note 1: Some shortcuts are defined when the last operand is known to be
360a constant (e.g. addi for add, movi for mov).
361
362Note 2: When using TCG, the opcodes must never be generated directly
363as some of them may not be available as "real" opcodes. Always use the
364function tcg_gen_xxx(args).
365
3664) Backend
367
368tcg-target.h contains the target specific definitions. tcg-target.c
369contains the target specific code.
370
3714.1) Assumptions
372
373The target word size (TCG_TARGET_REG_BITS) is expected to be 32 bit or
37464 bit. It is expected that the pointer has the same size as the word.
375
376On a 32 bit target, all 64 bit operations are converted to 32 bits. A
377few specific operations must be implemented to allow it (see add2_i32,
378sub2_i32, brcond2_i32).
379
380Floating point operations are not supported in this version. A
381previous incarnation of the code generator had full support of them,
382but it is better to concentrate on integer operations first.
383
384On a 64 bit target, no assumption is made in TCG about the storage of
385the 32 bit values in 64 bit registers.
386
3874.2) Constraints
388
389GCC like constraints are used to define the constraints of every
390instruction. Memory constraints are not supported in this
391version. Aliases are specified in the input operands as for GCC.
392
393A target can define specific register or constant constraints. If an
394operation uses a constant input constraint which does not allow all
395constants, it must also accept registers in order to have a fallback.
396
397The movi_i32 and movi_i64 operations must accept any constants.
398
399The mov_i32 and mov_i64 operations must accept any registers of the
400same type.
401
402The ld/st instructions must accept signed 32 bit constant offsets. It
403can be implemented by reserving a specific register to compute the
404address if the offset is too big.
405
406The ld/st instructions must accept any destination (ld) or source (st)
407register.
408
4094.3) Function call assumptions
410
411- The only supported types for parameters and return value are: 32 and
412 64 bit integers and pointer.
413- The stack grows downwards.
414- The first N parameters are passed in registers.
415- The next parameters are passed on the stack by storing them as words.
416- Some registers are clobbered during the call.
417- The function can return 0 or 1 value in registers. On a 32 bit
418 target, functions must be able to return 2 values in registers for
419 64 bit return type.
420
4215) Migration from dyngen to TCG
422
423TCG is backward compatible with QEMU "dyngen" operations. It means
424that TCG instructions can be freely mixed with dyngen operations. It
425is expected that QEMU targets will be progressively fully converted to
9804c8e2 426TCG. Once a target is fully converted to TCG, it will be possible
c896fe29
FB
427to apply more optimizations because more registers will be free for
428the generated code.
429
430The exception model is the same as the dyngen one.