]> git.proxmox.com Git - qemu.git/blame - tcg/README
CRIS: Add support for the pseudo randomized set that the mmu provides with TLB refill...
[qemu.git] / tcg / README
CommitLineData
c896fe29
FB
1Tiny Code Generator - Fabrice Bellard.
2
31) Introduction
4
5TCG (Tiny Code Generator) began as a generic backend for a C
6compiler. It was simplified to be used in QEMU. It also has its roots
7in the QOP code generator written by Paul Brook.
8
92) Definitions
10
11The TCG "target" is the architecture for which we generate the
12code. It is of course not the same as the "target" of QEMU which is
13the emulated architecture. As TCG started as a generic C backend used
14for cross compiling, it is assumed that the TCG target is different
15from the host, although it is never the case for QEMU.
16
17A TCG "function" corresponds to a QEMU Translated Block (TB).
18
19A TCG "temporary" is a variable only live in a given
9804c8e2 20function. Temporaries are allocated explicitly in each function.
c896fe29
FB
21
22A TCG "global" is a variable which is live in all the functions. They
23are defined before the functions defined. A TCG global can be a memory
24location (e.g. a QEMU CPU register), a fixed host register (e.g. the
25QEMU CPU state pointer) or a memory location which is stored in a
26register outside QEMU TBs (not implemented yet).
27
28A TCG "basic block" corresponds to a list of instructions terminated
29by a branch instruction.
30
313) Intermediate representation
32
333.1) Introduction
34
35TCG instructions operate on variables which are temporaries or
36globals. TCG instructions and variables are strongly typed. Two types
37are supported: 32 bit integers and 64 bit integers. Pointers are
38defined as an alias to 32 bit or 64 bit integers depending on the TCG
39target word size.
40
41Each instruction has a fixed number of output variable operands, input
42variable operands and always constant operands.
43
44The notable exception is the call instruction which has a variable
45number of outputs and inputs.
46
47In the textual form, output operands come first, followed by input
48operands, followed by constant operands. The output type is included
49in the instruction name. Constants are prefixed with a '$'.
50
51add_i32 t0, t1, t2 (t0 <- t1 + t2)
52
53sub_i64 t2, t3, $4 (t2 <- t3 - 4)
54
553.2) Assumptions
56
57* Basic blocks
58
59- Basic blocks end after branches (e.g. brcond_i32 instruction),
60 goto_tb and exit_tb instructions.
61- Basic blocks end before legacy dyngen operations.
62- Basic blocks start after the end of a previous basic block, at a
63 set_label instruction or after a legacy dyngen operation.
64
65After the end of a basic block, temporaries at destroyed and globals
66are stored at their initial storage (register or memory place
67depending on their declarations).
68
69* Floating point types are not supported yet
70
71* Pointers: depending on the TCG target, pointer size is 32 bit or 64
72 bit. The type TCG_TYPE_PTR is an alias to TCG_TYPE_I32 or
73 TCG_TYPE_I64.
74
75* Helpers:
76
77Using the tcg_gen_helper_x_y it is possible to call any function
78taking i32, i64 or pointer types types. Before calling an helper, all
79globals are stored at their canonical location and it is assumed that
80the function can modify them. In the future, function modifiers will
81be allowed to tell that the helper does not read or write some globals.
82
83On some TCG targets (e.g. x86), several calling conventions are
84supported.
85
86* Branches:
87
88Use the instruction 'br' to jump to a label. Use 'jmp' to jump to an
89explicit address. Conditional branches can only jump to labels.
90
913.3) Code Optimizations
92
93When generating instructions, you can count on at least the following
94optimizations:
95
96- Single instructions are simplified, e.g.
97
98 and_i32 t0, t0, $0xffffffff
99
100 is suppressed.
101
102- A liveness analysis is done at the basic block level. The
103 information is used to suppress moves from a dead temporary to
104 another one. It is also used to remove instructions which compute
105 dead results. The later is especially useful for condition code
9804c8e2 106 optimization in QEMU.
c896fe29
FB
107
108 In the following example:
109
110 add_i32 t0, t1, t2
111 add_i32 t0, t0, $1
112 mov_i32 t0, $1
113
114 only the last instruction is kept.
115
116- A macro system is supported (may get closer to function inlining
117 some day). It is useful if the liveness analysis is likely to prove
118 that some results of a computation are indeed not useful. With the
119 macro system, the user can provide several alternative
120 implementations which are used depending on the used results. It is
9804c8e2 121 especially useful for condition code optimization in QEMU.
c896fe29
FB
122
123 Here is an example:
124
125 macro_2 t0, t1, $1
126 mov_i32 t0, $0x1234
127
128 The macro identified by the ID "$1" normally returns the values t0
129 and t1. Suppose its implementation is:
130
131 macro_start
132 brcond_i32 t2, $0, $TCG_COND_EQ, $1
133 mov_i32 t0, $2
134 br $2
135 set_label $1
136 mov_i32 t0, $3
137 set_label $2
138 add_i32 t1, t3, t4
139 macro_end
140
141 If t0 is not used after the macro, the user can provide a simpler
142 implementation:
143
144 macro_start
145 add_i32 t1, t2, t4
146 macro_end
147
148 TCG automatically chooses the right implementation depending on
149 which macro outputs are used after it.
150
151 Note that if TCG did more expensive optimizations, macros would be
152 less useful. In the previous example a macro is useful because the
153 liveness analysis is done on each basic block separately. Hence TCG
154 cannot remove the code computing 't0' even if it is not used after
155 the first macro implementation.
156
1573.4) Instruction Reference
158
159********* Function call
160
161* call <ret> <params> ptr
162
163call function 'ptr' (pointer type)
164
165<ret> optional 32 bit or 64 bit return value
166<params> optional 32 bit or 64 bit parameters
167
168********* Jumps/Labels
169
170* jmp t0
171
172Absolute jump to address t0 (pointer type).
173
174* set_label $label
175
176Define label 'label' at the current program point.
177
178* br $label
179
180Jump to label.
181
182* brcond_i32/i64 cond, t0, t1, label
183
184Conditional jump if t0 cond t1 is true. cond can be:
185 TCG_COND_EQ
186 TCG_COND_NE
187 TCG_COND_LT /* signed */
188 TCG_COND_GE /* signed */
189 TCG_COND_LE /* signed */
190 TCG_COND_GT /* signed */
191 TCG_COND_LTU /* unsigned */
192 TCG_COND_GEU /* unsigned */
193 TCG_COND_LEU /* unsigned */
194 TCG_COND_GTU /* unsigned */
195
196********* Arithmetic
197
198* add_i32/i64 t0, t1, t2
199
200t0=t1+t2
201
202* sub_i32/i64 t0, t1, t2
203
204t0=t1-t2
205
206* mul_i32/i64 t0, t1, t2
207
208t0=t1*t2
209
210* div_i32/i64 t0, t1, t2
211
212t0=t1/t2 (signed). Undefined behavior if division by zero or overflow.
213
214* divu_i32/i64 t0, t1, t2
215
216t0=t1/t2 (unsigned). Undefined behavior if division by zero.
217
218* rem_i32/i64 t0, t1, t2
219
220t0=t1%t2 (signed). Undefined behavior if division by zero or overflow.
221
222* remu_i32/i64 t0, t1, t2
223
224t0=t1%t2 (unsigned). Undefined behavior if division by zero.
225
c896fe29
FB
226********* Logical
227
5e85404a
AJ
228* and_i32/i64 t0, t1, t2
229
c896fe29
FB
230t0=t1&t2
231
232* or_i32/i64 t0, t1, t2
233
234t0=t1|t2
235
236* xor_i32/i64 t0, t1, t2
237
238t0=t1^t2
239
c896fe29
FB
240********* Shifts
241
242* shl_i32/i64 t0, t1, t2
243
244t0=t1 << t2. Undefined behavior if t2 < 0 or t2 >= 32 (resp 64)
245
246* shr_i32/i64 t0, t1, t2
247
248t0=t1 >> t2 (unsigned). Undefined behavior if t2 < 0 or t2 >= 32 (resp 64)
249
250* sar_i32/i64 t0, t1, t2
251
252t0=t1 >> t2 (signed). Undefined behavior if t2 < 0 or t2 >= 32 (resp 64)
253
254********* Misc
255
256* mov_i32/i64 t0, t1
257
258t0 = t1
259
260Move t1 to t0 (both operands must have the same type).
261
262* ext8s_i32/i64 t0, t1
86831435 263ext8u_i32/i64 t0, t1
c896fe29 264ext16s_i32/i64 t0, t1
86831435 265ext16u_i32/i64 t0, t1
c896fe29 266ext32s_i64 t0, t1
86831435 267ext32u_i64 t0, t1
c896fe29 268
86831435 2698, 16 or 32 bit sign/zero extension (both operands must have the same type)
c896fe29
FB
270
271* bswap16_i32 t0, t1
272
27316 bit byte swap on a 32 bit value. The two high order bytes must be set
274to zero.
275
276* bswap_i32 t0, t1
277
27832 bit byte swap
279
280* bswap_i64 t0, t1
281
28264 bit byte swap
283
5ff9d6a4
FB
284* discard_i32/i64 t0
285
286Indicate that the value of t0 won't be used later. It is useful to
287force dead code elimination.
288
c896fe29
FB
289********* Type conversions
290
291* ext_i32_i64 t0, t1
292Convert t1 (32 bit) to t0 (64 bit) and does sign extension
293
294* extu_i32_i64 t0, t1
295Convert t1 (32 bit) to t0 (64 bit) and does zero extension
296
297* trunc_i64_i32 t0, t1
298Truncate t1 (64 bit) to t0 (32 bit)
299
300********* Load/Store
301
302* ld_i32/i64 t0, t1, offset
303ld8s_i32/i64 t0, t1, offset
304ld8u_i32/i64 t0, t1, offset
305ld16s_i32/i64 t0, t1, offset
306ld16u_i32/i64 t0, t1, offset
307ld32s_i64 t0, t1, offset
308ld32u_i64 t0, t1, offset
309
310t0 = read(t1 + offset)
311Load 8, 16, 32 or 64 bits with or without sign extension from host memory.
312offset must be a constant.
313
314* st_i32/i64 t0, t1, offset
315st8_i32/i64 t0, t1, offset
316st16_i32/i64 t0, t1, offset
317st32_i64 t0, t1, offset
318
319write(t0, t1 + offset)
320Write 8, 16, 32 or 64 bits to host memory.
321
322********* QEMU specific operations
323
324* tb_exit t0
325
326Exit the current TB and return the value t0 (word type).
327
328* goto_tb index
329
330Exit the current TB and jump to the TB index 'index' (constant) if the
331current TB was linked to this TB. Otherwise execute the next
332instructions.
333
334* qemu_ld_i32/i64 t0, t1, flags
335qemu_ld8u_i32/i64 t0, t1, flags
336qemu_ld8s_i32/i64 t0, t1, flags
337qemu_ld16u_i32/i64 t0, t1, flags
338qemu_ld16s_i32/i64 t0, t1, flags
339qemu_ld32u_i64 t0, t1, flags
340qemu_ld32s_i64 t0, t1, flags
341
342Load data at the QEMU CPU address t1 into t0. t1 has the QEMU CPU
343address type. 'flags' contains the QEMU memory index (selects user or
344kernel access) for example.
345
346* qemu_st_i32/i64 t0, t1, flags
347qemu_st8_i32/i64 t0, t1, flags
348qemu_st16_i32/i64 t0, t1, flags
349qemu_st32_i64 t0, t1, flags
350
351Store the data t0 at the QEMU CPU Address t1. t1 has the QEMU CPU
352address type. 'flags' contains the QEMU memory index (selects user or
353kernel access) for example.
354
355Note 1: Some shortcuts are defined when the last operand is known to be
356a constant (e.g. addi for add, movi for mov).
357
358Note 2: When using TCG, the opcodes must never be generated directly
359as some of them may not be available as "real" opcodes. Always use the
360function tcg_gen_xxx(args).
361
3624) Backend
363
364tcg-target.h contains the target specific definitions. tcg-target.c
365contains the target specific code.
366
3674.1) Assumptions
368
369The target word size (TCG_TARGET_REG_BITS) is expected to be 32 bit or
37064 bit. It is expected that the pointer has the same size as the word.
371
372On a 32 bit target, all 64 bit operations are converted to 32 bits. A
373few specific operations must be implemented to allow it (see add2_i32,
374sub2_i32, brcond2_i32).
375
376Floating point operations are not supported in this version. A
377previous incarnation of the code generator had full support of them,
378but it is better to concentrate on integer operations first.
379
380On a 64 bit target, no assumption is made in TCG about the storage of
381the 32 bit values in 64 bit registers.
382
3834.2) Constraints
384
385GCC like constraints are used to define the constraints of every
386instruction. Memory constraints are not supported in this
387version. Aliases are specified in the input operands as for GCC.
388
389A target can define specific register or constant constraints. If an
390operation uses a constant input constraint which does not allow all
391constants, it must also accept registers in order to have a fallback.
392
393The movi_i32 and movi_i64 operations must accept any constants.
394
395The mov_i32 and mov_i64 operations must accept any registers of the
396same type.
397
398The ld/st instructions must accept signed 32 bit constant offsets. It
399can be implemented by reserving a specific register to compute the
400address if the offset is too big.
401
402The ld/st instructions must accept any destination (ld) or source (st)
403register.
404
4054.3) Function call assumptions
406
407- The only supported types for parameters and return value are: 32 and
408 64 bit integers and pointer.
409- The stack grows downwards.
410- The first N parameters are passed in registers.
411- The next parameters are passed on the stack by storing them as words.
412- Some registers are clobbered during the call.
413- The function can return 0 or 1 value in registers. On a 32 bit
414 target, functions must be able to return 2 values in registers for
415 64 bit return type.
416
4175) Migration from dyngen to TCG
418
419TCG is backward compatible with QEMU "dyngen" operations. It means
420that TCG instructions can be freely mixed with dyngen operations. It
421is expected that QEMU targets will be progressively fully converted to
9804c8e2 422TCG. Once a target is fully converted to TCG, it will be possible
c896fe29
FB
423to apply more optimizations because more registers will be free for
424the generated code.
425
426The exception model is the same as the dyngen one.