]>
Commit | Line | Data |
---|---|---|
3fdbf5d6 RH |
1 | ======================== |
2 | Decodetree Specification | |
3 | ======================== | |
4 | ||
5 | A *decodetree* is built from instruction *patterns*. A pattern may | |
6 | represent a single architectural instruction or a group of same, depending | |
7 | on what is convenient for further processing. | |
8 | ||
9 | Each pattern has both *fixedbits* and *fixedmask*, the combination of which | |
10 | describes the condition under which the pattern is matched:: | |
11 | ||
12 | (insn & fixedmask) == fixedbits | |
13 | ||
14 | Each pattern may have *fields*, which are extracted from the insn and | |
15 | passed along to the translator. Examples of such are registers, | |
16 | immediates, and sub-opcodes. | |
17 | ||
18 | In support of patterns, one may declare *fields*, *argument sets*, and | |
19 | *formats*, each of which may be re-used to simplify further definitions. | |
20 | ||
21 | Fields | |
22 | ====== | |
23 | ||
24 | Syntax:: | |
25 | ||
94597b61 | 26 | field_def := '%' identifier ( unnamed_field )* ( !function=identifier )? |
3fdbf5d6 RH |
27 | unnamed_field := number ':' ( 's' ) number |
28 | ||
29 | For *unnamed_field*, the first number is the least-significant bit position | |
30 | of the field and the second number is the length of the field. If the 's' is | |
31 | present, the field is considered signed. If multiple ``unnamed_fields`` are | |
32 | present, they are concatenated. In this way one can define disjoint fields. | |
33 | ||
34 | If ``!function`` is specified, the concatenated result is passed through the | |
35 | named function, taking and returning an integral value. | |
36 | ||
94597b61 RH |
37 | One may use ``!function`` with zero ``unnamed_fields``. This case is called |
38 | a *parameter*, and the named function is only passed the ``DisasContext`` | |
39 | and returns an integral value extracted from there. | |
40 | ||
41 | A field with no ``unnamed_fields`` and no ``!function`` is in error. | |
42 | ||
3fdbf5d6 RH |
43 | Field examples: |
44 | ||
45 | +---------------------------+---------------------------------------------+ | |
46 | | Input | Generated code | | |
47 | +===========================+=============================================+ | |
48 | | %disp 0:s16 | sextract(i, 0, 16) | | |
49 | +---------------------------+---------------------------------------------+ | |
50 | | %imm9 16:6 10:3 | extract(i, 16, 6) << 3 | extract(i, 10, 3) | | |
51 | +---------------------------+---------------------------------------------+ | |
52 | | %disp12 0:s1 1:1 2:10 | sextract(i, 0, 1) << 11 | | | |
53 | | | extract(i, 1, 1) << 10 | | | |
54 | | | extract(i, 2, 10) | | |
55 | +---------------------------+---------------------------------------------+ | |
56 | | %shimm8 5:s8 13:1 | expand_shimm8(sextract(i, 5, 8) << 1 | | | |
57 | | !function=expand_shimm8 | extract(i, 13, 1)) | | |
58 | +---------------------------+---------------------------------------------+ | |
59 | ||
60 | Argument Sets | |
61 | ============= | |
62 | ||
63 | Syntax:: | |
64 | ||
65 | args_def := '&' identifier ( args_elt )+ ( !extern )? | |
af93ccac | 66 | args_elt := identifier (':' identifier)? |
3fdbf5d6 RH |
67 | |
68 | Each *args_elt* defines an argument within the argument set. | |
af93ccac RH |
69 | If the form of the *args_elt* contains a colon, the first |
70 | identifier is the argument name and the second identifier is | |
71 | the argument type. If the colon is missing, the argument | |
72 | type will be ``int``. | |
73 | ||
3fdbf5d6 RH |
74 | Each argument set will be rendered as a C structure "arg_$name" |
75 | with each of the fields being one of the member arguments. | |
76 | ||
77 | If ``!extern`` is specified, the backing structure is assumed | |
78 | to have been already declared, typically via a second decoder. | |
79 | ||
5d53b0f5 RH |
80 | Argument sets are useful when one wants to define helper functions |
81 | for the translator functions that can perform operations on a common | |
82 | set of arguments. This can ensure, for instance, that the ``AND`` | |
83 | pattern and the ``OR`` pattern put their operands into the same named | |
84 | structure, so that a common ``gen_logic_insn`` may be able to handle | |
85 | the operations common between the two. | |
86 | ||
3fdbf5d6 RH |
87 | Argument set examples:: |
88 | ||
89 | ®3 ra rb rc | |
90 | &loadstore reg base offset | |
af93ccac | 91 | &longldst reg base offset:int64_t |
3fdbf5d6 RH |
92 | |
93 | ||
94 | Formats | |
95 | ======= | |
96 | ||
97 | Syntax:: | |
98 | ||
99 | fmt_def := '@' identifier ( fmt_elt )+ | |
100 | fmt_elt := fixedbit_elt | field_elt | field_ref | args_ref | |
101 | fixedbit_elt := [01.-]+ | |
102 | field_elt := identifier ':' 's'? number | |
103 | field_ref := '%' identifier | identifier '=' '%' identifier | |
104 | args_ref := '&' identifier | |
105 | ||
106 | Defining a format is a handy way to avoid replicating groups of fields | |
107 | across many instruction patterns. | |
108 | ||
109 | A *fixedbit_elt* describes a contiguous sequence of bits that must | |
110 | be 1, 0, or don't care. The difference between '.' and '-' | |
111 | is that '.' means that the bit will be covered with a field or a | |
112 | final 0 or 1 from the pattern, and '-' means that the bit is really | |
113 | ignored by the cpu and will not be specified. | |
114 | ||
115 | A *field_elt* describes a simple field only given a width; the position of | |
116 | the field is implied by its position with respect to other *fixedbit_elt* | |
117 | and *field_elt*. | |
118 | ||
119 | If any *fixedbit_elt* or *field_elt* appear, then all bits must be defined. | |
120 | Padding with a *fixedbit_elt* of all '.' is an easy way to accomplish that. | |
121 | ||
122 | A *field_ref* incorporates a field by reference. This is the only way to | |
123 | add a complex field to a format. A field may be renamed in the process | |
124 | via assignment to another identifier. This is intended to allow the | |
125 | same argument set be used with disjoint named fields. | |
126 | ||
127 | A single *args_ref* may specify an argument set to use for the format. | |
128 | The set of fields in the format must be a subset of the arguments in | |
129 | the argument set. If an argument set is not specified, one will be | |
130 | inferred from the set of fields. | |
131 | ||
132 | It is recommended, but not required, that all *field_ref* and *args_ref* | |
133 | appear at the end of the line, not interleaving with *fixedbit_elf* or | |
134 | *field_elt*. | |
135 | ||
136 | Format examples:: | |
137 | ||
138 | @opr ...... ra:5 rb:5 ... 0 ....... rc:5 | |
139 | @opi ...... ra:5 lit:8 1 ....... rc:5 | |
140 | ||
141 | Patterns | |
142 | ======== | |
143 | ||
144 | Syntax:: | |
145 | ||
146 | pat_def := identifier ( pat_elt )+ | |
147 | pat_elt := fixedbit_elt | field_elt | field_ref | args_ref | fmt_ref | const_elt | |
148 | fmt_ref := '@' identifier | |
149 | const_elt := identifier '=' number | |
150 | ||
151 | The *fixedbit_elt* and *field_elt* specifiers are unchanged from formats. | |
152 | A pattern that does not specify a named format will have one inferred | |
153 | from a referenced argument set (if present) and the set of fields. | |
154 | ||
155 | A *const_elt* allows a argument to be set to a constant value. This may | |
156 | come in handy when fields overlap between patterns and one has to | |
157 | include the values in the *fixedbit_elt* instead. | |
158 | ||
159 | The decoder will call a translator function for each pattern matched. | |
160 | ||
161 | Pattern examples:: | |
162 | ||
163 | addl_r 010000 ..... ..... .... 0000000 ..... @opr | |
164 | addl_i 010000 ..... ..... .... 0000000 ..... @opi | |
165 | ||
166 | which will, in part, invoke:: | |
167 | ||
168 | trans_addl_r(ctx, &arg_opr, insn) | |
169 | ||
170 | and:: | |
171 | ||
172 | trans_addl_i(ctx, &arg_opi, insn) | |
0eff2df4 RH |
173 | |
174 | Pattern Groups | |
175 | ============== | |
176 | ||
177 | Syntax:: | |
178 | ||
ffdfca6f RH |
179 | group := overlap_group | no_overlap_group |
180 | overlap_group := '{' ( pat_def | group )+ '}' | |
181 | no_overlap_group := '[' ( pat_def | group )+ ']' | |
182 | ||
183 | A *group* begins with a lone open-brace or open-bracket, with all | |
184 | subsequent lines indented two spaces, and ending with a lone | |
185 | close-brace or close-bracket. Groups may be nested, increasing the | |
186 | required indentation of the lines within the nested group to two | |
187 | spaces per nesting level. | |
188 | ||
189 | Patterns within overlap groups are allowed to overlap. Conflicts are | |
190 | resolved by selecting the patterns in order. If all of the fixedbits | |
191 | for a pattern match, its translate function will be called. If the | |
192 | translate function returns false, then subsequent patterns within the | |
193 | group will be matched. | |
194 | ||
195 | Patterns within no-overlap groups are not allowed to overlap, just | |
196 | the same as ungrouped patterns. Thus no-overlap groups are intended | |
197 | to be nested inside overlap groups. | |
0eff2df4 RH |
198 | |
199 | The following example from PA-RISC shows specialization of the *or* | |
200 | instruction:: | |
201 | ||
202 | { | |
203 | { | |
204 | nop 000010 ----- ----- 0000 001001 0 00000 | |
205 | copy 000010 00000 r1:5 0000 001001 0 rt:5 | |
206 | } | |
207 | or 000010 rt2:5 r1:5 cf:4 001001 0 rt:5 | |
208 | } | |
209 | ||
210 | When the *cf* field is zero, the instruction has no side effects, | |
211 | and may be specialized. When the *rt* field is zero, the output | |
212 | is discarded and so the instruction has no effect. When the *rt2* | |
ffdfca6f | 213 | field is zero, the operation is ``reg[r1] | 0`` and so encodes |
0eff2df4 RH |
214 | the canonical register copy operation. |
215 | ||
216 | The output from the generator might look like:: | |
217 | ||
218 | switch (insn & 0xfc000fe0) { | |
219 | case 0x08000240: | |
220 | /* 000010.. ........ ....0010 010..... */ | |
221 | if ((insn & 0x0000f000) == 0x00000000) { | |
222 | /* 000010.. ........ 00000010 010..... */ | |
223 | if ((insn & 0x0000001f) == 0x00000000) { | |
224 | /* 000010.. ........ 00000010 01000000 */ | |
225 | extract_decode_Fmt_0(&u.f_decode0, insn); | |
226 | if (trans_nop(ctx, &u.f_decode0)) return true; | |
227 | } | |
228 | if ((insn & 0x03e00000) == 0x00000000) { | |
229 | /* 00001000 000..... 00000010 010..... */ | |
230 | extract_decode_Fmt_1(&u.f_decode1, insn); | |
231 | if (trans_copy(ctx, &u.f_decode1)) return true; | |
232 | } | |
233 | } | |
234 | extract_decode_Fmt_2(&u.f_decode2, insn); | |
235 | if (trans_or(ctx, &u.f_decode2)) return true; | |
236 | return false; | |
237 | } |