qemu-doc.texi

   1 \input texinfo @c -*- texinfo -*-
   2
   3 @settitle QEMU CPU Emulator Reference Documentation
   4 @titlepage
   5 @sp 7
   6 @center @titlefont{QEMU CPU Emulator Reference Documentation}
   7 @sp 3
   8 @end titlepage
   9
  10 @chapter Introduction
  11
  12 @section Features
  13
  14 QEMU is a FAST! processor emulator. By using dynamic translation it
  15 achieves a reasonnable speed while being easy to port on new host
  16 CPUs.
  17
  18 QEMU has two operating modes:
  19 @itemize
  20 @item User mode emulation. In this mode, QEMU can launch Linux processes
  21 compiled for one CPU on another CPU. Linux system calls are converted
  22 because of endianness and 32/64 bit mismatches. The Wine Windows API
  23 emulator (@url{http://www.winehq.org}) and the DOSEMU DOS emulator
  24 (@url{www.dosemu.org}) are the main targets for QEMU.
  25
  26 @item Full system emulation. In this mode, QEMU emulates a full
  27 system, including a processor and various peripherials. Currently, it
  28 is only used to launch an x86 Linux kernel on an x86 Linux system. It
  29 enables easier testing and debugging of system code. It can also be
  30 used to provide virtual hosting of several virtual PCs on a single
  31 server.
  32
  33 @end itemize
  34
  35 As QEMU requires no host kernel patches to run, it is very safe and
  36 easy to use.
  37
  38 QEMU generic features:
  39
  40 @itemize
  41
  42 @item User space only or full system emulation.
  43
  44 @item Using dynamic translation to native code for reasonnable speed.
  45
  46 @item Working on x86 and PowerPC hosts. Being tested on ARM, Sparc32, Alpha and S390.
  47
  48 @item Self-modifying code support.
  49
  50 @item Precise exceptions support.
  51
  52 @item The virtual CPU is a library (@code{libqemu}) which can be used
  53 in other projects.
  54
  55 @end itemize
  56
  57 QEMU user mode emulation features:
  58 @itemize
  59 @item Generic Linux system call converter, including most ioctls.
  60
  61 @item clone() emulation using native CPU clone() to use Linux scheduler for threads.
  62
  63 @item Accurate signal handling by remapping host signals to target signals.
  64 @end itemize
  65 @end itemize
  66
  67 QEMU full system emulation features:
  68 @itemize
  69 @item Using mmap() system calls to simulate the MMU
  70 @end itemize
  71
  72 @section x86 emulation
  73
  74 QEMU x86 target features:
  75
  76 @itemize
  77
  78 @item The virtual x86 CPU supports 16 bit and 32 bit addressing with segmentation.
  79 LDT/GDT and IDT are emulated. VM86 mode is also supported to run DOSEMU.
  80
  81 @item Support of host page sizes bigger than 4KB in user mode emulation.
  82
  83 @item QEMU can emulate itself on x86.
  84
  85 @item An extensive Linux x86 CPU test program is included @file{tests/test-i386}.
  86 It can be used to test other x86 virtual CPUs.
  87
  88 @end itemize
  89
  90 Current QEMU limitations:
  91
  92 @itemize
  93
  94 @item No SSE/MMX support (yet).
  95
  96 @item No x86-64 support.
  97
  98 @item IPC syscalls are missing.
  99
 100 @item The x86 segment limits and access rights are not tested at every
 101 memory access.
 102
 103 @item On non x86 host CPUs, @code{double}s are used instead of the non standard
 104 10 byte @code{long double}s of x86 for floating point emulation to get
 105 maximum performances.
 106
 107 @item Full system emulation only works if no data are mapped above the virtual address
 108 0xc0000000 (yet).
 109
 110 @item Some priviledged instructions or behaviors are missing. Only the ones
 111 needed for proper Linux kernel operation are emulated.
 112
 113 @item No memory separation between the kernel and the user processes is done.
 114 It will be implemented very soon.
 115
 116 @end itemize
 117
 118 @section ARM emulation
 119
 120 @itemize
 121
 122 @item ARM emulation can currently launch small programs while using the
 123 generic dynamic code generation architecture of QEMU.
 124
 125 @item No FPU support (yet).
 126
 127 @item No automatic regression testing (yet).
 128
 129 @end itemize
 130
 131 @chapter QEMU User space emulator invocation
 132
 133 @section Quick Start
 134
 135 If you need to compile QEMU, please read the @file{README} which gives
 136 the related information.
 137
 138 In order to launch a Linux process, QEMU needs the process executable
 139 itself and all the target (x86) dynamic libraries used by it.
 140
 141 @itemize
 142
 143 @item On x86, you can just try to launch any process by using the native
 144 libraries:
 145
 146 @example
 147 qemu -L / /bin/ls
 148 @end example
 149
 150 @code{-L /} tells that the x86 dynamic linker must be searched with a
 151 @file{/} prefix.
 152
 153 @item Since QEMU is also a linux process, you can launch qemu with qemu:
 154
 155 @example
 156 qemu -L / qemu -L / /bin/ls
 157 @end example
 158
 159 @item On non x86 CPUs, you need first to download at least an x86 glibc
 160 (@file{qemu-XXX-i386-glibc21.tar.gz} on the QEMU web page). Ensure that
 161 @code{LD_LIBRARY_PATH} is not set:
 162
 163 @example
 164 unset LD_LIBRARY_PATH
 165 @end example
 166
 167 Then you can launch the precompiled @file{ls} x86 executable:
 168
 169 @example
 170 qemu /usr/local/qemu-i386/bin/ls-i386
 171 @end example
 172 You can look at @file{/usr/local/qemu-i386/bin/qemu-conf.sh} so that
 173 QEMU is automatically launched by the Linux kernel when you try to
 174 launch x86 executables. It requires the @code{binfmt_misc} module in the
 175 Linux kernel.
 176
 177 @item The x86 version of QEMU is also included. You can try weird things such as:
 178 @example
 179 qemu /usr/local/qemu-i386/bin/qemu-i386 /usr/local/qemu-i386/bin/ls-i386
 180 @end example
 181
 182 @end itemize
 183
 184 @section Wine launch
 185
 186 @itemize
 187
 188 @item Ensure that you have a working QEMU with the x86 glibc
 189 distribution (see previous section). In order to verify it, you must be
 190 able to do:
 191
 192 @example
 193 qemu /usr/local/qemu-i386/bin/ls-i386
 194 @end example
 195
 196 @item Download the binary x86 Wine install
 197 (@file{qemu-XXX-i386-wine.tar.gz} on the QEMU web page).
 198
 199 @item Configure Wine on your account. Look at the provided script
 200 @file{/usr/local/qemu-i386/bin/wine-conf.sh}. Your previous
 201 @code{$@{HOME@}/.wine} directory is saved to @code{$@{HOME@}/.wine.org}.
 202
 203 @item Then you can try the example @file{putty.exe}:
 204
 205 @example
 206 qemu /usr/local/qemu-i386/wine/bin/wine /usr/local/qemu-i386/wine/c/Program\ Files/putty.exe
 207 @end example
 208
 209 @end itemize
 210
 211 @section Command line options
 212
 213 @example
 214 usage: qemu [-h] [-d] [-L path] [-s size] program [arguments...]
 215 @end example
 216
 217 @table @option
 218 @item -h
 219 Print the help
 220 @item -L path
 221 Set the x86 elf interpreter prefix (default=/usr/local/qemu-i386)
 222 @item -s size
 223 Set the x86 stack size in bytes (default=524288)
 224 @end table
 225
 226 Debug options:
 227
 228 @table @option
 229 @item -d
 230 Activate log (logfile=/tmp/qemu.log)
 231 @item -p pagesize
 232 Act as if the host page size was 'pagesize' bytes
 233 @end table
 234
 235 @chapter QEMU System emulator invocation
 236
 237 @section Quick Start
 238
 239 This section explains how to launch a Linux kernel inside QEMU.
 240
 241 @enumerate
 242 @item
 243 Download the archive @file{vl-test-xxx.tar.gz} containing a Linux kernel
 244 and an initrd (initial Ram Disk). The archive also contains a
 245 precompiled version of @file{vl}, the QEMU System emulator.
 246
 247 @item Optional: If you want network support (for example to launch X11 examples), you
 248 must copy the script @file{vl-ifup} in @file{/etc} and configure
 249 properly @code{sudo} so that the command @code{ifconfig} contained in
 250 @file{vl-ifup} can be executed as root. You must verify that your host
 251 kernel supports the TUN/TAP network interfaces: the device
 252 @file{/dev/net/tun} must be present.
 253
 254 When network is enabled, there is a virtual network connection between
 255 the host kernel and the emulated kernel. The emulated kernel is seen
 256 from the host kernel at IP address 172.20.0.2 and the host kernel is
 257 seen from the emulated kernel at IP address 172.20.0.1.
 258
 259 @item Launch @code{vl.sh}. You should have the following output:
 260
 261 @example
 262 > ./vl.sh
 263 connected to host network interface: tun0
 264 Uncompressing Linux... Ok, booting the kernel.
 265 Linux version 2.4.20 (bellard@voyager) (gcc version 2.95.2 20000220 (Debian GNU/Linux)) #42 Wed Jun 25 14:16:12 CEST 2003
 266 BIOS-provided physical RAM map:
 267  BIOS-88: 0000000000000000 - 000000000009f000 (usable)
 268  BIOS-88: 0000000000100000 - 0000000002000000 (usable)
 269 32MB LOWMEM available.
 270 On node 0 totalpages: 8192
 271 zone(0): 4096 pages.
 272 zone(1): 4096 pages.
 273 zone(2): 0 pages.
 274 Kernel command line: root=/dev/ram ramdisk_size=6144
 275 Initializing CPU#0
 276 Detected 501.785 MHz processor.
 277 Calibrating delay loop... 973.20 BogoMIPS
 278 Memory: 24776k/32768k available (725k kernel code, 7604k reserved, 151k data, 48k init, 0k highmem)
 279 Dentry cache hash table entries: 4096 (order: 3, 32768 bytes)
 280 Inode cache hash table entries: 2048 (order: 2, 16384 bytes)
 281 Mount-cache hash table entries: 512 (order: 0, 4096 bytes)
 282 Buffer-cache hash table entries: 1024 (order: 0, 4096 bytes)
 283 Page-cache hash table entries: 8192 (order: 3, 32768 bytes)
 284 CPU: Intel Pentium Pro stepping 03
 285 Checking 'hlt' instruction... OK.
 286 POSIX conformance testing by UNIFIX
 287 Linux NET4.0 for Linux 2.4
 288 Based upon Swansea University Computer Society NET3.039
 289 Initializing RT netlink socket
 290 apm: BIOS not found.
 291 Starting kswapd
 292 pty: 256 Unix98 ptys configured
 293 Serial driver version 5.05c (2001-07-08) with no serial options enabled
 294 ttyS00 at 0x03f8 (irq = 4) is a 16450
 295 ne.c:v1.10 9/23/94 Donald Becker (becker@scyld.com)
 296 Last modified Nov 1, 2000 by Paul Gortmaker
 297 NE*000 ethercard probe at 0x300: 52 54 00 12 34 56
 298 eth0: NE2000 found at 0x300, using IRQ 9.
 299 RAMDISK driver initialized: 16 RAM disks of 6144K size 1024 blocksize
 300 NET4: Linux TCP/IP 1.0 for NET4.0
 301 IP Protocols: ICMP, UDP, TCP, IGMP
 302 IP: routing cache hash table of 512 buckets, 4Kbytes
 303 TCP: Hash tables configured (established 2048 bind 2048)
 304 NET4: Unix domain sockets 1.0/SMP for Linux NET4.0.
 305 RAMDISK: ext2 filesystem found at block 0
 306 RAMDISK: Loading 6144 blocks [1 disk] into ram disk... done.
 307 Freeing initrd memory: 6144k freed
 308 VFS: Mounted root (ext2 filesystem).
 309 Freeing unused kernel memory: 48k freed
 310 sh: can't access tty; job control turned off
 311 #
 312 @end example
 313
 314 @item
 315 Then you can play with the kernel inside the virtual serial console. You
 316 can launch @code{ls} for example. Type @key{Ctrl-a h} to have an help
 317 about the keys you can type inside the virtual serial console. In
 318 particular, use @key{Ctrl-a x} to exit QEMU and use @key{Ctrl-a b} as
 319 the Magic SysRq key.
 320
 321 @item
 322 If the network is enabled, launch the script @file{/etc/linuxrc} in the
 323 emulator (don't forget the leading dot):
 324 @example
 325 . /etc/linuxrc
 326 @end example
 327
 328 Then enable X11 connections on your PC from the emulated Linux:
 329 @example
 330 xhost +172.20.0.2
 331 @end example
 332
 333 You can now launch @file{xterm} or @file{xlogo} and verify that you have
 334 a real Virtual Linux system !
 335
 336 @end enumerate
 337
 338 NOTES:
 339 @enumerate
 340 @item
 341 A 2.5.66 kernel is also included in the vl-test archive. Just
 342 replace the bzImage in vl.sh to try it.
 343
 344 @item
 345 vl creates a temporary file in @var{$VLTMPDIR} (@file{/tmp} is the
 346 default) containing all the simulated PC memory. If possible, try to use
 347 a temporary directory using the tmpfs filesystem to avoid too many
 348 unnecessary disk accesses.
 349
 350 @item
 351 The example initrd is a modified version of the one made by Kevin
 352 Lawton for the plex86 Project (@url{www.plex86.org}).
 353
 354 @end enumerate
 355
 356 @section Invocation
 357
 358 @example
 359 usage: vl [options] bzImage [kernel parameters...]
 360 @end example
 361
 362 @file{bzImage} is a Linux kernel image.
 363
 364 General options:
 365 @table @option
 366 @item -initrd file
 367 Use 'file' as initial ram disk.
 368
 369 @item -hda file
 370 @item -hdb file
 371 Use 'file' as hard disk 0 or 1 image (@xref{disk_images}).
 372
 373 @item -snapshot
 374
 375 Write to temporary files instead of disk image files. In this case,
 376 the raw disk image you use is not written back. You can however force
 377 the write back by pressing @key{C-a s} (@xref{disk_images}).
 378
 379 @item -m megs
 380 Set virtual RAM size to @var{megs} megabytes.
 381
 382 @item -n script
 383 Set network init script [default=/etc/vl-ifup]. This script is
 384 launched to configure the host network interface (usually tun0)
 385 corresponding to the virtual NE2000 card.
 386 @end table
 387
 388 Debug options:
 389 @table @option
 390 @item -s
 391 Wait gdb connection to port 1234.
 392 @item -p port
 393 Change gdb connection port.
 394 @item -d
 395 Output log in /tmp/vl.log
 396 @end table
 397
 398 During emulation, use @key{C-a h} to get terminal commands:
 399
 400 @table @key
 401 @item C-a h
 402 Print this help
 403 @item C-a x
 404 Exit emulatior
 405 @item C-a s
 406 Save disk data back to file (if -snapshot)
 407 @item C-a b
 408 Send break (magic sysrq)
 409 @item C-a C-a
 410 Send C-a
 411 @end table
 412
 413 @node disk_images
 414 @section Disk Images
 415
 416 @subsection Raw disk images
 417
 418 The disk images can simply be raw images of the hard disk. You can
 419 create them with the command:
 420 @example
 421 dd if=/dev/zero of=myimage bs=1024 count=mysize
 422 @end example
 423 where @var{myimage} is the image filename and @var{mysize} is its size
 424 in kilobytes.
 425
 426 @subsection Snapshot mode
 427
 428 If you use the option @option{-snapshot}, all disk images are
 429 considered as read only. When sectors in written, they are written in
 430 a temporary file created in @file{/tmp}. You can however force the
 431 write back to the raw disk images by pressing @key{C-a s}.
 432
 433 NOTE: The snapshot mode only works with raw disk images.
 434
 435 @subsection Copy On Write disk images
 436
 437 QEMU also supports user mode Linux
 438 (@url{http://user-mode-linux.sourceforge.net/}) Copy On Write (COW)
 439 disk images. The COW disk images are much smaller than normal images
 440 as they store only modified sectors. They also permit the use of the
 441 same disk image template for many users.
 442
 443 To create a COW disk images, use the command:
 444
 445 @example
 446 vlmkcow -f myrawimage.bin mycowimage.cow
 447 @end example
 448
 449 @file{myrawimage.bin} is a raw image you want to use as original disk
 450 image. It will never be written to.
 451
 452 @file{mycowimage.cow} is the COW disk image which is created by
 453 @code{vlmkcow}. You can use it directly with the @option{-hdx}
 454 options. You must not modify the original raw disk image if you use
 455 COW images, as COW images only store the modified sectors from the raw
 456 disk image. QEMU stores the original raw disk image name and its
 457 modified time in the COW disk image so that chances of mistakes are
 458 reduced.
 459
 460 If raw disk image is not read-only, by pressing @key{C-a s} you can
 461 flush the COW disk image back into the raw disk image, as in snapshot
 462 mode.
 463
 464 COW disk images can also be created without a corresponding raw disk
 465 image. It is useful to have a big initial virtual disk image without
 466 using much disk space. Use:
 467
 468 @example
 469 vlmkcow mycowimage.cow 1024
 470 @end example
 471
 472 to create a 1 gigabyte empty COW disk image.
 473
 474 NOTES:
 475 @enumerate
 476 @item
 477 COW disk images must be created on file systems supporting
 478 @emph{holes} such as ext2 or ext3.
 479 @item
 480 Since holes are used, the displayed size of the COW disk image is not
 481 the real one. To know it, use the @code{ls -ls} command.
 482 @end enumerate
 483
 484 @section Kernel Compilation
 485
 486 You can use any Linux kernel within QEMU provided it is mapped at
 487 address 0x90000000 (the default is 0xc0000000). You must modify only two
 488 lines in the kernel source:
 489
 490 In asm/page.h, replace
 491 @example
 492 #define __PAGE_OFFSET           (0xc0000000)
 493 @end example
 494 by
 495 @example
 496 #define __PAGE_OFFSET           (0x90000000)
 497 @end example
 498
 499 And in arch/i386/vmlinux.lds, replace
 500 @example
 501   . = 0xc0000000 + 0x100000;
 502 @end example
 503 by
 504 @example
 505   . = 0x90000000 + 0x100000;
 506 @end example
 507
 508 The file config-2.4.20 gives the configuration of the example kernel.
 509
 510 Just type
 511 @example
 512 make bzImage
 513 @end example
 514
 515 As you would do to make a real kernel. Then you can use with QEMU
 516 exactly the same kernel as you would boot on your PC (in
 517 @file{arch/i386/boot/bzImage}).
 518
 519 If you are not using a 2.5 kernel as host kernel but if you use a target
 520 2.5 kernel, you must also ensure that the 'HZ' define is set to 100
 521 (1000 is the default) as QEMU cannot currently emulate timers at
 522 frequencies greater than 100 Hz on host Linux systems < 2.5. In
 523 asm/param.h, replace:
 524
 525 @example
 526 # define HZ             1000            /* Internal kernel timer frequency */
 527 @end example
 528 by
 529 @example
 530 # define HZ             100             /* Internal kernel timer frequency */
 531 @end example
 532
 533 If you have problems running your kernel, verify that neither the SMP nor
 534 HIGHMEM configuration options are activated.
 535
 536 @section PC Emulation
 537
 538 QEMU emulates the following PC peripherials:
 539
 540 @itemize
 541 @item
 542 PIC (interrupt controler)
 543 @item
 544 PIT (timers)
 545 @item
 546 CMOS memory
 547 @item
 548 Dumb VGA (to print the @code{Uncompressing Linux} message)
 549 @item
 550 Serial port (port=0x3f8, irq=4)
 551 @item
 552 NE2000 network adapter (port=0x300, irq=9)
 553 @item
 554 IDE disk interface (port=0x1f0, irq=14)
 555 @end itemize
 556
 557 @section GDB usage
 558
 559 QEMU has a primitive support to work with gdb, so that you can do
 560 'Ctrl-C' while the kernel is running and inspect its state.
 561
 562 In order to use gdb, launch vl with the '-s' option. It will wait for a
 563 gdb connection:
 564 @example
 565 > vl -s arch/i386/boot/bzImage initrd-2.4.20.img root=/dev/ram0 ramdisk_size=6144
 566 Connected to host network interface: tun0
 567 Waiting gdb connection on port 1234
 568 @end example
 569
 570 Then launch gdb on the 'vmlinux' executable:
 571 @example
 572 > gdb vmlinux
 573 @end example
 574
 575 In gdb, connect to QEMU:
 576 @example
 577 (gdb) target remote locahost:1234
 578 @end example
 579
 580 Then you can use gdb normally. For example, type 'c' to launch the kernel:
 581 @example
 582 (gdb) c
 583 @end example
 584
 585 WARNING: breakpoints and single stepping are not yet supported.
 586
 587 @chapter QEMU Internals
 588
 589 @section QEMU compared to other emulators
 590
 591 Like bochs [3], QEMU emulates an x86 CPU. But QEMU is much faster than
 592 bochs as it uses dynamic compilation and because it uses the host MMU to
 593 simulate the x86 MMU. The downside is that currently the emulation is
 594 not as accurate as bochs (for example, you cannot currently run Windows
 595 inside QEMU).
 596
 597 Like Valgrind [2], QEMU does user space emulation and dynamic
 598 translation. Valgrind is mainly a memory debugger while QEMU has no
 599 support for it (QEMU could be used to detect out of bound memory
 600 accesses as Valgrind, but it has no support to track uninitialised data
 601 as Valgrind does). The Valgrind dynamic translator generates better code
 602 than QEMU (in particular it does register allocation) but it is closely
 603 tied to an x86 host and target and has no support for precise exceptions
 604 and system emulation.
 605
 606 EM86 [4] is the closest project to user space QEMU (and QEMU still uses
 607 some of its code, in particular the ELF file loader). EM86 was limited
 608 to an alpha host and used a proprietary and slow interpreter (the
 609 interpreter part of the FX!32 Digital Win32 code translator [5]).
 610
 611 TWIN [6] is a Windows API emulator like Wine. It is less accurate than
 612 Wine but includes a protected mode x86 interpreter to launch x86 Windows
 613 executables. Such an approach as greater potential because most of the
 614 Windows API is executed natively but it is far more difficult to develop
 615 because all the data structures and function parameters exchanged
 616 between the API and the x86 code must be converted.
 617
 618 User mode Linux [7] was the only solution before QEMU to launch a Linux
 619 kernel as a process while not needing any host kernel patches. However,
 620 user mode Linux requires heavy kernel patches while QEMU accepts
 621 unpatched Linux kernels. It would be interesting to compare the
 622 performance of the two approaches.
 623
 624 The new Plex86 [8] PC virtualizer is done in the same spirit as the QEMU
 625 system emulator. It requires a patched Linux kernel to work (you cannot
 626 launch the same kernel on your PC), but the patches are really small. As
 627 it is a PC virtualizer (no emulation is done except for some priveledged
 628 instructions), it has the potential of being faster than QEMU. The
 629 downside is that a complicated (and potentially unsafe) host kernel
 630 patch is needed.
 631
 632 @section Portable dynamic translation
 633
 634 QEMU is a dynamic translator. When it first encounters a piece of code,
 635 it converts it to the host instruction set. Usually dynamic translators
 636 are very complicated and highly CPU dependent. QEMU uses some tricks
 637 which make it relatively easily portable and simple while achieving good
 638 performances.
 639
 640 The basic idea is to split every x86 instruction into fewer simpler
 641 instructions. Each simple instruction is implemented by a piece of C
 642 code (see @file{op-i386.c}). Then a compile time tool (@file{dyngen})
 643 takes the corresponding object file (@file{op-i386.o}) to generate a
 644 dynamic code generator which concatenates the simple instructions to
 645 build a function (see @file{op-i386.h:dyngen_code()}).
 646
 647 In essence, the process is similar to [1], but more work is done at
 648 compile time.
 649
 650 A key idea to get optimal performances is that constant parameters can
 651 be passed to the simple operations. For that purpose, dummy ELF
 652 relocations are generated with gcc for each constant parameter. Then,
 653 the tool (@file{dyngen}) can locate the relocations and generate the
 654 appriopriate C code to resolve them when building the dynamic code.
 655
 656 That way, QEMU is no more difficult to port than a dynamic linker.
 657
 658 To go even faster, GCC static register variables are used to keep the
 659 state of the virtual CPU.
 660
 661 @section Register allocation
 662
 663 Since QEMU uses fixed simple instructions, no efficient register
 664 allocation can be done. However, because RISC CPUs have a lot of
 665 register, most of the virtual CPU state can be put in registers without
 666 doing complicated register allocation.
 667
 668 @section Condition code optimisations
 669
 670 Good CPU condition codes emulation (@code{EFLAGS} register on x86) is a
 671 critical point to get good performances. QEMU uses lazy condition code
 672 evaluation: instead of computing the condition codes after each x86
 673 instruction, it just stores one operand (called @code{CC_SRC}), the
 674 result (called @code{CC_DST}) and the type of operation (called
 675 @code{CC_OP}).
 676
 677 @code{CC_OP} is almost never explicitely set in the generated code
 678 because it is known at translation time.
 679
 680 In order to increase performances, a backward pass is performed on the
 681 generated simple instructions (see
 682 @code{translate-i386.c:optimize_flags()}). When it can be proved that
 683 the condition codes are not needed by the next instructions, no
 684 condition codes are computed at all.
 685
 686 @section CPU state optimisations
 687
 688 The x86 CPU has many internal states which change the way it evaluates
 689 instructions. In order to achieve a good speed, the translation phase
 690 considers that some state information of the virtual x86 CPU cannot
 691 change in it. For example, if the SS, DS and ES segments have a zero
 692 base, then the translator does not even generate an addition for the
 693 segment base.
 694
 695 [The FPU stack pointer register is not handled that way yet].
 696
 697 @section Translation cache
 698
 699 A 2MByte cache holds the most recently used translations. For
 700 simplicity, it is completely flushed when it is full. A translation unit
 701 contains just a single basic block (a block of x86 instructions
 702 terminated by a jump or by a virtual CPU state change which the
 703 translator cannot deduce statically).
 704
 705 @section Direct block chaining
 706
 707 After each translated basic block is executed, QEMU uses the simulated
 708 Program Counter (PC) and other cpu state informations (such as the CS
 709 segment base value) to find the next basic block.
 710
 711 In order to accelerate the most common cases where the new simulated PC
 712 is known, QEMU can patch a basic block so that it jumps directly to the
 713 next one.
 714
 715 The most portable code uses an indirect jump. An indirect jump makes it
 716 easier to make the jump target modification atomic. On some
 717 architectures (such as PowerPC), the @code{JUMP} opcode is directly
 718 patched so that the block chaining has no overhead.
 719
 720 @section Self-modifying code and translated code invalidation
 721
 722 Self-modifying code is a special challenge in x86 emulation because no
 723 instruction cache invalidation is signaled by the application when code
 724 is modified.
 725
 726 When translated code is generated for a basic block, the corresponding
 727 host page is write protected if it is not already read-only (with the
 728 system call @code{mprotect()}). Then, if a write access is done to the
 729 page, Linux raises a SEGV signal. QEMU then invalidates all the
 730 translated code in the page and enables write accesses to the page.
 731
 732 Correct translated code invalidation is done efficiently by maintaining
 733 a linked list of every translated block contained in a given page. Other
 734 linked lists are also maintained to undo direct block chaining.
 735
 736 Althought the overhead of doing @code{mprotect()} calls is important,
 737 most MSDOS programs can be emulated at reasonnable speed with QEMU and
 738 DOSEMU.
 739
 740 Note that QEMU also invalidates pages of translated code when it detects
 741 that memory mappings are modified with @code{mmap()} or @code{munmap()}.
 742
 743 @section Exception support
 744
 745 longjmp() is used when an exception such as division by zero is
 746 encountered.
 747
 748 The host SIGSEGV and SIGBUS signal handlers are used to get invalid
 749 memory accesses. The exact CPU state can be retrieved because all the
 750 x86 registers are stored in fixed host registers. The simulated program
 751 counter is found by retranslating the corresponding basic block and by
 752 looking where the host program counter was at the exception point.
 753
 754 The virtual CPU cannot retrieve the exact @code{EFLAGS} register because
 755 in some cases it is not computed because of condition code
 756 optimisations. It is not a big concern because the emulated code can
 757 still be restarted in any cases.
 758
 759 @section Linux system call translation
 760
 761 QEMU includes a generic system call translator for Linux. It means that
 762 the parameters of the system calls can be converted to fix the
 763 endianness and 32/64 bit issues. The IOCTLs are converted with a generic
 764 type description system (see @file{ioctls.h} and @file{thunk.c}).
 765
 766 QEMU supports host CPUs which have pages bigger than 4KB. It records all
 767 the mappings the process does and try to emulated the @code{mmap()}
 768 system calls in cases where the host @code{mmap()} call would fail
 769 because of bad page alignment.
 770
 771 @section Linux signals
 772
 773 Normal and real-time signals are queued along with their information
 774 (@code{siginfo_t}) as it is done in the Linux kernel. Then an interrupt
 775 request is done to the virtual CPU. When it is interrupted, one queued
 776 signal is handled by generating a stack frame in the virtual CPU as the
 777 Linux kernel does. The @code{sigreturn()} system call is emulated to return
 778 from the virtual signal handler.
 779
 780 Some signals (such as SIGALRM) directly come from the host. Other
 781 signals are synthetized from the virtual CPU exceptions such as SIGFPE
 782 when a division by zero is done (see @code{main.c:cpu_loop()}).
 783
 784 The blocked signal mask is still handled by the host Linux kernel so
 785 that most signal system calls can be redirected directly to the host
 786 Linux kernel. Only the @code{sigaction()} and @code{sigreturn()} system
 787 calls need to be fully emulated (see @file{signal.c}).
 788
 789 @section clone() system call and threads
 790
 791 The Linux clone() system call is usually used to create a thread. QEMU
 792 uses the host clone() system call so that real host threads are created
 793 for each emulated thread. One virtual CPU instance is created for each
 794 thread.
 795
 796 The virtual x86 CPU atomic operations are emulated with a global lock so
 797 that their semantic is preserved.
 798
 799 Note that currently there are still some locking issues in QEMU. In
 800 particular, the translated cache flush is not protected yet against
 801 reentrancy.
 802
 803 @section Self-virtualization
 804
 805 QEMU was conceived so that ultimately it can emulate itself. Althought
 806 it is not very useful, it is an important test to show the power of the
 807 emulator.
 808
 809 Achieving self-virtualization is not easy because there may be address
 810 space conflicts. QEMU solves this problem by being an executable ELF
 811 shared object as the ld-linux.so ELF interpreter. That way, it can be
 812 relocated at load time.
 813
 814 @section MMU emulation
 815
 816 For system emulation, QEMU uses the mmap() system call to emulate the
 817 target CPU MMU. It works as long the emulated OS does not use an area
 818 reserved by the host OS (such as the area above 0xc0000000 on x86
 819 Linux).
 820
 821 It is planned to add a slower but more precise MMU emulation
 822 with a software MMU.
 823
 824 @section Bibliography
 825
 826 @table @asis
 827
 828 @item [1]
 829 @url{http://citeseer.nj.nec.com/piumarta98optimizing.html}, Optimizing
 830 direct threaded code by selective inlining (1998) by Ian Piumarta, Fabio
 831 Riccardi.
 832
 833 @item [2]
 834 @url{http://developer.kde.org/~sewardj/}, Valgrind, an open-source
 835 memory debugger for x86-GNU/Linux, by Julian Seward.
 836
 837 @item [3]
 838 @url{http://bochs.sourceforge.net/}, the Bochs IA-32 Emulator Project,
 839 by Kevin Lawton et al.
 840
 841 @item [4]
 842 @url{http://www.cs.rose-hulman.edu/~donaldlf/em86/index.html}, the EM86
 843 x86 emulator on Alpha-Linux.
 844
 845 @item [5]
 846 @url{http://www.usenix.org/publications/library/proceedings/usenix-nt97/full_papers/chernoff/chernoff.pdf},
 847 DIGITAL FX!32: Running 32-Bit x86 Applications on Alpha NT, by Anton
 848 Chernoff and Ray Hookway.
 849
 850 @item [6]
 851 @url{http://www.willows.com/}, Windows API library emulation from
 852 Willows Software.
 853
 854 @item [7]
 855 @url{http://user-mode-linux.sourceforge.net/},
 856 The User-mode Linux Kernel.
 857
 858 @item [8]
 859 @url{http://www.plex86.org/},
 860 The new Plex86 project.
 861
 862 @end table
 863
 864 @chapter Regression Tests
 865
 866 In the directory @file{tests/}, various interesting testing programs
 867 are available. There are used for regression testing.
 868
 869 @section @file{hello-i386}
 870
 871 Very simple statically linked x86 program, just to test QEMU during a
 872 port to a new host CPU.
 873
 874 @section @file{hello-arm}
 875
 876 Very simple statically linked ARM program, just to test QEMU during a
 877 port to a new host CPU.
 878
 879 @section @file{test-i386}
 880
 881 This program executes most of the 16 bit and 32 bit x86 instructions and
 882 generates a text output. It can be compared with the output obtained with
 883 a real CPU or another emulator. The target @code{make test} runs this
 884 program and a @code{diff} on the generated output.
 885
 886 The Linux system call @code{modify_ldt()} is used to create x86 selectors
 887 to test some 16 bit addressing and 32 bit with segmentation cases.
 888
 889 The Linux system call @code{vm86()} is used to test vm86 emulation.
 890
 891 Various exceptions are raised to test most of the x86 user space
 892 exception reporting.
 893
 894 @section @file{sha1}
 895
 896 It is a simple benchmark. Care must be taken to interpret the results
 897 because it mostly tests the ability of the virtual CPU to optimize the
 898 @code{rol} x86 instruction and the condition code computations.
 899