qemu-doc.texi

   1 \input texinfo @c -*- texinfo -*-
   2
   3 @settitle QEMU CPU Emulator Reference Documentation
   4 @titlepage
   5 @sp 7
   6 @center @titlefont{QEMU CPU Emulator Reference Documentation}
   7 @sp 3
   8 @end titlepage
   9
  10 @chapter Introduction
  11
  12 @section Features
  13
  14 QEMU is a FAST! processor emulator. By using dynamic translation it
  15 achieves a reasonnable speed while being easy to port on new host
  16 CPUs.
  17
  18 QEMU has two operating modes:
  19 @itemize
  20 @item User mode emulation. In this mode, QEMU can launch Linux processes
  21 compiled for one CPU on another CPU. Linux system calls are converted
  22 because of endianness and 32/64 bit mismatches. The Wine Windows API
  23 emulator (@url{http://www.winehq.org}) and the DOSEMU DOS emulator
  24 (@url{www.dosemu.org}) are the main targets for QEMU.
  25
  26 @item Full system emulation. In this mode, QEMU emulates a full
  27 system, including a processor and various peripherials. Currently, it
  28 is only used to launch an x86 Linux kernel on an x86 Linux system. It
  29 enables easier testing and debugging of system code. It can also be
  30 used to provide virtual hosting of several virtual PCs on a single
  31 server.
  32
  33 @end itemize
  34
  35 As QEMU requires no host kernel patches to run, it is very safe and
  36 easy to use.
  37
  38 QEMU generic features:
  39
  40 @itemize
  41
  42 @item User space only or full system emulation.
  43
  44 @item Using dynamic translation to native code for reasonnable speed.
  45
  46 @item Working on x86 and PowerPC hosts. Being tested on ARM, Sparc32, Alpha and S390.
  47
  48 @item Self-modifying code support.
  49
  50 @item Precise exceptions support.
  51
  52 @item The virtual CPU is a library (@code{libqemu}) which can be used
  53 in other projects.
  54
  55 @end itemize
  56
  57 QEMU user mode emulation features:
  58 @itemize
  59 @item Generic Linux system call converter, including most ioctls.
  60
  61 @item clone() emulation using native CPU clone() to use Linux scheduler for threads.
  62
  63 @item Accurate signal handling by remapping host signals to target signals.
  64 @end itemize
  65 @end itemize
  66
  67 QEMU full system emulation features:
  68 @itemize
  69 @item Using mmap() system calls to simulate the MMU
  70 @end itemize
  71
  72 @section x86 emulation
  73
  74 QEMU x86 target features:
  75
  76 @itemize
  77
  78 @item The virtual x86 CPU supports 16 bit and 32 bit addressing with segmentation.
  79 LDT/GDT and IDT are emulated. VM86 mode is also supported to run DOSEMU.
  80
  81 @item Support of host page sizes bigger than 4KB in user mode emulation.
  82
  83 @item QEMU can emulate itself on x86.
  84
  85 @item An extensive Linux x86 CPU test program is included @file{tests/test-i386}.
  86 It can be used to test other x86 virtual CPUs.
  87
  88 @end itemize
  89
  90 Current QEMU limitations:
  91
  92 @itemize
  93
  94 @item No SSE/MMX support (yet).
  95
  96 @item No x86-64 support.
  97
  98 @item IPC syscalls are missing.
  99
 100 @item The x86 segment limits and access rights are not tested at every
 101 memory access.
 102
 103 @item On non x86 host CPUs, @code{double}s are used instead of the non standard
 104 10 byte @code{long double}s of x86 for floating point emulation to get
 105 maximum performances.
 106
 107 @item Full system emulation only works if no data are mapped above the virtual address
 108 0xc0000000 (yet).
 109
 110 @item Some priviledged instructions or behaviors are missing. Only the ones
 111 needed for proper Linux kernel operation are emulated.
 112
 113 @item No memory separation between the kernel and the user processes is done.
 114 It will be implemented very soon.
 115
 116 @end itemize
 117
 118 @section ARM emulation
 119
 120 @itemize
 121
 122 @item ARM emulation can currently launch small programs while using the
 123 generic dynamic code generation architecture of QEMU.
 124
 125 @item No FPU support (yet).
 126
 127 @item No automatic regression testing (yet).
 128
 129 @end itemize
 130
 131 @chapter QEMU User space emulator invocation
 132
 133 @section Quick Start
 134
 135 If you need to compile QEMU, please read the @file{README} which gives
 136 the related information.
 137
 138 In order to launch a Linux process, QEMU needs the process executable
 139 itself and all the target (x86) dynamic libraries used by it.
 140
 141 @itemize
 142
 143 @item On x86, you can just try to launch any process by using the native
 144 libraries:
 145
 146 @example
 147 qemu -L / /bin/ls
 148 @end example
 149
 150 @code{-L /} tells that the x86 dynamic linker must be searched with a
 151 @file{/} prefix.
 152
 153 @item Since QEMU is also a linux process, you can launch qemu with qemu:
 154
 155 @example
 156 qemu -L / qemu -L / /bin/ls
 157 @end example
 158
 159 @item On non x86 CPUs, you need first to download at least an x86 glibc
 160 (@file{qemu-XXX-i386-glibc21.tar.gz} on the QEMU web page). Ensure that
 161 @code{LD_LIBRARY_PATH} is not set:
 162
 163 @example
 164 unset LD_LIBRARY_PATH
 165 @end example
 166
 167 Then you can launch the precompiled @file{ls} x86 executable:
 168
 169 @example
 170 qemu /usr/local/qemu-i386/bin/ls-i386
 171 @end example
 172 You can look at @file{/usr/local/qemu-i386/bin/qemu-conf.sh} so that
 173 QEMU is automatically launched by the Linux kernel when you try to
 174 launch x86 executables. It requires the @code{binfmt_misc} module in the
 175 Linux kernel.
 176
 177 @item The x86 version of QEMU is also included. You can try weird things such as:
 178 @example
 179 qemu /usr/local/qemu-i386/bin/qemu-i386 /usr/local/qemu-i386/bin/ls-i386
 180 @end example
 181
 182 @end itemize
 183
 184 @section Wine launch
 185
 186 @itemize
 187
 188 @item Ensure that you have a working QEMU with the x86 glibc
 189 distribution (see previous section). In order to verify it, you must be
 190 able to do:
 191
 192 @example
 193 qemu /usr/local/qemu-i386/bin/ls-i386
 194 @end example
 195
 196 @item Download the binary x86 Wine install
 197 (@file{qemu-XXX-i386-wine.tar.gz} on the QEMU web page).
 198
 199 @item Configure Wine on your account. Look at the provided script
 200 @file{/usr/local/qemu-i386/bin/wine-conf.sh}. Your previous
 201 @code{$@{HOME@}/.wine} directory is saved to @code{$@{HOME@}/.wine.org}.
 202
 203 @item Then you can try the example @file{putty.exe}:
 204
 205 @example
 206 qemu /usr/local/qemu-i386/wine/bin/wine /usr/local/qemu-i386/wine/c/Program\ Files/putty.exe
 207 @end example
 208
 209 @end itemize
 210
 211 @section Command line options
 212
 213 @example
 214 usage: qemu [-h] [-d] [-L path] [-s size] program [arguments...]
 215 @end example
 216
 217 @table @option
 218 @item -h
 219 Print the help
 220 @item -L path
 221 Set the x86 elf interpreter prefix (default=/usr/local/qemu-i386)
 222 @item -s size
 223 Set the x86 stack size in bytes (default=524288)
 224 @end table
 225
 226 Debug options:
 227
 228 @table @option
 229 @item -d
 230 Activate log (logfile=/tmp/qemu.log)
 231 @item -p pagesize
 232 Act as if the host page size was 'pagesize' bytes
 233 @end table
 234
 235 @chapter QEMU System emulator invocation
 236
 237 @section Quick Start
 238
 239 This section explains how to launch a Linux kernel inside QEMU.
 240
 241 @enumerate
 242 @item
 243 Download the archive @file{vl-test-xxx.tar.gz} containing a Linux
 244 kernel and a disk image. The archive also contains a precompiled
 245 version of @file{vl}, the QEMU System emulator.
 246
 247 @item Optional: If you want network support (for example to launch X11 examples), you
 248 must copy the script @file{vl-ifup} in @file{/etc} and configure
 249 properly @code{sudo} so that the command @code{ifconfig} contained in
 250 @file{vl-ifup} can be executed as root. You must verify that your host
 251 kernel supports the TUN/TAP network interfaces: the device
 252 @file{/dev/net/tun} must be present.
 253
 254 When network is enabled, there is a virtual network connection between
 255 the host kernel and the emulated kernel. The emulated kernel is seen
 256 from the host kernel at IP address 172.20.0.2 and the host kernel is
 257 seen from the emulated kernel at IP address 172.20.0.1.
 258
 259 @item Launch @code{vl.sh}. You should have the following output:
 260
 261 @example
 262 > ./vl.sh
 263 connected to host network interface: tun0
 264 Uncompressing Linux... Ok, booting the kernel.
 265 Linux version 2.4.20 (fabrice@localhost.localdomain) (gcc version 2.96 20000731 (Red Hat Linux 7.3 2.96-110)) #22 lun jui 7 13:37:41 CEST 2003
 266 BIOS-provided physical RAM map:
 267  BIOS-e801: 0000000000000000 - 000000000009f000 (usable)
 268  BIOS-e801: 0000000000100000 - 0000000002000000 (usable)
 269 32MB LOWMEM available.
 270 On node 0 totalpages: 8192
 271 zone(0): 4096 pages.
 272 zone(1): 4096 pages.
 273 zone(2): 0 pages.
 274 Kernel command line: root=/dev/hda ide1=noprobe ide2=noprobe ide3=noprobe ide4=noprobe ide5=noprobe
 275 ide_setup: ide1=noprobe
 276 ide_setup: ide2=noprobe
 277 ide_setup: ide3=noprobe
 278 ide_setup: ide4=noprobe
 279 ide_setup: ide5=noprobe
 280 Initializing CPU#0
 281 Detected 501.285 MHz processor.
 282 Calibrating delay loop... 989.59 BogoMIPS
 283 Memory: 29268k/32768k available (907k kernel code, 3112k reserved, 212k data, 52k init, 0k highmem)
 284 Dentry cache hash table entries: 4096 (order: 3, 32768 bytes)
 285 Inode cache hash table entries: 2048 (order: 2, 16384 bytes)
 286 Mount-cache hash table entries: 512 (order: 0, 4096 bytes)
 287 Buffer-cache hash table entries: 1024 (order: 0, 4096 bytes)
 288 Page-cache hash table entries: 8192 (order: 3, 32768 bytes)
 289 CPU: Intel Pentium Pro stepping 03
 290 Checking 'hlt' instruction... OK.
 291 POSIX conformance testing by UNIFIX
 292 Linux NET4.0 for Linux 2.4
 293 Based upon Swansea University Computer Society NET3.039
 294 Initializing RT netlink socket
 295 apm: BIOS not found.
 296 Starting kswapd
 297 Journalled Block Device driver loaded
 298 pty: 256 Unix98 ptys configured
 299 Serial driver version 5.05c (2001-07-08) with no serial options enabled
 300 ttyS00 at 0x03f8 (irq = 4) is a 16450
 301 Uniform Multi-Platform E-IDE driver Revision: 6.31
 302 ide: Assuming 50MHz system bus speed for PIO modes; override with idebus=xx
 303 hda: QEMU HARDDISK, ATA DISK drive
 304 ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
 305 hda: 12288 sectors (6 MB) w/256KiB Cache, CHS=12/16/63
 306 Partition check:
 307  hda: unknown partition table
 308 ne.c:v1.10 9/23/94 Donald Becker (becker@scyld.com)
 309 Last modified Nov 1, 2000 by Paul Gortmaker
 310 NE*000 ethercard probe at 0x300: 52 54 00 12 34 56
 311 eth0: NE2000 found at 0x300, using IRQ 9.
 312 RAMDISK driver initialized: 16 RAM disks of 4096K size 1024 blocksize
 313 NET4: Linux TCP/IP 1.0 for NET4.0
 314 IP Protocols: ICMP, UDP, TCP, IGMP
 315 IP: routing cache hash table of 512 buckets, 4Kbytes
 316 TCP: Hash tables configured (established 2048 bind 4096)
 317 NET4: Unix domain sockets 1.0/SMP for Linux NET4.0.
 318 EXT2-fs warning: mounting unchecked fs, running e2fsck is recommended
 319 VFS: Mounted root (ext2 filesystem).
 320 Freeing unused kernel memory: 52k freed
 321 sh: can't access tty; job control turned off
 322 #
 323 @end example
 324
 325 @item
 326 Then you can play with the kernel inside the virtual serial console. You
 327 can launch @code{ls} for example. Type @key{Ctrl-a h} to have an help
 328 about the keys you can type inside the virtual serial console. In
 329 particular, use @key{Ctrl-a x} to exit QEMU and use @key{Ctrl-a b} as
 330 the Magic SysRq key.
 331
 332 @item
 333 If the network is enabled, launch the script @file{/etc/linuxrc} in the
 334 emulator (don't forget the leading dot):
 335 @example
 336 . /etc/linuxrc
 337 @end example
 338
 339 Then enable X11 connections on your PC from the emulated Linux:
 340 @example
 341 xhost +172.20.0.2
 342 @end example
 343
 344 You can now launch @file{xterm} or @file{xlogo} and verify that you have
 345 a real Virtual Linux system !
 346
 347 @end enumerate
 348
 349 NOTES:
 350 @enumerate
 351 @item
 352 A 2.5.74 kernel is also included in the vl-test archive. Just
 353 replace the bzImage in vl.sh to try it.
 354
 355 @item
 356 vl creates a temporary file in @var{$VLTMPDIR} (@file{/tmp} is the
 357 default) containing all the simulated PC memory. If possible, try to use
 358 a temporary directory using the tmpfs filesystem to avoid too many
 359 unnecessary disk accesses.
 360
 361 @item
 362 In order to exit cleanly for vl, you can do a @emph{shutdown} inside
 363 vl. vl will automatically exit when the Linux shutdown is done.
 364
 365 @item
 366 You can boot slightly faster by disabling the probe of non present IDE
 367 interfaces. To do so, add the following options on the kernel command
 368 line:
 369 @example
 370 ide1=noprobe ide2=noprobe ide3=noprobe ide4=noprobe ide5=noprobe
 371 @end example
 372
 373 @item
 374 The example disk image is a modified version of the one made by Kevin
 375 Lawton for the plex86 Project (@url{www.plex86.org}).
 376
 377 @end enumerate
 378
 379 @section Invocation
 380
 381 @example
 382 usage: vl [options] bzImage [kernel parameters...]
 383 @end example
 384
 385 @file{bzImage} is a Linux kernel image.
 386
 387 General options:
 388 @table @option
 389 @item -hda file
 390 @item -hdb file
 391 Use 'file' as hard disk 0 or 1 image (@xref{disk_images}).
 392
 393 @item -snapshot
 394
 395 Write to temporary files instead of disk image files. In this case,
 396 the raw disk image you use is not written back. You can however force
 397 the write back by pressing @key{C-a s} (@xref{disk_images}).
 398
 399 @item -m megs
 400 Set virtual RAM size to @var{megs} megabytes.
 401
 402 @item -n script
 403 Set network init script [default=/etc/vl-ifup]. This script is
 404 launched to configure the host network interface (usually tun0)
 405 corresponding to the virtual NE2000 card.
 406
 407 @item -initrd file
 408 Use 'file' as initial ram disk.
 409 @end table
 410
 411 Debug options:
 412 @table @option
 413 @item -s
 414 Wait gdb connection to port 1234.
 415 @item -p port
 416 Change gdb connection port.
 417 @item -d
 418 Output log in /tmp/vl.log
 419 @end table
 420
 421 During emulation, use @key{C-a h} to get terminal commands:
 422
 423 @table @key
 424 @item C-a h
 425 Print this help
 426 @item C-a x
 427 Exit emulatior
 428 @item C-a s
 429 Save disk data back to file (if -snapshot)
 430 @item C-a b
 431 Send break (magic sysrq)
 432 @item C-a C-a
 433 Send C-a
 434 @end table
 435
 436 @node disk_images
 437 @section Disk Images
 438
 439 @subsection Raw disk images
 440
 441 The disk images can simply be raw images of the hard disk. You can
 442 create them with the command:
 443 @example
 444 dd if=/dev/zero of=myimage bs=1024 count=mysize
 445 @end example
 446 where @var{myimage} is the image filename and @var{mysize} is its size
 447 in kilobytes.
 448
 449 @subsection Snapshot mode
 450
 451 If you use the option @option{-snapshot}, all disk images are
 452 considered as read only. When sectors in written, they are written in
 453 a temporary file created in @file{/tmp}. You can however force the
 454 write back to the raw disk images by pressing @key{C-a s}.
 455
 456 NOTE: The snapshot mode only works with raw disk images.
 457
 458 @subsection Copy On Write disk images
 459
 460 QEMU also supports user mode Linux
 461 (@url{http://user-mode-linux.sourceforge.net/}) Copy On Write (COW)
 462 disk images. The COW disk images are much smaller than normal images
 463 as they store only modified sectors. They also permit the use of the
 464 same disk image template for many users.
 465
 466 To create a COW disk images, use the command:
 467
 468 @example
 469 vlmkcow -f myrawimage.bin mycowimage.cow
 470 @end example
 471
 472 @file{myrawimage.bin} is a raw image you want to use as original disk
 473 image. It will never be written to.
 474
 475 @file{mycowimage.cow} is the COW disk image which is created by
 476 @code{vlmkcow}. You can use it directly with the @option{-hdx}
 477 options. You must not modify the original raw disk image if you use
 478 COW images, as COW images only store the modified sectors from the raw
 479 disk image. QEMU stores the original raw disk image name and its
 480 modified time in the COW disk image so that chances of mistakes are
 481 reduced.
 482
 483 If raw disk image is not read-only, by pressing @key{C-a s} you can
 484 flush the COW disk image back into the raw disk image, as in snapshot
 485 mode.
 486
 487 COW disk images can also be created without a corresponding raw disk
 488 image. It is useful to have a big initial virtual disk image without
 489 using much disk space. Use:
 490
 491 @example
 492 vlmkcow mycowimage.cow 1024
 493 @end example
 494
 495 to create a 1 gigabyte empty COW disk image.
 496
 497 NOTES:
 498 @enumerate
 499 @item
 500 COW disk images must be created on file systems supporting
 501 @emph{holes} such as ext2 or ext3.
 502 @item
 503 Since holes are used, the displayed size of the COW disk image is not
 504 the real one. To know it, use the @code{ls -ls} command.
 505 @end enumerate
 506
 507 @section Linux Kernel Compilation
 508
 509 You should be able to use any kernel with QEMU provided you make the
 510 following changes (only 2.4.x and 2.5.x were tested):
 511
 512 @enumerate
 513 @item
 514 The kernel must be mapped at 0x90000000 (the default is
 515 0xc0000000). You must modify only two lines in the kernel source:
 516
 517 In @file{include/asm/page.h}, replace
 518 @example
 519 #define __PAGE_OFFSET           (0xc0000000)
 520 @end example
 521 by
 522 @example
 523 #define __PAGE_OFFSET           (0x90000000)
 524 @end example
 525
 526 And in @file{arch/i386/vmlinux.lds}, replace
 527 @example
 528   . = 0xc0000000 + 0x100000;
 529 @end example
 530 by
 531 @example
 532   . = 0x90000000 + 0x100000;
 533 @end example
 534
 535 @item
 536 If you want to enable SMP (Symmetric Multi-Processing) support, you
 537 must make the following change in @file{include/asm/fixmap.h}. Replace
 538 @example
 539 #define FIXADDR_TOP     (0xffffX000UL)
 540 @end example
 541 by
 542 @example
 543 #define FIXADDR_TOP     (0xa7ffX000UL)
 544 @end example
 545 (X is 'e' or 'f' depending on the kernel version). Although you can
 546 use an SMP kernel with QEMU, it only supports one CPU.
 547
 548 @item
 549 If you are not using a 2.5 kernel as host kernel but if you use a target
 550 2.5 kernel, you must also ensure that the 'HZ' define is set to 100
 551 (1000 is the default) as QEMU cannot currently emulate timers at
 552 frequencies greater than 100 Hz on host Linux systems < 2.5. In
 553 @file{include/asm/param.h}, replace:
 554
 555 @example
 556 # define HZ             1000            /* Internal kernel timer frequency */
 557 @end example
 558 by
 559 @example
 560 # define HZ             100             /* Internal kernel timer frequency */
 561 @end example
 562
 563 @end enumerate
 564
 565 The file config-2.x.x gives the configuration of the example kernels.
 566
 567 Just type
 568 @example
 569 make bzImage
 570 @end example
 571
 572 As you would do to make a real kernel. Then you can use with QEMU
 573 exactly the same kernel as you would boot on your PC (in
 574 @file{arch/i386/boot/bzImage}).
 575
 576 @section PC Emulation
 577
 578 QEMU emulates the following PC peripherials:
 579
 580 @itemize
 581 @item
 582 PIC (interrupt controler)
 583 @item
 584 PIT (timers)
 585 @item
 586 CMOS memory
 587 @item
 588 Dumb VGA (to print the @code{Uncompressing Linux} message)
 589 @item
 590 Serial port (port=0x3f8, irq=4)
 591 @item
 592 NE2000 network adapter (port=0x300, irq=9)
 593 @item
 594 IDE disk interface (port=0x1f0, irq=14)
 595 @end itemize
 596
 597 @section GDB usage
 598
 599 QEMU has a primitive support to work with gdb, so that you can do
 600 'Ctrl-C' while the kernel is running and inspect its state.
 601
 602 In order to use gdb, launch vl with the '-s' option. It will wait for a
 603 gdb connection:
 604 @example
 605 > vl -s arch/i386/boot/bzImage initrd-2.4.20.img root=/dev/ram0 ramdisk_size=6144
 606 Connected to host network interface: tun0
 607 Waiting gdb connection on port 1234
 608 @end example
 609
 610 Then launch gdb on the 'vmlinux' executable:
 611 @example
 612 > gdb vmlinux
 613 @end example
 614
 615 In gdb, connect to QEMU:
 616 @example
 617 (gdb) target remote locahost:1234
 618 @end example
 619
 620 Then you can use gdb normally. For example, type 'c' to launch the kernel:
 621 @example
 622 (gdb) c
 623 @end example
 624
 625 WARNING: breakpoints and single stepping are not yet supported.
 626
 627 @chapter QEMU Internals
 628
 629 @section QEMU compared to other emulators
 630
 631 Like bochs [3], QEMU emulates an x86 CPU. But QEMU is much faster than
 632 bochs as it uses dynamic compilation and because it uses the host MMU to
 633 simulate the x86 MMU. The downside is that currently the emulation is
 634 not as accurate as bochs (for example, you cannot currently run Windows
 635 inside QEMU).
 636
 637 Like Valgrind [2], QEMU does user space emulation and dynamic
 638 translation. Valgrind is mainly a memory debugger while QEMU has no
 639 support for it (QEMU could be used to detect out of bound memory
 640 accesses as Valgrind, but it has no support to track uninitialised data
 641 as Valgrind does). The Valgrind dynamic translator generates better code
 642 than QEMU (in particular it does register allocation) but it is closely
 643 tied to an x86 host and target and has no support for precise exceptions
 644 and system emulation.
 645
 646 EM86 [4] is the closest project to user space QEMU (and QEMU still uses
 647 some of its code, in particular the ELF file loader). EM86 was limited
 648 to an alpha host and used a proprietary and slow interpreter (the
 649 interpreter part of the FX!32 Digital Win32 code translator [5]).
 650
 651 TWIN [6] is a Windows API emulator like Wine. It is less accurate than
 652 Wine but includes a protected mode x86 interpreter to launch x86 Windows
 653 executables. Such an approach as greater potential because most of the
 654 Windows API is executed natively but it is far more difficult to develop
 655 because all the data structures and function parameters exchanged
 656 between the API and the x86 code must be converted.
 657
 658 User mode Linux [7] was the only solution before QEMU to launch a Linux
 659 kernel as a process while not needing any host kernel patches. However,
 660 user mode Linux requires heavy kernel patches while QEMU accepts
 661 unpatched Linux kernels. It would be interesting to compare the
 662 performance of the two approaches.
 663
 664 The new Plex86 [8] PC virtualizer is done in the same spirit as the QEMU
 665 system emulator. It requires a patched Linux kernel to work (you cannot
 666 launch the same kernel on your PC), but the patches are really small. As
 667 it is a PC virtualizer (no emulation is done except for some priveledged
 668 instructions), it has the potential of being faster than QEMU. The
 669 downside is that a complicated (and potentially unsafe) host kernel
 670 patch is needed.
 671
 672 @section Portable dynamic translation
 673
 674 QEMU is a dynamic translator. When it first encounters a piece of code,
 675 it converts it to the host instruction set. Usually dynamic translators
 676 are very complicated and highly CPU dependent. QEMU uses some tricks
 677 which make it relatively easily portable and simple while achieving good
 678 performances.
 679
 680 The basic idea is to split every x86 instruction into fewer simpler
 681 instructions. Each simple instruction is implemented by a piece of C
 682 code (see @file{op-i386.c}). Then a compile time tool (@file{dyngen})
 683 takes the corresponding object file (@file{op-i386.o}) to generate a
 684 dynamic code generator which concatenates the simple instructions to
 685 build a function (see @file{op-i386.h:dyngen_code()}).
 686
 687 In essence, the process is similar to [1], but more work is done at
 688 compile time.
 689
 690 A key idea to get optimal performances is that constant parameters can
 691 be passed to the simple operations. For that purpose, dummy ELF
 692 relocations are generated with gcc for each constant parameter. Then,
 693 the tool (@file{dyngen}) can locate the relocations and generate the
 694 appriopriate C code to resolve them when building the dynamic code.
 695
 696 That way, QEMU is no more difficult to port than a dynamic linker.
 697
 698 To go even faster, GCC static register variables are used to keep the
 699 state of the virtual CPU.
 700
 701 @section Register allocation
 702
 703 Since QEMU uses fixed simple instructions, no efficient register
 704 allocation can be done. However, because RISC CPUs have a lot of
 705 register, most of the virtual CPU state can be put in registers without
 706 doing complicated register allocation.
 707
 708 @section Condition code optimisations
 709
 710 Good CPU condition codes emulation (@code{EFLAGS} register on x86) is a
 711 critical point to get good performances. QEMU uses lazy condition code
 712 evaluation: instead of computing the condition codes after each x86
 713 instruction, it just stores one operand (called @code{CC_SRC}), the
 714 result (called @code{CC_DST}) and the type of operation (called
 715 @code{CC_OP}).
 716
 717 @code{CC_OP} is almost never explicitely set in the generated code
 718 because it is known at translation time.
 719
 720 In order to increase performances, a backward pass is performed on the
 721 generated simple instructions (see
 722 @code{translate-i386.c:optimize_flags()}). When it can be proved that
 723 the condition codes are not needed by the next instructions, no
 724 condition codes are computed at all.
 725
 726 @section CPU state optimisations
 727
 728 The x86 CPU has many internal states which change the way it evaluates
 729 instructions. In order to achieve a good speed, the translation phase
 730 considers that some state information of the virtual x86 CPU cannot
 731 change in it. For example, if the SS, DS and ES segments have a zero
 732 base, then the translator does not even generate an addition for the
 733 segment base.
 734
 735 [The FPU stack pointer register is not handled that way yet].
 736
 737 @section Translation cache
 738
 739 A 2MByte cache holds the most recently used translations. For
 740 simplicity, it is completely flushed when it is full. A translation unit
 741 contains just a single basic block (a block of x86 instructions
 742 terminated by a jump or by a virtual CPU state change which the
 743 translator cannot deduce statically).
 744
 745 @section Direct block chaining
 746
 747 After each translated basic block is executed, QEMU uses the simulated
 748 Program Counter (PC) and other cpu state informations (such as the CS
 749 segment base value) to find the next basic block.
 750
 751 In order to accelerate the most common cases where the new simulated PC
 752 is known, QEMU can patch a basic block so that it jumps directly to the
 753 next one.
 754
 755 The most portable code uses an indirect jump. An indirect jump makes it
 756 easier to make the jump target modification atomic. On some
 757 architectures (such as PowerPC), the @code{JUMP} opcode is directly
 758 patched so that the block chaining has no overhead.
 759
 760 @section Self-modifying code and translated code invalidation
 761
 762 Self-modifying code is a special challenge in x86 emulation because no
 763 instruction cache invalidation is signaled by the application when code
 764 is modified.
 765
 766 When translated code is generated for a basic block, the corresponding
 767 host page is write protected if it is not already read-only (with the
 768 system call @code{mprotect()}). Then, if a write access is done to the
 769 page, Linux raises a SEGV signal. QEMU then invalidates all the
 770 translated code in the page and enables write accesses to the page.
 771
 772 Correct translated code invalidation is done efficiently by maintaining
 773 a linked list of every translated block contained in a given page. Other
 774 linked lists are also maintained to undo direct block chaining.
 775
 776 Although the overhead of doing @code{mprotect()} calls is important,
 777 most MSDOS programs can be emulated at reasonnable speed with QEMU and
 778 DOSEMU.
 779
 780 Note that QEMU also invalidates pages of translated code when it detects
 781 that memory mappings are modified with @code{mmap()} or @code{munmap()}.
 782
 783 @section Exception support
 784
 785 longjmp() is used when an exception such as division by zero is
 786 encountered.
 787
 788 The host SIGSEGV and SIGBUS signal handlers are used to get invalid
 789 memory accesses. The exact CPU state can be retrieved because all the
 790 x86 registers are stored in fixed host registers. The simulated program
 791 counter is found by retranslating the corresponding basic block and by
 792 looking where the host program counter was at the exception point.
 793
 794 The virtual CPU cannot retrieve the exact @code{EFLAGS} register because
 795 in some cases it is not computed because of condition code
 796 optimisations. It is not a big concern because the emulated code can
 797 still be restarted in any cases.
 798
 799 @section Linux system call translation
 800
 801 QEMU includes a generic system call translator for Linux. It means that
 802 the parameters of the system calls can be converted to fix the
 803 endianness and 32/64 bit issues. The IOCTLs are converted with a generic
 804 type description system (see @file{ioctls.h} and @file{thunk.c}).
 805
 806 QEMU supports host CPUs which have pages bigger than 4KB. It records all
 807 the mappings the process does and try to emulated the @code{mmap()}
 808 system calls in cases where the host @code{mmap()} call would fail
 809 because of bad page alignment.
 810
 811 @section Linux signals
 812
 813 Normal and real-time signals are queued along with their information
 814 (@code{siginfo_t}) as it is done in the Linux kernel. Then an interrupt
 815 request is done to the virtual CPU. When it is interrupted, one queued
 816 signal is handled by generating a stack frame in the virtual CPU as the
 817 Linux kernel does. The @code{sigreturn()} system call is emulated to return
 818 from the virtual signal handler.
 819
 820 Some signals (such as SIGALRM) directly come from the host. Other
 821 signals are synthetized from the virtual CPU exceptions such as SIGFPE
 822 when a division by zero is done (see @code{main.c:cpu_loop()}).
 823
 824 The blocked signal mask is still handled by the host Linux kernel so
 825 that most signal system calls can be redirected directly to the host
 826 Linux kernel. Only the @code{sigaction()} and @code{sigreturn()} system
 827 calls need to be fully emulated (see @file{signal.c}).
 828
 829 @section clone() system call and threads
 830
 831 The Linux clone() system call is usually used to create a thread. QEMU
 832 uses the host clone() system call so that real host threads are created
 833 for each emulated thread. One virtual CPU instance is created for each
 834 thread.
 835
 836 The virtual x86 CPU atomic operations are emulated with a global lock so
 837 that their semantic is preserved.
 838
 839 Note that currently there are still some locking issues in QEMU. In
 840 particular, the translated cache flush is not protected yet against
 841 reentrancy.
 842
 843 @section Self-virtualization
 844
 845 QEMU was conceived so that ultimately it can emulate itself. Although
 846 it is not very useful, it is an important test to show the power of the
 847 emulator.
 848
 849 Achieving self-virtualization is not easy because there may be address
 850 space conflicts. QEMU solves this problem by being an executable ELF
 851 shared object as the ld-linux.so ELF interpreter. That way, it can be
 852 relocated at load time.
 853
 854 @section MMU emulation
 855
 856 For system emulation, QEMU uses the mmap() system call to emulate the
 857 target CPU MMU. It works as long the emulated OS does not use an area
 858 reserved by the host OS (such as the area above 0xc0000000 on x86
 859 Linux).
 860
 861 It is planned to add a slower but more precise MMU emulation
 862 with a software MMU.
 863
 864 @section Bibliography
 865
 866 @table @asis
 867
 868 @item [1]
 869 @url{http://citeseer.nj.nec.com/piumarta98optimizing.html}, Optimizing
 870 direct threaded code by selective inlining (1998) by Ian Piumarta, Fabio
 871 Riccardi.
 872
 873 @item [2]
 874 @url{http://developer.kde.org/~sewardj/}, Valgrind, an open-source
 875 memory debugger for x86-GNU/Linux, by Julian Seward.
 876
 877 @item [3]
 878 @url{http://bochs.sourceforge.net/}, the Bochs IA-32 Emulator Project,
 879 by Kevin Lawton et al.
 880
 881 @item [4]
 882 @url{http://www.cs.rose-hulman.edu/~donaldlf/em86/index.html}, the EM86
 883 x86 emulator on Alpha-Linux.
 884
 885 @item [5]
 886 @url{http://www.usenix.org/publications/library/proceedings/usenix-nt97/full_papers/chernoff/chernoff.pdf},
 887 DIGITAL FX!32: Running 32-Bit x86 Applications on Alpha NT, by Anton
 888 Chernoff and Ray Hookway.
 889
 890 @item [6]
 891 @url{http://www.willows.com/}, Windows API library emulation from
 892 Willows Software.
 893
 894 @item [7]
 895 @url{http://user-mode-linux.sourceforge.net/},
 896 The User-mode Linux Kernel.
 897
 898 @item [8]
 899 @url{http://www.plex86.org/},
 900 The new Plex86 project.
 901
 902 @end table
 903
 904 @chapter Regression Tests
 905
 906 In the directory @file{tests/}, various interesting testing programs
 907 are available. There are used for regression testing.
 908
 909 @section @file{hello-i386}
 910
 911 Very simple statically linked x86 program, just to test QEMU during a
 912 port to a new host CPU.
 913
 914 @section @file{hello-arm}
 915
 916 Very simple statically linked ARM program, just to test QEMU during a
 917 port to a new host CPU.
 918
 919 @section @file{test-i386}
 920
 921 This program executes most of the 16 bit and 32 bit x86 instructions and
 922 generates a text output. It can be compared with the output obtained with
 923 a real CPU or another emulator. The target @code{make test} runs this
 924 program and a @code{diff} on the generated output.
 925
 926 The Linux system call @code{modify_ldt()} is used to create x86 selectors
 927 to test some 16 bit addressing and 32 bit with segmentation cases.
 928
 929 The Linux system call @code{vm86()} is used to test vm86 emulation.
 930
 931 Various exceptions are raised to test most of the x86 user space
 932 exception reporting.
 933
 934 @section @file{sha1}
 935
 936 It is a simple benchmark. Care must be taken to interpret the results
 937 because it mostly tests the ability of the virtual CPU to optimize the
 938 @code{rol} x86 instruction and the condition code computations.
 939