]> git.proxmox.com Git - mirror_qemu.git/blame - docs/devel/s390-dasd-ipl.rst
hw/arm: Build various units only once
[mirror_qemu.git] / docs / devel / s390-dasd-ipl.rst
CommitLineData
cc3d15a5
CH
1Booting from real channel-attached devices on s390x
2===================================================
3
4s390 hardware IPL
5-----------------
efa47d36
JH
6
7The s390 hardware IPL process consists of the following steps.
8
cc3d15a5
CH
91. A READ IPL ccw is constructed in memory location ``0x0``.
10 This ccw, by definition, reads the IPL1 record which is located on the disk
11 at cylinder 0 track 0 record 1. Note that the chain flag is on in this ccw
12 so when it is complete another ccw will be fetched and executed from memory
13 location ``0x08``.
14
152. Execute the Read IPL ccw at ``0x00``, thereby reading IPL1 data into ``0x00``.
16 IPL1 data is 24 bytes in length and consists of the following pieces of
17 information: ``[psw][read ccw][tic ccw]``. When the machine executes the Read
18 IPL ccw it read the 24-bytes of IPL1 to be read into memory starting at
19 location ``0x0``. Then the ccw program at ``0x08`` which consists of a read
20 ccw and a tic ccw is automatically executed because of the chain flag from
21 the original READ IPL ccw. The read ccw will read the IPL2 data into memory
22 and the TIC (Transfer In Channel) will transfer control to the channel
23 program contained in the IPL2 data. The TIC channel command is the
24 equivalent of a branch/jump/goto instruction for channel programs.
25
26 NOTE: The ccws in IPL1 are defined by the architecture to be format 0.
efa47d36
JH
27
283. Execute IPL2.
cc3d15a5
CH
29 The TIC ccw instruction at the end of the IPL1 channel program will begin
30 the execution of the IPL2 channel program. IPL2 is stage-2 of the boot
31 process and will contain a larger channel program than IPL1. The point of
32 IPL2 is to find and load either the operating system or a small program that
33 loads the operating system from disk. At the end of this step all or some of
34 the real operating system is loaded into memory and we are ready to hand
35 control over to the guest operating system. At this point the guest
36 operating system is entirely responsible for loading any more data it might
37 need to function.
38
39 NOTE: The IPL2 channel program might read data into memory
40 location ``0x0`` thereby overwriting the IPL1 psw and channel program. This is ok
41 as long as the data placed in location ``0x0`` contains a psw whose instruction
42 address points to the guest operating system code to execute at the end of
43 the IPL/boot process.
44
45 NOTE: The ccws in IPL2 are defined by the architecture to be format 0.
efa47d36
JH
46
474. Start executing the guest operating system.
cc3d15a5
CH
48 The psw that was loaded into memory location ``0x0`` as part of the ipl process
49 should contain the needed flags for the operating system we have loaded. The
50 psw's instruction address will point to the location in memory where we want
51 to start executing the operating system. This psw is loaded (via LPSW
52 instruction) causing control to be passed to the operating system code.
efa47d36
JH
53
54In a non-virtualized environment this process, handled entirely by the hardware,
55is kicked off by the user initiating a "Load" procedure from the hardware
56management console. This "Load" procedure crafts a special "Read IPL" ccw in
57memory location 0x0 that reads IPL1. It then executes this ccw thereby kicking
58off the reading of IPL1 data. Since the channel program from IPL1 will be
59written immediately after the special "Read IPL" ccw, the IPL1 channel program
60will be executed immediately (the special read ccw has the chaining bit turned
61on). The TIC at the end of the IPL1 channel program will cause the IPL2 channel
62program to be executed automatically. After this sequence completes the "Load"
cc3d15a5 63procedure then loads the psw from ``0x0``.
efa47d36 64
cc3d15a5
CH
65How this all pertains to QEMU (and the kernel)
66----------------------------------------------
efa47d36
JH
67
68In theory we should merely have to do the following to IPL/boot a guest
69operating system from a DASD device:
70
cc3d15a5
CH
711. Place a "Read IPL" ccw into memory location ``0x0`` with chaining bit on.
722. Execute channel program at ``0x0``.
733. LPSW ``0x0``.
efa47d36
JH
74
75However, our emulation of the machine's channel program logic within the kernel
76is missing one key feature that is required for this process to work:
77non-prefetch of ccw data.
78
79When we start a channel program we pass the channel subsystem parameters via an
80ORB (Operation Request Block). One of those parameters is a prefetch bit. If the
81bit is on then the vfio-ccw kernel driver is allowed to read the entire channel
82program from guest memory before it starts executing it. This means that any
83channel commands that read additional channel commands will not work as expected
84because the newly read commands will only exist in guest memory and NOT within
85the kernel's channel subsystem memory. The kernel vfio-ccw driver currently
86requires this bit to be on for all channel programs. This is a problem because
87the IPL process consists of transferring control from the "Read IPL" ccw
88immediately to the IPL1 channel program that was read by "Read IPL".
89
90Not being able to turn off prefetch will also prevent the TIC at the end of the
91IPL1 channel program from transferring control to the IPL2 channel program.
92
93Lastly, in some cases (the zipl bootloader for example) the IPL2 program also
94transfers control to another channel program segment immediately after reading
95it from the disk. So we need to be able to handle this case.
96
cc3d15a5
CH
97What QEMU does
98--------------
efa47d36
JH
99
100Since we are forced to live with prefetch we cannot use the very simple IPL
101procedure we defined in the preceding section. So we compensate by doing the
102following.
103
cc3d15a5
CH
1041. Place "Read IPL" ccw into memory location ``0x0``, but turn off chaining bit.
1052. Execute "Read IPL" at ``0x0``.
efa47d36 106
cc3d15a5 107 So now IPL1's psw is at ``0x0`` and IPL1's channel program is at ``0x08``.
efa47d36 108
cc3d15a5 1093. Write a custom channel program that will seek to the IPL2 record and then
efa47d36
JH
110 execute the READ and TIC ccws from IPL1. Normally the seek is not required
111 because after reading the IPL1 record the disk is automatically positioned
112 to read the very next record which will be IPL2. But since we are not reading
113 both IPL1 and IPL2 as part of the same channel program we must manually set
114 the position.
115
cc3d15a5 1164. Grab the target address of the TIC instruction from the IPL1 channel program.
efa47d36
JH
117 This address is where the IPL2 channel program starts.
118
119 Now IPL2 is loaded into memory somewhere, and we know the address.
120
cc3d15a5 1215. Execute the IPL2 channel program at the address obtained in step #4.
efa47d36
JH
122
123 Because this channel program can be dynamic, we must use a special algorithm
124 that detects a READ immediately followed by a TIC and breaks the ccw chain
125 by turning off the chain bit in the READ ccw. When control is returned from
126 the kernel/hardware to the QEMU bios code we immediately issue another start
127 subchannel to execute the remaining TIC instruction. This causes the entire
128 channel program (starting from the TIC) and all needed data to be refetched
129 thereby stepping around the limitation that would otherwise prevent this
130 channel program from executing properly.
131
132 Now the operating system code is loaded somewhere in guest memory and the psw
cc3d15a5 133 in memory location ``0x0`` will point to entry code for the guest operating
efa47d36
JH
134 system.
135
cc3d15a5
CH
1366. LPSW ``0x0``
137
efa47d36 138 LPSW transfers control to the guest operating system and we're done.