]>
Commit | Line | Data |
---|---|---|
8b4a503d MCC |
1 | ================================================= |
2 | Linux API for read access to z/VM Monitor Records | |
3 | ================================================= | |
1da177e4 LT |
4 | |
5 | Date : 2004-Nov-26 | |
8b4a503d | 6 | |
1da177e4 LT |
7 | Author: Gerald Schaefer (geraldsc@de.ibm.com) |
8 | ||
9 | ||
1da177e4 LT |
10 | |
11 | ||
12 | Description | |
13 | =========== | |
14 | This item delivers a new Linux API in the form of a misc char device that is | |
c98be0c9 | 15 | usable from user space and allows read access to the z/VM Monitor Records |
8b4a503d | 16 | collected by the `*MONITOR` System Service of z/VM. |
1da177e4 LT |
17 | |
18 | ||
19 | User Requirements | |
20 | ================= | |
21 | The z/VM guest on which you want to access this API needs to be configured in | |
8b4a503d MCC |
22 | order to allow IUCV connections to the `*MONITOR` service, i.e. it needs the |
23 | IUCV `*MONITOR` statement in its user entry. If the monitor DCSS to be used is | |
1da177e4 LT |
24 | restricted (likely), you also need the NAMESAVE <DCSS NAME> statement. |
25 | This item will use the IUCV device driver to access the z/VM services, so you | |
26 | need a kernel with IUCV support. You also need z/VM version 4.4 or 5.1. | |
27 | ||
28 | There are two options for being able to load the monitor DCSS (examples assume | |
29 | that the monitor DCSS begins at 144 MB and ends at 152 MB). You can query the | |
30 | location of the monitor DCSS with the Class E privileged CP command Q NSS MAP | |
31 | (the values BEGPAG and ENDPAG are given in units of 4K pages). | |
32 | ||
33 | See also "CP Command and Utility Reference" (SC24-6081-00) for more information | |
34 | on the DEF STOR and Q NSS MAP commands, as well as "Saved Segments Planning | |
35 | and Administration" (SC24-6116-00) for more information on DCSSes. | |
36 | ||
37 | 1st option: | |
38 | ----------- | |
39 | You can use the CP command DEF STOR CONFIG to define a "memory hole" in your | |
40 | guest virtual storage around the address range of the DCSS. | |
41 | ||
42 | Example: DEF STOR CONFIG 0.140M 200M.200M | |
43 | ||
44 | This defines two blocks of storage, the first is 140MB in size an begins at | |
45 | address 0MB, the second is 200MB in size and begins at address 200MB, | |
46 | resulting in a total storage of 340MB. Note that the first block should | |
47 | always start at 0 and be at least 64MB in size. | |
48 | ||
49 | 2nd option: | |
50 | ----------- | |
51 | Your guest virtual storage has to end below the starting address of the DCSS | |
52 | and you have to specify the "mem=" kernel parameter in your parmfile with a | |
53 | value greater than the ending address of the DCSS. | |
54 | ||
8b4a503d MCC |
55 | Example:: |
56 | ||
57 | DEF STOR 140M | |
1da177e4 LT |
58 | |
59 | This defines 140MB storage size for your guest, the parameter "mem=160M" is | |
60 | added to the parmfile. | |
61 | ||
62 | ||
63 | User Interface | |
64 | ============== | |
65 | The char device is implemented as a kernel module named "monreader", | |
66 | which can be loaded via the modprobe command, or it can be compiled into the | |
67 | kernel instead. There is one optional module (or kernel) parameter, "mondcss", | |
68 | to specify the name of the monitor DCSS. If the module is compiled into the | |
69 | kernel, the kernel parameter "monreader.mondcss=<DCSS NAME>" can be specified | |
70 | in the parmfile. | |
71 | ||
72 | The default name for the DCSS is "MONDCSS" if none is specified. In case that | |
8b4a503d | 73 | there are other users already connected to the `*MONITOR` service (e.g. |
1da177e4 LT |
74 | Performance Toolkit), the monitor DCSS is already defined and you have to use |
75 | the same DCSS. The CP command Q MONITOR (Class E privileged) shows the name | |
76 | of the monitor DCSS, if already defined, and the users connected to the | |
8b4a503d | 77 | `*MONITOR` service. |
1da177e4 LT |
78 | Refer to the "z/VM Performance" book (SC24-6109-00) on how to create a monitor |
79 | DCSS if your z/VM doesn't have one already, you need Class E privileges to | |
80 | define and save a DCSS. | |
81 | ||
82 | Example: | |
83 | -------- | |
8b4a503d MCC |
84 | |
85 | :: | |
86 | ||
87 | modprobe monreader mondcss=MYDCSS | |
1da177e4 LT |
88 | |
89 | This loads the module and sets the DCSS name to "MYDCSS". | |
90 | ||
91 | NOTE: | |
92 | ----- | |
8b4a503d | 93 | This API provides no interface to control the `*MONITOR` service, e.g. specify |
1da177e4 LT |
94 | which data should be collected. This can be done by the CP command MONITOR |
95 | (Class E privileged), see "CP Command and Utility Reference". | |
96 | ||
97 | Device nodes with udev: | |
98 | ----------------------- | |
99 | After loading the module, a char device will be created along with the device | |
100 | node /<udev directory>/monreader. | |
101 | ||
102 | Device nodes without udev: | |
103 | -------------------------- | |
104 | If your distribution does not support udev, a device node will not be created | |
105 | automatically and you have to create it manually after loading the module. | |
106 | Therefore you need to know the major and minor numbers of the device. These | |
107 | numbers can be found in /sys/class/misc/monreader/dev. | |
8b4a503d | 108 | |
1da177e4 LT |
109 | Typing cat /sys/class/misc/monreader/dev will give an output of the form |
110 | <major>:<minor>. The device node can be created via the mknod command, enter | |
111 | mknod <name> c <major> <minor>, where <name> is the name of the device node | |
112 | to be created. | |
113 | ||
114 | Example: | |
115 | -------- | |
8b4a503d MCC |
116 | |
117 | :: | |
118 | ||
119 | # modprobe monreader | |
120 | # cat /sys/class/misc/monreader/dev | |
121 | 10:63 | |
122 | # mknod /dev/monreader c 10 63 | |
1da177e4 LT |
123 | |
124 | This loads the module with the default monitor DCSS (MONDCSS) and creates a | |
125 | device node. | |
126 | ||
127 | File operations: | |
128 | ---------------- | |
129 | The following file operations are supported: open, release, read, poll. | |
130 | There are two alternative methods for reading: either non-blocking read in | |
131 | conjunction with polling, or blocking read without polling. IOCTLs are not | |
132 | supported. | |
133 | ||
134 | Read: | |
135 | ----- | |
136 | Reading from the device provides a 12 Byte monitor control element (MCE), | |
137 | followed by a set of one or more contiguous monitor records (similar to the | |
138 | output of the CMS utility MONWRITE without the 4K control blocks). The MCE | |
139 | contains information on the type of the following record set (sample/event | |
140 | data), the monitor domains contained within it and the start and end address | |
141 | of the record set in the monitor DCSS. The start and end address can be used | |
142 | to determine the size of the record set, the end address is the address of the | |
143 | last byte of data. The start address is needed to handle "end-of-frame" records | |
144 | correctly (domain 1, record 13), i.e. it can be used to determine the record | |
145 | start offset relative to a 4K page (frame) boundary. | |
146 | ||
8b4a503d | 147 | See "Appendix A: `*MONITOR`" in the "z/VM Performance" document for a description |
1da177e4 LT |
148 | of the monitor control element layout. The layout of the monitor records can |
149 | be found here (z/VM 5.1): http://www.vm.ibm.com/pubs/mon510/index.html | |
150 | ||
8b4a503d MCC |
151 | The layout of the data stream provided by the monreader device is as follows:: |
152 | ||
153 | ... | |
154 | <0 byte read> | |
155 | <first MCE> \ | |
156 | <first set of records> | | |
157 | ... |- data set | |
158 | <last MCE> | | |
159 | <last set of records> / | |
160 | <0 byte read> | |
161 | ... | |
1da177e4 LT |
162 | |
163 | There may be more than one combination of MCE and corresponding record set | |
164 | within one data set and the end of each data set is indicated by a successful | |
165 | read with a return value of 0 (0 byte read). | |
166 | Any received data must be considered invalid until a complete set was | |
167 | read successfully, including the closing 0 byte read. Therefore you should | |
168 | always read the complete set into a buffer before processing the data. | |
169 | ||
170 | The maximum size of a data set can be as large as the size of the | |
171 | monitor DCSS, so design the buffer adequately or use dynamic memory allocation. | |
172 | The size of the monitor DCSS will be printed into syslog after loading the | |
173 | module. You can also use the (Class E privileged) CP command Q NSS MAP to | |
174 | list all available segments and information about them. | |
175 | ||
176 | As with most char devices, error conditions are indicated by returning a | |
177 | negative value for the number of bytes read. In this case, the errno variable | |
178 | indicates the error condition: | |
179 | ||
8b4a503d MCC |
180 | EIO: |
181 | reply failed, read data is invalid and the application | |
1da177e4 | 182 | should discard the data read since the last successful read with 0 size. |
8b4a503d MCC |
183 | EFAULT: |
184 | copy_to_user failed, read data is invalid and the application should | |
185 | discard the data read since the last successful read with 0 size. | |
186 | EAGAIN: | |
187 | occurs on a non-blocking read if there is no data available at the | |
188 | moment. There is no data missing or corrupted, just try again or rather | |
189 | use polling for non-blocking reads. | |
190 | EOVERFLOW: | |
191 | message limit reached, the data read since the last successful | |
192 | read with 0 size is valid but subsequent records may be missing. | |
1da177e4 LT |
193 | |
194 | In the last case (EOVERFLOW) there may be missing data, in the first two cases | |
195 | (EIO, EFAULT) there will be missing data. It's up to the application if it will | |
196 | continue reading subsequent data or rather exit. | |
197 | ||
198 | Open: | |
199 | ----- | |
200 | Only one user is allowed to open the char device. If it is already in use, the | |
201 | open function will fail (return a negative value) and set errno to EBUSY. | |
8b4a503d | 202 | The open function may also fail if an IUCV connection to the `*MONITOR` service |
1da177e4 LT |
203 | cannot be established. In this case errno will be set to EIO and an error |
204 | message with an IPUSER SEVER code will be printed into syslog. The IPUSER SEVER | |
205 | codes are described in the "z/VM Performance" book, Appendix A. | |
206 | ||
207 | NOTE: | |
208 | ----- | |
209 | As soon as the device is opened, incoming messages will be accepted and they | |
210 | will account for the message limit, i.e. opening the device without reading | |
211 | from it will provoke the "message limit reached" error (EOVERFLOW error code) | |
212 | eventually. |