]>
Commit | Line | Data |
---|---|---|
1 | =============================================== | |
2 | The Linux WatchDog Timer Driver Core kernel API | |
3 | =============================================== | |
4 | ||
5 | Last reviewed: 12-Feb-2013 | |
6 | ||
7 | Wim Van Sebroeck <wim@iguana.be> | |
8 | ||
9 | Introduction | |
10 | ------------ | |
11 | This document does not describe what a WatchDog Timer (WDT) Driver or Device is. | |
12 | It also does not describe the API which can be used by user space to communicate | |
13 | with a WatchDog Timer. If you want to know this then please read the following | |
14 | file: Documentation/watchdog/watchdog-api.rst . | |
15 | ||
16 | So what does this document describe? It describes the API that can be used by | |
17 | WatchDog Timer Drivers that want to use the WatchDog Timer Driver Core | |
18 | Framework. This framework provides all interfacing towards user space so that | |
19 | the same code does not have to be reproduced each time. This also means that | |
20 | a watchdog timer driver then only needs to provide the different routines | |
21 | (operations) that control the watchdog timer (WDT). | |
22 | ||
23 | The API | |
24 | ------- | |
25 | Each watchdog timer driver that wants to use the WatchDog Timer Driver Core | |
26 | must #include <linux/watchdog.h> (you would have to do this anyway when | |
27 | writing a watchdog device driver). This include file contains following | |
28 | register/unregister routines:: | |
29 | ||
30 | extern int watchdog_register_device(struct watchdog_device *); | |
31 | extern void watchdog_unregister_device(struct watchdog_device *); | |
32 | ||
33 | The watchdog_register_device routine registers a watchdog timer device. | |
34 | The parameter of this routine is a pointer to a watchdog_device structure. | |
35 | This routine returns zero on success and a negative errno code for failure. | |
36 | ||
37 | The watchdog_unregister_device routine deregisters a registered watchdog timer | |
38 | device. The parameter of this routine is the pointer to the registered | |
39 | watchdog_device structure. | |
40 | ||
41 | The watchdog subsystem includes an registration deferral mechanism, | |
42 | which allows you to register an watchdog as early as you wish during | |
43 | the boot process. | |
44 | ||
45 | The watchdog device structure looks like this:: | |
46 | ||
47 | struct watchdog_device { | |
48 | int id; | |
49 | struct device *parent; | |
50 | const struct attribute_group **groups; | |
51 | const struct watchdog_info *info; | |
52 | const struct watchdog_ops *ops; | |
53 | const struct watchdog_governor *gov; | |
54 | unsigned int bootstatus; | |
55 | unsigned int timeout; | |
56 | unsigned int pretimeout; | |
57 | unsigned int min_timeout; | |
58 | unsigned int max_timeout; | |
59 | unsigned int min_hw_heartbeat_ms; | |
60 | unsigned int max_hw_heartbeat_ms; | |
61 | struct notifier_block reboot_nb; | |
62 | struct notifier_block restart_nb; | |
63 | void *driver_data; | |
64 | struct watchdog_core_data *wd_data; | |
65 | unsigned long status; | |
66 | struct list_head deferred; | |
67 | }; | |
68 | ||
69 | It contains following fields: | |
70 | ||
71 | * id: set by watchdog_register_device, id 0 is special. It has both a | |
72 | /dev/watchdog0 cdev (dynamic major, minor 0) as well as the old | |
73 | /dev/watchdog miscdev. The id is set automatically when calling | |
74 | watchdog_register_device. | |
75 | * parent: set this to the parent device (or NULL) before calling | |
76 | watchdog_register_device. | |
77 | * groups: List of sysfs attribute groups to create when creating the watchdog | |
78 | device. | |
79 | * info: a pointer to a watchdog_info structure. This structure gives some | |
80 | additional information about the watchdog timer itself. (Like it's unique name) | |
81 | * ops: a pointer to the list of watchdog operations that the watchdog supports. | |
82 | * gov: a pointer to the assigned watchdog device pretimeout governor or NULL. | |
83 | * timeout: the watchdog timer's timeout value (in seconds). | |
84 | This is the time after which the system will reboot if user space does | |
85 | not send a heartbeat request if WDOG_ACTIVE is set. | |
86 | * pretimeout: the watchdog timer's pretimeout value (in seconds). | |
87 | * min_timeout: the watchdog timer's minimum timeout value (in seconds). | |
88 | If set, the minimum configurable value for 'timeout'. | |
89 | * max_timeout: the watchdog timer's maximum timeout value (in seconds), | |
90 | as seen from userspace. If set, the maximum configurable value for | |
91 | 'timeout'. Not used if max_hw_heartbeat_ms is non-zero. | |
92 | * min_hw_heartbeat_ms: Hardware limit for minimum time between heartbeats, | |
93 | in milli-seconds. This value is normally 0; it should only be provided | |
94 | if the hardware can not tolerate lower intervals between heartbeats. | |
95 | * max_hw_heartbeat_ms: Maximum hardware heartbeat, in milli-seconds. | |
96 | If set, the infrastructure will send heartbeats to the watchdog driver | |
97 | if 'timeout' is larger than max_hw_heartbeat_ms, unless WDOG_ACTIVE | |
98 | is set and userspace failed to send a heartbeat for at least 'timeout' | |
99 | seconds. max_hw_heartbeat_ms must be set if a driver does not implement | |
100 | the stop function. | |
101 | * reboot_nb: notifier block that is registered for reboot notifications, for | |
102 | internal use only. If the driver calls watchdog_stop_on_reboot, watchdog core | |
103 | will stop the watchdog on such notifications. | |
104 | * restart_nb: notifier block that is registered for machine restart, for | |
105 | internal use only. If a watchdog is capable of restarting the machine, it | |
106 | should define ops->restart. Priority can be changed through | |
107 | watchdog_set_restart_priority. | |
108 | * bootstatus: status of the device after booting (reported with watchdog | |
109 | WDIOF_* status bits). | |
110 | * driver_data: a pointer to the drivers private data of a watchdog device. | |
111 | This data should only be accessed via the watchdog_set_drvdata and | |
112 | watchdog_get_drvdata routines. | |
113 | * wd_data: a pointer to watchdog core internal data. | |
114 | * status: this field contains a number of status bits that give extra | |
115 | information about the status of the device (Like: is the watchdog timer | |
116 | running/active, or is the nowayout bit set). | |
117 | * deferred: entry in wtd_deferred_reg_list which is used to | |
118 | register early initialized watchdogs. | |
119 | ||
120 | The list of watchdog operations is defined as:: | |
121 | ||
122 | struct watchdog_ops { | |
123 | struct module *owner; | |
124 | /* mandatory operations */ | |
125 | int (*start)(struct watchdog_device *); | |
126 | /* optional operations */ | |
127 | int (*stop)(struct watchdog_device *); | |
128 | int (*ping)(struct watchdog_device *); | |
129 | unsigned int (*status)(struct watchdog_device *); | |
130 | int (*set_timeout)(struct watchdog_device *, unsigned int); | |
131 | int (*set_pretimeout)(struct watchdog_device *, unsigned int); | |
132 | unsigned int (*get_timeleft)(struct watchdog_device *); | |
133 | int (*restart)(struct watchdog_device *); | |
134 | long (*ioctl)(struct watchdog_device *, unsigned int, unsigned long); | |
135 | }; | |
136 | ||
137 | It is important that you first define the module owner of the watchdog timer | |
138 | driver's operations. This module owner will be used to lock the module when | |
139 | the watchdog is active. (This to avoid a system crash when you unload the | |
140 | module and /dev/watchdog is still open). | |
141 | ||
142 | Some operations are mandatory and some are optional. The mandatory operations | |
143 | are: | |
144 | ||
145 | * start: this is a pointer to the routine that starts the watchdog timer | |
146 | device. | |
147 | The routine needs a pointer to the watchdog timer device structure as a | |
148 | parameter. It returns zero on success or a negative errno code for failure. | |
149 | ||
150 | Not all watchdog timer hardware supports the same functionality. That's why | |
151 | all other routines/operations are optional. They only need to be provided if | |
152 | they are supported. These optional routines/operations are: | |
153 | ||
154 | * stop: with this routine the watchdog timer device is being stopped. | |
155 | ||
156 | The routine needs a pointer to the watchdog timer device structure as a | |
157 | parameter. It returns zero on success or a negative errno code for failure. | |
158 | Some watchdog timer hardware can only be started and not be stopped. A | |
159 | driver supporting such hardware does not have to implement the stop routine. | |
160 | ||
161 | If a driver has no stop function, the watchdog core will set WDOG_HW_RUNNING | |
162 | and start calling the driver's keepalive pings function after the watchdog | |
163 | device is closed. | |
164 | ||
165 | If a watchdog driver does not implement the stop function, it must set | |
166 | max_hw_heartbeat_ms. | |
167 | * ping: this is the routine that sends a keepalive ping to the watchdog timer | |
168 | hardware. | |
169 | ||
170 | The routine needs a pointer to the watchdog timer device structure as a | |
171 | parameter. It returns zero on success or a negative errno code for failure. | |
172 | ||
173 | Most hardware that does not support this as a separate function uses the | |
174 | start function to restart the watchdog timer hardware. And that's also what | |
175 | the watchdog timer driver core does: to send a keepalive ping to the watchdog | |
176 | timer hardware it will either use the ping operation (when available) or the | |
177 | start operation (when the ping operation is not available). | |
178 | ||
179 | (Note: the WDIOC_KEEPALIVE ioctl call will only be active when the | |
180 | WDIOF_KEEPALIVEPING bit has been set in the option field on the watchdog's | |
181 | info structure). | |
182 | * status: this routine checks the status of the watchdog timer device. The | |
183 | status of the device is reported with watchdog WDIOF_* status flags/bits. | |
184 | ||
185 | WDIOF_MAGICCLOSE and WDIOF_KEEPALIVEPING are reported by the watchdog core; | |
186 | it is not necessary to report those bits from the driver. Also, if no status | |
187 | function is provided by the driver, the watchdog core reports the status bits | |
188 | provided in the bootstatus variable of struct watchdog_device. | |
189 | ||
190 | * set_timeout: this routine checks and changes the timeout of the watchdog | |
191 | timer device. It returns 0 on success, -EINVAL for "parameter out of range" | |
192 | and -EIO for "could not write value to the watchdog". On success this | |
193 | routine should set the timeout value of the watchdog_device to the | |
194 | achieved timeout value (which may be different from the requested one | |
195 | because the watchdog does not necessarily have a 1 second resolution). | |
196 | ||
197 | Drivers implementing max_hw_heartbeat_ms set the hardware watchdog heartbeat | |
198 | to the minimum of timeout and max_hw_heartbeat_ms. Those drivers set the | |
199 | timeout value of the watchdog_device either to the requested timeout value | |
200 | (if it is larger than max_hw_heartbeat_ms), or to the achieved timeout value. | |
201 | (Note: the WDIOF_SETTIMEOUT needs to be set in the options field of the | |
202 | watchdog's info structure). | |
203 | ||
204 | If the watchdog driver does not have to perform any action but setting the | |
205 | watchdog_device.timeout, this callback can be omitted. | |
206 | ||
207 | If set_timeout is not provided but, WDIOF_SETTIMEOUT is set, the watchdog | |
208 | infrastructure updates the timeout value of the watchdog_device internally | |
209 | to the requested value. | |
210 | ||
211 | If the pretimeout feature is used (WDIOF_PRETIMEOUT), then set_timeout must | |
212 | also take care of checking if pretimeout is still valid and set up the timer | |
213 | accordingly. This can't be done in the core without races, so it is the | |
214 | duty of the driver. | |
215 | * set_pretimeout: this routine checks and changes the pretimeout value of | |
216 | the watchdog. It is optional because not all watchdogs support pretimeout | |
217 | notification. The timeout value is not an absolute time, but the number of | |
218 | seconds before the actual timeout would happen. It returns 0 on success, | |
219 | -EINVAL for "parameter out of range" and -EIO for "could not write value to | |
220 | the watchdog". A value of 0 disables pretimeout notification. | |
221 | ||
222 | (Note: the WDIOF_PRETIMEOUT needs to be set in the options field of the | |
223 | watchdog's info structure). | |
224 | ||
225 | If the watchdog driver does not have to perform any action but setting the | |
226 | watchdog_device.pretimeout, this callback can be omitted. That means if | |
227 | set_pretimeout is not provided but WDIOF_PRETIMEOUT is set, the watchdog | |
228 | infrastructure updates the pretimeout value of the watchdog_device internally | |
229 | to the requested value. | |
230 | ||
231 | * get_timeleft: this routines returns the time that's left before a reset. | |
232 | * restart: this routine restarts the machine. It returns 0 on success or a | |
233 | negative errno code for failure. | |
234 | * ioctl: if this routine is present then it will be called first before we do | |
235 | our own internal ioctl call handling. This routine should return -ENOIOCTLCMD | |
236 | if a command is not supported. The parameters that are passed to the ioctl | |
237 | call are: watchdog_device, cmd and arg. | |
238 | ||
239 | The status bits should (preferably) be set with the set_bit and clear_bit alike | |
240 | bit-operations. The status bits that are defined are: | |
241 | ||
242 | * WDOG_ACTIVE: this status bit indicates whether or not a watchdog timer device | |
243 | is active or not from user perspective. User space is expected to send | |
244 | heartbeat requests to the driver while this flag is set. | |
245 | * WDOG_NO_WAY_OUT: this bit stores the nowayout setting for the watchdog. | |
246 | If this bit is set then the watchdog timer will not be able to stop. | |
247 | * WDOG_HW_RUNNING: Set by the watchdog driver if the hardware watchdog is | |
248 | running. The bit must be set if the watchdog timer hardware can not be | |
249 | stopped. The bit may also be set if the watchdog timer is running after | |
250 | booting, before the watchdog device is opened. If set, the watchdog | |
251 | infrastructure will send keepalives to the watchdog hardware while | |
252 | WDOG_ACTIVE is not set. | |
253 | Note: when you register the watchdog timer device with this bit set, | |
254 | then opening /dev/watchdog will skip the start operation but send a keepalive | |
255 | request instead. | |
256 | ||
257 | To set the WDOG_NO_WAY_OUT status bit (before registering your watchdog | |
258 | timer device) you can either: | |
259 | ||
260 | * set it statically in your watchdog_device struct with | |
261 | ||
262 | .status = WATCHDOG_NOWAYOUT_INIT_STATUS, | |
263 | ||
264 | (this will set the value the same as CONFIG_WATCHDOG_NOWAYOUT) or | |
265 | * use the following helper function:: | |
266 | ||
267 | static inline void watchdog_set_nowayout(struct watchdog_device *wdd, | |
268 | int nowayout) | |
269 | ||
270 | Note: | |
271 | The WatchDog Timer Driver Core supports the magic close feature and | |
272 | the nowayout feature. To use the magic close feature you must set the | |
273 | WDIOF_MAGICCLOSE bit in the options field of the watchdog's info structure. | |
274 | ||
275 | The nowayout feature will overrule the magic close feature. | |
276 | ||
277 | To get or set driver specific data the following two helper functions should be | |
278 | used:: | |
279 | ||
280 | static inline void watchdog_set_drvdata(struct watchdog_device *wdd, | |
281 | void *data) | |
282 | static inline void *watchdog_get_drvdata(struct watchdog_device *wdd) | |
283 | ||
284 | The watchdog_set_drvdata function allows you to add driver specific data. The | |
285 | arguments of this function are the watchdog device where you want to add the | |
286 | driver specific data to and a pointer to the data itself. | |
287 | ||
288 | The watchdog_get_drvdata function allows you to retrieve driver specific data. | |
289 | The argument of this function is the watchdog device where you want to retrieve | |
290 | data from. The function returns the pointer to the driver specific data. | |
291 | ||
292 | To initialize the timeout field, the following function can be used:: | |
293 | ||
294 | extern int watchdog_init_timeout(struct watchdog_device *wdd, | |
295 | unsigned int timeout_parm, | |
296 | struct device *dev); | |
297 | ||
298 | The watchdog_init_timeout function allows you to initialize the timeout field | |
299 | using the module timeout parameter or by retrieving the timeout-sec property from | |
300 | the device tree (if the module timeout parameter is invalid). Best practice is | |
301 | to set the default timeout value as timeout value in the watchdog_device and | |
302 | then use this function to set the user "preferred" timeout value. | |
303 | This routine returns zero on success and a negative errno code for failure. | |
304 | ||
305 | To disable the watchdog on reboot, the user must call the following helper:: | |
306 | ||
307 | static inline void watchdog_stop_on_reboot(struct watchdog_device *wdd); | |
308 | ||
309 | To disable the watchdog when unregistering the watchdog, the user must call | |
310 | the following helper. Note that this will only stop the watchdog if the | |
311 | nowayout flag is not set. | |
312 | ||
313 | :: | |
314 | ||
315 | static inline void watchdog_stop_on_unregister(struct watchdog_device *wdd); | |
316 | ||
317 | To change the priority of the restart handler the following helper should be | |
318 | used:: | |
319 | ||
320 | void watchdog_set_restart_priority(struct watchdog_device *wdd, int priority); | |
321 | ||
322 | User should follow the following guidelines for setting the priority: | |
323 | ||
324 | * 0: should be called in last resort, has limited restart capabilities | |
325 | * 128: default restart handler, use if no other handler is expected to be | |
326 | available, and/or if restart is sufficient to restart the entire system | |
327 | * 255: highest priority, will preempt all other restart handlers | |
328 | ||
329 | To raise a pretimeout notification, the following function should be used:: | |
330 | ||
331 | void watchdog_notify_pretimeout(struct watchdog_device *wdd) | |
332 | ||
333 | The function can be called in the interrupt context. If watchdog pretimeout | |
334 | governor framework (kbuild CONFIG_WATCHDOG_PRETIMEOUT_GOV symbol) is enabled, | |
335 | an action is taken by a preconfigured pretimeout governor preassigned to | |
336 | the watchdog device. If watchdog pretimeout governor framework is not | |
337 | enabled, watchdog_notify_pretimeout() prints a notification message to | |
338 | the kernel log buffer. | |
339 | ||
340 | To set the last known HW keepalive time for a watchdog, the following function | |
341 | should be used:: | |
342 | ||
343 | int watchdog_set_last_hw_keepalive(struct watchdog_device *wdd, | |
344 | unsigned int last_ping_ms) | |
345 | ||
346 | This function must be called immediately after watchdog registration. It | |
347 | sets the last known hardware heartbeat to have happened last_ping_ms before | |
348 | current time. Calling this is only needed if the watchdog is already running | |
349 | when probe is called, and the watchdog can only be pinged after the | |
350 | min_hw_heartbeat_ms time has passed from the last ping. |