]>
Commit | Line | Data |
---|---|---|
7b0df9c5 | 1 | .\" This file was originally generated by help2man 1.36. |
9473e340 | 2 | .TH WATCHFRR 8 "July 2010" |
7b0df9c5 | 3 | .SH NAME |
9473e340 | 4 | watchfrr \- a program to monitor the status of frr daemons |
7b0df9c5 | 5 | .SH SYNOPSIS |
9473e340 | 6 | .B watchfrr |
7b0df9c5 DW |
7 | .RI [ option ...] |
8 | .IR daemon ... | |
9 | .br | |
9473e340 | 10 | .B watchfrr |
7b0df9c5 DW |
11 | .BR \-h " | " \-v |
12 | .SH DESCRIPTION | |
9473e340 DS |
13 | .B watchfrr |
14 | is a watchdog program that monitors the status of supplied frr | |
7b0df9c5 DW |
15 | .IR daemon s |
16 | and tries to restart them in case they become unresponsive or shut down. | |
17 | .PP | |
18 | To determine whether a daemon is running, it tries to connect to the | |
19 | daemon's VTY UNIX stream socket, and send echo commands to ensure the | |
20 | daemon responds. When the daemon crashes, EOF is received from the socket, | |
9473e340 | 21 | so that watchfrr can react immediately. |
7b0df9c5 DW |
22 | .PP |
23 | This program can run in one of the following 5 modes: | |
24 | .TP | |
25 | .B Mode 0: monitor | |
26 | In this mode, the program serves as a monitor and reports status changes. | |
27 | .IP | |
9473e340 | 28 | Example usage: watchfrr \-d zebra ospfd bgpd |
7b0df9c5 DW |
29 | .TP |
30 | .B Mode 1: global restart | |
31 | In this mode, whenever a daemon hangs or crashes, the given command is used | |
32 | to restart all watched daemons. | |
33 | .IP | |
9473e340 | 34 | Example usage: watchfrr \-dz \e |
7b0df9c5 DW |
35 | .br |
36 | -R '/sbin/service zebra restart; /sbin/service ospfd restart' \e | |
37 | .br | |
38 | zebra ospfd | |
39 | .TP | |
40 | .B Mode 2: individual daemon restart | |
41 | In this mode, whenever a single daemon hangs or crashes, the given command | |
42 | is used to restart this daemon only. | |
43 | .IP | |
9473e340 | 44 | Example usage: watchfrr \-dz \-r '/sbin/service %s restart' \e |
7b0df9c5 DW |
45 | .br |
46 | zebra ospfd bgpd | |
47 | .TP | |
48 | .B Mode 3: phased zebra restart | |
49 | In this mode, whenever a single daemon hangs or crashes, the given command | |
50 | is used to restart this daemon only. The only exception is the zebra | |
51 | daemon; in this case, the following steps are taken: (1) all other daemons | |
52 | are stopped, (2) zebra is restarted, and (3) other daemons are started | |
53 | again. | |
54 | .IP | |
9473e340 | 55 | Example usage: watchfrr \-adz \-r '/sbin/service %s restart' \e |
7b0df9c5 DW |
56 | .br |
57 | \-s '/sbin/service %s start' \e | |
58 | .br | |
59 | \-k '/sbin/service %s stop' zebra ospfd bgpd | |
60 | .TP | |
61 | .B Mode 4: phased global restart for any failure | |
62 | In this mode, whenever a single daemon hangs or crashes, the following | |
63 | steps are taken: (1) all other daemons are stopped, (2) zebra is restarted, | |
64 | and (3) other daemons are started again. | |
65 | .IP | |
9473e340 | 66 | Example usage: watchfrr \-Adz \-r '/sbin/service %s restart' \e |
7b0df9c5 DW |
67 | .br |
68 | \-s '/sbin/service %s start' \e | |
69 | .br | |
70 | \-k '/sbin/service %s stop' zebra ospfd bgpd | |
71 | .PP | |
72 | Important: It is believed that mode 2 (individual daemon restart) is not | |
73 | safe, and mode 3 (phased zebra restart) may not be safe with certain | |
74 | routing daemons. | |
75 | .PP | |
76 | In order to avoid restarting the daemons in quick succession, you can | |
77 | supply the | |
78 | .B \-m | |
79 | and | |
80 | .B \-M | |
81 | options to set the minimum and maximum delay between the restart commands. | |
82 | The minimum restart delay is recalculated each time a restart is attempted. | |
83 | If the time since the last restart attempt exceeds twice the value of | |
84 | .BR \-M , | |
85 | the restart delay is set to the value of | |
86 | .BR \-m , | |
87 | otherwise the interval is doubled (but capped at the value of | |
88 | .BR \-M ). | |
89 | .SH OPTIONS | |
90 | .TP | |
91 | .BR \-d ", " \-\-daemon | |
92 | Run in daemon mode. When supplied, error messages are sent to Syslog | |
93 | instead of standard output (stdout). | |
94 | .TP | |
95 | .BI \-S " directory" "\fR, \fB\-\-statedir " directory | |
96 | Set the VTY socket | |
97 | .I directory | |
9473e340 | 98 | (the default value is "/var/run/frr"). |
7b0df9c5 DW |
99 | .TP |
100 | .BR \-e ", " \-\-no\-echo | |
101 | Do not ping the daemons to test whether they respond. This option is | |
102 | necessary if one or more daemons do not support the echo command. | |
103 | .TP | |
104 | .BI \-l " level" "\fR, \fB\-\-loglevel " level | |
105 | Set the logging | |
106 | .I level | |
107 | (the default value is "6"). The value should range from 0 (LOG_EMERG) to 7 | |
108 | (LOG_DEBUG), but higher number can be supplied if extra debugging messages | |
109 | are required. | |
110 | .TP | |
111 | .BI \-m " number" "\fR, \fB\-\-min\-restart\-interval " number | |
112 | Set the minimum | |
113 | .I number | |
114 | of seconds to wait between invocations of the daemon restart commands (the | |
115 | default value is "60"). | |
116 | .TP | |
117 | .BI \-M " number" "\fR, \fB\-\-max\-restart\-interval " number | |
118 | Set the maximum | |
119 | .I number | |
120 | of seconds to wait between invocations of the daemon restart commands (the | |
121 | default value is "600"). | |
122 | .TP | |
123 | .BI \-i " number" "\fR, \fB\-\-interval " number | |
124 | Set the status polling interval in seconds (the default value is "5"). | |
125 | .TP | |
126 | .BI \-t " number" "\fR, \fB\-\-timeout " number | |
127 | Set the unresponsiveness timeout in seconds (the default value is "10"). | |
128 | .TP | |
129 | .BI \-T " number" "\fR, \fB\-\-restart\-timeout " number | |
130 | Set the restart (kill) timeout in seconds (the default value is "20"). If | |
131 | any background jobs are still running after this period has elapsed, they | |
132 | will be killed. | |
133 | .TP | |
134 | .BI \-r " command" "\fR, \fB\-\-restart " command | |
135 | Supply a Bourne shell | |
136 | .I command | |
137 | to restart a single daemon. The command string should contain the '%s' | |
138 | placeholder to be substituted with the daemon name. | |
139 | .IP | |
140 | Note that | |
141 | .B \-r | |
142 | and | |
143 | .B \-R | |
144 | options are not compatible. | |
145 | .TP | |
146 | .BI \-s " command" "\fR, \fB\-\-start\-command " command | |
147 | Supply a Bourne shell | |
148 | .I command | |
149 | to start a single daemon. The command string should contain the '%s' | |
150 | placeholder to be substituted with the daemon name. | |
151 | .TP | |
152 | .BI \-k " command" "\fR, \fB\-\-kill\-command " command | |
153 | Supply a Bourne shell | |
154 | .I command | |
155 | to stop a single daemon. The command string should contain the '%s' | |
156 | placeholder to be substituted with the daemon name. | |
157 | .TP | |
158 | .BR \-R ", " \-\-restart\-all | |
159 | When one or more daemons are shut down, try to restart them using the | |
160 | Bourne shell command supplied on the command line. | |
161 | .IP | |
162 | Note that | |
163 | .B \-r | |
164 | and | |
165 | .B \-R | |
166 | options are not compatible. | |
167 | .TP | |
168 | .BR \-z ", " \-\-unresponsive\-restart | |
169 | When a daemon is in an unresponsive state, treat it as being shut down for | |
170 | the restart purposes. | |
171 | .TP | |
172 | .BR \-a ", " \-\-all\-restart | |
173 | When zebra hangs or crashes, restart all daemons taking the following | |
174 | steps: (1) stop all other daemons, (2) restart zebra, and (3) start other | |
175 | daemons again. | |
176 | .IP | |
177 | Note that this option also requires | |
178 | .BR \-r , | |
179 | .BR \-s , | |
180 | and | |
181 | .B \-k | |
182 | options to be specified. | |
183 | .TP | |
184 | .BR \-A ", " \-\-always\-all\-restart | |
185 | When any daemon (i.e., not just zebra) hangs or crashes, restart all | |
186 | daemons taking the following steps: (1) stop all other daemons, (2) restart | |
187 | zebra, and (3) start other daemons again. | |
188 | .IP | |
189 | Note that this option also requires | |
190 | .BR \-r , | |
191 | .BR \-s , | |
192 | and | |
193 | .B \-k | |
194 | options to be specified. | |
195 | .TP | |
196 | .BI \-p " filename" "\fR, \fB\-\-pid\-file " filename | |
197 | Set the process identifier | |
198 | .I filename | |
9473e340 | 199 | (the default value is "/var/run/frr/watchfrr.pid"). |
7b0df9c5 DW |
200 | .TP |
201 | .BI \-b " string" "\fR, \fB\-\-blank\-string " string | |
202 | When the supplied | |
203 | .I string | |
204 | is found in any of the command line option arguments (i.e., | |
205 | .BR \-r , | |
206 | .BR \-s , | |
207 | .BR \-k , | |
208 | or | |
209 | .BR \-R ), | |
210 | replace it with a space. | |
211 | .IP | |
212 | This is an ugly hack to circumvent problems with passing the command line | |
213 | arguments containing embedded spaces. | |
214 | .TP | |
215 | .BR \-v ", " \-\-version | |
216 | Display the version information and exit. | |
217 | .TP | |
218 | .BR \-h ", " \-\-help | |
219 | Display the usage information and exit. | |
220 | .SH SEE ALSO | |
221 | .BR zebra (8), | |
222 | .BR bgpd (8), | |
223 | .BR isisd (8), | |
224 | .BR ospfd (8), | |
225 | .BR ospf6d (8), | |
226 | .BR ripd (8), | |
227 | .BR ripngd (8) | |
228 | .PP | |
a07169b1 | 229 | See the project homepage at <@PACKAGE_URL@>. |
7b0df9c5 DW |
230 | .SH AUTHORS |
231 | Copyright 2004 Andrew J. Schorr |