]> git.proxmox.com Git - mirror_frr.git/commitdiff
tests: Fix Daemon Killing to actually notice when a deamon dies
authorDonald Sharp <sharpd@nvidia.com>
Tue, 30 Nov 2021 00:33:48 +0000 (19:33 -0500)
committerDonald Sharp <sharpd@nvidia.com>
Tue, 30 Nov 2021 01:55:30 +0000 (20:55 -0500)
Lot's of the GR topotests kill daemons in order to test code
that deals with crashing daemons.  Under heavy system load
it was noticed that a kill command was sent and if told to
wait we would sleep 2 seconds send another kill command and
call it good.  This was causiing issues when subsuquent
json commands would get errors like `lost connection to daemon`
as the daemon finally shut down after some time due to load.

Modify the kill the daemon function to notice that the daemon
was not actually killed and if we need to wait wait some
more time for it too happen

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
tests/topotests/lib/topotest.py

index 4e613ce8ac7567f9f4eb832cc376496ae44c3cfe..6be644ac00a5999598e7a667f3f908ef1c06d475 100644 (file)
@@ -1859,7 +1859,7 @@ class Router(Node):
                             self.cmd("kill -9 %s" % daemonpid)
                             if pid_exists(int(daemonpid)):
                                 numRunning += 1
-                        if wait and numRunning > 0:
+                        while wait and numRunning > 0:
                             sleep(
                                 2,
                                 "{}: waiting for {} daemon to be stopped".format(
@@ -1883,7 +1883,11 @@ class Router(Node):
                                             )
                                         )
                                         self.cmd("kill -9 %s" % daemonpid)
-                                    self.cmd("rm -- {}".format(d.rstrip()))
+                                    if daemonpid.isdigit() and not pid_exists(
+                                        int(daemonpid)
+                                    ):
+                                        numRunning -= 1
+                        self.cmd("rm -- {}".format(d.rstrip()))
                     if wait:
                         errors = self.checkRouterCores(reportOnce=True)
                         if self.checkRouterVersion("<", minErrorVersion):