ceph/monitoring/ceph-mixin/tests_dashboards/features/radosgw_overview.feature

   1 Feature: RGW Overview Dashboard
   2
   3 Scenario: "Test Average GET Latencies"
   4   Given the following series:
   5     | metrics | values |
   6     | ceph_rgw_get_initial_lat_sum{instance="127.0.0.1", instance_id="58892247", job="ceph"} | 10 50 100 |
   7     | ceph_rgw_get_initial_lat_count{instance="127.0.0.1", instance_id="58892247", job="ceph"} | 20 60 80 |
   8     | ceph_rgw_metadata{ceph_daemon="rgw.foo", hostname="localhost", instance="127.0.0.1", instance_id="58892247", job="ceph"} | 1 1 1 |
   9   When interval is `30s`
  10   Then Grafana panel `Average GET/PUT Latencies by RGW Instance` with legend `GET {{rgw_host}}` shows:
  11     | metrics | values |
  12     | {ceph_daemon="rgw.foo", instance="127.0.0.1", instance_id="58892247", job="ceph", rgw_host="foo"} | 1.5 |
  13
  14 Scenario: "Test Average PUT Latencies"
  15   Given the following series:
  16     | metrics | values |
  17     | ceph_rgw_put_initial_lat_sum{instance="127.0.0.1", instance_id="58892247", job="ceph"} | 15 35 55 |
  18     | ceph_rgw_put_initial_lat_count{instance="127.0.0.1", instance_id="58892247", job="ceph"} | 10 30 50 |
  19     | ceph_rgw_metadata{ceph_daemon="rgw.foo", hostname="localhost", instance="127.0.0.1", instance_id="58892247", job="ceph"} | 1 1 1 |
  20   When interval is `30s`
  21   Then Grafana panel `Average GET/PUT Latencies by RGW Instance` with legend `PUT {{rgw_host}}` shows:
  22     | metrics | values |
  23     | {ceph_daemon="rgw.foo", instance="127.0.0.1", instance_id="58892247", job="ceph", rgw_host="foo"} | 1 |
  24
  25 Scenario: "Test Total Requests/sec by RGW Instance"
  26   Given the following series:
  27     | metrics | values |
  28     | ceph_rgw_req{instance="127.0.0.1", instance_id="92806566", job="ceph"} | 10 50 100 |
  29     | ceph_rgw_metadata{ceph_daemon="rgw.1", hostname="localhost", instance="127.0.0.1", instance_id="92806566", job="ceph"} | 1 1 1 |
  30   When interval is `30s`
  31   Then Grafana panel `Total Requests/sec by RGW Instance` with legend `{{rgw_host}}` shows:
  32     | metrics | values |
  33     | {rgw_host="1"} | 1.5 |
  34
  35 Scenario: "Test GET Latencies by RGW Instance"
  36   Given the following series:
  37     | metrics | values |
  38     | ceph_rgw_get_initial_lat_sum{instance="127.0.0.1", instance_id="58892247", job="ceph"} | 10 50 100 |
  39     | ceph_rgw_get_initial_lat_count{instance="127.0.0.1", instance_id="58892247", job="ceph"} | 20 60 80 |
  40     | ceph_rgw_metadata{ceph_daemon="rgw.foo", hostname="localhost", instance="127.0.0.1", instance_id="58892247", job="ceph"} | 1 1 1 |
  41   When interval is `30s`
  42   Then Grafana panel `GET Latencies by RGW Instance` with legend `{{rgw_host}}` shows:
  43     | metrics | values |
  44     | {ceph_daemon="rgw.foo", instance="127.0.0.1", instance_id="58892247", job="ceph", rgw_host="foo"} | 1.5 |
  45
  46 Scenario: "Test Bandwidth Consumed by Type- GET"
  47   Given the following series:
  48     | metrics | values |
  49     | ceph_rgw_get_b{instance="127.0.0.1", instance_id="92806566", job="ceph"} | 10 50 100 |
  50   When evaluation time is `1m`
  51   And interval is `30s`
  52   Then Grafana panel `Bandwidth Consumed by Type` with legend `GETs` shows:
  53     | metrics | values |
  54     | {} | 1.5 |
  55
  56 Scenario: "Test Bandwidth Consumed by Type- PUT"
  57   Given the following series:
  58     | metrics | values |
  59     | ceph_rgw_put_b{instance="127.0.0.1", instance_id="92806566", job="ceph"} | 5 20 50 |
  60   When evaluation time is `1m`
  61   And interval is `30s`
  62   Then Grafana panel `Bandwidth Consumed by Type` with legend `PUTs` shows:
  63     | metrics | values |
  64     | {} | 7.5E-01 |
  65
  66 Scenario: "Test Bandwidth by RGW Instance"
  67   Given the following series:
  68     | metrics | values |
  69     | ceph_rgw_get_b{instance="127.0.0.1", instance_id="92806566", job="ceph"} | 10 50 100 |
  70     | ceph_rgw_put_b{instance="127.0.0.1", instance_id="92806566", job="ceph"} | 5 20 50 |
  71     | ceph_rgw_metadata{ceph_daemon="rgw.1", hostname="localhost", instance="127.0.0.1", instance_id="92806566", job="ceph"} | 1 1 1 |
  72   When evaluation time is `1m`
  73   And interval is `30s`
  74   Then Grafana panel `Bandwidth by RGW Instance` with legend `{{rgw_host}}` shows:
  75     | metrics | values |
  76     | {ceph_daemon="rgw.1", instance_id="92806566", rgw_host="1"} | 2.25 |
  77
  78 Scenario: "Test PUT Latencies by RGW Instance"
  79   Given the following series:
  80     | metrics | values |
  81     | ceph_rgw_put_initial_lat_sum{instance="127.0.0.1", instance_id="58892247", job="ceph"} | 15 35 55 |
  82     | ceph_rgw_put_initial_lat_count{instance="127.0.0.1", instance_id="58892247", job="ceph"} | 10 30 50 |
  83     | ceph_rgw_metadata{ceph_daemon="rgw.foo", hostname="localhost", instance="127.0.0.1", instance_id="58892247", job="ceph"} | 1 1 1 |
  84   When evaluation time is `1m`
  85   And interval is `30s`
  86   Then Grafana panel `PUT Latencies by RGW Instance` with legend `{{rgw_host}}` shows:
  87     | metrics | values |
  88     | {ceph_daemon="rgw.foo", instance="127.0.0.1", instance_id="58892247", job="ceph", rgw_host="foo"} | 1 |
  89
  90 Scenario: "Test Total backend responses by HTTP code"
  91   Given the following series:
  92     | metrics | values |
  93     | haproxy_backend_http_responses_total{job="haproxy",code="200",instance="ingress.rgw.1",proxy="backend"} | 10 100 |
  94     | haproxy_backend_http_responses_total{job="haproxy",code="404",instance="ingress.rgw.1",proxy="backend"} | 20 200 |
  95   When variable `ingress_service` is `ingress.rgw.1`
  96   When variable `code` is `200`
  97   Then Grafana panel `Total responses by HTTP code` with legend `Backend {{ code }}` shows:
  98     | metrics | values |
  99     | {code="200"} | 1.5 |
 100
 101 Scenario: "Test Total frontend responses by HTTP code"
 102   Given the following series:
 103     | metrics | values |
 104     | haproxy_frontend_http_responses_total{job="haproxy",code="200",instance="ingress.rgw.1",proxy="frontend"} | 10 100 |
 105     | haproxy_frontend_http_responses_total{job="haproxy",code="404",instance="ingress.rgw.1",proxy="frontend"} | 20 200 |
 106   When variable `ingress_service` is `ingress.rgw.1`
 107   When variable `code` is `200`
 108   Then Grafana panel `Total responses by HTTP code` with legend `Frontend {{ code }}` shows:
 109     | metrics | values |
 110     | {code="200"} | 1.5 |
 111
 112 Scenario: "Test Total http frontend requests by instance"
 113   Given the following series:
 114     | metrics | values |
 115     | haproxy_frontend_http_requests_total{job="haproxy",proxy="frontend",instance="ingress.rgw.1"} | 10 100 |
 116     | haproxy_frontend_http_requests_total{job="haproxy",proxy="frontend",instance="ingress.rgw.1"} | 20 200 |
 117   When variable `ingress_service` is `ingress.rgw.1`
 118   Then Grafana panel `Total requests / responses` with legend `Requests` shows:
 119     | metrics | values |
 120     | {instance="ingress.rgw.1"} | 3 |
 121
 122 Scenario: "Test Total backend response errors by instance"
 123   Given the following series:
 124     | metrics | values |
 125     | haproxy_backend_response_errors_total{job="haproxy",proxy="backend",instance="ingress.rgw.1"} | 10 100 |
 126     | haproxy_backend_response_errors_total{job="haproxy",proxy="backend",instance="ingress.rgw.1"} | 20 200 |
 127   When variable `ingress_service` is `ingress.rgw.1`
 128   Then Grafana panel `Total requests / responses` with legend `Response errors` shows:
 129     | metrics | values |
 130     | {instance="ingress.rgw.1"} | 3 |
 131
 132 Scenario: "Test Total frontend requests errors by instance"
 133   Given the following series:
 134     | metrics | values |
 135     | haproxy_frontend_request_errors_total{job="haproxy",proxy="frontend",instance="ingress.rgw.1"} | 10 100 |
 136     | haproxy_frontend_request_errors_total{job="haproxy",proxy="frontend",instance="ingress.rgw.1"} | 20 200 |
 137   When variable `ingress_service` is `ingress.rgw.1`
 138   Then Grafana panel `Total requests / responses` with legend `Requests errors` shows:
 139     | metrics | values |
 140     | {instance="ingress.rgw.1"} | 3 |
 141
 142 Scenario: "Test Total backend redispatch warnings by instance"
 143   Given the following series:
 144     | metrics | values |
 145     | haproxy_backend_redispatch_warnings_total{job="haproxy",proxy="backend",instance="ingress.rgw.1"} | 10 100 |
 146     | haproxy_backend_redispatch_warnings_total{job="haproxy",proxy="backend",instance="ingress.rgw.1"} | 20 200 |
 147   When variable `ingress_service` is `ingress.rgw.1`
 148   Then Grafana panel `Total requests / responses` with legend `Backend redispatch` shows:
 149     | metrics | values |
 150     | {instance="ingress.rgw.1"} | 3 |
 151
 152 Scenario: "Test Total backend retry warnings by instance"
 153   Given the following series:
 154     | metrics | values |
 155     | haproxy_backend_retry_warnings_total{job="haproxy",proxy="backend",instance="ingress.rgw.1"} | 10 100 |
 156     | haproxy_backend_retry_warnings_total{job="haproxy",proxy="backend",instance="ingress.rgw.1"} | 20 200 |
 157   When variable `ingress_service` is `ingress.rgw.1`
 158   Then Grafana panel `Total requests / responses` with legend `Backend retry` shows:
 159     | metrics | values |
 160     | {instance="ingress.rgw.1"} | 3 |
 161
 162 Scenario: "Test Total frontend requests denied by instance"
 163   Given the following series:
 164     | metrics | values |
 165     | haproxy_frontend_requests_denied_total{job="haproxy",proxy="frontend",instance="ingress.rgw.1"} | 10 100 |
 166     | haproxy_frontend_requests_denied_total{job="haproxy",proxy="frontend",instance="ingress.rgw.1"} | 20 200 |
 167   When variable `ingress_service` is `ingress.rgw.1`
 168   Then Grafana panel `Total requests / responses` with legend `Request denied` shows:
 169     | metrics | values |
 170     | {instance="ingress.rgw.1"} | 3 |
 171
 172 Scenario: "Test Total backend current queue by instance"
 173   Given the following series:
 174     | metrics | values |
 175     | haproxy_backend_current_queue{job="haproxy",proxy="backend",instance="ingress.rgw.1"} | 10 100 |
 176     | haproxy_backend_current_queue{job="haproxy",proxy="backend",instance="ingress.rgw.1"} | 20 200 |
 177   When variable `ingress_service` is `ingress.rgw.1`
 178   Then Grafana panel `Total requests / responses` with legend `Backend Queued` shows:
 179     | metrics | values |
 180     | {instance="ingress.rgw.1"} | 200 |
 181
 182 Scenario: "Test Total frontend connections by instance"
 183   Given the following series:
 184     | metrics | values |
 185     | haproxy_frontend_connections_total{job="haproxy",proxy="frontend",instance="ingress.rgw.1"} | 10 100 |
 186     | haproxy_frontend_connections_total{job="haproxy",proxy="frontend",instance="ingress.rgw.1"} | 20 200 |
 187   When variable `ingress_service` is `ingress.rgw.1`
 188   Then Grafana panel `Total number of connections` with legend `Front` shows:
 189     | metrics | values |
 190     | {instance="ingress.rgw.1"} | 3 |
 191
 192 Scenario: "Test Total backend connections attempts by instance"
 193   Given the following series:
 194     | metrics | values |
 195     | haproxy_backend_connection_attempts_total{job="haproxy",proxy="backend",instance="ingress.rgw.1"} | 10 100 |
 196     | haproxy_backend_connection_attempts_total{job="haproxy",proxy="backend",instance="ingress.rgw.1"} | 20 200 |
 197   When variable `ingress_service` is `ingress.rgw.1`
 198   Then Grafana panel `Total number of connections` with legend `Back` shows:
 199     | metrics | values |
 200     | {instance="ingress.rgw.1"} | 3 |
 201
 202 Scenario: "Test Total backend connections error by instance"
 203   Given the following series:
 204     | metrics | values |
 205     | haproxy_backend_connection_errors_total{job="haproxy",proxy="backend",instance="ingress.rgw.1"} | 10 100 |
 206     | haproxy_backend_connection_errors_total{job="haproxy",proxy="backend",instance="ingress.rgw.1"} | 20 200 |
 207   When variable `ingress_service` is `ingress.rgw.1`
 208   Then Grafana panel `Total number of connections` with legend `Back errors` shows:
 209     | metrics | values |
 210     | {instance="ingress.rgw.1"} | 3 |
 211
 212 Scenario: "Test Total frontend bytes incoming by instance"
 213   Given the following series:
 214     | metrics | values |
 215     | haproxy_frontend_bytes_in_total{job="haproxy",proxy="frontend",instance="ingress.rgw.1"} | 10 100 |
 216     | haproxy_frontend_bytes_in_total{job="haproxy",proxy="frontend",instance="ingress.rgw.1"} | 20 200 |
 217   When variable `ingress_service` is `ingress.rgw.1`
 218   Then Grafana panel `Current total of incoming / outgoing bytes` with legend `IN Front` shows:
 219     | metrics | values |
 220     | {instance="ingress.rgw.1"} | 24 |
 221
 222 Scenario: "Test Total frontend bytes outgoing by instance"
 223   Given the following series:
 224     | metrics | values |
 225     | haproxy_frontend_bytes_out_total{job="haproxy",proxy="frontend",instance="ingress.rgw.1"} | 10 100 |
 226     | haproxy_frontend_bytes_out_total{job="haproxy",proxy="frontend",instance="ingress.rgw.1"} | 20 200 |
 227   When variable `ingress_service` is `ingress.rgw.1`
 228   Then Grafana panel `Current total of incoming / outgoing bytes` with legend `OUT Front` shows:
 229     | metrics | values |
 230     | {instance="ingress.rgw.1"} | 24 |
 231
 232 Scenario: "Test Total backend bytes incoming by instance"
 233   Given the following series:
 234     | metrics | values |
 235     | haproxy_backend_bytes_in_total{job="haproxy",proxy="backend",instance="ingress.rgw.1"} | 10 100 |
 236     | haproxy_backend_bytes_in_total{job="haproxy",proxy="backend",instance="ingress.rgw.1"} | 20 200 |
 237   When variable `ingress_service` is `ingress.rgw.1`
 238   Then Grafana panel `Current total of incoming / outgoing bytes` with legend `IN Back` shows:
 239     | metrics | values |
 240     | {instance="ingress.rgw.1"} | 24 |
 241
 242 Scenario: "Test Total backend bytes outgoing by instance"
 243   Given the following series:
 244     | metrics | values |
 245     | haproxy_backend_bytes_out_total{job="haproxy",proxy="backend",instance="ingress.rgw.1"} | 10 100 |
 246     | haproxy_backend_bytes_out_total{job="haproxy",proxy="backend",instance="ingress.rgw.1"} | 20 200 |
 247   When variable `ingress_service` is `ingress.rgw.1`
 248   Then Grafana panel `Current total of incoming / outgoing bytes` with legend `OUT Back` shows:
 249     | metrics | values |
 250     | {instance="ingress.rgw.1"} | 24 |