Set up uptime monitoring
What you’ll accomplish
Section titled “What you’ll accomplish”Monitor HTTPS endpoint availability using Blackbox Exporter probes. Probes are configured through the UDS Package CR’s uptime block — the operator automatically creates Prometheus Probe resources and configures Blackbox Exporter. You can monitor simple health checks, custom paths, and even Authservice-protected applications without additional setup.
Prerequisites
Section titled “Prerequisites”- UDS CLI installed
- Access to a Kubernetes cluster with UDS Core deployed
- An application exposed via the Package CR
exposeblock
Before you begin
Section titled “Before you begin”-
Add uptime checks to a Package CR
Add
uptime.checks.pathsto anexposeentry in your Package CR. This creates a Prometheus Probe that issues HTTP GET requests at a regular interval and checks for a successful (2xx) response.package.yaml apiVersion: uds.dev/v1alpha1kind: Packagemetadata:name: my-appnamespace: my-appspec:network:expose:# monitors: https://myapp.uds.dev/- service: my-apphost: myappgateway: tenantport: 8080uptime:checks:paths:- / -
Optional: Monitor custom health endpoints
Specify multiple paths to monitor specific health endpoints on a single service.
package.yaml spec:network:expose:# monitors: https://myapp.uds.dev/health and https://myapp.uds.dev/ready- service: my-apphost: myappgateway: tenantport: 8080uptime:checks:paths:- /health- /ready -
Optional: Monitor multiple services
Add uptime checks to multiple expose entries within a single Package CR to monitor several services at once.
package.yaml spec:network:expose:# monitors: https://app.uds.dev/healthz, https://api.uds.dev/health,# https://api.uds.dev/ready, https://app.admin.uds.dev/- service: frontendhost: appgateway: tenantport: 3000uptime:checks:paths:- /healthz- service: apihost: apigateway: tenantport: 8080uptime:checks:paths:- /health- /ready- service: adminhost: appgateway: adminport: 8080uptime:checks:paths:- / -
Optional: Monitor Authservice-protected applications
For applications protected by Authservice, add
uptime.checksto the expose entry as normal. The UDS Operator detects theenableAuthserviceSelectoron the matching SSO entry and automatically:- Creates a Keycloak service account client (
<clientId>-probe) with an audience mapper scoped to the application’s SSO client - Configures the Blackbox Exporter with an OAuth2 module that obtains a token via client credentials before probing
No additional configuration is required beyond adding
uptime.checks.paths:package.yaml apiVersion: uds.dev/v1alpha1kind: Packagemetadata:name: my-appnamespace: my-appspec:sso:- name: My AppclientId: uds-my-appredirectUris:- "https://myapp.uds.dev/login"enableAuthserviceSelector:app: my-appnetwork:expose:- service: my-apphost: myappgateway: tenantport: 8080uptime:checks:paths:- /healthzThe operator matches the expose entry to the SSO entry via the redirect URI origin (
https://myapp.uds.dev) and configures the probe to authenticate transparently through Authservice. - Creates a Keycloak service account client (
-
Deploy your Package
(Recommended) Include the Package CR in your Zarf package and create/deploy. See Packaging applications for general packaging guidance.
Terminal window uds zarf package create --confirmuds zarf package deploy zarf-package-*.tar.zst --confirmOr apply the Package CR directly for quick testing:
Terminal window uds zarf tools kubectl apply -f package.yaml
Verification
Section titled “Verification”Confirm uptime monitoring is working:
- Open Grafana and navigate to Dashboards then UDS / Monitoring / Probe Uptime to see the uptime dashboard
- The dashboard displays uptime status timeline, percentage uptime, and TLS certificate expiration dates
- Query
probe_successin Grafana Explore to check individual probe status
Available metrics
Section titled “Available metrics”Blackbox Exporter provides the following key metrics for alerting and dashboarding:
| Metric | Description |
|---|---|
probe_success | Whether the probe succeeded (1) or failed (0) |
probe_duration_seconds | Total probe duration |
probe_http_status_code | HTTP response status code |
probe_ssl_earliest_cert_expiry | SSL certificate expiration timestamp |
Example PromQL queries:
# Check all probes and their success statusprobe_success
# Check if a specific endpoint is upprobe_success{instance="https://myapp.uds.dev/health"}Troubleshooting
Section titled “Troubleshooting”Problem: Probe showing as failed
Section titled “Problem: Probe showing as failed”Symptom: The uptime dashboard shows a probe in a failed state.
Solution: Verify the endpoint is reachable from within the cluster. Check application health and any network policies that might block the probe.
Problem: Probe not appearing
Section titled “Problem: Probe not appearing”Symptom: No probe data shows up in Grafana after applying the Package CR.
Solution: Verify uptime.checks.paths is set in the expose entry. Check Package CR status:
uds zarf tools kubectl describe package <name> -n <namespace>Problem: Authservice-protected probe failing
Section titled “Problem: Authservice-protected probe failing”Symptom: Probe returns authentication errors for an SSO-protected application.
Solution: Check that the probe Keycloak client was created by reviewing operator logs. Verify the SSO entry’s redirect URI origin matches the expose entry’s FQDN.
Related Documentation
Section titled “Related Documentation”- Prometheus: Blackbox Exporter — upstream project documentation
- Prometheus Operator: Probe API — Probe CRD field reference
Next steps
Section titled “Next steps”These guides and concepts may be useful to explore next: