Create log-based alerting and recording rules

What you’ll accomplish

Define alerting conditions based on log patterns using Loki Ruler, and optionally derive Prometheus metrics from logs using recording rules. Loki alerting rules send alerts to Alertmanager; recording rules create metrics stored in Prometheus.

Prerequisites

UDS CLI installed
Access to a Kubernetes cluster with UDS Core deployed
Familiarity with LogQL

Before you begin

Loki Ruler provides two complementary capabilities:

Loki alerting rules detect log patterns and send alerts directly to Alertmanager. Use these when you want to be notified about specific log events like error spikes or missing logs.
Loki recording rules create Prometheus metrics from log queries. These are useful for building dashboards and for enabling metric-based alerting on log data.

Rules are deployed via ConfigMaps labeled loki_rule: "1". The Loki sidecar watches for these ConfigMaps and loads them automatically — no restart required.

Steps

Create Loki alerting rules

Define a ConfigMap containing your alerting rules. The loki_rule: "1" label is required for the Loki sidecar to discover it.

apiVersion: v1
kind: ConfigMap
metadata:
  name: my-app-alert-rules
  namespace: my-app-namespace
  labels:
    loki_rule: "1"
data:
  rules.yaml: |
    groups:
      - name: my-app-alerts
        rules:
          - alert: ApplicationErrors
            expr: |
              sum(rate({namespace="my-app-namespace"} |= "ERROR" [5m])) > 0.05
            for: 2m
            labels:
              severity: warning
              service: my-app
            annotations:
              summary: "High error rate for my-app"
              runbook_url: "https://wiki.company.com/runbooks/my-app-errors"

          - alert: ApplicationLogsDown
            expr: |
              absent_over_time({namespace="my-app-namespace",app="my-app"}[5m])
            for: 1m
            labels:
              severity: critical
              service: my-app
            annotations:
              summary: "Application is not producing logs"
              description: "No logs received from application for 5 minutes"

Key fields in each alerting rule:

expr — A LogQL expression that defines the alert condition. rate() counts log lines per second matching a filter; absent_over_time() fires when no logs match within the window.
for — How long the condition must be true before the alert fires. This prevents transient spikes from triggering notifications.
labels — Attached to the alert and used by Alertmanager for routing and grouping (e.g., severity, service).
annotations — Human-readable metadata like summary and runbook_url that appear in alert notifications.

Optional: Create recording rules

Recording rules evaluate LogQL queries on a schedule and store the results as Prometheus metrics. This is useful when you want to build dashboards from log data or create metric-based alerts that are more efficient than repeated log queries.

apiVersion: v1
kind: ConfigMap
metadata:
  name: my-app-recording-rules
  namespace: my-app-namespace
  labels:
    loki_rule: "1"
data:
  recording-rules.yaml: |
    groups:
      - name: my-app-metrics
        interval: 30s
        rules:
          - record: my_app:request_rate
            expr: |
              sum(rate({namespace="my-app-namespace",app="my-app"} |= "REQUEST" [1m]))

          - record: my_app:error_rate
            expr: |
              sum(rate({namespace="my-app-namespace",app="my-app"} |= "ERROR" [1m]))

          - record: my_app:error_percentage
            expr: |
              (
                sum(rate({namespace="my-app-namespace",app="my-app"} |= "ERROR" [1m]))
                /
                sum(rate({namespace="my-app-namespace",app="my-app"} [1m]))
              ) * 100

Each record entry defines a Prometheus metric name (e.g., my_app:error_rate) and a LogQL expression that produces its value. The interval field controls how often the rules are evaluated — 30s is a good starting point.

Optional: Alert on recorded metrics

Once recording rules produce Prometheus metrics, you can create standard Prometheus alerting rules against them using a PrometheusRule CR. This combines log-derived data with the full power of PromQL alerting.

apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
  name: my-app-prometheus-alerts
  namespace: my-app-namespace
  labels:
    prometheus: kube-prometheus-stack-prometheus
spec:
  groups:
    - name: my-app-prometheus-alerts
      rules:
        - alert: HighErrorPercentage
          expr: my_app:error_percentage > 5
          for: 5m
          labels:
            severity: warning
            service: my-app
          annotations:
            description: "High error rate on my-app"
            runbook_url: "https://wiki.company.com/runbooks/my-app-high-errors"

Deploy your rules

(Recommended) Include your rule ConfigMaps and any PrometheusRule CRs in your Zarf package and create/deploy. See Packaging applications for general packaging guidance.
Terminal window
```
uds zarf package create --confirm
uds zarf package deploy zarf-package-*.tar.zst --confirm
```
Or apply the manifests directly for quick testing:
Terminal window
```
uds zarf tools kubectl apply -f loki-alerting-rules.yaml
uds zarf tools kubectl apply -f loki-recording-rules.yaml        # if using recording rules
uds zarf tools kubectl apply -f prometheus-rule-from-logs.yaml    # if alerting on recorded metrics
```
The Loki sidecar watches for ConfigMap changes continuously. Updates to existing ConfigMaps are picked up without any manual reload.

Verification

Confirm your rules are active:

Alerting rules: Open Grafana and navigate to Alerting > Alert rules. Filter by the Loki datasource. Your alerting rules (e.g., ApplicationErrors, ApplicationLogsDown) should appear in the list.
Recording rules: Open Grafana Explore, select the Prometheus datasource, and query a recorded metric name (e.g., my_app:error_rate). If the metric returns data, the recording rule is working.

# Verify the ConfigMaps were created with the correct label
uds zarf tools kubectl get configmap -A -l loki_rule=1

Troubleshooting

Problem: Rules not loading in Loki

Symptom: Rules do not appear in Grafana Alerting, or recorded metrics are not available in Prometheus.

Solution: Verify the ConfigMap has the loki_rule: "1" label and that the YAML under the data key is valid.

# Check that labeled ConfigMaps exist
uds zarf tools kubectl get configmap -A -l loki_rule=1

# Inspect a specific ConfigMap for YAML errors
uds zarf tools kubectl get configmap my-app-alert-rules -n my-app-namespace -o yaml

If the ConfigMap exists but rules still aren’t loading, check the Loki sidecar logs for parsing errors:

uds zarf tools kubectl logs -n loki -l app.kubernetes.io/name=loki -c loki-sc-rules --tail=50  # rules sidecar container

Problem: Alert not firing

Symptom: The alerting rule appears in Grafana but stays in the Normal or Pending state.

Solution: Verify the LogQL expression returns results. Open Grafana Explore, select the Loki datasource, and run the expr from your rule. If it returns no data, check that logs are actually being ingested for the target namespace and application. Also confirm that the for duration has elapsed — the condition must be true continuously for the specified period.

Grafana Loki: Alerting and recording rules — Loki ruler configuration reference
Grafana Loki: LogQL — query language documentation

Next steps

These guides may be useful to explore next:

Route alerts to notification channels Configure Alertmanager to deliver Loki alerts to Slack, PagerDuty, or email.

Create metric alerting rules Define additional alerting conditions based on Prometheus metrics.

Cookie Settings Privacy Policy