Skip to content

Create log-based alerting and recording rules

Define alerting conditions based on log patterns using Loki Ruler, and optionally derive Prometheus metrics from logs using recording rules. Loki alerting rules send alerts to Alertmanager; recording rules create metrics stored in Prometheus.

  • UDS CLI installed
  • Access to a Kubernetes cluster with UDS Core deployed
  • Familiarity with LogQL

Loki Ruler provides two complementary capabilities:

  1. Loki alerting rules detect log patterns and send alerts directly to Alertmanager. Use these when you want to be notified about specific log events like error spikes or missing logs.
  2. Loki recording rules create Prometheus metrics from log queries. These are useful for building dashboards and for enabling metric-based alerting on log data.

Rules are deployed via ConfigMaps labeled loki_rule: "1". The Loki sidecar watches for these ConfigMaps and loads them automatically — no restart required.

  1. Create Loki alerting rules

    Define a ConfigMap containing your alerting rules. The loki_rule: "1" label is required for the Loki sidecar to discover it.

    loki-alerting-rules.yaml
    apiVersion: v1
    kind: ConfigMap
    metadata:
    name: my-app-alert-rules
    namespace: my-app-namespace
    labels:
    loki_rule: "1"
    data:
    rules.yaml: |
    groups:
    - name: my-app-alerts
    rules:
    - alert: ApplicationErrors
    expr: |
    sum(rate({namespace="my-app-namespace"} |= "ERROR" [5m])) > 0.05
    for: 2m
    labels:
    severity: warning
    service: my-app
    annotations:
    summary: "High error rate for my-app"
    runbook_url: "https://wiki.company.com/runbooks/my-app-errors"
    - alert: ApplicationLogsDown
    expr: |
    absent_over_time({namespace="my-app-namespace",app="my-app"}[5m])
    for: 1m
    labels:
    severity: critical
    service: my-app
    annotations:
    summary: "Application is not producing logs"
    description: "No logs received from application for 5 minutes"

    Key fields in each alerting rule:

    • expr — A LogQL expression that defines the alert condition. rate() counts log lines per second matching a filter; absent_over_time() fires when no logs match within the window.
    • for — How long the condition must be true before the alert fires. This prevents transient spikes from triggering notifications.
    • labels — Attached to the alert and used by Alertmanager for routing and grouping (e.g., severity, service).
    • annotations — Human-readable metadata like summary and runbook_url that appear in alert notifications.
  2. Optional: Create recording rules

    Recording rules evaluate LogQL queries on a schedule and store the results as Prometheus metrics. This is useful when you want to build dashboards from log data or create metric-based alerts that are more efficient than repeated log queries.

    loki-recording-rules.yaml
    apiVersion: v1
    kind: ConfigMap
    metadata:
    name: my-app-recording-rules
    namespace: my-app-namespace
    labels:
    loki_rule: "1"
    data:
    recording-rules.yaml: |
    groups:
    - name: my-app-metrics
    interval: 30s
    rules:
    - record: my_app:request_rate
    expr: |
    sum(rate({namespace="my-app-namespace",app="my-app"} |= "REQUEST" [1m]))
    - record: my_app:error_rate
    expr: |
    sum(rate({namespace="my-app-namespace",app="my-app"} |= "ERROR" [1m]))
    - record: my_app:error_percentage
    expr: |
    (
    sum(rate({namespace="my-app-namespace",app="my-app"} |= "ERROR" [1m]))
    /
    sum(rate({namespace="my-app-namespace",app="my-app"} [1m]))
    ) * 100

    Each record entry defines a Prometheus metric name (e.g., my_app:error_rate) and a LogQL expression that produces its value. The interval field controls how often the rules are evaluated — 30s is a good starting point.

  3. Optional: Alert on recorded metrics

    Once recording rules produce Prometheus metrics, you can create standard Prometheus alerting rules against them using a PrometheusRule CR. This combines log-derived data with the full power of PromQL alerting.

    prometheus-rule-from-logs.yaml
    apiVersion: monitoring.coreos.com/v1
    kind: PrometheusRule
    metadata:
    name: my-app-prometheus-alerts
    namespace: my-app-namespace
    labels:
    prometheus: kube-prometheus-stack-prometheus
    spec:
    groups:
    - name: my-app-prometheus-alerts
    rules:
    - alert: HighErrorPercentage
    expr: my_app:error_percentage > 5
    for: 5m
    labels:
    severity: warning
    service: my-app
    annotations:
    description: "High error rate on my-app"
    runbook_url: "https://wiki.company.com/runbooks/my-app-high-errors"
  4. Deploy your rules

    (Recommended) Include your rule ConfigMaps and any PrometheusRule CRs in your Zarf package and create/deploy. See Packaging applications for general packaging guidance.

    Terminal window
    uds zarf package create --confirm
    uds zarf package deploy zarf-package-*.tar.zst --confirm

    Or apply the manifests directly for quick testing:

    Terminal window
    uds zarf tools kubectl apply -f loki-alerting-rules.yaml
    uds zarf tools kubectl apply -f loki-recording-rules.yaml # if using recording rules
    uds zarf tools kubectl apply -f prometheus-rule-from-logs.yaml # if alerting on recorded metrics

Confirm your rules are active:

  • Alerting rules: Open Grafana and navigate to Alerting > Alert rules. Filter by the Loki datasource. Your alerting rules (e.g., ApplicationErrors, ApplicationLogsDown) should appear in the list.
  • Recording rules: Open Grafana Explore, select the Prometheus datasource, and query a recorded metric name (e.g., my_app:error_rate). If the metric returns data, the recording rule is working.
Terminal window
# Verify the ConfigMaps were created with the correct label
uds zarf tools kubectl get configmap -A -l loki_rule=1

Symptom: Rules do not appear in Grafana Alerting, or recorded metrics are not available in Prometheus.

Solution: Verify the ConfigMap has the loki_rule: "1" label and that the YAML under the data key is valid.

Terminal window
# Check that labeled ConfigMaps exist
uds zarf tools kubectl get configmap -A -l loki_rule=1
# Inspect a specific ConfigMap for YAML errors
uds zarf tools kubectl get configmap my-app-alert-rules -n my-app-namespace -o yaml

If the ConfigMap exists but rules still aren’t loading, check the Loki sidecar logs for parsing errors:

Terminal window
uds zarf tools kubectl logs -n loki -l app.kubernetes.io/name=loki -c loki-sc-rules --tail=50 # rules sidecar container

Symptom: The alerting rule appears in Grafana but stays in the Normal or Pending state.

Solution: Verify the LogQL expression returns results. Open Grafana Explore, select the Loki datasource, and run the expr from your rule. If it returns no data, check that logs are actually being ingested for the target namespace and application. Also confirm that the for duration has elapsed — the condition must be true continuously for the specified period.

These guides may be useful to explore next: