Create log-based alerting and recording rules
What you’ll accomplish
Section titled “What you’ll accomplish”Define alerting conditions based on log patterns using Loki Ruler, and optionally derive Prometheus metrics from logs using recording rules. Loki alerting rules send alerts to Alertmanager; recording rules create metrics stored in Prometheus.
Prerequisites
Section titled “Prerequisites”Before you begin
Section titled “Before you begin”Loki Ruler provides two complementary capabilities:
- Loki alerting rules detect log patterns and send alerts directly to Alertmanager. Use these when you want to be notified about specific log events like error spikes or missing logs.
- Loki recording rules create Prometheus metrics from log queries. These are useful for building dashboards and for enabling metric-based alerting on log data.
Rules are deployed via ConfigMaps labeled loki_rule: "1". The Loki sidecar watches for these ConfigMaps and loads them automatically — no restart required.
-
Create Loki alerting rules
Define a ConfigMap containing your alerting rules. The
loki_rule: "1"label is required for the Loki sidecar to discover it.loki-alerting-rules.yaml apiVersion: v1kind: ConfigMapmetadata:name: my-app-alert-rulesnamespace: my-app-namespacelabels:loki_rule: "1"data:rules.yaml: |groups:- name: my-app-alertsrules:- alert: ApplicationErrorsexpr: |sum(rate({namespace="my-app-namespace"} |= "ERROR" [5m])) > 0.05for: 2mlabels:severity: warningservice: my-appannotations:summary: "High error rate for my-app"runbook_url: "https://wiki.company.com/runbooks/my-app-errors"- alert: ApplicationLogsDownexpr: |absent_over_time({namespace="my-app-namespace",app="my-app"}[5m])for: 1mlabels:severity: criticalservice: my-appannotations:summary: "Application is not producing logs"description: "No logs received from application for 5 minutes"Key fields in each alerting rule:
expr— A LogQL expression that defines the alert condition.rate()counts log lines per second matching a filter;absent_over_time()fires when no logs match within the window.for— How long the condition must be true before the alert fires. This prevents transient spikes from triggering notifications.labels— Attached to the alert and used by Alertmanager for routing and grouping (e.g.,severity,service).annotations— Human-readable metadata likesummaryandrunbook_urlthat appear in alert notifications.
-
Optional: Create recording rules
Recording rules evaluate LogQL queries on a schedule and store the results as Prometheus metrics. This is useful when you want to build dashboards from log data or create metric-based alerts that are more efficient than repeated log queries.
loki-recording-rules.yaml apiVersion: v1kind: ConfigMapmetadata:name: my-app-recording-rulesnamespace: my-app-namespacelabels:loki_rule: "1"data:recording-rules.yaml: |groups:- name: my-app-metricsinterval: 30srules:- record: my_app:request_rateexpr: |sum(rate({namespace="my-app-namespace",app="my-app"} |= "REQUEST" [1m]))- record: my_app:error_rateexpr: |sum(rate({namespace="my-app-namespace",app="my-app"} |= "ERROR" [1m]))- record: my_app:error_percentageexpr: |(sum(rate({namespace="my-app-namespace",app="my-app"} |= "ERROR" [1m]))/sum(rate({namespace="my-app-namespace",app="my-app"} [1m]))) * 100Each
recordentry defines a Prometheus metric name (e.g.,my_app:error_rate) and a LogQL expression that produces its value. Theintervalfield controls how often the rules are evaluated —30sis a good starting point. -
Optional: Alert on recorded metrics
Once recording rules produce Prometheus metrics, you can create standard Prometheus alerting rules against them using a
PrometheusRuleCR. This combines log-derived data with the full power of PromQL alerting.prometheus-rule-from-logs.yaml apiVersion: monitoring.coreos.com/v1kind: PrometheusRulemetadata:name: my-app-prometheus-alertsnamespace: my-app-namespacelabels:prometheus: kube-prometheus-stack-prometheusspec:groups:- name: my-app-prometheus-alertsrules:- alert: HighErrorPercentageexpr: my_app:error_percentage > 5for: 5mlabels:severity: warningservice: my-appannotations:description: "High error rate on my-app"runbook_url: "https://wiki.company.com/runbooks/my-app-high-errors" -
Deploy your rules
(Recommended) Include your rule ConfigMaps and any PrometheusRule CRs in your Zarf package and create/deploy. See Packaging applications for general packaging guidance.
Terminal window uds zarf package create --confirmuds zarf package deploy zarf-package-*.tar.zst --confirmOr apply the manifests directly for quick testing:
Terminal window uds zarf tools kubectl apply -f loki-alerting-rules.yamluds zarf tools kubectl apply -f loki-recording-rules.yaml # if using recording rulesuds zarf tools kubectl apply -f prometheus-rule-from-logs.yaml # if alerting on recorded metrics
Verification
Section titled “Verification”Confirm your rules are active:
- Alerting rules: Open Grafana and navigate to Alerting > Alert rules. Filter by the Loki datasource. Your alerting rules (e.g.,
ApplicationErrors,ApplicationLogsDown) should appear in the list. - Recording rules: Open Grafana Explore, select the Prometheus datasource, and query a recorded metric name (e.g.,
my_app:error_rate). If the metric returns data, the recording rule is working.
# Verify the ConfigMaps were created with the correct labeluds zarf tools kubectl get configmap -A -l loki_rule=1Troubleshooting
Section titled “Troubleshooting”Problem: Rules not loading in Loki
Section titled “Problem: Rules not loading in Loki”Symptom: Rules do not appear in Grafana Alerting, or recorded metrics are not available in Prometheus.
Solution: Verify the ConfigMap has the loki_rule: "1" label and that the YAML under the data key is valid.
# Check that labeled ConfigMaps existuds zarf tools kubectl get configmap -A -l loki_rule=1
# Inspect a specific ConfigMap for YAML errorsuds zarf tools kubectl get configmap my-app-alert-rules -n my-app-namespace -o yamlIf the ConfigMap exists but rules still aren’t loading, check the Loki sidecar logs for parsing errors:
uds zarf tools kubectl logs -n loki -l app.kubernetes.io/name=loki -c loki-sc-rules --tail=50 # rules sidecar containerProblem: Alert not firing
Section titled “Problem: Alert not firing”Symptom: The alerting rule appears in Grafana but stays in the Normal or Pending state.
Solution: Verify the LogQL expression returns results. Open Grafana Explore, select the Loki datasource, and run the expr from your rule. If it returns no data, check that logs are actually being ingested for the target namespace and application. Also confirm that the for duration has elapsed — the condition must be true continuously for the specified period.
Related Documentation
Section titled “Related Documentation”- Grafana Loki: Alerting and recording rules — Loki ruler configuration reference
- Grafana Loki: LogQL — query language documentation
Next steps
Section titled “Next steps”These guides may be useful to explore next: