Service Mesh
What you’ll accomplish
Section titled “What you’ll accomplish”You’ll configure Istio’s control plane (istiod) and ingress gateways for production high availability by increasing minimum replica counts, tuning resource allocation, and verifying that pod anti-affinity is spreading replicas across nodes.
Istio’s control plane manages service discovery, certificate rotation, and configuration distribution for the entire mesh. If istiod becomes unavailable, new connections cannot be established and configuration changes stop propagating. The ingress gateways are the entry point for all external traffic — if a gateway goes down, traffic to the applications it serves is interrupted.
Prerequisites
Section titled “Prerequisites”- UDS CLI installed
- Access to a Kubernetes cluster (multi-node, multi-AZ recommended)
Before you begin
Section titled “Before you begin”UDS Core configures istiod with two HA mechanisms out of the box:
- Horizontal Pod Autoscaler (HPA): enabled by default, scaling between 1 and 5 replicas based on CPU utilization
- Pod anti-affinity:
preferredDuringSchedulingIgnoredDuringExecutionanti-affinity, which tells Kubernetes to prefer scheduling istiod replicas on different nodes
With the default autoscaleMin: 1, the HPA may scale istiod down to a single replica during low-traffic periods — creating a temporary single point of failure.
-
Increase the minimum replica count for HA
Set
autoscaleMinto 2 (or higher) to ensure at least two istiod replicas are always running:uds-bundle.yaml packages:- name: corerepository: registry.defenseunicorns.com/public/coreref: x.x.x-upstreamoverrides:istio-controlplane:istiod:values:# Minimum istiod replicas (default: 1)- path: autoscaleMinvalue: 2# Maximum istiod replicas (default: 5)- path: autoscaleMaxvalue: 5 -
Tune istiod resources
The default istiod resource allocation (500m CPU, 2Gi memory) is sized for moderate clusters. For larger clusters with many services or high configuration complexity, increase the allocation:
uds-bundle.yaml packages:- name: corerepository: registry.defenseunicorns.com/public/coreref: x.x.x-upstreamoverrides:istio-controlplane:istiod:values:# istiod resources (adjust for your environment)- path: resourcesvalue:requests:cpu: 500mmemory: 2Gilimits:cpu: 1000mmemory: 4Gi -
Scale the admin and tenant ingress gateways
UDS Core deploys separate ingress gateways for admin and tenant traffic. Both use the upstream Istio gateway chart with HPA enabled by default (min 1, max 5). For production, increase the minimum replicas and tune resources for both gateways:
uds-bundle.yaml packages:- name: corerepository: registry.defenseunicorns.com/public/coreref: x.x.x-upstreamoverrides:istio-admin-gateway:gateway:values:# Admin gateway minimum replicas (default: 1)- path: autoscaling.minReplicasvalue: 2# Admin gateway maximum replicas (default: 5)- path: autoscaling.maxReplicasvalue: 8# Admin gateway resources (adjust for your environment)- path: resources.requests.cpuvalue: 750m- path: resources.requests.memoryvalue: 1024Mi- path: resources.limits.cpuvalue: 2000m- path: resources.limits.memoryvalue: 4Gi# Scale based on CPU and memory request utilization- path: autoscaling.targetCPUUtilizationPercentagevalue: 100- path: autoscaling.targetMemoryUtilizationPercentagevalue: 100istio-tenant-gateway:gateway:values:# Tenant gateway minimum replicas (default: 1)- path: autoscaling.minReplicasvalue: 2# Tenant gateway maximum replicas (default: 5)- path: autoscaling.maxReplicasvalue: 8# Tenant gateway resources (adjust for your environment)- path: resources.requests.cpuvalue: 750m- path: resources.requests.memoryvalue: 1024Mi- path: resources.limits.cpuvalue: 2000m- path: resources.limits.memoryvalue: 4Gi# Scale based on CPU and memory request utilization- path: autoscaling.targetCPUUtilizationPercentagevalue: 100- path: autoscaling.targetMemoryUtilizationPercentagevalue: 100# Optional: customize scaling behavior- path: autoscaling.autoscaleBehaviorvalue:scaleUp:stabilizationWindowSeconds: 30policies:- type: Percentvalue: 50periodSeconds: 15scaleDown:stabilizationWindowSeconds: 300policies:- type: Percentvalue: 20periodSeconds: 60 -
Create and deploy your bundle
Terminal window uds create <path-to-bundle-dir>uds deploy uds-bundle-<name>-<arch>-<version>.tar.zst
Verification
Section titled “Verification”# Confirm istiod pods are on different nodesuds zarf tools kubectl get pods -n istio-system -l app=istiod -o custom-columns=NAME:.metadata.name,NODE:.spec.nodeName,STATUS:.status.phase
# Check istiod HPA statusuds zarf tools kubectl get hpa -n istio-system
# Check admin gateway HPA and podsuds zarf tools kubectl get hpa -n istio-admin-gatewayuds zarf tools kubectl get pods -n istio-admin-gateway -o wide
# Check tenant gateway HPA and podsuds zarf tools kubectl get hpa -n istio-tenant-gatewayuds zarf tools kubectl get pods -n istio-tenant-gateway -o wideSuccess criteria:
- istiod has at least 2 replicas
Running, distributed across different nodes (on 3+ node clusters) - Admin and tenant gateways each have at least 2 replicas
Running - All HPAs show the expected min/max replica range
Troubleshooting
Section titled “Troubleshooting”Problem: istiod pods scheduled on the same node
Section titled “Problem: istiod pods scheduled on the same node”Symptoms: All istiod replicas are on a single node, creating a single point of failure.
Solution: The anti-affinity is a soft preference — Kubernetes will co-locate pods when it has no better option. Verify you have at least 3 schedulable nodes:
uds zarf tools kubectl get nodes -o custom-columns=NAME:.metadata.name,TAINTS:.spec.taintsIf nodes have taints preventing istiod scheduling, add appropriate tolerations via bundle overrides for the istiod chart under the istio-controlplane component.
Problem: HPA not scaling istiod
Section titled “Problem: HPA not scaling istiod”Symptoms: HPA shows <unknown> for current metrics or replicas stay at minimum.
Solution: Ensure the metrics-server is running and healthy:
uds zarf tools kubectl get pods -n kube-system -l k8s-app=metrics-serverRelated Documentation
Section titled “Related Documentation”- Istio istiod Helm Chart — full list of istiod helm values
- Istio Gateway Helm Chart — full list of gateway helm values
- Istio: Deployment Best Practices — control plane resilience and scaling guidance
- Istio: Performance and Scalability — benchmarks and tuning for large clusters
- Kubernetes: Horizontal Pod Autoscaling — HPA configuration and scaling behavior
- Kubernetes: Assigning Pods to Nodes — affinity, anti-affinity, and topology spread constraints
Next steps
Section titled “Next steps”These guides and concepts may be useful to explore next: