Describe the bug
When running NTH in Queue Processor mode with WORKERS >= 50, the Kubernetes API client is severely throttled by the hardcoded client-go default rate limits (QPS=5, Burst=10). All workers share a single clientset created via rest.InClusterConfig()
With WORKERS=50 dispatching simultaneously during a correlated spot interruption, ~500 API calls compete for the 5 QPS rate limit. Workers are blocked for 8-10 seconds per API call due to client-side throttling:
INF Waited for 7.994525579s due to client-side throttling, not priority and fairness, request: GET:https://kube-apiserver:6443/api/v1/nodes?labelSelector=kubernetes.io%2Fhostname
INF Waited for 9.994951525s due to client-side throttling, not priority and fairness, request: GET:https://kube-apiserver:6443/api/v1/nodes?labelSelector=kubernetes.io%2Fhostname
INF Waited for 9.971182606s due to client-side throttling, not priority and fairness, request: GET:https://kube-apiserver:6443/api/v1/nodes?labelSelector=kubernetes.io%2Fhostname
Steps to reproduce
- Deploy NTH in Queue Processor mode with WORKERS=50 (or higher)
- Trigger 50+ concurrent spot interruptions (e.g., via AWS FIS aws:ec2:send-spot-instance-interruptions)
- Observe NTH logs: continuous client-side throttling warnings with 8-10 second waits, and all workers busy, waiting every second
Expected outcome
NTH should allow configuring the Kubernetes API client QPS and burst rate limits via environment variables or CLI flags so that high-concurrency deployments can avoid client-side throttling.
Environment
- NTH Version: latest main branch (also reproduced on OpenShift fork)
- NTH Mode: Queue Processor (SQS)
- Kubernetes version: v1.35.5 (OpenShift 4.22)
- Platform: ROSA HCP on AWS, us-east-2
Describe the bug
When running NTH in Queue Processor mode with WORKERS >= 50, the Kubernetes API client is severely throttled by the hardcoded client-go default rate limits (QPS=5, Burst=10). All workers share a single clientset created via rest.InClusterConfig()
With WORKERS=50 dispatching simultaneously during a correlated spot interruption, ~500 API calls compete for the 5 QPS rate limit. Workers are blocked for 8-10 seconds per API call due to client-side throttling:
Steps to reproduce
Expected outcome
NTH should allow configuring the Kubernetes API client QPS and burst rate limits via environment variables or CLI flags so that high-concurrency deployments can avoid client-side throttling.
Environment