Kubernetes API client QPS/Burst rate limits are hardcoded, causing throttling at high WORKERS concurrency

**Describe the bug**
When running NTH in Queue Processor mode with WORKERS >= 50, the Kubernetes API client is severely throttled by the hardcoded client-go default rate limits (QPS=5, Burst=10). All workers share a single clientset created via [rest.InClusterConfig()](https://github.com/aws/aws-node-termination-handler/blob/main/cmd/node-termination-handler.go#L113)

With WORKERS=50 dispatching simultaneously during a correlated spot interruption, ~500 API calls compete for the 5 QPS rate limit. Workers are blocked for 8-10 seconds per API call due to client-side throttling:

```
INF Waited for 7.994525579s due to client-side throttling, not priority and fairness, request: GET:https://kube-apiserver:6443/api/v1/nodes?labelSelector=kubernetes.io%2Fhostname
INF Waited for 9.994951525s due to client-side throttling, not priority and fairness, request: GET:https://kube-apiserver:6443/api/v1/nodes?labelSelector=kubernetes.io%2Fhostname
INF Waited for 9.971182606s due to client-side throttling, not priority and fairness, request: GET:https://kube-apiserver:6443/api/v1/nodes?labelSelector=kubernetes.io%2Fhostname
```

**Steps to reproduce**
1. Deploy NTH in Queue Processor mode with WORKERS=50 (or higher)
2. Trigger 50+ concurrent spot interruptions (e.g., via AWS FIS aws:ec2:send-spot-instance-interruptions)
3. Observe NTH logs: continuous client-side throttling warnings with 8-10 second waits, and all workers busy, waiting every second

**Expected outcome**
NTH should allow configuring the Kubernetes API client QPS and burst rate limits via environment variables or CLI flags so that high-concurrency deployments can avoid client-side throttling.

**Environment**

  - NTH Version: latest main branch (also reproduced on OpenShift fork)
  - NTH Mode: Queue Processor (SQS)
  - Kubernetes version: v1.35.5 (OpenShift 4.22)
  - Platform: ROSA HCP on AWS, us-east-2



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Kubernetes API client QPS/Burst rate limits are hardcoded, causing throttling at high WORKERS concurrency #1280

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Kubernetes API client QPS/Burst rate limits are hardcoded, causing throttling at high WORKERS concurrency #1280

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions