Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[kube-state-metrics] Use scrapeConfig to have HA #5470

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

jenciso
Copy link

@jenciso jenciso commented Mar 22, 2025

What this PR does / why we need it

This PR enables scrapeConfig to be used as alternative to serviceMonitor for kube-state-metrics.

We want to have a way to configure the kube-state-metrics in High Availiability. The simple way to do it is increasing the number of replicas to 2. However, it duplicates the kube-state-metrics metrics.

With the use of scrapeConfig resource instead of serviceMonitor, you can scrape the kube-state-metrics as a traditional kubernetes services where each replica will expose its endpoint port giving high availability

Which issue this PR fixes

Special notes for your reviewer

Checklist

  • DCO signed
  • Chart Version bumped
  • Title of the PR starts with chart name (e.g. [prometheus-couchdb-exporter])

@jenciso jenciso changed the title Add scrapeconfig files Use scrapeConfig to give HA for kube-state-metrics Mar 22, 2025
@jenciso jenciso marked this pull request as ready for review March 22, 2025 17:15
@jenciso jenciso changed the title Use scrapeConfig to give HA for kube-state-metrics [kube-state-metrics] Use scrapeConfig to have HA Mar 22, 2025
@jenciso jenciso force-pushed the main branch 3 times, most recently from 15a9f0b to 9a15708 Compare March 22, 2025 20:54
@jkroepke
Copy link
Member

Whats the issue with double metrics? If could drop instance pod label to equalize them.

@jenciso
Copy link
Author

jenciso commented Mar 23, 2025

Whats the issue with double metrics? If could drop instance pod label to equalize them.

Hi @jkroepke, The issue with double metrics is cost (storage) and memory efficiency (cardinality), why need I double metrics when it is possible to have only one?
Another point is, why could we do extra work, when you could handle this service differently, as an alternative to serviceMonitor. The approach works, and it is being adopted (See here an example)

@nicolastakashi
Copy link
Contributor

I’m trying to think about possible issues that require the usage of KSM with multiple replicas.

Resource Constraints:

KSM is using more resources (especially memory) than the defined limits and it’s crash looping due to OOM. I might be wrong, but multiple replicas won’t improve reliability in that case, since both instances will be collecting the same data and using the same resource limits.

Rollout of new versions:

If you’re rolling out a new version and for some reason it has a startup problem, it’s true that multiple replicas might help here, because the second replica won’t be rolled out until the first one is ready and healthy.

But in my opinion, we can achieve the same behavior by properly configuring the deployment strategy, setting configs like minReadySeconds and others.

Reschedules:

Multiple replicas might also help with pod reschedules, but we can achieve stability using something like PDBs to avoid ending up with zero replicas.

If we really want to increase the availability of KSM, maybe we can consider the sharding feature, so you can have multiple KSM instances, each one collecting metrics from specific APIs.

@jkroepke
Copy link
Member

@nicolastakashi I see still a value for @jenciso requests. It's a low budget version of HA where the cluster is not big enough to need the sharing mode, but you will not lose metrics on rescheduling nodes.

For scrape requests against blackbox_exporter, the same principal is used.

@jenciso
Copy link
Author

jenciso commented Mar 23, 2025

I’m trying to think about possible issues that require the usage of KSM with multiple replicas.

Resource Constraints:

KSM is using more resources (especially memory) than the defined limits and it’s crash looping due to OOM. I might be wrong, but multiple replicas won’t improve reliability in that case, since both instances will be collecting the same data and using the same resource limits.

Maybe I could wrong. But, we could say the same for prometheus-HA mode stratey, where each prometheus server scrape and save the same content and share the same memory limits.

Rollout of new versions:

If you’re rolling out a new version and for some reason it has a startup problem, it’s true that multiple replicas might help here, because the second replica won’t be rolled out until the first one is ready and healthy.

But in my opinion, we can achieve the same behavior by properly configuring the deployment strategy, setting configs like minReadySeconds and others.

IMO, the main idea to having replicas of KSM is avoiding loss metrics during a node downtime (It is a common use-case). We could recommend using topology spreads constrains or a podAntiAffinity (by hostname) together to achieve a better solution.

Reschedules:

Multiple replicas might also help with pod reschedules, but we can achieve stability using something like PDBs to avoid ending up with zero replicas.

Yeah, but the problem is the duplicate of metrics basically. This PR dealing it offering an alternative to the use of serviceMonitor and propose the use of scrapeConfig to see the KSM as a simple kubernetes service.

If we really want to increase the availability of KSM, maybe we can consider the sharding feature, so you can have multiple KSM instances, each one collecting metrics from specific APIs.

The sharding feature is great, I use it in big cluster when I want to distribuite the load accross multiples KSM instances. But, if you lose a node running a specific shard, you will lose metrics and normally the schedule in statefulset is a headache if the node still in a Terminating/Unknown state. Also, It already happen to Prometheus shards

For small/medium clusters, this PR solves a HA problem in a simple way. The KSM is an important service for alerts and SLO's metrics, and we could have a minimal of availability if we look it as a classic Kubernetes service.

Signed-off-by: Juan Enciso <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants