Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider optimizing pods call to Kubernetes API Server so that it becomes efficient and scalable #6676

Closed
JohnRusk opened this issue Jan 12, 2023 · 7 comments
Labels

Comments

@JohnRusk
Copy link

Is your feature request related to a problem? Please describe.
There are known issues with Fluent Bit on large Kubernetes clusters. The workaround is to enable the use_kubelet setting. But that's not enabled by default, and so users can run into scaling problems without understanding why.

Describe the solution you'd like
Optimization of the default behaviour (in which Fluent Bit calls the Kube API Server) so that the workaround becomes less necessary, and maybe unnecessary.

This might be easy. It depends on how fresh the listed data must be. Right now, it appears that Fluent Bit is using default query string parameters, and therefore the query is being served from etcd. This is expensive in terms of performance. However, if FluentBit instead indicated that it would accept data from the API server's cache, performance would be much better.

In particular, FluentBit already appears to be setting fieldSelector=spec.nodeName=.... in the query string when listing pods. There is a special optimization, which makes that highly performant in Kubernetes... but only when the data is served from the API Server's cache. That optimized cache read path is exactly how all the kubelets in a large cluster can efficiently read pods. So it would be good if Fluent Bit was changed to also use that path. The only change necessary is to signal to Kubernetes that cached data will be accepted.

As for whether it would be appropriate to use cached data, that's something only the FluentBit team can judge.

For more info, please see this new section of the K8s FAQ: https://github.com/kubernetes/community/blob/master/sig-scalability/configs-and-limits/faq.md#how-should-we-code-client-applications-to-improve-scalability. Points 4, 6, and 7 are the most relevant to Fluent Bit.

@github-actions
Copy link
Contributor

This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 5 days. Maintainers can add the exempt-stale label.

@github-actions github-actions bot added the Stale label Apr 12, 2023
@JohnRusk
Copy link
Author

Hey bot, please don't close this one.

@github-actions github-actions bot removed the Stale label Apr 13, 2023
@github-actions
Copy link
Contributor

This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 5 days. Maintainers can add the exempt-stale label.

@github-actions github-actions bot added the Stale label Jul 12, 2023
@github-actions
Copy link
Contributor

This issue was closed because it has been stalled for 5 days with no activity.

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Jul 17, 2023
@shuaich
Copy link
Contributor

shuaich commented Apr 5, 2024

Thanks John for proposing this. +1 on optimizing querying pod labels from api-server.

Querying '/pods' endpoint via secure 10255 port is not always feasible due to security concerns. Granting permission to resource nodes/proxy imposes security risks.

One optimization direction is to use list-watch mechanism. Unfortunately, there is no shared informer in C client library. We can a http client or k8 C client with a callback to handle incremental pod events.

Happy to chat and collaborate more on this proposal. @JohnRusk

@shuaich
Copy link
Contributor

shuaich commented Apr 5, 2024

This is a good example of using k8s C client to query pod information with list-watch: https://github.com/kubernetes-client/c/blob/07648eda6118449de94354d9deb6611cdd19d4e6/examples/watch_list_pod/main.c

I am looking into if this is feasible to use a http client to do the same.

@JohnRusk
Copy link
Author

JohnRusk commented Apr 7, 2024

@shuaich Oh, I did not realise that there was no shared informer in the C client library. That's inconvenient. Yes, I agree that finding a way to call list-watch sounds like a good idea.

Thanks for the suggestion that we collaborate. My C skills are almost non-existent, so I can't offer to help with code I'm sorry. But happy to chat here about ideas if needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants