-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consider optimizing pods call to Kubernetes API Server so that it becomes efficient and scalable #6676
Comments
This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 5 days. Maintainers can add the |
Hey bot, please don't close this one. |
This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 5 days. Maintainers can add the |
This issue was closed because it has been stalled for 5 days with no activity. |
Thanks John for proposing this. +1 on optimizing querying pod labels from api-server. Querying '/pods' endpoint via secure 10255 port is not always feasible due to security concerns. Granting permission to resource One optimization direction is to use list-watch mechanism. Unfortunately, there is no shared informer in C client library. We can a http client or k8 C client with a callback to handle incremental pod events. Happy to chat and collaborate more on this proposal. @JohnRusk |
This is a good example of using k8s C client to query pod information with list-watch: https://github.com/kubernetes-client/c/blob/07648eda6118449de94354d9deb6611cdd19d4e6/examples/watch_list_pod/main.c I am looking into if this is feasible to use a http client to do the same. |
@shuaich Oh, I did not realise that there was no shared informer in the C client library. That's inconvenient. Yes, I agree that finding a way to call list-watch sounds like a good idea. Thanks for the suggestion that we collaborate. My C skills are almost non-existent, so I can't offer to help with code I'm sorry. But happy to chat here about ideas if needed. |
Is your feature request related to a problem? Please describe.
There are known issues with Fluent Bit on large Kubernetes clusters. The workaround is to enable the use_kubelet setting. But that's not enabled by default, and so users can run into scaling problems without understanding why.
Describe the solution you'd like
Optimization of the default behaviour (in which Fluent Bit calls the Kube API Server) so that the workaround becomes less necessary, and maybe unnecessary.
This might be easy. It depends on how fresh the listed data must be. Right now, it appears that Fluent Bit is using default query string parameters, and therefore the query is being served from etcd. This is expensive in terms of performance. However, if FluentBit instead indicated that it would accept data from the API server's cache, performance would be much better.
In particular, FluentBit already appears to be setting
fieldSelector=spec.nodeName=....
in the query string when listing pods. There is a special optimization, which makes that highly performant in Kubernetes... but only when the data is served from the API Server's cache. That optimized cache read path is exactly how all the kubelets in a large cluster can efficiently read pods. So it would be good if Fluent Bit was changed to also use that path. The only change necessary is to signal to Kubernetes that cached data will be accepted.As for whether it would be appropriate to use cached data, that's something only the FluentBit team can judge.
For more info, please see this new section of the K8s FAQ: https://github.com/kubernetes/community/blob/master/sig-scalability/configs-and-limits/faq.md#how-should-we-code-client-applications-to-improve-scalability. Points 4, 6, and 7 are the most relevant to Fluent Bit.
The text was updated successfully, but these errors were encountered: