Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[filter_kube] Rely on /pods endpoint of kubelet #1948

Closed
loburm opened this issue Feb 13, 2020 · 17 comments
Closed

[filter_kube] Rely on /pods endpoint of kubelet #1948

loburm opened this issue Feb 13, 2020 · 17 comments
Assignees

Comments

@loburm
Copy link
Contributor

loburm commented Feb 13, 2020

Is your feature request related to a problem? Please describe.
This wasn't tested in Fluent Bit but we had a problem with FluentD before in huge clusters.

Instead of querying kube-apiserver for retrieving labels and annotations of pod, it would be great to add a way to rely on /pods endpoint of kubelet instead. This endpoint is kind of local cache for pods running on the node, so it completely makes sense to use it when Fluent Bit is deployed as a DaemonSet. Disadvantage: it doesn't support watches, which means it should periodically scan that endpoint, but I assume that's not an issue, because we are not using watches anyway, if I read code correctly.

Describe the solution you'd like
Parameter for kubernetes filter (for example, Use_Kubelet) that would disable dependency on kube-apiserver. Additionally we may need to specify kubelet port, if someone is using a non-standard (Kubelet_Port). When a log entry from a new pod arrives, it's going to scrape /pods endpoint of kubelet to retrieve metadata for all pods on that node.

Additionally I haven't found any argument to configure time to live of cache entries in kube_filter. This might be a problem, because no one forbids changing pod labels or annotations without restarting a pod.

@PettitWesley
Copy link
Contributor

+1 for this

@lnalex
Copy link

lnalex commented Feb 16, 2020

Not Fluent Bit, but we've observed issues with FluentD (specifically with https://github.com/fluent/fluentd-kubernetes-daemonset) in large (5000-6000+ pods) Kubernetes clusters, where the sheer volume of API calls made by all the FluentD pods would cause the API server to fall over and become unresponsive at times.

@loburm
Copy link
Contributor Author

loburm commented Mar 9, 2020

The worst happens, when FluentD configured incorrectly on big clusters, it just starts crashlooping and sends tons of queries to kube-apiserver, which kills it.

@edsiper
Copy link
Member

edsiper commented Jun 11, 2020

@loburm do you know where I can get the Kubelet API docs ?

@loburm
Copy link
Contributor Author

loburm commented Jun 15, 2020

Kubelet API is not officially documented. Probably source code is the best documentation:
https://github.com/kubernetes/kubernetes/blob/15e95e48964d4e800f7210a249f9122ee7d79fcc/pkg/kubelet/server/server.go#L661

/pods should return PodList. But before implementing, it might be worth to sync with sig-node or sig-security team to ensure that this endpoint is not going to be removed soon.

@jaypipes
Copy link

I would definitely advise against relying on a private, unpublished Kubelet API endpoint like /pods. AFAIK, there's no immediate plans to get rid of it, but the general rule of thumb is that everything should query for Kubernetes-specific information using kubeapi-server so that there is a single place to control access, to apply rate-limiting and congestion control, and to be able to scale for increased concurrency for both reads and writes. I know it's tempting to just "use the kubelet since we're already on the node", but this certainly breaks the documented design principles in Kubernetes that everything should communicate only with the Kubernetes API server, not directly between Nodes or between agents on worker nodes.

@derekwaynecarr and @dashpole, please feel free to correct me if I'm wrong on anything!

@PettitWesley
Copy link
Contributor

@derekwaynecarr and @dashpole one thing to mention is that this would be an opt-in feature. The default would be to call the API server, using the Kubelet would be something the user would have to choose. So if there are some things we need to consider, as long as they aren't too complicated they could be called out in the documentation associated with the option.

@dashpole
Copy link

The /pods endpoint isn't going away, but we generally only recommend using it for debugging. IIRC, static pods status is still not updated correctly, and may not match the pod in the API Server (we call the pod on the API server the "mirror" pod, and it is different from the "static" pod reported on the /pods endpoint IIRC.

What pieces of information do you use from pods? Just UUID, or other things too, like labels?

The podresources grpc endpoint could be a reasonable alternative, if it has the information you are after.

@PettitWesley
Copy link
Contributor

@dashpole Thanks, I will read up on that. Is podresources something scalable that could have 2000+ Fluentd or Fluent Bit agents reading from simultaneously in a large cluster?

Adding this link, for the benefit of others: https://docs.google.com/document/d/1NYnqw-HDQ6Y3L_mk85Q3wkxDtGNWTxpsedsgw4NgWpg/edit

@dashpole
Copy link

It is a kubelet endpoint, so I would hope there is only a single fluentd/bit accessing it at once on each node.

@dashpole
Copy link

I completely forgot I had written that doc. :)

@PettitWesley
Copy link
Contributor

@dashpole Unfortunately, Fluent Bit currently lacks support for gRPC. Could we contribute changes to the Kubelet to support a basic HTTP version of the pod resources endpoint?

This is just getting information right? It could be covered with a simple GET request- or is there some bidirectional communication going on and that's why gRPC was chosen?

@edsiper Or, how hard would it be to add gRPC support in Fluent Bit? What would the timeline be?

@dashpole
Copy link

We chose gRPC to follow other node plugins (cri, csi, device plugin). It generally has good language support, with C being an apparent exception. I'm not an active maintainer of the kubelet anymore, but I suspect they would prefer you using the /pods endpoint to adding a new endpoint.

@PettitWesley PettitWesley self-assigned this Oct 15, 2020
@PettitWesley
Copy link
Contributor

Assigning this to myself, since I have someone on my team who tentatively plans to work on it- for release in Fluent Bit 1.7

@PettitWesley
Copy link
Contributor

@DrewZhang13 @loburm This can been released in 1.7.2- we can close this issue?

@loburm
Copy link
Contributor Author

loburm commented Mar 11, 2021

Yes I think so.

@JohnRusk
Copy link

JohnRusk commented Jan 12, 2023

See also #6676

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants