filter: kubernetes: add documentation for use_kubelet(#1948) #458

DrewZhang13 · 2021-02-09T01:51:02Z

This is for the documentation of issue: 3025
Signed-off-by: Drew Zhang [email protected]

PettitWesley · 2021-02-13T00:19:28Z

pipeline/filters/kubernetes.md

@@ -46,6 +46,8 @@ The plugin supports the following configuration parameters:
 | Dummy\_Meta | If set, use dummy-meta data \(for test/dev purposes\) | Off |
 | DNS\_Retries | DNS lookup retries N times until the network start working | 6 |
 | DNS\_Wait\_Time | DNS lookup interval between network status checks | 30 |
+| Use\_Kubelet | this is an optional feature flag to get metadata information from kubelet instead of calling Kube Server API to enhance the log. This could mitigate the Kube API heavy traffic issue for large cluster. This feature default to be disabled and user could enable it by set this flag to true | False |


the option supports On or Off as values as well right? (it should, boolean options in fluent bit config maps support that). I think it's preferable in the docs to use Off and On instead of true and false, even though both work. You can see this for other options.

Also, lets change the wording here, this is my suggestion:

This is an optional feature flag to get metadata information from kubelet instead of calling Kube Server API to obtain kubernetes metadata. At large scale, issues have been reported with Fluent Bit making calls to the API Server, this option may mitigate those problems.

We can link to the issue instead of saying somethig vague, folks won't know what the "Kube API heavy traffic issue" means without more information.

Yeah it supports On and Off. Changed the default value to Off and updated the description.

PettitWesley · 2021-02-13T00:58:30Z

pipeline/filters/kubernetes.md

@@ -46,6 +46,8 @@ The plugin supports the following configuration parameters:
 | Dummy\_Meta | If set, use dummy-meta data \(for test/dev purposes\) | Off |
 | DNS\_Retries | DNS lookup retries N times until the network start working | 6 |
 | DNS\_Wait\_Time | DNS lookup interval between network status checks | 30 |
+| Use\_Kubelet | this is an optional feature flag to get metadata information from kubelet instead of calling Kube Server API to enhance the log. This could mitigate the Kube API heavy traffic issue for large cluster. This feature default to be disabled and user could enable it by set this flag to true | False |
+| Kubelet\_Port | kubelet port using for HTTP request, this only works when `Use_Kubelet`  set to true. The default value is `10250` which is the same with offical website and could be setup based on kubeletconfig about the kubelet port.| 10250 |


The default value is 10250 which is the same with offical website and could be setup based on kubeletconfig about the kubelet port.

I think all of this is unnecessary, since you note the default value in the next column. Also "which is the same with the official website" I am guessing you mean the official k8s docs? I am not sure what this sentence means. I think probably we can just remove it.

Removed this sentence.

PettitWesley · 2021-02-13T00:59:17Z

pipeline/filters/kubernetes.md

@@ -204,3 +206,149 @@ Under certain and not common conditions, a user would want to alter that hard-co

 So at this point the filter is able to gather the values of _pod\_name_ and _namespace_, with that information it will check in the local cache \(internal hash table\) if some metadata for that key pair exists, if so, it will enrich the record with the metadata value, otherwise it will connect to the Kubernetes Master/API Server and retrieve that information.

+## Optional Feature: Using Kubelet to Get Metadata


the explanation here is good. May be ignore my other comment on the config option and instead have a link in the description to this section.

Added the link inside this section.

PettitWesley · 2021-02-13T01:00:32Z

pipeline/filters/kubernetes.md

+## Optional Feature: Using Kubelet to Get Metadata
+
+There is an [issue](https://github.com/fluent/fluent-bit/issues/1948) reported about kube-apiserver fall over and become unresponsive when cluster is too large and too many requests are sent to it.
+We consider fluent bit would expected the same issue so provide this feature as an option to use. For this feature, fluent bit Kubernetes filter will send the request to kubelet /pods endpoint instead of kube-apiserver to retrieve the pods information and use it to enrich the log. Since Kubelet is running locally in nodes, the request would be responded faster and each node would only get one request one time. This could save kube-apiserver power to handle other requests.


We consider fluent bit would expected the same issue so provide this feature as an option to use.

I know what you mean here, you are talking about Fluentd. But a reader will not unless they clicked on the issue. Always keep in mind what your reader might be thinking.

I think you can entirely remove this sentence, I don't think it adds value/it doesn't help the user

PettitWesley · 2021-02-13T01:02:01Z

pipeline/filters/kubernetes.md

+
+There is an [issue](https://github.com/fluent/fluent-bit/issues/1948) reported about kube-apiserver fall over and become unresponsive when cluster is too large and too many requests are sent to it.
+We consider fluent bit would expected the same issue so provide this feature as an option to use. For this feature, fluent bit Kubernetes filter will send the request to kubelet /pods endpoint instead of kube-apiserver to retrieve the pods information and use it to enrich the log. Since Kubelet is running locally in nodes, the request would be responded faster and each node would only get one request one time. This could save kube-apiserver power to handle other requests.
+By enabling this feature, you should see no difference on enriching log part, but the Kube-apiserver bottleneck should be avoided when cluster is large.


I prefer to just say:

When this feature is enabled, you should see no difference in the kubernetes metadata added to logs.

the other bit is already explained by the previous sentences.

PettitWesley · 2021-02-13T01:04:08Z

pipeline/filters/kubernetes.md

+
+Now you are good to use this new feature!
+
+### Verify the New Feature Working


"Verify the New Feature Working" is a very vague header. You want to make sure someone skimming this doc would be able to understand what the header is about. The title should be clear.

May be: "Verify that the Use_Kubelet option is working"

PettitWesley · 2021-02-15T20:13:43Z

pipeline/filters/kubernetes.md

@@ -46,6 +46,8 @@ The plugin supports the following configuration parameters:
 | Dummy\_Meta | If set, use dummy-meta data \(for test/dev purposes\) | Off |
 | DNS\_Retries | DNS lookup retries N times until the network start working | 6 |
 | DNS\_Wait\_Time | DNS lookup interval between network status checks | 30 |
+| Use\_Kubelet | this is an optional feature flag to get metadata information from kubelet instead of calling Kube Server API to enhance the log. This could mitigate the Kube API heavy traffic issue for large cluster. | Off |


"the Kube API heavy traffic issue for large cluster" this should be a link to the standalone section you have down below on this feature ("## Optional Feature: Using Kubelet to Get Metadata"), otherwise to a new user this phrase won't make any sense.

You can do anchor links in github Markdown as shown here: https://gist.github.com/rachelhyman/b1f109155c9dafffe618

updated with anchor

PettitWesley · 2021-02-20T02:09:51Z

pipeline/filters/kubernetes.md

+### Verify that the Use_Kubelet option is working
+Basically you should see no difference about your experience for enriching your log files with Kubernetes metadata. 
+
+By Checking if it's working in kubelet way, you can check fluent bit logs and there should be a log like this:


Nit: "To check if Fluent Bit is using the kubelet, you can check..."

PettitWesley · 2021-02-20T02:10:14Z

pipeline/filters/kubernetes.md

+        Use_Kubelet         true
+        Kubelet_Port        10250
+```
+So for fluent bit configuration, we need to set the `Use_Kubelet` to true to enable this feature.


nit: you not we

changed to you

PettitWesley · 2021-02-20T02:10:34Z

pipeline/filters/kubernetes.md

+    name: fluentbitds
+    namespace: fluentbit-system
+```
+The difference is that kubelet need a special permission for resource `nodes/proxy` to get HTTP request in. When creating the `role` or `clusterRole`, we need to add `nodes/proxy` into the rule for resource.


you is more typical in docs than we I think

changed to you

Signed-off-by: Drew Zhang <[email protected]>

DrewZhang13 mentioned this pull request Feb 9, 2021

filter_kubernetes: option for get meta information from kubelet /pods… fluent/fluent-bit#3025

Merged

PettitWesley reviewed Feb 13, 2021

View reviewed changes

DrewZhang13 force-pushed the eks-scale branch 2 times, most recently from fcec1f6 to bb6feef Compare February 15, 2021 09:14

PettitWesley reviewed Feb 15, 2021

View reviewed changes

DrewZhang13 force-pushed the eks-scale branch 2 times, most recently from 7da7774 to e8cbc04 Compare February 16, 2021 09:30

edsiper force-pushed the master branch from 0d9b567 to 51ef8ec Compare February 16, 2021 16:53

DrewZhang13 force-pushed the eks-scale branch from e8cbc04 to 93d6f07 Compare February 20, 2021 02:07

PettitWesley reviewed Feb 20, 2021

View reviewed changes

PettitWesley approved these changes Feb 20, 2021

View reviewed changes

filter: kubernetes: add documentation for use_kubelet(#1948)

ec11d51

Signed-off-by: Drew Zhang <[email protected]>

DrewZhang13 force-pushed the eks-scale branch from 93d6f07 to ec11d51 Compare February 20, 2021 02:18

PettitWesley merged commit 7a6fa23 into fluent:master Mar 4, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

filter: kubernetes: add documentation for use_kubelet(#1948) #458

filter: kubernetes: add documentation for use_kubelet(#1948) #458

DrewZhang13 commented Feb 9, 2021

PettitWesley Feb 13, 2021

PettitWesley Feb 13, 2021

DrewZhang13 Feb 15, 2021

PettitWesley Feb 13, 2021

DrewZhang13 Feb 15, 2021

PettitWesley Feb 13, 2021

DrewZhang13 Feb 15, 2021

PettitWesley Feb 13, 2021

DrewZhang13 Feb 15, 2021

PettitWesley Feb 13, 2021

DrewZhang13 Feb 15, 2021

PettitWesley Feb 13, 2021

DrewZhang13 Feb 15, 2021

PettitWesley Feb 15, 2021

DrewZhang13 Feb 16, 2021

PettitWesley Feb 20, 2021

DrewZhang13 Feb 20, 2021

PettitWesley Feb 20, 2021

DrewZhang13 Feb 20, 2021

PettitWesley Feb 20, 2021

DrewZhang13 Feb 20, 2021

		@@ -204,3 +206,149 @@ Under certain and not common conditions, a user would want to alter that hard-co

		So at this point the filter is able to gather the values of _pod\_name_ and _namespace_, with that information it will check in the local cache \(internal hash table\) if some metadata for that key pair exists, if so, it will enrich the record with the metadata value, otherwise it will connect to the Kubernetes Master/API Server and retrieve that information.

		## Optional Feature: Using Kubelet to Get Metadata


		Now you are good to use this new feature!

		### Verify the New Feature Working

filter: kubernetes: add documentation for use_kubelet(#1948) #458

filter: kubernetes: add documentation for use_kubelet(#1948) #458

Conversation

DrewZhang13 commented Feb 9, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment