Skip to content

tags: initial implementation of tags #149

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Feb 2, 2021

Conversation

nicolehanjing
Copy link
Member

@nicolehanjing nicolehanjing commented Oct 28, 2020

first pass of tags implementation

What type of PR is this?
/kind feature

What this PR does / why we need it:
We have two formats of the cluster tags in the legacy implementation:

// TagNameKubernetesClusterPrefix is the tag name we use to differentiate multiple
// logically independent clusters running in the same AZ.
// The tag key = TagNameKubernetesClusterPrefix + clusterID
// The tag value is an ownership value
const TagNameKubernetesClusterPrefix = "kubernetes.io/cluster/"

// TagNameKubernetesClusterLegacy is the legacy tag name we use to differentiate multiple
// logically independent clusters running in the same AZ.  The problem with it was that it
// did not allow shared resources.
const TagNameKubernetesClusterLegacy = "KubernetesCluster"

As @nckturner mentioned before, the current format ("kubernetes.io/cluster/<clusterID>": <ownership>") causes issues for customers trying to use tags to organize billing.

After discussion, we'd like to see tags with formats:

1) kubernetes.io/cluster/ = <clusterID>
2) kubernetes.io/cluster/<clusterID> = ""

the second format is only used for shared resources e.g. subnets

and once we have other workarounds, will make corresponding updates.
Which issue(s) this PR fixes:

Part of #125

Special notes for your reviewer:

Does this PR introduce a user-facing change?:

NONE

@k8s-ci-robot k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. release-note-none Denotes a PR that doesn't merit a release note. kind/feature Categorizes issue or PR as related to a new feature. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Oct 28, 2020
@k8s-ci-robot k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Oct 28, 2020
@nicolehanjing nicolehanjing force-pushed the nicoleh-tags branch 6 times, most recently from 8139e29 to 5b1e59d Compare November 4, 2020 19:01
@nicolehanjing nicolehanjing force-pushed the nicoleh-tags branch 3 times, most recently from 0f2fe6d to 891e3e0 Compare November 12, 2020 00:41
@nicolehanjing nicolehanjing changed the title [WIP] tags: initial implementation of tags tags: initial implementation of tags Nov 12, 2020
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Nov 12, 2020
@nckturner
Copy link
Contributor

So just to summarize our conversation from this morning, we need to keep the kubernetes.io/cluster/<clusterID> = shared|owned for discovery, but we will allow passing in additionalTags (or something similar) via the config to aid in billing and other tag use cases, like where users automatically delete all resources from an account that aren't tagged a certain way. It also makes sense to use "tag-on-create" where possible because we believe it is better supported (although I am verifying that). There have been some changes to the APIs since V1 was written, so we should take advantage of those.

@nckturner
Copy link
Contributor

So actually as of now I think the resource tagging API is fully supported in all regions, so I don't think we need to prefer tag-on-create for that reason. Tag-on-create might be nice for other reasons, like being eaiser to implement and fewer total number of api calls.

@nicolehanjing
Copy link
Member Author

nicolehanjing commented Nov 13, 2020

So just to summarize our conversation from this morning, we need to keep the kubernetes.io/cluster/<clusterID> = shared|owned for discovery, but we will allow passing in additionalTags (or something similar) via the config to aid in billing and other tag use cases, like where users automatically delete all resources from an account that aren't tagged a certain way. It also makes sense to use "tag-on-create" where possible because we believe it is better supported (although I am verifying that). There have been some changes to the APIs since V1 was written, so we should take advantage of those.

Thanks @nckturner for the summary
for my current implementation, I only used the first format which is kubernetes.io/cluster/ = <clusterID>, but as we are keeping kubernetes.io/cluster/<clusterID> = shared|owned, I will need to keep both of them in this initial implementation, right?

Copy link
Member

@andrewsykim andrewsykim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the late review @nicolehanjing

@@ -61,6 +77,30 @@ type EC2Metadata interface {
GetMetadata(path string) (string, error)
}

func readAWSCloudConfig(config io.Reader) (*v1alpha1.AWSCloudConfig, error) {
if config == nil {
return nil, fmt.Errorf("no AWS cloud provider config file given")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: use errors.New instead since there's no args

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

gotcha!

return nil, err
}
if cfg.Kind != "AWSCloudConfig" {
return nil, fmt.Errorf("invalid cloud configuration object %q", cfg.Kind)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"invalid Kind for cloud config: %q"

@@ -82,7 +122,7 @@ func azToRegion(az string) (string, error) {
}

// newCloud creates a new instance of AWSCloud.
func newCloud() (cloudprovider.Interface, error) {
func newCloud(cfg v1alpha1.AWSCloudConfig) (cloudprovider.Interface, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: pass in the pointer

return clusterName, nil
}

func (t *awsTagging) init(clusterName string) error {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general I tend to avoid naming functions init because init is a reserved function name in Go

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Recommend creating a constructor function for awsTagging like NewAWSTags and have it receive clusterName as one of the parameters

@@ -53,6 +61,14 @@ type cloud struct {
region string
ec2 EC2
metadata EC2Metadata
tagging awsTagging
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: just name this tags?

@nicolehanjing
Copy link
Member Author

@andrewsykim updated! PTAL :)

}

func (t *awsTagging) hasClusterTag(tags []*ec2.Tag) bool {
// if the clusterName is not configured -- we consider all instances.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: wondering if this logic can be refactored slightly. IMO returning true from a function called hasClusterTag when there isn't a tag to look for is bit confusing (for the sole purpose of causing some side effect behavior in the caller), wondering if its worth separating return values to have distinct values for "is cluster tag configured" and "does cluster tag exist". We could return a specific error (or bool value) that the caller can check for if cluster tag is not configured.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point, personally I prefer to return a specific error with some info for the caller to check, will update!

// we got an error where the decode wasn't related to a missing type
return nil, err
}
if cfg.Kind != "AWSCloudConfig" {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there any other required fields we have to verify here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call, I think for this PR we don't technically have to validate anything since ClusterName is optional (I think) but there will be a need to validate field values at some point. Here's one example of how to do it:

https://github.com/kubernetes/kubernetes/blob/5344afd4fbfc3cb2ed6785d89b6f760886f100b5/pkg/credentialprovider/plugin/config.go#L71-L128


// no clusterName is a sign of misconfigured cluster, but we can't be tagging the resources with empty
// strings
if len(t.ClusterName) == 0 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is a sign of misconfigured cluster, we should log a warning here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mean clusterName is actually required? If so we should validate it when we read the config and return an error if it's empty. I thought clusterName was optional and it can be discovered through instances but I might be mis-remembering the details.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think requiring clusterName is reasonable, it's an explicit step from the user requiring them to understand how/why the cluster name is needed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there's a way to infer the cluster name, I think that'd be ideal. "We will add the cluster name automatically, but you can override it via config" sounds reasonable to me. Although I'm not sure if there's a case where the user would want to override it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah I agree, will figure out if we can discover the cluster name through instances
for now I leave a TODO note here and a warning log in https://github.com/kubernetes/cloud-provider-aws/pull/149/files#diff-78811b2c178b96d8cf3e546ccbc24defbe15e8ef228da31b36d17ecfe29174ceR183

@k8s-ci-robot k8s-ci-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Dec 9, 2020
for _, testcase := range testcases {
t.Run(testcase.name, func(t *testing.T) {
var buffer bytes.Buffer
_, err := buffer.WriteString(testcase.configData)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be simplieifed to:

bytes.NewBufferString(testcase.configData)

Copy link
Member

@andrewsykim andrewsykim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We also need to update the Instances implementation to use the clusterName tags for API calls.

return nil, fmt.Errorf("failed to read AWS cloud provider config file: %v", err)
}

errs := validateAWSCloudConfig(cfg)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@andrewsykim updated! PTAL :)

  • added validation function & unit tests
  • update the Instances implementation to use the clusterName tags for API calls

cloudprovider "k8s.io/cloud-provider"
"k8s.io/cloud-provider-aws/pkg/apis/config/v1alpha1"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

import alias this to awsconfigv1alpha1

allErrs = append(allErrs, field.Invalid(field.NewPath("Kind"), "invalid Kind for cloud config: %q", config.Kind))
}

if config.APIVersion != "config.aws.io/v1alpha1" {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use awsconfigv1alpha1.SchemeGroupVersion.String() instead of config.aws.io/v1alpha1

allErrs := field.ErrorList{}

if config.Kind != "AWSCloudConfig" {
allErrs = append(allErrs, field.Invalid(field.NewPath("Kind"), "invalid Kind for cloud config: %q", config.Kind))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

field value should be the json serialized version field.NewPath("kind")

}

if config.APIVersion != "config.aws.io/v1alpha1" {
allErrs = append(allErrs, field.Invalid(field.NewPath("APIVersion"), "invalid APIVersion for cloud config: %q", config.APIVersion))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

APIVersion -> apiVersion

allErrs = append(allErrs, field.Invalid(field.NewPath("APIVersion"), "invalid APIVersion for cloud config: %q", config.APIVersion))
}

fieldPath := field.NewPath("Config")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Config -> config

}

fieldPath := field.NewPath("Config")
if config.Config == (v1alpha1.AWSConfig{}) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this check is required, it would fail in field-specific validation anyways

}

if len(config.Config.ClusterName) == 0 {
allErrs = append(allErrs, field.Required(fieldPath.Child("ClusterName"), "cluster name cannot be empty"))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ClusterName -> clusterName

@nicolehanjing
Copy link
Member Author

@andrewsykim @nckturner @ayberk PTAL :)

Copy link
Member

@andrewsykim andrewsykim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/approve
/lgtm

There's a few TODOs in the comments and the code but I think we can address those with #156

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 2, 2021
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: andrewsykim, nicolehanjing

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 2, 2021
@k8s-ci-robot k8s-ci-robot merged commit d8f0022 into kubernetes:master Feb 2, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/feature Categorizes issue or PR as related to a new feature. lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note-none Denotes a PR that doesn't merit a release note. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants