Skip to content

Commit 240b154

Browse files
authored
Merge pull request #3 from aramase/aramase/f/kep_3331_review_comments_2
address review feedback (part 2)
2 parents 5204f7e + bd0f644 commit 240b154

File tree

1 file changed

+70
-14
lines changed
  • keps/sig-auth/3331-structured-config-for-oidc-authentication

1 file changed

+70
-14
lines changed

keps/sig-auth/3331-structured-config-for-oidc-authentication/README.md

+70-14
Original file line numberDiff line numberDiff line change
@@ -194,7 +194,6 @@ TODO:
194194
- distributed claims with fancier resolution requirements (such as access tokens as input)
195195
- implementation detail: we should probably parse the `iss` claim out once
196196
- should audit annotations be set on validation failure?
197-
- should we introduce implicit checks that assure at least one of uid/username are nonempty?
198197
- decide what error should be returned if CEL eval fails at runtime
199198
`500 Internal Sever Error` seem appropriate but authentication can only do `401`
200199

@@ -225,10 +224,6 @@ jwt:
225224
extra:
226225
- key: 'client_name'
227226
valueExpression: 'claims.some_claim'
228-
# TODO(enj): drop this and figure out to get from CEL
229-
# claimFilters:
230-
# - username
231-
# - roles
232227
userInfoValidationRules:
233228
- rule: "!userInfo.username.startsWith('system:')"
234229
message: username cannot used reserved system: prefix
@@ -337,16 +332,28 @@ type JWTAuthenticator struct {
337332
// Same value as the --oidc-issuer-url flag.
338333
// Used to fetch discovery information unless overridden by discoveryURL.
339334
// Required to be unique.
340-
URL string `json:"url,omitempty"`
335+
// Note that egress selection configuration is not used for this network connection.
336+
// TODO: decide if we want to support egress selection configuration and how to do so.
337+
URL string `json:"url"`
341338

342339
// discoveryURL if specified, overrides the URL used to fetch discovery information.
340+
// This is for scenarios where the well-known and jwks endpoints are hosted at a different
341+
// location than the issuer (such as locally in the cluster).
343342
// Format must be https://url/path.
343+
//
344344
// Example:
345-
// curl oidc.oidc-namespace (.discoveryURL field)
345+
// A discovery url that is exposed using kubernetes service 'oidc' in namespace 'oidc-namespace'.
346+
// certificateAuthority is used to verify the TLS connection and the hostname on the leaf certifcation
347+
// must be set to 'oidc.oidc-namespace'.
348+
//
349+
// curl https://oidc.oidc-namespace (.discoveryURL field)
346350
// {
347351
// issuer: "https://oidc.example.com" (.url field)
348352
// }
353+
//
349354
// Required to be unique.
355+
// Note that egress selection configuration is not used for this network connection.
356+
// TODO: decide if we want to support egress selection configuration and how to do so.
350357
// +optional
351358
DiscoveryURL *string `json:"discoveryURL,omitempty"`
352359

@@ -420,7 +427,7 @@ type JWTAuthenticator struct {
420427
// username represents an option for the username attribute.
421428
// Claim must be a singular string claim.
422429
// TODO: decide whether to support a distributed claim for username (what are we required to correlate between the data retrieved for distributed claims? sub? something else?). Limit distributed claim support to OIDC things with clientID validation?
423-
// Expression must produce a string value.
430+
// Expression must produce a string value that must be non-empty.
424431
// Possible prefixes based on the config:
425432
// (1) if userName.prefix = "-", no prefix will be added to the username
426433
// (2) if userName.prefix = "" and userName.claim != "email", prefix will be "<issuer.url>#"
@@ -432,6 +439,21 @@ type JWTAuthenticator struct {
432439
// Expression must produce a string or string array value.
433440
// "", [], missing, and null values are treated as having no groups.
434441
// TODO: investigate if you could make a single expression to construct groups from multiple claims. If not, maybe []PrefixedClaimOrExpression?
442+
// For input claim:
443+
// {
444+
// "claims": {
445+
// "roles":"foo,bar",
446+
// "other_roles":"baz,qux"
447+
// "is_admin": true
448+
// }
449+
// }
450+
// To concatenate lists:
451+
// claims.roles.split(",") + claims.other_roles.split(",")
452+
// Constructing single item list and concatenating lists:
453+
// claims.roles.split(",") + ["hardcoded_group"]
454+
// claims.roles.split(",") + (claims.is_admin ? ["admin"]:[])
455+
// Type check and wrap in a list if needed:
456+
// (type(claims.string_or_list_claim) == string ? [claims.string_or_list_claim] : claims.string_or_list_claim) + ["hardcoded_group"]
435457
// +optional
436458
Groups PrefixedClaimOrExpression `json:"groups,omitempty"`
437459
// uid represents an option for the uid attribute.
@@ -544,6 +566,26 @@ type JWTAuthenticator struct {
544566
client_name: kubernetes
545567
```
546568

569+
For distributed claims:
570+
571+
```json
572+
claims = {
573+
"foo":"bar",
574+
"foo.bar": "...",
575+
"true": "...",
576+
"_claim_names": {
577+
"groups": "group_source"
578+
},
579+
"_claim_sources": {
580+
"group_source": {"endpoint": "https://example.com/claim_source"}
581+
}
582+
}
583+
```
584+
585+
- For claim names containing `.`, we can reference using `claims["foo.bar"]`
586+
- TODO: can we implement a CEL type resolver so that a cel expression `claims.foo` gets resolved via a distributed claim the first time it is used?
587+
- this seems likely and preferable so we only resolve the things we need (in case an early validation rule fails and short-circuits).
588+
547589
### CEL
548590

549591
* CEL runtime should be compiled only once if structured authentication config option is enabled.
@@ -552,14 +594,26 @@ type JWTAuthenticator struct {
552594
* `claims` for JWT claims (payload)
553595
* One variable will be available to use in `userInfoValidationRules`:
554596
* `userInfo` with the same schema as [authentication.k8s.io/v1, Kind=UserInfo](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.27/#userinfo-v1-authentication-k8s-io)
555-
* To make working with strings more convenient, `strings` and `encoding`
556-
[CEL extensions](https://github.com/google/cel-go/tree/v0.9.0/ext) will be enabled,
557-
e.g, to be able to split a string with comma separated fields and use them as a single array.
597+
* The standard Kubernetes CEL environment, including extension libraries, will be used.
598+
* Current environment:
599+
* [Extension libraries](https://github.com/kubernetes/kubernetes/blob/5fe3563ad7e04d5470368aa821f42f131d3bd8fc/staging/src/k8s.io/apiserver/pkg/cel/library/libraries.go#L26)
600+
* [Base environment](https://github.com/kubernetes/kubernetes/blob/5fe3563ad7e04d5470368aa821f42f131d3bd8fc/staging/src/k8s.io/apiextensions-apiserver/pkg/apiserver/schema/cel/compilation.go#L83)
601+
* The encoding library needs to be added to the environment since it's currently not used. Doing so will help keep CEL consistent across the API.
558602
* Benchmarks are required to see how different CEL expressions affects authentication time.
603+
* There will be a upper bound of 5s for the CEL expression evaluation.
559604
* Caching will be used to prevent having to execute the CEL expressions on every request.
560605
- TODO decide what the behavior of the token cache will be on config reload
561606
- TODO should the token expiration cache know about the `exp` field instead of hard coding `10` seconds?
562607
this requires awareness of key rotation to implement safely
608+
* TODO: decide how to safe guard access to fields that might not exist or stop existing at any moment.
609+
* Using `has()` to guard access to fields.
610+
* Could we do some kind of defaulting for fields that don't exist?
611+
612+
> Notes from PR review (jpbetz):
613+
>
614+
> You can pass a context to CEL and cancel runtime evaluation if the context is canceled. This causes the CEL expression to halt execution promptly and evaluate to an error.
615+
> You can also put a runtime limit (measured in abstract cost units that are hardware and wall clock independent) on CEL expressions to bound running time.
616+
> (There is also a way to set a limit for the estimated cost, which is computed statically on compiled CEL programs if you know the worst case size of the input data, but this might be overkill for this feature)
563617

564618
### Flags
565619

@@ -711,11 +765,11 @@ No.
711765

712766
###### Can the feature be disabled once it has been enabled (i.e. can we roll back the enablement)?
713767

714-
Yes.
768+
Yes. Note that if the `--oidc-*` flags were previously in use, they must be restored for OIDC authentication to function correctly.
715769

716770
###### What happens if we reenable the feature if it was previously rolled back?
717771

718-
No impact.
772+
No impact (generally speaking, authentication does not cause persisted state in the cluster).
719773

720774
###### Are there any tests for feature enablement/disablement?
721775

@@ -732,6 +786,7 @@ It cannot fail until a bug in kube-apiserver connected to parsing structured con
732786
Possible consequences are:
733787
* A cluster administrator rolls out the feature with the addition of some validation rules that may allow access to previously restricted users.
734788
* Other cluster components can depend on claim validations. Rolling back would mean losing validation functionality.
789+
* If the cluster admin fails to restore any previously in-use `--oidc-*` flags on a rollback, OIDC authentication will not function.
735790

736791
###### What specific metrics should inform a rollback?
737792

@@ -760,6 +815,7 @@ TBA
760815

761816
* There will be a corresponding message in kube-apiserver logs.
762817
* By checking the kube-apiserver flags.
818+
* By checking the metrics emitted by the kube-apiserver.
763819

764820
###### How can someone using this feature know that it is working for their instance?
765821

@@ -796,7 +852,7 @@ These goals will help you determine what you need to measure (SLIs) in the next
796852
question.
797853
-->
798854
799-
The feature should work 99.9% of the time (SLOs for actual requests should not change in any way compared to the flag-based OIDC configuration).
855+
SLOs for actual requests should not change in any way compared to the flag-based OIDC configuration.
800856
801857
###### What are the SLIs (Service Level Indicators) an operator can use to determine the health of the service?
802858

0 commit comments

Comments
 (0)