-
Notifications
You must be signed in to change notification settings - Fork 44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add helm template #416
base: main
Are you sure you want to change the base?
add helm template #416
Conversation
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: Kuromesi The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Hi @Kuromesi. Thanks for your PR. I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
Signed-off-by: Kuromesi <[email protected]>
✅ Deploy Preview for gateway-api-inference-extension ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
✅ Deploy Preview for gateway-api-inference-extension ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
Thanks @Kuromesi! I'll try to take a look at this today /assign |
/ok-to-test |
Thanks, and I got some questions which not quite certain of:
|
Hey @Kuromesi, thanks for the work on this! I think it would be helpful to think about how we expect users to use this project. Initial Setup
Day to Day
While your PR seems to do a great job at capturing the config required for our current quickstart guide, it's not where we want to be long term. In the next ~month, I'm hopeful that we'll have built in support for this pattern from kgateway, Istio, and GKE Gateway implementations. That will mean that instead of manually patching Envoy Gateway like our current quickstart guide (and this Helm chart) do, users will be able to just use these APIs directly. With that background, I think the original issue was specifically asking for a chart that "simplifies creating an InferencePool with an associated EPP deployment". I think the ideal for this would be a chart that took parameters for InferencePool name, and then had defaults for all the rest, including the EPP configuration (Deployment, Service, HPA, RBAC). It looks like you have a lot of this in the chart already, but ideally the chart could be restructured to be focused exclusively on InferencePool and deploying a corresponding extension. In the future we could expand this chart to include InferenceModels pointing at the InferencePool. I'd recommend leaving all CRD, Gateway, and HTTPRoute configuration out of this chart. Hopefully that approach makes sense. I'm also happy to chat about this in the #gateway-api-inference-extension channel on Kubernetes Slack if that would be easier. |
Got it, thanks! |
Signed-off-by: Kuromesi <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for all the work on this @Kuromesi! Left some more nits but otherwise LGTM
@@ -0,0 +1 @@ | |||
Gateway api inference extension deployed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gateway api inference extension deployed. | |
InferencePool deployed. |
@@ -0,0 +1,23 @@ | |||
# Patterns to ignore when building packages. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: I'd expect this chart to live at config/charts/inferencepool
@@ -0,0 +1,9 @@ | |||
apiVersion: v2 | |||
name: gateway-api-inference-extension |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
name: gateway-api-inference-extension | |
name: inferencepool |
@@ -0,0 +1,9 @@ | |||
apiVersion: v2 | |||
name: gateway-api-inference-extension | |||
description: A Helm chart for gateway-api-inference-extension |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
description: A Helm chart for gateway-api-inference-extension | |
description: A Helm chart for InferencePool |
*/}} | ||
{{- define "gateway-api-inference-extension.selectorLabels" -}} | ||
app: {{ .Values.inferenceExtension.name }} | ||
{{- end -}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Recommend adding trailing new line
tag: main | ||
pullPolicy: Always | ||
|
||
name: inference-gateway-ext-proc |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should probably have the name of the inferencePool in it by default. So if the inference pools is called base
, maybe this is called base-epp
|
||
inferencePool: | ||
namespace: default | ||
name: vllm-llama2-7b-pool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pool-1
:)
{{- include "gateway-api-inference-extension.labels" . | nindent 4 }} | ||
spec: | ||
selector: | ||
{{- include "gateway-api-inference-extension.selectorLabels" . | nindent 4 }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both of these feel like they should be included in values.yaml
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is generated inhelpers.tpl
, should we provide customization in values.yaml?
{{/*
Selector labels
*/}}
{{- define "gateway-api-inference-extension.selectorLabels" -}}
app: {{ .Values.inferenceExtension.name }}
{{- end -}}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd missed that, thanks! While I don't think we need the InferencePool labels to be configurable, I think it's important to make the selector configurable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mean we should also create a inferencePool in the helm chart? (which I did not yet)
port: {{ .Values.inferenceExtension.grpcPort | default 9002 }} | ||
targetPort: {{ .Values.inferenceExtension.grpcPort | default 9002 }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: I think I'd call this extProcPort
. I also don't think you need to specify targetPort
unless it's different from port
.
port: {{ .Values.inferenceExtension.grpcPort | default 9002 }} | |
targetPort: {{ .Values.inferenceExtension.grpcPort | default 9002 }} | |
port: {{ .Values.inferenceExtension.extProcPort | default 9002 }} |
selector: | ||
{{- include "gateway-api-inference-extension.selectorLabels" . | nindent 4 }} | ||
ports: | ||
- name: grpc |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- name: grpc | |
- name: ext_proc |
Thanks for your patiently review! I will fix this issues. |
Thanks @Kuromesi for doing this, anything blocking this PR now? |
Resolve #381, deploy by helm.
A generated file is shown in
config/manifests/gateway-api-inference-extension/generated.yaml
.To avoid conflicts with other releases, I extend the names of the resources with helm release name, which is shown in
config/manifests/gateway-api-inference-extension/templates/_helpers.tpl
.