add helm template #416

Kuromesi · 2025-02-27T02:53:12Z

Resolve #381, deploy by helm.

A generated file is shown in config/manifests/gateway-api-inference-extension/generated.yaml.

To avoid conflicts with other releases, I extend the names of the resources with helm release name, which is shown in config/manifests/gateway-api-inference-extension/templates/_helpers.tpl.

k8s-ci-robot · 2025-02-27T02:53:19Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: Kuromesi
Once this PR has been reviewed and has the lgtm label, please assign jeffwan for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

k8s-ci-robot · 2025-02-27T02:53:22Z

Hi @Kuromesi. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Signed-off-by: Kuromesi <[email protected]>

netlify · 2025-02-27T02:54:01Z

✅ Deploy Preview for gateway-api-inference-extension ready!

Name	Link
🔨 Latest commit	`c91ec3c`
🔍 Latest deploy log	https://app.netlify.com/sites/gateway-api-inference-extension/deploys/67bfd39b64b78d0008de3d29
😎 Deploy Preview	https://deploy-preview-416--gateway-api-inference-extension.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

netlify · 2025-02-27T02:55:51Z

✅ Deploy Preview for gateway-api-inference-extension ready!

Name	Link
🔨 Latest commit	`2366460`
🔍 Latest deploy log	https://app.netlify.com/sites/gateway-api-inference-extension/deploys/67c172f89a1ea4000862e593
😎 Deploy Preview	https://deploy-preview-416--gateway-api-inference-extension.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

robscott · 2025-02-27T20:55:16Z

Thanks @Kuromesi! I'll try to take a look at this today

/assign

ahg-g · 2025-02-27T21:22:04Z

/ok-to-test

Kuromesi · 2025-02-28T00:46:18Z

Thanks @Kuromesi! I'll try to take a look at this today

/assign

Thanks, and I got some questions which not quite certain of:

I made some effort on extending the names of the resources to avoid conflicting and made it possible to deploy under a single namespace, I'm not quite sure if that needed and the naming is appropriate.
Do we need to support to configure entire setup parameters of the ext_proc in helm values?

robscott · 2025-02-28T01:15:41Z

Hey @Kuromesi, thanks for the work on this!

I think it would be helpful to think about how we expect users to use this project.

Initial Setup

Install APIs (CRDs)
Install or enable a Gateway controller that supports this API
Set up an initial Gateway
Maybe set up body-to-header translator to enable routing based on model param in body (Add code for Envoy extension that supports body-to-header translation #355)

Day to Day

Deploy InferencePool(s), each of which will be bundled with an Endpoint Picker
Configure InferenceModel(s) that will be served by an InferencePool
Configure HTTPRoute(s) to point to InferencePool(s)

While your PR seems to do a great job at capturing the config required for our current quickstart guide, it's not where we want to be long term. In the next ~month, I'm hopeful that we'll have built in support for this pattern from kgateway, Istio, and GKE Gateway implementations. That will mean that instead of manually patching Envoy Gateway like our current quickstart guide (and this Helm chart) do, users will be able to just use these APIs directly.

With that background, I think the original issue was specifically asking for a chart that "simplifies creating an InferencePool with an associated EPP deployment".

I think the ideal for this would be a chart that took parameters for InferencePool name, and then had defaults for all the rest, including the EPP configuration (Deployment, Service, HPA, RBAC). It looks like you have a lot of this in the chart already, but ideally the chart could be restructured to be focused exclusively on InferencePool and deploying a corresponding extension.

In the future we could expand this chart to include InferenceModels pointing at the InferencePool.

I'd recommend leaving all CRD, Gateway, and HTTPRoute configuration out of this chart. Hopefully that approach makes sense. I'm also happy to chat about this in the #gateway-api-inference-extension channel on Kubernetes Slack if that would be easier.

Kuromesi · 2025-02-28T01:58:34Z

Hey @Kuromesi, thanks for the work on this!

I think it would be helpful to think about how we expect users to use this project.

Initial Setup

Install APIs (CRDs)

Install or enable a Gateway controller that supports this API

Set up an initial Gateway

Maybe set up body-to-header translator to enable routing based on model param in body ([WIP] Add code for Envoy extension that support body-to-header translation #355)

Day to Day

Deploy InferencePool(s), each of which will be bundled with an Endpoint Picker

Configure InferenceModel(s) that will be served by an InferencePool

Configure HTTPRoute(s) to point to InferencePool(s)

While your PR seems to do a great job at capturing the config required for our current quickstart guide, it's not where we want to be long term. In the next ~month, I'm hopeful that we'll have built in support for this pattern from kgateway, Istio, and GKE Gateway implementations. That will mean that instead of manually patching Envoy Gateway like our current quickstart guide (and this Helm chart) do, users will be able to just use these APIs directly.

With that background, I think the original issue was specifically asking for a chart that "simplifies creating an InferencePool with an associated EPP deployment".

I think the ideal for this would be a chart that took parameters for InferencePool name, and then had defaults for all the rest, including the EPP configuration (Deployment, Service, HPA, RBAC). It looks like you have a lot of this in the chart already, but ideally the chart could be restructured to be focused exclusively on InferencePool and deploying a corresponding extension.

In the future we could expand this chart to include InferenceModels pointing at the InferencePool.

I'd recommend leaving all CRD, Gateway, and HTTPRoute configuration out of this chart. Hopefully that approach makes sense. I'm also happy to chat about this in the #gateway-api-inference-extension channel on Kubernetes Slack if that would be easier.

Got it, thanks!

Signed-off-by: Kuromesi <[email protected]>

robscott

Thanks for all the work on this @Kuromesi! Left some more nits but otherwise LGTM

robscott · 2025-03-13T00:47:00Z

config/manifests/gateway-api-inference-extension/templates/NOTES.txt

@@ -0,0 +1 @@
+Gateway api inference extension deployed.


Suggested change

Gateway api inference extension deployed.

InferencePool deployed.

robscott · 2025-03-13T00:47:45Z

config/manifests/gateway-api-inference-extension/.helmignore

@@ -0,0 +1,23 @@
+# Patterns to ignore when building packages.


Nit: I'd expect this chart to live at config/charts/inferencepool

robscott · 2025-03-13T00:48:47Z

config/manifests/gateway-api-inference-extension/Chart.yaml

@@ -0,0 +1,9 @@
+apiVersion: v2
+name: gateway-api-inference-extension


Suggested change

name: gateway-api-inference-extension

name: inferencepool

robscott · 2025-03-13T00:48:58Z

config/manifests/gateway-api-inference-extension/Chart.yaml

@@ -0,0 +1,9 @@
+apiVersion: v2
+name: gateway-api-inference-extension
+description: A Helm chart for gateway-api-inference-extension


Suggested change

description: A Helm chart for gateway-api-inference-extension

description: A Helm chart for InferencePool

robscott · 2025-03-13T00:49:31Z

config/manifests/gateway-api-inference-extension/templates/_helpers.tpl

+*/}}
+{{- define "gateway-api-inference-extension.selectorLabels" -}}
+app: {{ .Values.inferenceExtension.name }}
+{{- end -}}


Recommend adding trailing new line

robscott · 2025-03-13T00:54:06Z

config/manifests/gateway-api-inference-extension/values.yaml

+    tag: main
+    pullPolicy: Always
+
+  name: inference-gateway-ext-proc


This should probably have the name of the inferencePool in it by default. So if the inference pools is called base, maybe this is called base-epp

robscott · 2025-03-13T00:55:06Z

config/manifests/gateway-api-inference-extension/values.yaml

+
+inferencePool:
+  namespace: default
+  name: vllm-llama2-7b-pool


I'm not really sure what we want our default pool name to be, but this seems to specific. Maybe base or default?

cc @ahg-g @danehans @kfswain

pool-1 :)

robscott · 2025-03-13T00:56:40Z

config/manifests/gateway-api-inference-extension/templates/ext_proc.yaml

+    {{- include "gateway-api-inference-extension.labels" . | nindent 4 }}
+spec:
+  selector:
+    {{- include "gateway-api-inference-extension.selectorLabels" . | nindent 4 }}


Both of these feel like they should be included in values.yaml

This is generated inhelpers.tpl, should we provide customization in values.yaml?

{{/* Selector labels */}} {{- define "gateway-api-inference-extension.selectorLabels" -}} app: {{ .Values.inferenceExtension.name }} {{- end -}}

I'd missed that, thanks! While I don't think we need the InferencePool labels to be configurable, I think it's important to make the selector configurable.

You mean we should also create a inferencePool in the helm chart? (which I did not yet)

robscott · 2025-03-13T00:58:00Z

config/manifests/gateway-api-inference-extension/templates/ext_proc.yaml

+      port: {{ .Values.inferenceExtension.grpcPort | default 9002 }}
+      targetPort: {{ .Values.inferenceExtension.grpcPort | default 9002 }}


Nit: I think I'd call this extProcPort. I also don't think you need to specify targetPort unless it's different from port.

Suggested change

port: {{ .Values.inferenceExtension.grpcPort | default 9002 }}

targetPort: {{ .Values.inferenceExtension.grpcPort | default 9002 }}

port: {{ .Values.inferenceExtension.extProcPort | default 9002 }}

robscott · 2025-03-13T00:58:23Z

config/manifests/gateway-api-inference-extension/templates/ext_proc.yaml

+  selector:
+    {{- include "gateway-api-inference-extension.selectorLabels" . | nindent 4 }}
+  ports:
+    - name: grpc


Suggested change

- name: grpc

- name: ext_proc

Kuromesi · 2025-03-13T04:00:35Z

Thanks for all the work on this @Kuromesi! Left some more nits but otherwise LGTM

Thanks for your patiently review! I will fix this issues.

ahg-g · 2025-03-17T18:14:39Z

Thanks @Kuromesi for doing this, anything blocking this PR now?

k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Feb 27, 2025

k8s-ci-robot requested review from danehans and kfswain February 27, 2025 02:53

k8s-ci-robot added needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Feb 27, 2025

initialize helm template

4931640

Signed-off-by: Kuromesi <[email protected]>

Kuromesi force-pushed the helm branch from c91ec3c to 4931640 Compare February 27, 2025 02:54

k8s-ci-robot assigned robscott Feb 27, 2025

k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Feb 27, 2025

tidy template

2366460

Signed-off-by: Kuromesi <[email protected]>

k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Feb 28, 2025

robscott reviewed Mar 13, 2025

View reviewed changes

kfswain mentioned this pull request Mar 17, 2025

Revert name change to make pool name more descriptive. #516

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add helm template #416

add helm template #416

Kuromesi commented Feb 27, 2025

k8s-ci-robot commented Feb 27, 2025

k8s-ci-robot commented Feb 27, 2025

netlify bot commented Feb 27, 2025 •

edited

Loading

netlify bot commented Feb 27, 2025 •

edited

Loading

robscott commented Feb 27, 2025

ahg-g commented Feb 27, 2025

Kuromesi commented Feb 28, 2025

robscott commented Feb 28, 2025 •

edited

Loading

Kuromesi commented Feb 28, 2025

robscott left a comment

robscott Mar 13, 2025

robscott Mar 13, 2025

robscott Mar 13, 2025

robscott Mar 13, 2025

robscott Mar 13, 2025

robscott Mar 13, 2025

robscott Mar 13, 2025

ahg-g Mar 13, 2025

robscott Mar 13, 2025

Kuromesi Mar 13, 2025

robscott Mar 13, 2025

Kuromesi Mar 15, 2025

robscott Mar 13, 2025

robscott Mar 13, 2025

Kuromesi commented Mar 13, 2025

ahg-g commented Mar 17, 2025

	Gateway api inference extension deployed.
	InferencePool deployed.

		@@ -0,0 +1,23 @@
		# Patterns to ignore when building packages.

		@@ -0,0 +1,9 @@
		apiVersion: v2
		name: gateway-api-inference-extension

	description: A Helm chart for gateway-api-inference-extension
	description: A Helm chart for InferencePool

		port: {{ .Values.inferenceExtension.grpcPort \| default 9002 }}
		targetPort: {{ .Values.inferenceExtension.grpcPort \| default 9002 }}

	port: {{ .Values.inferenceExtension.grpcPort \| default 9002 }}
	targetPort: {{ .Values.inferenceExtension.grpcPort \| default 9002 }}
	port: {{ .Values.inferenceExtension.extProcPort \| default 9002 }}

add helm template #416

Are you sure you want to change the base?

add helm template #416

Conversation

Kuromesi commented Feb 27, 2025

k8s-ci-robot commented Feb 27, 2025

k8s-ci-robot commented Feb 27, 2025

netlify bot commented Feb 27, 2025 • edited Loading

✅ Deploy Preview for gateway-api-inference-extension ready!

netlify bot commented Feb 27, 2025 • edited Loading

✅ Deploy Preview for gateway-api-inference-extension ready!

robscott commented Feb 27, 2025

ahg-g commented Feb 27, 2025

Kuromesi commented Feb 28, 2025

robscott commented Feb 28, 2025 • edited Loading

Kuromesi commented Feb 28, 2025

robscott left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Kuromesi commented Mar 13, 2025

ahg-g commented Mar 17, 2025

netlify bot commented Feb 27, 2025 •

edited

Loading

netlify bot commented Feb 27, 2025 •

edited

Loading

robscott commented Feb 28, 2025 •

edited

Loading