You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Say I have an ensemble model, E, which runs models A and B, both released as version 1. Both A and B are only used in context of the ensemble.
Say I also have two environments where I'm running Triton - nonprod and prod. They're both using the same artifact repository, with the rest of my environments specifying model versions in their requests. Deployment of a new model in either environment means changing the requested version - this allows me to test in nonprod first, then promote to prod later. By specifying allowed versions in the config, I can limit the number of models loaded by Triton, saving memory.
I create version 2 of model A, and want to test it in my ensemble workflow in nonprod. To deploy it, I have to update E's config.pbtxt, telling the ensemble to use version 2 instead of 1.
But since that file is at the top level of the model structure, and not tied to a version, modifying it changes the workflow for BOTH nonprod and prod.
I would like to be able to have different versions of an ensemble model available, with each one pointing to different versions of underlying models. A way to specify a different config.pbtxt for different model versions would be one way to achieve this.
Alternatives I've considered include:
Using separate model repositories per-environment. This means more complicated build pipelines - in my real setup I actually have 5 different environments, not just two.
Separate ensemble model paths for each environment (i.e. models named ensemble-e-nonprod and ensemble-e-prod). This means config.pbtxt has to be redeployed every time we modify the model, and also doesn't work well with A-B testing within a single environment.
Separate model paths for each version of the ensemble (i.e. ensemble-e-1.0 and ensemble-e-2.0). This would work but it means using a different version scheme for ensemble models and regular models, which isn't ideal and could lead to confusion.
Is there a better way to handle this that I'm missing?
The text was updated successfully, but these errors were encountered:
Another note on this: as we update our models, it's looking like we will be regularly updating the config.pbtxt. It feels like having a way to tie a configuration and a model together as a versioned object would be really helpful from a CI/CD perspective - maybe allowing for a config.pbtxt to exist in each numbered version folder. This same convention would allow for versioning of ensemble configurations.
Another client here also interested in this feature.
ghicks-novaprime
changed the title
Versioning for ensemble models and/or confg.pbtxt files
Versioning for ensemble models and/or config.pbtxt files
Mar 11, 2025
Say I have an ensemble model, E, which runs models A and B, both released as version 1. Both A and B are only used in context of the ensemble.
Say I also have two environments where I'm running Triton - nonprod and prod. They're both using the same artifact repository, with the rest of my environments specifying model versions in their requests. Deployment of a new model in either environment means changing the requested version - this allows me to test in nonprod first, then promote to prod later. By specifying allowed versions in the config, I can limit the number of models loaded by Triton, saving memory.
I create version 2 of model A, and want to test it in my ensemble workflow in nonprod. To deploy it, I have to update E's config.pbtxt, telling the ensemble to use version 2 instead of 1.
But since that file is at the top level of the model structure, and not tied to a version, modifying it changes the workflow for BOTH nonprod and prod.
I would like to be able to have different versions of an ensemble model available, with each one pointing to different versions of underlying models. A way to specify a different config.pbtxt for different model versions would be one way to achieve this.
Alternatives I've considered include:
ensemble-e-nonprod
andensemble-e-prod
). This means config.pbtxt has to be redeployed every time we modify the model, and also doesn't work well with A-B testing within a single environment.ensemble-e-1.0
andensemble-e-2.0
). This would work but it means using a different version scheme for ensemble models and regular models, which isn't ideal and could lead to confusion.Is there a better way to handle this that I'm missing?
The text was updated successfully, but these errors were encountered: