Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: NATS queue initialization failure caused by customized served_model_name in PD disagg #354

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

dashanji
Copy link

Overview:

If we use a customized served_mode_name, the vllm will parse it as a list but not a string.
https://github.com/ai-dynamo/dynamo/blob/main/examples/llm/utils/vllm.py#L49

Thus, the NATS queue initialization will crash there as the stream_name (assigned by served_model_name) is a list.
https://github.com/ai-dynamo/dynamo/blob/main/examples/llm/utils/nats_queue.py#L42

Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)

Copy link

copy-pr-bot bot commented Mar 23, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

Copy link

👋 Hi dashanji! Thank you for contributing to ai-dynamo/dynamo.

Just a reminder: The NVIDIA Test Github Validation CI runs an essential subset of the testing framework to quickly catch errors.Your PR reviewers may elect to test the changes comprehensively before approving your changes.

🚀

@dashanji dashanji changed the title Fix NATS queue initialization failure caused by invalid served_model_name in PD disagg Fix NATS queue initialization failure caused by customized served_model_name in PD disagg Mar 23, 2025
@dashanji dashanji changed the title Fix NATS queue initialization failure caused by customized served_model_name in PD disagg fix: NATS queue initialization failure caused by customized served_model_name in PD disagg Mar 23, 2025
@a4zhangfei
Copy link

@dashanji When is it expected to merge into the main branch?

@rainj-me
Copy link

Please not directly choose the first element. How about we use sha256 hash or base64 encoding as stream name ? In this way we don't have to replace special characters later on?

@rainj-me
Copy link

Please not directly choose the first element. How about we use sha256 hash or base64 encoding as stream name ? In this way we don't have to replace special characters later on?

Refer commit bytedance-iaas@bd3f004

@dashanji
Copy link
Author

@rainj-me I think it's not a good idea to use a random one. Once we use the PD disagg in a distributed environment, how to make the P worker and D worker use the same nats stream? In this case, a customized served_model_name by users makes sense.

@rainj-me
Copy link

@rainj-me I think it's not a good idea to use a random one. Once we use the PD disagg in a distributed environment, how to make the P worker and D worker use the same nats stream? In this case, a customized served_model_name by users makes sense.

sha256 is not random value

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants