ONNX model support #5

khasinski · 2025-03-11T09:14:21Z

khasinski
Mar 11, 2025

Hey, I'm currently using ONNX models that I run with ONNX runtime for faster embeddings. I'm planning on extending it a bit to support token generating models (something with an API similar to ONNX genai). If you add the docs on adding providers I'd be happy to write a wrapper that translates both formats so I could use your DSL for interacting with those models.

Config would probably be just a link to a HF repo and calling those models wouldn't need any HTTP, just regular function calls.

crmne · 2025-03-12T13:27:52Z

crmne
Mar 12, 2025
Maintainer

Hi @khasinski and thank you for the interest in RubyLLM!

RubyLLM is designed to be a client for LLMs, not a model host. Adding model serving capabilities would completely change the performance profile - from being IO-bound to CPU/GPU/memory-bound.

That's a fundamentally different library with different concerns. I'd recommend keeping your ONNX runtime implementation separate, and if you want, build a provider for RubyLLM that speaks to your server over HTTP.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ONNX model support #5

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 1 comment

{{title}}

Select a reply

ONNX model support #5

khasinski Mar 11, 2025

Replies: 1 comment

crmne Mar 12, 2025 Maintainer

khasinski
Mar 11, 2025

crmne
Mar 12, 2025
Maintainer