-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Ollama as a supported provider #10
base: main
Are you sure you want to change the base?
Conversation
This is so that doing `models.refresh!` won't attempt to use every provider API, resulting in errors unless a valid key is given for every one. If this is too disruptive to be default behavior, then this could be a "default to offline" mode that needs to be explicitly turned on perhaps with an environment variable; in its absence, behavior would be the same as before. This is all so that `models.refresh!` can be called freely to populate Ollama models at runtime.
This was done copying and adapting from the existing Gemini provider. Tool usage and media was excised for now; I still need to research what Ollama offers and what overlaps within this project. Also, Commit 9b387c1 disables all providers by default in The intention is to be able to work on Ollama (or any other specific single provider) without being forced to provide valid keys for EVERY provider, also incurring live calls. In particular, because Ollama doesn't come with any default models and so As mentioned in the commit message, this might be too intrusive for the intended usage of this project, so an alternative is to only do this when given a specific env var that puts ruby-llm into such a "default offline" mode. This might interfere with tests and/or the models update rake task, none of which I touched yet since they also require valid keys for all providers. As for tests, I'd like some guidance into how to implement unit testing and eventually integration testing specifically for Ollama, in a way that does not require configuring and using all APIs (and attending cost). Since Ollama does not come with default models, I suggest a separate test suite that first ensures that a tiny model is downloaded into the Ollama server via its API. |
unless data == '[DONE]' | ||
parsed_data = JSON.parse(data) | ||
block.call(parsed_data) | ||
content_type = env.response_headers['content-type'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not super sure about this; as far as I could see, Ollama uses newline delimited JSON lines rather than standard event streams.
Of the API providers, I only have a Gemini key and it does work after this commit (both streaming and synch) so this might be correct.
was just about to file an issue for this — awesome! |
As mentioned in issue #2, Ollama support is valuable for users interested in self-hosted/offline inference.
This PR adds initial support for Ollama including chat completions, streaming and embeddings. No tool support yet, needs further investigation.
PR is a rough draft and will likely take some back and forth to get it merge-ready; more comments to follow.
Closes #2