Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Ollama as a supported provider #10

Draft
wants to merge 9 commits into
base: main
Choose a base branch
from

Conversation

ldmosquera
Copy link

@ldmosquera ldmosquera commented Mar 12, 2025

As mentioned in issue #2, Ollama support is valuable for users interested in self-hosted/offline inference.

This PR adds initial support for Ollama including chat completions, streaming and embeddings. No tool support yet, needs further investigation.

PR is a rough draft and will likely take some back and forth to get it merge-ready; more comments to follow.

Closes #2

This is so that doing `models.refresh!` won't attempt to use every
provider API, resulting in errors unless a valid key is given for every one.

If this is too disruptive to be default behavior, then this could be a
"default to offline" mode that needs to be explicitly turned on perhaps
with an environment variable; in its absence, behavior would be the same
as before.

This is all so that `models.refresh!` can be called freely to populate
Ollama models at runtime.
@ldmosquera
Copy link
Author

ldmosquera commented Mar 12, 2025

This was done copying and adapting from the existing Gemini provider. Tool usage and media was excised for now; I still need to research what Ollama offers and what overlaps within this project. Also, capabilities.rb is full of placeholder stuff until we figure out sane defaults since Ollama allows running arbitrary models instead of a well known list.


Commit 9b387c1 disables all providers by default in models.refresh! unless they are explicitly configured.

The intention is to be able to work on Ollama (or any other specific single provider) without being forced to provide valid keys for EVERY provider, also incurring live calls. In particular, because Ollama doesn't come with any default models and so models.refresh! is mandatory before any usage to populate available models on the server.

As mentioned in the commit message, this might be too intrusive for the intended usage of this project, so an alternative is to only do this when given a specific env var that puts ruby-llm into such a "default offline" mode.

This might interfere with tests and/or the models update rake task, none of which I touched yet since they also require valid keys for all providers.


As for tests, I'd like some guidance into how to implement unit testing and eventually integration testing specifically for Ollama, in a way that does not require configuring and using all APIs (and attending cost).

Since Ollama does not come with default models, I suggest a separate test suite that first ensures that a tiny model is downloaded into the Ollama server via its API.

unless data == '[DONE]'
parsed_data = JSON.parse(data)
block.call(parsed_data)
content_type = env.response_headers['content-type']
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not super sure about this; as far as I could see, Ollama uses newline delimited JSON lines rather than standard event streams.

Of the API providers, I only have a Gemini key and it does work after this commit (both streaming and synch) so this might be correct.

@jm3
Copy link

jm3 commented Mar 13, 2025

was just about to file an issue for this — awesome!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add Local Model Support via Ollama Integration
2 participants