-
Notifications
You must be signed in to change notification settings - Fork 52
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Custom URL support #9
Comments
Hey @zachlatta - thanks for the kind words! RubyLLM is designed around the idea that provider-specific details should be handled inside the library, not exposed as configuration. This gives us cleaner APIs and better handling of models than the "just swap the URL" approach. We have an open issue for Ollama (#2), but we're implementing it as a proper provider rather than through OpenAI compatibility layer, which tends to be buggy and incomplete in my experience. The same goes for other providers with OpenAI-compatible APIs - we'll add them as first-class providers in RubyLLM (like we did with DeepSeek) rather than making you fiddle with base URLs. This approach lets us properly handle model routing, capabilities, and pricing info. If you're looking to use a specific provider, let us know which one and we can prioritize it. If it's about running models in your own infrastructure, that's valuable feedback for us to consider how to best support that use case. |
The two others that immediately come to mind for me are LM Studio and Groq!
…________________________________
From: Carmine Paolino ***@***.***>
Sent: Friday, March 14, 2025 3:03:41 AM
To: crmne/ruby_llm ***@***.***>
Cc: Zach Latta ***@***.***>; Mention ***@***.***>
Subject: Re: [crmne/ruby_llm] Custom URL support (Issue #9)
Hi @zachlatta<https://github.com/zachlatta> and thank you!
What other providers apart from Ollama would you like to support by changing the base url?
We already have an issue to support Ollama #2<#2> and we're not gonna do it through the OpenAI compatibility layer as it doesn't support everything, and I've had a myriad of issues with other OpenAI compatibility layers in the past. Looking at you Gemini.
—
Reply to this email directly, view it on GitHub<#9 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AAHSH6GRI4EYFZ2XG7DFBM32UJ5M3AVCNFSM6AAAAABY4GU72KVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDOMRTHAYTSNJYGU>.
You are receiving this because you were mentioned.Message ID: ***@***.***>
[crmne]crmne left a comment (crmne/ruby_llm#9)<#9 (comment)>
Hi @zachlatta<https://github.com/zachlatta> and thank you!
What other providers apart from Ollama would you like to support by changing the base url?
We already have an issue to support Ollama #2<#2> and we're not gonna do it through the OpenAI compatibility layer as it doesn't support everything, and I've had a myriad of issues with other OpenAI compatibility layers in the past. Looking at you Gemini.
—
Reply to this email directly, view it on GitHub<#9 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AAHSH6GRI4EYFZ2XG7DFBM32UJ5M3AVCNFSM6AAAAABY4GU72KVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDOMRTHAYTSNJYGU>.
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
This makes sense, but I'm not sure how you get around the need for configuring the url for provider like Azure OpenAI Service where your deployed models are at a unique endpoint for your resource. We use Azure OpenAI at work for its unified billing and access control, and I would like to be able to use this library with our existing deployments. |
Hopping in here with another use case. Fastly's LLM semantic caching product, AI Accellerator, operates as a URL/API key swap within code to initiate middleware caching with Python and Node.js libraries. Disclaimer: I work for Fastly. I'd love to feature a Ruby example within the product. If there's an easy drop-in integration option, we'll find a way to feature it. |
I still believe in our model-first approach rather than taking the "swap the URL in the config" shortcut. Each provider deserves proper first-class implementation with correct model routing, capabilities, and pricing info. That said, Azure OpenAI presents a legitimate use case where URL configuration is inherent to the provider. I'll need to think about the right design to accommodate this without compromising our principles - maybe extending our OpenAI provider to handle Azure's deployment-based approach. Are pricing and capabilities the same as the original OpenAI models? |
There are probably enough differences in constructing the request that it would make sense for Azure OpenAI to be it's own provider. However, the capabilities of Azure's hosted OpenAI APIs are the same as OpenAI's, and their pricing is identical, as far as I'm aware. For example, o3-mini costs $1.10 per million tokens on both platforms. And the responses are the same structure. Some differences: Models can be deployed under custom names rather than their official model names. For instance, you could deploy the o3-mini model as "my-o3-mini," or whatever you want, and that would be the name used to access it istead of the official name. The API key is sent as an Additionally, Azure requires an explicit API version. My understanding is this helps Azure stay in sync with OpenAI's API in terms of features and capabilities. So if new features are added to OpenAI's API, you would need to update the API version to gain access to those features. An example Azure OpenAI request: azure_api_key = "asdf-1234-asdf-1234"
azure_endpoint = "https://resource-name.openai.azure.com"
deployment_name = "my-deployment-name" # Instead of official model name
api_version = "2024-10-21" # This is the latest GA version
# Instead of "https://api.openai.com/v1/chat/completions"
url = "#{azure_endpoint}/openai/deployments/#{deployment_name}/chat/completions?api-version=#{api_version}"
# Instead of { "Authorization" => "Bearer #{ENV['OPENAI_API_KEY']}" }
headers = {
"api-key" => azure_api_key,
"Content-Type" => "application/json"
}
# Note since the model is in the url, you don't need to put it in the body payload.
body = {
"messages" => [{ "role" => "user", "content" => "Hello, how are you?" }]
}.to_json
response = HTTP.headers(headers).post(url, body: body)
puts JSON.pretty_generate(JSON.parse(response.body.to_s)) |
First - wow! Thanks for making this!
Would love the ability to override the API base URL for any given model. I didn't see something in the docs around this.
This would enable support for Llama and proxies.
The text was updated successfully, but these errors were encountered: