It tries to strike a balance between working out of the box and being flexible... which has its challenges, still nice overall.
One big real-life pain I experienced is that caches don't always work, e.g. for xAI, since it only supports completions API and thought signatures are returned wrong.
Responses API is now implemented and it's coming in RubyLLM 2.0
https://github.com/crmne/ruby_llm/blob/main/lib/ruby_llm/pro...
I am using RubyLLM for quite some time and I am in love with the API design. If someone wants to see how this looks in a real project including custom tools, you can have a look at the SerpTrail project on GitHub.
The Chat model still is just:
class Chat < ApplicationRecord
acts_as_chat model: :llm_model
end
More: https://github.com/serpapi/serptrailI am quite excited for RubyLLM 2.0 and beyond.
Agreed with another commenter on the frustration with the responses API not being naively supported; that seems like a huge miss. There is a connector from another dev, but it's buggy and not as high quality as the main gem.
Really looking forward to future development and especially 2.0!
Edit: Just saw that responses API is now native? I will definitely check that out.
Since a few mentioned Responses API: the reason why it wasn't implemented in 1.x is because RubyLLM 1.x effectively assumes a 1:1 mapping between provider and protocol. That assumption no longer holds since OpenAI has 2 protocols with different capabilities, and to access all VertexAI models we need to support a bunch under that single provider.
Therefore, a major refactoring to split Protocols and from Providers was needed, as well as a way to route different models to different Protocols under the same Provider, transparently.
That's one of the many things that's gonna ship with RubyLLM 2.0.
If you're curious: https://github.com/crmne/ruby_llm/commit/d398354da493570b050... https://github.com/crmne/ruby_llm/commit/0875ce2dfeae9d28a3a...
Rails-style instrumentation landed in 1.16.0.
I tried submitting some PRs, but got a chilly reception. It was taking so long to make any forward progress on the parts I needed that I gave up, and wrote my own layer to do the parts of this that mattered to me. It didn't take long, and ultimately, I've customized it so much over time that I'm glad I didn't make this a dependency.
While the parts of this gem abstracting the various LLMs are nice and well designed, I think this kind of thing is a liability for anything but the most trivial applications. LLMs are moving too quickly to have the core connection infrastructure be gated on the release cycle of a third-party library. You can see this in the various comments down-thread where people are talking about the library lacking the Responses API -- it's great that the library is about to fix that problem, but if you just write your own adapter, you'd have been done months ago.
One of the biggest implications of LLMs in software, in my opinion, is that entire classes of third-party dependencies can be eliminated. It's interesting to look at which ones those are (and which ones continue to get interest) because it tells you a bit about where the post-LLM value of software will reside.
The repeated `chat.to_llm` message bug was reported Apr 30 2025, and fixed May 6, 2025, about 27 minutes after your comment.
It only showed up when reusing the same Rails chat object for multiple turns in the same Ruby object lifetime, e.g. `chat.ask("first"); chat.ask("second")` inside one controller action or one background job.
The usual flow is one turn per request/job, where the record is reloaded each time. Also, it did not overwrite records; it duplicated messages in the in-memory request context.
Gemini tool calling shipped in 1.0, schema support landed in 1.4, and observability landed in 1.16.
As for "the most trivial of applications": check the docs. RubyLLM goes well beyond that, and several multi-million-dollar companies use it in production every day.
> It only showed up when reusing the same Rails chat object for multiple turns in the same Ruby object lifetime, e.g. `chat.ask("first"); chat.ask("second")` inside one controller action or one background job.
That happens all the time. Without it, you can't pass things around to functions that add to the chat, for example.
> The usual flow is one turn per request/job, where the record is reloaded each time.
This may be your usual flow, but it doesn't have to be everybody's usual flow. No offense, but you're currently reminding me of what I interpreted as "chilly reactions" at the time.
> Gemini tool calling shipped in 1.0, schema support landed in 1.4, and observability landed in 1.16.
Great! I did them myself in less time. It's entirely possible that your library, today, is the tool I needed then.
> As for "the most trivial of applications": check the docs. RubyLLM goes well beyond that, and several multi-million-dollar companies use it in production every day.
OK. Cool. If they want to use the software, I'm sure nothing I say will convince them otherwise.
> Also, the same PR was reverted a bit later, IIRC
No. Your #151 was merged. The regressions it introduced were patched the next day in #157/#159.
Cheers!
Erm, OK. I don't think that's even close to what I did, but I guess people can read for themselves.
Again, this was not the only interaction I had with you folks, and even if you accepted my patch instantly and didn't roll it back or anything else at all (Also, once again: completely fine; I understand why you did it and don't fault you for doing it), the fact remains that there were lots of missing features that I needed, and working with you was slower than doing it myself.
Maybe that has changed, but I have to be honest: being defensive and prickly is not making me want to do it. You're actually pretty much illustrating the primary reason why I stopped.
Now you keep saying you had other bad interactions, but you haven't even said what they were. I have a feeling that if you named them, shortly later I'd read a comment about how it wasn't true. Again.
If the library didn't match your needs, that's fine. Who cares? That's happened to all of us many times, right?
I said they were slower than me. I didn't "call anyone out" for un-merging my PR -- I explicitly did the opposite. I could care less.
More than anything, I said I recalled that the reception was chilly. This thread has reminded me of why I had that opinion.
> If the library didn't match your needs, that's fine. Who cares? That's happened to all of us many times, right?
Sure. And we're allowed to talk about it.
2 days is a great turnaround time for PRs isn't it?
It's the wrong question; the first person was sharing their experience, and the reply jumped to collecting evidence that only vindicated the characterisation of a 'chilly response'. It's a human social skills thing. This place is not github issues. People who actually care will see. People who 'clarify' it here were never maintainers
I found one: #813, opened June 16, 2026. Last week.
Well, do real people want to "interact" with bots, including AI? I noticed this problem recently on prawn. Someone wants to merge about 1000 lines of code, 95% of what was auto-written by some AI model. Then he complained that the solo-maintainer does not want to review those ~1000 lines of code. What can I say ... I understand the issues by real humans more than by "vibe" coders here.
I also liked how they run the issue tracker. If you select "Feature Request", it makes you explain how you explored workarounds, why you believe it belongs in RubyLLM etc to prevent scope creep.
To put it differently, is this more like choosing between Fog and aws-sdk-s3, or choosing between Active Storage and aws-sdk-s3?
Some of the things I like about RubyLLM: 1. the DSL - You can chain methods like ActiveRecord. 2. the Structure - If gives a way to organize agents, tools & prompts 3. the Portability - The costs of AI usage, will one day be an issue for any successful product. Being able to easily test and move from Anthropic to DeepSeek cut my bill down by over 90%. Knowing how easy RubyLLM makes it, ignoring this eventuality feels reckless to me. 4. ActiveRecord Integration - With a simple `bin/rails generate ruby_llm:install `, you can save each chat to your database. 5. Agent Training - This is a side benefit of the above, but has been a huge unlock for me. Since I have all my chats saved, I will regularly pull down that history and give it to claude code to refine my agent instructions.
You can always build multiple clients however. It could be one thorough integration with one API, and a simpler backup via RubyLLM.
Unless of course RubyLLM supports all of the above, but even then, are they going to be able to keep up as much as a native client in the long term?
The answer is almost always that a native client for a more critical integration is the right call. You can always add another if you need it.
I love how MINASWAN Hacker News is when talking about Ruby!
People point at pi-agent (Typescript) as a good one, but I find the Ruby version far more elegant, far more concise, and generally easier to read.
I think it could be simplified further at the cost of making the API a bit uglier — and perhaps an educational fork should be — but it's still really good.
I'm not sure where you got that.
`chat.with_temperature(0.2)`
https://rubyllm.com/chat/#controlling-response-behavior
`chat.with_thinking(effort: :high, budget: 8000)`
https://rubyllm.com/thinking/#controlling-extended-thinking
Max tokens is the only one of your list that require provider specific params:
https://rubyllm.com/chat/#provider-specific-parameters
I'm one guy doing it for free. Happy to see your contribution!
I will have a deep dive into which things I felt we needed to adapt per provider.
I didn't mean to imply that you have to solve all of our wants of course.
One thing we did do was monkey-patch the spot where tool_calls are performed by RubyLLM. We had our own mechanism for that and were able to skip RubyLLM's and still extract the tool calls and run them through our own tool harness. That all worked beautifully. I don't know if that type of stuff is something you want PRs on or that you want to keep steering towards the route that does everything within RubyLLM classes. Happy to contribute some of that.
There is some difference in how OpenAI and Anthropic handle 'max_tokens'. The OpenAI way raises errors in Anthropic for example.
I will look at confirming this a bit more in depth once we've taken our RubyLLM based adapter in production and see if I can make some contribution.
Thanks for all the work! It is incredibly impressive and I never meant any disrespect.
You work your arse off for free and the guy who made the disparaging comment didn't even bother to research to see if he had the details right.
Hat's off to you Carmine for all your work. Many people really do appreciate it.
I think it is fair to list limitations from using a library that provides an abstraction; it can suggest why a tool isn't right for a person's use cases.
But it also sounds like this API handles those pretty well.
RubyLLM dev literally had to take time to provide code samples and doc links.
No issue with listing legit limitations, but be a bro and fact check claims before wasting a volunteer’s time - and potentially leading other developers on a public board astray.
I think part of the confusion with that word comes from things like corporate non-disparagement clauses. In those contracts, lawyers write the terms so broad that "disparagement" means saying anything negative, regardless of malice or intent.
I checked and it turns out I remembered correctly that setting effort and some of its settings are not portable between providers.
There are some different settings that each provider uses and in order for it to be portable, you have to force some defaults on provider A when using a setting that is almost only supported in provider B.
In our implementation we decided to drop a certain setting when using OpenAI in one case and we decided we can just force some other setting when using Anthropic. But this 'solution', might not be what others expect.
When you build an open-source library you can go this opinionated route and force these settings, or you might go the config route and force people to explicitly handle per-provider differences. I will have a look at what I am able to do in terms of a contribution and then in the PR Carmine can decide what they like.
> Error: You exceeded your current quota, please check your plan and billing details. For more information on this error, head to: https://ai.google.dev/gemini-api/docs/rate-limits. To monitor your current usage, head to: https://ai.dev/rate-limit. * Quota exceeded for metric: generativelanguage.googleapis.com/generate_content_free_tier_requests, limit: 20, model: gemini-2.5-flash Please retry in 41.543129369s.
You can build high quality software with dynamically typed languages, and Ruby is an absolute dream to read and write.
I was on team dynamic typing for about 12 years, and Ruby was a big part of that. I still think dynamic languages can be wonderful to read and write.
But after using modern statically typed languages with good inference, I changed my mind. Many of my old objections were really objections to verbose type systems, not static typing itself. With inference, you can keep a lot of the readability while gaining safer refactoring, better tooling, and earlier feedback.
That doesn’t mean dynamic languages can’t produce high-quality software. They obviously can. But I don’t think appreciating modern static typing is just evangelism.
And yes, I understand what this library is about, it's for "beautiful" easy to use interface to AI providers for Ruby apps. It's the popular play nowadays with litellm, bifrost, gomodel and vercel gateway. We have at least couple AI gateways, libraries like that every week on HN.
> But I don’t think appreciating modern static typing is just evangelism.
But it is evangelism, because the type-addicted devs want to slap on types on ruby. That is evangelism. They try to change an existing language to their world view and narrative. And you can not talk to them because their brain is deadlocked here. This is why the person you replied to, is correct here.
> I have been using ruby since 2005 or so. Static types were never relevant or needed in the slightest here.
We are like Liebniz monads, with disjoint views of the same universe due to our different experiences. I worked for several years at Shopify and there were droves of developers begging for types. Shopify itself recognized this need. We contributed early to Sorbet and began implementing it in the main monolith prior to it being publicly announced. I personally was the incident response on-call for a million-dollar bug that static typing would have prevented from ever being merged. And so on.
Why did people switch to these languages in the first place and what's driving the current back-to-typed-languages trend?