As AI-powered features move from “cool experiment” to “core production feature,” the “how much is this costing me?” question becomes a daily concern. While most providers offer their own dashboards, jumping between three or four different portals to check usage is a headache.
In my recent projects, I’ve moved away from external trackers and leaned back into the Rails Way by implementing a polymorphic Inference model.
Why Polymorphic?
The AI landscape is fragmented. Today you’re using OpenAI for text; tomorrow you’re using Gemini for multimodal tasks or a self-hosted Llama instance. By making the Inference model polymorphic, you can attach an “inference” to any record in your system—a User, a Conversation, or even a specific BackgroundJob.
The Setup
I treat an Inference as a ledger entry. Whenever a service is called, I wrap the response and write it to the database:
# The core model
class Inference < ApplicationRecord
belongs_to :inferable, polymorphic: true
# Tracks provider (OpenAI), model (gpt-4o),
# input_tokens, output_tokens, and latency
end
By keeping this data in-house rather than siloed in a third-party tool, we get all the standard Rails benefits and activerecord awesomeness.
Native Charting
Using a gem like groupdate, I can spin up a production admin dashboard in minutes to see daily token burns.
Business Context: I can easily run queries like User.find(1).inferences.sum(:cost) to see which specific customers are my “power users” (and perhaps my most expensive ones).
Performance Monitoring:
By tracking latency in the model, I can trigger alerts if a specific provider’s response times start to spike.
Verdict
If you are building an AI-heavy Rails app, don’t overcomplicate your telemetry. Use the tools you already have. A simple polymorphic table gives you the visibility you need without the overhead of another SaaS subscription.