Track LLM Costs
Overview
Confident AI tracks the token usage and cost of your LLM calls, helping you identify high-cost models and heavy usage patterns across your application.
Cost tracking only applies to LLM spans. If you haven’t already, learn how to configure span types first.
How It Works
Confident AI resolves token usage and cost for each LLM span in the following order of precedence, and separately for both input and output tokens:
- Per-token costs and counts set in Evals API/DeepEval (via
observeorupdate_llm_span/updateLlmSpan) take the highest priority and will always override any other source.- Integrations may provide token count but cost calculation happens the same way.
- Custom set model costs — if you provide token counts but not per-token costs, Confident AI will use the pricing you’ve configured in your Model Costs settings to calculate the cost.
- Automatic inference — if neither per-token costs nor project-level costs are available, Confident AI tokenizes the span’s input/output text using a provider-specific tokenizer and internally looks up pricing based on the
model.
Once token counts and per-token costs are resolved for each side, the total cost is computed as:
Input and output costs are computed independently — cost will only be logged for a side (input or output) if its values are non-null (they do not default to 0).
Automatic inference is only available for OpenAI, Anthropic, and Gemini models. For all other providers, supply token counts and costs manually or configure Model Costs in your project settings.
Track Token Usage Count
You can manually set the input and output token counts on an LLM span using update_llm_span / updateLlmSpan. This is useful when your provider returns token usage in the response and you want to log it precisely.
Python
TypeScript
If you don’t provide token counts and aren’t using an integration, Confident AI will attempt to infer them by tokenizing the span’s input and output text using the appropriate provider tokenizer. The table below summarizes each supported provider and its tokenization method.
See the OpenAI documentation, Anthropic documentation, or Google documentation for the most up-to-date pricing.
Note that the input and output are calculated separately - you don’t have to provide both to set the cost for either.
Track Token Usage Cost
Once token counts are available (either set manually, captured by an integration, or inferred automatically), Confident AI resolves the per-token cost using the following precedence:
- Per-token costs set in DeepEval/Evals API — if you provide cost per input/output tokens directly in code, these always take priority.
- Custom set model costs — if per-token costs aren’t set in code, Confident AI uses the pricing configured in your Model Costs settings.
- Automatic price lookup — if no project-level costs are configured, Confident AI looks up the per-token pricing internally based on the
model. This is only available for OpenAI, Anthropic, and Gemini models.
If none of the above resolve a per-token cost, the cost for that side (input or output) is not logged.
Explicit cost setting
Set the per-token costs explicitly in the observe decorator/wrapper alongside your token counts. This spares you for provider models not supported by automatic price lookup.
Explicit cost setting is best for teams that want programtic control over cost. For teams wanting to set model costs on the platform directory, see custom price lookup.
Python
TypeScript
Custom price lookup
If you provide token counts but don’t set per-token costs in code, Confident AI will use the pricing you’ve configured in your project’s Model Costs settings. This is useful when you want to manage pricing centrally without changing any code.
Model costs are matched against the model name on your LLM span using wildcard patterns. For example:
gpt-4o— matches onlygpt-4ogpt-4*— matchesgpt-4o,gpt-4o-mini,gpt-4-turbo, etc.claude-*— matches all Claude model variants
You can optionally restrict a cost rule to a specific provider, and set input and output costs independently per million tokens. See the full Model Costs settings page for setup instructions.
Automatic price lookup
If you provide a supported model on your LLM span and neither SDK-level nor project-level costs are configured, Confident AI will automatically look up the per-token pricing and calculate the cost — no additional code needed.
Python
TypeScript
Automatic price lookup is only available for OpenAI, Anthropic, and Gemini models. For all other providers, set per-token costs manually or configure Model Costs in your project settings.
Cost on Traces
Cost on traces are automatically set by summing up the cost of all LLM spans in said trace. Similar to LLM spans, trace cost defaults to null values if no LLM spans have non-null values.
Next Steps
With cost tracking configured, continue setting up the rest of your instrumentation.