Confident AI tracks the token usage and cost of your LLM calls, helping you identify high-cost models and heavy usage patterns across your application.
Cost tracking only applies to LLM spans. If you haven’t already, learn how to configure span types first.
Confident AI resolves token usage and cost for each LLM span in the following order of precedence, and separately for both input and output tokens:
observe or update_llm_span / updateLlmSpan) take the highest priority and will always override any other source.
model.Once token counts and per-token costs are resolved for each side, the total cost is computed as:
Input and output costs are computed independently — cost will only be logged for a side (input or output) if its values are non-null (they do not default to 0).
Automatic inference is only available for OpenAI, Anthropic, and Gemini models. For all other providers, supply token counts and costs manually or configure Model Costs in your project settings.
You can manually set the input and output token counts on an LLM span using update_llm_span / updateLlmSpan. This is useful when your provider returns token usage in the response and you want to log it precisely.
If you don’t provide token counts and aren’t using an integration, Confident AI will attempt to infer them by tokenizing the span’s input and output text using the appropriate provider tokenizer. The table below summarizes each supported provider and its tokenization method.
See the OpenAI documentation, Anthropic documentation, or Google documentation for the most up-to-date pricing.
Note that the input and output are calculated separately - you don’t have to provide both to set the cost for either.
Once token counts are available (either set manually, captured by an integration, or inferred automatically), Confident AI resolves the per-token cost using the following precedence:
model. This is only available for OpenAI, Anthropic, and Gemini models.If none of the above resolve a per-token cost, the cost for that side (input or output) is not logged.
Set the per-token costs explicitly in the observe decorator/wrapper alongside your token counts. This spares you for provider models not supported by automatic price lookup.
Explicit cost setting is best for teams that want programtic control over cost. For teams wanting to set model costs on the platform directory, see custom price lookup.
If you provide token counts but don’t set per-token costs in code, Confident AI will use the pricing you’ve configured in your project’s Model Costs settings. This is useful when you want to manage pricing centrally without changing any code.
Model costs are matched against the model name on your LLM span using wildcard patterns. For example:
gpt-4o — matches only gpt-4ogpt-4* — matches gpt-4o, gpt-4o-mini, gpt-4-turbo, etc.claude-* — matches all Claude model variantsYou can optionally restrict a cost rule to a specific provider, and set input and output costs independently per million tokens. See the full Model Costs settings page for setup instructions.
If you provide a supported model on your LLM span and neither SDK-level nor project-level costs are configured, Confident AI will automatically look up the per-token pricing and calculate the cost — no additional code needed.
Automatic price lookup is only available for OpenAI, Anthropic, and Gemini models. For all other providers, set per-token costs manually or configure Model Costs in your project settings.
Cost on traces are automatically set by summing up the cost of all LLM spans in said trace. Similar to LLM spans, trace cost defaults to null values if no LLM spans have non-null values.
With cost tracking configured, continue setting up the rest of your instrumentation.