What is Inference Cost?

Inference cost is the dollar amount charged by an LLM provider for processing a single API request. It is calculated as: (input_tokens × input_price_per_1M) + (output_tokens × output_price_per_1M).

Inference costs vary dramatically across models. GPT-4o costs $2.50/1M input tokens while GPT-4o mini costs $0.15/1M — a 16x difference. Choosing the right model for each task is one of the most impactful cost levers available.

GateCtr reports the exact inference cost for every request in the response metadata, enabling precise cost attribution per project, user, and model. This data feeds the analytics dashboard and budget enforcement system.

Termes associés

Modèles associés

Voir GateCtr en action — gratuit