Python

GateCtr + LlamaIndex

Add budget control and token optimization to LlamaIndex pipelines

Install

No additional packages required. Use your existing LlamaIndex installation.

Configure

Before

from llama_index.llms.openai import OpenAI

llm = OpenAI(model="gpt-4o", api_key="sk-...")

After GateCtr

from llama_index.llms.openai import OpenAI

llm = OpenAI(
    model="gpt-4o",
    api_key="sk-...",
    api_base="https://api.gatectr.com/v1"
)

Test

Make a test call and check the GateCtr dashboard for token savings and cost data.

What GateCtr does under the hood for LlamaIndex

When you route LlamaIndex calls through GateCtr, every request is automatically compressed (up to 40% fewer tokens), scored for complexity (to select the optimal model), and checked against your budget cap before reaching the LLM provider. You get full observability — tokens, cost, latency — in the GateCtr dashboard.

Compatible models

GPT-4o GPT-4.1 GPT-4.1 mini GPT-4.1 nano o3 o4-mini

Start saving with LlamaIndex — free

No credit card required. Up and running in 5 minutes.

Start free