What is Prompt Compression?

Prompt compression is the automated process of reducing the length of a prompt before it is sent to an LLM. Unlike simple truncation, compression preserves the semantic content — the model receives the same information in fewer tokens.

Techniques include removing filler words, condensing verbose instructions, summarizing long context windows, and eliminating duplicate information. GateCtr applies prompt compression transparently on every API call, with an average reduction of up to 40%.

The key metric is the compression ratio: a ratio of 0.6 means the compressed prompt is 60% of the original size. GateCtr returns this metric in every response so you can measure savings per request.

Termes associés

Modèles associés

Voir GateCtr en action — gratuit