CUGA LogoCUGA AGENT
Customization

Context Summarization

Automatically summarize older messages when the context window fills up — for both CugaAgent and CugaSupervisor.

For long conversations, CUGA can roll older turns into a running summary so the LLM keeps the most useful context without blowing the window.

The full option list lives in the Settings reference — Context Summarization.

Enable

[context_summarization]
enabled = true
keep_last_n_messages = 10
trim_tokens_to_summarize = 500
summarization_model = "gpt-4o-mini"
trigger_fraction = 0.75

With this configuration:

  • Summarization fires when the prompt would exceed 75 % of the model's context window.
  • The last 10 messages are always preserved verbatim.
  • Older messages are condensed into ~500 tokens by gpt-4o-mini.

Trigger options

You can use any combination of the three trigger conditions; whichever fires first wins.

TriggerUse when
trigger_fraction = 0.75You want the trigger to track the model's actual context window — recommended for production.
trigger_tokens = 2000You want a fixed token cap regardless of model.
trigger_messages = 20You want to summarize after a fixed number of turns (useful for testing).

If you set more than one, the first condition that becomes true triggers summarization.

Custom prompt

By default LangChain's built-in summarization prompt is used. To override:

[context_summarization]
custom_summary_prompt = "Provide a concise summary of the following conversation, preserving all numeric values and named entities: {messages}"

The {messages} placeholder is the only required variable.

Choice of summarization model

summarization_model is independent of the agent's main model. Most users keep it on a small/cheap model (gpt-4o-mini, claude-haiku, etc.) — the goal is fast, lossy compression, not high reasoning.

Works with CugaSupervisor

Context summarization applies to both CugaAgent and CugaSupervisor runs. Each delegated sub-agent invocation gets the summarized history just like a standalone agent.

Summarization is lossy by design. If your task depends on remembering every literal detail (e.g. exact figures from a document), prefer the Knowledge Base — it keeps the original document available for retrieval.