v0.10: Context Engine And Compression [PARTIAL]¶

Goal¶

Keep long-running character sessions useful without stuffing everything into every prompt.

Implementation Status¶

[x] Pluggable ContextEngine interface.
[x] DefaultContextEngine adapter.
[x] Deterministic prompt section building.
[x] ContextCompressor (Initial: summarization + tool pruning + fallback).
[x] Replace /compact internals with ContextCompressor reference digest.
[ ] Automated compression triggering in AgentLoop.
[ ] Dedicated advanced tests/token budgets/iterative structured compression.

Scope¶

Add pluggable context engine interface.
Build prompt sections in a stable order.
Add structured running summary.
Protect recent tail and identity/profile head.
Prune tool output.
Redact sensitive data.
Add fallback model behavior for compression failures.
Make summaries reference-only, not new instructions.
Add tests for prompt injection from context files, memory blocks, and retrieved transcripts.

Prompt Order¶

platform/system policy
character profile
current channel/session state
relevant memory
relevant session search
active skills/routines
recent messages

Module Targets¶

backend/agent/g_agent/context/engine.py
backend/agent/g_agent/context/compressor.py
backend/agent/g_agent/context/redaction.py
backend/agent/g_agent/agent/context.py
backend/agent/tests/test_context_engine.py
backend/agent/tests/test_context_compression.py

Acceptance Criteria¶

Prompt assembly order is deterministic.
Memory and retrieved transcripts are fenced.
Recent turns are protected from over-compression.
Large tool output is summarized or pruned.
Compression failure degrades gracefully.

References¶

hermes-agent-ref/agent/context_engine.py
hermes-agent-ref/agent/context_compressor.py
hermes-agent-ref/agent/prompt_builder.py

Agent Handoff¶

Current G-Agent State¶

ContextBuilder is monolithic and builds system prompt + runtime context.
ContextBuilder.strip_runtime_context() already prevents stale runtime metadata from being persisted.
Session.get_history(max_messages=50) currently truncates history by message count, not tokens.
There is a /compact command that replaces current session messages with a digest built by SessionManager._build_digest().

Implementation Strategy¶

Do not replace ContextBuilder in one pass. Add a context engine interface and move behavior behind it gradually.

Recommended shape:

context/engine.py: interface.
context/default_engine.py: wraps current ContextBuilder.
context/compressor.py: future compression engine.
context/redaction.py: secret redaction helpers.

Implementation Slices¶

Add deterministic prompt section builder.
Preserve current prompt output as much as possible.
Add token/char budgeting.
Start char-based if token counter is unavailable.
Add tool-output pruning for saved history.
Summarize large tool results.
Add structured compression.
Protected head: identity/profile.
Protected tail: latest turns.
Summarize middle.
Add automatic compression trigger in AgentLoop when stable.

Tests¶

test_context_engine.py
section order
memory fencing
profile/head protection
test_context_compression.py
large middle history summarized
recent tail preserved
tool output pruned
compression failure fallback
test_prompt_injection_context.py
malicious memory/context file does not become instructions.

Guardrails¶

Do not remove strip_runtime_context.
Do not summarize away unresolved tasks, paths, commands, URLs, or decisions.
Do not let summaries become instructions.

First PR Boundary¶

Context engine interface + current builder adapter + tests proving output order. Shipped with first-slice compression (summarize middle, fallback, prune tool outputs) but lacks dedicated advanced tests/token budgets/iterative structured compression.