Runtime Roadmap — Lean `g-agent` Runtime¶

Last updated: 2026-02-12

This roadmap tracks runtime priorities for g-agent, while keeping it lightweight, cross-platform, and operator-controlled.

North Star¶

Build a personal assistant that is:

proactive (can initiate useful work),
memory-strong (recalls user context reliably across sessions),
workflow-capable (Google + browser + channel orchestration),
safe by default (strict boundaries for personal and guest modes).

Scope Boundaries¶

This roadmap is focused-scope.

In scope:

high-value runtime behavior
memory quality
proactive orchestration
security posture for mixed personal/guest usage
observability for daily operations

Out of scope:

heavy framework expansion
channel sprawl without clear reliability/ops value
large control-plane abstractions that reduce local clarity

Delta Tracks and Status¶

1) Execution Runtime with Checkpoints¶

Status: implemented (v1)

Implemented:

task lifecycle checkpoints (plan -> execute -> verify -> reflect -> done)
persisted task state in workspace (state/tasks/*.json)
restart-safe completion tracking

Code:

backend/agent/g_agent/agent/runtime.py
backend/agent/g_agent/agent/loop.py

2) Memory Retrieval Intelligence¶

Status: implemented (v1)

Implemented:

memory metadata schema (type, confidence, source, last_seen, supersedes)
dedup/supersede behavior
ranked recall via confidence + recency + relevance

Code:

backend/agent/g_agent/agent/memory.py

3) Proactive Agent Engine¶

Status: implemented (v1)

Implemented:

perfect-day schedule prompts
calendar watch lead-time reminders
quiet-hours gating + dedupe state

Code:

backend/agent/g_agent/proactive/engine.py
backend/agent/g_agent/cli/commands.py
backend/agent/g_agent/cron/service.py

4) Multimodal Output¶

Status: implemented (v1)

Implemented:

outbound message tool supports text/image/voice/sticker/document
generated voice + sticker fallback from plain text
workflow-pack media flags (--voice, --image, --sticker)

Code:

backend/agent/g_agent/agent/tools/message.py
backend/agent/g_agent/agent/workflow_packs.py

5) Workflow Packs (Google-first)¶

Status: implemented (v1)

Implemented packs:

daily_brief
meeting_prep
inbox_zero_batch

Implemented scope:

intent-level orchestration prompting
multimodal output options
media-first --silent mode

Code:

backend/agent/g_agent/agent/workflow_packs.py
backend/agent/g_agent/agent/loop.py

6) Guest Safety Boundary¶

Status: implemented (v1)

Implemented:

policy presets: personal_full, guest_limited, guest_readonly
channel/sender scoped policy map
deny-by-default behavior for guest restrictions

Code:

backend/agent/g_agent/config/presets.py
backend/agent/g_agent/agent/loop.py
backend/agent/g_agent/cli/commands.py

7) Observability and Evaluation Harness¶

Status: implemented (v1)

Implemented:

local metrics event sink
tool/cron reliability snapshots
latency + success-rate surfaces in status/doctor paths

Code:

backend/agent/g_agent/observability/metrics.py
backend/agent/g_agent/cli/commands.py

What Remains (Hardening Backlog)¶

Recently Completed¶

Telegram/WhatsApp reconnect harness coverage:
backend/agent/tests/test_channel_reconnect.py
OAuth edge-case regression checks added for expired refresh token and scope drift:
backend/agent/tests/test_google_oauth_edges.py
backend/agent/g_agent/agent/tools/google_workspace.py
Provider-specific retry taxonomy for transient tool failures:
backend/agent/g_agent/agent/loop.py
backend/agent/tests/test_retry_and_idempotency.py
Memory quality fixtures and ranking consistency assertions:
backend/agent/tests/fixtures/memory_conflicts.md
backend/agent/tests/test_memory_intelligence.py
backend/agent/g_agent/agent/memory.py
Multilingual overlap fixture coverage + summary/fact drift checks:
backend/agent/tests/fixtures/memory_multilingual.md
backend/agent/tests/test_memory_intelligence.py
backend/agent/g_agent/agent/memory.py
Metrics export path + dashboard-friendly scrape summary:
backend/agent/g_agent/observability/metrics.py
backend/agent/g_agent/cli/commands.py
backend/agent/tests/test_observability_metrics.py
Optional lightweight HTTP /metrics endpoint mode (disabled by default):
backend/agent/g_agent/observability/http_server.py
backend/agent/g_agent/cli/commands.py
backend/agent/tests/test_metrics_http_server.py
Metrics retention/pruning controls for events.jsonl growth management:
backend/agent/g_agent/observability/metrics.py
backend/agent/g_agent/cli/commands.py
backend/agent/tests/test_metrics_retention_alerts.py
Alert-threshold summary output for uptime/SLO monitoring:
backend/agent/g_agent/observability/metrics.py
backend/agent/g_agent/cli/commands.py
backend/agent/tests/test_metrics_retention_alerts.py
Integration-level reconnect harness (manager + dispatcher + multi-channel recovery):
backend/agent/tests/test_channel_reconnect.py
backend/agent/g_agent/channels/manager.py
Semantic normalization checks for mixed-language recall drift:
backend/agent/g_agent/agent/memory.py
backend/agent/tests/test_memory_intelligence.py
Cross-scope memory conflict regression checks (profile/long-term/custom):
backend/agent/g_agent/agent/memory.py
backend/agent/tests/test_memory_intelligence.py
Operator-facing memory audit diagnostics (memory-audit + doctor checks):
backend/agent/g_agent/cli/commands.py
backend/agent/g_agent/agent/memory.py
backend/agent/tests/test_memory_intelligence.py
Operator-facing security baseline audit automation (security-audit + doctor summary):
backend/agent/g_agent/security/audit.py
backend/agent/g_agent/cli/commands.py
backend/agent/tests/test_security_audit.py
Operator-facing security baseline auto-remediation helper (security-fix dry-run/apply):
backend/agent/g_agent/security/fix.py
backend/agent/g_agent/cli/commands.py
backend/agent/tests/test_security_fix.py
Channel manager long-running supervisor restarts for unexpected channel crashes:
backend/agent/g_agent/channels/manager.py
backend/agent/tests/test_channel_reconnect.py
Outbound dispatch retries with capped exponential backoff for transient channel-send failures:
backend/agent/g_agent/channels/manager.py
backend/agent/tests/test_retry_and_idempotency.py
Metrics alert compact summaries surfaced in status/doctor plus Prometheus alert gauges:
backend/agent/g_agent/observability/metrics.py
backend/agent/g_agent/cli/commands.py
backend/agent/tests/test_metrics_retention_alerts.py
Scoped tool-policy identity normalization + guardrail audit checks:
backend/agent/g_agent/agent/loop.py
backend/agent/g_agent/security/audit.py
backend/agent/tests/test_policy_presets.py
backend/agent/tests/test_security_audit.py
Channel supervisor restart-burst cooldown guard for long crash loops:
backend/agent/g_agent/channels/manager.py
backend/agent/tests/test_channel_reconnect.py

P0 — Reliability Gaps (next)¶

none (closed)

P1 — Memory Quality¶

none (closed)

P2 — Observability Ops¶

none (closed)

Exit Criteria for Delta Phase¶

Delta phase is considered complete when:

all P0 items are implemented and covered by automated tests
memory regression fixtures are in CI and stable
one optional metrics export path is available and documented
production checklist in backend/agent/SECURITY.md remains valid without exceptions

Relationship to Main Docs¶

product-level narrative: README.md
backend setup and operations: backend/agent/README.md
security posture and hardening: backend/agent/SECURITY.md

This roadmap only tracks the remaining g-agent runtime work.

Runtime Roadmap — Lean g-agent Runtime¶

North Star¶

Scope Boundaries¶

Delta Tracks and Status¶

1) Execution Runtime with Checkpoints¶

2) Memory Retrieval Intelligence¶

3) Proactive Agent Engine¶

4) Multimodal Output¶

5) Workflow Packs (Google-first)¶

6) Guest Safety Boundary¶

7) Observability and Evaluation Harness¶

What Remains (Hardening Backlog)¶

Recently Completed¶

P0 — Reliability Gaps (next)¶

P1 — Memory Quality¶

P2 — Observability Ops¶

Exit Criteria for Delta Phase¶

Relationship to Main Docs¶

Runtime Roadmap — Lean `g-agent` Runtime¶