Skip to content

v0.8: Owner-Reviewed Learning Loop [PARTIAL]

Implementation Status

  • [x] LearningCandidate model exists.
  • [x] LearningQueue persists candidates in SQLite.
  • [x] /learn chat command can list and inspect pending candidates.
  • [x] /learn can approve, reject, edit, apply, and rollback skill candidates.
  • [x] Learning queue persists diff_preview, applied_at, and rollback metadata.
  • [x] Background reviewer first slice exists behind opt-in AgentLoop config.
  • [x] Learning heuristics tightened: profile/relationship/routine/tool_quirk apply manual_review_required; explicit memory apply works; weak memory manual_review_required; 11 tests pass.

Goal

Let the character improve without silently drifting.

User Outcome

After useful interactions, G-Agent proposes memory, profile, skill, routine, relationship, or tool-quirk updates. The owner can accept, edit, reject, or roll back changes.

Scope

  • Add background review after response delivery.
  • Inspect recent conversation, tool-heavy work, repeated errors, new owner preferences, new project facts, reusable workflow patterns, and character/profile drift.
  • Produce learning candidates.
  • Store candidates in a learning queue.
  • Expose review through CLI/chat first, Web UI later.
  • Default to owner-reviewed, not auto-apply.
  • Add auto-apply only for low-risk facts after enough trust and tests.

Candidate Types

  • memory_candidate
  • profile_candidate
  • skill_candidate
  • routine_candidate
  • relationship_update
  • tool_quirk

Queue Fields

  • diff
  • reason
  • source session/message ids
  • risk level
  • accept/reject/edit state
  • rollback metadata

Module Targets

  • backend/agent/g_agent/learning/queue.py
  • backend/agent/g_agent/learning/reviewer.py
  • backend/agent/g_agent/learning/types.py
  • backend/agent/g_agent/agent/hooks.py
  • backend/agent/tests/test_learning_queue.py
  • backend/agent/tests/test_learning_reviewer.py

Acceptance Criteria

  • [x] Learning candidates persist with source links.
  • [x] Owner can list and inspect candidates.
  • [x] Owner can approve, reject, and edit candidates.
  • [x] Owner can apply and roll back skill candidates.
  • [ ] Profile changes show diffs.
  • [ ] Rejected candidates do not immediately reappear without new evidence.
  • [x] Background review is scheduled without blocking the main response when enabled.

References

  • hermes-agent-ref/run_agent.py

Agent Handoff

Current G-Agent State

  • G-Agent can already write memory directly through tools.
  • LearningQueue, LearningCandidate, and BackgroundLearningReviewer exist.
  • /learn exposes queue list/inspect, approve/reject, edit, skill apply, and skill rollback through chat commands.
  • There is no background reviewer hook after response delivery.
  • AgentLoop._process_message() is the right integration point after final response/logging and before task completion.
  • TaskCheckpointStore can link learning candidates back to task ids.
  • SQLite session store from v0.2 should provide session/message ids. If v0.2 is not done, learning candidates can temporarily link to session key and task id.

Implementation Strategy

Implement queue first, reviewer second, auto-apply never by default.

Recommended shape:

  • learning/types.py: candidate models.
  • learning/queue.py: persistence and lifecycle.
  • learning/reviewer.py: LLM or rule-based candidate generation.
  • agent/hooks.py: post-response hook runner.

Implementation Slices

  1. Add LearningCandidate model.
  2. Add queue persistence.
  3. Prefer SQLite if v0.2 exists.
  4. Otherwise JSONL under workspace/state/learning/queue.jsonl.
  5. Add CLI/chat list and inspect.
  6. Extend /memory, /profile, /skills, or add /learning.
  7. Add approve/reject/edit/apply.
  8. Skill candidates can be applied through owner command review.
  9. Memory/profile/routine apply remains future work until their manager layers exist.
  10. Add background reviewer.
  11. Runs after response delivery.
  12. Failures are logged, never user-blocking.
  13. Starts conservative: only propose obvious owner facts/preferences/tool quirks.

Candidate Required Fields

  • id
  • type
  • title
  • diff
  • reason
  • source_session_id or session_key
  • source_message_ids
  • source_task_id
  • risk
  • status
  • created_at
  • updated_at
  • rollback_metadata

Tests

  • test_learning_queue.py
  • create/list/get
  • accept/reject/edit
  • source links
  • status transitions
  • test_learning_skill_lifecycle.py
  • skill candidate edit/apply/rollback
  • rollback of new and replaced skills
  • test_learning_reviewer.py
  • reviewer proposes obvious memory
  • reviewer ignores noisy chat
  • reviewer failure is non-fatal
  • test_learning_commands.py
  • list/inspect/accept/reject output.

Guardrails

  • No silent profile mutation.
  • No skill activation without review.
  • No auto-apply until a later trust policy exists.
  • Redact secrets before storing candidate previews.

First PR Boundary

Queue + commands + manual candidate creation. Background reviewer can be second PR if needed. Shipped with tightened heuristics: profile/relationship/routine/ tool_quirk apply manual_review_required; explicit memory apply works; weak memory manual_review_required; 11 tests pass.