v0.8: Owner-Reviewed Learning Loop [PARTIAL]¶
Implementation Status¶
- [x]
LearningCandidatemodel exists. - [x]
LearningQueuepersists candidates in SQLite. - [x]
/learnchat command can list and inspect pending candidates. - [x]
/learncan approve, reject, edit, apply, and rollback skill candidates. - [x] Learning queue persists
diff_preview,applied_at, and rollback metadata. - [x] Background reviewer first slice exists behind opt-in AgentLoop config.
- [x] Learning heuristics tightened: profile/relationship/routine/tool_quirk apply manual_review_required; explicit memory apply works; weak memory manual_review_required; 11 tests pass.
Goal¶
Let the character improve without silently drifting.
User Outcome¶
After useful interactions, G-Agent proposes memory, profile, skill, routine, relationship, or tool-quirk updates. The owner can accept, edit, reject, or roll back changes.
Scope¶
- Add background review after response delivery.
- Inspect recent conversation, tool-heavy work, repeated errors, new owner preferences, new project facts, reusable workflow patterns, and character/profile drift.
- Produce learning candidates.
- Store candidates in a learning queue.
- Expose review through CLI/chat first, Web UI later.
- Default to owner-reviewed, not auto-apply.
- Add auto-apply only for low-risk facts after enough trust and tests.
Candidate Types¶
memory_candidateprofile_candidateskill_candidateroutine_candidaterelationship_updatetool_quirk
Queue Fields¶
- diff
- reason
- source session/message ids
- risk level
- accept/reject/edit state
- rollback metadata
Module Targets¶
backend/agent/g_agent/learning/queue.pybackend/agent/g_agent/learning/reviewer.pybackend/agent/g_agent/learning/types.pybackend/agent/g_agent/agent/hooks.pybackend/agent/tests/test_learning_queue.pybackend/agent/tests/test_learning_reviewer.py
Acceptance Criteria¶
- [x] Learning candidates persist with source links.
- [x] Owner can list and inspect candidates.
- [x] Owner can approve, reject, and edit candidates.
- [x] Owner can apply and roll back skill candidates.
- [ ] Profile changes show diffs.
- [ ] Rejected candidates do not immediately reappear without new evidence.
- [x] Background review is scheduled without blocking the main response when enabled.
References¶
hermes-agent-ref/run_agent.py
Agent Handoff¶
Current G-Agent State¶
- G-Agent can already write memory directly through tools.
LearningQueue,LearningCandidate, andBackgroundLearningReviewerexist./learnexposes queue list/inspect, approve/reject, edit, skill apply, and skill rollback through chat commands.- There is no background reviewer hook after response delivery.
AgentLoop._process_message()is the right integration point after final response/logging and before task completion.TaskCheckpointStorecan link learning candidates back to task ids.- SQLite session store from v0.2 should provide session/message ids. If v0.2 is not done, learning candidates can temporarily link to session key and task id.
Implementation Strategy¶
Implement queue first, reviewer second, auto-apply never by default.
Recommended shape:
learning/types.py: candidate models.learning/queue.py: persistence and lifecycle.learning/reviewer.py: LLM or rule-based candidate generation.agent/hooks.py: post-response hook runner.
Implementation Slices¶
- Add
LearningCandidatemodel. - Add queue persistence.
- Prefer SQLite if v0.2 exists.
- Otherwise JSONL under
workspace/state/learning/queue.jsonl. - Add CLI/chat list and inspect.
- Extend
/memory,/profile,/skills, or add/learning. - Add approve/reject/edit/apply.
- Skill candidates can be applied through owner command review.
- Memory/profile/routine apply remains future work until their manager layers exist.
- Add background reviewer.
- Runs after response delivery.
- Failures are logged, never user-blocking.
- Starts conservative: only propose obvious owner facts/preferences/tool quirks.
Candidate Required Fields¶
idtypetitlediffreasonsource_session_idorsession_keysource_message_idssource_task_idriskstatuscreated_atupdated_atrollback_metadata
Tests¶
test_learning_queue.py- create/list/get
- accept/reject/edit
- source links
- status transitions
test_learning_skill_lifecycle.py- skill candidate edit/apply/rollback
- rollback of new and replaced skills
test_learning_reviewer.py- reviewer proposes obvious memory
- reviewer ignores noisy chat
- reviewer failure is non-fatal
test_learning_commands.py- list/inspect/accept/reject output.
Guardrails¶
- No silent profile mutation.
- No skill activation without review.
- No auto-apply until a later trust policy exists.
- Redact secrets before storing candidate previews.
First PR Boundary¶
Queue + commands + manual candidate creation. Background reviewer can be second PR if needed. Shipped with tightened heuristics: profile/relationship/routine/ tool_quirk apply manual_review_required; explicit memory apply works; weak memory manual_review_required; 11 tests pass.