voku/agent-learning

Reviewable finding, proposal, redaction, and decision-history tooling for coding-agent learning loops.

Maintainers

Package info

github.com/voku/agent-learning

pkg:composer/voku/agent-learning

Statistics

Installs: 87

Dependents: 1

Suggesters: 0

Stars: 1

Open Issues: 3

0.7.0 2026-06-23 09:58 UTC

README

Reviewable finding, proposal, redaction, and decision-history tooling for coding-agent learning loops.

This library provides core domain logic and validation classes to support structured post-session learning for coding agents. It separates raw experiences (Findings) from potential guideline changes (Proposals), keeping the agent's knowledge extraction workflow structured, secure, and fully auditable.

Key Concepts

Findings

A Finding represents a single raw experience or observation captured from a task session. It stores:

  • An observation and a hypothetical rule or pattern.
  • A confidence level.
  • Explicit validation metadata (unverified, validated, invalidated).
  • A validated conclusion detailing why the pattern was verified or rejected.
  • Optional learning triage metadata:
    • classification: CREATE_SKILL, UPDATE_SKILL, ADD_LEARNING_NOTE, or IGNORE.
    • pattern_key: stable dot-separated clustering key such as tests.add_before_change.
    • validation_case: concrete given / when / then behavior check.

ADD_LEARNING_NOTE is the default durable capture. CREATE_SKILL should be rare; prefer UPDATE_SKILL when an existing skill already owns the behavior. IGNORE is valid for praise, vague reflection, one-off details, and already-covered guidance.

Proposals

A Proposal defines a potential durable mutation to the repository's guidelines or instructions (e.g., in MEMORY.md or dedicated agent skills).

  • Can represent actions like ADD, DELETE, REPLACE, REJECT, or NO_DURABLE_LEARNING.
  • References one or more validated source findings that back it up.
  • Contains metadata about target type, scope, proposed boundary, validation checklist, status, and approval.
  • May carry the same learning_decision, pattern_key, and validation_case fields used by consolidation. CREATE_SKILL proposals additionally require an overlap_check proving existing skills were inspected and no overlapping skill owns more than 50% of the behavior.

Constraint Specifications

A ConstraintSpecification is a typed, reviewable bridge from confirmed learning to executable validation. Constraint proposals describe the engine, rule identifier, scope, objective violation, allowed boundaries, false-positive risk, validation commands, local example rules, target rule path, and registration files. The package validates whether the learning is stable and precise enough for a later PHPStan, PHP-CS-Fixer, test, or CI generation step.

Evidence

Findings must be backed by concrete, verifiable evidence. Supported types include:

  • file_reference: References to specific files and line numbers.
  • commit: Reference to a specific git commit.
  • test_result / phpstan_result: Command execution command and summary.
  • review_comment: Pull/merge request comments or reviews.
  • issue_reference: Bounded issue or ticket tracker reference.
  • Others (e.g., schema_reference, runtime_observation, manual_verification).

Decision History

A persistent record of approved or rejected proposals stored in JSON Lines (.jsonl) format.

  • decisions.jsonl logs approved and applied mutations.
  • rejected-proposals.jsonl logs rejected candidate proposals with detailed reasons.

Core Classes & APIs

The package codebase is organized under the voku\AgentLearning namespace in the following structure:

Value Objects & Enums

  • Finding: Read-only entity representing a captured session finding.
  • FindingStatus: Enum defining finding lifecycles (candidate, validated, invalidated, rejected, superseded, consolidated, archived).
  • Proposal: Read-only entity representing a proposed modification to guidelines.
  • ProposalStatus: Enum defining proposal states (candidate, approved, rejected, applied, retired).
  • Action: Enum representing actions (NO_DURABLE_LEARNING, ADD, DELETE, REPLACE, REJECT).
  • ConstraintSpecification: Read-only model for hard-constraint promotion candidates.
  • GuidanceUsageSummary: Read-only projection of recall eligibility, selection, application, explicit outcomes, task spread, timestamps, and evidence event IDs.
  • ConstraintEngine: Enum defining supported hard-constraint engines (phpstan, php_cs_fixer, test, ci).
  • Detectability: Enum describing whether the violation is statically, syntax-locally, runtime, or cross-file detectable.
  • FalsePositiveRisk: Enum declaring expected false-positive risk (low, medium, high, unknown).

Parsers & Repositories

Validators

  • FindingValidator: Enforces structure, format, and lifecycle consistency for findings.
  • ProposalValidator: Validates proposal mutations, targets, actions, and references.
  • EvidenceValidator: Inspects list of evidence objects to ensure required fields for each type exist.
  • JsonlValidator: Parses and validates JSON Lines log formats.
  • RedactionGuard: Scans all content for credentials, secrets, or sensitive configuration keys to prevent accidental leaks.
  • DecisionHistoryValidator: Validates log consistency of the decision history.
  • ConstraintPromotionValidator: Validates that constraint proposals come from confirmed findings and contain explicit promotion-gate evidence.
  • ConstraintManifestActivator: Writes the active manifest consumed by recall tooling after an approved or applied constraint rule exists in the project.

Utilities & Infrastructure

  • ConsolidationPromptBuilder: Assembles validated findings and rejected proposals history into a structured LLM consolidation prompt.
  • ConstraintGenerationPackageExporter: Exports specification.json, source findings/proposals, examples, validation plan, and generation prompt for coding-agent rule generation.
  • ConstraintLoopRunner: Drives the approved generated-rule close-out path by exporting, applying, and activating a hard constraint with one explicit command.
  • GuidanceUsageProjector: Rebuilds deterministic usage summaries from immutable event histories without mutable counters.
  • GuidanceEvolutionEvaluator: Applies conservative tier-specific promotion and review policies.
  • RecordAccess: Utility helper to extract strongly typed fields from raw array data.
  • Json: Helper for decoding files safely.
  • ValidationException: Custom runtime exception with file name, line numbers, and record IDs context.

Validation Specifications

Finding Validation

  1. Finding ID: Must match finding.YYYY-MM-DD.NNN.
  2. Created At: Must be a valid ISO 8601/Atom timestamp string.
  3. Task ID: Must match the configured task ID pattern (passed via $taskIdPattern to the FindingValidator constructor; defaults to '/^(?:[A-Z][A-Z0-9_-]*-\d+|TODO@[\w:\/.-]+)$/').
  4. Observation/Hypothesis Separation: Both must be non-empty strings and cannot be identical.
  5. Confidence: Must be one of low, medium, or high.
  6. Validation Status: Must be one of unverified, validated, or invalidated.
  7. Lifecycle Enforcements:
    • candidate requires validation_status=unverified.
    • validated and consolidated require validation_status=validated.
    • invalidated requires validation_status=invalidated.
    • superseded and rejected require validation_status=validated or validation_status=invalidated.
    • archived preserves the prior validation state and may use any supported validation_status.
    • A validation_status=validated finding requires a non-empty validated_conclusion.
    • The validated_conclusion must not be identical to the hypothesis.

Proposal Validation

  1. Proposal ID: Must match proposal.YYYY-MM-DD.NNN.
  2. Created At: Must be a valid ISO 8601/Atom timestamp string.
  3. Mutations Constraint: Fields mutations, changes, or targets must contain at most 1 item to prevent overly broad proposals.
  4. Source Findings: Must have at least 1 referenced source finding.
  5. Action-Specific Constraints:
    • If not a NO_DURABLE_LEARNING action: requires target_type, target, scope (non-empty list), boundary (non-empty), and validation checklist.
    • ADD action requires new wording.
    • DELETE action requires old wording.
    • REPLACE action requires both old and new wording.
  6. Status Constraints:
    • Proposal action describes the requested durable change (ADD, DELETE, REPLACE, REJECT, NO_DURABLE_LEARNING).
    • Proposal status describes the human lifecycle decision (candidate, approved, rejected, applied, retired).
    • Durable actions (ADD, DELETE, REPLACE) may be candidate, approved, rejected, applied, or retired.
    • REJECT and NO_DURABLE_LEARNING may only be candidate or rejected.
    • APPROVED, APPLIED, or RETIRED proposal requires approved_by and approved_at timestamp.
    • REJECTED proposal or a REJECT action requires a non-empty reason.
    • RETIRED proposal requires a non-empty reason. Retirement only applies to a previously APPLIED proposal whose durable change is now fully captured in its target skill/doc/memory home; voku/agent-recall-compiler's loadActiveGuidance() only scans proposals/approved/ and proposals/applied/, so a retired proposal stops being read into every future active recall guidance pool without needing any change in that package.
  7. Lifecycle Directory Check: Proposal files under proposals/<status>/ must embed the same status value.
  8. Scope Broader Check: If proposal scope includes entries not present in the referenced findings, a scope_justification must be provided.
  9. Constraint Promotion Gates: Constraint proposals require confirmed source findings, several independent findings or a critical-incident justification, explicit scope, explicit allowed boundaries, objective detectability, validation commands, declared false-positive risk, local example rule references where available, and engine-compatible target paths/commands.
  10. Learning Triage Gates: When present, learning_decision must align with the proposal:
  • IGNORE requires NO_DURABLE_LEARNING.
  • ADD_LEARNING_NOTE preserves the raw learning without pretending it is ready for skill promotion.
  • UPDATE_SKILL requires target_type=skill.
  • CREATE_SKILL requires ADD, target_type=skill, pattern_key, validation_case, and an overlap_check with inspected skills and max_overlap_percent <= 50.

Applied Constraint Metadata

When a constraint proposal is marked applied, its validation JSON must include generated_files, registration_file, commit, tests, validation_result, and content_hashes. This preserves lineage from finding to proposal, generated rule, registration, commit, validation, and later outcome.

Active Constraint Manifests

After a generated constraint is approved and implemented, run constraint-activate to create constraints/active/constraint.<rule_id>.json. The command validates the proposal, checks the target rule file and registration files exist relative to the project root, and writes the exact engine, rule identifier, scope, validation commands, and source proposal that voku/agent-recall-compiler selects later.

Learning roots may define config.json to avoid hard-coding one repository layout:

{
  "schema_version": "1.0",
  "project_root": "../../..",
  "constraint_generation_dir": "constraint-generation",
  "active_constraints_dir": "constraints/active"
}

Relative paths are resolved from the learning root. CLI options --project-root, --constraint-generation-dir, and --active-constraints-dir override config.json for one run. Without configuration, the package keeps the legacy project-root inference for infra/doc/agent-learning, .agent-learning, docs/agent-learning, and agent-learning.

Redaction Constraints

All keys and values are checked using RedactionGuard against secret assignment patterns. Any matches of standard credential assignments (e.g. password, token, api_key, ms-Mcs-AdmPwd patterns) throw a validation exception.

JSON Structure Formats

Example Finding

{
  "id": "finding.2026-06-08.001",
  "task_id": "PROJECT-1234",
  "session": "session_abc123",
  "created_at": "2026-06-08T10:00:00+00:00",
  "created_by": "agent_alpha",
  "scope": [
    "lib/framework/forms"
  ],
  "observation": "FormElement validation fails when checking numeric bounds if string decimals are passed.",
  "evidence": [
    {
      "type": "file_reference",
      "path": "lib/framework/forms/FormElement.php",
      "line": 42
    },
    {
      "type": "test_result",
      "command": "make test_unit_file FILE=tests/FormElement_UnitCest.php",
      "summary": "Failed asserting that false is true on DecimalBound test"
    }
  ],
  "hypothesis": "String decimal inputs should be normalized to float/int before calling range checks in FormElement.",
  "validated_conclusion": "Normalizing value to float in range validation resolves bounds failures without side-effects.",
  "confidence": "high",
  "validation_status": "validated",
  "status": "validated",
  "sensitivity": "public"
}

With learning triage:

{
  "classification": "ADD_LEARNING_NOTE",
  "pattern_key": "tests.add_before_change",
  "validation_case": {
    "given": "a task modifies behavior covered by existing tests",
    "when": "the agent prepares the implementation plan",
    "then": "it identifies relevant tests before editing production code"
  }
}

Example Proposal

{
  "id": "proposal.2026-06-08.001",
  "created_at": "2026-06-08T11:30:00+00:00",
  "action": "REPLACE",
  "target_type": "skill",
  "target": "form-validation",
  "scope": [
    "lib/framework/forms"
  ],
  "source_findings": [
    "finding.2026-06-08.001"
  ],
  "old": "Validate range bounds directly using the raw inputs.",
  "new": "Ensure numeric inputs are cast/normalized to numeric values before validating range bounds.",
  "reason": "Prevents float/string type comparisons from failing bounds checks.",
  "boundary": "Only run numeric bounds normalization on Decimal and Float FormElement subclasses.",
  "validation": [
    "Ensure unit tests verify decimal string normalization."
  ],
  "status": "candidate",
  "proposed_by": "agent_alpha",
  "approved_by": null,
  "approved_at": null
}

Development & Testing

Bundled Agent Skills

This package ships package-specific skills under skills/:

  • agent-learning-consumer: for end users setting up a learning root, capturing findings, validating proposals, and preparing consolidation input.
  • agent-hard-constraint-author: for end users promoting validated findings into executable PHPStan, PHP-CS-Fixer, test, or CI constraints.
  • agent-learning-maintainer: for maintainers changing voku/agent-learning source, tests, docs, or local vendor syncs.

Running Tests

To run unit and integration tests for this package:

composer test

Or use the local Makefile:

make test

Static Analysis

To run PHPStan checks on the package:

composer phpstan

Or use the local Makefile:

make phpstan

CLI

The Composer binary exposes the package workflow without requiring consuming-project classes:

vendor/bin/agent-learning validate --root infra/doc/agent-learning
vendor/bin/agent-learning prepare --root infra/doc/agent-learning --task PROJECT-1234 --task GH-158
vendor/bin/agent-learning prepare --root infra/doc/agent-learning --finding finding.2026-06-08.001 --scope src/Auth --since 2026-06-01
vendor/bin/agent-learning proposal-validate --root infra/doc/agent-learning --proposal proposal.2026-06-08.001.json
vendor/bin/agent-learning constraint-export --root infra/doc/agent-learning --proposal proposal.2026-06-08.001.json
vendor/bin/agent-learning constraint-activate --root infra/doc/agent-learning --proposal proposal.2026-06-08.001.json
vendor/bin/agent-learning constraint-loop --root infra/doc/agent-learning proposal.2026-06-08.001 --by lars --commit working-tree --validation infra/doc/agent-learning/validation-results/proposal.2026-06-08.001.json --approve-candidate
vendor/bin/agent-learning guidance-evaluate --root infra/doc/agent-learning --selection-history history/recall-selections.jsonl --outcome-history history/outcomes.jsonl

prepare prints the selected finding IDs before writing the prompt. Empty selections fail unless --allow-empty is passed. If templates/consolidation-prompt.md exists under the learning root, its content is appended to the generated consolidation input as a project-specific prompt addendum.

--root may point either to the learning root itself or to a project root containing one of these directories:

  • infra/doc/agent-learning
  • .agent-learning
  • docs/agent-learning
  • agent-learning

Zero-byte .json files are treated as extraction placeholders and skipped. Non-empty finding, proposal, and history records are validated strictly.

Guidance Usage Evaluation

guidance-evaluate consumes immutable event histories produced by voku/agent-recall-compiler, rebuilds deterministic usage summaries, and prints conservative decisions for finding-to-memory, memory-to-skill, skill-to-constraint, and stale/replacement review paths.

It does not edit MEMORY.md, skills, active constraints, PHPStan configuration, or CI. With --write-candidates, it may write only reviewable proposal files under proposals/candidate/; no proposal is approved, applied, or activated automatically.

Schema details, policy gates, and duplicate behavior are documented in docs/guidance-evaluation.md. A complete findings-to-memory-to-skill-promotion fixture is available under examples/guidance-evaluation.