README

Alpha Release - This package is in early development. APIs may change without notice.

A self-learning text-to-SQL agent for Laravel that converts natural language questions into SQL queries using LLMs.

This package is based on Dash and OpenAI's in-house data agent.

Why This Package?

Raw LLMs writing SQL hit a wall fast. The problems:

Schemas lack meaning — Column names like status or type don't convey business context
Types are misleading — A position column might be TEXT, not INTEGER
Tribal knowledge is missing — "Active customer" means different things to different teams
No learning from mistakes — The same errors repeat endlessly
Results lack interpretation — You get data, not answers

The root cause is missing context and missing memory.

This package solves it with:

Knowledge Base — Curated table metadata, business rules, and query patterns that give the LLM the context it needs
Self-Learning — When a query fails and the agent recovers, it saves that learning. Next time, it knows.
Multi-Layer Context — Schema introspection, semantic search over knowledge, conversation history, and accumulated learnings
SQL Safety — Configurable guardrails to prevent destructive operations

This package provides the foundation to build reliable, context-aware data agents for Laravel applications.

Features
Requirements
Installation
Quick Start
Configuration
Knowledge Base
LLM Drivers
Search Drivers
Web Interface
Artisan Commands
Programmatic Usage
Evaluation System
Self-Learning
Database Support
Events
Testing
Troubleshooting
License

Features

Multi-LLM Support - OpenAI (GPT-4, GPT-4o), Anthropic (Claude), and Ollama for local models
Multi-Database Support - MySQL, PostgreSQL, SQLite, and SQL Server
Self-Learning - Automatically learns from SQL errors and improves over time
Multiple Search Drivers - Database full-text search, Laravel Scout integration, or hybrid approach
Agentic Loop - Uses tool calling to introspect schema, run queries, and refine results
Livewire Chat UI - Ready-to-use chat interface with conversation history
Knowledge Base System - Define table metadata, business rules, and query patterns
SQL Safety - Configurable statement restrictions and row limits
Evaluation Framework - Test your agent's accuracy with automated evaluations

Requirements

PHP 8.2 or higher
Laravel 11.x or 12.x
An LLM API key (OpenAI, Anthropic, or local Ollama installation)
Optional: Livewire 3.x for the chat UI
Optional: Laravel Scout for external search engines

Installation

Install the package via Composer:

composer require knobik/laravel-sql-agent

Run the install command:

php artisan sql-agent:install

This will:

Publish the configuration file
Publish and run migrations
Create the knowledge directory structure at resources/sql-agent/knowledge/

Add your LLM API key to .env:

# For OpenAI (default)
OPENAI_API_KEY=sk-your-api-key

# Or for Anthropic
ANTHROPIC_API_KEY=sk-ant-your-api-key
SQL_AGENT_LLM_DRIVER=anthropic

# Or for Ollama (local)
SQL_AGENT_LLM_DRIVER=ollama
OLLAMA_BASE_URL=http://localhost:11434

Quick Start

1. Create a knowledge file

Create resources/sql-agent/knowledge/tables/users.json:

{
    "table": "users",
    "description": "Contains user account information",
    "columns": {
        "id": "Primary key, auto-incrementing integer",
        "name": "User's full name",
        "email": "User's email address (unique)",
        "created_at": "Account creation timestamp",
        "updated_at": "Last update timestamp"
    }
}

2. Load knowledge into the database

php artisan sql-agent:load-knowledge

3. Run your first query

use Knobik\SqlAgent\Facades\SqlAgent;

$response = SqlAgent::run('How many users signed up this month?');

echo $response->answer;  // "There are 42 users who signed up this month."
echo $response->sql;     // "SELECT COUNT(*) as count FROM users WHERE created_at >= '2026-01-01'"

Configuration

The configuration file is located at config/sql-agent.php. Here are all available options:

Display Name

'name' => 'SqlAgent',

The display name used in the UI and logs.

Database Configuration

'database' => [
    // The database connection to query (your application data)
    'connection' => env('SQL_AGENT_CONNECTION', config('database.default')),

    // The connection for SqlAgent's own tables (knowledge, learnings, etc.)
    'storage_connection' => env('SQL_AGENT_STORAGE_CONNECTION', config('database.default')),
],

You can use a separate database for SqlAgent's internal tables by setting SQL_AGENT_STORAGE_CONNECTION.

LLM Configuration

'llm' => [
    'default' => env('SQL_AGENT_LLM_DRIVER', 'openai'),

    'drivers' => [
        'openai' => [
            'api_key' => env('OPENAI_API_KEY'),
            'model' => env('SQL_AGENT_OPENAI_MODEL', 'gpt-4o'),
            'temperature' => 0.0,
            'max_tokens' => 4096,
        ],

        'anthropic' => [
            'api_key' => env('ANTHROPIC_API_KEY'),
            'model' => env('SQL_AGENT_ANTHROPIC_MODEL', 'claude-sonnet-4-20250514'),
            'temperature' => 0.0,
            'max_tokens' => 4096,
        ],

        'ollama' => [
            'base_url' => env('OLLAMA_BASE_URL', 'http://localhost:11434'),
            'model' => env('SQL_AGENT_OLLAMA_MODEL', 'llama3.1'),
            'temperature' => 0.0,
        ],
    ],
],

Search Configuration

'search' => [
    'default' => env('SQL_AGENT_SEARCH_DRIVER', 'database'),

    'drivers' => [
        'database' => [
            'mysql' => [
                'mode' => 'NATURAL LANGUAGE MODE', // or 'BOOLEAN MODE'
            ],
            'pgsql' => [
                'language' => 'english', // PostgreSQL text search language
            ],
            'sqlsrv' => [],
        ],

        'scout' => [
            'driver' => env('SCOUT_DRIVER', 'meilisearch'),
        ],

        'hybrid' => [
            'primary' => 'scout',
            'fallback' => 'database',
            'merge_results' => false,
        ],
    ],
],

Agent Configuration

'agent' => [
    // Maximum number of tool-calling iterations before stopping
    'max_iterations' => env('SQL_AGENT_MAX_ITERATIONS', 10),

    // Default LIMIT for queries without explicit limits
    'default_limit' => env('SQL_AGENT_DEFAULT_LIMIT', 100),

    // Number of previous messages to include for context
    'chat_history_length' => env('SQL_AGENT_CHAT_HISTORY', 10),
],

Learning Configuration

'learning' => [
    // Enable/disable the self-learning feature
    'enabled' => env('SQL_AGENT_LEARNING_ENABLED', true),

    // Automatically save learnings when SQL errors occur
    'auto_save_errors' => env('SQL_AGENT_AUTO_SAVE_ERRORS', true),

    // Remove learnings older than this many days (via prune command)
    'prune_after_days' => env('SQL_AGENT_LEARNING_PRUNE_DAYS', 90),

    // Maximum auto-generated learnings per day (prevents runaway learning)
    'max_auto_learnings_per_day' => env('SQL_AGENT_MAX_AUTO_LEARNINGS', 50),
],

Knowledge Configuration

'knowledge' => [
    // Path to knowledge files
    'path' => env('SQL_AGENT_KNOWLEDGE_PATH', resource_path('sql-agent/knowledge')),

    // Source for knowledge: 'files' or 'database'
    'source' => env('SQL_AGENT_KNOWLEDGE_SOURCE', 'files'),
],

UI Configuration

'ui' => [
    // Enable/disable the web interface
    'enabled' => env('SQL_AGENT_UI_ENABLED', true),

    // URL prefix (e.g., /sql-agent)
    'route_prefix' => env('SQL_AGENT_ROUTE_PREFIX', 'sql-agent'),

    // Middleware for the UI routes
    'middleware' => ['web', 'auth'],
],

SQL Safety Configuration

'sql' => [
    // Only these statement types are allowed
    'allowed_statements' => ['SELECT', 'WITH'],

    // These keywords will cause queries to be rejected
    'forbidden_keywords' => [
        'DROP', 'DELETE', 'UPDATE', 'INSERT', 'ALTER',
        'CREATE', 'TRUNCATE', 'GRANT', 'REVOKE', 'EXEC', 'EXECUTE',
    ],

    // Maximum rows returned by any query
    'max_rows' => env('SQL_AGENT_MAX_ROWS', 1000),
],

Evaluation Configuration

'evaluation' => [
    // Model used for LLM grading of test results
    'grader_model' => env('SQL_AGENT_GRADER_MODEL', 'gpt-4o-mini'),

    // Minimum score to pass LLM grading (0.0 - 1.0)
    'pass_threshold' => env('SQL_AGENT_EVAL_PASS_THRESHOLD', 0.6),

    // Timeout for each test case in seconds
    'timeout' => env('SQL_AGENT_EVAL_TIMEOUT', 60),
],

Knowledge Base

The knowledge base helps SqlAgent understand your database schema, business rules, and common query patterns.

Directory Structure

resources/sql-agent/knowledge/
├── tables/          # Table metadata (JSON)
├── business/        # Business rules and metrics (JSON)
└── queries/         # Query patterns (SQL or JSON)

Table Metadata

Create JSON files in tables/ to describe your database schema:

{
    "table": "orders",
    "description": "Contains customer orders and their status",
    "columns": {
        "id": "Primary key",
        "customer_id": "Foreign key to customers.id",
        "status": "Order status: pending, processing, shipped, delivered, cancelled",
        "total_amount": "Order total in cents (integer)",
        "created_at": "Order creation timestamp",
        "shipped_at": "Shipping timestamp (null if not shipped)"
    },
    "relationships": [
        "orders.customer_id -> customers.id"
    ],
    "notes": "The total_amount is stored in cents. Divide by 100 for dollars."
}

Business Rules

Create JSON files in business/ to define business logic and metrics:

{
    "name": "Active Customer Definition",
    "description": "A customer is considered active if they have placed an order in the last 90 days",
    "rules": [
        "Active customers have at least one order with created_at >= NOW() - INTERVAL 90 DAY",
        "Inactive customers have no orders in the last 90 days"
    ],
    "examples": [
        {
            "question": "How many active customers do we have?",
            "sql": "SELECT COUNT(DISTINCT customer_id) FROM orders WHERE created_at >= DATE_SUB(NOW(), INTERVAL 90 DAY)"
        }
    ]
}

Query Patterns

Create files in queries/ to teach SqlAgent common query patterns:

JSON format (queries/revenue.json):

{
    "name": "Monthly Revenue",
    "description": "Calculate total revenue by month",
    "pattern": "SELECT DATE_FORMAT(created_at, '%Y-%m') as month, SUM(total_amount) / 100 as revenue FROM orders WHERE status != 'cancelled' GROUP BY month ORDER BY month DESC",
    "keywords": ["revenue", "monthly", "sales", "income"]
}

SQL format (queries/top_customers.sql):

-- name: Top Customers by Order Count
-- description: Find customers with the most orders
-- keywords: top, customers, orders, best

SELECT
    c.id,
    c.name,
    COUNT(o.id) as order_count,
    SUM(o.total_amount) / 100 as total_spent
FROM customers c
JOIN orders o ON o.customer_id = c.id
WHERE o.status != 'cancelled'
GROUP BY c.id, c.name
ORDER BY order_count DESC
LIMIT 10;

Loading Knowledge

Load all knowledge files into the database:

php artisan sql-agent:load-knowledge

Load specific types:

php artisan sql-agent:load-knowledge --tables
php artisan sql-agent:load-knowledge --rules
php artisan sql-agent:load-knowledge --queries

Recreate all knowledge (clears existing):

php artisan sql-agent:load-knowledge --recreate

Use a custom path:

php artisan sql-agent:load-knowledge --path=/custom/knowledge/path

LLM Drivers

OpenAI

The default driver. Requires the OpenAI API key.

OPENAI_API_KEY=sk-your-api-key
SQL_AGENT_LLM_DRIVER=openai
SQL_AGENT_OPENAI_MODEL=gpt-4o

Anthropic

Use Claude models from Anthropic.

ANTHROPIC_API_KEY=sk-ant-your-api-key
SQL_AGENT_LLM_DRIVER=anthropic
SQL_AGENT_ANTHROPIC_MODEL=claude-sonnet-4-20250514

Ollama

Use local models with Ollama. No API key required.

SQL_AGENT_LLM_DRIVER=ollama
OLLAMA_BASE_URL=http://localhost:11434
SQL_AGENT_OLLAMA_MODEL=llama3.1

Custom Drivers

Implement the Knobik\SqlAgent\Contracts\LlmDriver interface:

<?php

namespace App\Llm;

use Generator;
use Knobik\SqlAgent\Contracts\LlmDriver;
use Knobik\SqlAgent\Contracts\LlmResponse;

class CustomLlmDriver implements LlmDriver
{
    public function chat(array $messages, array $tools = []): LlmResponse
    {
        // Your implementation
    }

    public function stream(array $messages, array $tools = []): Generator
    {
        // Your implementation
    }

    public function supportsToolCalling(): bool
    {
        return true;
    }
}

$this->app->bind('sql-agent.llm.custom', CustomLlmDriver::class);

Search Drivers

Search drivers are used to find relevant knowledge (table metadata, business rules, query patterns) based on the user's question.

Database Driver

Uses native database full-text search. No external services required.

SQL_AGENT_SEARCH_DRIVER=database

MySQL: Uses MATCH ... AGAINST with natural language or boolean mode.

PostgreSQL: Uses to_tsvector and to_tsquery for full-text search.

SQLite: Falls back to LIKE queries (less accurate but functional).

SQL Server: Uses CONTAINS full-text predicates (requires full-text catalog).

Scout Driver

Integrates with Laravel Scout for external search engines like Meilisearch or Algolia.

SQL_AGENT_SEARCH_DRIVER=scout
SCOUT_DRIVER=meilisearch
MEILISEARCH_HOST=http://localhost:7700
MEILISEARCH_KEY=your-key

Requires laravel/scout package:

composer require laravel/scout

Hybrid Driver

Combines Scout as primary with database as fallback. Useful for reliability.

SQL_AGENT_SEARCH_DRIVER=hybrid

Configure in config/sql-agent.php:

'hybrid' => [
    'primary' => 'scout',
    'fallback' => 'database',
    'merge_results' => false, // Set true to combine results from both
],

Web Interface

SqlAgent includes a ready-to-use Livewire chat interface.

Accessing the UI

By default, the UI is available at /sql-agent and protected by web and auth middleware.

Customizing Routes

In config/sql-agent.php:

'ui' => [
    'enabled' => true,
    'route_prefix' => 'admin/sql-agent',  // Change the URL prefix
    'middleware' => ['web', 'auth', 'admin'],  // Add custom middleware
],

Disabling the UI

'ui' => [
    'enabled' => false,
],

Or via environment:

SQL_AGENT_UI_ENABLED=false

Customizing Views

Publish the views:

php artisan vendor:publish --tag=sql-agent-views

Views will be published to resources/views/vendor/sql-agent/.

Using the Livewire Component Directly

<livewire:sql-agent-chat />

{{-- With a specific conversation --}}
<livewire:sql-agent-chat :conversation-id="$conversationId" />

Artisan Commands

sql-agent:install

Install the SqlAgent package.

php artisan sql-agent:install
php artisan sql-agent:install --force  # Overwrite existing files

sql-agent:load-knowledge

Load knowledge files into the database.

php artisan sql-agent:load-knowledge

# Options
--recreate        # Drop and recreate all knowledge
--tables          # Load only table metadata
--rules           # Load only business rules
--queries         # Load only query patterns
--path=<path>     # Custom path to knowledge files

sql-agent:eval

Run evaluation tests to measure agent accuracy.

php artisan sql-agent:eval

# Options
--category=<cat>  # Filter by category (basic, aggregation, complex, etc.)
--llm-grader      # Use LLM to grade responses
--golden-sql      # Compare against golden SQL results
--connection=<c>  # Use specific database connection
--detailed        # Show detailed output for failed tests
--json            # Output results as JSON
--html=<path>     # Generate HTML report at path
--seed            # Seed test cases before running

sql-agent:export-learnings

Export learnings to a JSON file.

php artisan sql-agent:export-learnings
php artisan sql-agent:export-learnings output.json
php artisan sql-agent:export-learnings --category=type_error

Categories: type_error, schema_fix, query_pattern, data_quality, business_logic

sql-agent:import-learnings

Import learnings from a JSON file.

php artisan sql-agent:import-learnings learnings.json
php artisan sql-agent:import-learnings learnings.json --force  # Include duplicates

sql-agent:prune-learnings

Remove old or duplicate learnings.

php artisan sql-agent:prune-learnings

# Options
--days=90         # Remove learnings older than N days (default: 90)
--duplicates      # Only remove duplicate learnings
--include-used    # Also remove learnings that have been used
--dry-run         # Show what would be removed without removing

Programmatic Usage

Basic Usage

use Knobik\SqlAgent\Facades\SqlAgent;

$response = SqlAgent::run('How many users registered last week?');

// Access the response
$response->answer;      // Natural language answer
$response->sql;         // The SQL query that was executed
$response->results;     // Raw query results (array)
$response->toolCalls;   // All tool calls made during execution
$response->iterations;  // Detailed iteration data
$response->error;       // Error message if failed

// Check status
$response->isSuccess(); // true if no error
$response->hasResults(); // true if results is not empty

Streaming Responses

use Knobik\SqlAgent\Facades\SqlAgent;

foreach (SqlAgent::stream('Show me the top 5 customers') as $chunk) {
    echo $chunk->content;

    if ($chunk->isComplete()) {
        // Stream finished
    }
}

Custom Connection

Query a specific database connection:

$response = SqlAgent::run('How many orders today?', 'analytics');

With Conversation History

For the streaming API with chat history:

$history = [
    ['role' => 'user', 'content' => 'Show me all products'],
    ['role' => 'assistant', 'content' => 'Here are the products...'],
];

foreach (SqlAgent::stream('Now filter by price > 100', null, $history) as $chunk) {
    echo $chunk->content;
}

Dependency Injection

use Knobik\SqlAgent\Contracts\Agent;

class ReportController extends Controller
{
    public function __construct(
        private Agent $agent,
    ) {}

    public function generate(Request $request)
    {
        $response = $this->agent->run($request->input('question'));

        return [
            'answer' => $response->answer,
            'sql' => $response->sql,
            'data' => $response->results,
        ];
    }
}

Evaluation System

The evaluation system helps you measure and improve your agent's accuracy.

Creating Test Cases

Test cases are stored in the sql_agent_test_cases table. You can seed them using the built-in seeder or create your own:

use Knobik\SqlAgent\Models\TestCase;

TestCase::create([
    'name' => 'Count active users',
    'category' => 'basic',
    'question' => 'How many active users are there?',
    'expected_strings' => ['active', 'users'], // Strings that should appear in response
    'golden_sql' => 'SELECT COUNT(*) FROM users WHERE status = "active"',
    'metadata' => ['difficulty' => 'easy'],
]);

Running Evaluations

# Run all tests
php artisan sql-agent:eval

# Run with LLM grading
php artisan sql-agent:eval --llm-grader

# Run specific category
php artisan sql-agent:eval --category=aggregation

# Generate HTML report
php artisan sql-agent:eval --html=storage/eval-report.html

Evaluation Modes

String Matching (default): Checks if expected strings appear in the response
LLM Grading: Uses an LLM to semantically evaluate the response
Golden SQL: Compares query results against a known-good SQL query

Self-Learning

SqlAgent can automatically learn from its mistakes and improve over time.

How It Works

When a SQL error occurs, the agent analyzes the error
If it successfully recovers, it creates a "learning" record
Future queries can reference these learnings for context
Learnings are categorized and can be exported/imported

Learning Categories

Type Error: Data type mismatches or casting issues
Schema Fix: Incorrect schema assumptions (wrong table/column names)
Query Pattern: Learned patterns for constructing queries
Data Quality: Observations about data quality or anomalies
Business Logic: Learned business rules or domain knowledge

Managing Learnings

# Export all learnings
php artisan sql-agent:export-learnings

# Export specific category
php artisan sql-agent:export-learnings --category=schema_fix

# Import learnings
php artisan sql-agent:import-learnings learnings.json

# Prune old learnings
php artisan sql-agent:prune-learnings --days=90

# Remove duplicates
php artisan sql-agent:prune-learnings --duplicates

Disabling Self-Learning

SQL_AGENT_LEARNING_ENABLED=false
SQL_AGENT_AUTO_SAVE_ERRORS=false

Database Support

MySQL

Full support including:

Full-text search with MATCH ... AGAINST
Natural language and boolean search modes
JSON column support for metadata

PostgreSQL