goaop / dissect
A set of tools for lexical and syntactical analysis written in pure PHP
Fund package maintenance!
Requires
- php: ^8.2.0
Requires (Dev)
- phpunit/phpunit: ^11.0.3
- rector/rector: ^1.0
- symfony/console: >=6.0
Suggests
- symfony/console: for the command-line tool
This package is auto-updated.
Last update: 2026-03-25 20:50:31 UTC
README
A pure-PHP toolkit for building custom lexers and LALR(1) parsers — fast, type-safe, and dependency-free.
✨ What is Dissect?
Dissect is a pure-PHP library for lexical and syntactical analysis — the foundational building blocks for any language tooling: expression evaluators, template engines, DSL interpreters, query parsers, and more.
It powers the GoAOP framework, where it parses pointcut DSL expressions into an AST for aspect-oriented programming.
Data flow
Input String
│
▼
┌─────────┐ ┌──────────────┐ ┌──────────────────┐
│ Lexer │ ───▶ │ TokenStream │ ───▶ │ LALR(1) Parser │ ───▶ Result / AST
└─────────┘ └──────────────┘ └──────────────────┘
▲
Grammar (rules
+ callbacks)
🚀 Key Features
🔤 Flexible Lexers
| Lexer | Description |
|---|---|
SimpleLexer |
Fluent builder API — define tokens with strings or regex, mark skippable tokens |
StatefulLexer |
Context-aware tokenization with explicit state transitions (e.g. for string interpolation) |
RegexLexer |
Abstract base class adapted from Doctrine — ultra-fast single-pass regex lexing |
📐 LALR(1) Parser
- Full LALR(1) grammar support — handles the vast majority of real-world grammars
- Fluent grammar API — define productions and semantic actions with readable PHP closures
- Operator precedence & associativity — built-in
left(),right(),nonassoc()declarations - Conflict resolution — configurable strategies: shift-wins, longer-reduce, earlier-reduce
- Precomputed parse tables — analyze once, serialize to PHP file, load instantly in production
🌳 AST Construction
CommonNode— ready-to-use tree node with named children and arbitrary attributes- Countable & iterable — traverse subtrees with standard PHP constructs
🛠 Developer Experience
- Zero runtime dependencies — only Symfony Console as an optional CLI dep
- PHPStan level 10 — fully typed with generics, array shapes, and readonly properties
- CLI tool — dump parse tables and visualize automaton states as Graphviz graphs
📦 Installation
composer require goaop/dissect
⚡ Quick Example
use Dissect\Lexer\SimpleLexer; use Dissect\Parser\Grammar; use Dissect\Parser\LALR1\Parser; // 1. Define a lexer $lexer = new SimpleLexer(); $lexer->regex('INT', '/[0-9]+/') ->token('PLUS', '+') ->token('MINUS', '-') ->regex('WS', '/\s+/') ->skip('WS'); // 2. Define a grammar $grammar = new Grammar(); $grammar('Expr') ->is('Expr', 'PLUS', 'Expr') ->call(fn($l, $_, $r) => $l + $r) ->is('Expr', 'MINUS', 'Expr') ->call(fn($l, $_, $r) => $l - $r) ->is('INT') ->call(fn($t) => (int) $t->getValue()); $grammar->operators('PLUS', 'MINUS')->left()->prec(1); $grammar->start('Expr'); // 3. Parse! $parser = new Parser($grammar); $result = $parser->parse($lexer->lex('3 + 5 - 2')); // → 6
📖 Documentation
| Topic | Description |
|---|---|
| Lexical analysis | SimpleLexer, StatefulLexer, RegexLexer, performance tips |
| Writing a grammar | Productions, callbacks, operator precedence, conflict resolution |
| Building an AST | CommonNode, tree traversal |
| Common patterns | Lists, comma-separated sequences, expression grammars |
| CLI tool | Precomputing parse tables, exporting automaton graphs |
🧪 Testing & Quality
# Run tests composer test # Run tests with coverage composer test-coverage # Static analysis (PHPStan level 10) composer phpstan
🙏 Credits
Originally created by @jakubledl, extended by @WalterWoshid, maintained by the GoAOP team.
Give a ⭐ if Dissect saved you from writing a parser by hand!