opencat / core
Shared data models and contract interfaces for the OpenCAT Framework
dev-main
2026-05-09 00:57 UTC
Requires
- php: ^8.2
- ext-intl: *
- ext-mbstring: *
Requires (Dev)
- phpunit/phpunit: ^11.0
This package is auto-updated.
Last update: 2026-05-09 00:57:59 UTC
README
Shared data models, contracts, and enums for the OpenCAT Framework.
Every other OpenCAT package depends on this one. It contains no business logic — only the shapes that the rest of the framework passes around.
Installation
composer require opencat/core
Requires ext-intl and ext-mbstring.
What's inside
Models
| Class | Purpose |
|---|---|
Segment |
Ordered sequence of string and InlineCode elements — one translatable unit |
SegmentPair |
Source Segment + target Segment (null when untranslated), status, and lock flag |
BilingualDocument |
Ordered collection of SegmentPair objects plus filter skeleton data |
InlineCode |
A non-translatable formatting marker inside a Segment (bold tag, link, line break, etc.) |
TranslationUnit |
A stored source/target pair in a translation memory, with metadata |
MatchResult |
A TM lookup result — TranslationUnit plus similarity score and match type |
QualityIssue |
One issue raised by a QA check — check ID, severity, message, and character offset |
TermEntry |
A bilingual term pair from a glossary — source/target text, domain, and forbidden flag |
TermMatch |
A term found in running text — TermEntry plus the matched span |
Contracts (interfaces)
| Interface | Implemented by |
|---|---|
FileFilterInterface |
filter-plaintext, filter-html, filter-docx, others |
SegmentationEngineInterface |
segmentation |
TranslationMemoryInterface |
translation-memory |
MachineTranslationInterface |
mt |
TerminologyProviderInterface |
terminology |
QualityCheckInterface |
qa |
DocumentQualityCheckInterface |
qa |
Enums
| Enum | Values |
|---|---|
SegmentStatus |
Untranslated, Draft, Translated, Reviewed, Approved, Rejected |
SegmentState |
States for external interchange formats |
InlineCodeType |
OPENING, CLOSING, STANDALONE |
MatchType |
EXACT, EXACT_TEXT, FUZZY |
QualitySeverity |
INFO, WARNING, ERROR |
Exceptions
Each domain has its own exception class extending \RuntimeException:
FilterException · MtException · SegmentationException · TerminologyException · TmException
Working with Segment
A Segment holds an ordered mix of plain strings and InlineCode objects:
use CatFramework\Core\Model\Segment; use CatFramework\Core\Model\InlineCode; use CatFramework\Core\Enum\InlineCodeType; $bold = new InlineCode('b1', InlineCodeType::OPENING, '<strong>'); $boldClose = new InlineCode('b1', InlineCodeType::CLOSING, '</strong>'); $segment = new Segment('seg-1', [ 'Hello ', $bold, 'world', $boldClose, '!', ]); $segment->getPlainText(); // "Hello world!" $segment->isEmpty(); // false $segment->getInlineCodes(); // [$bold, $boldClose]
Working with BilingualDocument
use CatFramework\Core\Model\BilingualDocument; use CatFramework\Core\Model\SegmentPair; use CatFramework\Core\Enum\SegmentStatus; $doc = new BilingualDocument( sourceLanguage: 'en-US', targetLanguage: 'fr-FR', originalFile: 'report.docx', mimeType: 'application/vnd.openxmlformats-officedocument.wordprocessingml.document', ); foreach ($doc->getSegmentPairs() as $pair) { echo $pair->source->getPlainText(); // source text echo $pair->status->name; // SegmentStatus enum name echo $pair->isLocked ? 'locked' : 'editable'; }
Implementing a custom file filter
use CatFramework\Core\Contract\FileFilterInterface; use CatFramework\Core\Model\BilingualDocument; use CatFramework\Core\Exception\FilterException; class MyFilter implements FileFilterInterface { public function supports(string $filePath, ?string $mimeType = null): bool { return str_ends_with(strtolower($filePath), '.myext'); } public function extract(string $filePath, string $sourceLanguage, string $targetLanguage): BilingualDocument { // parse $filePath, create and return a BilingualDocument } public function rebuild(BilingualDocument $document, string $outputPath): void { // use $document->skeleton to reconstruct the file } public function getSupportedExtensions(): array { return ['.myext']; } }
Related packages
opencat/segmentation— sentence segmentation usingSegmentationEngineInterfaceopencat/translation-memory— TM usingTranslationMemoryInterfaceopencat/mt— machine translation usingMachineTranslationInterfaceopencat/qa— QA checks usingQualityCheckInterfaceopencat/workflow— wires all of the above into one call