Text Extraction API for SilverStripe CMS (mostly used with 'fulltextsearch' module)
Provides a text extraction API for file content, that can hook into different extractor
engines based on availability and the parsed file format. The output returned is always a string of the file content.
Via the FileTextExtractable
extension, this logic can be used to
cache the extracted content on a DataObject
subclass (usually File
).
The module supports text extraction on the following file formats:
pdftotext
utility)composer require silverstripe/textextraction
The module depends on the Guzzle HTTP Library,
which is automatically checked out by composer. Alternatively, install Guzzle
through PEAR and ensure its in your include_path
.
Bugs are tracked in the issues section of this repository. Before submitting an issue please read over
existing issues to ensure yours is unique.
If the issue does look like a new bug:
Please report security issues to [email protected] directly. Please don't file security issues in the bugtracker.
If you would like to make contributions to the module please ensure you raise a pull request and discuss
with the module maintainers.
Module rating system helping users find modules that are well supported. For more on how the rating system works visit Module standards
Score not correct? Let us know there is a problem