SilverStripe\TextExtraction\Extractor\TikaTextExtractor
Enables text extraction of file content via the Tika CLI
{@link http://tika.apache.org/1.7/gettingstarted.html}
Synopsis
class TikaTextExtractor
extends FileTextExtractor
{
- // members
- private static string $output_mode = '-t';
- // Inherited members from FileTextExtractor
- protected static $sorted_extractor_classes;
- // methods
- public mixed getVersion()
- protected int runShell()
- public void getContent()
- public bool isAvailable()
- public bool supportsExtension()
- public bool supportsMime()
- // Inherited methods from FileTextExtractor
- protected static array get_extractor_classes()
- protected static FileTextExtractor get_extractor()
- public static FileTextExtractor|null for_file()
- protected static string getPathFromFile()
- public abstract boolean isAvailable()
- public abstract boolean supportsExtension()
- public abstract boolean supportsMime()
- public abstract string getContent()
Hierarchy
Members
private
- $output_mode
—
string
Text extraction mode. Defaults to -t (plain text)
protected
- $sorted_extractor_classes
—
array
Cache of extractor class names, sorted by priority
Methods
protected
- runShell() — Runs an arbitrary and safely escaped shell command
public
- getContent()
- getVersion() — Get the version of tika installed, or 0 if not installed
- isAvailable()
- supportsExtension()
- supportsMime()
Inherited from SilverStripe\TextExtraction\Extractor\FileTextExtractor
protected
- getPathFromFile() — Some text extractors (like pdftotext) may require a physical file to read from, so write the current file contents to a temp file and return its path
- get_extractor() — Get the text file extractor for the given class
- get_extractor_classes() — Gets the list of prioritised extractor classes
public
- for_file() — Given a File object, decide which extractor instance to use to handle it
- getContent() — Given a File instance, extract the contents as text.
- isAvailable() — Checks if the extractor is supported on the current environment, for example if the correct binaries or libraries are available.
- supportsExtension() — Determine if this extractor supports the given extension.
- supportsMime() — Determine if this extractor supports the given mime type.