Interface ITextExtractor
public interface ITextExtractor
Interface for text extractor implementations. Each
ITextExtractor
implementation is compatible with a specific list of file
extensions. If multiple implementation are found with compatibility on the same file extension an exception is thrown and extraction
cannot be proceeded.
An ITextExtractor
implementation must be declared using Service Provider Interface
( @see SPI ).
-
Method Summary
Modifier and TypeMethodDescriptionvoid
Extracts the text from the content read from the input stream.Returns a collection of compatible file extensions
-
Method Details
-
retrieveCompatibleFileExtensions
Collection<String> retrieveCompatibleFileExtensions()Returns a collection of compatible file extensions- Returns:
- the compatible file extensions collection
-
extract
Extracts the text from the content read from the input stream. The text is written to the output stream.This extract method can be called several time on the same
ITextExtractor
instance with different content to process- Parameters:
in
- the input stream from which to extract the textout
- the output stream to write the text result to- Throws:
IOException
- a potentialIOException
in case of error
-