Interface ITextExtractorProcessor
public interface ITextExtractorProcessor
Represents the text extractor processor used to extract text from various contents.
 
Extraction on one or more file extensions can be disabled using the configuration key "index.system.extractor.text.disable.document.types" by listing the disabled file extensions separated by commas.
- 
Method SummaryModifier and TypeMethodDescriptionbooleancanHandleExtractionFor(String fileExtension) Indicates if a file extension can be processed to extract text fromvoidextract(BufferedInputStream in, BufferedOutputStream out, String fileExtension) Extracts the text from the content read in the input stream.
- 
Method Details- 
extractvoid extract(BufferedInputStream in, BufferedOutputStream out, String fileExtension) throws ExtractorNotFoundException, IOException Extracts the text from the content read in the input stream. The text resulting from the processing is written to the output stream. A suitable extractor will be used depending on the file extension indicated in parameter.- Parameters:
- in- the input stream from which to extract the text
- out- the output stream to write the text result to
- fileExtension- the file extension indicating the type of content
- Throws:
- ExtractorNotFoundException- if no suitable extractor can be used for the type of content
- IOException- a potential- IOExceptionin case of error
 
- 
canHandleExtractionForIndicates if a file extension can be processed to extract text from- Parameters:
- fileExtension- the file extension
- Returns:
- true if it is possible to extract text from this type of file otherwise false
 
 
-