public class RegexPlainTextFilter extends AbstractFilter
PlainTextFilter extracts lines of input text, separated by line terminators.
The filter is aware of the following line terminators:
| Modifier and Type | Field and Description |
|---|---|
static String |
FILTER_CONFIG |
static String |
FILTER_CONFIG_LINES |
static String |
FILTER_CONFIG_PARAGRAPHS |
static String |
FILTER_MIME |
static String |
FILTER_NAME |
SUB_FILTER| Constructor and Description |
|---|
RegexPlainTextFilter() |
| Modifier and Type | Method and Description |
|---|---|
void |
cancel()
Cancels the current process.
|
void |
close()
Closes the input document.
|
IFilterWriter |
createFilterWriter()
Creates a new IFilterWriter object from the most appropriate class to
use with this filter.
|
ISkeletonWriter |
createSkeletonWriter()
Creates a new ISkeletonWriter object that corresponds to the type of skeleton
this filter uses.
|
String |
getMimeType()
Gets the input document mime type.
|
String |
getName()
Gets the name/identifier of this filter.
|
Parameters |
getParameters()
Gets the current parameters for this filter.
|
Parameters |
getRegexParameters()
Provides access to the internal line extractor's
Parameters object. |
boolean |
hasNext()
Indicates if there is an event to process.
|
Event |
next()
Gets the next event available.
|
void |
open(RawDocument input)
Opens the input document described in a give RawDocument object.
|
void |
open(RawDocument input,
boolean generateSkeleton)
Opens the input document described in a give RawDocument object, and
optionally creates skeleton information.
|
void |
setParameters(IParameters params)
Sets new parameters for this filter.
|
void |
setRule(String rule,
int sourceGroup,
int regexOptions)
Configures an internal line extractor.
|
addConfiguration, addConfiguration, addConfiguration, addConfigurations, createEndFilterEvent, createStartFilterEvent, findConfiguration, getConfiguration, getConfigurations, getDisplayName, getDocumentId, getDocumentName, getEncoderManager, getEncoding, getFilterConfigurationMapper, getFilterWriter, getNewlineType, getParameters, getParametersClassName, getParentId, getSrcLoc, getTrgLoc, isCanceled, isGenerateSkeleton, isMultilingual, isUtf8Bom, isUtf8Encoding, removeConfiguration, setDisplayName, setDocumentName, setEncoding, setFilterConfigurationMapper, setFilterWriter, setGenerateSkeleton, setMimeType, setMultilingual, setName, setNewlineType, setOptions, setParentId, setSrcLoc, setTrgLocclone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitforEachRemaining, removepublic static final String FILTER_NAME
public static final String FILTER_MIME
public static final String FILTER_CONFIG
public static final String FILTER_CONFIG_LINES
public static final String FILTER_CONFIG_PARAGRAPHS
public void setRule(String rule, int sourceGroup, int regexOptions)
rule - - Java regex rule used to extract lines of text. Default: "^(.*?)$".sourceGroup - - regex capturing group denoting text to be extracted. Default: 1.regexOptions - - Java regex options. Default: Pattern.MULTILINE.public Parameters getRegexParameters()
Parameters object.Parameters object; with this object you can access the line extraction rule, source group, regex options, etc.public void cancel()
IFiltercancel in interface IFiltercancel in class AbstractFilterpublic void close()
IFilterclose in interface AutoCloseableclose in interface IFilterclose in class AbstractFilterpublic IFilterWriter createFilterWriter()
IFiltercreateFilterWriter in interface IFiltercreateFilterWriter in class AbstractFilterpublic ISkeletonWriter createSkeletonWriter()
IFiltercreateSkeletonWriter in interface IFiltercreateSkeletonWriter in class AbstractFilterpublic String getMimeType()
AbstractFiltergetMimeType in interface IFiltergetMimeType in class AbstractFilterpublic String getName()
IFiltergetName in interface IFiltergetName in class AbstractFilterpublic Parameters getParameters()
IFiltergetParameters in interface IFiltergetParameters in class AbstractFilterpublic boolean hasNext()
IFilterImplementer Note: The caller must be able to call this method several times without changing state.
public Event next()
IFilterpublic void open(RawDocument input)
IFilterinput - The RawDocument object to use to open the document.public void open(RawDocument input, boolean generateSkeleton)
IFilteropen in interface IFilteropen in class AbstractFilterinput - The RawDocument object to use to open the document.generateSkeleton - true to generate the skeleton data, false otherwise.public void setParameters(IParameters params)
IFiltersetParameters in interface IFiltersetParameters in class AbstractFilterparams - The new parameters to use.Copyright © 2022. All rights reserved.