public class HtmlFilter extends AbstractMarkupFilter
SUB_FILTER| Constructor and Description |
|---|
HtmlFilter() |
| Modifier and Type | Method and Description |
|---|---|
void |
close()
Close the filter and all used resources.
|
protected PropertyTextUnitPlaceholder |
createPropertyTextUnitPlaceholder(PropertyTextUnitPlaceholder.PlaceholderAccessType type,
String name,
String value,
net.htmlparser.jericho.Tag tag,
net.htmlparser.jericho.Attribute attribute)
|
ISkeletonWriter |
createSkeletonWriter()
Creates a new ISkeletonWriter object that corresponds to the type of skeleton
this filter uses.
|
protected TextFragment.TagType |
determineTagType(net.htmlparser.jericho.Tag tag)
Filter specific method for determining
TextFragment.TagType |
protected void |
endFilter()
End the current filter processing and send the
Ending Event |
protected TaggedFilterConfiguration |
getConfig()
Get the current
TaggedFilterConfiguration. |
Parameters |
getParameters()
Gets the current parameters for this filter.
|
protected String |
normalizeAttributeName(String attrName,
String attrValue,
net.htmlparser.jericho.Tag tag)
Some attributes names are converted to Okapi standards such as HTML charset to "encoding" and lang to "language"
|
void |
open(RawDocument input,
boolean generateSkeleton)
Start a new
IFilter using the supplied RawDocument. |
protected void |
preProcess(net.htmlparser.jericho.Segment segment)
Do any handling needed before the current Segment is processed.
|
void |
setParameters(IParameters params)
Sets new parameters for this filter.
|
void |
setParametersFromFile(File config)
Initialize filter parameters from a Java File.
|
void |
setParametersFromString(String config)
Initialize filter parameters from a String.
|
void |
setParametersFromURL(URL config)
Initialize filter parameters from a URL.
|
protected void |
startFilter()
Initialize rule state and parser.
|
protected TaggedFilterConfiguration.RULE_TYPE |
updateEndTagRuleState(net.htmlparser.jericho.EndTag endTag) |
addCodeToCurrentTextUnit, addCodeToCurrentTextUnit, addFilterEvent, addToDocumentPart, addToTextUnit, addToTextUnit, addToTextUnit, addToTextUnit, addToTextUnit, appendToFirstSkeletonPart, canStartNewTextUnit, createEventBuilder, createPropertyTextUnitPlaceholders, detectEncoding, endDocumentPart, endGroup, endTextUnit, getBufferedWhiteSpace, getCurrentDocName, getDocumentPartId, getEventBuilder, getGroupIdSequence, getParsedHeader, getRuleState, getTextUnitId, handleCdataSection, handleCharacterEntity, handleComment, handleDocTypeDeclaration, handleDocumentPart, handleEndTag, handleMarkupDeclaration, handleNumericEntity, handleProcessingInstruction, handleServerCommon, handleServerCommonEscaped, handleStartTag, handleText, handleXmlDeclaration, hasNext, isBOM, isDocumentEncoding, isInsideTextRun, isPreserveWhitespace, isUtf8Bom, isUtf8Encoding, isWhiteSpace, next, open, peekTempEvent, popTempEvent, postProcessTextUnit, setCurrentDocName, setDocumentPartId, setGroupIdSequence, setMimeType, setPreserveWhitespace, setTextUnitId, setTextUnitMimeType, setTextUnitName, setTextUnitPreserveWhitespace, setTextUnitTranslatable, setTextUnitType, startDocumentPart, startGroup, startGroup, startTextUnit, startTextUnit, startTextUnit, startTextUnit, updateStartTagRuleStateaddConfiguration, addConfigurations, cancel, createEndFilterEvent, createFilterWriter, createStartFilterEvent, getConfiguration, getConfigurations, getDisplayName, getDocumentId, getDocumentName, getEncoderManager, getEncoding, getFilterConfigurationMapper, getFilterWriter, getMimeType, getName, getNewlineType, getParentId, getSrcLoc, getTrgLoc, isCanceled, isGenerateSkeleton, isMultilingual, removeConfiguration, setDisplayName, setDocumentName, setEncoding, setFilterConfigurationMapper, setFilterWriter, setGenerateSkeleton, setMultilingual, setName, setNewlineType, setOptions, setParentId, setSrcLoc, setTrgLocclone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitforEachRemaining, removepublic ISkeletonWriter createSkeletonWriter()
IFiltercreateSkeletonWriter in interface IFiltercreateSkeletonWriter in class AbstractFilterpublic void open(RawDocument input, boolean generateSkeleton)
AbstractMarkupFilterIFilter using the supplied RawDocument.open in interface IFilteropen in class AbstractMarkupFilterinput - - input to the IFilter (can be a CharSequence, URI or InputStream)generateSkeleton - - true if the IFilter should store non-translatble blocks (aka skeleton), false otherwise.public void close()
AbstractMarkupFilterclose in interface AutoCloseableclose in interface IFilterclose in class AbstractMarkupFilterprotected void startFilter()
startFilter in class AbstractMarkupFilterprotected void endFilter()
Ending EventendFilter in class AbstractMarkupFilterprotected void preProcess(net.htmlparser.jericho.Segment segment)
AbstractMarkupFilterpreProcess in class AbstractMarkupFilterprotected TaggedFilterConfiguration.RULE_TYPE updateEndTagRuleState(net.htmlparser.jericho.EndTag endTag)
updateEndTagRuleState in class AbstractMarkupFilterprotected PropertyTextUnitPlaceholder createPropertyTextUnitPlaceholder(PropertyTextUnitPlaceholder.PlaceholderAccessType type, String name, String value, net.htmlparser.jericho.Tag tag, net.htmlparser.jericho.Attribute attribute)
AbstractMarkupFiltercreatePropertyTextUnitPlaceholder in class AbstractMarkupFiltertype - - PropertyTextUnitPlaceholder.PlaceholderAccessType is one of TRANSLATABLE, READ_ONLY_PROPERTY, WRITABLE_PROPERTYname - - attribute namevalue - - attribute valuetag - - Jericho Tag which contains the attributeattribute - - attribute as a Jericho AttributePropertyTextUnitPlaceholder representing the attributeprotected String normalizeAttributeName(String attrName, String attrValue, net.htmlparser.jericho.Tag tag)
AbstractMarkupFilternormalizeAttributeName in class AbstractMarkupFilterattrName - - the attribute nameattrValue - - the attribute valuetag - - the Jericho Tag that contains the attributeprotected TaggedFilterConfiguration getConfig()
AbstractMarkupFilterTaggedFilterConfiguration. A TaggedFilterConfiguration is the result of reading in a YAML
configuration file and converting it into Java Objects.getConfig in class AbstractMarkupFilterTaggedFilterConfigurationpublic void setParameters(IParameters params)
IFilterparams - The new parameters to use.public Parameters getParameters()
IFilterpublic void setParametersFromURL(URL config)
config - public void setParametersFromFile(File config)
config - public void setParametersFromString(String config)
config - protected TextFragment.TagType determineTagType(net.htmlparser.jericho.Tag tag)
AbstractMarkupFilterTextFragment.TagTypedetermineTagType in class AbstractMarkupFiltertag - Jericho Tag start or end tagTextFragment.TagTypeCopyright © 2021. All rights reserved.