public class MultiWordChunker extends AbstractDisambiguator
| Modifier and Type | Field and Description |
|---|---|
static String |
tagForNotAddingTags |
| Constructor and Description |
|---|
MultiWordChunker(String filename) |
MultiWordChunker(String filename,
boolean allowFirstCapitalized,
boolean allowAllUppercase,
boolean allowTitlecase) |
MultiWordChunker(String filename,
boolean allowFirstCapitalized,
boolean allowAllUppercase,
boolean allowTitlecase,
String defaultTag) |
| Modifier and Type | Method and Description |
|---|---|
AnalyzedSentence |
disambiguate(AnalyzedSentence input)
If possible, filters out the wrong POS tags.
|
AnalyzedSentence |
disambiguate(AnalyzedSentence input,
JLanguageTool.CheckCancelledCallback checkCanceled)
Implements multiword POS tags, e.g., <ELLIPSIS> for ellipsis (...)
|
List<String> |
getTokenLettercaseVariants(String originalToken,
Map<String,AnalyzedToken> tokenMap) |
void |
setIgnoreSpelling(boolean ignoreSpelling) |
void |
setRemovePreviousTags(boolean removePreviousTags) |
preDisambiguatepublic static String tagForNotAddingTags
public MultiWordChunker(String filename)
filename - file text with multiwords and tagspublic MultiWordChunker(String filename, boolean allowFirstCapitalized, boolean allowAllUppercase, boolean allowTitlecase)
filename - file text with multiwords and tagsallowFirstCapitalized - if set to true, first word of the
multiword can be capitalizedallowAllUppercase - if set to true, the all uppercase
version of the multiword is allowedallowTitlecase - if set to true, titlecased variants
of multi-token words are acceptedpublic List<String> getTokenLettercaseVariants(String originalToken, Map<String,AnalyzedToken> tokenMap)
public AnalyzedSentence disambiguate(AnalyzedSentence input) throws IOException
Disambiguatorinput - The sentence with already tagged words. The words are expected to
have multiple tags.IOExceptionpublic final AnalyzedSentence disambiguate(AnalyzedSentence input, @Nullable JLanguageTool.CheckCancelledCallback checkCanceled) throws IOException
input - The tokens to be chunked.IOExceptionpublic void setIgnoreSpelling(boolean ignoreSpelling)
public void setRemovePreviousTags(boolean removePreviousTags)