public class TextFragment extends Object implements Appendable, CharSequence, Comparable<TextFragment>
The model uses two objects to store the data:
Code object.The coded text string is composed of normal characters and markers.
A marker is a sequence of two special characters (in the Unicode PUA)
that indicate the type of underlying code (opening, closing, isolated), and an index
pointing to its corresponding Code object where more information can be found.
The value of the index is encoded as a Unicode PUA character. You can use the
toChar(int) and toIndex(char) methods to encoded and decode
the index value.
To get the coded text of a TextFragment object use getCodedText(), and
to get its list of codes use getCodes().
You can modify directly the coded text or the codes and re-apply them to the
TextFragment object using setCodedText(String) and
setCodedText(String, List).
Adding a code to the coded text can be done by:
append(TagType, String, String)
changeToCode(int, int, TagType, String)| Modifier and Type | Class and Description |
|---|---|
static class |
TextFragment.Marker
List of the marker types as an
Enum. |
static class |
TextFragment.TagType
List of the types of tag usable for in-line codes.
|
| Modifier and Type | Field and Description |
|---|---|
static int |
CHARBASE
Special value used as the base of inline code indices.
|
protected List<Code> |
codes
List of the inline codes for this fragment.
|
protected boolean |
isBalanced
Flag indicating if the opening/closing inline codes of this fragment
have been balanced or not.
|
protected int |
lastCodeID
Value of the last inline code ID in this fragment.
|
static int |
MARKER_CLOSING
Special character marker for a closing inline code.
|
static int |
MARKER_ISOLATED
Special character marker for an isolated inline code.
|
static int |
MARKER_OPENING
Special character marker for a opening inline code.
|
static Pattern |
MARKERS_REGEX |
static String |
REFMARKER_END
Marker for end of reference.
|
static String |
REFMARKER_SEP
Marker for reference separator.
|
static String |
REFMARKER_START
Marker for start of reference.
|
protected StringBuilder |
text
Coded text buffer of this fragment.
|
| Constructor and Description |
|---|
TextFragment()
Creates an empty TextFragment.
|
TextFragment(String text)
Creates a TextFragment with a given text.
|
TextFragment(String text,
int lastCodeId)
Creates a TextFragment with a given text and an initial id value for codes.
|
TextFragment(String newCodedText,
List<Code> newCodes)
Creates a TextFragment with the content made of a given coded text
and a list of codes.
|
TextFragment(TextFragment fragment)
Creates a TextFragment with the content of a given TextFragment.
|
| Modifier and Type | Method and Description |
|---|---|
void |
alignCodeIds(TextFragment base)
Aligns the code IDs of this fragment with the ones of a given fragment.
|
int |
annotate(int start,
int end,
String type,
InlineAnnotation annotation)
Annotates a section of this text.
|
TextFragment |
append(char value)
Appends a character to the fragment.
|
TextFragment |
append(CharSequence csq)
Appends the specified character sequence to this fragment.
|
TextFragment |
append(CharSequence csq,
int start,
int end)
Appends a subsequence of the specified character sequence to this fragment.
|
TextFragment |
append(Code code)
Appends an existing code to this fragment.
|
TextFragment |
append(String text)
Appends a string to the fragment.
|
Code |
append(TextFragment.TagType tagType,
String type,
InlineAnnotation annotation)
Appends an annotation-type code to this text.
|
Code |
append(TextFragment.TagType tagType,
String type,
String data)
Appends a new code to the text.
|
Code |
append(TextFragment.TagType tagType,
String type,
String data,
int id)
Appends a new code to the text, when the code has a defined identifier.
|
TextFragment |
append(TextFragment fragment)
Appends a TextFragment object to this fragment.
|
TextFragment |
append(TextFragment fragment,
boolean keepCodeIds) |
void |
balanceMarkers()
Balances the markers based on the tag type of the codes.
|
int |
changeToCode(int start,
int end,
TextFragment.TagType tagType,
String type)
Changes a section of the coded text into a single code.
|
int |
changeToCode(int start,
int end,
TextFragment.TagType tagType,
String type,
boolean setDisplayText)
Changes a section of the coded text into a single code.
|
char |
charAt(int index)
Returns the character at the specified index in the coded text of this fragment.
|
TextFragment |
cleanCodes()
Removes all codes both in the Codes list and the markers.
|
void |
cleanUnusedCodes()
Removes all codes that have no data and no annotation.
|
void |
clear()
Clears the fragment of all content.
|
TextFragment |
clone()
Clones this TextFragment.
|
void |
collapseWhitespace()
Collapse all whitespace to a single space character.
|
int |
compareTo(TextFragment tf)
Compares an object with this TextFragment.
|
int |
compareTo(TextFragment frag,
boolean codeSensitive)
Compares a TextFragment with this one.
|
boolean |
equals(Object object) |
Code |
findCode(int id)
Finds a code with a given ID in this fragment, or null if there is no such code.
|
static int |
fromFragmentToString(TextFragment frag,
int pos)
Gets the position in the string representation of a fragment of a given
position in that fragment.
|
List<AnnotatedSpan> |
getAnnotatedSpans(String type)
Gets the list of all spans of text annotated with a given type of annotation.
|
List<Code> |
getClonedCodes()
Gets a list of the copy of the codes for this fragment.
|
Code |
getCode(char indexAsChar)
Gets the code for a given index formatted as character (the second
special character in a marker in a coded text string).
|
Code |
getCode(int index)
Gets the code for a given index.
|
String |
getCodedText()
Gets the coded text representation of the fragment.
|
String |
getCodedText(int start,
int end)
Gets the portion of coded text for a given section of the coded text.
|
int |
getCodePosition(int index) |
List<Code> |
getCodes()
Gets the list of all codes for the fragment.
|
List<Code> |
getCodes(int start,
int end)
Gets a copy of the list of the codes that are within a given section of
coded text.
|
int |
getIndex(int id)
Gets the index value for the first in-line code (in the codes list)
with a given identifier.
|
int |
getIndexForClosing(int id)
Gets the index value for the closing in-line code (in the codes list)
with a given identifier.
|
Code |
getLastCode()
Return the last code appended to this fragment, or null if there are no codes.
|
int |
getLastCodeId()
Gets the last value used for code id.
|
static Object[] |
getRefMarker(StringBuilder text)
Helper method to retrieve a reference marker from a string.
|
String |
getText()
Get the text of the fragment (all codes are removed)
|
static String |
getText(String codedText)
Helper method that will take a coded string and return a text only version.
|
boolean |
hasAnnotation()
Indicates if this text has at least one annotation.
|
boolean |
hasAnnotation(String type)
Indicates if this text has at least one annotation of a given type.
|
boolean |
hasCode()
Indicates if the fragment contains at least one code.
|
int |
hashCode() |
boolean |
hasReference()
Indicates if this TextFragment contains any in-line code with a reference.
|
boolean |
hasText()
Indicates if this fragment contains at least one character other than a whitespace.
|
boolean |
hasText(boolean whiteSpacesAreText)
Indicates if this fragment contains at least one character
(inline codes, segment markers, and annotation markers do not count as characters).
|
static int |
indexOfFirstNonWhitespace(String codedText,
int fromIndex,
int untilIndex,
boolean openingMarkerIsWS,
boolean closingMarkerIsWS,
boolean isolatedMarkerIsWS,
boolean whitespaceIsWS)
Helper method to find the first non-whitespace character
of a coded text, starting at a given position and no farther than another
given position.
|
static int |
indexOfLastNonWhitespace(String codedText,
int fromIndex,
int untilIndex,
boolean openingMarkerIsWS,
boolean closingMarkerIsWS,
boolean isolatedMarkerIsWS,
boolean whitespaceIsWS)
Helper method to find, from the back, the first non-whitespace character
of a coded text, starting at a given position and no farther than another
given position.
|
void |
insert(int offset,
Code code)
Inserts a
Code object to this fragment. |
void |
insert(int offset,
String str)
Inserts a
String object to this fragment. |
void |
insert(int offset,
TextFragment fragment)
Inserts a TextFragment object to this fragment.
|
void |
insert(int offset,
TextFragment fragment,
boolean keepCodeIds)
Inserts a TextFragment object to this fragment.
|
void |
invalidate()
Sets the fragment in a state where it has to be re-balanced before being used for output.
|
boolean |
isEmpty()
Indicates if the fragment is empty (no text and no codes).
|
static boolean |
isMarker(char ch)
Helper method that checks if a given character is an inline code marker.
|
int |
length()
Returns the number of character in the coded text of this fragment.
|
void |
ltrim()
Remove leading whitespace from this fragment
|
static String |
makeRefMarker(String id)
Helper method to build a reference marker string from a given identifier.
|
static String |
makeRefMarker(String id,
String propertyName)
Helper method to build a reference marker string from a given identifier
and a property name.
|
void |
remove(int start,
int end)
Removes a section of the fragment (including its codes).
|
void |
removeAnnotations()
Removes all annotations in this text.
|
void |
removeAnnotations(String type)
Removes all annotations of a given type in this text.
|
void |
removeCode(Code code)
Remove the
Code from this TextFragment |
int |
renumberCodes()
Renumbers the IDs of the codes in the fragment.
|
int |
renumberCodes(int idBase)
Re-assigns IDs of the codes in this fragment to be in a sequential order starting
from a given base.
|
int |
renumberCodes(int idBase,
boolean mindPosition)
Re-assigns IDs of the codes in this fragment to be in a sequential order starting
from a given base.
|
void |
rtrim()
Remove trailing whitespace from this fragment
|
void |
setCodedText(String newCodedText)
Sets the coded text of the fragment, using its the existing codes.
|
void |
setCodedText(String newCodedText,
boolean allowCodeDeletion)
Sets the coded text of the fragment, using its the existing codes.
|
void |
setCodedText(String newCodedText,
List<Code> newCodes)
Sets the coded text of the fragment and its corresponding codes.
|
void |
setCodedText(String newCodedText,
List<Code> newCodes,
boolean allowCodeDeletion)
Sets the coded text of the fragment and its corresponding codes.
|
protected void |
setCodes(List<Code> codes) |
TextFragment |
subSequence(int start,
int end)
Gets a copy of a sub-sequence of this object.
|
static char |
toChar(int index)
Helper method to convert a marker index to its character value in the
coded text string.
|
static int |
toIndex(char index)
Helper method to convert the index-coded-as-character part of a marker into
its index value.
|
String |
toString()
Gets the coded text for this fragment.
|
String |
toText()
Returns the content of this fragment, including the original codes whenever
possible.
|
void |
trim()
Trims white-spaces from the beginning and the end of this fragment.
|
static void |
unwrap(TextFragment frag)
Unwraps the content of a TextFragment.
|
finalize, getClass, notify, notifyAll, wait, wait, waitchars, codePointspublic static final int MARKER_OPENING
public static final int MARKER_CLOSING
public static final int MARKER_ISOLATED
public static final int CHARBASE
public static final String REFMARKER_START
public static final String REFMARKER_END
public static final String REFMARKER_SEP
public static final Pattern MARKERS_REGEX
protected StringBuilder text
protected boolean isBalanced
protected int lastCodeID
public TextFragment()
public TextFragment(String text)
text - the text to use.public TextFragment(String text, int lastCodeId)
text - the text to use.lastCodeId - value to use to start the code id. The first new code will have for id this value+1.
The value should be -1 or a positive number. Values below -1 will be automatically reset to -1.public TextFragment(TextFragment fragment)
fragment - the content to use.public static char toChar(int index)
index - the index value to encode.public static int toIndex(char index)
index - the character to decode.public static String makeRefMarker(String id)
id - the identifier to use.public static String makeRefMarker(String id, String propertyName)
id - The identifier to use.propertyName - the name of the property to use.public static Object[] getRefMarker(StringBuilder text)
text - the text to search for a reference marker.public static int fromFragmentToString(TextFragment frag, int pos)
For example if you find a match in a coded text string, use this method to convert the boundaries of the match into character position in the string representing the fragment (4 in "xxyyMATCHyyxx" -> 6 in "{b}{i}MATCH{/i}{/b}")
frag - the fragment where the position is located.pos - the position.public static int indexOfLastNonWhitespace(String codedText, int fromIndex, int untilIndex, boolean openingMarkerIsWS, boolean closingMarkerIsWS, boolean isolatedMarkerIsWS, boolean whitespaceIsWS)
codedText - the coded text to process.fromIndex - the first position to check (must be greater or equal to
untilIndex). Use -1 to point to the last position of the text.untilIndex - The last position to check (must be lesser or equal to
fromIndex).openingMarkerIsWS - indicates if opening markers count as whitespace.closingMarkerIsWS - indicates if closing markers count as whitespace.isolatedMarkerIsWS - indicates if isolated markers count as whitespace.whitespaceIsWS - indicates if whitespace characters count as whitespace.public static int indexOfFirstNonWhitespace(String codedText, int fromIndex, int untilIndex, boolean openingMarkerIsWS, boolean closingMarkerIsWS, boolean isolatedMarkerIsWS, boolean whitespaceIsWS)
codedText - the coded text to process.fromIndex - the first position to check (must be lesser or equal to
untilIndex).untilIndex - the last position to check (must be greater or equal to
fromIndex). Use -1 to point to the last position of the text.openingMarkerIsWS - indicates if opening markers count as whitespace.closingMarkerIsWS - indicates if closing markers count as whitespace.isolatedMarkerIsWS - indicates if isolated markers count as whitespace.whitespaceIsWS - indicates if whitespace characters count as whitespace.public static void unwrap(TextFragment frag)
frag - the text fragment to unwrap.public static boolean isMarker(char ch)
ch - the character to check.public TextFragment clone()
public boolean hasReference()
public TextFragment append(String text)
text - the string to append.public TextFragment append(TextFragment fragment)
fragment - the TextFragment to append.public TextFragment append(TextFragment fragment, boolean keepCodeIds)
public TextFragment append(Code code)
code - the existing code to append.public Code append(TextFragment.TagType tagType, String type, InlineAnnotation annotation)
tagType - the tag type of the code (e.g. TagType.OPENING).type - the type of the annotation (e.g. "protected").annotation - the annotation to add (can be null).public Code append(TextFragment.TagType tagType, String type, String data)
tagType - the tag type of the code (e.g. TagType.OPENING).type - the type of the code (e.g. "bold").data - the raw code itself. (e.g. "<b>").public Code append(TextFragment.TagType tagType, String type, String data, int id)
tagType - the tag type of the code (e.g. TagType.OPENING).type - the type of the code (e.g. "bold").data - the raw code itself. (e.g. "<b>").id - the identifier to use for this code.public void insert(int offset,
String str)
String object to this fragment.offset - position in the coded text where to insert the new String.
You can use -1 to append at the end of the current content.str - String to insert.InvalidPositionException - when offset points inside a marker.public void insert(int offset,
Code code)
Code object to this fragment.offset - position in the coded text where to insert the new Code.
You can use -1 to append at the end of the current content.code - Code to insert.InvalidPositionException - when offset points inside a marker.public void insert(int offset,
TextFragment fragment)
offset - position in the coded text where to insert the new fragment.
You can use -1 to append at the end of the current content.fragment - the TextFragment to insert.InvalidPositionException - when offset points inside a marker.public void insert(int offset,
TextFragment fragment,
boolean keepCodeIds)
offset - position in the coded text where to insert the new fragment.
You can use -1 to append at the end of the current content.fragment - the TextFragment to insert.keepCodeIds - true to not change Ids of the codes of the inserted TextFragment.public void clear()
public void trim()
public void ltrim()
public void rtrim()
public void collapseWhitespace()
public String getText()
public static String getText(String codedText)
codedText - string with possible TextFragment codes.public String getCodedText()
public String getCodedText(int start, int end)
start - the position of the first character or marker of the section
(in the coded text representation).end - The position just after the last character or marker of the section
(in the coded text representation).
You can use -1 for ending the section at the end of the fragment.InvalidPositionException - when start or end points inside a marker.public Code getCode(char indexAsChar)
indexAsChar - the index value coded as character.public Code getCode(int index)
index - the index of the code.public List<Code> getCodes()
public List<Code> getClonedCodes()
public List<Code> getCodes(int start, int end)
start - the position of the first character or marker of the section
(in the coded text representation).end - the position just after the last character or marker of the section
(in the coded text representation).InvalidPositionException - when start or end points inside a marker.public int getIndex(int id)
id - the identifier to look for.public int getIndexForClosing(int id)
id - the identifier of the closing tag to look for.public boolean isEmpty()
public boolean hasText()
public boolean hasText(boolean whiteSpacesAreText)
whiteSpacesAreText - indicates if whitespaces should be considered
characters or not for the purpose of checking if this fragment is empty.public boolean hasCode()
public void remove(int start,
int end)
start - the position of the first character or marker of the section
(in the coded text representation).end - the position just after the last character or marker of the section
(in the coded text representation). You can use -1 to indicate the end of the fragment.InvalidPositionException - when start or end points inside a marker.public TextFragment subSequence(int start, int end)
subSequence in interface CharSequencestart - the position of the first character or marker of the section
(in the coded text representation).end - the position just after the last character or marker of the section
(in the coded text representation).
You can use -1 for ending the section at the end of the fragment.public void setCodedText(String newCodedText)
newCodedText - the coded text to apply.InvalidContentException - when the coded text is not valid, or does
not correspond to the existing codes.public void setCodedText(String newCodedText, boolean allowCodeDeletion)
newCodedText - The coded text to apply.allowCodeDeletion - True when missing in-line codes in the coded text
means the corresponding codes should be deleted from the fragment.InvalidContentException - When the coded text is not valid, or does
not correspond to the existing codes.public void setCodedText(String newCodedText, List<Code> newCodes)
newCodedText - the coded text to apply.newCodes - the list of the corresponding codes.InvalidContentException - when the coded text is not valid or does
not correspond to the new codes.public void setCodedText(String newCodedText, List<Code> newCodes, boolean allowCodeDeletion)
newCodedText - the coded text to apply.newCodes - the list of the corresponding codes.allowCodeDeletion - True when missing in-line codes in the coded text
means the corresponding codes should be deleted from the fragment.InvalidContentException - when the coded text is not valid or does
not correspond to the new codes.public String toString()
getCodedText().
Each code is represented by a placeholder made of two special characters.
To get the content with the codes expanded as their original data use toText().
toString in interface CharSequencetoString in class Objectpublic String toText()
getCodedText()
or toString().public int compareTo(TextFragment tf)
compareTo(fragment, false)
(Note that inline codes are not compared with this method).compareTo in interface Comparable<TextFragment>tf - the object to compare with this TextFragment.public int compareTo(TextFragment frag, boolean codeSensitive)
frag - the TextFragment to compare with this one.codeSensitive - true if the codes need to be compared as well.public int changeToCode(int start,
int end,
TextFragment.TagType tagType,
String type)
start - The position of the first character or marker of the section
(in the coded text representation).end - the position just after the last character or marker of the section
(in the coded text representation).tagType - the tag type of the new code.type - the type of the new code.InvalidPositionException - when start or end points inside a marker.public int changeToCode(int start,
int end,
TextFragment.TagType tagType,
String type,
boolean setDisplayText)
start - The position of the first character or marker of the section
(in the coded text representation).end - the position just after the last character or marker of the section
(in the coded text representation).tagType - the tag type of the new code.type - the type of the new code.setDisplayText - if true set the subsequence (sub) as the displayText of the codeInvalidPositionException - when start or end points inside a marker.public int annotate(int start,
int end,
String type,
InlineAnnotation annotation)
start - the position of the first character or marker of the section
to annotate (in the coded text representation).end - the position just after the last character or marker of the section
to annotate (in the coded text representation).type - the type of annotation to set.annotation - the annotation to set (can be null).InvalidPositionException - when start or end points inside a marker.public void removeAnnotations()
public void removeAnnotations(String type)
type - the type of annotation to remove.public boolean hasAnnotation()
public boolean hasAnnotation(String type)
type - the type of annotation to look for.public void cleanUnusedCodes()
public TextFragment cleanCodes()
TextFragment, with the codes removedpublic int getCodePosition(int index)
public List<AnnotatedSpan> getAnnotatedSpans(String type)
type - the type of annotation to look for.public int renumberCodes()
public int renumberCodes(int idBase)
idBase - The base from which code IDs start numbering.public int renumberCodes(int idBase,
boolean mindPosition)
idBase - The base from which code IDs start numbering.mindPosition - If true, the codes with lesser positions in this text fragment will have lesser IDs.
If false, the codes with lesser original IDs will be assigned lesser IDs.public void removeCode(Code code)
Code from this TextFragmentcode - - the Code to removepublic void balanceMarkers()
invalidate() prior to calling this method.public void alignCodeIds(TextFragment base)
%d equals %s and the target is
%s equals %d and %s and %d are codes.
You want their IDs to match for the code with the same content.base - the fragment to use as the base for the synchronization.public TextFragment append(char value)
append in interface Appendablevalue - the character to append.public TextFragment append(CharSequence csq)
append in interface Appendablecsq - the character sequence to append.
If the parameter is null, the string "null" is appended.public TextFragment append(CharSequence csq, int start, int end)
append in interface Appendablecsq - the character sequence to append.
If csq is null, then characters will be appended as if csq contained the string "null".start - the index of the first character in the subsequence.end - the index of the character following the last character in the subsequence.public char charAt(int index)
For example: If the fragment is "A[xy]B" and "[xy]" is a code, charAt(3) returns 'B' not 'x'.
If the specified index falls on a code placeholder, the character returned is either a marker
(first character of the placeholder) or a special index to access the underlying code (second
character of the placeholder). Markers can be identified using isMarker(char).
charAt in interface CharSequenceindex - the index of the character to be returned.IndexOutOfBoundsException - if the if the index argument is negative or not less than the length
of the coded text.isMarker(char)public int length()
This is not the length of the content with all its codes. In the coded text, each code is represented by a placeholder made of two characters regardless of the size of the code. For example: If the fragment is "A[xy]B" and "[xy]" is a code, length() returns 4, not 6.
To get the length of the content including codes use .
Note that codes with referenced are not expanded by toText().length()toText().
length in interface CharSequencepublic void invalidate()
public int getLastCodeId()
public Code getLastCode()
public Code findCode(int id)
id - the ID to look for.Copyright © 2021. All rights reserved.