- java.lang.Object
-
- com.lowagie.text.pdf.parser.MarkedUpTextAssembler
-
- All Implemented Interfaces:
TextAssembler
public class MarkedUpTextAssembler extends Object implements TextAssembler
We'll get called on a variety of marked section content (perhaps including the results of nested sections), and will assemble it into an order as we can.- Author:
- dgd
-
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description FinalTextendParsingContext(String containingElementName)protected PdfReadergetReader()Getter.StringgetWordId()assembler can calculate an identifier for each word on a page, for use in markup.voidprocess(FinalText completed, String contextName)Slot fully-assembled chunk into our result at the current location.voidprocess(ParsedText unassembled, String contextName)Remember an unassembled chunk until we hit the end of this element, or we hit an assembled chunk, and need to pull things together.voidprocess(Word completed, String contextName)voidrenderText(FinalText finalText)voidrenderText(ParsedTextImpl partialWord)Captures text using a simplified algorithm for inserting hard returns and spacesvoidreset()voidsetPage(int page)
-
-
-
Method Detail
-
process
public void process(ParsedText unassembled, String contextName)
Remember an unassembled chunk until we hit the end of this element, or we hit an assembled chunk, and need to pull things together.- Specified by:
processin interfaceTextAssembler- Parameters:
unassembled- chunk of text rendering instruction to contribute to final textcontextName- Name of the element context we are in. Null value if it's an Artifact.
-
process
public void process(FinalText completed, String contextName)
Slot fully-assembled chunk into our result at the current location. If there are unassembled chunks waiting, assemble them first.- Specified by:
processin interfaceTextAssembler- Parameters:
completed- This is a chunk from a nested elementcontextName- Name of the element context we are in. Null value if it's an Artifact.
-
process
public void process(Word completed, String contextName)
- Specified by:
processin interfaceTextAssembler- Parameters:
completed- process a complete chunk -- just add this subsection into the proper place.contextName- Name of the element context we are in. Null value if it's an Artifact.- See Also:
TextAssembler.process(Word, String)
-
endParsingContext
public FinalText endParsingContext(String containingElementName)
- Specified by:
endParsingContextin interfaceTextAssembler- Parameters:
containingElementName- This is an element name to surround the extracted text- Returns:
- the final text for the set of fragments and fully parsed items we were passed during processing.
- See Also:
TextAssembler.endParsingContext(String)
-
reset
public void reset()
- Specified by:
resetin interfaceTextAssembler- See Also:
TextAssembler.reset()
-
renderText
public void renderText(FinalText finalText)
- Specified by:
renderTextin interfaceTextAssembler- Parameters:
finalText- process a complete chunk -- just add this subsection into the proper place.
-
renderText
public void renderText(ParsedTextImpl partialWord)
Captures text using a simplified algorithm for inserting hard returns and spaces- Specified by:
renderTextin interfaceTextAssembler- Parameters:
partialWord- process one of a number of raw pdf text chunks, with placement, font, etc.- See Also:
GraphicsState,Matrix
-
getReader
protected PdfReader getReader()
Getter.- Returns:
- reader
-
setPage
public void setPage(int page)
- Specified by:
setPagein interfaceTextAssembler- Parameters:
page- number of the page we are assembling- See Also:
TextAssembler.setPage(int)
-
getWordId
public String getWordId()
assembler can calculate an identifier for each word on a page, for use in markup.- Specified by:
getWordIdin interfaceTextAssembler- Returns:
- the new unique id.
- See Also:
TextAssembler.getWordId()
-
-