Index
All Classes and Interfaces|All Packages|Constant Field Values
A
- addMetadata(ParagraphManager.Paragraph, ParagraphManager.Paragraph, Document) - Method in class org.springframework.ai.reader.pdf.ParagraphPdfDocumentReader
- addRegion(String, Rectangle2D) - Method in class org.springframework.ai.reader.pdf.layout.PDFLayoutTextStripperByArea
-
Add a new region to group text by.
- ALL_PAGES - Static variable in class org.springframework.ai.reader.pdf.config.PdfDocumentReaderConfig
B
- build() - Method in class org.springframework.ai.reader.pdf.config.PdfDocumentReaderConfig.Builder
-
Returns the immutable configuration.
- builder() - Static method in class org.springframework.ai.reader.pdf.config.PdfDocumentReaderConfig
-
Start building a new configuration.
C
- children() - Method in record class org.springframework.ai.reader.pdf.config.ParagraphManager.Paragraph
-
Returns the value of the
childrenrecord component. - computeFontHeight(PDFont) - Method in class org.springframework.ai.reader.pdf.layout.PDFLayoutTextStripperByArea
D
- DEBUG - Static variable in class org.springframework.ai.reader.pdf.layout.ForkPDFLayoutTextStripper
- defaultConfig() - Static method in class org.springframework.ai.reader.pdf.config.PdfDocumentReaderConfig
-
Returns the default config.
- document - Variable in class org.springframework.ai.reader.pdf.PagePdfDocumentReader
- document - Variable in class org.springframework.ai.reader.pdf.ParagraphPdfDocumentReader
E
- endPageNumber() - Method in record class org.springframework.ai.reader.pdf.config.ParagraphManager.Paragraph
-
Returns the value of the
endPageNumberrecord component. - equals(Object) - Method in record class org.springframework.ai.reader.pdf.config.ParagraphManager.Paragraph
-
Indicates whether some other object is "equal to" this one.
- extractRegions(PDPage) - Method in class org.springframework.ai.reader.pdf.layout.PDFLayoutTextStripperByArea
-
Process the page to extract the region text.
F
- flatten() - Method in class org.springframework.ai.reader.pdf.config.ParagraphManager
- ForkPDFLayoutTextStripper - Class in org.springframework.ai.reader.pdf.layout
-
This class extends PDFTextStripper to provide custom text extraction and formatting capabilities for PDF pages.
- ForkPDFLayoutTextStripper() - Constructor for class org.springframework.ai.reader.pdf.layout.ForkPDFLayoutTextStripper
-
Constructor
G
- generateParagraphs(ParagraphManager.Paragraph, PDOutlineNode, Integer) - Method in class org.springframework.ai.reader.pdf.config.ParagraphManager
-
For given
PDOutlineNodebookmark convert all siblingPDOutlineItemitems intoParagraphManager.Paragraphinstances under the parentParagraph. - get() - Method in class org.springframework.ai.reader.pdf.PagePdfDocumentReader
- get() - Method in class org.springframework.ai.reader.pdf.ParagraphPdfDocumentReader
-
Reads and processes the PDF document to extract paragraphs.
- getParagraphsByLevel(ParagraphManager.Paragraph, int, boolean) - Method in class org.springframework.ai.reader.pdf.config.ParagraphManager
- getRegions() - Method in class org.springframework.ai.reader.pdf.layout.PDFLayoutTextStripperByArea
-
Get the list of regions that have been setup.
- getTextBetweenParagraphs(ParagraphManager.Paragraph, ParagraphManager.Paragraph) - Method in class org.springframework.ai.reader.pdf.ParagraphPdfDocumentReader
- getTextForRegion(String) - Method in class org.springframework.ai.reader.pdf.layout.PDFLayoutTextStripperByArea
-
Get the text for the region, this should be called after extractRegions().
H
- hashCode() - Method in record class org.springframework.ai.reader.pdf.config.ParagraphManager.Paragraph
-
Returns a hash code value for this object.
L
- level() - Method in record class org.springframework.ai.reader.pdf.config.ParagraphManager.Paragraph
-
Returns the value of the
levelrecord component.
M
- METADATA_END_PAGE_NUMBER - Static variable in class org.springframework.ai.reader.pdf.PagePdfDocumentReader
- METADATA_FILE_NAME - Static variable in class org.springframework.ai.reader.pdf.PagePdfDocumentReader
- METADATA_START_PAGE_NUMBER - Static variable in class org.springframework.ai.reader.pdf.PagePdfDocumentReader
O
- org.springframework.ai.reader.pdf - package org.springframework.ai.reader.pdf
- org.springframework.ai.reader.pdf.aot - package org.springframework.ai.reader.pdf.aot
- org.springframework.ai.reader.pdf.config - package org.springframework.ai.reader.pdf.config
- org.springframework.ai.reader.pdf.layout - package org.springframework.ai.reader.pdf.layout
- OUTPUT_SPACE_CHARACTER_WIDTH_IN_PT - Static variable in class org.springframework.ai.reader.pdf.layout.ForkPDFLayoutTextStripper
P
- pageBottomMargin - Variable in class org.springframework.ai.reader.pdf.config.PdfDocumentReaderConfig
- pageExtractedTextFormatter - Variable in class org.springframework.ai.reader.pdf.config.PdfDocumentReaderConfig
- PagePdfDocumentReader - Class in org.springframework.ai.reader.pdf
-
Groups the parsed PDF pages into
Documents. - PagePdfDocumentReader(String) - Constructor for class org.springframework.ai.reader.pdf.PagePdfDocumentReader
- PagePdfDocumentReader(String, PdfDocumentReaderConfig) - Constructor for class org.springframework.ai.reader.pdf.PagePdfDocumentReader
- PagePdfDocumentReader(Resource) - Constructor for class org.springframework.ai.reader.pdf.PagePdfDocumentReader
- PagePdfDocumentReader(Resource, PdfDocumentReaderConfig) - Constructor for class org.springframework.ai.reader.pdf.PagePdfDocumentReader
- pagesPerDocument - Variable in class org.springframework.ai.reader.pdf.config.PdfDocumentReaderConfig
- pageTopMargin - Variable in class org.springframework.ai.reader.pdf.config.PdfDocumentReaderConfig
- Paragraph(ParagraphManager.Paragraph, String, int, int, int, int) - Constructor for record class org.springframework.ai.reader.pdf.config.ParagraphManager.Paragraph
- Paragraph(ParagraphManager.Paragraph, String, int, int, int, int, List<ParagraphManager.Paragraph>) - Constructor for record class org.springframework.ai.reader.pdf.config.ParagraphManager.Paragraph
-
Creates an instance of a
Paragraphrecord class. - ParagraphManager - Class in org.springframework.ai.reader.pdf.config
-
The ParagraphManager class is responsible for managing the paragraphs and hierarchy of a PDF document.
- ParagraphManager(PDDocument) - Constructor for class org.springframework.ai.reader.pdf.config.ParagraphManager
- ParagraphManager.Paragraph - Record Class in org.springframework.ai.reader.pdf.config
-
Represents a document paragraph metadata and hierarchy.
- ParagraphPdfDocumentReader - Class in org.springframework.ai.reader.pdf
-
Uses the PDF catalog (e.g.
- ParagraphPdfDocumentReader(String) - Constructor for class org.springframework.ai.reader.pdf.ParagraphPdfDocumentReader
-
Constructs a ParagraphPdfDocumentReader using a resource URL.
- ParagraphPdfDocumentReader(String, PdfDocumentReaderConfig) - Constructor for class org.springframework.ai.reader.pdf.ParagraphPdfDocumentReader
-
Constructs a ParagraphPdfDocumentReader using a resource URL and a configuration.
- ParagraphPdfDocumentReader(Resource) - Constructor for class org.springframework.ai.reader.pdf.ParagraphPdfDocumentReader
-
Constructs a ParagraphPdfDocumentReader using a resource.
- ParagraphPdfDocumentReader(Resource, PdfDocumentReaderConfig) - Constructor for class org.springframework.ai.reader.pdf.ParagraphPdfDocumentReader
-
Constructs a ParagraphPdfDocumentReader using a resource and a configuration.
- parent() - Method in record class org.springframework.ai.reader.pdf.config.ParagraphManager.Paragraph
-
Returns the value of the
parentrecord component. - PdfDocumentReaderConfig - Class in org.springframework.ai.reader.pdf.config
-
Common configuration builder for the
PagePdfDocumentReaderand theParagraphPdfDocumentReader. - PdfDocumentReaderConfig.Builder - Class in org.springframework.ai.reader.pdf.config
- PDFLayoutTextStripperByArea - Class in org.springframework.ai.reader.pdf.layout
-
Re-implement the PDFLayoutTextStripperByArea on top of the PDFLayoutTextStripper instead the original PDFTextStripper.
- PDFLayoutTextStripperByArea() - Constructor for class org.springframework.ai.reader.pdf.layout.PDFLayoutTextStripperByArea
-
Constructor.
- PdfReaderRuntimeHints - Class in org.springframework.ai.reader.pdf.aot
-
The PdfReaderRuntimeHints class is responsible for registering runtime hints for PDFBox resources.
- PdfReaderRuntimeHints() - Constructor for class org.springframework.ai.reader.pdf.aot.PdfReaderRuntimeHints
- position() - Method in record class org.springframework.ai.reader.pdf.config.ParagraphManager.Paragraph
-
Returns the value of the
positionrecord component. - processPage(PDPage) - Method in class org.springframework.ai.reader.pdf.layout.ForkPDFLayoutTextStripper
- processTextPosition(TextPosition) - Method in class org.springframework.ai.reader.pdf.layout.PDFLayoutTextStripperByArea
R
- registerHints(RuntimeHints, ClassLoader) - Method in class org.springframework.ai.reader.pdf.aot.PdfReaderRuntimeHints
- removeRegion(String) - Method in class org.springframework.ai.reader.pdf.layout.PDFLayoutTextStripperByArea
-
Delete a region to group text by.
- resourceFileName - Variable in class org.springframework.ai.reader.pdf.PagePdfDocumentReader
- resourceFileName - Variable in class org.springframework.ai.reader.pdf.ParagraphPdfDocumentReader
- reversedParagraphPosition - Variable in class org.springframework.ai.reader.pdf.config.PdfDocumentReaderConfig
S
- setShouldSeparateByBeads(boolean) - Method in class org.springframework.ai.reader.pdf.layout.PDFLayoutTextStripperByArea
-
This method does nothing in this derived class, because beads and regions are incompatible.
- showGlyph(Matrix, PDFont, int, Vector) - Method in class org.springframework.ai.reader.pdf.layout.PDFLayoutTextStripperByArea
- startPageNumber() - Method in record class org.springframework.ai.reader.pdf.config.ParagraphManager.Paragraph
-
Returns the value of the
startPageNumberrecord component.
T
- title() - Method in record class org.springframework.ai.reader.pdf.config.ParagraphManager.Paragraph
-
Returns the value of the
titlerecord component. - toDocument(String, int, int) - Method in class org.springframework.ai.reader.pdf.PagePdfDocumentReader
- toDocument(ParagraphManager.Paragraph, ParagraphManager.Paragraph) - Method in class org.springframework.ai.reader.pdf.ParagraphPdfDocumentReader
- toString() - Method in record class org.springframework.ai.reader.pdf.config.ParagraphManager.Paragraph
-
Returns a string representation of this record class.
W
- withPageBottomMargin(int) - Method in class org.springframework.ai.reader.pdf.config.PdfDocumentReaderConfig.Builder
-
Configures the Pdf reader page bottom margin.
- withPageExtractedTextFormatter(ExtractedTextFormatter) - Method in class org.springframework.ai.reader.pdf.config.PdfDocumentReaderConfig.Builder
-
Formatter of the extracted text.
- withPagesPerDocument(int) - Method in class org.springframework.ai.reader.pdf.config.PdfDocumentReaderConfig.Builder
-
How many pages to put in a single Document instance.
- withPageTopMargin(int) - Method in class org.springframework.ai.reader.pdf.config.PdfDocumentReaderConfig.Builder
-
Configures the Pdf reader page top margin.
- withReversedParagraphPosition(boolean) - Method in class org.springframework.ai.reader.pdf.config.PdfDocumentReaderConfig.Builder
-
Configures the Pdf reader reverse paragraph position.
- writePage() - Method in class org.springframework.ai.reader.pdf.layout.ForkPDFLayoutTextStripper
- writePage() - Method in class org.springframework.ai.reader.pdf.layout.PDFLayoutTextStripperByArea
-
This will print the processed page text to the output stream.
All Classes and Interfaces|All Packages|Constant Field Values