Index

A B C D E F G H L M O P R S T W 
All Classes and Interfaces|All Packages|Constant Field Values

A

addMetadata(ParagraphManager.Paragraph, ParagraphManager.Paragraph, Document) - Method in class org.springframework.ai.reader.pdf.ParagraphPdfDocumentReader
 
addRegion(String, Rectangle2D) - Method in class org.springframework.ai.reader.pdf.layout.PDFLayoutTextStripperByArea
Add a new region to group text by.
ALL_PAGES - Static variable in class org.springframework.ai.reader.pdf.config.PdfDocumentReaderConfig
 

B

build() - Method in class org.springframework.ai.reader.pdf.config.PdfDocumentReaderConfig.Builder
Returns the immutable configuration.
builder() - Static method in class org.springframework.ai.reader.pdf.config.PdfDocumentReaderConfig
Start building a new configuration.

C

children() - Method in record class org.springframework.ai.reader.pdf.config.ParagraphManager.Paragraph
Returns the value of the children record component.
computeFontHeight(PDFont) - Method in class org.springframework.ai.reader.pdf.layout.PDFLayoutTextStripperByArea
 

D

DEBUG - Static variable in class org.springframework.ai.reader.pdf.layout.ForkPDFLayoutTextStripper
 
defaultConfig() - Static method in class org.springframework.ai.reader.pdf.config.PdfDocumentReaderConfig
Returns the default config.
document - Variable in class org.springframework.ai.reader.pdf.PagePdfDocumentReader
 
document - Variable in class org.springframework.ai.reader.pdf.ParagraphPdfDocumentReader
 

E

endPageNumber() - Method in record class org.springframework.ai.reader.pdf.config.ParagraphManager.Paragraph
Returns the value of the endPageNumber record component.
equals(Object) - Method in record class org.springframework.ai.reader.pdf.config.ParagraphManager.Paragraph
Indicates whether some other object is "equal to" this one.
extractRegions(PDPage) - Method in class org.springframework.ai.reader.pdf.layout.PDFLayoutTextStripperByArea
Process the page to extract the region text.

F

flatten() - Method in class org.springframework.ai.reader.pdf.config.ParagraphManager
 
ForkPDFLayoutTextStripper - Class in org.springframework.ai.reader.pdf.layout
This class extends PDFTextStripper to provide custom text extraction and formatting capabilities for PDF pages.
ForkPDFLayoutTextStripper() - Constructor for class org.springframework.ai.reader.pdf.layout.ForkPDFLayoutTextStripper
Constructor

G

generateParagraphs(ParagraphManager.Paragraph, PDOutlineNode, Integer) - Method in class org.springframework.ai.reader.pdf.config.ParagraphManager
For given PDOutlineNode bookmark convert all sibling PDOutlineItem items into ParagraphManager.Paragraph instances under the parentParagraph.
get() - Method in class org.springframework.ai.reader.pdf.PagePdfDocumentReader
 
get() - Method in class org.springframework.ai.reader.pdf.ParagraphPdfDocumentReader
Reads and processes the PDF document to extract paragraphs.
getParagraphsByLevel(ParagraphManager.Paragraph, int, boolean) - Method in class org.springframework.ai.reader.pdf.config.ParagraphManager
 
getRegions() - Method in class org.springframework.ai.reader.pdf.layout.PDFLayoutTextStripperByArea
Get the list of regions that have been setup.
getTextBetweenParagraphs(ParagraphManager.Paragraph, ParagraphManager.Paragraph) - Method in class org.springframework.ai.reader.pdf.ParagraphPdfDocumentReader
 
getTextForRegion(String) - Method in class org.springframework.ai.reader.pdf.layout.PDFLayoutTextStripperByArea
Get the text for the region, this should be called after extractRegions().

H

hashCode() - Method in record class org.springframework.ai.reader.pdf.config.ParagraphManager.Paragraph
Returns a hash code value for this object.

L

level() - Method in record class org.springframework.ai.reader.pdf.config.ParagraphManager.Paragraph
Returns the value of the level record component.

M

METADATA_END_PAGE_NUMBER - Static variable in class org.springframework.ai.reader.pdf.PagePdfDocumentReader
 
METADATA_FILE_NAME - Static variable in class org.springframework.ai.reader.pdf.PagePdfDocumentReader
 
METADATA_START_PAGE_NUMBER - Static variable in class org.springframework.ai.reader.pdf.PagePdfDocumentReader
 

O

org.springframework.ai.reader.pdf - package org.springframework.ai.reader.pdf
 
org.springframework.ai.reader.pdf.aot - package org.springframework.ai.reader.pdf.aot
 
org.springframework.ai.reader.pdf.config - package org.springframework.ai.reader.pdf.config
 
org.springframework.ai.reader.pdf.layout - package org.springframework.ai.reader.pdf.layout
 
OUTPUT_SPACE_CHARACTER_WIDTH_IN_PT - Static variable in class org.springframework.ai.reader.pdf.layout.ForkPDFLayoutTextStripper
 

P

pageBottomMargin - Variable in class org.springframework.ai.reader.pdf.config.PdfDocumentReaderConfig
 
pageExtractedTextFormatter - Variable in class org.springframework.ai.reader.pdf.config.PdfDocumentReaderConfig
 
PagePdfDocumentReader - Class in org.springframework.ai.reader.pdf
Groups the parsed PDF pages into Documents.
PagePdfDocumentReader(String) - Constructor for class org.springframework.ai.reader.pdf.PagePdfDocumentReader
 
PagePdfDocumentReader(String, PdfDocumentReaderConfig) - Constructor for class org.springframework.ai.reader.pdf.PagePdfDocumentReader
 
PagePdfDocumentReader(Resource) - Constructor for class org.springframework.ai.reader.pdf.PagePdfDocumentReader
 
PagePdfDocumentReader(Resource, PdfDocumentReaderConfig) - Constructor for class org.springframework.ai.reader.pdf.PagePdfDocumentReader
 
pagesPerDocument - Variable in class org.springframework.ai.reader.pdf.config.PdfDocumentReaderConfig
 
pageTopMargin - Variable in class org.springframework.ai.reader.pdf.config.PdfDocumentReaderConfig
 
Paragraph(ParagraphManager.Paragraph, String, int, int, int, int) - Constructor for record class org.springframework.ai.reader.pdf.config.ParagraphManager.Paragraph
 
Paragraph(ParagraphManager.Paragraph, String, int, int, int, int, List<ParagraphManager.Paragraph>) - Constructor for record class org.springframework.ai.reader.pdf.config.ParagraphManager.Paragraph
Creates an instance of a Paragraph record class.
ParagraphManager - Class in org.springframework.ai.reader.pdf.config
The ParagraphManager class is responsible for managing the paragraphs and hierarchy of a PDF document.
ParagraphManager(PDDocument) - Constructor for class org.springframework.ai.reader.pdf.config.ParagraphManager
 
ParagraphManager.Paragraph - Record Class in org.springframework.ai.reader.pdf.config
Represents a document paragraph metadata and hierarchy.
ParagraphPdfDocumentReader - Class in org.springframework.ai.reader.pdf
Uses the PDF catalog (e.g.
ParagraphPdfDocumentReader(String) - Constructor for class org.springframework.ai.reader.pdf.ParagraphPdfDocumentReader
Constructs a ParagraphPdfDocumentReader using a resource URL.
ParagraphPdfDocumentReader(String, PdfDocumentReaderConfig) - Constructor for class org.springframework.ai.reader.pdf.ParagraphPdfDocumentReader
Constructs a ParagraphPdfDocumentReader using a resource URL and a configuration.
ParagraphPdfDocumentReader(Resource) - Constructor for class org.springframework.ai.reader.pdf.ParagraphPdfDocumentReader
Constructs a ParagraphPdfDocumentReader using a resource.
ParagraphPdfDocumentReader(Resource, PdfDocumentReaderConfig) - Constructor for class org.springframework.ai.reader.pdf.ParagraphPdfDocumentReader
Constructs a ParagraphPdfDocumentReader using a resource and a configuration.
parent() - Method in record class org.springframework.ai.reader.pdf.config.ParagraphManager.Paragraph
Returns the value of the parent record component.
PdfDocumentReaderConfig - Class in org.springframework.ai.reader.pdf.config
Common configuration builder for the PagePdfDocumentReader and the ParagraphPdfDocumentReader.
PdfDocumentReaderConfig.Builder - Class in org.springframework.ai.reader.pdf.config
 
PDFLayoutTextStripperByArea - Class in org.springframework.ai.reader.pdf.layout
Re-implement the PDFLayoutTextStripperByArea on top of the PDFLayoutTextStripper instead the original PDFTextStripper.
PDFLayoutTextStripperByArea() - Constructor for class org.springframework.ai.reader.pdf.layout.PDFLayoutTextStripperByArea
Constructor.
PdfReaderRuntimeHints - Class in org.springframework.ai.reader.pdf.aot
The PdfReaderRuntimeHints class is responsible for registering runtime hints for PDFBox resources.
PdfReaderRuntimeHints() - Constructor for class org.springframework.ai.reader.pdf.aot.PdfReaderRuntimeHints
 
position() - Method in record class org.springframework.ai.reader.pdf.config.ParagraphManager.Paragraph
Returns the value of the position record component.
processPage(PDPage) - Method in class org.springframework.ai.reader.pdf.layout.ForkPDFLayoutTextStripper
 
processTextPosition(TextPosition) - Method in class org.springframework.ai.reader.pdf.layout.PDFLayoutTextStripperByArea

R

registerHints(RuntimeHints, ClassLoader) - Method in class org.springframework.ai.reader.pdf.aot.PdfReaderRuntimeHints
 
removeRegion(String) - Method in class org.springframework.ai.reader.pdf.layout.PDFLayoutTextStripperByArea
Delete a region to group text by.
resourceFileName - Variable in class org.springframework.ai.reader.pdf.PagePdfDocumentReader
 
resourceFileName - Variable in class org.springframework.ai.reader.pdf.ParagraphPdfDocumentReader
 
reversedParagraphPosition - Variable in class org.springframework.ai.reader.pdf.config.PdfDocumentReaderConfig
 

S

setShouldSeparateByBeads(boolean) - Method in class org.springframework.ai.reader.pdf.layout.PDFLayoutTextStripperByArea
This method does nothing in this derived class, because beads and regions are incompatible.
showGlyph(Matrix, PDFont, int, Vector) - Method in class org.springframework.ai.reader.pdf.layout.PDFLayoutTextStripperByArea
 
startPageNumber() - Method in record class org.springframework.ai.reader.pdf.config.ParagraphManager.Paragraph
Returns the value of the startPageNumber record component.

T

title() - Method in record class org.springframework.ai.reader.pdf.config.ParagraphManager.Paragraph
Returns the value of the title record component.
toDocument(String, int, int) - Method in class org.springframework.ai.reader.pdf.PagePdfDocumentReader
 
toDocument(ParagraphManager.Paragraph, ParagraphManager.Paragraph) - Method in class org.springframework.ai.reader.pdf.ParagraphPdfDocumentReader
 
toString() - Method in record class org.springframework.ai.reader.pdf.config.ParagraphManager.Paragraph
Returns a string representation of this record class.

W

withPageBottomMargin(int) - Method in class org.springframework.ai.reader.pdf.config.PdfDocumentReaderConfig.Builder
Configures the Pdf reader page bottom margin.
withPageExtractedTextFormatter(ExtractedTextFormatter) - Method in class org.springframework.ai.reader.pdf.config.PdfDocumentReaderConfig.Builder
Formatter of the extracted text.
withPagesPerDocument(int) - Method in class org.springframework.ai.reader.pdf.config.PdfDocumentReaderConfig.Builder
How many pages to put in a single Document instance.
withPageTopMargin(int) - Method in class org.springframework.ai.reader.pdf.config.PdfDocumentReaderConfig.Builder
Configures the Pdf reader page top margin.
withReversedParagraphPosition(boolean) - Method in class org.springframework.ai.reader.pdf.config.PdfDocumentReaderConfig.Builder
Configures the Pdf reader reverse paragraph position.
writePage() - Method in class org.springframework.ai.reader.pdf.layout.ForkPDFLayoutTextStripper
 
writePage() - Method in class org.springframework.ai.reader.pdf.layout.PDFLayoutTextStripperByArea
This will print the processed page text to the output stream.
A B C D E F G H L M O P R S T W 
All Classes and Interfaces|All Packages|Constant Field Values