Package org.apache.parquet.format
Class ColumnIndex
- java.lang.Object
-
- org.apache.parquet.format.ColumnIndex
-
- All Implemented Interfaces:
Serializable,Cloneable,Comparable<ColumnIndex>,org.apache.thrift.TBase<ColumnIndex,ColumnIndex._Fields>,org.apache.thrift.TSerializable
@Generated(value="Autogenerated by Thrift Compiler (0.22.0)", date="2025-12-22") public class ColumnIndex extends Object implements org.apache.thrift.TBase<ColumnIndex,ColumnIndex._Fields>, Serializable, Cloneable, Comparable<ColumnIndex>Optional statistics for each data page in a ColumnChunk. Forms part the page index, along with OffsetIndex. If this structure is present, OffsetIndex must also be present. For each field in this structure,[i] refers to the page at OffsetIndex.page_locations[i] - See Also:
- Serialized Form
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static classColumnIndex._FieldsThe set of fields this struct contains, along with convenience methods for finding and manipulating them.
-
Field Summary
Fields Modifier and Type Field Description BoundaryOrderboundary_orderStores whether both min_values and max_values are ordered and if so, in which direction.List<Long>definition_level_histogramsSame as repetition_level_histograms except for definitions levels.List<ByteBuffer>max_valuesstatic Map<ColumnIndex._Fields,org.apache.thrift.meta_data.FieldMetaData>metaDataMapList<ByteBuffer>min_valuesTwo lists containing lower and upper bounds for the values of each page determined by the ColumnOrder of the column.List<Long>null_countsA list containing the number of null values for each page Writers SHOULD always write this field even if no null values are present or the column is not nullable.List<Boolean>null_pagesA list of Boolean values to determine the validity of the corresponding min and max values.List<Long>repetition_level_histogramsContains repetition level histograms for each page concatenated together.
-
Constructor Summary
Constructors Constructor Description ColumnIndex()ColumnIndex(List<Boolean> null_pages, List<ByteBuffer> min_values, List<ByteBuffer> max_values, BoundaryOrder boundary_order)ColumnIndex(ColumnIndex other)Performs a deep copy on other.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description voidaddToDefinition_level_histograms(long elem)voidaddToMax_values(ByteBuffer elem)voidaddToMin_values(ByteBuffer elem)voidaddToNull_counts(long elem)voidaddToNull_pages(boolean elem)voidaddToRepetition_level_histograms(long elem)voidclear()intcompareTo(ColumnIndex other)ColumnIndexdeepCopy()booleanequals(Object that)booleanequals(ColumnIndex that)ColumnIndex._FieldsfieldForId(int fieldId)BoundaryOrdergetBoundary_order()Stores whether both min_values and max_values are ordered and if so, in which direction.List<Long>getDefinition_level_histograms()Same as repetition_level_histograms except for definitions levels.Iterator<Long>getDefinition_level_histogramsIterator()intgetDefinition_level_histogramsSize()ObjectgetFieldValue(ColumnIndex._Fields field)List<ByteBuffer>getMax_values()Iterator<ByteBuffer>getMax_valuesIterator()intgetMax_valuesSize()List<ByteBuffer>getMin_values()Two lists containing lower and upper bounds for the values of each page determined by the ColumnOrder of the column.Iterator<ByteBuffer>getMin_valuesIterator()intgetMin_valuesSize()List<Long>getNull_counts()A list containing the number of null values for each page Writers SHOULD always write this field even if no null values are present or the column is not nullable.Iterator<Long>getNull_countsIterator()intgetNull_countsSize()List<Boolean>getNull_pages()A list of Boolean values to determine the validity of the corresponding min and max values.Iterator<Boolean>getNull_pagesIterator()intgetNull_pagesSize()List<Long>getRepetition_level_histograms()Contains repetition level histograms for each page concatenated together.Iterator<Long>getRepetition_level_histogramsIterator()intgetRepetition_level_histogramsSize()inthashCode()booleanisSet(ColumnIndex._Fields field)Returns true if field corresponding to fieldID is set (has been assigned a value) and false otherwisebooleanisSetBoundary_order()Returns true if field boundary_order is set (has been assigned a value) and false otherwisebooleanisSetDefinition_level_histograms()Returns true if field definition_level_histograms is set (has been assigned a value) and false otherwisebooleanisSetMax_values()Returns true if field max_values is set (has been assigned a value) and false otherwisebooleanisSetMin_values()Returns true if field min_values is set (has been assigned a value) and false otherwisebooleanisSetNull_counts()Returns true if field null_counts is set (has been assigned a value) and false otherwisebooleanisSetNull_pages()Returns true if field null_pages is set (has been assigned a value) and false otherwisebooleanisSetRepetition_level_histograms()Returns true if field repetition_level_histograms is set (has been assigned a value) and false otherwisevoidread(org.apache.thrift.protocol.TProtocol iprot)ColumnIndexsetBoundary_order(BoundaryOrder boundary_order)Stores whether both min_values and max_values are ordered and if so, in which direction.voidsetBoundary_orderIsSet(boolean value)ColumnIndexsetDefinition_level_histograms(List<Long> definition_level_histograms)Same as repetition_level_histograms except for definitions levels.voidsetDefinition_level_histogramsIsSet(boolean value)voidsetFieldValue(ColumnIndex._Fields field, Object value)ColumnIndexsetMax_values(List<ByteBuffer> max_values)voidsetMax_valuesIsSet(boolean value)ColumnIndexsetMin_values(List<ByteBuffer> min_values)Two lists containing lower and upper bounds for the values of each page determined by the ColumnOrder of the column.voidsetMin_valuesIsSet(boolean value)ColumnIndexsetNull_counts(List<Long> null_counts)A list containing the number of null values for each page Writers SHOULD always write this field even if no null values are present or the column is not nullable.voidsetNull_countsIsSet(boolean value)ColumnIndexsetNull_pages(List<Boolean> null_pages)A list of Boolean values to determine the validity of the corresponding min and max values.voidsetNull_pagesIsSet(boolean value)ColumnIndexsetRepetition_level_histograms(List<Long> repetition_level_histograms)Contains repetition level histograms for each page concatenated together.voidsetRepetition_level_histogramsIsSet(boolean value)StringtoString()voidunsetBoundary_order()voidunsetDefinition_level_histograms()voidunsetMax_values()voidunsetMin_values()voidunsetNull_counts()voidunsetNull_pages()voidunsetRepetition_level_histograms()voidvalidate()voidwrite(org.apache.thrift.protocol.TProtocol oprot)
-
-
-
Field Detail
-
null_pages
public List<Boolean> null_pages
A list of Boolean values to determine the validity of the corresponding min and max values. If true, a page contains only null values, and writers have to set the corresponding entries in min_values and max_values to byte[0], so that all lists have the same length. If false, the corresponding entries in min_values and max_values must be valid.
-
min_values
public List<ByteBuffer> min_values
Two lists containing lower and upper bounds for the values of each page determined by the ColumnOrder of the column. These may be the actual minimum and maximum values found on a page, but can also be (more compact) values that do not exist on a page. For example, instead of storing ""Blart Versenwald III", a writer may set min_values[i]="B", max_values[i]="C". Such more compact values must still be valid values within the column's logical type. Readers must make sure that list entries are populated before using them by inspecting null_pages.
-
max_values
public List<ByteBuffer> max_values
-
boundary_order
public BoundaryOrder boundary_order
Stores whether both min_values and max_values are ordered and if so, in which direction. This allows readers to perform binary searches in both lists. Readers cannot assume that max_values[i] <= min_values[i+1], even if the lists are ordered.- See Also:
BoundaryOrder
-
null_counts
public List<Long> null_counts
A list containing the number of null values for each page Writers SHOULD always write this field even if no null values are present or the column is not nullable. Readers MUST distinguish between null_counts not being present and null_count being 0. If null_counts are not present, readers MUST NOT assume all null counts are 0.
-
repetition_level_histograms
public List<Long> repetition_level_histograms
Contains repetition level histograms for each page concatenated together. The repetition_level_histogram field on SizeStatistics contains more details. When present the length should always be (number of pages * (max_repetition_level + 1)) elements. Element 0 is the first element of the histogram for the first page. Element (max_repetition_level + 1) is the first element of the histogram for the second page.
-
definition_level_histograms
public List<Long> definition_level_histograms
Same as repetition_level_histograms except for definitions levels.
-
metaDataMap
public static final Map<ColumnIndex._Fields,org.apache.thrift.meta_data.FieldMetaData> metaDataMap
-
-
Constructor Detail
-
ColumnIndex
public ColumnIndex()
-
ColumnIndex
public ColumnIndex(List<Boolean> null_pages, List<ByteBuffer> min_values, List<ByteBuffer> max_values, BoundaryOrder boundary_order)
-
ColumnIndex
public ColumnIndex(ColumnIndex other)
Performs a deep copy on other.
-
-
Method Detail
-
deepCopy
public ColumnIndex deepCopy()
- Specified by:
deepCopyin interfaceorg.apache.thrift.TBase<ColumnIndex,ColumnIndex._Fields>
-
clear
public void clear()
- Specified by:
clearin interfaceorg.apache.thrift.TBase<ColumnIndex,ColumnIndex._Fields>
-
getNull_pagesSize
public int getNull_pagesSize()
-
addToNull_pages
public void addToNull_pages(boolean elem)
-
getNull_pages
public List<Boolean> getNull_pages()
A list of Boolean values to determine the validity of the corresponding min and max values. If true, a page contains only null values, and writers have to set the corresponding entries in min_values and max_values to byte[0], so that all lists have the same length. If false, the corresponding entries in min_values and max_values must be valid.
-
setNull_pages
public ColumnIndex setNull_pages(List<Boolean> null_pages)
A list of Boolean values to determine the validity of the corresponding min and max values. If true, a page contains only null values, and writers have to set the corresponding entries in min_values and max_values to byte[0], so that all lists have the same length. If false, the corresponding entries in min_values and max_values must be valid.
-
unsetNull_pages
public void unsetNull_pages()
-
isSetNull_pages
public boolean isSetNull_pages()
Returns true if field null_pages is set (has been assigned a value) and false otherwise
-
setNull_pagesIsSet
public void setNull_pagesIsSet(boolean value)
-
getMin_valuesSize
public int getMin_valuesSize()
-
getMin_valuesIterator
public Iterator<ByteBuffer> getMin_valuesIterator()
-
addToMin_values
public void addToMin_values(ByteBuffer elem)
-
getMin_values
public List<ByteBuffer> getMin_values()
Two lists containing lower and upper bounds for the values of each page determined by the ColumnOrder of the column. These may be the actual minimum and maximum values found on a page, but can also be (more compact) values that do not exist on a page. For example, instead of storing ""Blart Versenwald III", a writer may set min_values[i]="B", max_values[i]="C". Such more compact values must still be valid values within the column's logical type. Readers must make sure that list entries are populated before using them by inspecting null_pages.
-
setMin_values
public ColumnIndex setMin_values(List<ByteBuffer> min_values)
Two lists containing lower and upper bounds for the values of each page determined by the ColumnOrder of the column. These may be the actual minimum and maximum values found on a page, but can also be (more compact) values that do not exist on a page. For example, instead of storing ""Blart Versenwald III", a writer may set min_values[i]="B", max_values[i]="C". Such more compact values must still be valid values within the column's logical type. Readers must make sure that list entries are populated before using them by inspecting null_pages.
-
unsetMin_values
public void unsetMin_values()
-
isSetMin_values
public boolean isSetMin_values()
Returns true if field min_values is set (has been assigned a value) and false otherwise
-
setMin_valuesIsSet
public void setMin_valuesIsSet(boolean value)
-
getMax_valuesSize
public int getMax_valuesSize()
-
getMax_valuesIterator
public Iterator<ByteBuffer> getMax_valuesIterator()
-
addToMax_values
public void addToMax_values(ByteBuffer elem)
-
getMax_values
public List<ByteBuffer> getMax_values()
-
setMax_values
public ColumnIndex setMax_values(List<ByteBuffer> max_values)
-
unsetMax_values
public void unsetMax_values()
-
isSetMax_values
public boolean isSetMax_values()
Returns true if field max_values is set (has been assigned a value) and false otherwise
-
setMax_valuesIsSet
public void setMax_valuesIsSet(boolean value)
-
getBoundary_order
public BoundaryOrder getBoundary_order()
Stores whether both min_values and max_values are ordered and if so, in which direction. This allows readers to perform binary searches in both lists. Readers cannot assume that max_values[i] <= min_values[i+1], even if the lists are ordered.- See Also:
BoundaryOrder
-
setBoundary_order
public ColumnIndex setBoundary_order(BoundaryOrder boundary_order)
Stores whether both min_values and max_values are ordered and if so, in which direction. This allows readers to perform binary searches in both lists. Readers cannot assume that max_values[i] <= min_values[i+1], even if the lists are ordered.- See Also:
BoundaryOrder
-
unsetBoundary_order
public void unsetBoundary_order()
-
isSetBoundary_order
public boolean isSetBoundary_order()
Returns true if field boundary_order is set (has been assigned a value) and false otherwise
-
setBoundary_orderIsSet
public void setBoundary_orderIsSet(boolean value)
-
getNull_countsSize
public int getNull_countsSize()
-
addToNull_counts
public void addToNull_counts(long elem)
-
getNull_counts
public List<Long> getNull_counts()
A list containing the number of null values for each page Writers SHOULD always write this field even if no null values are present or the column is not nullable. Readers MUST distinguish between null_counts not being present and null_count being 0. If null_counts are not present, readers MUST NOT assume all null counts are 0.
-
setNull_counts
public ColumnIndex setNull_counts(List<Long> null_counts)
A list containing the number of null values for each page Writers SHOULD always write this field even if no null values are present or the column is not nullable. Readers MUST distinguish between null_counts not being present and null_count being 0. If null_counts are not present, readers MUST NOT assume all null counts are 0.
-
unsetNull_counts
public void unsetNull_counts()
-
isSetNull_counts
public boolean isSetNull_counts()
Returns true if field null_counts is set (has been assigned a value) and false otherwise
-
setNull_countsIsSet
public void setNull_countsIsSet(boolean value)
-
getRepetition_level_histogramsSize
public int getRepetition_level_histogramsSize()
-
getRepetition_level_histogramsIterator
public Iterator<Long> getRepetition_level_histogramsIterator()
-
addToRepetition_level_histograms
public void addToRepetition_level_histograms(long elem)
-
getRepetition_level_histograms
public List<Long> getRepetition_level_histograms()
Contains repetition level histograms for each page concatenated together. The repetition_level_histogram field on SizeStatistics contains more details. When present the length should always be (number of pages * (max_repetition_level + 1)) elements. Element 0 is the first element of the histogram for the first page. Element (max_repetition_level + 1) is the first element of the histogram for the second page.
-
setRepetition_level_histograms
public ColumnIndex setRepetition_level_histograms(List<Long> repetition_level_histograms)
Contains repetition level histograms for each page concatenated together. The repetition_level_histogram field on SizeStatistics contains more details. When present the length should always be (number of pages * (max_repetition_level + 1)) elements. Element 0 is the first element of the histogram for the first page. Element (max_repetition_level + 1) is the first element of the histogram for the second page.
-
unsetRepetition_level_histograms
public void unsetRepetition_level_histograms()
-
isSetRepetition_level_histograms
public boolean isSetRepetition_level_histograms()
Returns true if field repetition_level_histograms is set (has been assigned a value) and false otherwise
-
setRepetition_level_histogramsIsSet
public void setRepetition_level_histogramsIsSet(boolean value)
-
getDefinition_level_histogramsSize
public int getDefinition_level_histogramsSize()
-
getDefinition_level_histogramsIterator
public Iterator<Long> getDefinition_level_histogramsIterator()
-
addToDefinition_level_histograms
public void addToDefinition_level_histograms(long elem)
-
getDefinition_level_histograms
public List<Long> getDefinition_level_histograms()
Same as repetition_level_histograms except for definitions levels.
-
setDefinition_level_histograms
public ColumnIndex setDefinition_level_histograms(List<Long> definition_level_histograms)
Same as repetition_level_histograms except for definitions levels.
-
unsetDefinition_level_histograms
public void unsetDefinition_level_histograms()
-
isSetDefinition_level_histograms
public boolean isSetDefinition_level_histograms()
Returns true if field definition_level_histograms is set (has been assigned a value) and false otherwise
-
setDefinition_level_histogramsIsSet
public void setDefinition_level_histogramsIsSet(boolean value)
-
setFieldValue
public void setFieldValue(ColumnIndex._Fields field, Object value)
- Specified by:
setFieldValuein interfaceorg.apache.thrift.TBase<ColumnIndex,ColumnIndex._Fields>
-
getFieldValue
public Object getFieldValue(ColumnIndex._Fields field)
- Specified by:
getFieldValuein interfaceorg.apache.thrift.TBase<ColumnIndex,ColumnIndex._Fields>
-
isSet
public boolean isSet(ColumnIndex._Fields field)
Returns true if field corresponding to fieldID is set (has been assigned a value) and false otherwise- Specified by:
isSetin interfaceorg.apache.thrift.TBase<ColumnIndex,ColumnIndex._Fields>
-
equals
public boolean equals(ColumnIndex that)
-
compareTo
public int compareTo(ColumnIndex other)
- Specified by:
compareToin interfaceComparable<ColumnIndex>
-
fieldForId
public ColumnIndex._Fields fieldForId(int fieldId)
- Specified by:
fieldForIdin interfaceorg.apache.thrift.TBase<ColumnIndex,ColumnIndex._Fields>
-
read
public void read(org.apache.thrift.protocol.TProtocol iprot) throws org.apache.thrift.TException- Specified by:
readin interfaceorg.apache.thrift.TSerializable- Throws:
org.apache.thrift.TException
-
write
public void write(org.apache.thrift.protocol.TProtocol oprot) throws org.apache.thrift.TException- Specified by:
writein interfaceorg.apache.thrift.TSerializable- Throws:
org.apache.thrift.TException
-
validate
public void validate() throws org.apache.thrift.TException- Throws:
org.apache.thrift.TException
-
-