Module org.apache.lucene.core
Class Lucene99ScalarQuantizedVectorsWriter
java.lang.Object
org.apache.lucene.codecs.hnsw.FlatVectorsWriter
org.apache.lucene.codecs.lucene99.Lucene99ScalarQuantizedVectorsWriter
- All Implemented Interfaces:
Closeable,AutoCloseable,Accountable
Writes quantized vector values and metadata to index segments.
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescription(package private) static class(package private) static class(package private) static classReturns a merged view over all the segment'sQuantizedByteVectorValues.(package private) static final class(package private) static class(package private) static class(package private) static final class -
Field Summary
FieldsModifier and TypeFieldDescriptionprivate final byteprivate final booleanprivate final Floatprivate final List<Lucene99ScalarQuantizedVectorsWriter.FieldWriter> private booleanprivate final IndexOutputprivate static final floatprivate final IndexOutputprivate final FlatVectorsWriterprivate static final floatprivate final SegmentWriteStateprivate static final longprivate final intFields inherited from class org.apache.lucene.codecs.hnsw.FlatVectorsWriter
vectorsScorerFields inherited from interface org.apache.lucene.util.Accountable
NULL_ACCOUNTABLE -
Constructor Summary
ConstructorsModifierConstructorDescriptionprivateLucene99ScalarQuantizedVectorsWriter(SegmentWriteState state, int version, Float confidenceInterval, byte bits, boolean compress, FlatVectorsWriter rawVectorDelegate, FlatVectorsScorer scorer) Lucene99ScalarQuantizedVectorsWriter(SegmentWriteState state, Float confidenceInterval, byte bits, boolean compress, FlatVectorsWriter rawVectorDelegate, FlatVectorsScorer scorer) Lucene99ScalarQuantizedVectorsWriter(SegmentWriteState state, Float confidenceInterval, FlatVectorsWriter rawVectorDelegate, FlatVectorsScorer scorer) -
Method Summary
Modifier and TypeMethodDescriptionaddField(FieldInfo fieldInfo, KnnFieldVectorsWriter<?> indexWriter) Add a new field for indexing, allowing the user to provide a writer that the flat vectors writer can delegate to if additional indexing logic is required.(package private) static ScalarQuantizerbuildScalarQuantizer(FloatVectorValues floatVectorValues, int numVectors, VectorSimilarityFunction vectorSimilarityFunction, Float confidenceInterval, byte bits) voidclose()voidfinish()Called once at the end before closevoidflush(int maxDoc, Sorter.DocMap sortMap) Flush all buffered data on disk *private static QuantizedVectorsReadergetQuantizedKnnVectorsReader(KnnVectorsReader vectorsReader, String fieldName) private static ScalarQuantizergetQuantizedState(KnnVectorsReader vectorsReader, String fieldName) static ScalarQuantizermergeAndRecalculateQuantiles(MergeState mergeState, FieldInfo fieldInfo, Float confidenceInterval, byte bits) Merges the quantiles of the segments and recalculates the quantiles if necessary.voidmergeOneField(FieldInfo fieldInfo, MergeState mergeState) Write field for mergingmergeOneFieldToIndex(FieldInfo fieldInfo, MergeState mergeState) Write the field for merging, providing a scorer over the newly merged flat vectors.mergeOneFieldToIndex(SegmentWriteState segmentWriteState, FieldInfo fieldInfo, MergeState mergeState, ScalarQuantizer mergedQuantizationState) (package private) static ScalarQuantizermergeQuantiles(List<ScalarQuantizer> quantizationStates, IntArrayList segmentSizes, byte bits) longReturn the memory usage of this object in bytes.(package private) static booleanshouldRecomputeQuantiles(ScalarQuantizer mergedQuantizationState, List<ScalarQuantizer> quantizationStates) Returns true if the quantiles of the merged state are too far from the quantiles of the individual states.(package private) static booleanshouldRequantize(ScalarQuantizer existingQuantiles, ScalarQuantizer newQuantiles) Returns true if the quantiles of the new quantization state are too far from the quantiles of the existing quantization state.private voidwriteField(Lucene99ScalarQuantizedVectorsWriter.FieldWriter fieldData, int maxDoc) private voidwriteMeta(FieldInfo field, int maxDoc, long vectorDataOffset, long vectorDataLength, Float confidenceInterval, byte bits, boolean compress, Float lowerQuantile, Float upperQuantile, DocsWithFieldSet docsWithField) static DocsWithFieldSetwriteQuantizedVectorData(IndexOutput output, QuantizedByteVectorValues quantizedByteVectorValues, byte bits, boolean compress) Writes the vector values to the output and returns a set of documents that contains vectors.private voidprivate voidwriteSortedQuantizedVectors(Lucene99ScalarQuantizedVectorsWriter.FieldWriter fieldData, int[] ordMap) private voidwriteSortingField(Lucene99ScalarQuantizedVectorsWriter.FieldWriter fieldData, int maxDoc, Sorter.DocMap sortMap) Methods inherited from class org.apache.lucene.codecs.hnsw.FlatVectorsWriter
getFlatVectorScorerMethods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitMethods inherited from interface org.apache.lucene.util.Accountable
getChildResources
-
Field Details
-
SHALLOW_RAM_BYTES_USED
private static final long SHALLOW_RAM_BYTES_USED -
QUANTILE_RECOMPUTE_LIMIT
private static final float QUANTILE_RECOMPUTE_LIMIT- See Also:
-
REQUANTIZATION_LIMIT
private static final float REQUANTIZATION_LIMIT- See Also:
-
segmentWriteState
-
fields
-
meta
-
quantizedVectorData
-
confidenceInterval
-
rawVectorDelegate
-
bits
private final byte bits -
compress
private final boolean compress -
version
private final int version -
finished
private boolean finished
-
-
Constructor Details
-
Lucene99ScalarQuantizedVectorsWriter
public Lucene99ScalarQuantizedVectorsWriter(SegmentWriteState state, Float confidenceInterval, FlatVectorsWriter rawVectorDelegate, FlatVectorsScorer scorer) throws IOException - Throws:
IOException
-
Lucene99ScalarQuantizedVectorsWriter
public Lucene99ScalarQuantizedVectorsWriter(SegmentWriteState state, Float confidenceInterval, byte bits, boolean compress, FlatVectorsWriter rawVectorDelegate, FlatVectorsScorer scorer) throws IOException - Throws:
IOException
-
Lucene99ScalarQuantizedVectorsWriter
private Lucene99ScalarQuantizedVectorsWriter(SegmentWriteState state, int version, Float confidenceInterval, byte bits, boolean compress, FlatVectorsWriter rawVectorDelegate, FlatVectorsScorer scorer) throws IOException - Throws:
IOException
-
-
Method Details
-
addField
public FlatFieldVectorsWriter<?> addField(FieldInfo fieldInfo, KnnFieldVectorsWriter<?> indexWriter) throws IOException Description copied from class:FlatVectorsWriterAdd a new field for indexing, allowing the user to provide a writer that the flat vectors writer can delegate to if additional indexing logic is required.- Specified by:
addFieldin classFlatVectorsWriter- Parameters:
fieldInfo- fieldInfo of the field to addindexWriter- the writer to delegate to, can be null- Returns:
- a writer for the field
- Throws:
IOException- if an I/O error occurs when adding the field
-
mergeOneField
Description copied from class:FlatVectorsWriterWrite field for merging- Overrides:
mergeOneFieldin classFlatVectorsWriter- Throws:
IOException
-
mergeOneFieldToIndex
public CloseableRandomVectorScorerSupplier mergeOneFieldToIndex(FieldInfo fieldInfo, MergeState mergeState) throws IOException Description copied from class:FlatVectorsWriterWrite the field for merging, providing a scorer over the newly merged flat vectors. This way any additional merging logic can be implemented by the user of this class.- Specified by:
mergeOneFieldToIndexin classFlatVectorsWriter- Parameters:
fieldInfo- fieldInfo of the field to mergemergeState- mergeState of the segments to merge- Returns:
- a scorer over the newly merged flat vectors, which should be closed as it holds a temporary file handle to read over the newly merged vectors
- Throws:
IOException- if an I/O error occurs when merging
-
flush
Description copied from class:FlatVectorsWriterFlush all buffered data on disk *- Specified by:
flushin classFlatVectorsWriter- Throws:
IOException
-
finish
Description copied from class:FlatVectorsWriterCalled once at the end before close- Specified by:
finishin classFlatVectorsWriter- Throws:
IOException
-
ramBytesUsed
public long ramBytesUsed()Description copied from interface:AccountableReturn the memory usage of this object in bytes. Negative values are illegal. -
writeField
private void writeField(Lucene99ScalarQuantizedVectorsWriter.FieldWriter fieldData, int maxDoc) throws IOException - Throws:
IOException
-
writeMeta
private void writeMeta(FieldInfo field, int maxDoc, long vectorDataOffset, long vectorDataLength, Float confidenceInterval, byte bits, boolean compress, Float lowerQuantile, Float upperQuantile, DocsWithFieldSet docsWithField) throws IOException - Throws:
IOException
-
writeQuantizedVectors
private void writeQuantizedVectors(Lucene99ScalarQuantizedVectorsWriter.FieldWriter fieldData) throws IOException - Throws:
IOException
-
writeSortingField
private void writeSortingField(Lucene99ScalarQuantizedVectorsWriter.FieldWriter fieldData, int maxDoc, Sorter.DocMap sortMap) throws IOException - Throws:
IOException
-
writeSortedQuantizedVectors
private void writeSortedQuantizedVectors(Lucene99ScalarQuantizedVectorsWriter.FieldWriter fieldData, int[] ordMap) throws IOException - Throws:
IOException
-
mergeOneFieldToIndex
private Lucene99ScalarQuantizedVectorsWriter.ScalarQuantizedCloseableRandomVectorScorerSupplier mergeOneFieldToIndex(SegmentWriteState segmentWriteState, FieldInfo fieldInfo, MergeState mergeState, ScalarQuantizer mergedQuantizationState) throws IOException - Throws:
IOException
-
mergeQuantiles
static ScalarQuantizer mergeQuantiles(List<ScalarQuantizer> quantizationStates, IntArrayList segmentSizes, byte bits) -
shouldRecomputeQuantiles
static boolean shouldRecomputeQuantiles(ScalarQuantizer mergedQuantizationState, List<ScalarQuantizer> quantizationStates) Returns true if the quantiles of the merged state are too far from the quantiles of the individual states.- Parameters:
mergedQuantizationState- The merged quantization statequantizationStates- The quantization states of the individual segments- Returns:
- true if the quantiles should be recomputed
-
getQuantizedKnnVectorsReader
private static QuantizedVectorsReader getQuantizedKnnVectorsReader(KnnVectorsReader vectorsReader, String fieldName) -
getQuantizedState
-
mergeAndRecalculateQuantiles
public static ScalarQuantizer mergeAndRecalculateQuantiles(MergeState mergeState, FieldInfo fieldInfo, Float confidenceInterval, byte bits) throws IOException Merges the quantiles of the segments and recalculates the quantiles if necessary.- Parameters:
mergeState- The merge statefieldInfo- The field infoconfidenceInterval- The confidence intervalbits- The number of bits- Returns:
- The merged quantiles
- Throws:
IOException- If there is a low-level I/O error
-
buildScalarQuantizer
static ScalarQuantizer buildScalarQuantizer(FloatVectorValues floatVectorValues, int numVectors, VectorSimilarityFunction vectorSimilarityFunction, Float confidenceInterval, byte bits) throws IOException - Throws:
IOException
-
shouldRequantize
Returns true if the quantiles of the new quantization state are too far from the quantiles of the existing quantization state. This would imply that floating point values would slightly shift quantization buckets.- Parameters:
existingQuantiles- The existing quantiles for a segmentnewQuantiles- The new quantiles for a segment, could be merged, or fully re-calculated- Returns:
- true if the floating point values should be requantized
-
writeQuantizedVectorData
public static DocsWithFieldSet writeQuantizedVectorData(IndexOutput output, QuantizedByteVectorValues quantizedByteVectorValues, byte bits, boolean compress) throws IOException Writes the vector values to the output and returns a set of documents that contains vectors.- Throws:
IOException
-
close
- Throws:
IOException
-