Module org.apache.lucene.sandbox
Class SampleReader
java.lang.Object
org.apache.lucene.index.KnnVectorValues
org.apache.lucene.index.FloatVectorValues
org.apache.lucene.sandbox.codecs.quantization.SampleReader
- All Implemented Interfaces:
HasIndexSlice
A reader of vector values that samples a subset of the vectors.
-
Nested Class Summary
Nested classes/interfaces inherited from class org.apache.lucene.index.KnnVectorValues
KnnVectorValues.DocIndexIterator -
Field Summary
FieldsModifier and TypeFieldDescriptionprivate final FloatVectorValuesprivate final IntUnaryOperatorprivate final int -
Constructor Summary
ConstructorsConstructorDescriptionSampleReader(FloatVectorValues origin, int sampleSize, IntUnaryOperator sampleFunction) -
Method Summary
Modifier and TypeMethodDescriptioncopy()Creates a new copy of thisKnnVectorValues.static SampleReadercreateSampleReader(FloatVectorValues origin, int k, long seed) intReturn the dimension of the vectorsgetAcceptOrds(Bits acceptDocs) Returns a Bits accepting docs accepted by the argument and having a vector valuegetSlice()Returns an IndexInput from which to read this instance's values.intReturns the vector byte length, defaults to dimension multiplied by float byte sizeintordToDoc(int ord) Return the docid of the document indexed with the given vector ordinal.static int[]reservoirSample(int n, int k, long seed) Sample k elements from n elements according to reservoir sampling algorithm.static int[]reservoirSampleFromArray(int[] origin, int k, long seed) Sample k elements from the origin array using reservoir sampling algorithm.intsize()Return the number of vectors for this field.float[]vectorValue(int targetOrd) Return the vector value for the given vector ordinal which must be in [0, size() - 1], otherwise IndexOutOfBoundsException is thrown.Methods inherited from class org.apache.lucene.index.FloatVectorValues
checkField, fromFloats, getEncoding, scorerMethods inherited from class org.apache.lucene.index.KnnVectorValues
createDenseIterator, createSparseIterator, fromDISI, iterator
-
Field Details
-
origin
-
sampleSize
private final int sampleSize -
sampleFunction
-
-
Constructor Details
-
SampleReader
SampleReader(FloatVectorValues origin, int sampleSize, IntUnaryOperator sampleFunction)
-
-
Method Details
-
size
public int size()Description copied from class:KnnVectorValuesReturn the number of vectors for this field.- Specified by:
sizein classKnnVectorValues- Returns:
- the number of vectors returned by this iterator
-
dimension
public int dimension()Description copied from class:KnnVectorValuesReturn the dimension of the vectors- Specified by:
dimensionin classKnnVectorValues
-
copy
Description copied from class:KnnVectorValuesCreates a new copy of thisKnnVectorValues. This is helpful when you need to access different values at once, to avoid overwriting the underlying vector returned.- Specified by:
copyin classFloatVectorValues- Throws:
IOException
-
getSlice
Description copied from interface:HasIndexSliceReturns an IndexInput from which to read this instance's values.- Specified by:
getSlicein interfaceHasIndexSlice
-
vectorValue
Description copied from class:FloatVectorValuesReturn the vector value for the given vector ordinal which must be in [0, size() - 1], otherwise IndexOutOfBoundsException is thrown. The returned array may be shared across calls.- Specified by:
vectorValuein classFloatVectorValues- Returns:
- the vector value
- Throws:
IOException
-
getVectorByteLength
public int getVectorByteLength()Description copied from class:KnnVectorValuesReturns the vector byte length, defaults to dimension multiplied by float byte size- Overrides:
getVectorByteLengthin classKnnVectorValues
-
ordToDoc
public int ordToDoc(int ord) Description copied from class:KnnVectorValuesReturn the docid of the document indexed with the given vector ordinal. This default implementation returns the argument and is appropriate for dense values implementations where every doc has a single value.- Overrides:
ordToDocin classKnnVectorValues
-
getAcceptOrds
Description copied from class:KnnVectorValuesReturns a Bits accepting docs accepted by the argument and having a vector value- Overrides:
getAcceptOrdsin classKnnVectorValues
-
createSampleReader
-
reservoirSample
public static int[] reservoirSample(int n, int k, long seed) Sample k elements from n elements according to reservoir sampling algorithm.- Parameters:
n- number of elementsk- number of samplesseed- random seed- Returns:
- array of k samples
-
reservoirSampleFromArray
public static int[] reservoirSampleFromArray(int[] origin, int k, long seed) Sample k elements from the origin array using reservoir sampling algorithm.- Parameters:
origin- original arrayk- number of samplesseed- random seed- Returns:
- array of k samples
-