Class Lucene90HnswGraphBuilder
- java.lang.Object
-
- org.apache.lucene.backward_codecs.lucene90.Lucene90HnswGraphBuilder
-
public final class Lucene90HnswGraphBuilder extends java.lang.ObjectBuilder for HNSW graph. SeeLucene90OnHeapHnswGraphfor a gloss on the algorithm and the meaning of the hyperparameters.This class is preserved here only for tests.
-
-
Field Summary
Fields Modifier and Type Field Description private intbeamWidthprivate BoundsCheckerboundprivate RandomAccessVectorValuesbuildVectorsprivate static longDEFAULT_RAND_SEEDDefault random seed for level generation *(package private) Lucene90OnHeapHnswGraphhnswstatic java.lang.StringHNSW_COMPONENTA name for the HNSW component for the info-stream *private InfoStreaminfoStreamprivate intmaxConnprivate java.util.SplittableRandomrandomstatic longrandSeedRandom seed for level generation; public to expose for testing *private Lucene90NeighborArrayscratchprivate VectorSimilarityFunctionsimilarityFunctionprivate RandomAccessVectorValuesvectorValues
-
Constructor Summary
Constructors Constructor Description Lucene90HnswGraphBuilder(RandomAccessVectorValuesProducer vectors, VectorSimilarityFunction similarityFunction, int maxConn, int beamWidth, long seed)Reads all the vectors from a VectorValues, builds a graph connecting them by their dense ordinals, using the given hyperparameter settings, and returns the resulting graph.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description private voidaddDiverseNeighbors(int node, NeighborQueue candidates)(package private) voidaddGraphNode(float[] value)Inserts a doc with vector value to the graphLucene90OnHeapHnswGraphbuild(RandomAccessVectorValues vectors)Reads all the vectors from two copies of a random access VectorValues.private booleandiversityCheck(float[] candidate, float score, Lucene90NeighborArray neighbors, RandomAccessVectorValues vectorValues)private voiddiversityUpdate(Lucene90NeighborArray neighbors)private intfindNonDiverse(Lucene90NeighborArray neighbors)private voidpopToScratch(NeighborQueue candidates)private voidselectDiverse(Lucene90NeighborArray neighbors, Lucene90NeighborArray candidates)voidsetInfoStream(InfoStream infoStream)Set info-stream to output debugging information *
-
-
-
Field Detail
-
DEFAULT_RAND_SEED
private static final long DEFAULT_RAND_SEED
Default random seed for level generation *- See Also:
- Constant Field Values
-
HNSW_COMPONENT
public static final java.lang.String HNSW_COMPONENT
A name for the HNSW component for the info-stream *- See Also:
- Constant Field Values
-
randSeed
public static long randSeed
Random seed for level generation; public to expose for testing *
-
maxConn
private final int maxConn
-
beamWidth
private final int beamWidth
-
scratch
private final Lucene90NeighborArray scratch
-
similarityFunction
private final VectorSimilarityFunction similarityFunction
-
vectorValues
private final RandomAccessVectorValues vectorValues
-
random
private final java.util.SplittableRandom random
-
bound
private final BoundsChecker bound
-
hnsw
final Lucene90OnHeapHnswGraph hnsw
-
infoStream
private InfoStream infoStream
-
buildVectors
private RandomAccessVectorValues buildVectors
-
-
Constructor Detail
-
Lucene90HnswGraphBuilder
public Lucene90HnswGraphBuilder(RandomAccessVectorValuesProducer vectors, VectorSimilarityFunction similarityFunction, int maxConn, int beamWidth, long seed) throws java.io.IOException
Reads all the vectors from a VectorValues, builds a graph connecting them by their dense ordinals, using the given hyperparameter settings, and returns the resulting graph.- Parameters:
vectors- the vectors whose relations are represented by the graph - must provide a different view over those vectors than the one used to add via addGraphNode.maxConn- the number of connections to make when adding a new graph node; roughly speaking the graph fanout.beamWidth- the size of the beam search to use when finding nearest neighbors.seed- the seed for a random number generator used during graph construction. Provide this to ensure repeatable construction.- Throws:
java.io.IOException
-
-
Method Detail
-
build
public Lucene90OnHeapHnswGraph build(RandomAccessVectorValues vectors) throws java.io.IOException
Reads all the vectors from two copies of a random access VectorValues. Providing two copies enables efficient retrieval without extra data copying, while avoiding collision of the returned values.- Parameters:
vectors- the vectors for which to build a nearest neighbors graph. Must be an independet accessor for the vectors- Throws:
java.io.IOException
-
setInfoStream
public void setInfoStream(InfoStream infoStream)
Set info-stream to output debugging information *
-
addGraphNode
void addGraphNode(float[] value) throws java.io.IOExceptionInserts a doc with vector value to the graph- Throws:
java.io.IOException
-
addDiverseNeighbors
private void addDiverseNeighbors(int node, NeighborQueue candidates) throws java.io.IOException- Throws:
java.io.IOException
-
selectDiverse
private void selectDiverse(Lucene90NeighborArray neighbors, Lucene90NeighborArray candidates) throws java.io.IOException
- Throws:
java.io.IOException
-
popToScratch
private void popToScratch(NeighborQueue candidates)
-
diversityCheck
private boolean diversityCheck(float[] candidate, float score, Lucene90NeighborArray neighbors, RandomAccessVectorValues vectorValues) throws java.io.IOException- Parameters:
candidate- the vector of a new candidate neighbor of a node nscore- the score of the new candidate and node n, to be compared with scores of the candidate and n's neighborsneighbors- the neighbors selected so farvectorValues- source of values used for making comparisons between candidate and existing neighbors- Returns:
- whether the candidate is diverse given the existing neighbors
- Throws:
java.io.IOException
-
diversityUpdate
private void diversityUpdate(Lucene90NeighborArray neighbors) throws java.io.IOException
- Throws:
java.io.IOException
-
findNonDiverse
private int findNonDiverse(Lucene90NeighborArray neighbors) throws java.io.IOException
- Throws:
java.io.IOException
-
-