Class WANDScorer
- java.lang.Object
-
- org.apache.lucene.search.Scorable
-
- org.apache.lucene.search.Scorer
-
- org.apache.lucene.search.WANDScorer
-
final class WANDScorer extends Scorer
This implements the WAND (Weak AND) algorithm for dynamic pruning described in "Efficient Query Evaluation using a Two-Level Retrieval Process" by Broder, Carmel, Herscovici, Soffer and Zien. Enhanced with techniques described in "Faster Top-k Document Retrieval Using Block-Max Indexes" by Ding and Suel. For scoreMode ==ScoreMode.TOP_SCORES, this scorer maintains a feedback loop with the collector in order to know at any time the minimum score that is required in order for a hit to be competitive.The implementation supports both minCompetitiveScore by enforce that
∑ max_score >= minCompetitiveScore, and minShouldMatch by enforcingfreq >= minShouldMatch. It keeps sub scorers in 3 different places: - tail: a heap that contains scorers that are behind the desired doc ID. These scorers are ordered by cost so that we can advance the least costly ones first. - lead: a linked list of scorer that are positioned on the desired doc ID - head: a heap that contains scorers which are beyond the desired doc ID, ordered by doc ID in order to move quickly to the next candidate.When scoreMode ==
ScoreMode.TOP_SCORES, it leverages themax scorefrom each scorer in order to know when it may callDocIdSetIterator.advance(int)rather thanDocIdSetIterator.nextDoc()to move to the next competitive hit. When scoreMode !=ScoreMode.TOP_SCORES, block-max scoring related logic is skipped. Finding the next match consists of first setting the desired doc ID to the least entry in 'head', and then advance 'tail' until there is a match, by meeting the configuredfreq >= minShouldMatchand / or∑ max_score >= minCompetitiveScorerequirements.
-
-
Nested Class Summary
-
Nested classes/interfaces inherited from class org.apache.lucene.search.Scorable
Scorable.ChildScorable
-
-
Field Summary
Fields Modifier and Type Field Description (package private) longcost(package private) intdoc(package private) static intFLOAT_MANTISSA_BITS(package private) intfreq(package private) DisiPriorityQueuehead(package private) DisiWrapperlead(package private) longleadMaxScoreprivate static longMAX_SCALED_SCORE(package private) MaxScoreSumPropagatormaxScorePropagatorprivate longminCompetitiveScore(package private) intminShouldMatchprivate intscalingFactor(package private) ScoreModescoreMode(package private) DisiWrapper[]tail(package private) longtailMaxScore(package private) inttailSize(package private) intupTo
-
Constructor Summary
Constructors Constructor Description WANDScorer(Weight weight, java.util.Collection<Scorer> scorers, int minShouldMatch, ScoreMode scoreMode)
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description private voidaddLead(DisiWrapper lead)Add a disi to the linked list of leads.private voidaddTail(DisiWrapper s)Add an entry to 'tail'.private voidadvanceAllTail()Advance all entries from the tail to know about all matches on the current doc.private voidadvanceHead(int target)Make sure all disis in 'head' are on or after 'target'.intadvanceShallow(int target)Advance to the block of documents that containstargetin order to get scoring information about this block.private voidadvanceTail()Pop the entry from the 'tail' that has the greatest score contribution, advance it to the current doc and then add it to 'lead' or 'head' depending on whether it matches.private voidadvanceTail(DisiWrapper disi)intdocID()Returns the doc ID that is currently being scored.private intdoNextCompetitiveCandidate()Move iterators to the tail until there is a potential match.private static voiddownHeapMaxScore(DisiWrapper[] heap, int size)private booleanensureConsistent()java.util.Collection<Scorable.ChildScorable>getChildren()Returns child sub-scorers positioned on the current documentfloatgetMaxScore(int upTo)Return the maximum score that documents between the lasttargetthat this iterator wasshallow-advancedto included andupToincluded.private static booleangreaterMaxScore(DisiWrapper w1, DisiWrapper w2)In the tail, we want to get first entries that produce the maximum scores and in case of ties (eg.private DisiWrapperinsertTailWithOverFlow(DisiWrapper s)Insert an entry in 'tail' and evict the least-costly scorer if full.DocIdSetIteratoriterator()Return aDocIdSetIteratorover matching documents.private voidmoveToNextCandidate(int target)Set 'doc' to the next potential match, and move all disis of 'head' that are on this doc into 'lead'.private DisiWrapperpopTail()Pop the least-costly scorer from 'tail'.private voidpushBackLeads(int target)Move disis that are in 'lead' back to the tail.(package private) static longscaleMaxScore(float maxScore, int scalingFactor)Scale max scores in an unsigned integer to avoid overflows (only the lower 32 bits of the long are used) as well as floating-point arithmetic errors.private static longscaleMinScore(float minScore, int scalingFactor)Scale min competitive scores the same way as max scores but this time by rounding down in order to make sure that we do not miss any matches.(package private) static intscalingFactor(float f)Return a scaling factor for the given float so thatf x 2^scalingFactorwould be in[2^23, 2^24[.floatscore()Returns the score of the current document matching the query.voidsetMinCompetitiveScore(float minScore)Optional method: Tell the scorer that its iterator may safely ignore all documents whose score is less than the givenminScore.TwoPhaseIteratortwoPhaseIterator()Optional method: Return aTwoPhaseIteratorview of thisScorer.private voidupdateMaxScores(int target)private voidupdateMaxScoresIfNecessary(int target)UpdateupToand maximum scores of sub scorers so thatupTois greater than or equal to the next candidate aftertarget, i.e.private static voidupHeapMaxScore(DisiWrapper[] heap, int i)Heap helpers-
Methods inherited from class org.apache.lucene.search.Scorable
smoothingScore
-
-
-
-
Field Detail
-
FLOAT_MANTISSA_BITS
static final int FLOAT_MANTISSA_BITS
- See Also:
- Constant Field Values
-
MAX_SCALED_SCORE
private static final long MAX_SCALED_SCORE
- See Also:
- Constant Field Values
-
scalingFactor
private final int scalingFactor
-
minCompetitiveScore
private long minCompetitiveScore
-
lead
DisiWrapper lead
-
doc
int doc
-
leadMaxScore
long leadMaxScore
-
head
final DisiPriorityQueue head
-
tail
final DisiWrapper[] tail
-
tailMaxScore
long tailMaxScore
-
tailSize
int tailSize
-
cost
final long cost
-
maxScorePropagator
final MaxScoreSumPropagator maxScorePropagator
-
upTo
int upTo
-
minShouldMatch
final int minShouldMatch
-
freq
int freq
-
scoreMode
final ScoreMode scoreMode
-
-
Method Detail
-
scalingFactor
static int scalingFactor(float f)
Return a scaling factor for the given float so thatf x 2^scalingFactorwould be in[2^23, 2^24[. Special cases:scalingFactor(0) = scalingFactor(MIN_VALUE) + 1 scalingFactor(+Infty) = scalingFactor(MAX_VALUE) - 1
-
scaleMaxScore
static long scaleMaxScore(float maxScore, int scalingFactor)Scale max scores in an unsigned integer to avoid overflows (only the lower 32 bits of the long are used) as well as floating-point arithmetic errors. Those are rounded up in order to make sure we do not miss any matches.
-
scaleMinScore
private static long scaleMinScore(float minScore, int scalingFactor)Scale min competitive scores the same way as max scores but this time by rounding down in order to make sure that we do not miss any matches.
-
ensureConsistent
private boolean ensureConsistent()
-
setMinCompetitiveScore
public void setMinCompetitiveScore(float minScore) throws java.io.IOExceptionDescription copied from class:ScorableOptional method: Tell the scorer that its iterator may safely ignore all documents whose score is less than the givenminScore. This is a no-op by default.This method may only be called from collectors that use
ScoreMode.TOP_SCORES, and successive calls may only set increasing values ofminScore.- Overrides:
setMinCompetitiveScorein classScorable- Throws:
java.io.IOException
-
getChildren
public final java.util.Collection<Scorable.ChildScorable> getChildren() throws java.io.IOException
Description copied from class:ScorableReturns child sub-scorers positioned on the current document- Overrides:
getChildrenin classScorable- Throws:
java.io.IOException
-
iterator
public DocIdSetIterator iterator()
Description copied from class:ScorerReturn aDocIdSetIteratorover matching documents.The returned iterator will either be positioned on
-1if no documents have been scored yet,DocIdSetIterator.NO_MORE_DOCSif all documents have been scored already, or the last document id that has been scored otherwise.The returned iterator is a view: calling this method several times will return iterators that have the same state.
-
twoPhaseIterator
public TwoPhaseIterator twoPhaseIterator()
Description copied from class:ScorerOptional method: Return aTwoPhaseIteratorview of thisScorer. A return value ofnullindicates that two-phase iteration is not supported.Note that the returned
TwoPhaseIterator'sapproximationmust advance synchronously with theScorer.iterator(): advancing the approximation must advance the iterator and vice-versa.Implementing this method is typically useful on
Scorers that have a high per-document overhead in order to confirm matches.The default implementation returns
null.- Overrides:
twoPhaseIteratorin classScorer
-
addLead
private void addLead(DisiWrapper lead)
Add a disi to the linked list of leads.
-
pushBackLeads
private void pushBackLeads(int target) throws java.io.IOExceptionMove disis that are in 'lead' back to the tail.- Throws:
java.io.IOException
-
advanceHead
private void advanceHead(int target) throws java.io.IOExceptionMake sure all disis in 'head' are on or after 'target'.- Throws:
java.io.IOException
-
advanceTail
private void advanceTail(DisiWrapper disi) throws java.io.IOException
- Throws:
java.io.IOException
-
advanceTail
private void advanceTail() throws java.io.IOExceptionPop the entry from the 'tail' that has the greatest score contribution, advance it to the current doc and then add it to 'lead' or 'head' depending on whether it matches.- Throws:
java.io.IOException
-
updateMaxScores
private void updateMaxScores(int target) throws java.io.IOException- Throws:
java.io.IOException
-
updateMaxScoresIfNecessary
private void updateMaxScoresIfNecessary(int target) throws java.io.IOExceptionUpdateupToand maximum scores of sub scorers so thatupTois greater than or equal to the next candidate aftertarget, i.e. the top of `head`.- Throws:
java.io.IOException
-
moveToNextCandidate
private void moveToNextCandidate(int target) throws java.io.IOExceptionSet 'doc' to the next potential match, and move all disis of 'head' that are on this doc into 'lead'.- Throws:
java.io.IOException
-
doNextCompetitiveCandidate
private int doNextCompetitiveCandidate() throws java.io.IOExceptionMove iterators to the tail until there is a potential match.- Throws:
java.io.IOException
-
advanceAllTail
private void advanceAllTail() throws java.io.IOExceptionAdvance all entries from the tail to know about all matches on the current doc.- Throws:
java.io.IOException
-
score
public float score() throws java.io.IOExceptionDescription copied from class:ScorableReturns the score of the current document matching the query.
-
advanceShallow
public int advanceShallow(int target) throws java.io.IOExceptionDescription copied from class:ScorerAdvance to the block of documents that containstargetin order to get scoring information about this block. This method is implicitly called byDocIdSetIterator.advance(int)andDocIdSetIterator.nextDoc()on the returned doc ID. Calling this method doesn't modify the currentDocIdSetIterator.docID(). It returns a number that is greater than or equal to all documents contained in the current block, but less than any doc IDS of the next block.targetmust be >=Scorable.docID()as well as all targets that have been passed toScorer.advanceShallow(int)so far.- Overrides:
advanceShallowin classScorer- Throws:
java.io.IOException
-
getMaxScore
public float getMaxScore(int upTo) throws java.io.IOExceptionDescription copied from class:ScorerReturn the maximum score that documents between the lasttargetthat this iterator wasshallow-advancedto included andupToincluded.- Specified by:
getMaxScorein classScorer- Throws:
java.io.IOException
-
docID
public int docID()
Description copied from class:ScorableReturns the doc ID that is currently being scored.
-
insertTailWithOverFlow
private DisiWrapper insertTailWithOverFlow(DisiWrapper s)
Insert an entry in 'tail' and evict the least-costly scorer if full.
-
addTail
private void addTail(DisiWrapper s)
Add an entry to 'tail'. Fails if over capacity.
-
popTail
private DisiWrapper popTail()
Pop the least-costly scorer from 'tail'.
-
upHeapMaxScore
private static void upHeapMaxScore(DisiWrapper[] heap, int i)
Heap helpers
-
downHeapMaxScore
private static void downHeapMaxScore(DisiWrapper[] heap, int size)
-
greaterMaxScore
private static boolean greaterMaxScore(DisiWrapper w1, DisiWrapper w2)
In the tail, we want to get first entries that produce the maximum scores and in case of ties (eg. constant-score queries), those that have the least cost so that they are likely to advance further.
-
-