Package org.apache.lucene.analysis.core
Class WhitespaceTokenizerFactory
- java.lang.Object
-
- org.apache.lucene.analysis.AbstractAnalysisFactory
-
- org.apache.lucene.analysis.TokenizerFactory
-
- org.apache.lucene.analysis.core.WhitespaceTokenizerFactory
-
public class WhitespaceTokenizerFactory extends TokenizerFactory
Factory forWhitespaceTokenizer.<fieldType name="text_ws" class="solr.TextField" positionIncrementGap="100"> <analyzer> <tokenizer class="solr.WhitespaceTokenizerFactory" rule="unicode" maxTokenLen="256"/> </analyzer> </fieldType>Options:- rule: either "java" for
WhitespaceTokenizeror "unicode" forUnicodeWhitespaceTokenizer - maxTokenLen: max token length, should be greater than 0 and less than
MAX_TOKEN_LENGTH_LIMIT (1024*1024). It is rare to need to change this else
CharTokenizer::DEFAULT_MAX_TOKEN_LEN
- Since:
- 3.1
- rule: either "java" for
-
-
Field Summary
Fields Modifier and Type Field Description private intmaxTokenLenstatic java.lang.StringNAMESPI nameprivate java.lang.Stringrulestatic java.lang.StringRULE_JAVAprivate static java.util.Collection<java.lang.String>RULE_NAMESstatic java.lang.StringRULE_UNICODE-
Fields inherited from class org.apache.lucene.analysis.AbstractAnalysisFactory
LUCENE_MATCH_VERSION_PARAM, luceneMatchVersion
-
-
Constructor Summary
Constructors Constructor Description WhitespaceTokenizerFactory()Default ctor for compatibility with SPIWhitespaceTokenizerFactory(java.util.Map<java.lang.String,java.lang.String> args)Creates a new WhitespaceTokenizerFactory
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description Tokenizercreate(AttributeFactory factory)Creates a TokenStream of the specified input using the given AttributeFactory-
Methods inherited from class org.apache.lucene.analysis.TokenizerFactory
availableTokenizers, create, findSPIName, forName, lookupClass, reloadTokenizers
-
Methods inherited from class org.apache.lucene.analysis.AbstractAnalysisFactory
defaultCtorException, get, get, get, get, get, getBoolean, getChar, getClassArg, getFloat, getInt, getLines, getLuceneMatchVersion, getOriginalArgs, getPattern, getSet, getSnowballWordSet, getWordSet, isExplicitLuceneMatchVersion, require, require, require, requireBoolean, requireChar, requireFloat, requireInt, setExplicitLuceneMatchVersion, splitAt, splitFileNames
-
-
-
-
Field Detail
-
NAME
public static final java.lang.String NAME
SPI name- See Also:
- Constant Field Values
-
RULE_JAVA
public static final java.lang.String RULE_JAVA
- See Also:
- Constant Field Values
-
RULE_UNICODE
public static final java.lang.String RULE_UNICODE
- See Also:
- Constant Field Values
-
RULE_NAMES
private static final java.util.Collection<java.lang.String> RULE_NAMES
-
rule
private final java.lang.String rule
-
maxTokenLen
private final int maxTokenLen
-
-
Method Detail
-
create
public Tokenizer create(AttributeFactory factory)
Description copied from class:TokenizerFactoryCreates a TokenStream of the specified input using the given AttributeFactory- Specified by:
createin classTokenizerFactory
-
-