Class CapitalizationFilterFactory
- java.lang.Object
-
- org.apache.lucene.analysis.AbstractAnalysisFactory
-
- org.apache.lucene.analysis.TokenFilterFactory
-
- org.apache.lucene.analysis.miscellaneous.CapitalizationFilterFactory
-
public class CapitalizationFilterFactory extends TokenFilterFactory
Factory forCapitalizationFilter.The factory takes parameters:
- "onlyFirstWord" - should each word be capitalized or all of the words?
- "keep" - a keep word list. Each word that should be kept separated by whitespace.
- "keepIgnoreCase - true or false. If true, the keep list will be considered case-insensitive.
- "forceFirstLetter" - Force the first letter to be capitalized even if it is in the keep list
- "okPrefix" - do not change word capitalization if a word begins with something in this list. for example if "McK" is on the okPrefix list, the word "McKinley" should not be changed to "Mckinley"
- "minWordLength" - how long the word needs to be to get capitalization applied. If the minWordLength is 3, "and" > "And" but "or" stays "or"
- "maxWordCount" - if the token contains more then maxWordCount words, the capitalization is assumed to be correct.
<fieldType name="text_cptlztn" class="solr.TextField" positionIncrementGap="100"> <analyzer> <tokenizer class="solr.WhitespaceTokenizerFactory"/> <filter class="solr.CapitalizationFilterFactory" onlyFirstWord="true" keep="java solr lucene" keepIgnoreCase="false" okPrefix="McK McD McA"/> </analyzer> </fieldType>- Since:
- solr 1.3
-
-
Field Summary
Fields Modifier and Type Field Description static java.lang.StringFORCE_FIRST_LETTER(package private) booleanforceFirstLetter(package private) CharArraySetkeepstatic java.lang.StringKEEPstatic java.lang.StringKEEP_IGNORE_CASEstatic java.lang.StringMAX_TOKEN_LENGTHstatic java.lang.StringMAX_WORD_COUNT(package private) intmaxTokenLength(package private) intmaxWordCountstatic java.lang.StringMIN_WORD_LENGTH(package private) intminWordLengthstatic java.lang.StringNAMESPI namestatic java.lang.StringOK_PREFIX(package private) java.util.Collection<char[]>okPrefixstatic java.lang.StringONLY_FIRST_WORD(package private) booleanonlyFirstWord-
Fields inherited from class org.apache.lucene.analysis.AbstractAnalysisFactory
LUCENE_MATCH_VERSION_PARAM, luceneMatchVersion
-
-
Constructor Summary
Constructors Constructor Description CapitalizationFilterFactory()Default ctor for compatibility with SPICapitalizationFilterFactory(java.util.Map<java.lang.String,java.lang.String> args)Creates a new CapitalizationFilterFactory
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description CapitalizationFiltercreate(TokenStream input)Transform the specified input TokenStream-
Methods inherited from class org.apache.lucene.analysis.TokenFilterFactory
availableTokenFilters, findSPIName, forName, lookupClass, normalize, reloadTokenFilters
-
Methods inherited from class org.apache.lucene.analysis.AbstractAnalysisFactory
defaultCtorException, get, get, get, get, get, getBoolean, getChar, getClassArg, getFloat, getInt, getLines, getLuceneMatchVersion, getOriginalArgs, getPattern, getSet, getSnowballWordSet, getWordSet, isExplicitLuceneMatchVersion, require, require, require, requireBoolean, requireChar, requireFloat, requireInt, setExplicitLuceneMatchVersion, splitAt, splitFileNames
-
-
-
-
Field Detail
-
NAME
public static final java.lang.String NAME
SPI name- See Also:
- Constant Field Values
-
KEEP
public static final java.lang.String KEEP
- See Also:
- Constant Field Values
-
KEEP_IGNORE_CASE
public static final java.lang.String KEEP_IGNORE_CASE
- See Also:
- Constant Field Values
-
OK_PREFIX
public static final java.lang.String OK_PREFIX
- See Also:
- Constant Field Values
-
MIN_WORD_LENGTH
public static final java.lang.String MIN_WORD_LENGTH
- See Also:
- Constant Field Values
-
MAX_WORD_COUNT
public static final java.lang.String MAX_WORD_COUNT
- See Also:
- Constant Field Values
-
MAX_TOKEN_LENGTH
public static final java.lang.String MAX_TOKEN_LENGTH
- See Also:
- Constant Field Values
-
ONLY_FIRST_WORD
public static final java.lang.String ONLY_FIRST_WORD
- See Also:
- Constant Field Values
-
FORCE_FIRST_LETTER
public static final java.lang.String FORCE_FIRST_LETTER
- See Also:
- Constant Field Values
-
keep
CharArraySet keep
-
okPrefix
java.util.Collection<char[]> okPrefix
-
minWordLength
final int minWordLength
-
maxWordCount
final int maxWordCount
-
maxTokenLength
final int maxTokenLength
-
onlyFirstWord
final boolean onlyFirstWord
-
forceFirstLetter
final boolean forceFirstLetter
-
-
Method Detail
-
create
public CapitalizationFilter create(TokenStream input)
Description copied from class:TokenFilterFactoryTransform the specified input TokenStream- Specified by:
createin classTokenFilterFactory
-
-