Class PhoneticFilterFactory
- java.lang.Object
-
- org.apache.lucene.analysis.AbstractAnalysisFactory
-
- org.apache.lucene.analysis.TokenFilterFactory
-
- org.apache.lucene.analysis.phonetic.PhoneticFilterFactory
-
- All Implemented Interfaces:
ResourceLoaderAware
public class PhoneticFilterFactory extends TokenFilterFactory implements ResourceLoaderAware
Factory forPhoneticFilter.Create tokens based on phonetic encoders from Apache Commons Codec.
This takes one required argument, "encoder", and the rest are optional:
- encoder
- required, one of "DoubleMetaphone", "Metaphone", "Soundex", "RefinedSoundex", "Caverphone" (v2.0), "ColognePhonetic" or "Nysiis" (case insensitive). If encoder isn't one of these, it'll be resolved as a class name either by itself if it already contains a '.' or otherwise as in the same package as these others.
- inject
- (default=true) add tokens to the stream with the offset=0
- maxCodeLength
- The maximum length of the phonetic codes, as defined by the encoder. If an encoder doesn't support this then specifying this is an error.
<fieldType name="text_phonetic" class="solr.TextField" positionIncrementGap="100"> <analyzer> <tokenizer class="solr.WhitespaceTokenizerFactory"/> <filter class="solr.PhoneticFilterFactory" encoder="DoubleMetaphone" inject="true"/> </analyzer> </fieldType>- Since:
- 3.1
- See Also:
PhoneticFilter
-
-
Field Summary
Fields Modifier and Type Field Description private java.lang.Class<? extends org.apache.commons.codec.Encoder>clazzstatic java.lang.StringENCODERparameter name: either a short name or a full class name(package private) booleaninjectstatic java.lang.StringINJECTparameter name: true if encoded tokens should be added as synonymsstatic java.lang.StringMAX_CODE_LENGTHparameter name: restricts the length of the phonetic codeprivate java.lang.IntegermaxCodeLengthprivate java.lang.Stringnamestatic java.lang.StringNAMESPI nameprivate static java.lang.StringPACKAGE_CONTAINING_ENCODERSprivate static java.util.Map<java.lang.String,java.lang.Class<? extends org.apache.commons.codec.Encoder>>registryprivate java.lang.reflect.MethodsetMaxCodeLenMethod-
Fields inherited from class org.apache.lucene.analysis.AbstractAnalysisFactory
LUCENE_MATCH_VERSION_PARAM, luceneMatchVersion
-
-
Constructor Summary
Constructors Constructor Description PhoneticFilterFactory()Default ctor for compatibility with SPIPhoneticFilterFactory(java.util.Map<java.lang.String,java.lang.String> args)Creates a new PhoneticFilterFactory
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description PhoneticFiltercreate(TokenStream input)Transform the specified input TokenStreamprotected org.apache.commons.codec.EncodergetEncoder()Must be thread-safe.voidinform(ResourceLoader loader)Initializes this component with the provided ResourceLoader (used for loading classes, files, etc).private java.lang.Class<? extends org.apache.commons.codec.Encoder>resolveEncoder(java.lang.String name, ResourceLoader loader)-
Methods inherited from class org.apache.lucene.analysis.TokenFilterFactory
availableTokenFilters, findSPIName, forName, lookupClass, normalize, reloadTokenFilters
-
Methods inherited from class org.apache.lucene.analysis.AbstractAnalysisFactory
defaultCtorException, get, get, get, get, get, getBoolean, getChar, getClassArg, getFloat, getInt, getLines, getLuceneMatchVersion, getOriginalArgs, getPattern, getSet, getSnowballWordSet, getWordSet, isExplicitLuceneMatchVersion, require, require, require, requireBoolean, requireChar, requireFloat, requireInt, setExplicitLuceneMatchVersion, splitAt, splitFileNames
-
-
-
-
Field Detail
-
NAME
public static final java.lang.String NAME
SPI name- See Also:
- Constant Field Values
-
ENCODER
public static final java.lang.String ENCODER
parameter name: either a short name or a full class name- See Also:
- Constant Field Values
-
INJECT
public static final java.lang.String INJECT
parameter name: true if encoded tokens should be added as synonyms- See Also:
- Constant Field Values
-
MAX_CODE_LENGTH
public static final java.lang.String MAX_CODE_LENGTH
parameter name: restricts the length of the phonetic code- See Also:
- Constant Field Values
-
PACKAGE_CONTAINING_ENCODERS
private static final java.lang.String PACKAGE_CONTAINING_ENCODERS
- See Also:
- Constant Field Values
-
registry
private static final java.util.Map<java.lang.String,java.lang.Class<? extends org.apache.commons.codec.Encoder>> registry
-
inject
final boolean inject
-
name
private final java.lang.String name
-
maxCodeLength
private final java.lang.Integer maxCodeLength
-
clazz
private java.lang.Class<? extends org.apache.commons.codec.Encoder> clazz
-
setMaxCodeLenMethod
private java.lang.reflect.Method setMaxCodeLenMethod
-
-
Method Detail
-
inform
public void inform(ResourceLoader loader) throws java.io.IOException
Description copied from interface:ResourceLoaderAwareInitializes this component with the provided ResourceLoader (used for loading classes, files, etc).- Specified by:
informin interfaceResourceLoaderAware- Throws:
java.io.IOException
-
resolveEncoder
private java.lang.Class<? extends org.apache.commons.codec.Encoder> resolveEncoder(java.lang.String name, ResourceLoader loader)
-
getEncoder
protected org.apache.commons.codec.Encoder getEncoder()
Must be thread-safe.
-
create
public PhoneticFilter create(TokenStream input)
Description copied from class:TokenFilterFactoryTransform the specified input TokenStream- Specified by:
createin classTokenFilterFactory
-
-