Class TokenInfoMorphData
java.lang.Object
org.apache.lucene.analysis.ja.dict.TokenInfoMorphData
- All Implemented Interfaces:
JaMorphData,MorphData
- Direct Known Subclasses:
UnknownMorphData
Morphological information for system dictionary.
-
Field Summary
FieldsModifier and TypeFieldDescriptionprivate final ByteBufferstatic final intflag that the entry has baseform data.static final intflag that the entry has pronunciation data.static final intflag that the entry has reading data.private final String[]private final String[]private final String[] -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionprivate static intbaseFormOffset(int wordId) getBaseForm(int morphId, char[] surfaceForm, int off, int len) Get base form of wordgetInflectionForm(int wordId) Get inflection form of tokensgetInflectionType(int morphId) Get inflection type of tokensintgetLeftId(int morphId) Get left id of specified wordgetPartOfSpeech(int morphId) Get Part-Of-Speech of tokensgetPronunciation(int morphId, char[] surface, int off, int len) Get pronunciation of tokensgetReading(int morphId, char[] surface, int off, int len) Get reading of tokensintgetRightId(int morphId) Get right id of specified wordintgetWordCost(int morphId) Get word cost of specified wordprivate booleanhasBaseFormData(int wordId) private booleanhasPronunciationData(int wordId) private booleanhasReadingData(int wordId) private static voidpopulatePosDict(DataInput in, int posSize, String[] posDict, String[] inflTypeDict, String[] inflFormDict) private intpronunciationOffset(int wordId) private intreadingOffset(int wordId) private StringreadString(int offset, int length, boolean kana)
-
Field Details
-
buffer
-
posDict
-
inflTypeDict
-
inflFormDict
-
HAS_BASEFORM
public static final int HAS_BASEFORMflag that the entry has baseform data. otherwise it's not inflected (same as surface form)- See Also:
-
HAS_READING
public static final int HAS_READINGflag that the entry has reading data. otherwise reading is surface form converted to katakana- See Also:
-
HAS_PRONUNCIATION
public static final int HAS_PRONUNCIATIONflag that the entry has pronunciation data. otherwise pronunciation is the reading- See Also:
-
-
Constructor Details
-
TokenInfoMorphData
TokenInfoMorphData(ByteBuffer buffer, IOSupplier<InputStream> posResource) throws IOException - Throws:
IOException
-
-
Method Details
-
populatePosDict
private static void populatePosDict(DataInput in, int posSize, String[] posDict, String[] inflTypeDict, String[] inflFormDict) throws IOException - Throws:
IOException
-
getLeftId
public int getLeftId(int morphId) Description copied from interface:MorphDataGet left id of specified word -
getRightId
public int getRightId(int morphId) Description copied from interface:MorphDataGet right id of specified word- Specified by:
getRightIdin interfaceMorphData- Returns:
- right id
-
getWordCost
public int getWordCost(int morphId) Description copied from interface:MorphDataGet word cost of specified word- Specified by:
getWordCostin interfaceMorphData- Returns:
- word's cost
-
getBaseForm
Description copied from interface:JaMorphDataGet base form of word- Specified by:
getBaseFormin interfaceJaMorphData- Parameters:
morphId- word ID of token- Returns:
- Base form (only different for inflected words, otherwise null)
-
getReading
Description copied from interface:JaMorphDataGet reading of tokens- Specified by:
getReadingin interfaceJaMorphData- Parameters:
morphId- word ID of token- Returns:
- Reading of the token
-
getPartOfSpeech
Description copied from interface:JaMorphDataGet Part-Of-Speech of tokens- Specified by:
getPartOfSpeechin interfaceJaMorphData- Parameters:
morphId- word ID of token- Returns:
- Part-Of-Speech of the token
-
getPronunciation
Description copied from interface:JaMorphDataGet pronunciation of tokens- Specified by:
getPronunciationin interfaceJaMorphData- Parameters:
morphId- word ID of token- Returns:
- Pronunciation of the token
-
getInflectionType
Description copied from interface:JaMorphDataGet inflection type of tokens- Specified by:
getInflectionTypein interfaceJaMorphData- Parameters:
morphId- word ID of token- Returns:
- inflection type, or null
-
getInflectionForm
Description copied from interface:JaMorphDataGet inflection form of tokens- Specified by:
getInflectionFormin interfaceJaMorphData- Parameters:
wordId- word ID of token- Returns:
- inflection form, or null
-
readingOffset
private int readingOffset(int wordId) -
pronunciationOffset
private int pronunciationOffset(int wordId) -
baseFormOffset
private static int baseFormOffset(int wordId) -
hasBaseFormData
private boolean hasBaseFormData(int wordId) -
hasReadingData
private boolean hasReadingData(int wordId) -
hasPronunciationData
private boolean hasPronunciationData(int wordId) -
readString
-