Package org.apache.lucene.analysis.id
Class IndonesianStemmer
java.lang.Object
org.apache.lucene.analysis.id.IndonesianStemmer
Stemmer for Indonesian.
Stems Indonesian words with the algorithm presented in: A Study of Stemming Effects on Information Retrieval in Bahasa Indonesia, Fadillah Z Tala. http://www.illc.uva.nl/Publications/ResearchReports/MoL-2003-02.text.pdf
-
Field Summary
FieldsModifier and TypeFieldDescriptionprivate intprivate intprivate static final intprivate static final intprivate static final intprivate static final intprivate static final intprivate static final intprivate static final int -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionprivate booleanisVowel(char ch) private intremoveFirstOrderPrefix(char[] text, int length) private intremoveParticle(char[] text, int length) private intremovePossessivePronoun(char[] text, int length) private intremoveSecondOrderPrefix(char[] text, int length) private intremoveSuffix(char[] text, int length) intstem(char[] text, int length, boolean stemDerivational) Stem a term (returning its new length).private intstemDerivational(char[] text, int length)
-
Field Details
-
numSyllables
private int numSyllables -
flags
private int flags -
REMOVED_KE
private static final int REMOVED_KE- See Also:
-
REMOVED_PENG
private static final int REMOVED_PENG- See Also:
-
REMOVED_DI
private static final int REMOVED_DI- See Also:
-
REMOVED_MENG
private static final int REMOVED_MENG- See Also:
-
REMOVED_TER
private static final int REMOVED_TER- See Also:
-
REMOVED_BER
private static final int REMOVED_BER- See Also:
-
REMOVED_PE
private static final int REMOVED_PE- See Also:
-
-
Constructor Details
-
IndonesianStemmer
public IndonesianStemmer()
-
-
Method Details
-
stem
public int stem(char[] text, int length, boolean stemDerivational) Stem a term (returning its new length).Use
stemDerivationalto control whether full stemming or only light inflectional stemming is done. -
stemDerivational
private int stemDerivational(char[] text, int length) -
isVowel
private boolean isVowel(char ch) -
removeParticle
private int removeParticle(char[] text, int length) -
removePossessivePronoun
private int removePossessivePronoun(char[] text, int length) -
removeFirstOrderPrefix
private int removeFirstOrderPrefix(char[] text, int length) -
removeSecondOrderPrefix
private int removeSecondOrderPrefix(char[] text, int length) -
removeSuffix
private int removeSuffix(char[] text, int length)
-