Class BreakIteratorWrapper
java.lang.Object
org.apache.lucene.analysis.icu.segmentation.BreakIteratorWrapper
Wraps RuleBasedBreakIterator, making object reuse convenient and emitting a rule status for emoji
sequences.
-
Field Summary
FieldsModifier and TypeFieldDescription(package private) static final com.ibm.icu.text.UnicodeSet(package private) static final com.ibm.icu.text.UnicodeSetprivate final com.ibm.icu.text.RuleBasedBreakIteratorprivate intprivate intprivate char[]private final CharArrayIterator -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionprivate intcalcStatus(int current, int next) Returns current rule status for the text between breaks.(package private) intcurrent()(package private) intprivate booleanisEmoji(int current, int next) Returns true if the current text represents emoji character or sequence(package private) intnext()(package private) voidsetText(char[] text, int start, int length)
-
Field Details
-
textIterator
-
rbbi
private final com.ibm.icu.text.RuleBasedBreakIterator rbbi -
text
private char[] text -
start
private int start -
status
private int status -
EMOJI_RK
static final com.ibm.icu.text.UnicodeSet EMOJI_RK -
EMOJI
static final com.ibm.icu.text.UnicodeSet EMOJI
-
-
Constructor Details
-
BreakIteratorWrapper
BreakIteratorWrapper(com.ibm.icu.text.RuleBasedBreakIterator rbbi)
-
-
Method Details
-
current
int current() -
getRuleStatus
int getRuleStatus() -
next
int next() -
calcStatus
private int calcStatus(int current, int next) Returns current rule status for the text between breaks. (determines token type) -
isEmoji
private boolean isEmoji(int current, int next) Returns true if the current text represents emoji character or sequence -
setText
void setText(char[] text, int start, int length)
-