java.lang.Object
org.apache.lucene.search.spell.WordBreakSpellChecker
A spell checker whose sole function is to offer suggestions by combining multiple terms into one
word and/or breaking terms into multiple words.
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescriptionstatic enumDetermines the order to list word break suggestionsprivate static classprivate static classprivate static classprivate static classprivate static class -
Field Summary
FieldsModifier and TypeFieldDescriptionprivate intprivate intprivate intprivate intprivate intstatic final TermTerm that can be used to prohibit adjacent terms from being combined -
Constructor Summary
ConstructorsConstructorDescriptionCreates a new spellchecker with default configuration values -
Method Summary
Modifier and TypeMethodDescriptionprivate intgenerateBreakUpSuggestions(Term term, IndexReader ir, int numberBreaks, int maxSuggestions, int useMinSuggestionFrequency, SuggestWord[] prefix, Queue<WordBreakSpellChecker.SuggestWordArrayWrapper> suggestions, int totalEvaluations, WordBreakSpellChecker.BreakSuggestionSortMethod sortMethod) private SuggestWordgenerateSuggestWord(IndexReader ir, String fieldname, String text) intReturns the maximum number of changes to perform on the inputintReturns the maximum length of a combined suggestionintReturns the maximum number of word combinations to evaluate.intReturns the minimum size of a broken wordintReturns the minimum frequency a term must have to be part of a suggestion.private SuggestWord[]newPrefix(SuggestWord[] oldPrefix, SuggestWord append) private SuggestWord[]newSuggestion(SuggestWord[] prefix, SuggestWord append1, SuggestWord append2) voidsetMaxChanges(int maxChanges) The maximum numbers of changes (word breaks or combinations) to make on the original term(s).voidsetMaxCombineWordLength(int maxCombineWordLength) The maximum length of a suggestion made by combining 1 or more original terms.voidsetMaxEvaluations(int maxEvaluations) The maximum number of word combinations to evaluate.voidsetMinBreakWordLength(int minBreakWordLength) The minimum length to break words down to.voidsetMinSuggestionFrequency(int minSuggestionFrequency) The minimum frequency a term must have to be included as part of a suggestion.SuggestWord[][]suggestWordBreaks(Term term, int maxSuggestions, IndexReader ir, SuggestMode suggestMode, WordBreakSpellChecker.BreakSuggestionSortMethod sortMethod) Generate suggestions by breaking the passed-in term into multiple words.suggestWordCombinations(Term[] terms, int maxSuggestions, IndexReader ir, SuggestMode suggestMode) Generate suggestions by combining one or more of the passed-in terms into single words.
-
Field Details
-
minSuggestionFrequency
private int minSuggestionFrequency -
minBreakWordLength
private int minBreakWordLength -
maxCombineWordLength
private int maxCombineWordLength -
maxChanges
private int maxChanges -
maxEvaluations
private int maxEvaluations -
SEPARATOR_TERM
Term that can be used to prohibit adjacent terms from being combined
-
-
Constructor Details
-
WordBreakSpellChecker
public WordBreakSpellChecker()Creates a new spellchecker with default configuration values- See Also:
-
-
Method Details
-
suggestWordBreaks
public SuggestWord[][] suggestWordBreaks(Term term, int maxSuggestions, IndexReader ir, SuggestMode suggestMode, WordBreakSpellChecker.BreakSuggestionSortMethod sortMethod) throws IOException Generate suggestions by breaking the passed-in term into multiple words. The scores returned are equal to the number of word breaks needed so a lower score is generally preferred over a higher score.- Parameters:
suggestMode- - default =SuggestMode.SUGGEST_WHEN_NOT_IN_INDEXsortMethod- - default =WordBreakSpellChecker.BreakSuggestionSortMethod.NUM_CHANGES_THEN_MAX_FREQUENCY- Returns:
- one or more arrays of words formed by breaking up the original term
- Throws:
IOException- If there is a low-level I/O error.
-
suggestWordCombinations
public CombineSuggestion[] suggestWordCombinations(Term[] terms, int maxSuggestions, IndexReader ir, SuggestMode suggestMode) throws IOException Generate suggestions by combining one or more of the passed-in terms into single words. The returnedCombineSuggestioncontains both aSuggestWordand also an array detailing which passed-in terms were involved in creating this combination. The scores returned are equal to the number of word combinations needed, also one less than the length of the arrayCombineSuggestion.originalTermIndexes. Generally, a suggestion with a lower score is preferred over a higher score.To prevent two adjacent terms from being combined (for instance, if one is mandatory and the other is prohibited), separate the two terms with
SEPARATOR_TERMWhen suggestMode equals
SuggestMode.SUGGEST_WHEN_NOT_IN_INDEX, each suggestion will include at least one term not in the index.When suggestMode equals
SuggestMode.SUGGEST_MORE_POPULAR, each suggestion will have the same, or better frequency than the most-popular included term.- Returns:
- an array of words generated by combining original terms
- Throws:
IOException- If there is a low-level I/O error.
-
generateBreakUpSuggestions
private int generateBreakUpSuggestions(Term term, IndexReader ir, int numberBreaks, int maxSuggestions, int useMinSuggestionFrequency, SuggestWord[] prefix, Queue<WordBreakSpellChecker.SuggestWordArrayWrapper> suggestions, int totalEvaluations, WordBreakSpellChecker.BreakSuggestionSortMethod sortMethod) throws IOException - Throws:
IOException
-
newPrefix
-
newSuggestion
-
generateSuggestWord
private SuggestWord generateSuggestWord(IndexReader ir, String fieldname, String text) throws IOException - Throws:
IOException
-
getMinSuggestionFrequency
public int getMinSuggestionFrequency()Returns the minimum frequency a term must have to be part of a suggestion.- See Also:
-
getMaxCombineWordLength
public int getMaxCombineWordLength()Returns the maximum length of a combined suggestion- See Also:
-
getMinBreakWordLength
public int getMinBreakWordLength()Returns the minimum size of a broken word- See Also:
-
getMaxChanges
public int getMaxChanges()Returns the maximum number of changes to perform on the input- See Also:
-
getMaxEvaluations
public int getMaxEvaluations()Returns the maximum number of word combinations to evaluate.- See Also:
-
setMinSuggestionFrequency
public void setMinSuggestionFrequency(int minSuggestionFrequency) The minimum frequency a term must have to be included as part of a suggestion. Default=1 Not applicable when used withSuggestMode.SUGGEST_MORE_POPULAR- See Also:
-
setMaxCombineWordLength
public void setMaxCombineWordLength(int maxCombineWordLength) The maximum length of a suggestion made by combining 1 or more original terms. Default=20- See Also:
-
setMinBreakWordLength
public void setMinBreakWordLength(int minBreakWordLength) The minimum length to break words down to. Default=1- See Also:
-
setMaxChanges
public void setMaxChanges(int maxChanges) The maximum numbers of changes (word breaks or combinations) to make on the original term(s). Default=1- See Also:
-
setMaxEvaluations
public void setMaxEvaluations(int maxEvaluations) The maximum number of word combinations to evaluate. Default=1000. A higher value might improve result quality. A lower value might improve performance.- See Also:
-