Package org.apache.lucene.analysis.core
Class WhitespaceTokenizerFactory
java.lang.Object
org.apache.lucene.analysis.AbstractAnalysisFactory
org.apache.lucene.analysis.TokenizerFactory
org.apache.lucene.analysis.core.WhitespaceTokenizerFactory
Factory for
WhitespaceTokenizer.
<fieldType name="text_ws" class="solr.TextField" positionIncrementGap="100">
<analyzer>
<tokenizer class="solr.WhitespaceTokenizerFactory" rule="unicode" maxTokenLen="256"/>
</analyzer>
</fieldType>
Options:
- rule: either "java" for
WhitespaceTokenizeror "unicode" forUnicodeWhitespaceTokenizer - maxTokenLen: max token length, should be greater than 0 and less than
MAX_TOKEN_LENGTH_LIMIT (1024*1024). It is rare to need to change this else
CharTokenizer::DEFAULT_MAX_TOKEN_LEN
- Since:
- 3.1
-
Field Summary
FieldsModifier and TypeFieldDescriptionprivate final intstatic final StringSPI nameprivate final Stringstatic final Stringprivate static final Collection<String> static final StringFields inherited from class org.apache.lucene.analysis.AbstractAnalysisFactory
LUCENE_MATCH_VERSION_PARAM, luceneMatchVersion -
Constructor Summary
ConstructorsConstructorDescriptionDefault ctor for compatibility with SPICreates a new WhitespaceTokenizerFactory -
Method Summary
Modifier and TypeMethodDescriptioncreate(AttributeFactory factory) Creates a TokenStream of the specified input using the given AttributeFactoryMethods inherited from class org.apache.lucene.analysis.TokenizerFactory
availableTokenizers, create, findSPIName, forName, lookupClass, reloadTokenizersMethods inherited from class org.apache.lucene.analysis.AbstractAnalysisFactory
defaultCtorException, get, get, get, get, get, getBoolean, getChar, getClassArg, getFloat, getInt, getLines, getLuceneMatchVersion, getOriginalArgs, getPattern, getSet, getSnowballWordSet, getWordSet, isExplicitLuceneMatchVersion, require, require, require, requireBoolean, requireChar, requireFloat, requireInt, setExplicitLuceneMatchVersion, splitAt, splitFileNames
-
Field Details
-
NAME
SPI name- See Also:
-
RULE_JAVA
- See Also:
-
RULE_UNICODE
- See Also:
-
RULE_NAMES
-
rule
-
maxTokenLen
private final int maxTokenLen
-
-
Constructor Details
-
WhitespaceTokenizerFactory
Creates a new WhitespaceTokenizerFactory -
WhitespaceTokenizerFactory
public WhitespaceTokenizerFactory()Default ctor for compatibility with SPI
-
-
Method Details
-
create
Description copied from class:TokenizerFactoryCreates a TokenStream of the specified input using the given AttributeFactory- Specified by:
createin classTokenizerFactory
-