Package org.apache.lucene.analysis.ngram
Class EdgeNGramTokenFilter
java.lang.Object
org.apache.lucene.util.AttributeSource
org.apache.lucene.analysis.TokenStream
org.apache.lucene.analysis.TokenFilter
org.apache.lucene.analysis.ngram.EdgeNGramTokenFilter
- All Implemented Interfaces:
Closeable,AutoCloseable,Unwrappable<TokenStream>
Tokenizes the given token into n-grams of given size(s).
This TokenFilter create n-grams from the beginning edge of a input token.
As of Lucene 4.4, this filter handles correctly supplementary characters.
-
Nested Class Summary
Nested classes/interfaces inherited from class org.apache.lucene.util.AttributeSource
AttributeSource.State -
Field Summary
FieldsFields inherited from class org.apache.lucene.analysis.TokenFilter
inputFields inherited from class org.apache.lucene.analysis.TokenStream
DEFAULT_TOKEN_ATTRIBUTE_FACTORY -
Constructor Summary
ConstructorsConstructorDescriptionEdgeNGramTokenFilter(TokenStream input, int gramSize) Creates an EdgeNGramTokenFilter that produces edge n-grams of the given size.EdgeNGramTokenFilter(TokenStream input, int minGram, int maxGram, boolean preserveOriginal) Creates an EdgeNGramTokenFilter that, for a given input term, produces all edge n-grams with lengths >= minGram and <= maxGram. -
Method Summary
Methods inherited from class org.apache.lucene.analysis.TokenFilter
close, unwrapMethods inherited from class org.apache.lucene.util.AttributeSource
addAttribute, addAttributeImpl, captureState, clearAttributes, cloneAttributes, copyTo, endAttributes, equals, getAttribute, getAttributeClassesIterator, getAttributeFactory, getAttributeImplsIterator, hasAttribute, hasAttributes, hashCode, reflectAsString, reflectWith, removeAllAttributes, restoreState, toString
-
Field Details
-
DEFAULT_PRESERVE_ORIGINAL
public static final boolean DEFAULT_PRESERVE_ORIGINAL- See Also:
-
-
Constructor Details
-
EdgeNGramTokenFilter
Creates an EdgeNGramTokenFilter that, for a given input term, produces all edge n-grams with lengths >= minGram and <= maxGram. Will optionally preserve the original term when its length is outside of the defined range.- Parameters:
input-TokenStreamholding the input to be tokenizedminGram- the minimum length of the generated n-gramsmaxGram- the maximum length of the generated n-gramspreserveOriginal- Whether or not to keep the original term when it is outside the min/max size range.
-
EdgeNGramTokenFilter
Creates an EdgeNGramTokenFilter that produces edge n-grams of the given size.- Parameters:
input-TokenStreamholding the input to be tokenizedgramSize- the n-gram size to generate.
-
-
Method Details
-
incrementToken
- Specified by:
incrementTokenin classTokenStream- Throws:
IOException
-
reset
- Overrides:
resetin classTokenFilter- Throws:
IOException
-
end
- Overrides:
endin classTokenFilter- Throws:
IOException
-