Class ShingleAnalyzerWrapper
java.lang.Object
org.apache.lucene.analysis.Analyzer
org.apache.lucene.analysis.AnalyzerWrapper
org.apache.lucene.analysis.shingle.ShingleAnalyzerWrapper
- All Implemented Interfaces:
Closeable,AutoCloseable
A ShingleAnalyzerWrapper wraps a
ShingleFilter around another Analyzer.
A shingle is another name for a token based n-gram.
- Since:
- 3.1
-
Nested Class Summary
Nested classes/interfaces inherited from class org.apache.lucene.analysis.Analyzer
Analyzer.ReuseStrategy, Analyzer.TokenStreamComponents -
Field Summary
Fields inherited from class org.apache.lucene.analysis.Analyzer
GLOBAL_REUSE_STRATEGY, PER_FIELD_REUSE_STRATEGY -
Constructor Summary
ConstructorsConstructorDescriptionWrapsStandardAnalyzer.ShingleAnalyzerWrapper(int minShingleSize, int maxShingleSize) WrapsStandardAnalyzer.ShingleAnalyzerWrapper(Analyzer defaultAnalyzer) ShingleAnalyzerWrapper(Analyzer defaultAnalyzer, int maxShingleSize) ShingleAnalyzerWrapper(Analyzer defaultAnalyzer, int minShingleSize, int maxShingleSize) ShingleAnalyzerWrapper(Analyzer delegate, int minShingleSize, int maxShingleSize, String tokenSeparator, boolean outputUnigrams, boolean outputUnigramsIfNoShingles, String fillerToken) Creates a new ShingleAnalyzerWrapper -
Method Summary
Modifier and TypeMethodDescriptionintThe max shingle (token ngram) sizeintThe min shingle (token ngram) sizefinal AnalyzergetWrappedAnalyzer(String fieldName) booleanbooleanprotected Analyzer.TokenStreamComponentswrapComponents(String fieldName, Analyzer.TokenStreamComponents components) Methods inherited from class org.apache.lucene.analysis.AnalyzerWrapper
attributeFactory, createComponents, getOffsetGap, getPositionIncrementGap, initReader, initReaderForNormalization, normalize, wrapReader, wrapReaderForNormalization, wrapTokenStreamForNormalizationMethods inherited from class org.apache.lucene.analysis.Analyzer
close, getReuseStrategy, normalize, tokenStream, tokenStream
-
Constructor Details
-
ShingleAnalyzerWrapper
-
ShingleAnalyzerWrapper
-
ShingleAnalyzerWrapper
-
ShingleAnalyzerWrapper
public ShingleAnalyzerWrapper(Analyzer delegate, int minShingleSize, int maxShingleSize, String tokenSeparator, boolean outputUnigrams, boolean outputUnigramsIfNoShingles, String fillerToken) Creates a new ShingleAnalyzerWrapper- Parameters:
delegate- Analyzer whose TokenStream is to be filteredminShingleSize- Min shingle (token ngram) sizemaxShingleSize- Max shingle sizetokenSeparator- Used to separate input stream tokens in output shinglesoutputUnigrams- Whether or not the filter shall pass the original tokens to the output streamoutputUnigramsIfNoShingles- Overrides the behavior of outputUnigrams==false for those times when no shingles are available (because there are fewer than minShingleSize tokens in the input stream)? Note that if outputUnigrams==true, then unigrams are always output, regardless of whether any shingles are available.fillerToken- filler token to use when positionIncrement is more than 1
-
ShingleAnalyzerWrapper
public ShingleAnalyzerWrapper()WrapsStandardAnalyzer. -
ShingleAnalyzerWrapper
public ShingleAnalyzerWrapper(int minShingleSize, int maxShingleSize) WrapsStandardAnalyzer.
-
-
Method Details
-
getMaxShingleSize
public int getMaxShingleSize()The max shingle (token ngram) size- Returns:
- The max shingle (token ngram) size
-
getMinShingleSize
public int getMinShingleSize()The min shingle (token ngram) size- Returns:
- The min shingle (token ngram) size
-
getTokenSeparator
-
isOutputUnigrams
public boolean isOutputUnigrams() -
isOutputUnigramsIfNoShingles
public boolean isOutputUnigramsIfNoShingles() -
getFillerToken
-
getWrappedAnalyzer
- Specified by:
getWrappedAnalyzerin classAnalyzerWrapper
-
wrapComponents
protected Analyzer.TokenStreamComponents wrapComponents(String fieldName, Analyzer.TokenStreamComponents components) - Overrides:
wrapComponentsin classAnalyzerWrapper
-