Package org.apache.lucene.analysis.core
package org.apache.lucene.analysis.core
Basic, general-purpose analysis components.
-
ClassesClassDescriptionFolds all Unicode digits in
[:General_Category=Decimal_Number:]to Basic Latin digits (0-9).Factory forDecimalDigitFilter.Converts an incoming graph token stream, such as one fromSynonymGraphFilter, into a flat form so that all nodes form a single linear chain with no side paths.Factory forFlattenGraphFilter."Tokenizes" the entire stream as a single token.Emits the entire input as a single token.Factory forKeywordTokenizer.A LetterTokenizer is a tokenizer that divides text at non-letters.Factory forLetterTokenizer.Normalizes token text to lower case.Factory forLowerCaseFilter.Removes stop words from a token stream.Factory forStopFilter.Removes tokens whose types appear in a set of blocked types from a token stream.Factory class forTypeTokenFilter.An Analyzer that usesUnicodeWhitespaceTokenizer.A UnicodeWhitespaceTokenizer is a tokenizer that divides text at whitespace.Normalizes token text to UPPER CASE.Factory forUpperCaseFilter.An Analyzer that usesWhitespaceTokenizer.A tokenizer that divides text at whitespace characters as defined byCharacter.isWhitespace(int).Factory forWhitespaceTokenizer.