Package org.apache.lucene.analysis.de
Class GermanNormalizationFilter
java.lang.Object
org.apache.lucene.util.AttributeSource
org.apache.lucene.analysis.TokenStream
org.apache.lucene.analysis.TokenFilter
org.apache.lucene.analysis.de.GermanNormalizationFilter
- All Implemented Interfaces:
Closeable,AutoCloseable,Unwrappable<TokenStream>
Normalizes German characters according to the heuristics of the German snowball algorithm. It
allows for the fact that ä, ö and ü are sometimes written as ae, oe and ue.
- 'ß' is replaced by 'ss'
- 'ä', 'ö', 'ü' are replaced by 'a', 'o', 'u', respectively.
- 'ae' and 'oe' are replaced by 'a', and 'o', respectively.
- 'ue' is replaced by 'u', when not following a vowel or q.
-
Nested Class Summary
Nested classes/interfaces inherited from class org.apache.lucene.util.AttributeSource
AttributeSource.State -
Field Summary
Fields inherited from class org.apache.lucene.analysis.TokenFilter
inputFields inherited from class org.apache.lucene.analysis.TokenStream
DEFAULT_TOKEN_ATTRIBUTE_FACTORY -
Constructor Summary
Constructors -
Method Summary
Methods inherited from class org.apache.lucene.analysis.TokenFilter
close, end, reset, unwrapMethods inherited from class org.apache.lucene.util.AttributeSource
addAttribute, addAttributeImpl, captureState, clearAttributes, cloneAttributes, copyTo, endAttributes, equals, getAttribute, getAttributeClassesIterator, getAttributeFactory, getAttributeImplsIterator, hasAttribute, hasAttributes, hashCode, reflectAsString, reflectWith, removeAllAttributes, restoreState, toString
-
Constructor Details
-
GermanNormalizationFilter
-
-
Method Details
-
incrementToken
- Specified by:
incrementTokenin classTokenStream- Throws:
IOException
-