public final class CodepointCountFilter extends FilteringTokenFilter
Note: Length is calculated as the number of Unicode codepoints.
AttributeSource.State| Modifier and Type | Field and Description |
|---|---|
private int |
max |
private int |
min |
private CharTermAttribute |
termAtt |
inputDEFAULT_TOKEN_ATTRIBUTE_FACTORY| Constructor and Description |
|---|
CodepointCountFilter(TokenStream in,
int min,
int max)
Create a new
CodepointCountFilter. |
| Modifier and Type | Method and Description |
|---|---|
boolean |
accept()
Override this method and return if the current input token should be returned by
FilteringTokenFilter.incrementToken(). |
end, incrementToken, resetcloseaddAttribute, addAttributeImpl, captureState, clearAttributes, cloneAttributes, copyTo, endAttributes, equals, getAttribute, getAttributeClassesIterator, getAttributeFactory, getAttributeImplsIterator, hasAttribute, hasAttributes, hashCode, reflectAsString, reflectWith, removeAllAttributes, restoreState, toStringprivate final int min
private final int max
private final CharTermAttribute termAtt
public CodepointCountFilter(TokenStream in, int min, int max)
CodepointCountFilter. This will filter out tokens whose
CharTermAttribute is either too short (Character.codePointCount(char[], int, int)
< min) or too long (Character.codePointCount(char[], int, int) > max).in - the TokenStream to consumemin - the minimum lengthmax - the maximum lengthpublic boolean accept()
FilteringTokenFilterFilteringTokenFilter.incrementToken().accept in class FilteringTokenFilter