| Package | Description |
|---|---|
| org.apache.lucene.analysis.standard |
Fast, general-purpose grammar-based tokenizer
StandardTokenizer
implements the Word Break rules from the Unicode Text Segmentation algorithm, as specified in
Unicode Standard Annex #29. |
| Class and Description |
|---|
| ClassicTokenizer
A grammar-based tokenizer constructed with JFlex
|
| ClassicTokenizerImpl
This class implements the classic lucene StandardTokenizer up until 3.0
|
| StandardTokenizer
A grammar-based tokenizer constructed with JFlex.
|
| StandardTokenizerImpl
This class implements Word Break rules from the Unicode Text Segmentation
algorithm, as specified in
Unicode Standard Annex #29.
|
| UAX29URLEmailTokenizer
This class implements Word Break rules from the Unicode Text Segmentation
algorithm, as specified in
Unicode Standard Annex #29
URLs and email addresses are also tokenized according to the relevant RFCs.
|
| UAX29URLEmailTokenizerImpl
This class implements Word Break rules from the Unicode Text Segmentation
algorithm, as specified in
Unicode Standard Annex #29
URLs and email addresses are also tokenized according to the relevant RFCs.
|