public class UniformSplitTerms extends Terms implements Accountable
Terms based on the Uniform Split technique.
The index dictionary is lazy loaded only when
TermsEnum.seekCeil(org.apache.lucene.util.BytesRef) or TermsEnum.seekExact(org.apache.lucene.util.BytesRef) are called
(it is not loaded for a direct terms enumeration).
UniformSplitTermsWriter| Modifier and Type | Field and Description |
|---|---|
private static long |
BASE_RAM_USAGE |
protected BlockDecoder |
blockDecoder |
protected IndexInput |
blockInput |
protected DictionaryBrowserSupplier |
dictionaryBrowserSupplier |
protected FieldMetadata |
fieldMetadata |
protected PostingsReaderBase |
postingsReader |
EMPTY_ARRAY| Modifier | Constructor and Description |
|---|---|
protected |
UniformSplitTerms(IndexInput blockInput,
FieldMetadata fieldMetadata,
PostingsReaderBase postingsReader,
BlockDecoder blockDecoder,
DictionaryBrowserSupplier dictionaryBrowserSupplier) |
protected |
UniformSplitTerms(IndexInput dictionaryInput,
IndexInput blockInput,
FieldMetadata fieldMetadata,
PostingsReaderBase postingsReader,
BlockDecoder blockDecoder) |
| Modifier and Type | Method and Description |
|---|---|
protected void |
checkIntersectAutomatonType(CompiledAutomaton automaton) |
long |
getDictionaryRamBytesUsed() |
int |
getDocCount()
Returns the number of documents that have at least one
term for this field.
|
BytesRef |
getMax()
Returns the largest term (in lexicographic order) in the field.
|
long |
getSumDocFreq()
Returns the sum of
TermsEnum.docFreq() for
all terms in this field. |
long |
getSumTotalTermFreq()
Returns the sum of
TermsEnum.totalTermFreq() for
all terms in this field. |
boolean |
hasFreqs()
Returns true if documents in this field store
per-document term frequency (
PostingsEnum.freq()). |
boolean |
hasOffsets()
Returns true if documents in this field store offsets.
|
boolean |
hasPayloads()
Returns true if documents in this field store payloads.
|
boolean |
hasPositions()
Returns true if documents in this field store positions.
|
TermsEnum |
intersect(CompiledAutomaton compiled,
BytesRef startTerm)
Returns a TermsEnum that iterates over all terms and
documents that are accepted by the provided
CompiledAutomaton. |
TermsEnum |
iterator()
Returns an iterator that will step through all
terms.
|
long |
ramBytesUsed()
Return the memory usage of this object in bytes.
|
long |
ramBytesUsedWithoutDictionary() |
long |
size()
Returns the number of terms for this field, or -1 if this
measure isn't stored by the codec.
|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitgetChildResourcesprivate static final long BASE_RAM_USAGE
protected final IndexInput blockInput
protected final FieldMetadata fieldMetadata
protected final PostingsReaderBase postingsReader
protected final BlockDecoder blockDecoder
protected final DictionaryBrowserSupplier dictionaryBrowserSupplier
protected UniformSplitTerms(IndexInput dictionaryInput, IndexInput blockInput, FieldMetadata fieldMetadata, PostingsReaderBase postingsReader, BlockDecoder blockDecoder) throws java.io.IOException
blockDecoder - Optional block decoder, may be null if none. It can be used for decompression or decryption.java.io.IOExceptionprotected UniformSplitTerms(IndexInput blockInput, FieldMetadata fieldMetadata, PostingsReaderBase postingsReader, BlockDecoder blockDecoder, DictionaryBrowserSupplier dictionaryBrowserSupplier)
blockDecoder - Optional block decoder, may be null if none. It can be used for decompression or decryption.public TermsEnum iterator() throws java.io.IOException
Termspublic TermsEnum intersect(CompiledAutomaton compiled, BytesRef startTerm) throws java.io.IOException
TermsCompiledAutomaton. If the startTerm is
provided then the returned enum will only return terms
> startTerm, but you still must call
next() first to get to the first term. Note that the
provided startTerm must be accepted by
the automaton.
This is an expert low-level API and will only work
for NORMAL compiled automata. To handle any
compiled automata you should instead use
CompiledAutomaton.getTermsEnum(org.apache.lucene.index.Terms) instead.
NOTE: the returned TermsEnum cannot seek
.protected void checkIntersectAutomatonType(CompiledAutomaton automaton)
public BytesRef getMax()
Termspublic long size()
Termspublic long getSumTotalTermFreq()
TermsTermsEnum.totalTermFreq() for
all terms in this field. Note that, just like other term
measures, this measure does not take deleted documents
into account.getSumTotalTermFreq in class Termspublic long getSumDocFreq()
TermsTermsEnum.docFreq() for
all terms in this field. Note that, just like other term
measures, this measure does not take deleted documents
into account.getSumDocFreq in class Termspublic int getDocCount()
TermsgetDocCount in class Termspublic boolean hasFreqs()
TermsPostingsEnum.freq()).public boolean hasOffsets()
TermshasOffsets in class Termspublic boolean hasPositions()
TermshasPositions in class Termspublic boolean hasPayloads()
TermshasPayloads in class Termspublic long ramBytesUsed()
AccountableramBytesUsed in interface Accountablepublic long ramBytesUsedWithoutDictionary()
public long getDictionaryRamBytesUsed()