java.lang.Object
org.apache.lucene.search.CollectionStatistics
Contains statistics for a collection (field).
This class holds statistics across all documents for scoring purposes:
maxDoc(): number of documents.docCount(): number of documents that contain this field.sumDocFreq(): number of postings-list entries.sumTotalTermFreq(): number of tokens.
The following conditions are always true:
- All statistics are positive integers: never zero or negative.
docCount<=maxDocdocCount<=sumDocFreq<=sumTotalTermFreq
Values may include statistics on deleted documents that have not yet been merged away.
Be careful when performing calculations on these values because they are represented as 64-bit
integer values, you may need to cast to double for your use.
-
Field Summary
FieldsModifier and TypeFieldDescriptionprivate final longprivate final Stringprivate final longprivate final longprivate final long -
Constructor Summary
ConstructorsConstructorDescriptionCollectionStatistics(String field, long maxDoc, long docCount, long sumTotalTermFreq, long sumDocFreq) Creates statistics instance for a collection (field). -
Method Summary
Modifier and TypeMethodDescriptionfinal longdocCount()The total number of documents that have at least one term for this field.final Stringfield()The field's name.final longmaxDoc()The total number of documents, regardless of whether they all contain values for this field.final longThe total number of posting list entries for this field.final longThe total number of tokens for this field.toString()
-
Field Details
-
field
-
maxDoc
private final long maxDoc -
docCount
private final long docCount -
sumTotalTermFreq
private final long sumTotalTermFreq -
sumDocFreq
private final long sumDocFreq
-
-
Constructor Details
-
CollectionStatistics
public CollectionStatistics(String field, long maxDoc, long docCount, long sumTotalTermFreq, long sumDocFreq) Creates statistics instance for a collection (field).- Parameters:
field- Field's namemaxDoc- total number of documents.docCount- number of documents containing the field.sumTotalTermFreq- number of tokens in the field.sumDocFreq- number of postings list entries for the field.- Throws:
IllegalArgumentException- ifmaxDocis negative or zero.IllegalArgumentException- ifdocCountis negative or zero.IllegalArgumentException- ifdocCountis more thanmaxDoc.IllegalArgumentException- ifsumDocFreqis less thandocCount.IllegalArgumentException- ifsumTotalTermFreqis less thansumDocFreq.
-
-
Method Details
-
field
The field's name.This value is never
null.- Returns:
- field's name, not
null
-
maxDoc
public final long maxDoc()The total number of documents, regardless of whether they all contain values for this field.This value is always a positive number.
- Returns:
- total number of documents, in the range [1 ..
Long.MAX_VALUE] - See Also:
-
docCount
public final long docCount()The total number of documents that have at least one term for this field.This value is always a positive number, and never exceeds
maxDoc().- Returns:
- total number of documents containing this field, in the range [1 ..
maxDoc()] - See Also:
-
sumTotalTermFreq
public final long sumTotalTermFreq()The total number of tokens for this field. This is the "word count" for this field across all documents. It is the sum ofTermStatistics.totalTermFreq()across all terms. It is also the sum of each document's field length across all documents.This value is always a positive number, and always at least
sumDocFreq().- Returns:
- total number of tokens in the field, in the range [
sumDocFreq()..Long.MAX_VALUE] - See Also:
-
sumDocFreq
public final long sumDocFreq()The total number of posting list entries for this field. This is the sum of term-document pairs: the sum ofTermStatistics.docFreq()across all terms. It is also the sum of each document's unique term count for this field across all documents.This value is always a positive number, always at least
docCount(), and never exceedssumTotalTermFreq().- Returns:
- number of posting list entries, in the range [
docCount()..sumTotalTermFreq()] - See Also:
-
toString
-