public class CommonsDigester extends java.lang.Object implements DigestingParser.Digester
DigestingParser.Digester
that relies on commons.codec.digest.DigestUtils to calculate digest hashes.
This digester tries to use the regular mark/reset protocol on the InputStream. However, this wraps an internal BoundedInputStream, and if the InputStream is not fully read, then this will reset the stream and spool the InputStream to disk (via TikaInputStream) and then digest the file.
If a TikaInputStream is passed in and it has an underlying file that is longer
than the markLimit, then this digester digests the file directly.
| Modifier and Type | Class and Description |
|---|---|
static class |
CommonsDigester.DigestAlgorithm |
private class |
CommonsDigester.SimpleBoundedInputStream
Very slight modification of Commons' BoundedInputStream
so that we can figure out if this hit the bound or not.
|
| Modifier and Type | Field and Description |
|---|---|
private java.util.List<CommonsDigester.DigestAlgorithm> |
algorithms |
private int |
markLimit |
| Constructor and Description |
|---|
CommonsDigester(int markLimit,
CommonsDigester.DigestAlgorithm... algorithms) |
| Modifier and Type | Method and Description |
|---|---|
void |
digest(java.io.InputStream is,
Metadata m,
ParseContext parseContext)
Digests an InputStream and sets the appropriate value(s) in the metadata.
|
private boolean |
digestEach(CommonsDigester.DigestAlgorithm algorithm,
java.io.InputStream is,
Metadata metadata) |
private void |
digestFile(java.io.File f,
Metadata m) |
static CommonsDigester.DigestAlgorithm[] |
parse(java.lang.String s) |
private final java.util.List<CommonsDigester.DigestAlgorithm> algorithms
private final int markLimit
public CommonsDigester(int markLimit,
CommonsDigester.DigestAlgorithm... algorithms)
public void digest(java.io.InputStream is,
Metadata m,
ParseContext parseContext)
throws java.io.IOException
DigestingParser.Digester
The given stream is guaranteed to support the
mark feature and the detector
is expected to mark the stream before
reading any bytes from it, and to reset
the stream before returning. The stream must not be closed by the
detector.
digest in interface DigestingParser.Digesteris - InputStream to digestm - Metadata to set the values forparseContext - ParseContextjava.io.IOExceptionprivate void digestFile(java.io.File f,
Metadata m)
throws java.io.IOException
java.io.IOExceptionprivate boolean digestEach(CommonsDigester.DigestAlgorithm algorithm, java.io.InputStream is, Metadata metadata) throws java.io.IOException
algorithm - algo to useis - input stream to read frommetadata - metadata for reporting the digestjava.io.IOExceptionpublic static CommonsDigester.DigestAlgorithm[] parse(java.lang.String s)
s - comma-delimited (no space) list of algorithms to use: md5,sha256