public class FSListCrawler extends FileResourceCrawler
| Modifier and Type | Field and Description |
|---|---|
private java.io.BufferedReader |
reader |
private java.nio.file.Path |
root |
ADDED, LOG, SKIPPED, STOP_NOW| Constructor and Description |
|---|
FSListCrawler(java.util.concurrent.ArrayBlockingQueue<FileResource> fileQueue,
int numConsumers,
java.io.File root,
java.io.File list,
java.lang.String encoding)
Deprecated.
|
FSListCrawler(java.util.concurrent.ArrayBlockingQueue<FileResource> fileQueue,
int numConsumers,
java.nio.file.Path root,
java.nio.file.Path list,
java.nio.charset.Charset charset)
Constructor for a crawler that reads a list of files to process.
|
| Modifier and Type | Method and Description |
|---|---|
private java.lang.String |
nextLine() |
void |
start()
Implement this to control the addition of FileResources.
|
call, getAdded, getConsidered, isActive, isQueueEmpty, select, setDocumentSelector, setMaxConsecWaitInMillis, setMaxFilesToAdd, setMaxFilesToConsider, shutDownNoPoison, tryToAdd, wasTimedOutprivate final java.io.BufferedReader reader
private final java.nio.file.Path root
@Deprecated public FSListCrawler(java.util.concurrent.ArrayBlockingQueue<FileResource> fileQueue, int numConsumers, java.io.File root, java.io.File list, java.lang.String encoding) throws java.io.FileNotFoundException, java.io.UnsupportedEncodingException
fileQueue - numConsumers - root - list - encoding - java.io.FileNotFoundExceptionjava.io.UnsupportedEncodingExceptionFSListCrawler(ArrayBlockingQueue, int, Path, Path, Charset)public FSListCrawler(java.util.concurrent.ArrayBlockingQueue<FileResource> fileQueue, int numConsumers, java.nio.file.Path root, java.nio.file.Path list, java.nio.charset.Charset charset) throws java.io.IOException
The list should be paths relative to the root.
fileQueue - queue for batchnumConsumers - number of consumersroot - root input directorlist - text file list (one file per line) of paths relative to
the root for processingcharset - charset of the filejava.io.IOExceptionpublic void start()
throws java.lang.InterruptedException
FileResourceCrawlerFileResourceCrawler.tryToAdd(org.apache.tika.batch.FileResource)
to add FileResources to the queue.start in class FileResourceCrawlerjava.lang.InterruptedExceptionprivate java.lang.String nextLine()