public class HtmlToPlainText
extends java.lang.Object
Note that this is a fairly simplistic formatter -- for real world use you'll want to embrace and extend.
To invoke from the command line, assuming you've downloaded the jsoup jar to your current directory:
java -cp jsoup.jar org.jsoup.examples.HtmlToPlainText url [selector]
| Modifier and Type | Class and Description |
|---|---|
private class |
HtmlToPlainText.FormattingVisitor |
| Modifier and Type | Field and Description |
|---|---|
private static int |
timeout |
private static java.lang.String |
userAgent |
| Constructor and Description |
|---|
HtmlToPlainText() |
| Modifier and Type | Method and Description |
|---|---|
java.lang.String |
getPlainText(Element element)
Format an Element to plain-text
|
static void |
main(java.lang.String... args) |
private static final java.lang.String userAgent
private static final int timeout
public static void main(java.lang.String... args)
throws java.io.IOException
java.io.IOExceptionpublic java.lang.String getPlainText(Element element)
element - the root element to format