|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||
java.lang.Object | +--HttpTokenizer
A simple tokenizer for web pages. Object of this class provide a low-level, stream-based parsing of web pages. Given a Reader object that reads a web page, an HttpTokenizer will provide a stream of tokens, in exactly the same style as the StreamTokenizer class in java.io. The possible tokens are as follows: HT_EOF: The end of file HT_NUMBER: A number, converted to a double HT_WORD: A word, converted to all lowercase HT_STRING: A quoted string HT_TAGOPEN: A "<" character HT_TAGCLOSE: A ">" character HT_EQUALS: A "=" character HT_SLASH: A "/" character HT_DASH: A "-" character HT_BANG: A "!" character HT_A: The keyword "a" HT_HREF: The keyword "href" HT_IMG: The keyword "img" When an HT_NUMBER is returned by the next() method, the instance variable nval contains the double representation of the number. When an HT_WORD or STRING is returned by the next() method, the instance variable sval contains its string representation.
| Field Summary | |
static int |
HT_A
A constant indicating an "a" has been read. |
static int |
HT_BANG
A constant indicating a '!' |
static int |
HT_DASH
A constant indicating a '-' has been read. |
static int |
HT_EOF
A constant indicating the end of the web document has been reached. |
static int |
HT_EQUALS
A constant indicating a '=' has been read. |
static int |
HT_HREF
A constant indicating an "href" has been read. |
static int |
HT_IMG
A constant indicating an "img" has been read. |
static int |
HT_NUMBER
A constant indicating a number token has been read. |
static int |
HT_SLASH
A constant indicating a '/' has been read. |
static int |
HT_STRING
A constant indicating a string token has been read. |
static int |
HT_TAGCLOSE
A constant indicating a '>' has been read. |
static int |
HT_TAGOPEN
A constant indicating a '<' has been read. |
static int |
HT_WORD
A constant indicating a word token has been read. |
double |
nval
If the current token is a number, this field contains the value of that number. |
java.lang.String |
sval
If the current token is a word or string, this field gives the string. |
private java.io.StreamTokenizer |
tokens
|
private java.util.StringTokenizer |
word
|
| Constructor Summary | |
HttpTokenizer(java.io.Reader page)
Create an HTTP tokenizer, given a Reader for the web page. |
|
| Method Summary | |
int |
nextToken()
Parses the next token from the web page. |
| Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
| Field Detail |
public static final int HT_EOF
public static final int HT_NUMBER
public static final int HT_WORD
public static final int HT_STRING
public static final int HT_TAGOPEN
public static final int HT_TAGCLOSE
public static final int HT_EQUALS
public static final int HT_SLASH
public static final int HT_DASH
public static final int HT_BANG
public static final int HT_A
public static final int HT_HREF
public static final int HT_IMG
public java.lang.String sval
public double nval
private java.io.StreamTokenizer tokens
private java.util.StringTokenizer word
| Constructor Detail |
public HttpTokenizer(java.io.Reader page)
throws java.io.IOException
| Method Detail |
public int nextToken()
throws java.io.IOException
java.io.IOException
|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||