Package
Class
Tree
Deprecated
Index
Help
PREV NEXT
FRAMES
NO FRAMES
All Classes
A
C
D
E
F
G
H
I
K
L
M
N
O
P
R
S
T
U
W
A
action
- Variable in class
PageLexer
The action table. action.doit(state) performs the action for the given state.
addPage(URL, ObjectIterator, ObjectIterator)
- Method in class
WebIndex
Add the given web page to the index.
C
clearup()
- Method in class
CoolSearch
clear up the web index
close()
- Method in class
URLTextReader
Close the stream.
compareTo(Object)
- Method in class
Page
CoolSearch
- class
CoolSearch
.
The servlet class.
CoolSearch()
- Constructor for class
CoolSearch
crawl()
- Method in class
WebSpider
Crawl the web, up to the default number of web pages.
crawl(int)
- Method in class
WebSpider
Crawl the web, up to a certain number of web pages.
crawl(int, String, WebIndex, String)
- Method in class
CoolSearch
Crawl the web!!
crawlLimitDefault
- Variable in class
WebSpider
The maximum number of pages to crawl.
D
data
- Variable in class
ObjectIterator
delta
- Variable in class
PageLexer
The state-transition table.
doGet(HttpServletRequest, HttpServletResponse)
- Method in class
CoolSearch
the get method that response to the search.html
doit(int)
- Method in class
PageLexer.Action
doPost(HttpServletRequest, HttpServletResponse)
- Method in class
CoolSearch
the post method that response to manager.html
E
elts
- Variable in class
PageLexer
F
fix(URL, String)
- Static method in class
URLFixer
Function takes the parent level URL and the linkCandidate and attempts to resolve any relative linkage.
G
getDirectoryAndFile(String)
- Static method in class
URLFixer
Function takes the non-host part of the string and returns the directory structure only.
getIndegree()
- Method in class
Page
getUrl()
- Method in class
Page
H
hasNext()
- Method in class
ObjectIterator
Determine whether there are more elements in the Iterator.
hasNext()
- Method in class
PageLexer
Determine whether there are more PageElements in the page.
hit
- Variable in class
WebIndex
href
- Variable in class
PageHref
HT_A
- Static variable in class
HttpTokenizer
A constant indicating an "a" has been read.
HT_BANG
- Static variable in class
HttpTokenizer
A constant indicating a '!'
HT_DASH
- Static variable in class
HttpTokenizer
A constant indicating a '-' has been read.
HT_EOF
- Static variable in class
HttpTokenizer
A constant indicating the end of the web document has been reached.
HT_EQUALS
- Static variable in class
HttpTokenizer
A constant indicating a '=' has been read.
HT_HREF
- Static variable in class
HttpTokenizer
A constant indicating an "href" has been read.
HT_IMG
- Static variable in class
HttpTokenizer
A constant indicating an "img" has been read.
HT_NUMBER
- Static variable in class
HttpTokenizer
A constant indicating a number token has been read.
HT_SLASH
- Static variable in class
HttpTokenizer
A constant indicating a '/' has been read.
HT_STRING
- Static variable in class
HttpTokenizer
A constant indicating a string token has been read.
HT_TAGCLOSE
- Static variable in class
HttpTokenizer
A constant indicating a '>' has been read.
HT_TAGOPEN
- Static variable in class
HttpTokenizer
A constant indicating a '<' has been read.
HT_WORD
- Static variable in class
HttpTokenizer
A constant indicating a word token has been read.
HttpTokenizer
- class
HttpTokenizer
.
A simple tokenizer for web pages.
HttpTokenizer(Reader)
- Constructor for class
HttpTokenizer
Create an HTTP tokenizer, given a Reader for the web page.
I
i
- Variable in class
ObjectIterator
i
- Variable in class
WebSpider
indegree
- Variable in class
Page
index
- Variable in class
WebIndex
intersect(ObjectIterator, ObjectIterator)
- Method in class
WebIndex
get intersection of two Page object
K
keywords
- Variable in class
Page
L
links
- Variable in class
Page
M
makeIndex()
- Method in class
WebIndex
Index all of the pages that have been added by the addPage method.
N
next()
- Method in class
ObjectIterator
Get the next Object in the iteration.
next()
- Method in class
PageLexer
Return the next PageElement in the page.
nextToken()
- Method in class
HttpTokenizer
Parses the next token from the web page.
num
- Variable in class
PageNum
nval
- Variable in class
HttpTokenizer
If the current token is a number, this field contains the value of that number.
O
ObjectIterator
- class
ObjectIterator
.
This class creates Iterators out of Vectors.
ObjectIterator(Vector)
- Constructor for class
ObjectIterator
The constructor for ObjectIterator.
P
Page
- class
Page
.
a inner class that represent page url and its indregree
Page(String, int)
- Constructor for class
Page
Page(URL, ObjectIterator, ObjectIterator)
- Constructor for class
Page
PageElement
- interface
PageElement
.
Elements of web documents.
PageHref
- class
PageHref
.
A hyperlink in a web page.
PageHref(String)
- Constructor for class
PageHref
PageHref(URL, String)
- Constructor for class
PageHref
PageLexer
- class
PageLexer
.
A lexical analyzer for web documents, based on a finite-state machine.
PageLexer.Action
- class
PageLexer.Action
.
a private class that do the action
PageLexer.Action()
- Constructor for class
PageLexer.Action
PageLexer(Reader, URL)
- Constructor for class
PageLexer
Creates a new web page lexer.
PageNum
- class
PageNum
.
A number in a web page.
PageNum(double)
- Constructor for class
PageNum
pages
- Variable in class
WebIndex
PageWord
- class
PageWord
.
A text word in a web page.
PageWord(String)
- Constructor for class
PageWord
R
read(char[], int, int)
- Method in class
URLTextReader
Read characters into a portion of an array.
reader
- Variable in class
URLTextReader
readLine()
- Method in class
URLTextReader
Read a line of text.
remove()
- Method in class
ObjectIterator
Unimplemented.
remove()
- Method in class
PageLexer
Unimplemented
restore(FileInputStream)
- Method in class
WebIndex
Restore the index from a file.
restore(String)
- Method in class
CoolSearch
restore the webindex from file
retrievePages(ObjectIterator)
- Method in class
WebIndex
Retrieve all of the web pages that contain any of the given keywords.
retrievePages(PageWord)
- Method in class
WebIndex
Retrieve all of the web pages that contain the given keyword.
S
save(FileOutputStream)
- Method in class
WebIndex
Save the index to a file.
setIndegree(int)
- Method in class
Page
setUrl(String)
- Method in class
Page
StripPalm(String)
- Method in class
WebSpider
strip out all the '#' and "/" in the url in order to avoid intrapage link
sval
- Variable in class
HttpTokenizer
If the current token is a word or string, this field gives the string.
T
tempHit
- Variable in class
WebIndex
tokens
- Variable in class
HttpTokenizer
tokenStream
- Variable in class
PageLexer
toString()
- Method in interface
PageElement
toString()
- Method in class
PageHref
toString()
- Method in class
PageNum
toString()
- Method in class
PageWord
toString()
- Method in class
WebIndex
Produce a printable representation of the index.
toString()
- Method in class
Page
U
u
- Variable in class
WebSpider
url
- Variable in class
PageLexer
url
- Variable in class
Page
URLFixer
- class
URLFixer
.
Helper class that is used to resolve relative url's, given a parent context url
URLFixer()
- Constructor for class
URLFixer
URLTextReader
- class
URLTextReader
.
Read text from the Internet, given a URL.
URLTextReader(URL)
- Constructor for class
URLTextReader
Create a buffering URL text reader, given a URL.
W
w
- Static variable in class
CoolSearch
The Index that store keywords-pages.
WebIndex
- class
WebIndex
.
The web-indexing object.
WebIndex()
- Constructor for class
WebIndex
The constructor for WebIndex.
WebSpider
- class
WebSpider
.
Web-crawling objects.
WebSpider(URL, WebIndex)
- Constructor for class
WebSpider
Create a new web spider.
word
- Variable in class
HttpTokenizer
word
- Variable in class
PageWord
A
C
D
E
F
G
H
I
K
L
M
N
O
P
R
S
T
U
W
Package
Class
Tree
Deprecated
Index
Help
PREV NEXT
FRAMES
NO FRAMES
All Classes