|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||
java.lang.Object | +--WebSpider
Web-crawling objects. Instances of this class will crawl a given web site in breadth-first order.
| Field Summary | |
int |
crawlLimitDefault
The maximum number of pages to crawl. |
private WebIndex |
i
|
private java.net.URL |
u
|
| Constructor Summary | |
WebSpider(java.net.URL u,
WebIndex i)
Create a new web spider. |
|
| Method Summary | |
WebIndex |
crawl()
Crawl the web, up to the default number of web pages. |
WebIndex |
crawl(int limit)
Crawl the web, up to a certain number of web pages. |
private java.lang.String |
StripPalm(java.lang.String s)
strip out all the '#' and "/" in the url in order to avoid intrapage link |
| Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
| Field Detail |
public int crawlLimitDefault
private java.net.URL u
private WebIndex i
| Constructor Detail |
public WebSpider(java.net.URL u,
WebIndex i)
u - The URL of the web site to crawl.i - The initial web index object to extend.| Method Detail |
public WebIndex crawl(int limit)
limit - The maximum number of pages to crawl.
public WebIndex crawl()
private java.lang.String StripPalm(java.lang.String s)
|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||