|
|||||||
![]() |
|
|
LinkBack | Thread Tools | Search this Thread | Rate Thread | Display Modes |
|
|
#1 |
|
Member (2 bit)
Join Date: Jun 2006
Posts: 2
|
Website indexing?
Hi,
I need a software that can retrieve all sites (with their description, link, title - no content) related to a certain keyword, and save them to my hard drive under a certain format (Excel, Access, text,...). For example, if I search Lebanon, it will retrieve all related sites (description, link, title) and save them. I'm trying to make a web directory, it would be wonderful if such a program existed. Any ideas? Thanks |
|
|
|
|
|
#2 |
|
Member (7 bit)
Join Date: Sep 2005
Location: UK
Posts: 114
|
i believe Google is what you're looking for?!
if not, you could use Googles own index to your advantage, if you wrote software that 'googled' your search term then return the results and re-formatted them to your own specifications then you could save them in whatever format you desire?! just an idea
|
|
|
|
|
|
#3 |
|
Wx geek
Join Date: Aug 2005
Location: Indiana
Posts: 6,638
|
If you want to index every website on the Web you will need a massive amount of processing power and hard drive space. There are litterally billions of pages on the Web.
A billion pages, 1kb of data on each, would require about a terabyte of hard drive space. I know you just want sites relating to a certain website. Searching for "Lebanon" on Google brings up about 195 million results, which would be about 195GB of space. Now I'm sure you wouldn't need that many pages, just maybe the top 100 or so, but you would still have to index every page, find which are related to what you are looking for and sort by relevance and popularity, just like Google does. Something like moistmule described might be better. Although, just don't go loading 1000s of pages from Google, I don't think they'll appreciate that. However, I don't see this as very ethical, you're taking the work that Google has done and using it for your own.
__________________
"It is the way of man to make monsters and it is the nature of monsters to destroy their makers." |
|
|
|
|
|
#4 |
|
Member (2 bit)
Join Date: Jun 2006
Posts: 2
|
Hi,
Thanks for the replies. Indeed, I don't want to index all sites, but just some couple of thousands. I don't know how to write such a program (that can retrieve Google listings and re-format them). I'm looking for one that can do that. If there is no such program, how do people make directories? I see people adding a 2000 page directory on their site from one day to the next: how do they get all those sites? Do they manually index these sites? Thanks |
|
|
|
![]() |
| Bookmarks |
| Thread Tools | Search this Thread |
| Display Modes | Rate This Thread |
|
|