[gopher] Re: Bot update
[Top] [All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index] [Thread Index]
Yes very much like WAIS and with each gopher server, for this particular
project (searching Johns database), maintaing a part of the larger data set.
The idea being that your searching a static database and each link points to
the cached database and therefore faster than WAIS as it was used and could
still be used. I feel there is a distinction between WAIS and using WAIS
software to work on one static database, even if done similarly because with
WAIS there was alot of lag and latency which we could trim by selecting servers
and groups of servers dedicated for the project.
And maybe we should have some people running WAIS anyhow.
John had some ideas for pygopherd to be more easily traversed through
directories by google spidering... what of a way to have the full text be
googled and derived from pygopherds html-ized pages, in this way one would run
a pygopherd server with the 28G data on it and let google spider it and then
make a gopher gateway to google http://gophersite:80/bigdataset.
That way google does the work of the search and its brought back "down" into
gopherland. Just more thoughts..
Chris
On Wed, 30 Nov 2005 12:06:58 +0300
"R.A.Pavlov" <webmaster@xxxxxxxxxx> wrote:
> On Tue, Nov 29, 2005 at 09:03:35PM -0600, Chris wrote:
> > Some other possibilities came to mind, things such as breaking it up into
> > datasets and having various boxen here as well as at other gophers each
> > maintain a dataset or sets.
>
> And this is close to the idea of WAIS searches where each server indexes
> its own content and other servers maintain lists of links to these WAIS
> servers. If you mean that each gopher server maintains a database
> of its own content.
>
> By the way I have some progress with WAIS and will show some results to
> the public very soon.
>
> > These were just some thoughts I had. Thanks John for getting it I think
> > it's awesome and am excited to see what we can all do with it.
> > Chris
> > gopher://hal3000.cx
> >
> >
> > On Tue, 29 Nov 2005 17:20:06 -0600
> > John Goerzen <jgoerzen@xxxxxxxxxxxx> wrote:
> >
> > > On Wed, Nov 16, 2005 at 10:04:17PM -0600, Jeff wrote:
> > > > On Sun, 30 Oct 2005 21:48:51 -0600, John Goerzen
> > > > <jgoerzen@xxxxxxxxxxxx>
> > > > wrote:
> > > >
> > > > > Here's an update on the gopher bot:
> > > > >
> > > > > There is currently 28G of data archived representing 386,315
> > > > > documents. 1.3 million documents remain to be visited, from
> > > > > approximately 20 very large Gopher servers. I believe, then, that the
> > > > > majority of gopher servers have been cached by this point. 3,987
> > > > > different servers are presently represented in the archive.
> > > >
> > > > Any news?
> > >
> > > Not really. The bot hit a point where its algorithm for storing page
> > > information was getting to be too slow, and there was also a problem
> > > with the database layer I'm using segfaulting. When I get some time, I
> > > will write a new layer.
> > >
> > > In the meantime, I'd like to talk about how to get this data to others
> > > that might be willing to host it, as well as how to store it out there
> > > for the public. Any ideas?
> > >
> > >
> > >
> > >
> >
> >
> > --
> > Join FSF as an Associate Member at:
> > <URL:http://member.fsf.org/join?referrer=3014>
> >
>
> --
> Yours, etc.
> Roman A. Pavlov
>
> gopher://sdf.lonestar.org/1/users/rp
>
>
>
>
>
--
Join FSF as an Associate Member at:
<URL:http://member.fsf.org/join?referrer=3014>
|
|