Complete.Org: Mailing Lists: Archives: gopher: May 2004:
[gopher] Cicada Incomplete Gopher Census
Home

[gopher] Cicada Incomplete Gopher Census

[Top] [All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index] [Thread Index]
To: gopher@xxxxxxxxxxxx
Subject: [gopher] Cicada Incomplete Gopher Census
From: Tim Fraser <tfraser@xxxxxxxxxx>
Date: Thu, 27 May 2004 22:23:33 -0400
Reply-to: gopher@xxxxxxxxxxxx

I recently indulged my long-time fascination with Gopherspace by
writing a spider program to determine how many Gophers are actually
out there.  Of course, now that I'm finished, I've found Cameron
Kaiser's post in the gopher mailing list archive announcing that he'd
finished updating the the much nicer V-2 database at floodgap.org,
even as my little toy spider was chugging away.  Seems I should have
picked a different project to pass the time! ;^)

Fortunately, nobody ever said a fun hacking project had to be
*useful*, so I'll post this note about my results nonetheless.  While
my program is not perfect, its results still seem interesting: After
spidering intermittently between Sunday 23 May and Thursday 27 May
2004, I found 154 Gopher servers live and ready to serve useful data.
I found an additional 16 live Gophers that could not serve any useful
data due to an inability to access their log files, configuration
files, data directories, or back-end NNTP servers.  My spider found
links to 1777 Gopher servers that were not operational at the time of
my spidering.

I've somewhat pretentiously decided to call this the "Cicada
Incomplete Gopher Census".  "Cicada" is in honor of the 17-year brood
of magicicadas that kept my computer room filled with their
astonishing multidecibel flying-saucer wailing throughout the entire
spidering process.  (It's *really* loud!)  "Incomplete" is in
recognition of the fact that the census almost certainly missed some
Gophers.  The full results of this incomplete Gopher census and the
spidering program I used to produce them are available via Gopher at:

sdf.lonestar.org/11/users/tfraser

Any feedback would be appreciated.

Perhaps the most egregious error in my survey is my failure to
completely spider two large sites: Gopher.dna.affrc.go.jp:70 and
bbs.nsysu.edu.tw:70.  Near the end of my spidering, I realized that my
primitive spider had been automatically banned from at least one site
(floodgap.com) for behaving rudely in its quest to traverse all of the
site's directories looking for links to other sites.  Horrified that
my little project to celebrate Gopher's longevity might be causing
trouble for Gopher administrators, I cut the spidering of these last
two sites short.  My apologies to anyone I might have inconvenienced!

With these shortcomings in mind, here's some highlights:

The five Gophers with the most selectors might be:

74632 ftp.std.com:70
      Software Tool & Die's The World (ISP)
29931 fas.sfu.ca:75
      Faculty of Applied Sciences, Simon Fraser University
29799 nic.merit.edu:7043 
      Merit Network Information Center Services (MichNet)
28201 dongpo.math.ncu.edu.tw:70
      Department of Mathematics, National Central University"
21775 osiris.wu-wien.ac.at:71
      A Gopher at the University of Vienna

The five Gophers with links to the most other sites might be:

851 gopher.csie.nctu.edu.tw:70
    Dept. of Comp Sci and Info Eng, National Chiao-Tung University"
211 gopher.floodgap.com:70
    Floodgap systems, formerly gopher.ptloma.edu
185 www.polarhome.com:27070
    Polarhome (public access UNIX and VMS)
180 sdf.lonestar.org:70
    Super Dimension Fortress (public access UNIX)
178 iubio.bio.indiana.edu:70
    Indiana University Bio-Archive

Many thanks to all who continue to support Gopherspace,

Tim Fraser


[Prev in Thread] Current Thread [Next in Thread]