Complete.Org: Mailing Lists: Archives: gopher: December 2007:
[gopher] Improved binary file detection in Bucktooth 0.2.2

[gopher] Improved binary file detection in Bucktooth 0.2.2

[Top] [All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index] [Thread Index]

To:	gopher@xxxxxxxxxxxx
Subject:	[gopher] Improved binary file detection in Bucktooth 0.2.2
From:	brian@xxxxxxxxxxxxx
Date:	Fri, 28 Dec 2007 01:23:39 -0600
Reply-to:	gopher@xxxxxxxxxxxx

I'm using buckd to serve up binary files, and noticed that several
binary files (mostly older PDFs with a lot of text in the file header)
were being identified as item type "0" rather than "9". It turns out
that buckd uses the Perl -B operator to determine binary files.  To do
this, it examines some number of bytes in the file header for certain
characteristics (nul bytes, high-order bits set, etc.) and if that
number of bytes exceeds 30%, Perl identifies it as a binary file.

This wasn't accurate enough for my purposes, so I modified buckd.in so
that it calls the UNIX "file" command and greps for the string "text"
(guaranteed to be returned if a file is identified as a text file).

I just want to emphasize that this is *not* a problem with Bucktooth,
but rather an issue with Perl.

Here's the patchfile with the change.  I opted to modify buckd.in and
simply regenerate buckd.

--- buckd.in    2007-12-28 01:21:30.000000000 -0600
+++ buckd.in.new        2007-12-28 01:20:58.000000000 -0600
@@ -289,7 +289,7 @@
                ($xentr =~ /\.jpe?g$/i) ? "I" :
                ($xentr =~ /\.html?$/i) ? "h" :
                ($xentr =~ /\.hqx$/i) ? "4" :
-               (-B $xentr) ? "9" :
+               (grep(!/text/, `file $xentr`)) ? "9" :
        "0";
        $xentr =~ s/^$DIR//;
        return ($itype, ($pentr eq $xentr) ? '' : $xentr);

  --Brian

[Prev in Thread]

Current Thread

[Next in Thread]

[gopher] Improved binary file detection in Bucktooth 0.2.2, brian <=
- [gopher] Re: Improved binary file detection in Bucktooth 0.2.2, Cameron Kaiser, 2007/12/28
  - [gopher] Re: Improved binary file detection in Bucktooth 0.2.2, brian, 2007/12/28
    - [gopher] Re: Improved binary file detection in Bucktooth 0.2.2, Cameron Kaiser, 2007/12/28

Prev by Date: [gopher] Re: PocketPC / Smartphones
Next by Date: [gopher] Re: Improved binary file detection in Bucktooth 0.2.2
Previous by thread: [gopher] PocketPC / Smartphones
Next by thread: [gopher] Re: Improved binary file detection in Bucktooth 0.2.2
Index(es):
- Date
- Thread