Complete.Org: Mailing Lists: Archives: offlineimap: April 2004:
Re: Connection problems: fix with wrapper
Home

Re: Connection problems: fix with wrapper

[Top] [All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index] [Thread Index]
To: Jared Rhine <jared@xxxxxxxxxxx>
Cc: offlineimap@xxxxxxxxxxxx
Subject: Re: Connection problems: fix with wrapper
From: Scott Lambert <lambert@xxxxxxxxxxxxxx>
Date: Fri, 23 Apr 2004 01:56:16 -0400

On Thu, Apr 22, 2004 at 07:54:57AM -0700, Jared Rhine wrote:
> [Matthew == matthew@xxxxxxxxxxx on Wed, 21 Apr 2004 16:18:00 -0600]
> 
> >> while true; do offlineimap; done
> 
> Matthew> Seriously, though, I guess one of us needs to step up to the
> Matthew> plate and put in some exception handling.
> 
> Respectfully, it's not obvious to everyone that is the correct route.
> 
> No matter how much exception-handling code you put in, there is always
> the chance of an uncaught bug forcing an unwelcome exit.  If you don't
> use a "real" system to monitor and restart your daemons, your process
> will then stay down without manual intervention.  If you use the
> widely-accepted technique of wrapping offlineimap inside a
> daemon-management system (daemontools, runit, cron), then you've
> actually solved the problem, and you don't care whether offlineimap
> exits after an uncaught exception.

Offlineimap just isn't that high a priority, for me.  I will notice that
I'm not getting new mail and click the pretty little icon.  Three or
four minutes later my mailboxes will be back in good shape.  Or it blows
up again part way through and I have to go delete the affected local
mailbox and metadata.  In the latter case, wrapping does not "solve" the
problem.  You would end up wondering why only some mailboxes were being
updated because offlineimap is running every time you go look.

Letting Python throw exceptions without the application catching them
is acceptable for the, I'm assuming here, early stages of a project,
especially one that probably started as a hobby for the author's sole
satisfaction.  I'm grateful it has been released as it has made my mail
reading more efficient.  I'm not knocking OfflineIMAP, nor it's author.
I just take exception to your, apparantly, cavalier attitude toward
unhandled exceptions.

Critical services can be wrapped, especially when the code is not robust
enough to take care of itself.  But that should only be regarded as a
last line of defense.  The only critical system I accept being that
fragile is spamd from spamassassin.  I have few alternatives there and
the situation within spamassassin is improving, ie. the failures are
happening further apart.

But not handling 90+% of the exceptions it is reasonably possible the
an application will see is not a sign of a mature reliable service.
Accepting the blowups is a lowering of coding standards that does not
bode well for the future of programming.  Proper error handling is
important stuff.  Or it was when I was taught to program.  But back
then, Microsoft handn't quite established their monopoly, I was running
OS/2, and you didn't expect the compiler to do your garbage collection
for you.

Let me exaggerate a bit on your apparant position on handling execption.
I may be reading you wrong and apologize profusely if that is the case.
However, to me it seems that by extention to your statement above that
wrapping a daemon solves the problem of the daemon dieing from unhandled
exceptions, we should just add hardware watchdogs to our systems and rip
out all the exception handling in the kernels.  It would save a lot of
kernel development time, but is it really a solution?

> If you do actually add some more exception checks, it will likely be a
> welcome addition and the community will thank you...but (IMHO) you
> will still have an incomplete solution until you wrap offlineimap
> properly.  This isn't just an offlineimap issue; EVERY daemon should
> be so wrapped if at all possible.

The wrappers should scream loudly when the monitored process has to be
restarted, possibly to your pager for critical services.  It should be
a sign that something has gone horribly wrong and someone needs to sift
through the logs asap to prevent it happening again.

-- 
Scott Lambert                    KC5MLE                       Unix SysAdmin
lambert@xxxxxxxxxxxxxx



[Prev in Thread] Current Thread [Next in Thread]