Complete.Org: Mailing Lists: Archives: offlineimap: July 2007:
Re: [PATCH] LocalStatus in sqlite (take2)
Home

Re: [PATCH] LocalStatus in sqlite (take2)

[Top] [All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index] [Thread Index]
To: offlineimap@xxxxxxxxxxxx
Cc: Stewart Smith <stewart@xxxxxxxxxxxxxxxx>
Subject: Re: [PATCH] LocalStatus in sqlite (take2)
From: John Goerzen <jgoerzen@xxxxxxxxxxxx>
Date: Fri, 6 Jul 2007 10:15:21 -0500

On Fri July 6 2007 2:23:59 am Stewart Smith wrote:
> Hi there!

Hi Stewart,

First, thanks much for your patches.  I have been doing some development in 
OfflineIMAP recently, and you happened to grab OfflineIMAP from darcs at a 
time when it wasn't buildable.  I have since decided to go a different 
direction with the UI code, and the changes your first two patches addressed 
have been backed out.

>       - NOTE: total sync time was no different between the unpatched and
> patched versions, suggesting that threads for a single mailbox does
> nothing to improve performance.

That's probably because you were testing on a low-latency connection (wifi on 
your LAN).  If you're going across the Internet, there are significant 
benefits even for single mailboxes.

[ snips ]

Let's look at the performance numbers:

> Unpatched offlineimap took:
>
> For the initial sync (all messages):
>               real    9m35.872s
>
> The second sync (doing nothing) took:
>               real    0m6.846s
>
> With this patch, offlineimap took:
>               real    9m26.239s
>
> For the second (doing nothing) sync:
> 1st run:
>               real    0m11.403s
> 2nd run:
>               real    0m6.893s

Looking at the real times only here, it looks to me like the initial sync is 
2% faster.  The second sync is either 68% slower, or 1% slower, depending on 
how the numbers are read.

My initial inclination is to say that this is a marginal benefit even for 
this extreme use case (huge mailbox, connected over LAN).  There are a 
couple of things that concern me here:

1) No longer supporting multithreaded sync for IMAP

I think that for the majority of use cases, this will make real time 
significantly slower (300% or more wouldn't surprise me)

This can probably be solved by careful use of locks around the database 
accesses.

2) The added complexity, code size, and prerequisites don't seem (yet) to 
justify the performance difference (in real time).

Something such as Python's anydbm support may be able to achieve similar 
performance gains without these other things.

That said, I am a big fan of Sqlite (my hpodder program, for instance, uses 
it).  I would encourage you to maintain a Darcs branch of OfflineIMAP with 
your Sqlite code in it, and work on the threading support in particular.  I 
think that this could be appropriate for inclusion in mainline at some 
point.

I'm particularly glad that you added automatic migration support to your 
code, BTW.

> I also experimented with bulk committing new mail messages to
> LocalStatus (every 10 and every 100 messages). It was possible to shave
> ~30 and ~60 seconds off the initial sync respectively. Although I don't
> think the possibility of duplicate messages is really worth it at this
> stage. I think there are other (better) ways to improve initial sync
> performance than this.

Agreed.

-- John



[Prev in Thread] Current Thread [Next in Thread]