Complete.Org: Mailing Lists: Archives: offlineimap: October 2003:
Re: offlineimap imap<->imap reliability?
Home

Re: offlineimap imap<->imap reliability?

[Top] [All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index] [Thread Index]
To: James Leifer <James.Leifer@xxxxxxxx>
Cc: offlineimap@xxxxxxxxxxxx
Subject: Re: offlineimap imap<->imap reliability?
From: John Goerzen <jgoerzen@xxxxxxxxxxxx>
Date: Wed, 22 Oct 2003 08:18:27 -0500

On Wed, Oct 22, 2003 at 10:37:00AM +0200, James Leifer wrote:
> Hi John,
> 
> Thanks for those answers.  I must say, it looks like a impressive
> piece of engineering.
> 
> A few more questions... (snipping liberally from your reply).
> 
> > record the state of each folder as of the last sync.  These are used to
> > determine what changed locally (new messages, deleted messages, flag
> > changes, etc).  There is one file per message and that file may be updated
> > zero or more times per sync.

That should have read "one file per FOLDER".  My mistake.

> If the IMAP server has, say, 100K messages, does that mean offlineimap
> will open 100K local files to check what's changed or can it get the
> list of changed messages from the server?  Ditto for the uidvalidity

Nope, I told you wrong.  Though Maildir itself is one file per message.

UID validity and mapping files also are one file per folder.

> and uidmapping files?  In general terms, how does the performance
> scale with respect to:
> 
> * the number of changed messages (new, deleted, flags modified) on
>   each IMAP server?

The only difference here is relative to the actual actions that need to be
carried out (having the IMAP server delete messages, deleting messages
locally, etc.)  Whenever possible, requests are batched.  Therefore,
OfflineIMAP sends a single command to the IMAP server to delete 40 messages
in a folder or to delete 1.  How that command scales is up to the IMAP
server, of course, but it's almost certainly faster than sending 40
different commands.

Making changes to a Maildir is very trivial -- each change is just an
unlink() or rename().  All flag changes are handled by the latter, and
deletions by the former.

> * the total number of messages on each IMAP server?

This has an impact in terms of the amount of time it takes to request the
summary of messages in each folder.  OfflineIMAP requests certain data
during each sync (again, it sends one command to get that data, and it
requests very little information, so the IMAP server should be able to
respond quickly).  If the folder has many messages, it will take longer to
transfer that data.

Barring very exceptional circumstances, you will find that the performance
of your network link and IMAP server are by far the greater impact on
performance than OfflineIMAP's own algorithms.

OfflineIMAP can help to lessen that impact as well -- especially for
high-latency links -- by using its multiple connections feature.

For Maildir storage, OfflineIMAP is significantly faster scanning the
Maildir than programs like mutt.  The reason is that mutt has to open each
file to get information for its summary (Subject line, Date line, etc).  All
the information OfflineIMAP needs is stored in the filename, so it just does
a readdir().

> Incidently, do you use a fs like reiser that handles lots of small
> files well?

No, all my systems use ext2 or ext3.

> Got it.  Is the move optimized or is the message from the new folder
> recopied over the network?  If it's optimized, how does offlineimap
> know that it already has the message locally?  Does it store md5s of
> all messages or rely on messageids to be unique?

It is re-copied.  Optimization is on the to-do list, but it probably won't
happen unless someone else sends me a patch, because honestly there are a
lot of things that are more important to me than that :-)

> Let me try to make more sense despite my cold...  Suppose we're
> starting from an *unsynchronized* state, i.e. before the first run of
> offlineimap when it has no statefiles around. Suppose some messages

When you start from an unsynchronized state, OfflineIMAP has no information
regarding how to map messages on the local machine to those on the remote. 
Therefore, every message in the local folder gets copied to the remote, and
every message on the remote gets copied to the local.  If the same message
is present in each folder, it will wind up being present twice in each
folder, and one will represent the state from the remote, and the other will
represent the state from the local end.

> present on one end are not present on the other or some flags are
> different for some of the same messages.  How does offlineimap make
> the two consistent initially?  Does it propogate changes
> unidirectionally?  Does it warn but not actually make any changes?

OfflineIMAP has a limited unidirectional support if a remote folder is
marked read-only by IMAP.  This support is not well-tested, however, as I do
not have any such folders.

OfflineIMAP does not have any "dry run" mode.  My "dry run" for testing is
called "tar" :-)

-- John


[Prev in Thread] Current Thread [Next in Thread]