Complete.Org: Mailing Lists: Archives: offlineimap: September 2008:
Re: Darwin patches
Home

Re: Darwin patches

[Top] [All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index] [Thread Index]
To: offlineimap@xxxxxxxxxxxx
Subject: Re: Darwin patches
From: Vincent Beffara <vbeffara+ml@xxxxxxxxx>
Date: Mon, 29 Sep 2008 07:33:50 +0000

Hi,

(I am the patch author.)

> I see that recently the following patch was included:
>
> > diff --git a/offlineimap/imapserver.py b/offlineimap/imapserver.py
> > index 4c89cca..b0830c7 100644
> > --- a/offlineimap/imapserver.py
> > +++ b/offlineimap/imapserver.py
> > ...
> > +    # This is a hack around Darwin's implementation of realloc()
> > (which
> > +    # Python uses inside the socket code). On Darwin, we split the
> > +    # message into 100k chunks, which should be small enough -
> > smaller
> > +    # might start seriously hurting performance ...
> >
> > +                data =3D imaplib.IMAP4.read (self, min(size-read,819=
2))
> > ..
> > +                data =3D imaplibutil.WrappedIMAP4_SSL.read (self,
> > min(size-read,8192))
>
> This particular problem is fixed in python 2.6, and they use:
>
> > read(min(size-read, 16384))

Good to know ! The last time I checked (I think it was 2.6rc1) the bug
(or mismatch, or whatever, well the realloc thing) was still happening.
In any case, I would say that some kind of workaround (if it doesn't
break anything else, as was the case of the previous version of the
patch that didn't check for size>0) should still remain, as the spread
of python 2.6 will not be instantaneous. Plus as you mention,
OfflineIMAP doesn't quit work with python 2.6 yet ...

> Because this is going to be fixed at a fundamental level, maybe you
> should include some kind of warning message at program start, so that
> we can remember to remove this special treatment later.

Makes sense.

> Also, you guys seem to be wasting a lot of cycles every read():
>
> > if (system() =3D=3D 'Darwin') and (size>0) :
>
> Shouldn't that information be cached somewhere? Why not bind 'read' to
> a particular implementation at program startup, so that there is
> virtually no overhead.

There is already "virtually no overhead": timeit tells me (on a Linux
machine, but I am assuming a mac would give similar results) that size>0
takes 145ns whereas the system() equality test takes 1.1us. Much less
that any network-related task.

Admittedly I should have swapped the size and platform test - but the
system() result is definitely cached somewhere already (if it is not a
trivial call returning a constant string).


Anyway, this was a hack from the start, and the best it can become is an
optimal hack ;-)


  /vincent

--=20
| Vincent Beffara    Section de Math=E9matiques |
|                           2-4 rue du Li=E8vre |
| T=E9l: (+41) 22 379 11 45     Case postale 64 |
| Fax: (+41) 22 379 11 76       1211 Gen=E8ve 4 |
| Vincent.Beffara@xxxxxxxx             Suisse |



[Prev in Thread] Current Thread [Next in Thread]