Re: MUPDATE database problems -- the importance of thread safety

Wesley Craig <wes@xxxxxxxxx> · Wed, 17 Jun 2009 14:16:44 -0400

Please open a report in bugzilla and mark it was a "blocker".  Thanks  
for finding the issue.

:wes

On 17 Jun 2009, at 09:44, Michael Bacon wrote:
> It turns out that this was an issue with mupdate being a multi- 
> threaded daemon, and in a critical place in the non-blocking prot  
> code (in prot_flush_internal()), the behavior relies on the value  
> of errno.  If it's EAGAIN, the write will try again, otherwise it  
> sets s->error and quits. Naturally, being a global variable  
> normally, errno doesn't work terribly well in multi-threaded code  
> unless the necessary thread safety switch is passed to the  
> compiler.  Hence, when thread #5 was getting a -1 from the write(2)  
> system call, it was reading errno as 0, rather than EAGAIN as it  
> should have been.
>
> The solution, should anyone else run into this, is as simple as  
> recompiling with the thread safety switch.  (In the case of Sun's  
> SPro, it's -mt.  I think it's -mthread for gcc, but I'm not sure.)   
> Maddening that the fix was that simple, as I spent two solid weeks  
> hunting for the dratted bug.
>
> I have two requests to the CVS maintainers out there. First, the  
> below patch to current CVS isn't terribly comprehensive, and  
> doesn't narrow it down from about a dozen places s->error could be  
> set, but at least would have given SOME kind of indication on the  
> server that something had gone wrong, and might have saved me about  
> a week of hunting.
>
> Secondly, I am very weak in the ways of autoconf, but it strikes me  
> that since Cyrus now builds mupdate as multithreaded by default  
> (good decision, IMO), autoconf should make some attempt to figure  
> out what thread safety switch is appropriate and add it to CFLAGS.
----
Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html