Re: reconstruct caused mailboxes (skiplist) corruption?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



We saw something similar:

syslog() messages 'on the wire' (imap, pop3, etcetera) when We've 
restarted syslog on an in-production cyrus backend.

In summary, DONT DO IT (syslog stop) with cyrus runing.


On 11/11/2010 07:54 PM, Bron Gondwana wrote:
> On Thu, Nov 11, 2010 at 02:24:47PM -0200, Henrique de Moraes Holschuh wrote:
>> On Thu, 11 Nov 2010, Paul Dekkers wrote:
>>> Uhoh! And then I looked at mailboxes.db: It looks like part completely
>>> rewritten, including the skiplist header, and the first line now said:
>>> user.bla: System I/O error System I/O error
>> This is something that has plagued cyrus for a long time.  Can we find a
>> way to actually keep tabs on our FDs so it cannot ever happen again,
>> please?  I recall reports of crap showing inside prot streams 10 years
>> ago... if now it is leaking into even worse places, well...
> It's a standalone program.  Reconstruct was running all by itself.
>
>> This probably needs a redesign of master/service fd-passing protocol,
>> and of prot streams to be fixed for good.   While at it, we should
>> switch the master/service interaction to a modern design, since the
>> operating system worth bothering with nowadays deal sanely with the
>> thundering herd effect, and all of them have proper socket event support
>> (epoll-like. Would require one of the event abstraction libraries,
>> though, so as to support linux/bsd/solaris with minimum fuss).
> Since that wasn't the issue - why on earth was it allowed to have fd 2
> in the first place?  Is Cyrus closing fd 2, or is truss closing it??
>
> There was no issue outside truss, it was when it ran under truss that
> the issue happened.
>
> Here's the start of an strace of a reconstruct run on my machine:
>
> execve("/usr/cyrus/bin/reconstruct", ["/usr/cyrus/bin/reconstruct", "-C", "/tmp/ct-slot2/etc/imapd.conf", "-s"], [/* 20 vars */]) = 0
> brk(0)                                  = 0x12f1000
> access("/etc/ld.so.nohwcap", F_OK)      = -1 ENOENT (No such file or directory)
> mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fceb52d8000
> access("/etc/ld.so.preload", R_OK)      = -1 ENOENT (No such file or directory)
> open("db-4.6/lib/tls/x86_64/libsasl2.so.2", O_RDONLY) = -1 ENOENT (No such file or directory)
> open("db-4.6/lib/tls/libsasl2.so.2", O_RDONLY) = -1 ENOENT (No such file or directory)
> open("db-4.6/lib/x86_64/libsasl2.so.2", O_RDONLY) = -1 ENOENT (No such file or directory)
> open("db-4.6/lib/libsasl2.so.2", O_RDONLY) = -1 ENOENT (No such file or directory)
> open("/etc/ld.so.cache", O_RDONLY)      = 3
>
>
> Notice the first fd allocated: 3.
>
> And here's a run under truss on FreeBSD:
>
> [root@cyrus1 /var/imap]# sudo -u cyrus truss /usr/local/cyrus/bin/reconstruct user.foo
> __sysctl(0x7fffffffe390,0x2,0x7fffffffe3ac,0x7fffffffe3a0,0x0,0x0) = 0 (0x0)
> mmap(0x0,672,PROT_READ|PROT_WRITE,MAP_ANON,-1,0x0) = 34366398464 (0x80065a000)
> munmap(0x80065a000,672)		     = 0 (0x0)
> __sysctl(0x7fffffffe400,0x2,0x800763428,0x7fffffffe3f8,0x0,0x0) = 0 (0x0)
> mmap(0x0,32768,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANON,-1,0x0) = 34366398464 (0x80065a000)
> issetugid(0x80065b015,0x800654cc4,0x80076fc50,0x80076fc20,0x6351,0x0) = 0 (0x0)
> open("/etc/libmap.conf",O_RDONLY,0666)	     ERR#2 'No such file or directory'
> access("/usr/lib/libsasl2.so.2",0)	 ERR#2 'No such file or directory'
> access("/usr/local/lib/libsasl2.so.2",0)     = 0 (0x0)
> open("/usr/local/lib/libsasl2.so.2",O_RDONLY,035431400) = 2 (0x2)
>
> Note the first fd allocated: 2!!!!!
>
>
> The question is - why is fd 2 being allocated?  Is it necessary to explicitly
> open stderr?  The function that's scribbling all over everything is com_err,
> which is supposed to be a BSD error reporting library, it SHOULD know what
> it's doing...
>
> Bron ( a while later, fd 2 gets re-used as the mailboxes.db handle, and hence
>         the mess is created )
> ----
> Cyrus Home Page: http://www.cyrusimap.org/
> List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/

----
Cyrus Home Page: http://www.cyrusimap.org/
List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/


[Index of Archives]     [Cyrus SASL]     [Squirrel Mail]     [Asterisk PBX]     [Video For Linux]     [Photo]     [Yosemite News]     [gtk]     [KDE]     [Gimp on Windows]     [Steve's Art]

  Powered by Linux