------------ Omen Wild (University of California Davis) The root problem seems to be an interaction between Solaris' concept of global memory consistency and the fact that Cyrus spawns many processes that all memory map (mmap) the same file. Whenever any process updates any part of a memory mapped file, Solaris freezes all of the processes that have that file mmaped, updates their memory tables, and then re-schedules the processes to run. When we have problems we see the load average go extremely high and no useful work gets done by Cyrus. Logins get processed by saslauthd, but listing an inbox either takes a long time or completely times out. Apparently AIX also runs into this issue. I talked to one email administrator that had this exact issue under AIX. That admin talked to the kernel engineers at IBM who explained that this is a feature, not a bug. They eventually switched to Linux which solved their issues, although they did move to more Linux boxes with fewer users per box.Oh man... Horrible memories just flood right back... Wow. I was reading your e-mail and thinking to myself that this sounded like the same problem we had. Then I got to the above section and *bam*, there it was... We had significant problems with our e-mail last year (this year was a perfect start!) a week before students came back. We didn't resolve the problems until the end of September and we were dismayed at our final solution. We run Tru64 5.1b on a 4 member cluster. Tru64's kernel suffers from the same exact issue as described above. We have regularly 12,000 cyrus procs running at any one time during the day, and that cluster also receives on average 300k-500k e-mails each day (that is after spam/virus work). What was finally identified was that the number of "processes" that were mapped to that single physical "executable" (/usr/cyrus/imapd) was causing a lot of lock contention in the kernel. The executable would have a link list of all the processes running off of it in kernel memory. When one of the processes would go away, the kernel would start at the beginning of the list and search for the process in order to clean up its resources. During that time, the kernel would lock everything and execution would essentially stop for everything (basically, the whole system appeared to simply freeze on us). The kernel would reach a time threshold and stop in order to let other things happen (unfreeze). This time was very short, but if we had a lot of processes going away in a very short period of time, we would noticeably see the freeze, since the kernel was going into this lock-down mode a lot in a very short period of time. That is a simplified view of what really happened.
could someone whip up a small test that could be used to check different operating systems (and filesystems) for this concurrancy problem?
it doesn't even need to use any cyrus code, (in fact it would probably be better if it didn't)
it sounds like there are a couple different aspects to check1. large number of copies of a single program running, find the impact of starting and stopping a process
1a. single process that forks lots of copies 1b. master process that execs lots of copies 2. large number of processes mmapping a single file. 2a. impact to add or remove a process from this group 2b. impact on modifying this filepersonally I expect 1b and 1a to be significantly different on different OSs. some OSs will gain huge memory savings in 1a due to copy-on-write savings (and to partially account for this it may be worth making the program allocate a chink of ram and write to it after the fork), while on other OSs the overhead of multiple mappings of a page will dominate.
David Lang
--On Tuesday, October 16, 2007 3:39 PM -0700 Vincent Fox <vbfox@xxxxxxxxxxx> wrote:
------------ Omen Wild (University of California Davis) The root problem seems to be an interaction between Solaris' concept of global memory consistency and the fact that Cyrus spawns many processes that all memory map (mmap) the same file. Whenever any process updates any part of a memory mapped file, Solaris freezes all of the processes that have that file mmaped, updates their memory tables, and then re-schedules the processes to run. When we have problems we see the load average go extremely high and no useful work gets done by Cyrus. Logins get processed by saslauthd, but listing an inbox either takes a long time or completely times out. Apparently AIX also runs into this issue. I talked to one email administrator that had this exact issue under AIX. That admin talked to the kernel engineers at IBM who explained that this is a feature, not a bug. They eventually switched to Linux which solved their issues, although they did move to more Linux boxes with fewer users per box.
Oh man... Horrible memories just flood right back... Wow. I was reading your e-mail and thinking to myself that this sounded like the same problem we had. Then I got to the above section and *bam*, there it was... We had significant problems with our e-mail last year (this year was a perfect start!) a week before students came back. We didn't resolve the problems until the end of September and we were dismayed at our final solution. We run Tru64 5.1b on a 4 member cluster. Tru64's kernel suffers from the same exact issue as described above. We have regularly 12,000 cyrus procs running at any one time during the day, and that cluster also receives on average 300k-500k e-mails each day (that is after spam/virus work). What was finally identified was that the number of "processes" that were mapped to that single physical "executable" (/usr/cyrus/imapd) was causing a lot of lock contention in the kernel. The executable would have a link list of all the processes running off of it in kernel memory. When one of the processes would go away, the kernel would start at the beginning of the list and search for the process in order to clean up its resources. During that time, the kernel would lock everything and execution would essentially stop for everything (basically, the whole system appeared to simply freeze on us). The kernel would reach a time threshold and stop in order to let other things happen (unfreeze). This time was very short, but if we had a lot of processes going away in a very short period of time, we would noticeably see the freeze, since the kernel was going into this lock-down mode a lot in a very short period of time. That is a simplified view of what really happened. HP recommends that we keep the linked list down to only a few hundred processes at most. They were working on a kernel patch to make it a hash instead of a linked list in the kernel, but as they got deeper into the making of this patch, they found that it impacts a lot more than they initially realized. The last I heard, this might make it into the PK7 patch release, which is likely sometime next year. Meanwhile, we hacked around this in a very cool way. We copied the imapd process 60 times (assuming average of 12,000 processes, shooting for 200 processes per executable, that is 60 individual executables). These were named /usr/cyrus/bin/imapd_001 through /usr/cyrus/bin/imapd_060. We then symlinked the "imapd" binary to imapd_001. We then wrote a cron job that ran once a minute and relinked the imapd symlink to the next numbered executable, rotating around to imapd_001 when the end was reached. This worked like a charm and *all* of our problems went away... In fact, our system has continued to get busier and we are still running pretty good. I don't think the hack is ideal, but man, does it work! Scott -- +-----------------------------------------------------------------------+ Scott W. Adkins Work (740)593-9478 Fax (740)593-1944 UNIX Systems Engineer <mailto:adkinss@xxxxxxxx> +-----------------------------------------------------------------------+ PGP Public Key <http://edirectory.ohio.edu/?$search?uid=adkinss>
Attachment:
pgpr2eu6FXIOC.pgp
Description: PGP signature
---- Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
---- Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html