Can values way above 100% be trusted? If so, it's pretty bad (this is
from a situation where there are 200 lmtp processes, which is the
current limit I set):
I've never seen over 100%, and it doesn't seem to make sense, so I'm
guessing it's a bogus value.
avg-cpu: %user %nice %system %iowait %idle
2.53 0.00 5.26 89.98 2.23
However this shows that the system is mainly waiting on IO as we expected.
Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s rkB/s wkB/s
avgrq-sz avgqu-sz await svctm %util
etherd/e0.0
0.00 0.00 5.87 235.02 225.10 2513.77 112.55 1256.88
11.37 0.00 750.32 750.32 18074.51
Ugg, if you line those up, await = 750.32
await - The average time (in milliseconds) for I/O requests issued to
the device to be served. This includes the time spent by the requests in
queue and the time spent servicing them.
So it's taking 0.75 seconds on average to service an IO request, that's
really bad.
Load average tends to get really high. It starts increasing really fast
after the number of lmtpd processes reaches the limit set in cyrus.conf,
and can easily get to 150 or 200. One of the moments where the problem
Makes sense. There's 200 lmtpd processes waiting on IO, and in linux at
least, the load average is calculated as number of processes not in "sleep"
state basically.
Really you never want that many lmtpd processes, if they're all in use, it's
clear you've got an IO problem. Limiting it to 10 or so is probably a
reasonable number to avoid complete IO saturation and IO sevice delays.
- The ones that don't have the problem use local disks instead of AoE
- The ones that don't have the problem are limited to 2000 domains
(around 8000 accounts), while the one using the AoE storage serves 4000
domains (around 20000 accounts).
Anyone running cyrus with that many accounts?
Yes, no problem, though using local disks.
I think the problem is probably the latency that AoE introduces into the
disk path. A couple of questions
1. How many disks in the AoE array?
2. Are they all one RAID array, or multiple RAID arrays? What type?
3. Are they one volume, or multiple volumes?
Because of the latency for system <-> drive IO, the thing you want to try
and do is allow the OS to send more outstanding requests in parallel. The
problem is I don't know where in the FS <-> RAID <-> AoE path the
serialising bits are, so I'm not sure what the best things to do to increase
parallelism are, but the usualy things to try are more RAID arrays with less
drives per array, and more volumes per RAID array. This gives more places
for parallelism to occur assuming there's not something holding some
internal lock somewhere.
Some of our machines have 4 RAID arrays divided up into 40 separate
filesystems/volumes.
Rob
----
Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html