PostFix+SpamAssassin - Disk I/O Issues!

"XOR" <XOR@xxxxxxxxxxxxx> · Fri, 7 Nov 2003 15:49:17 -0500



Greetings All!
	I have just put a PostFix+SpamAssassin system into production for a
network of about 750,000 users running on Red Hat Enterprise Linux 3.0 ES.
I have three front-end PostFix servers acting as mail relay's for all
incoming SMTP traffic and four back-end SpamAssassin (spamd) boxes which the
PostFix boxes are shipping mail too for scanning via a shell script I wrote.
Everything is working wonderfully and mail is flowing quickly.  I do have
one concern, however, that I believe could become a problem in the near
future.  I wanted to get the list's thoughts on the solution I have come up
with.
	The PostFix servers seem to be heavily loaded with disk I/O.
Running a "top" shows that the processors spend the vast majority of their
time in the "iowait" state - sometimes remaining at above 90% in this
category for several minutes on end.  Looking at a "vmstat 1" also shows
that there is a hefty disk load, consistently reporting 1000+ in the "bo"
field under the "io" heading.  Running an "iostat -x 1" I see that write
requests are waiting about 30ms to 40ms before being processed, sometimes
jumping up close to 100ms during heavy usage (I can ping one of our routers
at a remote site and get a response faster than I can write a byte of data
to a local disk... heh).  Disk service times are remaining somewhat low
however, at consistently below 10ms, so once the data is written to disk
it's at least being written quickly.  I'm not sure how much I can trust the
utilization percentage number, as I regularly see it jump to %150 to %160,
but that's also a general indication that the disk subsystem is stressed.
Turning off logging (and hence curtailing the truly massive amount of log
data that was being written to disk) helped, but not quite enough.
Unfortunately, the physical number of hard drives and their RAID
configuration is suboptimal, and changing it isn't an option at this point.
Fortunately, the machine has plenty of memory...
	...which leads me to what I was considering.  What about mounting
the "/var/spool/postfix/incoming" queue and "/var/spool/postfix/active"
queue as tmpfs with a max filesystem size of about 500MB a piece.  I would
mount the filesystem on boot in "/etc/fstab" (obviously) and then make the
subdirectories needed in the queues ("A" through "0" - hex - they remain
static as far as I can tell) then "chown" and "chmod" them to the
appropriate owner and permissions in the PostFix startup script before
PostFix actually starts.  All file access in these two directories would be
in physical memory, would take the load completely off of the disks, and be
blazingly fast... as long as tmpfs doesn't try to use swap space in case of
a physical memory shortage (which there most likely wouldn't be since
there's a ton of memory in these boxes).  Hell, even if there was a physical
memory shortage and tmpfs had to use swap space, it would probably still be
faster than reading and writing from/to the normal ext3 filesystem -
according to a whitepaper over @ IBM that I read concerning tmpfs
performance.  The only downside of this setup I can think of is that in the
very unlikely event of a server crash, all of the mail that was stored in
the PostFix "incoming" and "active" queues would be lost.  Poof.  This,
however, is an acceptable risk according to management.
	Thoughts?

XOR


-- 
redhat-list mailing list
unsubscribe mailto:redhat-list-request@xxxxxxxxxx?subject=unsubscribe
https://www.redhat.com/mailman/listinfo/redhat-list