Re: I/O wait problem with hardware raid

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Eric S. Johansson wrote:
Bill Davidsen wrote:

iowait means that there is a program waiting for I/O. That's all.

I was under the impression that I/O wait created a blocking condition.

I means a process is waiting for I/O, so that one is blocked. And often it means that heavy disk access will slow down other disk i/o. But the CPU involved in iowait is available for CPU-bound process (any process which needs it).

One of the things Linux does poorly is to balance reads and writes. If you are doing heavy writes you don't have reads jumping the queue and getting donw in reasonable time. Use of the deadline i/o scheduler may help this, as may making the dirty ratio _smaller_ to slow writes to let reads get run.

I played with allowing the reads to bypayy a certain number of writes to balance performance. It worked beautifully, and I could tune it for any load, but I never got it to tune itself, so there was a always a "jackpot case" which weorked WAY worse than standard. Needless to say I never followed up on it, I haven't had inspiration.
Of
course when you do a copy (regardless of software) the CPU is waiting
for disk transfers. I'm not sure what you think you should debug, i/o
takes time, and if the program is blocked until the next input comes in
it will enter the waitio state. If there is no other process to use the
available CPU it becomes waitio, which is essentially available CPU
cycles similar to idle.

What exactly do you think is wrong?

As I run rsync which increases the I/O wait state, the first thing I notice is
that IMAP starts getting slow, users start experiencing failures in sending
e-mail, and the initial exchange for ssh gets significantly longer.

All of these problems have both networking and file I/O in common and I'm trying
to separate out where the problem is coming from.  I have run netcat which has
shown that the network throughput is not wonderful but that's a different
problem for me to solve.  When I run netcat, there is no degradation of ssh,
IMAP or SMTP response times. the problem shows up if I run CP or rsync internal
source and target.   the problem becomes the worst when I'm doing rsync within
the local filesystem and another rsync to an external rsync server.  At that
point, the system becomes very close to unusable.

Of course, I can throttle back rsync and regain some usability but I'm backing
up a couple terabytes of information and it's a time-consuming process even with
rsync and would like  it to run as quickly as possible. I should probably point
out that the disk array is a relatively small raid five set up with six 1 TB
drives.  Never did like raid five especially when it's on a bit of firmware.
Can't wait for ZFS (or its equivalent) on linux to reach production quality.

from where I stand right now, this might be "it sucks but it's perfectly
normal".  In a situation with heavy disk I/O, I would expect anything that
accesses the disc to run slowly and in a more naïve moment, I thought that the
GUI wouldn't be hurt by heavy disk I/O and then I remembered that gnome and its
kindred have lots of configuration files to read every time you move the mouse.  :-)

Any case, the people that sign my check aren't happy because they spent all his
money on an HP server and performs no better than an ordinary PC.  I'm hoping I
can learn enough to give them a cogent explanation if I can't give them a solution.

Some of the tuning I suggested may help, perhaps a lot. If you have a lot of memory you can fill it with dirty buffers and the response will be a problem.

I appreciate the help.

Then let me know if any of the things I suggested help. Someone else may have other ideas, I would cut the dirty ratio by half and see if it makes any difference and for better or worse.

---eric



--
Bill Davidsen <davidsen@xxxxxxx>
 "Woe unto the statesman who makes war without a reason that will still
be valid when the war is over..." Otto von Bismark


--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux