RE: RAID halting

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, 2009-04-04 at 00:57 -0500, Lelsie Rhorer wrote:

> >> The issue is the entire array will occasionally pause completely for
> about 40 seconds when a file is created. 
> 
> >I had symptoms like this once. It turned out to be a defective disk. The
> >disk would never return a read or write error but just intermittently
> >took a really long time to respond.
> 
> >I found it by running atop. All the other drives would be running at low
> >utilization and this one drive would be at 100% when the symptoms
> >occurred (which in atop gets colored red so it jumps out at you)
> 
> Thanks.  I gave this a try, but not being at all familiar with atop, I'm not
> sure what, if anything, the results mean in terms of any additional
> diagnostic data.

It's the same info as iostat just in color

> Depending somewhat upon the I/O load on the RAID array,
> atop sometimes reports the drive utilization on several or all of the drives
> to be well in excess of 85% - occasionally even 99%, but never flat 100% at
> any time.  

High 90's is what I ment by 100% :-)

> Oddly, even under relatively light loads of 20 or 30 Mbps,
> sometimes the RAID members would show utilization in the high 90s, usually
> on all the drives on a multiplier channel.

I think that's the filesystem buffering and then writing all at once.
It's normal if it's periodic; they go briefly to ~100% and then back to
~0%?

Did you watch the atop display when the problem occurred?

> I don't know if this is ordinary
> behavior for atop, but all the drives also periodically disappear from the
> status display.

That's a config option (and I find the default annoying). atop also
sorts the drives by utilization every second which can be a little hard
to watch. But if you have the problem I had then that one drive stays at
the top of the list when the problem occurs.

> Additionally, while atop is running and I am using my usual
> video editor, Video Redo, on a Windows workstation to stream video from the
> server, every time atop updates, the video and audio skip when reading from
> a drive not on the RAID array.  I did not notice the same behavior from the
> RAID array.  Odd.

I think this is heavy /proc filesystem access which I have noticed can
screw up even realtime processes.

> Anyway, on to the diagnostics.
> 
> I ran both `atop` and `watch iostat 1 2` concurrently and triggered several
> events while under heavy load ( >450 Mbps, total ). In atop, drives sdb,
> sdd, sde, sdg, and sdi consistently disappeared from atop entirely, and
> writes for the other drives fell to dead zero.  Reads fell to a very small
> number.  The iostat session returned information in agreement with atop:
> both reads and writes for sdb, sdd, sde, sdg, sdi, and md0 all fell to dead
> zero from nominal values frequently exceeding 20,000 reads / sec and 5000
> writes / sec.  Meanwhile, writes to sda, sdc, sdf, sdh, and sdj also dropped
> to dead zero, but reads only fell to between 230 and 256 reads/sec.

I used:

  iostat -t -k -x 1 | egrep -v 'sd.[0-9]'

to get percent utilization and not show each partition but just whole
drives.

For atop you want the -f option to 'fixate' the number of lines so
drives with zero utilization don't disappear.

If you didn't get constant 100% utilization while the event occurred
then I guess you don't have the problem I had.

Does the sata multiplier have it's own driver and if so, is it the
latest? Any other complaints on the net about it? I would think a
problem there would show up as 100% utilization though...

And I think you already said the cpu usage is low when the event occurs?
No one core at near 100%? (atop would show this too...)


--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux