RE: RAID halting

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



>This sounds like a filesys problem rather than a RAID problem.

I considered that.  It may well be.

>One thing that can cause this sort of behaviour is if the filesystem is in
>the middle of a sync and has to complete it before the create can
>complete, and the sync is writing out many megabytes of data.

For between 40 seconds and 2 minutes?  The drive subsystem can easily gulp
down 200 megabytes to 6000 megabytes in that period of time.  What synch
would be that large?  Secondly, the problem occurs also when there is
relatively little or no data being written to the array.  Finally, unless I
am misunderstanding at what layers iostat and atop are reporting the
traffic, the fact all drive writes invariably fall to dead zero during an
event and reads on precisely half the drives (and always the same drives)
drop to dead zero suggests to me this is not the case.


>You can see if this is happening by running

>     watch 'grep Dirty /proc/meminfo'

>if that is large when the hang starts, and drops down to zero, and the
>hang lets go when it hits (close to) zero, then this is the problem.

Thanks, I'll give it a try later today.  Right now I am dead tired, plus
there are some batches running I really don't want interrupted, and
triggering an event might halt them.

>The answer then is to keep that number low by writing a suitable
>number into
>   /proc/sys/vm/dirty_ratio   (a percentage of system RAM)
>or
>   /proc/sys/vm/dirty_bytes

Um, OK.  What would constitute suitable numbers, assuming it turns out to be
the issue?

>If that doesn't turn out to be the problem, then knowing how the
>"Dirty" count is behaving might still be useful, and I would probably
>look at what processes are in 'D' state, (ps axgu) and look at their
>stack (/proc/$PID/stack)..

I'll surely try that, too.

>You didn't say what kernel you are running.  It might make a difference.

>NeilBrown

Oh, sorry!  2.6.26-1-amd64  4G of RAM, with typically 600 - 800M in use.
The swap space is 4.7G, but the used swap space has never exceeded 200K.

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux