Re: raid5 hang on get_active_stripe

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 11 May 2006, dean gaudet wrote:

> On Tue, 14 Mar 2006, Neil Brown wrote:
> 
> > On Monday March 13, patrik@xxxxxxxxxxx wrote:
> > > I just experienced some kind of lockup accessing my 8-drive raid5
> > > (2.6.16-rc4-mm2). The system has been up for 16 days running fine, but
> > > now processes that try to read the md device hang. ps tells me they are
> > > all sleeping in get_active_stripe. There is nothing in the syslog, and I
> > > can read from the individual drives fine with dd. mdadm says the state
> > > is "active".
...
> 
> i seem to be running into this as well... it has happenned several times 
> in the past three weeks.  i attached the kernel log output...

it happenned again...  same system as before...


> > You could try increasing the size of the stripe cache
> >   echo 512 > /sys/block/mdX/md/stripe_cache_size
> > (choose and appropriate 'X').
> 
> yeah that got things going again -- it took a minute or so maybe, i
> wasn't paying attention as to how fast things cleared up.

i tried 768 this time and it wasn't enough... 1024 did it again...

> 
> > Maybe check the content of
> >          /sys/block/mdX/md/stripe_cache_active
> > as well.
> 
> next time i'll check this before i increase stripe_cache_size... it's
> 0 now, but the raid5 is working again...

here's a sequence of things i did... not sure if it helps:

# cat /sys/block/md4/md/stripe_cache_active
435
# cat /sys/block/md4/md/stripe_cache_size
512
# echo 768 >/sys/block/md4/md/stripe_cache_size
# cat /sys/block/md4/md/stripe_cache_active
752
# cat /sys/block/md4/md/stripe_cache_active
752
# cat /sys/block/md4/md/stripe_cache_active
752
# cat /sys/block/md4/md/stripe_cache_active
752
# cat /sys/block/md4/md/stripe_cache_active
752
# cat /sys/block/md4/md/stripe_cache_active
752
# cat /sys/block/md4/md/stripe_cache_active
752
# echo 1024 >/sys/block/md4/md/stripe_cache_size
# cat /sys/block/md4/md/stripe_cache_active
927
# cat /sys/block/md4/md/stripe_cache_active
151
# cat /sys/block/md4/md/stripe_cache_active
66
# cat /sys/block/md4/md/stripe_cache_active
2
# cat /sys/block/md4/md/stripe_cache_active
1
# cat /sys/block/md4/md/stripe_cache_active
0
# cat /sys/block/md4/md/stripe_cache_active
3

and it's OK again... except i'm going to lower the stripe_cache_size to
256 again because i'm not sure i want to keep having to double it each
freeze :)

let me know if you want the task dump output from this one too.

-dean
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux