Re: raid5 hang on get_active_stripe

dean gaudet <dean@xxxxxxxxxx> · Thu, 11 May 2006 08:13:03 -0700 (PDT)

On Tue, 14 Mar 2006, Neil Brown wrote:

> On Monday March 13, patrik@xxxxxxxxxxx wrote:
> > Hi all,
> > 
> > I just experienced some kind of lockup accessing my 8-drive raid5
> > (2.6.16-rc4-mm2). The system has been up for 16 days running fine, but
> > now processes that try to read the md device hang. ps tells me they are
> > all sleeping in get_active_stripe. There is nothing in the syslog, and I
> > can read from the individual drives fine with dd. mdadm says the state
> > is "active".
> 
> Hmmm... That's sad. That's going to be very hard to track down.
> 
> If you could
>   echo t > /proc/sysrq-trigger
> 
> and send me the dump that appears in the kernel log, I would
> appreciate it.  I doubt it will be very helpful, but it is the best
> bet I can come up with.

i seem to be running into this as well... it has happenned several times 
in the past three weeks.  i attached the kernel log output...

it's a debian 2.6.16 kernel, which is based mostly on 2.6.16.10.

md4 : active raid5 sdd1[0] sde1[5](S) sdh1[4] sdg1[3] sdf1[2] sdc1[1]
      1562834944 blocks level 5, 128k chunk, algorithm 2 [5/5] [UUUUU]
      bitmap: 3/187 pages [12KB], 1024KB chunk

those drives are on 3w-xxxx (7508 controller).  i'm using lvm2 and
xfs as the filesystem (although i'm pretty sure an ext3 fs on another lv
is hanging too -- but i forgot to check before i unwedged it).

let me know if anything else is useful and i can try to catch it next
time.

> You could try increasing the size of the stripe cache
>   echo 512 > /sys/block/mdX/md/stripe_cache_size
> (choose and appropriate 'X').

yeah that got things going again -- it took a minute or so maybe, i
wasn't paying attention as to how fast things cleared up.

> Maybe check the content of
>          /sys/block/mdX/md/stripe_cache_active
> as well.

next time i'll check this before i increase stripe_cache_size... it's
0 now, but the raid5 is working again...

-dean
Attachment:
messages.gz

Description: Binary data