Re: Raid5 bitmap - Bug in bitmap_startwrite()

"Francois Barre" <francois.barre@xxxxxxxxx> · Fri, 4 Aug 2006 18:46:18 +0200

As promised, here is the dmesg of the Oops caused by the BUG_ON() on line 1166.
So it seems that (*bcm & COUNTER_MAX) == COUNTER_MAX, so the system is
issuing many more bitmap_startwrite() than bitmap_endwrite(). I'll try
and compile with more verbous options and see what happens.

> This may be the 'hijacked' logic, but I'm a little puzzled here.

Yes. When we fail to allocate a page for the map (which should be rare),
we, instead of failing the whole operation, just use the pointer to page
, so we're basically using 4 bytes (the page pointer itself) instead of
4K (the page) for that part of the bitmap. So each bit represents more
data (1000x more in the case of x86).

Thanks, I got it when I re-read the code and understood the
bitmap_checkpage() as well. Now it seems pretty clear.
Anyway it looks pretty unlikely for my system to run OOM and to hijack
page in bitmap_checkpage()...

Regards,
Attachment:
dmesg_bitmap_trace_tail

Description: Binary data