As promised, here is the dmesg of the Oops caused by the BUG_ON() on line 1166. So it seems that (*bcm & COUNTER_MAX) == COUNTER_MAX, so the system is issuing many more bitmap_startwrite() than bitmap_endwrite(). I'll try and compile with more verbous options and see what happens.
> This may be the 'hijacked' logic, but I'm a little puzzled here. Yes. When we fail to allocate a page for the map (which should be rare), we, instead of failing the whole operation, just use the pointer to page , so we're basically using 4 bytes (the page pointer itself) instead of 4K (the page) for that part of the bitmap. So each bit represents more data (1000x more in the case of x86).
Thanks, I got it when I re-read the code and understood the bitmap_checkpage() as well. Now it seems pretty clear. Anyway it looks pretty unlikely for my system to run OOM and to hijack page in bitmap_checkpage()... Regards,
Attachment:
dmesg_bitmap_trace_tail
Description: Binary data