Re: possible bug - bitmap dirty pages status

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Nov 16, 2011 at 7:07 AM, NeilBrown <neilb@xxxxxxx> wrote:
> On Wed, 16 Nov 2011 03:13:51 +0400 CoolCold <coolthecold@xxxxxxxxx> wrote:
>
>> As I promised I was collecting data, but forgot to return to that
>> problem, bumping thread returned me to that state ;)
>> So, data was collected for almost the month - from 31 August to 26 September:
>> root@gamma2:/root# grep -A 1 dirty component_examine.txt |head
>>           Bitmap : 44054 bits (chunks), 190 dirty (0.4%)
>> Wed Aug 31 17:32:16 MSD 2011
>>
>> root@gamma2:/root# grep -A 1 dirty component_examine.txt |tail -n 2
>>           Bitmap : 44054 bits (chunks), 1 dirty (0.0%)
>> Mon Sep 26 00:28:33 MSD 2011
>>
>> As i can understand from that dump, it was bitmap examination (-X key)
>> of component /dev/sdc3 of raid /dev/md3.
>> Decreasing happend, though after some increase on 23 of September, and
>> first decrease to 0 happened on 24 of September (line number 436418).
>>
>> So almost for month, dirty count was no decreasing!
>> I'm attaching that log, may be it will help somehow.
>
> Thanks a lot.
> Any idea what happened at on Fri Sep 23??
> Between 6:23am and midnight the number of dirty bits dropped from 180 to 2.
Have no idea, sorry. 6.25 am scheduled in cron for logrotation, but
6.23 has nothing specific

But changes (dirty increase) begun to happen on 2:30 AM , which
corresponds with some cron-running script which does data import &
database update  - database lives on that LVMed md array.

>
> This does seem to suggest that md is just losing track of some of the pages
> of bits and once they are modified again md remembers to flush them and write
> them out - which is a fairly safe way to fail.
>
> The one issue I have found is that set_page_attr uses a non-atomic __set_bit
> because it should always be called under a spinlock.  But bitmap_write_all()
> - which is called when a spare is added - calls it without the spinlock so
> that could corrupt some of the bits.
>
> Thanks,
> NeilBrown
>
>



-- 
Best regards,
[COOLCOLD-RIPN]
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux