Re: A lot of NILFS: bad btree node messages (readonly fs)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Jan 4, 2013, at 12:49 AM, Piotr Szymaniak wrote:

> On Tue, Dec 25, 2012 at 10:05:52AM +0400, Vyacheslav Dubeyko wrote:
>> Hi Piotr,
>> 
>> On Fri, 2012-11-30 at 08:32 +0100, Piotr Szymaniak wrote:
>>> On Fri, Nov 30, 2012 at 10:09:04AM +0400, Vyacheslav Dubeyko wrote:
>>>> Hi,
>>>> 
>>>> On Thu, 2012-11-29 at 22:06 +0100, Piotr Szymaniak wrote:
>>>>> Hello.
>>>>> 
>>>>> Doing some work today (system update, kernel update, rsync rootfs with
>>>>> external drive) I noticed some unwanted behaviour:
>>>>> maszn ~ # touch afile
>>>>> touch: cannot touch ‘afile’: System plików wyłącznie do odczytu
>>>>>                        it's Read only filesystem
>>>>> 
>>>>> dmesg shows a lot (1812 in the log) of:
>>>>> [ 8288.319012] NILFS: bad btree node (blocknr=26229286): level = 178,
>>>>> flags = 0x0, nchildren = 0
>>>>> [ 8288.319017] NILFS error (device sda2): nilfs_bmap_lookup_contig:
>>>>> broken bmap (inode number=102230)
>>>>> 
>>>>> After mount -o remount,rw /
>>>>> [ 8384.734024] segctord starting. Construction interval = 300 seconds,
>>>>> CP frequency < 30 seconds
>>>>> [ 8384.734028] NILFS warning: mounting fs with errors
>>>>> 
>>>>> Never happend before and it looks like it works fine now (using vanilla
>>>>> kernel 3.6.4).
>>>>> 
>>>>> After a reboot (to vanilla kernel 3.6.8) dmesg shows:
>>>>> [    6.046883] segctord starting. Construction interval = 300 seconds,
>>>>> CP frequency < 30 seconds
>>>>> [    6.046884] NILFS warning: mounting fs with errors
>>>>> 
>>>>> I don't like the "errors" part...
>>>>> 
>> 
>> Did you have any snapshots on this volume or not?
> 
> No, only checkpoints (just checked).
> 

Ok. It means that snapshots don't play any role for the issue.

> Also, got this issue again. A *lot* of:
> Jan 03 22:36:38 [kernel] [  953.289973] NILFS: bad btree node
> (blocknr=26229286): level = 67, flags = 0xee, nchildren = 40
> Jan 03 22:36:38 [kernel] [  953.289976] NILFS error (device sda2):
> nilfs_bmap_lookup_contig: broken bmap (inode number=102230)
> Jan 03 22:36:38 [kernel] [  953.289976] 
> Jan 03 22:36:38 [kernel] [  953.290248] NILFS: bad btree node
> (blocknr=26229286): level = 67, flags = 0xee, nchildren = 40
> Jan 03 22:36:38 [kernel] [  953.290251] NILFS error (device sda2):
> nilfs_bmap_lookup_contig: broken bmap (inode number=102230)
> Jan 03 22:36:38 [kernel] [  953.290251] 
> Jan 03 22:36:38 [kernel] [  953.290523] NILFS: bad btree node
> (blocknr=26229286): level = 67, flags = 0xee, nchildren = 40
> Jan 03 22:36:38 [kernel] [  953.290526] NILFS error (device sda2):
> nilfs_bmap_lookup_contig: broken bmap (inode number=102230)
> 
> Then I remounted rw and tried again (i was rsyncing my whole rootfs to
> another drive):

So, the issue was occurred not because of using rsync. Am I correct? I simply remember that rsync was used in another report about this issue also.

> Jan 03 22:38:15 [kernel] [ 1050.290134] segctord starting. Construction
> interval = 300 seconds, CP frequency < 30 seconds
> Jan 03 22:38:15 [kernel] [ 1050.290135] NILFS warning: mounting fs with
> errors
> Jan 03 22:38:15 [nilfs_cleanerd] start
> Jan 03 22:38:15 [nilfs_cleanerd] pause (clean check)
> Jan 03 22:38:40 [kernel] [ 1075.157346] NILFS: bad btree node
> (blocknr=26229286): level = 67, flags = 0xee, nchildren = 40
> Jan 03 22:38:40 [kernel] [ 1075.157351] NILFS error (device sda2):
> nilfs_bmap_lookup_contig: broken bmap (inode number=102230)
> Jan 03 22:38:40 [kernel] [ 1075.157351] 
> Jan 03 22:38:40 [kernel] [ 1075.157352] Remounting filesystem read-only
> Jan 03 22:38:40 [kernel] [ 1075.159162] NILFS: bad btree node
> (blocknr=26229286): level = 67, flags = 0xee, nchildren = 40
> Jan 03 22:38:40 [kernel] [ 1075.159168] NILFS error (device sda2):
> nilfs_bmap_lookup_contig: broken bmap (inode number=102230)
> Jan 03 22:38:40 [kernel] [ 1075.159168] 
> Jan 03 22:38:40 [kernel] [ 1075.161003] NILFS: bad btree node
> (blocknr=26229286): level = 67, flags = 0xee, nchildren = 40
> Jan 03 22:38:40 [kernel] [ 1075.161019] NILFS error (device sda2):
> nilfs_bmap_lookup_contig: broken bmap (inode number=102230)
> 
> But, thanks to rsync I got a (corupted?) file that causes the problem.
> It's some /var/tmp/kdecache-$USER/foo.kcache. copying this file to
> another place ends with readonly.
> 

So, maybe the nature of /var/tmp/kdecache-$USER/foo.kcache is a reason of the issue. It is a very interesting detail. Could you share more details about this file? I mean size, creation time, modification time, owner, rights and so on. Moreover, could you share more details about your hardware environment? I mean CPU and RAM details.

I think that it can be very interesting to know about how this file is distributed on the volume. Could you get by means of dumpseg information about  several last segments? I don't know how many it needs for understanding but maybe about 10 can be enough for the beginning.

Thanks,
Vyacheslav Dubeyko.

> 
> Piotr Szymaniak.
> -- 
>  - Chyba nie jest pan jednym z tych roniacych lzy liberalow?
>  - Odmawiam odpowiedzi, poniewaz moglaby zostac wykorzystana przeciwko
> mnie - odparlem.  Taksiarz  wydal  prychniecie oznaczajace dlaczego-ja-
> zawsze-trafiam-na-takich-cwaniakow... ale zamknal sie.
>  -- Stephen King, "The Breathing Method"

--
To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Filesystem Development]     [Linux BTRFS]     [Linux CIFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux SCSI]

  Powered by Linux