Hi Ryusuke, On Fri, 2013-04-26 at 01:37 +0900, Ryusuke Konishi wrote: [snip] > > Adding BH_NILFS_Allocated flag looks overkill for this issue. > > Why not modify btree or direct mapping code so that the buffer dirty > flag is propery set or cleared also for 1 KB block? Basically, the > block mapping code of NILFS is designed to be able to handle 1 KB > block. > My idea is simple. As I can see, usual way of processing page's buffers in NILFS2 driver is likewise: struct buffer_head *bh, *head; bh = head = page_buffers(page); do { /* Do something */ } while (bh = bh->b_this_page, bh != head); And, as I understand, the essence of the issue is in processing of spare buffers that were allocated during create_empty_buffers() call but these buffers shouldn't be used yet. So, from my viewpoint, the reasonable way of the issue fix is to ignore such spare buffers. Because operations with spare buffers can be resulted in presence of very sophisticated bugs and performance degradation. As a result, I suggested to use BH_NILFS_Allocated flag in this patch. > If this issue occurs after file size extension with ftruncate(), I > guess handling of hole blocks created by the extension seems to have > problem. > The nature of this issue is more complex, as I described in patch. The ftruncate call is simply beginning of the issue. The peculiarity of this call for this issue consists in growing file from zero bytes to 60 bytes. So, for 1 KB block size we should use only one block (hole) but, as a result, we have 4 allocated buffers (1 for hole block and 3 spare buffers). [ 223.543981] nilfs_truncate:715: blkoff 1, inode->i_size 60 [ 223.544068] nilfs_get_block:89: inode->i_ino 21, ii->i_bmap->b_last_allocated_ptr 0 [ 223.544079] nilfs_truncate:729: inode->i_ino 21 [ 223.544086] Pid: 2424, comm: darcs Tainted: G I 3.8.0-rc1 #147 [ 223.544089] Call Trace: [ 223.544098] [<ffffffff812e0c77>] nilfs_truncate+0x117/0x130 [ 223.544105] [<ffffffff811516de>] ? truncate_pagecache+0x5e/0x70 [ 223.544111] [<ffffffff812e0d11>] nilfs_setattr+0x81/0xb0 [ 223.544119] [<ffffffff817285b1>] ? mutex_lock_nested+0x2a1/0x370 [ 223.544126] [<ffffffff811bf60b>] notify_change+0x1cb/0x390 [ 223.544132] [<ffffffff811a1290>] do_truncate+0x60/0xa0 [ 223.544138] [<ffffffff811a1644>] sys_ftruncate+0x114/0x180 [ 223.544146] [<ffffffff8139fcae>] ? trace_hardirqs_on_thunk+0x3a/0x3f [ 223.544152] [<ffffffff81735119>] system_call_fastpath+0x16/0x1b Then, it is possible to identify trying to write in file mapping: [ 489.785334] nilfs_page_mkwrite:146: inode->i_ino 21 [ 489.785336] Pid: 2447, comm: darcs Tainted: G I 3.8.0-rc1 #149 [ 489.785338] Call Trace: [ 489.785342] [<ffffffff812e1a55>] nilfs_page_mkwrite+0x315/0x390 [ 489.785347] [<ffffffff81169562>] __do_fault+0xe2/0x4c0 [ 489.785351] [<ffffffff8116c260>] handle_pte_fault+0x90/0x8e0 [ 489.785355] [<ffffffff8101dcc9>] ? sched_clock+0x9/0x10 [ 489.785359] [<ffffffff81098985>] ? sched_clock_local+0x25/0x90 [ 489.785363] [<ffffffff81730644>] ? __do_page_fault+0x114/0x5b0 [ 489.785367] [<ffffffff8116dc85>] handle_mm_fault+0x235/0x330 [ 489.785371] [<ffffffff817306cc>] __do_page_fault+0x19c/0x5b0 [ 489.785375] [<ffffffff81098b18>] ? sched_clock_cpu+0xa8/0x110 [ 489.785378] [<ffffffff810bd9fd>] ? trace_hardirqs_off+0xd/0x10 [ 489.785382] [<ffffffff81098bef>] ? local_clock+0x6f/0x80 [ 489.785386] [<ffffffff810be345>] ? lock_release_holdtime.part.25+0x15/0x1a0 [ 489.785390] [<ffffffff811c1183>] ? __close_fd+0x83/0xb0 [ 489.785394] [<ffffffff8139fe9d>] ? trace_hardirqs_off_thunk+0x3a/0x3c [ 489.785398] [<ffffffff81730aee>] do_page_fault+0xe/0x10 [ 489.785402] [<ffffffff8172cc48>] page_fault+0x28/0x30 It is possible to see that after segment construction we have such state of bmap (namely, it was set two block numbers): [ 490.177923] nilfs_direct_set_ptr: inode: 21 [ 490.177925] NILFS2: 00 00 00 00 00 00 00 00 01 14 00 00 00 00 00 00 ................ [ 490.177927] NILFS2: 0a 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ [ 490.177929] NILFS2: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ [ 490.177931] NILFS2: 00 00 00 00 00 00 00 00 And, finally, we have issue: [ 490.179790] nilfs_direct_assign: inode: 21 [ 490.179800] nilfs_direct_assign: ptr: 5121 [ 490.179802] nilfs_dat_prepare_entry:57 req->pr_entry_nr 5121 [ 490.179805] nilfs_palloc_get_entry_block:294 nr 5121, create 0 [ 490.179815] nilfs_direct_assign: inode: 21 [ 490.179828] nilfs_direct_assign: ptr: 10 [ 490.179830] nilfs_dat_prepare_entry:57 req->pr_entry_nr 10 [ 490.179832] nilfs_palloc_get_entry_block:294 nr 10, create 0 [ 490.179843] nilfs_direct_assign: inode: 21 [ 490.179857] nilfs_direct_assign: ptr: 0 [ 490.179859] nilfs_direct_assign: invalid pointer: 0 [ 490.179862] NILFS error (device dm-0): nilfs_bmap_assign: broken bmap (inode number=21) With the best regards, Vyacheslav Dubeyko. > I will look into the issue too. > > > Regards, > Ryusuke Konishi > -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html