* Theodore Ts'o <tytso@xxxxxxx>: > On Wed, Feb 12, 2014 at 10:44:03AM -0500, Theodore Ts'o wrote: > > On Mon, Feb 10, 2014 at 03:04:14PM -0500, Eric Whitney wrote: > > > Commit a115f749c1 (ext4: remove wait for unwritten extent conversion from > > > ext4_truncate) exposed a bug in ext4_ext_handle_uninitialized_extents(). > > > It can be triggered by xfstest generic/299 when run on a test file > > > system created without a journal. > > Hey Eric, I'm still seeing generic/299 failures in the nojournal case, > although instead of block validity errors, they are ENOSPC errors: > > generic/299 192s ... [05:55:05][16439.429067] EXT4-fs warning (device vdc): ext4_convert_unwritten_extents:4725: inode #15: block 692480: len 32: ext4_ext_map_blocks returned -28 > [16441.203606] EXT4-fs warning (device vdc): ext4_convert_unwritten_extents:4725: inode #14: block 258688: len 32: ext4_ext_map_blocks returned -28 > [16441.508472] EXT4-fs warning (device vdc): ext4_convert_unwritten_extents:4725: inode #14: block 257792: len 32: ext4_ext_map_blocks returned -28 > . > . > . > [16479.132762] EXT4-fs warning (device vdc): ext4_convert_unwritten_extents:4725: inode #15: block 739808: len 32: ext4_ext_map_blocks returned -28 > [05:56:18] [failed, exit status 1] - output mismatch (see /root/xfstests/results//generic/299.out.bad) > --- tests/generic/299.out 2014-02-16 22:20:24.000000000 -0500 > +++ /root/xfstests/results//generic/299.out.bad 2014-02-18 05:56:18.816438707 -0500 > @@ -3,3 +3,5 @@ > Run fio with random aio-dio pattern > > Start fallocate/truncate loop > +failed: '/root/xfstests/bin/fio /tmp/22707.fio' > +(see /root/xfstests/results//generic/299.full for details) > ... > (Run 'diff -u tests/generic/299.out /root/xfstests/results//generic/299.out.bad' to see the entire diff) > > > I'm also seeing a soft lockup warning with generic/299 when using a 1k > block: > > generic/299 192s ... [08:26:39][25800.514234] INFO: task umount:655 blocked for more than 120 seconds. > [25800.515135] Not tainted 3.14.0-rc2 #1604 > [25800.515764] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > [25800.516824] e1b9ddac 00000046 c0df1b40 e1b9c000 00000ddc 00000000 00000000 00001752 > [25800.518038] c8c563d0 e1b9dd68 c0138b9f 00000000 e1b9dd98 c018fac4 0003fb49 00000000 > [25800.519225] f63b2140 85ba09ee 00001752 85ba081b 00001752 00000001 f63b2140 00000000 > [25800.520411] Call Trace: > [25800.520756] [<c0138b9f>] ? sched_clock+0x9/0xc > [25800.521395] [<c018fac4>] ? sched_clock_local+0x11/0xfa > [25800.522105] [<c018fdcd>] ? sched_clock_cpu+0xc6/0xe7 > [25800.522794] [<c07b26f8>] schedule+0x63/0x65 > [25800.523374] [<c07b1c4a>] schedule_timeout+0x1a/0x99 > [25800.524046] [<c019e398>] ? mark_held_locks+0x5b/0x72 > [25800.524777] [<c07b5623>] ? _raw_spin_unlock_irq+0x27/0x36 > [25800.525518] [<c019e4fd>] ? trace_hardirqs_on_caller+0x14e/0x169 > [25800.526338] [<c019e523>] ? trace_hardirqs_on+0xb/0xd > [25800.527023] [<c07b5628>] ? _raw_spin_unlock_irq+0x2c/0x36 > [25800.527887] [<c07b2d2d>] __wait_for_common+0xc4/0xee > [25800.528586] [<c07b1c30>] ? ieee80211_assoc_success+0x95e/0x95e > [25800.529386] [<c018df97>] ? wake_up_state+0x11/0x11 > [25800.530044] [<c07b2d70>] wait_for_completion+0x19/0x1c > [25800.530748] [<c02562bf>] writeback_inodes_sb_nr+0xc2/0xcd > [25800.531507] [<c07b2c9a>] ? __wait_for_common+0x31/0xee > [25800.532222] [<c02562e8>] writeback_inodes_sb+0x1e/0x22 > [25800.532931] [<c0259be5>] sync_filesystem+0x3b/0x8c > [25800.533594] [<c0238d56>] generic_shutdown_super+0x22/0xcd > [25800.534351] [<c0238f9f>] kill_block_super+0x22/0x63 > [25800.535026] [<c0239188>] deactivate_locked_super+0x25/0x42 > [25800.535788] [<c02395cb>] deactivate_super+0x31/0x34 > [25800.536462] [<c024e235>] mntput_no_expire+0xd5/0xf1 > [25800.537134] [<c024efbd>] SYSC_umount+0x283/0x29a > [25800.537793] [<c024f0ac>] SyS_oldumount+0x1f/0x21 > [25800.538590] [<c07bbec6>] sysenter_do_call+0x12/0x38 > [25800.539330] 1 lock held by umount/655: > [25800.539864] #0: (&type->s_umount_key#19){++++++}, at: [<c02395c4>] deactivate_super+0x2a/0x34 > [08:31:50] 311s > > Have you seen either of these in your testing? > Hi Ted - We discussed the ENOSPC warnings from ext4_ext_map_blocks() in a concall a couple of weeks ago. They were visible in my testing both before and after my patch on both x86-64 and ARM, and are very reproducible. I don't believe they're directly related to the block validity problem, although they appeared simultaneously with it after commit a115f749c1. In the concall, you mentioned you were going to take a look, so I set them aside to look at a few other problems. I'd be happy to pick that up again if you'd like - just let me know. (It's unclear whether it's relevant, but I'll note that I'm getting warnings from the extent status tree debugging code (ES_AGGRESSIVE_TEST) even after my patch when running generic/299 nojournal - an item I've made a note to come back to.) The 299 test failure you're seeing could be due to the version of fio you're running. I don't recall all the details, but when upgrading my test environment in preparation for 3.14 I found that the current version of fio at the time would fail. I chose to stay with what worked for my 3.13 runs - fio-2.1.4-8-g3e26 (commit 3e260a46ea). It might be worth trying that to see if you get a clean run. I've not seen that softlockup warning on x86-64 or ARM on the 1K or any other test case (nor any softlockup warnings at all in 3.14 through -rc3). Also, it did not appear during the full xfstests runs of my patch on x86-64 and ARM prior to posting. That's the kind of warning I'd hope to see when running generic/208 and the umount happens late. Unfortunately, not. Thanks, Eric -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html