On Mon, Aug 28, 2017 at 11:36:38AM +0300, Amir Goldstein wrote: > On Mon, Aug 28, 2017 at 11:33 AM, Amir Goldstein <amir73il@xxxxxxxxx> wrote: > > Christoph, Darrick, > > > > As I reported last week, I started running Josef's log-writes crash > > tests and immediately got reports on data checksum errors when > > running the tests on xfs. > > > > Unlike ext4 and btrfs, xfs tests seemed to fail arbitrarily for any > > value of random seed I tried. Unlike xfs, I never observed data > > checksum errors on ext4 and btrfs (only fsck errors). > > > > It's quite easy to reproduce the reported checksum errors when > > running the test currently on my xfstests branch: > > https://github.com/amir73il/xfstests/commits/dm-log-writes > > > > Looking closer at the reported checksum errors, in all cases > > I examined, the problem was, that after a sequence of > > PUNCH_HOLE+FSYNC on a test file, a partially zeroed block, > > both at beginning and end of zero range is not zeroed after > > crash. > > > > For example, the following file does not have zeroes after crash > > at end of logical block #11: > > Block #10 that is... #11 is unmapped. > > > > > --------------------- > > Filesystem type is: 58465342 > > File size of /mnt/scratch/testfile2 is 248338 (61 blocks of 4096 bytes) > > ext: logical_offset: physical_offset: length: expected: flags: > > 0: 1.. 3: 33.. 35: 3: unwritten > > 1: 10.. 10: 93.. 93: 1: 36: > > 2: 20.. 23: 147.. 150: 4: 94: unwritten > > 3: 24.. 31: 158.. 165: 8: 151: unwritten > > 4: 34.. 34: 146.. 146: 1: 166: unwritten > > 5: 35.. 38: 151.. 154: 4: 147: unwritten > > 6: 41.. 44: 167.. 170: 4: 155: unwritten > > 7: 46.. 46: 166.. 166: 1: 171: > > 8: 47.. 50: 89.. 92: 4: 167: > > 9: 51.. 60: 171.. 180: 10: 93: last,eof > > /mnt/scratch/testfile2: 10 extents found > > > > /mnt/scratch/testfile2 (bad): > > 0000000 0000 0000 0000 0000 0000 0000 0000 0000 > > * > > 000a510 5858 5858 5858 5858 5858 5858 5858 5858 > > * > > 000b000 0000 0000 0000 0000 0000 0000 0000 0000 > > * > > 002e280 0000 0000 0000 0000 0000 0000 5858 5858 > > 002e290 5858 5858 5858 5858 5858 5858 5858 5858 > > * > > 0038720 5858 5858 0000 0000 0000 0000 0000 0000 > > 0038730 0000 0000 0000 0000 0000 0000 0000 0000 > > * > > 003ca12 > > ------------------ > > > > However, this crash checkpoint (testfile2.mark1) was taken > > after punch+fsync that should have zeroed the end of block #11 > > (0xa988..000b000): > > ------------------- > > ... > > 2: 16 punch from 0xa988 to 0xf126, (0x479e bytes) > > 2: 17 read 0x12420 thru 0x1aa08 (0x85e9 bytes) > > 2: 18 write 0x30d11 thru 0x3a723 (0x9a13 bytes) > > 2: 19 punch from 0x27988 to 0x2aaed, (0x3165 bytes) > > 2: 20 write 0x2d6ff thru 0x369f3 (0x92f5 bytes) > > 2: 21 zero from 0x22882 to 0x22e14, (0x592 bytes) > > 2: 22 zero from 0x14655 to 0x1e636, (0x9fe1 bytes) > > 2: 23 zero from 0x17c91 to 0x1fb75, (0x7ee4 bytes) > > 2: 24 punch from 0x273eb to 0x3028c, (0x8ea1 bytes) > > 2: 25 zero from 0x29eb2 to 0x2c692, (0x27e0 bytes) > > 2: 26 zero from 0x11ac to 0x3910, (0x2764 bytes) > > 2: truncating to largest ever: 0x3ea12 > > 2: 27 trunc from 0x3a724 to 0x3ea12 > > 2: 28 collapse from 0x2d000 to 0x2f000, (0x2000 bytes) > > 2: 29 falloc from 0x22cf2 to 0x2733b (0x4649 bytes) > > 2: 30 mapread 0x3466a thru 0x3ca11 (0x83a8 bytes) > > 2: 31 fsync > > 2: Dumped fsync buffer to testfile2.mark1 > > > > /mnt/test/fsxtests/testfile2.mark1 (good): > > 0000000 0000 0000 0000 0000 0000 0000 0000 0000 > > * > > 000a510 5858 5858 5858 5858 5858 5858 5858 5858 > > * > > 000a980 5858 5858 5858 5858 0000 0000 0000 0000 > > 000a990 0000 0000 0000 0000 0000 0000 0000 0000 > > * > > 002e280 0000 0000 0000 0000 0000 0000 5858 5858 > > 002e290 5858 5858 5858 5858 5858 5858 5858 5858 > > * > > 0038720 5858 5858 0000 0000 0000 0000 0000 0000 > > 0038730 0000 0000 0000 0000 0000 0000 0000 0000 > > * > > 003ca12 > > > > -------------------------- > > > > Anyway, I went to look at xfs_zero_range() and while I admit > > it was hard for me to follow down all the actors into block > > layer, I couldn't find where partial zeroed page is marked dirty. > > > > Can you please have a look and say what you make of this? <shrug> /me would have thought the page gets dirty via: xfs_zero_range -> iomap_zero_range -> iomap_zero_range_actor -> iomap_zero -> iomap_write_end -> generic_write_end -> block_write_end -> __block_commit_write -> mark_buffer_dirty -> __set_page_dirty But maybe that doesn't happen? --D > > > > Thanks, > > Amir. > > > > P.S. if needed I can provide the recorded writes log to replay the > > I/O sequence that results in the reported error (it's 13K compressed) > > but the problem seems obvious and easy to reproduce using the > > xfstest (reproduced at high probability not always). > -- > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html