https://bugzilla.kernel.org/show_bug.cgi?id=201089 Bug ID: 201089 Summary: [xfstests generic/417]: XFS corruption attribute entry #0 in attr block 0, inode 674 is INCOMPLETE Product: File System Version: 2.5 Kernel Version: 4.19-rc3 Hardware: All OS: Linux Tree: Mainline Status: NEW Severity: normal Priority: P1 Component: XFS Assignee: filesystem_xfs@xxxxxxxxxxxxxxxxxxxxxx Reporter: zlang@xxxxxxxxxx Regression: No Created attachment 278449 --> https://bugzilla.kernel.org/attachment.cgi?id=278449&action=edit xfs (512 blocksize) with the orphan list I just hit a XFS corruption by running xfstests generic/417 on 512 blocksize XFS (reproduce on linux 4.19-rc3): _check_xfs_filesystem: filesystem on /dev/sda5 is inconsistent (r) *** xfs_repair -n output *** Phase 1 - find and verify superblock... Phase 2 - using internal log - zero log... - scan filesystem freespace and inode maps... sb_icount 64, counted 128 sb_ifree 61, counted 124 sb_fdblocks 31436740, counted 31436706 - found root inode chunk Phase 3 - for each AG... - scan (but don't clear) agi unlinked lists... - process known inodes and perform inode discovery... - agno = 0 attribute entry #0 in attr block 0, inode 674 is INCOMPLETE problem with attribute contents in inode 674 would clear attr fork bad nblocks 2 for inode 674, would reset to 0 bad anextents 2 for inode 674, would reset to 0 - agno = 1 - agno = 2 - agno = 3 - process newly discovered inodes... Phase 4 - check for duplicate blocks... - setting up duplicate extent list... - check for inodes claiming duplicate blocks... - agno = 0 - agno = 1 - agno = 2 - agno = 3 No modify flag set, skipping phase 5 Phase 6 - check inode connectivity... - traversing filesystem ... - traversal finished ... - moving disconnected inodes to lost+found ... Phase 7 - verify link counts... No modify flag set, skipping filesystem flush and exiting. *** end xfs_repair output Although g/417 is the reproducer, it's very hard to reproduce it by g/417. So I got the metadump file which can trigger this bug by running g/417: ... echo "mount dirty orphans ro, then unmount" create_dirty_orphans <<< metadump at HERE _scratch_mount -o ro _scratch_unmount # We should be clean at this point echo "check fs consistency" _check_scratch_fs ... Steps to Reproduce: 1. Download the attachment from this bug 2. mdrestore the metadump 3. mount && umount the XFS to replay log 4. xfs_repair -n above XFS image Additional info: Brian (bfoster@) has left some messages for this bug, but that's an internal link can't be opened from outside. So I paste his comment as below: --- >From skimming through the code and reminding myself about the xattr INCOMPLETE flag semantics, I think this flag can be expected after a crash regardless of log recovery. For example, if we're setting a largish xattr value that requires remote block allocation, we'd set the xattr name and mark the entry INCOMPLETE, roll the transaction, allocate the remote block(s) (rolling the transaction again), synchronous write the remote value, clear the INCOMPLETE flag (and roll the tx) and the finally commit the transaction. So IOW, it's quite possible to leave a partially constructed (i.e., no value) xattr in place after a crash and the purpose of the flag is to accommodate that. It looks like there are cases where incomplete xattrs might be quietly cleaned out, so this isn't a catastrophic problem that requires immediate repair, but otherwise it makes sense for repair to detect and clear them out as well. It's not clear that the block accounting error is to be expected, however, so there still could be something going on here.. --- -- You are receiving this mail because: You are watching the assignee of the bug.