xfs clones crash issue - illegal state 13 in block map

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi guys,

I am getting these errors often when running the crash tests
with cloned files (generic/502 in my xfstests patches).

Hitting these errors requires first fixing 2 other issues
that shadow over this issue:
"xfs: fix incorrect log_flushed on fsync" (in master)
"xfs: fix leftover CoW extent after truncate"
available on my tree based on Darrick's simple fix:
https://github.com/amir73il/linux/commits/xfs-fsync

I get the errors more often (1 out of 5) on a 100G fs on spinning disk.
On a 10G fs on SSD they are less frequent.
The log in this email was captured on patched stable 4.9.47 kernel,
but I am getting the same errors on patched upstream kernel.

I wasn't able to create a deterministic reproducer, so attaching
the full log from a failed test along with an IO log that can be
replayed on your disk to examine the outcome.

Following is the output of fsx process #5, which is the process
that wrote the problematic testfile5.mark0 to the log.
This process performs only read,zero,fsync before creating
the log mark.
The file testfile5 was cloned from an origin 256K file before
running fsx.
Later, I used the random seed 35484 in this log for all
processes and it seemed to increase the probability for failure.

# /old/home/amir/src/xfstests-dev/ltp/fsx -N 100 -d -k -P
/mnt/test/fsxtests -i /dev/mapper/logwrites-test -S 0 -j 5
/mnt/scratch/testfile5
Seed set to 35484
file_size=262144
5: 1 read 0x3f959 thru 0x3ffff (0x6a7 bytes)
5: 2 zero from 0x3307e to 0x34f74, (0x1ef6 bytes)
5: 3 fsync
5: Dumped fsync buffer to testfile5.mark0

In order to get to the crash state you need to get my
xfstests replay-log patches and replay the attached log
on a >= 100G scratch device:

# ./src/log-writes/replay-log --log log.xfs.testfile5.mark0 --replay
$SCRATCH_DEV --end-mark testfile5.mark0
# mount $SCRATCH_DEV $SCRATCH_MNT
# umount $SCRATCH_MNT
# xfs_repair -n $SCRATCH_DEV
Phase 1 - find and verify superblock...
Phase 2 - using internal log
        - zero log...
        - scan filesystem freespace and inode maps...
        - found root inode chunk
Phase 3 - for each AG...
        - scan (but don't clear) agi unlinked lists...
        - process known inodes and perform inode discovery...
        - agno = 0

fatal error -- illegal state 13 in block map 376

Can anyone provide some insight?

Thanks,
Amir.

Attachment: 502.full.xfs.testfile5.mark0
Description: Binary data

Attachment: log.xfs.testfile5.mark0.bz2
Description: BZip2 compressed data


[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux