> > I tested it on top of 5.10.109 + these 5 patches: > > https://github.com/amir73il/linux/commits/xfs-5.10.y-1 > > > > I can test it in isolation if you like. Let me know if there are > > other forensics that you would like me to collect. > > > > Hm. Still no luck if I move to .109 and pull in those few patches. I > assume there's nothing else potentially interesting about the test env > other than the sparse file scratch dev (i.e., default mkfs options, Oh! right, this guest is debian/10 with xfsprogs 4.20, so the defaults are reflink=0. Actually, the section I am running is reflink_normapbt, but... ** mkfs failed with extra mkfs options added to "-f -m reflink=1,rmapbt=0, -i sparse=1," by test 076 ** ** attempting to mkfs using only test 076 options: -m crc=1 -i sparse ** ** mkfs failed with extra mkfs options added to "-f -m reflink=1,rmapbt=0, -i sparse=1," by test 076 ** ** attempting to mkfs using only test 076 options: -d size=50m -m crc=1 -i sparse ** mkfs.xfs does not accept double sparse argument, so the test falls back to mkfs defaults (+ sparse) I checked and xfsprogs 5.3 behaves the same, I did not check newer xfsprogs, but that seems like a test bug(?) IWO, unless xfsprogs was changed to be more tolerable to repeating arguments, then maybe nobody is testing xfs/076 with reflink=0 (?) > etc.)? If so and you can reliably reproduce, I suppose it couldn't hurt > to try and grab a tracepoint dump of the test when it fails (feel free > to send directly or upload somewhere as the list may punt it, and please > also include the dmesg output that goes along with it) and I can see if > that shows anything helpful. > > I think what we want to know initially is what error code we're > producing (-ENOSPC?) and where it originates, and from there we can > probably work out how the transaction might be dirty. I'm not sure a > trace dump will express that conclusively. If you wanted to increase the > odds of getting some useful information it might be helpful to stick a > few trace_printk() calls in the various trans cancel error paths out of > xfs_create() to determine whether it's the inode allocation attempt that > fails or the subsequent attempt to create the directory entry.. > Well, the full output is filled with ENOSPC (also in a good run), so it's probably that, but I will try to get to that failing stack, no need for all the noisy traces. Signing off the day. hope I will get to it tomorrow. Thanks, Amir. P.S: this is how 076.full ends if it makes any difference: touch: cannot touch '/media/scratch/offset.21889024/63': No space left on device touch: cannot touch '/media/scratch/offset.21823488/63': No space left on device touch: cannot touch '/media/scratch/offset.21757952/63': No space left on device touch: cannot touch '/media/scratch/offset.21692416/63': No space left on device touch: cannot touch '/media/scratch/offset.21626880/63': No space left on device touch: cannot touch '/media/scratch/offset.21561344/63': No space left on device touch: cannot touch '/media/scratch/offset.21495808/63': No space left on device touch: cannot touch '/media/scratch/offset.21430272/63': No space left on device stat: Input/output error fpunch failed