On Mon, May 03, 2021 at 05:20:53PM -0700, Darrick J. Wong wrote: > So... I have a machine with an nvme drive manufactured by a certain > manufacturer who isn't known for the quality of their firmware > implementation. I'm pretty sure that this is a result of the use of > fallocate(FALLOC_FL_ZERO_RANGE) to zero the log during format. > > If I format a device, mounting and repair both fail because the primary > superblock UUID doesn't match the log UUID: ..... > And the format works this time too: > > [root@abacus654 ~]# strace -s99 -o /tmp/a mkfs.xfs /dev/nvme0n1 -f > meta-data=/dev/nvme0n1 isize=512 agcount=6, agsize=268435455 blks > = sectsz=512 attr=2, projid32bit=1 > = crc=1 finobt=1, sparse=1, rmapbt=0 > = reflink=1 > data = bsize=4096 blocks=1542990848, imaxpct=5 > = sunit=0 swidth=0 blks > naming =version 2 bsize=4096 ascii-ci=0, ftype=1 > log =internal log bsize=4096 blocks=521728, version=2 > = sectsz=512 sunit=0 blks, lazy-count=1 > realtime =none extsz=4096 blocks=0, rtextents=0 > Discarding blocks...Done. > (reverse-i-search)`-n': od -tx1 -Ad -c /tmp/badlog3 | head ^C15 > [root@abacus654 ~]# xfs_repair -n /dev/nvme0n1 > Phase 1 - find and verify superblock... > Phase 2 - using internal log > - zero log... > - scan filesystem freespace and inode maps... > - found root inode chunk > Phase 3 - for each AG... > > In conclusion, the drive firmware is broken. > > Question: Should we be doing /some/ kind of re-read after a zeroing the > log to detect these sh*tty firmwares and fall back to a pwrite()? No, userspace should not have to wrok around broken hardware. The kernel needs to blacklist/quirk this device so that it will do either: a) redirect to a zeroing mechanism that actually works on that device; or b) fail the fallocate() call with -EOPNOTSUPP so that the application can fall back to manual zeroing. Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx