On Tue, Jul 06, 2021 at 05:21:28PM -0700, Darrick J. Wong wrote: > From: Darrick J. Wong <djwong@xxxxxxxxxx> > > Every now and then I see the following failure when running generic/347: > > --- generic/347.out > +++ generic/347.out.bad > @@ -1,2 +1,2 @@ > QA output created by 347 > -=== completed > +failed to create dm thin pool device Note that I have to add space indention to above diff lines, otherwise it seems git am will treat lines after "---" as the beginning of patch, and failed to apply. Thanks, Eryu > > Accompanied by the following dmesg spew: > > device-mapper: thin metadata: sb_check failed: blocknr 7016996765293437281: wanted 0 > device-mapper: block manager: superblock validator check failed for block 0 > device-mapper: thin metadata: couldn't read superblock > device-mapper: table: 253:2: thin-pool: Error creating metadata object > device-mapper: ioctl: error adding target to table > > 7016996765293437281 is of course the magic number 0x6161616161616161, > which are stale ondisk contents left behind by previous tests that wrote > known tests patterns to files on the scratch device. This is a bit > surprising, since _dmthin_init supposedly zeroes the first 4k of the > thin pool metadata device before initializing the pool. Or does it? > > dd if=/dev/zero of=$DMTHIN_META_DEV bs=4096 count=1 &>/dev/null > > Herein lies the problem: the dd process writes zeroes into the page > cache and exits. Normally the block layer will flush the page cache > after the last file descriptor is closed, but once in a while the > terminating dd process won't be the only process in the system with an > open file descriptor! > > That process is of course udev. The write() call from dd triggers a > kernel uevent, which starts udev. If udev is running particularly > slowly, it'll still be running an instant later when dd terminates, > thereby preventing the page cache flush. If udev is still running a > moment later when we call dmsetup to set up the thin pool, the pool > creation will issue a bio to read the ondisk superblock. This read > isn't coherent with the page cache, so it sees old disk contents and the > test fails even though we supposedly formatted the metadata device. > > Fix this by explicitly flushing the page cache after writing the zeroes. > > Fixes: 4b52fffb ("dm-thinp helpers in common/dmthin") > Signed-off-by: Darrick J. Wong <djwong@xxxxxxxxxx> > --- > common/dmthin | 8 ++++++-- > 1 file changed, 6 insertions(+), 2 deletions(-) > > > diff --git a/common/dmthin b/common/dmthin > index 3b1c7d45..91147e47 100644 > --- a/common/dmthin > +++ b/common/dmthin > @@ -113,8 +113,12 @@ _dmthin_init() > _dmsetup_create $DMTHIN_DATA_NAME --table "$DMTHIN_DATA_TABLE" || \ > _fatal "failed to create dm thin data device" > > - # Zap the pool metadata dev > - dd if=/dev/zero of=$DMTHIN_META_DEV bs=4096 count=1 &>/dev/null > + # Zap the pool metadata dev. Explicitly fsync the zeroes to disk > + # because a slow-running udev running concurrently with dd can maintain > + # an open file descriptor. The block layer only flushes the page cache > + # on last close, which means that the thin pool creation below will > + # see the (stale) ondisk contents and fail. > + dd if=/dev/zero of=$DMTHIN_META_DEV bs=4096 count=1 conv=fsync &>/dev/null > > # Thin pool > # "start length thin-pool metadata_dev data_dev data_block_size low_water_mark"