Re: Stuck IOs with dm-integrity + md raid1 + dm-thin

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, 1 Dec 2023, David Teigland wrote:
> On Thu, Nov 30, 2023 at 04:40:00PM -0800, Eric Wheeler wrote:
> > 	integritysetup format $idev0 --integrity xxhash64 --batch-mode
> > 	integritysetup format $idev1 --integrity xxhash64 --batch-mode
> > 
> > 	integritysetup open --integrity xxhash64 --allow-discards $idev0 dm-integrity0
> > 	integritysetup open --integrity xxhash64 --allow-discards $idev1 dm-integrity1
> > 
> > 	mdadm --create $MD_DEV --metadata=1.2 --assume-clean --level=1 --raid-devices=2 /dev/mapper/dm-integrity[01]
> > 
> > 	# 1. This should be enough to trigger it:
> > 	SSD_DEV=$MD_DEV
> > 
> > 	# 2. If not, then wrap /dev/md9 in a linear target:
> > 	#linear_add ssd $MD_DEV
> > 	#SSD_DEV=/dev/mapper/ssd
> 
> Interesting that just adding a linear layer there would have some effect.

Its seemed (sometimes) to be the difference between triggering on the 
first instead of the second iteration.  Without the linear wrap, it would 
sometimes not crash until the second loop, typically on `lvchange -an 
testvg`.  Maybe the extra linear hop delayed the IO just enough to help 
with the race, but that is only speculation.

> > 	# Create a writable header for the PV meta:
> > 	dd if=/dev/zero bs=1M count=16 oflag=direct of=/tmp/pvheader
> > 	loop=`losetup -f --show /tmp/pvheader`
> > 	linear_add pv $loop /dev/nullb0
> > 
> > 	# Create the VG
> > 	lvmdevices --adddev $SSD_DEV
> > 	lvmdevices --adddev /dev/mapper/pv
> > 	vgcreate $VGNAME /dev/mapper/pv $SSD_DEV
> > 
> > 	# Create the pool:
> > 	lvcreate -n pool0 -L 1T $VGNAME /dev/mapper/pv
> > 	lvcreate -n meta0 -L 512m $VGNAME $SSD_DEV
> > 
> > 	# Make sure the meta volume is on the SSD (it should be already from above):
> > 	pvmove -n meta0 /dev/mapper/pv
> 
> I'd omit that pvmove if possible just in case it makes some unexpected
> change.  You have more than enough layers to complicate things as it is
> without pvmove adding dm-mirror to the mix.

Good point. Since we specify the PV with `lvcreate ... $SSD_DEV`, the
pvmove isn't necessary at all.  (Specifying the PV on lvcreate was added
later, pvmove was leftover from earlier testing.)

> 
> > 	lvconvert -y --force --force --chunksize 64k --type thin-pool --poolmetadata $VGNAME/meta0 $VGNAME/pool0
> 
> It's not a bad idea to use mdraid over dm-integrity, but it would be
> interesting to know if doing raid+integrity in lvm would have the same
> problems. e.g.
>
> 
> lvcreate --type raid1 --raidintegrity y -m1 -L 512m -n meta0 $vg /dev/ram[01]
> lvcreate -n pool0 -L 1T $vg /dev/nullb0
> lvconvert --type thin-pool --poolmetadata meta0 $vg/pool0

Good idea!  As it turns out, that crashes just as easily.  Here is the
script with LVM based RAID1, which is somewhat simpler:

--------------------------------------------------------------------
#!/bin/bash

# Notice: /dev/ram0 and /dev/ram1 will be wiped unconditionally.

# Configure these if you need to:
VGNAME=testvg
LVNAME=thin
LVSIZE=$((10 * 1024*1024*1024/512))

echo "NOTICE: THIS MAY BE UNSAFE. ONLY RUN THIS IN A TEST ENVIRONMENT!"
echo "Press enter twice to continue or CTRL-C to abort."

read
read

set -x


# append disks into a linear target
linear_add()
{
	name=$1
	shift

	prevsize=0
	for vol in "$@"; do
		size=`blockdev --getsize $vol`
		echo "$prevsize $size linear $vol 0"
		prevsize=$size
	done \
		| dmsetup create $name

	echo /dev/mapper/$name
}

lvthin_add()
{
	id=$1
	lvcreate -An -V $LVSIZE -n $LVNAME$id --thinpool pool0 $VGNAME >&2
	echo /dev/$VGNAME/$LVNAME$id
}

lvthin_snapshot()
{
	origin=$1
	id=$2

	lvcreate -An -s $VGNAME/$LVNAME$origin -n $LVNAME$id >&2
	echo /dev/$VGNAME/$LVNAME$id
}

fio()
{
	dev=$1
	/bin/fio --name=$dev --rw=randrw --direct=1 --bs=512 --numjobs=1 --filename=$dev --time_based --runtime=$FIOTIME --ioengine=libaio --iodepth=1 &> /dev/null
}

do_reset()
{
	killall -9 fio
	lvchange -an $VGNAME
	rmdir /dev/$VGNAME
	dmsetup remove pv
	lvmdevices --deldev /dev/mapper/pv
	losetup -d /dev/loop?
	rmmod brd
	rmmod null_blk
	echo ==== reset done
	sleep 1
}

do_init()
{
	modprobe null_blk gb=30000 bs=512

	ramsize_gb=1
	modprobe brd rd_size=$(($ramsize_gb * 1024*1024)) rd_nr=2

	# Create a writable header for the PV meta:
	dd if=/dev/zero bs=1M count=16 oflag=direct of=/tmp/pvheader
	loop=`losetup -f --show /tmp/pvheader`
	linear_add pv $loop /dev/nullb0

	# Create the VG
	lvmdevices --adddev /dev/ram0
	lvmdevices --adddev /dev/ram1
	lvmdevices --adddev /dev/mapper/pv
	vgcreate $VGNAME /dev/mapper/pv /dev/ram[01]

	# Create the pool:
	lvcreate -n pool0 -L 1T $VGNAME /dev/mapper/pv
	lvcreate --type raid1 --raidintegrity y -m1 -L 512m -n meta0 $VGNAME /dev/ram[01]

	lvconvert -y --force --force --chunksize 64k --type thin-pool --poolmetadata $VGNAME/meta0 $VGNAME/pool0
}


while true; do
	do_reset
	do_init

	thin1=`lvthin_add 1`
	fio $thin1 &

	thin2=`lvthin_snapshot 1 2`
	fio $thin2 &

	wait

done





[Index of Archives]     [DM Crypt]     [Fedora Desktop]     [ATA RAID]     [Fedora Marketing]     [Fedora Packaging]     [Fedora SELinux]     [Yosemite Discussion]     [KDE Users]     [Fedora Docs]

  Powered by Linux