Re: dm-mq and end_clone_request()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




----- Original Message -----
> From: "Mike Snitzer" <snitzer@xxxxxxxxxx>
> To: "Laurence Oberman" <loberman@xxxxxxxxxx>
> Cc: "Bart Van Assche" <bart.vanassche@xxxxxxxxxxx>, dm-devel@xxxxxxxxxx, linux-scsi@xxxxxxxxxxxxxxx
> Sent: Tuesday, August 2, 2016 10:10:12 PM
> Subject: Re: dm-mq and end_clone_request()
> 
> On Tue, Aug 02 2016 at  9:33pm -0400,
> Laurence Oberman <loberman@xxxxxxxxxx> wrote:
> 
> > Hi Bart
> > 
> > I simplified the test to 2 simple scripts and only running against one XFS
> > file system.
> > Can you validate these and tell me if its enough to emulate what you are
> > doing.
> > Perhaps our test-suite is too simple.
> > 
> > Start the test
> > 
> > # cat run_test.sh
> > #!/bin/bash
> > logger "Starting Bart's test"
> > #for i in `seq 1 10`
> > for i in 1
> > do
> > 	fio --verify=md5 -rw=randwrite --size=10M --bs=4K --loops=$((10**6)) \
> >         --iodepth=64 --group_reporting --sync=1 --direct=1
> >         --ioengine=libaio \
> >         --directory="/data-$i" --name=data-integrity-test --thread
> >         --numjobs=16 \
> >         --runtime=600 --output=fio-output.txt >/dev/null &
> > done
> > 
> > Delete the host, I wait 10s in between host deletions.
> > But I also tested with 3s and still its stable with Mike's patches.
> > 
> > #!/bin/bash
> > for i in /sys/class/srp_remote_ports/*
> > do
> >  echo "Deleting host $i, it will re-connect via srp_daemon"
> >  echo 1 > $i/delete
> >  sleep 10
> > done
> > 
> > Check for I/O errors affecting XFS and we now have none with the patches
> > Mike provided.
> > After recovery I can create files in the xfs mount with no issues.
> > 
> > Can you use my scripts and 1 mount and see if it still fails for you.
> 
> In parallel we can try Bart's testsuite that he shared earlier in this
> thread: https://github.com/bvanassche/srp-test
> 
> README.md says:
> "Running these tests manually is tedious. Hence this test suite that
> tests the SRP initiator and target drivers by loading both drivers on
> the same server, by logging in using the IB loopback functionality and
> by sending I/O through the SRP initiator driver to a RAM disk exported
> by the SRP target driver."
> 
> This could explain why Bart is still seeing issues.  He isn't testing
> real hardware -- as such he is using ramdisk to expose races, etc.
> 
> Mike
> 

Hi Mike,

I looked at Bart's scripts, they looked fine but I wanted a more simplified way to bring the error out.
Using ramdisk is not uncommon as an LIO backend via ib_srpt to serve LUNS.
That is the same way I do it when I am not connected to a large array as it is the only way I can get EDR like speeds.

I don't thinks its racing due to the ramdisk back-end but  maybe we need to ramp ours up to run more in parallel in a loop.

I will run 21 parallel runs and see if it makes a difference tonight and report back tomorrow.
Clearly prior to your final patches we were escaping back to the FS layer with errors but since your patches, at least in out test harness that is resolved.

Thanks
Laurence

--
dm-devel mailing list
dm-devel@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/dm-devel



[Index of Archives]     [DM Crypt]     [Fedora Desktop]     [ATA RAID]     [Fedora Marketing]     [Fedora Packaging]     [Fedora SELinux]     [Yosemite Discussion]     [KDE Users]     [Fedora Docs]

  Powered by Linux