On Tue, 2019-09-17 at 10:59 -0700, Bart Van Assche wrote: > On 9/12/19 12:48 PM, Laurence Oberman wrote: > > My usual 3 month SRP test results show all is still well with SRP > > client drivers and multipath. > > I am still using 4.16 for the ib_srpt on the target server. > > > > 5.3-rc8 ib_srp CX4 100Gbit EDR tests > > direct and unbuffered, large and small I/O sizes > > port recovery with fault injection > > > > One small observation was that after fault injection it seemed to > > take > > longer to log back in, in that I needed to extend my sleep in the > > injection script to avoid some multipaths lose all paths. > > > > I was sleeping 30s between resets prior to this and I would log > > back in > > quick enough to not lose all paths. > > My sleep is now 60s > > > > #on ibclient server in /sys/class/srp_remote_ports, using echo 1 > > > delete for the particular port will simulate a port reset. > > > > #/sys/class/srp_remote_ports > > #[root@ibclient srp_remote_ports]# ls > > #port-1:1 port-2:1 > > for d in /sys/class/srp_remote_ports/* > > do > > echo 1 > $d/delete > > sleep 60 > > done > > Hi Laurence, > > This is weird. Has this behavior change been observed once or has it > been observed multiple times? I'm asking because in my tests I > noticed > that there can be variation between tests depending on how much time > the > SCSI error handler spends in its error recovery strategy. > > Thanks, > > Bart. Hi Bart, Well the tests have been at 30s for quite a while. The 30s used to be long enough to get by in my tests. I would have to go back a bit to see when it started seeing these longer delays but I fully expect it to be related to EH changes to be honest. When I started getting hard errors I realized I was taking the second port out before the first had recovered now. Just let you know in case you have to change your blktests etc. Thanks Laurence