On 8/23/23 14:46, Bart Van Assche wrote: > On 8/23/23 09:19, Bob Pearson wrote: >> I have also seen the same hangs in siw. Not as frequently but the same symptoms. >> About every month or so I take another run at trying to find and fix this bug but >> I have not succeeded yet. I haven't seen anything that looks like bad behavior from >> the rxe side but that doesn't prove anything. I also saw these hangs on my system >> before the WQ patch went in if my memory serves. Out main application for this >> driver at HPE is Lustre which is a little different than SRP but uses the same >> general approach with fast MRs. Currently we are finding the driver to be quite stable >> even under very heavy stress. >> >> I would be happy to collaborate with someone (you?) who knows the SRP side well to resolve >> this hang. I think that is the quickest way to fix this. I have no idea what SRP is waiting for. > > Hi Bob, > > I cannot reproduce these issues. All SRP tests work reliably on my test setup on > top of the v6.5-rc7 kernel, whether I use the siw driver or whether I use the > rdma_rxe driver. Additionally, I do not see any SRP abort messages. Thank you for this. This is good news. > > # uname -a > Linux opensuse-vm 6.5.0-rc7 #28 SMP PREEMPT_DYNAMIC Wed Aug 23 10:42:35 PDT 2023 x86_64 x86_64 x86_64 GNU/Linux > # journalctl --since=today | grep 'SRP abort' | wc > 0 0 0 > > Since I installed openSUSE Tumbleweed in the VM in which I run kernel tests: if > you are using a Linux distro that is based on Debian it may include a buggy > version of multipathd. Last time I ran the SRP tests in a Debian VM I had to > build multipathd from source - the SRP tests did not work with the Debian version > of multipathd. The shell script that I use to build and install multipathd is as > follows (must be run in the multipath-tools source directory): I run on Ubuntu which is Debian based. So perhaps that is the root of the problems I have been seeing. I'll try to follow your lead here. Bob > > #!/bin/bash > > scriptdir="$(dirname "$0")" > > if type -p zypper >/dev/null 2>&1; then > rpms=(device-mapper-devel libaio-devel libjson-c-devel librados-devel > liburcu-devel readline-devel systemd-devel) > for p in "${rpms[@]}"; do > sudo zypper install -y "$p" > done > elif type -p apt-get >/dev/null 2>&1; then > export LIB=/lib > sudo apt-get install -y libaio-dev libdevmapper-dev libjson-c-dev librados-dev \ > libreadline-dev libsystemd-dev liburcu-dev > fi > > git clean -f > make -s "$@" > sudo make -s "$@" install > > Bart.