On 8/20/24 15:50, Shin'ichiro Kawasaki wrote: > The CKI project reported that the test case nvme/052 fails occasionally > with the errors below: > > nvme/052 (tr=loop) (Test file-ns creation/deletion under one subsystem) [failed] > runtime ... 22.209s > --- tests/nvme/052.out 2024-07-30 18:38:29.041716566 -0400 > +++ > +/mnt/tests/gitlab.com/redhat/centos-stream/tests/kernel/kernel-tests/-/archive/production/kernel-t\ > ests-production.zip/storage/blktests/nvme/nvme-loop/blktests > +/results/nodev_tr_loop/nvme/052.out.bad 2024-07-30 18:45:35.438067452 -0400 > @@ -1,2 +1,4 @@ > Running nvme/052 > +cat: /sys/block/nvme1n2/uuid: No such file or directory > +cat: /sys/block/nvme1n2/uuid: No such file or directory > Test complete > > The test case repeats creating and removing namespaces. When the test > case removes the namespace by echoing 0 to the sysfs enable file, this > echo write does not wait for the completion of the namespace removal. > Before the removal completes, the test case recreates the namespace. > At this point, the sysfs uuid file for the old namespace still exists. > The test case misunderstands that the the sysfs uuid file would be for > the recreated namespace, and tries to read it. However, the removal > process for the old namespace deletes the sysfs uuid file at this point. > Then the read attempt fails and results in the errors. > > To avoid the failure, wait for the namespace removal before recreating > the namespace. For this purpose, add the new helper function > nvmf_wait_for_ns_removal(). To specify the namespace to wait for, get > the name of the namespace from nvmf_wait_for_ns(), and pass it to > nvmf_wait_for_ns_removal(). > > The test case intends to catch the regression fixed by the kernel commit > ff0ffe5b7c3c ("nvme: fix namespace removal list"). I reverted the commit > from the kernel v6.11-rc4, then confirmed that the test case still can > catch the regression with this change. > > Link: https://lore.kernel.org/linux-block/tczctp5tkr34o3k3f4dlyhuutgp2ycex6gdbjuqx4trn6ewm2i@qbkza3yr5wdd/ > Fixes: 077211a0e9ff ("nvme: add test for creating/deleting file-ns") > Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@xxxxxxx> > --- > Nelay, Yi, thank you for the feedbacks for the discussion > thread at the Link. Here's the formal fix patch. > > tests/nvme/052 | 22 ++++++++++++++++++++-- > 1 file changed, 20 insertions(+), 2 deletions(-) > > diff --git a/tests/nvme/052 b/tests/nvme/052 > index cf6061a..e1ac823 100755 > --- a/tests/nvme/052 > +++ b/tests/nvme/052 > @@ -39,15 +39,32 @@ nvmf_wait_for_ns() { > ns=$(_find_nvme_ns "${uuid}") > done > > + echo "$ns" > return 0 > } > > +nvmf_wait_for_ns_removal() { > + local ns=$1 i > + > + for ((i = 0; i < 10; i++)); do > + if [[ ! -e /dev/$ns ]]; then > + return > + fi > + sleep .1 > + echo "wait removal of $ns" >> "$FULL" > + done > + > + if [[ -e /dev/$ns ]]; then > + echo "Failed to remove the namespace $ns" > + fi > +} > + Under nvmf_wait_for_ns_removal(), instead of checking the existence of "/dev/$ns", how about checking the existence of file "/sys/block/$ns"? As we know, when this issue manifests, we have a stale entry "/sys/block/$ns/$uuid" lurking around from the previous iteration for sometime causing the observed symptom. So I think, we may reuse the _find_nvme_ns() function to wait until the stale "/sys/block/$ns/$uuid" file exists. Maybe something like below: nvmf_wait_for_ns_removal() { local ns local timeout="5" local uuid="$1" ns=$(_find_nvme_ns "${uuid}") start_time=$(date +%s) while [[ ! -z "$ns" ]]; do sleep 1 end_time=$(date +%s) if (( end_time - start_time > timeout )); then echo "namespace with uuid \"${uuid}\" still " \ "not deleted within ${timeout} seconds" return 1 fi echo "Waiting for $ns removal" >> ${FULL} ns=$(_find_nvme_ns "${uuid}") done return 0 } Thanks, --Nilay