On 8/21/24 12:51, Shinichiro Kawasaki wrote: >> Under nvmf_wait_for_ns_removal(), instead of checking the existence of "/dev/$ns", >> how about checking the existence of file "/sys/block/$ns"? As we know, when this issue >> manifests, we have a stale entry "/sys/block/$ns/$uuid" lurking around from the >> previous iteration for sometime causing the observed symptom. So I think, we may reuse the >> _find_nvme_ns() function to wait until the stale "/sys/block/$ns/$uuid" file >> exists. > > It sounds a good idea to reuse _find_nvme_ns(). > >> Maybe something like below: >> >> nvmf_wait_for_ns_removal() { >> local ns >> local timeout="5" >> local uuid="$1" >> >> ns=$(_find_nvme_ns "${uuid}") > > I tried this, and found that the _find_nvme_ns call spits out the failure > "cat: /sys/block/nvme1n2/uuid: No such file or directory", because the > delayed namespace removal can happen here. To suppress the error message, > this line should be, > > ns=$(_find_nvme_ns "${uuid}" 2> /dev/null) > yeah I agree here. We need to suppress this error. >> >> start_time=$(date +%s) >> while [[ ! -z "$ns" ]]; do >> sleep 1 >> end_time=$(date +%s) >> if (( end_time - start_time > timeout )); then >> echo "namespace with uuid \"${uuid}\" still " \ >> "not deleted within ${timeout} seconds" >> return 1 >> fi >> echo "Waiting for $ns removal" >> ${FULL} >> ns=$(_find_nvme_ns "${uuid}") > > Same comment as above. > >> >> done >> >> return 0 >> } > > I found that your nvmf_wait_for_ns_removal() above has certain amount of > duplication with the existing nvmf_wait_for_ns(). To avoid the duplication, > I suggest to reuse nvmf_wait_for_ns() and add a new argument to control wait > target event: namespace 'created' or 'removed'. With this idea, I created the > patch below. I confirmed the patch avoids the failure. > Sounds good! > One drawback of this new patch based on your suggestion is that it extends > execution time of the test case from 20+ seconds to 40+ seconds. In most cases, > the while loop condition check in nvmf_wait_for_ns() is true at the first time, > and false at the the second time. So, nvmf_wait_for_ns() takes 1 second for one > time "sleep 1". Before applying this patch, it took 20+ seconds for 20 times > iteration. After applying the patch, it takes 40+ seconds, since one iteration > calls nvmf_wait_for_ns() twice. So how about to reduce the sleep time from 1 to > 0.1? I tried it and observed that it reduced the runtime from 40+ seconds to 10+ > seconds. > If using sleep of 0.1 second saves execution time then yes this makes sense. > diff --git a/tests/nvme/052 b/tests/nvme/052 > index cf6061a..22e0bf5 100755 > --- a/tests/nvme/052 > +++ b/tests/nvme/052 > @@ -20,23 +20,35 @@ set_conditions() { > _set_nvme_trtype "$@" > } > > +find_nvme_ns() { > + if [[ "$2" == removed ]]; then > + _find_nvme_ns "$1" 2> /dev/null > + else > + _find_nvme_ns "$1" > + fi > +} > + > +# Wait for the namespace with specified uuid to fulfill the specified condtion, > +# "created" or "removed". > nvmf_wait_for_ns() { > local ns > local timeout="5" > local uuid="$1" > + local condition="$2" > > - ns=$(_find_nvme_ns "${uuid}") > + ns=$(find_nvme_ns "${uuid}" "${condition}") > > start_time=$(date +%s) > - while [[ -z "$ns" ]]; do > + while [[ -z "$ns" && "$condition" == created ]] || > + [[ -n "$ns" && "$condition" == removed ]]; do > sleep 1 > end_time=$(date +%s) > if (( end_time - start_time > timeout )); then > echo "namespace with uuid \"${uuid}\" not " \ > - "found within ${timeout} seconds" > + "${condition} within ${timeout} seconds" > return 1 > fi > - ns=$(_find_nvme_ns "${uuid}") > + ns=$(find_nvme_ns "${uuid}" "${condition}") > done > > return 0 > @@ -63,7 +75,7 @@ test() { > _create_nvmet_ns "${def_subsysnqn}" "${i}" "$(_nvme_def_file_path).$i" "${uuid}" > > # wait until async request is processed and ns is created > - nvmf_wait_for_ns "${uuid}" > + nvmf_wait_for_ns "${uuid}" created > if [ $? -eq 1 ]; then > echo "FAIL" > rm "$(_nvme_def_file_path).$i" > @@ -71,6 +83,10 @@ test() { > fi > > _remove_nvmet_ns "${def_subsysnqn}" "${i}" > + > + # wait until async request is processed and ns is removed > + nvmf_wait_for_ns "${uuid}" removed > + As nvme_wait_for_ns() returns either 0 (success) or 1 (failure), I think we should check the return status of nvme_wait_for_ns() here and bail out in case it returns failure same as what we do above while creating namespace. Another point: I think, we may always suppress error from _find_nvme_ns() irrespective of it's being called while "creating" or "removing" the namespace assuming we always check the return status of nvme_wait_for_ns() in the main loop. So ideally we shall invoke _find_nvme_ns() from nvme_wait_for_ns() as below: ns=$(_find_nvme_ns "${uuid}" 2>/dev/null) On a cosmetic note: Maybe we can use readonly constants to make "created" and "removed" parameters looks more elegant/readable. # Define constants readonly NS_ADD="added" readonly NS_DEL="deleted" Now we may reuse above constants instead of "created" and "removed". You may rename constant name if you don't like the name I used above :) Thanks, --Nilay