On Tue, Jun 23, 2020 at 04:17:30PM -0700, Sagi Grimberg wrote: > > > > > > - len = nvme_process_ns_desc(ctrl, ids, cur); > > > > > + len = nvme_process_ns_desc(ctrl, ids, cur, &csi_seen); > > > > > if (len < 0) > > > > > goto free_data; > > > > > len += sizeof(*cur); > > > > > } > > > > > free_data: > > > > > + if (!status && nvme_multi_css(ctrl) && !csi_seen) { > > > > > > > > We will clear the status if we detect a path error, that is to > > > > avoid needlessly removing the ns for path failures, so you should > > > > check at the goto site. > > > > > > The problem is that this check has to be done after checking all the ns descs, > > > so this check to be done as the final thing, at least after processing all the > > > ns descs. No matter if nvme_process_ns_desc() returned an error, or if > > > simply NVME_NIDT_CSI wasn't part of the ns desc list, so the loop reached the > > > end without error. > > > > > > Even if the nvme command failed and the status was cleared: > > > > > > if (status > 0 && !(status & NVME_SC_DNR)) > > > status = 0; > > > > This check is so weird. What has DNR got to do with whether or not we > > want to continue with this namespace? The commit that adds this says > > it's to check for a host failed IO, but a controller can just as validly > > set DNR in its error status, in which case we'd still want clear the > > status. > > The reason is that if this error is not a DNR error, it means that we > should retry and succeed, we don't want to remove the namespace. And what if it is a DNR error? For example, the controller simply doesn't support this CNS value. While the controller should support it, we definitely don't need it for NVM command set namespaces. > The problem that this solves is the fact that we have various error > recovery conditions (interconnect issues, failures, resets) and the > async works are designed to continue to run, issue I/O and fail. The > scan work, will revalidate namespaces and remove them if it fails to > do so, which is inherently wrong if these are simply inaccessible by > the host at this time. Relying on a specific bit in the status code seems a bit indirect for this kind of condition.