On Mon, Jun 10, 2024 at 10:17:42PM +0300, Sagi Grimberg wrote: > On 10/06/2024 22:15, Keith Busch wrote: > > On Mon, Jun 10, 2024 at 10:05:00PM +0300, Sagi Grimberg wrote: > > > > > > On 10/06/2024 21:53, Keith Busch wrote: > > > > On Mon, Jun 10, 2024 at 01:21:00PM +0530, Venkat Rao Bagalkote wrote: > > > > > Issue is introduced by the patch: be647e2c76b27f409cdd520f66c95be888b553a3. > > > > My mistake. The namespace remove list appears to be getting corrupted > > > > because I'm using the wrong APIs to replace a "list_move_tail". This is > > > > fixing the issue on my end: > > > > > > > > --- > > > > diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c > > > > index 7c9f91314d366..c667290de5133 100644 > > > > --- a/drivers/nvme/host/core.c > > > > +++ b/drivers/nvme/host/core.c > > > > @@ -3959,9 +3959,10 @@ static void nvme_remove_invalid_namespaces(struct nvme_ctrl *ctrl, > > > > mutex_lock(&ctrl->namespaces_lock); > > > > list_for_each_entry_safe(ns, next, &ctrl->namespaces, list) { > > > > - if (ns->head->ns_id > nsid) > > > > - list_splice_init_rcu(&ns->list, &rm_list, > > > > - synchronize_rcu); > > > > + if (ns->head->ns_id > nsid) { > > > > + list_del_rcu(&ns->list); > > > > + list_add_tail_rcu(&ns->list, &rm_list); > > > > + } > > > > } > > > > mutex_unlock(&ctrl->namespaces_lock); > > > > synchronize_srcu(&ctrl->srcu); > > > > -- > > > Can we add a reproducer for this in blktests? I'm assuming that we can > > > easily trigger this > > > with adding/removing nvmet namespaces? > > I'm testing this with Namespace Manamgent commands, which nvmet doesn't > > support. You can recreate the issue by detaching the last namespace. > > > > I think the same will happen in a test that creates two namespaces and then > echo 0 > ns/enable. Looks like nvme/016 tess this. It's reporting as "passed" on my end, but I don't think it's actually testing the driver as intended. Still messing with it.