Re: [PATCH] nvmet: Fix a use-after-free

Sagi Grimberg <sagi@xxxxxxxxxxx> · Wed, 17 Aug 2022 13:17:07 +0300

nvmet_req *req, u16 status)
      trace_nvmet_req_complete(req);
-    if (req->ns)
-        nvmet_put_namespace(req->ns);
      req->ops->queue_response(req);
+    if (ns)
+        nvmet_put_namespace(ns);

Why did the put change position?
I'm not exactly clear what was used-after-free here..

Hi Sagi,

Is my understanding correct that the NVMe target namespace owns the 
block device `req` is associated with and hence that the namespace 
reference count must only be dropped after dereferencing the `req` 
pointer has finished?

This is what I found in the NVMe target code:
* nvmet_put_namespace() decreases ns->ref.
* Dropping the last ns->ref causes nvmet_destroy_namespace() to be
   called. That function completes ns->disable_done.
* nvmet_ns_disable() waits on that completion and calls
   nvmet_ns_dev_disable().
* For a block device, nvmet_ns_dev_disable() calls blkdev_put().
* The last blkdev_put() call calls disk_release().
* disk_release() calls blk_put_queue().
* The last blk_put_queue() call calls blk_release_queue().
* blk_release_queue() frees struct request_queue.
* blk_mq_complete_request_remote() dereferences req->q.

The analysis is correct, specifically for loop transport, but it is not
the same request_queue...

blk_mq_complete_request_remote references the initiator block device
created from the namespace, but nvmet_ns_dev_disable puts the backend
blockdev, which is a separate blockdev...

Maybe I'm misunderstanding your point?

And specifically your stack trace references nvmet_execute_io_connect
which is not addressed to a namespace... This I think that the
request_queue is actually the ctrl->connect_q, which is used for
io_connect commands.

It is unclear to me how this one ended up with a use-after-free...
Maybe we have a stray io_connect command lingering after we teardown
the controller?