On 4/18/19 8:02 AM, Hannes Reinecke wrote:
On 4/18/19 4:34 PM, Ming Lei wrote:
Hi Hannes,
On Thu, Apr 18, 2019 at 04:06:32PM +0200, Hannes Reinecke wrote:
When calling blk_queue_split() it will be using the per-queue
bioset to allocate the split bio from. However, blk_steal_bios()
might move the bio to another queue, _and_ the original queue
might be removed completely (nvme is especially prone to do so).
Could you explain a bit how the original queue is removed in case
that blk_steal_bios() is involved?
It's not blk_steal_bios() which removes the queue. What happens is:
- bio returns with error
- blk_steal_bios() moves bio over to a different queue
- nvme_reset_ctrl() is called
- Error detection finds that the original device is gone
- nvme calls nvme_remove_ns()
- nvme_remove_ns() removes the original request queue alongside the
bio_set from which the bvecs have been allocated from.
- 'stolen' bio is completed
- 'stolen' bio calls bio_endio()
- bio_endio() calls mempool_free() on the bvecs, referencing the mempool
from the original queue
- crash
That leaves the bvecs of the split bio with a missing / destroyed
mempool, and a really fun crash in bio_endio().
per-queue bioset is used originally for avoiding deadlock, are you
sure the static bioset is safe?
If that turns out be be an issue we could be having a per 'ns_head'
bio_set for allocating the split bio from.
But the main point is that we cannot use the bioset from the request
queue as the queue (and the bioset) might be removed during the lifetime
of the bio.
Does this mean that as-is, this is safe because nvme (currently) is the
only consumer of blk_steal_bios()?
(Sorry if this is a lame question, I'm in the process of learning this
code.)
TIA,
Ed