Currently, blk_mq_tagset_iter() iterate over initial hctx tags only.
In case scheduler is used, it doesn't iterate the hctx scheduler tags
and the static request aren't been updated.
For example, while using NVMe over Fabrics RDMA host, this cause us not to
reinit the scheduler requests and thus not re-register all the memory regions
during the tagset re-initialization in the reconnect flow.
I think this is a sign that we should cease from embedding memory
regions on the pre-allocated requests. Its too much resources
that we waste. In our case, tags are not really cheap given
that they take a physical HW resource (rdma memory region).
I think we should switch (again) to a pool design instead. I guess its
time for a generic MR pool that will serve nvmf, xprt, srp, iser and
friends.
Liks drivers/infiniband/core/mr_pool.c? :)
Yea :)
forgot we had that...
Note that it does introduce a new spinlock to our hot-path, but given
the current over-allocation scheme with schedulers, its probably better
off.