On 26/11/2019 15:14, Jens Axboe wrote:
On 11/26/19 2:14 AM, Hannes Reinecke wrote:
Instead of allocating the tag bitmap in place we should be using a
pointer. This is in preparation for shared host-wide bitmaps.
Not a huge fan of this, it's an extra indirection in the hot path
of both submission and completion.
Hi Jens,
Thanks for having a look.
I checked the disassembly for blk_mq_get_tag() as a sample - which I
assume is one hot path function which you care about - and the cost of
the indirection is a load instruction instead of an add, denoted by ***,
below:
Before:
static inline struct blk_mq_tags *blk_mq_tags_from_data(struct
blk_mq_alloc_data *data)
{
if (data->flags & BLK_MQ_REQ_INTERNAL)
return data->hctx->sched_tags;
6ac: a9554c64 ldp x4, x19, [x3, #336]
return data->hctx->tags;
6b0: f27e003f tst x1, #0x4
6b4: f9003ba0 str x0, [x29, #112]
6b8: a9078ba2 stp x2, x2, [x29, #120]
6bc: 9a841273 csel x19, x19, x4, ne // ne = any
if (data->flags & BLK_MQ_REQ_RESERVED) {
6c0: 36081021 tbz w1, #1, 8c4 <blk_mq_get_tag+0x264>
if (unlikely(!tags->nr_reserved_tags)) {
6c4: b9400660 ldr w0, [x19, #4]
6d4: f90027ba str x26, [x29, #72]
tag_offset = 0;
6c8: 52800018 mov w24, #0x0 // #0
bt = &tags->breserved_tags;
6cc: 91014273 add x19, x19, #0x50 ***
if (unlikely(!tags->nr_reserved_tags)) {
6d0: 340012e0 cbz w0, 92c <blk_mq_get_tag+0x2cc>
tag = __blk_mq_get_tag(data, bt);
6d8: aa1303e1 mov x1, x19
6dc: aa1403e0 mov x0, x20
6e0: 97fffe92 bl 128 <__blk_mq_get_tag>
After:
static inline struct blk_mq_tags *blk_mq_tags_from_data(struct
blk_mq_alloc_data *data)
{
if (data->flags & BLK_MQ_REQ_INTERNAL)
return data->hctx->sched_tags;
6b4: a9550004 ldp x4, x0, [x0, #336]
return data->hctx->tags;
6ac: f27e005f tst x2, #0x4
6b0: f9003ba1 str x1, [x29, #112]
return data->hctx->sched_tags;
6b8: a9078fa3 stp x3, x3, [x29, #120]
return data->hctx->tags;
6bc: 9a841000 csel x0, x0, x4, ne // ne = any
if (data->flags & BLK_MQ_REQ_RESERVED) {
6c0: 36080fa2 tbz w2, #1, 8b4 <blk_mq_get_tag+0x254>
if (unlikely(!tags->nr_reserved_tags)) {
6c4: b9400401 ldr w1, [x0, #4]
6cc: f90027ba str x26, [x29, #72]
tag_offset = 0;
6d0: 52800017 mov w23, #0x0 // #0
bt = tags->breserved_tags;
6c8: 340012a1 cbz w1, 91c <blk_mq_get_tag+0x2bc>
6d4: f9400c14 ldr x20, [x0, #24] ***
tag = __blk_mq_get_tag(data, bt);
6d8: aa1303e0 mov x0, x19
6dc: aa1403e1 mov x1, x20
6e0: 97fffe92 bl 128 <__blk_mq_get_tag>
This is arm64 dis.
I'm just saying this to provide some illustration of the potential
performance impact of this change.
Thanks,
John