On 11/27/19 6:12 AM, Hannes Reinecke wrote:
On 11/27/19 2:05 PM, John Garry wrote:
On 27/11/2019 01:46, Jens Axboe wrote:
Would be interesting to check the generated code for that, ideally we'd
get rid of the extra load for that case, even if it is in the same
cacheline.
I checked the disassembly and we still have the load instead of the add.
This is not surprising, as the compiler would not know for certain that
we point to a field within the same struct. But at least we still should
point to a close memory.
Note that the pointer could be dropped, which would remove the load, but
then we have many if-elses which could be slower, not to mention that
the blk-mq-tag code deals in bitmap pointers anyway.
Hi Jens,
It might still be worthwhile to do:
if (tags->ptr == &tags->__default)
foo(&tags->__default);
to make it clear, as that branch will predict easily.
Not sure. So this code does produce the same assembly, as we still need
to do the tags->ptr load for the comparison.
And then if you consider blk_mq_get_tags() as an example, there is no
other hot value available to indicate whether the shared tags are used
to decide whether to use &tags->__default.
I'll consider it more.
After talking to our compiler folks I guess it should be possible to
resurrect your original patch (with both the embedded bitmap and the
point to the bitmap), linking the pointer to the embedded bitmap in the
non-shared case.
Then we can access the pointer in all cases, but in the non-shared case
that will then point to the embedded one, thus avoiding the possible
cache miss.
I'll see how that goes.
That's exactly what I suggested yesterday, the discussion now is on how
to make that work in the cleanest way.
--
Jens Axboe