On 11/26/19 11:08 AM, John Garry wrote: > On 26/11/2019 17:25, Jens Axboe wrote: >> On 11/26/19 10:23 AM, John Garry wrote: >>> On 26/11/2019 17:11, Jens Axboe wrote: >>>> On 11/26/19 9:54 AM, John Garry wrote: >>>>> On 26/11/2019 15:14, Jens Axboe wrote: >>>>>> On 11/26/19 2:14 AM, Hannes Reinecke wrote: >>>>>>> Instead of allocating the tag bitmap in place we should be using a >>>>>>> pointer. This is in preparation for shared host-wide bitmaps. >>>>>> >>>>>> Not a huge fan of this, it's an extra indirection in the hot path >>>>>> of both submission and completion. >>>>> >>>>> Hi Jens, >>>>> >>>>> Thanks for having a look. >>>>> >>>>> I checked the disassembly for blk_mq_get_tag() as a sample - which I >>>>> assume is one hot path function which you care about - and the cost of >>>>> the indirection is a load instruction instead of an add, denoted by ***, >>>>> below: >>>> >>> >>> Hi Jens, >>> >>>> I'm not that worried about an extra instruction, my worry is the extra >>>> load is from different memory. When it's embedded in the struct, we're >>>> on the same cache line or adjacent. >>> >>> Right, so the earlier iteration of this series kept the embedded struct >>> and we simply pointed at that, so I wouldn't expect a caching issue of >>> different memory in that case. >> > > Hi Jens, > >> That would be a much better solution for the common case, my concern >> here is slowing down the fast path for device that don't need shared >> tags. >> >> Would be interesting to check the generated code for that, ideally we'd >> get rid of the extra load for that case, even if it is in the same >> cacheline. >> > > I checked the disassembly and we still have the load instead of the add. > > This is not surprising, as the compiler would not know for certain that > we point to a field within the same struct. But at least we still should > point to a close memory. > > Note that the pointer could be dropped, which would remove the load, but > then we have many if-elses which could be slower, not to mention that > the blk-mq-tag code deals in bitmap pointers anyway. It might still be worthwhile to do: if (tags->ptr == &tags->__default) foo(&tags->__default); to make it clear, as that branch will predict easily. If if can be done in a nice enough fashion and not sprinkled everywhere, in some fashion. Should be testable, though. -- Jens Axboe