Re: [PATCH RFC v7 10/12] megaraid_sas: switch fusion adapters to MQ

Ming Lei <ming.lei@xxxxxxxxxx> · Tue, 28 Jul 2020 16:45:11 +0800

On Tue, Jul 28, 2020 at 08:54:27AM +0100, John Garry wrote:
> On 24/07/2020 03:47, Ming Lei wrote:
> > On Thu, Jul 23, 2020 at 06:29:01PM +0100, John Garry wrote:
> > > > > As I see, since megaraid will have 1:1 mapping of CPU to hw queue, will
> > > > > there only ever possibly a single bit set in ctx_map? If so, it seems a
> > > > > waste to always check every sbitmap map. But adding logic for this may
> > > > > negate any possible gains.
> > > > 
> > > > It really depends on min and max cpu id in the map, then sbitmap
> > > > depth can be reduced to (max - min + 1). I'd suggest to double check that
> > > > cost of sbitmap_any_bit_set() really matters.
> > > 
> > > Hi Ming,
> > > 
> > > I'm not sure that reducing the search range would help much, as we still
> > > need to load some indexes of map[], and at best this may be reduced from 2/3
> > > -> 1 elements, depending on nr_cpus.
> > 
> > I believe you misunderstood my idea, and you have to think it from implementation
> > viewpoint.
> > 
> > The only workable way is to store the min cpu id as 'offset' and set the sbitmap
> > depth as (max - min + 1), isn't it? Then the actual cpu id can be figured out via
> > 'offset' + nr_bit. And the whole indexes are just spread on the actual depth. BTW,
> > max & min is the max / min cpu id in hctx->cpu_map. So we can improve not only on 1:1,
> > and I guess most of MQ cases can benefit from the change, since it shouldn't be usual
> > for one ctx_map to cover both 0 & nr_cpu_id - 1.
> > 
> > Meantime, we need to allocate the sbitmap dynamically.
> 
> OK, so dynamically allocating the sbitmap could be good. I was thinking
> previously that we still allocate for nr_cpus size, and search a limited
> range - but this would have heavier runtime overhead.
> 
> So if you really think that this may have some value, then let me know, so
> we can look to take it forward.

Forget to mention, the in-tree code has been this shape for long
time, please see sbitmap_resize() called from blk_mq_map_swqueue().

Another update is that V4 of 'scsi: core: only re-run queue in scsi_end_request()
if device queue is busy' is quite hard to implement since commit b4fd63f42647110c9
("Revert "scsi: core: run queue if SCSI device queue isn't ready and queue is idle").

Thanks,
Ming