On Sat, Nov 17, 2018 at 10:34:18AM +0800, Ming Lei wrote: > On Fri, Nov 16, 2018 at 06:06:23AM -0800, Greg Kroah-Hartman wrote: > > On Fri, Nov 16, 2018 at 07:23:11PM +0800, Ming Lei wrote: > > > Now q->queue_ctx is just one read-mostly table for query the > > > 'blk_mq_ctx' instance from one cpu index, it isn't necessary > > > to allocate it as percpu variable. One simple array may be > > > more efficient. > > > > "may be", have you run benchmarks to be sure? If so, can you add the > > results of them to this changelog? If there is no measurable > > difference, then why make this change at all? > > __blk_mq_get_ctx() is used in fast path, what do you think about which > one is more efficient? > > - *per_cpu_ptr(q->queue_ctx, cpu); > > - q->queue_ctx[cpu] You need to actually test to see which one is faster, you might be surprised :) In other words, do not just guess. > At least the latter isn't worse than the former. How do you know? > Especially q->queue_ctx is just a read-only look-up table, it doesn't > make sense to make it percpu any more. > > Not mention q->queue_ctx[cpu] is more clean/readable. Again, please test to verify this. thanks, greg k-h