On Fri, Nov 16, 2018 at 06:06:23AM -0800, Greg Kroah-Hartman wrote: > On Fri, Nov 16, 2018 at 07:23:11PM +0800, Ming Lei wrote: > > Now q->queue_ctx is just one read-mostly table for query the > > 'blk_mq_ctx' instance from one cpu index, it isn't necessary > > to allocate it as percpu variable. One simple array may be > > more efficient. > > "may be", have you run benchmarks to be sure? If so, can you add the > results of them to this changelog? If there is no measurable > difference, then why make this change at all? __blk_mq_get_ctx() is used in fast path, what do you think about which one is more efficient? - *per_cpu_ptr(q->queue_ctx, cpu); - q->queue_ctx[cpu] At least the latter isn't worse than the former. Especially q->queue_ctx is just a read-only look-up table, it doesn't make sense to make it percpu any more. Not mention q->queue_ctx[cpu] is more clean/readable. Thanks, Ming