On Wed, Nov 18, 2020 at 10:15:19AM +0100, Hannes Reinecke wrote: > On 11/18/20 8:44 AM, Ming Lei wrote: > > On Wed, Nov 18, 2020 at 08:15:47AM +0100, Hannes Reinecke wrote: > > > Hey Ming, > > > > > > On 11/18/20 3:35 AM, Ming Lei wrote: > > > > Hello Kashyap & Sumanesh, > > > > > > > > On Mon, Nov 16, 2020 at 07:49:31PM +0800, kernel test robot wrote: > > > > > Hi Ming, > > > > > > > > > > Thank you for the patch! Yet something to improve: > > > > > > > > > > [auto build test ERROR on block/for-next] > > > > > [also build test ERROR on mkp-scsi/for-next scsi/for-next v5.10-rc4 next-20201116] > > > > > [If your patch is applied to the wrong git tree, kindly drop us a note. > > > > > And when submitting patch, we suggest to use '--base' as documented in > > > > > https://git-scm.com/docs/git-format-patch] > > > > > > > > > > url: https://github.com/0day-ci/linux/commits/Ming-Lei/blk-mq-scsi-tracking-device-queue-depth-via-sbitmap/20201116-171449 > > > > > base: https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git for-next > > > > > config: powerpc64-randconfig-r026-20201116 (attached as .config) > > > > > compiler: clang version 12.0.0 (https://github.com/llvm/llvm-project c044709b8fbea2a9a375e4173a6bd735f6866c0c) > > > > > reproduce (this is a W=1 build): > > > > > wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross > > > > > chmod +x ~/bin/make.cross > > > > > # install powerpc64 cross compiling tool for clang build > > > > > # apt-get install binutils-powerpc64-linux-gnu > > > > > # https://github.com/0day-ci/linux/commit/cc286ae987be50d7b8e152cc80a5ccaa8682e3ff > > > > > git remote add linux-review https://github.com/0day-ci/linux > > > > > git fetch --no-tags linux-review Ming-Lei/blk-mq-scsi-tracking-device-queue-depth-via-sbitmap/20201116-171449 > > > > > git checkout cc286ae987be50d7b8e152cc80a5ccaa8682e3ff > > > > > # save the attached .config to linux build tree > > > > > COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=powerpc64 > > > > > > > > > > If you fix the issue, kindly add following tag as appropriate > > > > > Reported-by: kernel test robot <lkp@xxxxxxxxx> > > > > > > > > > > All errors (new ones prefixed by >>): > > > > > > > > > > > > drivers/scsi/megaraid/megaraid_sas_fusion.c:365:41: error: no member named 'device_busy' in 'struct scsi_device' > > > > > sdev_busy = atomic_read(&scmd->device->device_busy); > > > > > > > > This new reference to sdev->device_busy is added by recent shared host > > > > tag patch, and according to the comment, you may have planed to convert into > > > > one megaraid internal counter. > > > > > > > > /* TBD - if sml remove device_busy in future, driver > > > > * should track counter in internal structure. > > > > */ > > > > > > > > So can you post one patch? And I am happy to fold it into this series. > > > > > > > Seeing that we already have the accessor 'scsi_device_busy()' it's probably > > > easier to just use that and not fiddle with driver internals. > > > See the attached patch. > > > > > > Cheers, > > > > > > Hannes > > > -- > > > Dr. Hannes Reinecke Kernel Storage Architect > > > hare@xxxxxxx +49 911 74053 688 > > > SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg > > > HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer > > > > > From d8fa5e61187dbe851b8da9c65a5df5ec5809f8ea Mon Sep 17 00:00:00 2001 > > > From: Hannes Reinecke <hare@xxxxxxx> > > > Date: Wed, 18 Nov 2020 08:08:41 +0100 > > > Subject: [PATCH] megaraid_sas: use scsi_device_busy() instead of direct access > > > to atomic counter > > > > > > It's always a bad style to access structure internals, especially if > > > there is an accessor for it. So convert to use scsi_device_busy() > > > intead of accessing the atomic counter directly. > > > > > > Cc: Kashyap Desai <kashyap.desai@xxxxxxxxxxxx> > > > Cc: Sumanesh Samanta <sumanesh.samanta@xxxxxxxxxxxx> > > > Signed-off-by: Hannes Reinecke <hare@xxxxxxx> > > > --- > > > drivers/scsi/megaraid/megaraid_sas_fusion.c | 2 +- > > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > > > diff --git a/drivers/scsi/megaraid/megaraid_sas_fusion.c b/drivers/scsi/megaraid/megaraid_sas_fusion.c > > > index fd607287608e..272ff123bc6b 100644 > > > --- a/drivers/scsi/megaraid/megaraid_sas_fusion.c > > > +++ b/drivers/scsi/megaraid/megaraid_sas_fusion.c > > > @@ -362,7 +362,7 @@ megasas_get_msix_index(struct megasas_instance *instance, > > > /* TBD - if sml remove device_busy in future, driver > > > * should track counter in internal structure. > > > */ > > > - sdev_busy = atomic_read(&scmd->device->device_busy); > > > + sdev_busy = scsi_device_busy(scmd->device); > > > > megasas_get_msix_index() is called in .queuecommand() path, > > scsi_device_busy() might take more cycles since it has to iterate over > > each sbitmap words, especially when the sbitmap depth is high. > > > > I'd suggest Kashyap/Sumanesh to check if there is better way to > > deal with it. If not, scsi_device_busy() should be fine. > > > I guess this whole codepath will become obsolete if and when support for > polling queues / io_uring will be implemented for megaraid_sas. Not sure if it is related with iopoll which requires host tags. I understand the code path for selecting msi index should be replaced with the following simply if host tags is enabled: if (instance->host->nr_hw_queues > 1) { u32 tag = blk_mq_unique_tag(scmd->request); cmd->request_desc->SCSIIO.MSIxIndex = blk_mq_unique_tag_to_hwq(tag) + instance->low_latency_index_start; } else { if (instance->perf_mode == MR_BALANCED_PERF_MODE) ... else if (instance->msix_load_balance) ... else cmd->request_desc->SCSIIO.MSIxIndex = instance->reply_map[raw_smp_processor_id()]; } Otherwise there might be risk to trigger soft lockup in case of host tags. sdev->device_busy is only required for MR_BALANCED_PERF_MODE, so your patch can be adjusted to read the counter only for MR_BALANCED_PERF_MODE. > This whole section deals with spreading the load over several hardware > queues once the dedicated one is at risk of being congested. > But this is only required if someone want to reach high IOPS; so if we have > poll/io_uring support there won't be a need for this anymore. I understand poll is for low latency usage with extra cost of CPU utilization, and iopoll can't replace irq based IO. But I may misunderstood your point. Thanks, Ming