RE: [PATCH] aacraid: reply queue mapping to CPUs based of IRQ affinity

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




-----Original Message-----
From: John Garry <john.g.garry@xxxxxxxxxx> 
Sent: Wednesday, March 29, 2023 12:08 AM
To: Sagar Biradar - C34249 <Sagar.Biradar@xxxxxxxxxxxxx>; Don Brace - C33706 <Don.Brace@xxxxxxxxxxxxx>; Gilbert Wu - C33504 <Gilbert.Wu@xxxxxxxxxxxxx>; linux-scsi@xxxxxxxxxxxxxxx; Martin Petersen <martin.petersen@xxxxxxxxxx>; James Bottomley <jejb@xxxxxxxxxxxxx>; Brian King <brking@xxxxxxxxxxxxxxxxxx>; stable@xxxxxxxxxxxxxxx; Tom White - C33503 <Tom.White@xxxxxxxxxxxxx>
Subject: Re: [PATCH] aacraid: reply queue mapping to CPUs based of IRQ affinity

EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe

On 28/03/2023 22:41, Sagar Biradar wrote:
> Fix the IO hang that arises because of MSIx vector not having a mapped 
> online CPU upon receiving completion.

What about if the CPU targeted goes offline while the IO is in-flight?

> This patch sets up a reply queue mapping to CPUs based on the IRQ 
> affinity retrieved using pci_irq_get_affinity() API.
>

blk-mq already does what you want here, including handling for the case I mention above. It maintains a CPU -> HW queue mapping, and using a reply map in the LLD is the old way of doing this.

Could you instead follow the example in commit 664f0dce2058 ("scsi:
mpt3sas: Add support for shared host tagset for CPU hotplug"), and expose the HW queues to the upper layer? You can alternatively check the example of any SCSI driver which sets shost->host_tagset for this.

Thanks,
John
[Sagar Biradar] 

***What about if the CPU targeted goes offline while the IO is in-flight?
We ran multiple random cases with the IO's running in parallel and disabling load-bearing CPU's. We saw that the load was transferred to the other online CPUs successfully every time.
The same was tested at vendor and their customer site - they did not see any issues too.


***blk-mq already does what you want here, including handling for the case I mention above. It maintains a CPU -> HW queue mapping, and using a reply map in the LLD is the old way of doing this.
We also tried implementing the blk-mq mechanism in the driver and we saw command timeouts. 
The firmware has limitation of fixed number of queues per vector and the blk-mq changes would saturate that limit.
That answers the possible command timeout. 

Also this is EOL product and there will be no firmware code changes. Given this, we have decided to stick to the reply_map mechanism.
(https://storage.microsemi.com/en-us/support/series8/index.php)

Thank you for your review comments and we hope you will reconsider the original patch.

Thanks
Sagar

> Reviewed-by: Gilbert Wu <gilbert.wu@xxxxxxxxxxxxx>
> Signed-off-by: Sagar Biradar <Sagar.Biradar@xxxxxxxxxxxxx>
> ---
>   drivers/scsi/aacraid/aacraid.h  |  1 +
>   drivers/scsi/aacraid/comminit.c | 25 +++++++++++++++++++++++++
>   drivers/scsi/aacraid/linit.c    | 11 +++++++++++
>   drivers/scsi/aacraid/src.c      |  2 +-
>   4 files changed, 38 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/scsi/aacraid/aacraid.h 
> b/drivers/scsi/aacraid/aacraid.h index 5e115e8b2ba4..4a23f9fab61f 
> 100644
> --- a/drivers/scsi/aacraid/aacraid.h
> +++ b/drivers/scsi/aacraid/aacraid.h
> @@ -1678,6 +1678,7 @@ struct aac_dev
>       u32                     handle_pci_error;
>       bool                    init_reset;
>       u8                      soft_reset_support;
> +     unsigned int *reply_map;
>   };
>
>   #define aac_adapter_interrupt(dev) \ diff --git 
> a/drivers/scsi/aacraid/comminit.c b/drivers/scsi/aacraid/comminit.c 
> index bd99c5492b7d..6fc323844a31 100644
> --- a/drivers/scsi/aacraid/comminit.c
> +++ b/drivers/scsi/aacraid/comminit.c
> @@ -33,6 +33,8 @@
>
>   #include "aacraid.h"
>
> +void aac_setup_reply_map(struct aac_dev *dev);
> +
>   struct aac_common aac_config = {
>       .irq_mod = 1
>   };
> @@ -630,6 +632,9 @@ struct aac_dev *aac_init_adapter(struct aac_dev 
> *dev)
>
>       if (aac_is_src(dev))
>               aac_define_int_mode(dev);
> +
> +     aac_setup_reply_map(dev);
> +
>       /*
>        *      Ok now init the communication subsystem
>        */
> @@ -658,3 +663,23 @@ struct aac_dev *aac_init_adapter(struct aac_dev *dev)
>       return dev;
>   }
>
> +void aac_setup_reply_map(struct aac_dev *dev) {
> +     const struct cpumask *mask;
> +     unsigned int i, cpu = 1;
> +
> +     for (i = 1; i < dev->max_msix; i++) {
> +             mask = pci_irq_get_affinity(dev->pdev, i);
> +             if (!mask)
> +                     goto fallback;
> +
> +             for_each_cpu(cpu, mask) {
> +                     dev->reply_map[cpu] = i;
> +             }
> +     }
> +     return;
> +
> +fallback:
> +     for_each_possible_cpu(cpu)
> +             dev->reply_map[cpu] = 0; }
> diff --git a/drivers/scsi/aacraid/linit.c 
> b/drivers/scsi/aacraid/linit.c index 5ba5c18b77b4..af60c7d26407 100644
> --- a/drivers/scsi/aacraid/linit.c
> +++ b/drivers/scsi/aacraid/linit.c
> @@ -1668,6 +1668,14 @@ static int aac_probe_one(struct pci_dev *pdev, const struct pci_device_id *id)
>               goto out_free_host;
>       }
>
> +     aac->reply_map = kzalloc(sizeof(unsigned int) * nr_cpu_ids,
> +                             GFP_KERNEL);
> +     if (!aac->reply_map) {
> +             error = -ENOMEM;
> +             dev_err(&pdev->dev, "reply_map allocation failed\n");
> +             goto out_free_host;
> +     }
> +
>       spin_lock_init(&aac->fib_lock);
>
>       mutex_init(&aac->ioctl_mutex);
> @@ -1797,6 +1805,8 @@ static int aac_probe_one(struct pci_dev *pdev, const struct pci_device_id *id)
>                                 aac->comm_addr, aac->comm_phys);
>       kfree(aac->queues);
>       aac_adapter_ioremap(aac, 0);
> +     /* By now we should have configured the reply_map */
> +     kfree(aac->reply_map);
>       kfree(aac->fibs);
>       kfree(aac->fsa_dev);
>    out_free_host:
> @@ -1918,6 +1928,7 @@ static void aac_remove_one(struct pci_dev *pdev)
>
>       aac_adapter_ioremap(aac, 0);
>
> +     kfree(aac->reply_map);
>       kfree(aac->fibs);
>       kfree(aac->fsa_dev);
>
> diff --git a/drivers/scsi/aacraid/src.c b/drivers/scsi/aacraid/src.c 
> index 11ef58204e96..e84ec60a655b 100644
> --- a/drivers/scsi/aacraid/src.c
> +++ b/drivers/scsi/aacraid/src.c
> @@ -506,7 +506,7 @@ static int aac_src_deliver_message(struct fib *fib)
>                       && dev->sa_firmware)
>                       vector_no = aac_get_vector(dev);
>               else
> -                     vector_no = fib->vector_no;
> +                     vector_no = 
> + dev->reply_map[raw_smp_processor_id()];
>
>               if (native_hba) {
>                       if (fib->flags & 
> FIB_CONTEXT_FLAG_NATIVE_HBA_TMF) {





[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux