RE: [issue report] pm8001 issues (was driver crashes with IOMMU enabled)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> >>>   From an earlier mail [0] I got the impression that you tested on
> >>> an arm platform – did you?
> >> Yes, with respect to my previous mail update, at that time got the
> >> chance to load the driver on ARM server/enclosure connected in one of
> >> our tester's arm server after attaching the controller card.
> >> There this error handling issue was observed.
> >>
> >> The card/driver was never tested or validated on ARM server before,
> >> was curious to see the behavior for the first time. Whereas driver
> >> loads smoothly on x86 server.
> >>
> >> Currently busy with some other issues, debugging on ARM server is not
> >> planned for now.
> >>
> >
> > OK, since you do see this same/similar issue with another card on arm
> > then I think that it is safe to assume that it is a driver issue.
> >
> > If you can share the dmesg on the arm machine then at least that would
> > be helpful.
> 
> I notice that UBSAN complains:
> 
>     19.231481]
> ================================================================
> ================
> 
> [   19.239926] UBSAN: shift-out-of-bounds in
> drivers/scsi/pm8001/pm80xx_hwi.c:1743:17
> [   19.247490] shift exponent 32 is too large for 32-bit type 'int'
> [   19.253490] CPU: 0 PID: 5 Comm: kworker/0:0 Not tainted
> 5.16.0-rc3-00389-g1758b8fcdbf7 #1018
> [   19.261915] Hardware name: Huawei D06 /D06, BIOS Hisilicon D06 UEFI
> RC0 - V1.16.01 03/15/2019
> [   19.270426] Workqueue: events work_for_cpu_fn
> [   19.274777] Call trace:
> [   19.277211]  dump_backtrace+0x0/0x1b0
> [   19.280863]  show_stack+0x1c/0x30
> [   19.284167]  dump_stack_lvl+0x7c/0xa8
> [   19.287818]  dump_stack+0x1c/0x38
> [   19.291121]  ubsan_epilogue+0x10/0x54
> [   19.294771]  __ubsan_handle_shift_out_of_bounds+0x148/0x180
> [   19.300332]  pm80xx_chip_interrupt_enable+0x74/0x19c
> [   19.305287]  pm8001_pci_probe+0xf8c/0x1610
> [   19.309372]  local_pci_probe+0x44/0xb0
> [   19.313112]  work_for_cpu_fn+0x20/0x34
> [   19.316851]  process_one_work+0x224/0x42c
> [   19.320849]  worker_thread+0x204/0x44c
> [   19.324585]  kthread+0x174/0x190
> [   19.327802]  ret_from_fork+0x10/0x20
> [   19.331377] ==========================
> 
> Here's the code:
> static void
> pm80xx_chip_interrupt_enable(struct pm8001_hba_info *pm8001_ha, u8
> vec) { #ifdef PM8001_USE_MSIX
>         u32 mask;
>         mask = (u32)(1 << vec);
> 
>         pm8001_cw32(pm8001_ha, 0, MSGU_ODMR_CLR, (u32)(mask &
> 0xFFFFFFFF));
>         return;
> #endif
>         pm80xx_chip_intx_interrupt_enable(pm8001_ha);
> 
> }
> 
> So vec can be >= 32 now and those interrupts are now used - are we missing
> some operations for the upper bits?

Yes. At first look like we are missing it 

#define MSGU_ODMR_CLR 0x38
#define MSGU_ODMR_CLR_U 0x3C

0x38
Address offset 0x38 - bits 31:0
Address offset 0x3C - bits 63:32

The same analogy applies to these
registers too 
#define MSGU_ODMR 0x30
#define MSGU_ODMR_U 0x34

0x30
Address offset 0x30 - bits 31:0
Address offset 0x34 - bits 63:32

Let me go through the internals first.

> 
> Something else I notice is that pm80xx_set_sas_protocol_timer_config()
> is called before the tags are setup in pm8001_init_ccb_tag(), and this always
> fails silently as no tags are available for the command.

You are right here. 
Currently the code sequence and error handling both are
not proper.

Probe()
rc = PM8001_CHIP_DISP->chip_init(pm8001_ha); {pm80xx_chip_init()}
        ret = pm80xx_set_sas_protocol_timer_config() // error handling
                 rc = pm8001_tag_alloc(pm8001_ha, &tag);
                            void *bitmap = pm8001_ha->tags;
		

rc = pm8001_init_ccb_tag(pm8001_ha, shost, pdev);
	pm8001_ha->tags = kzalloc(ccb_count, GFP_KERNEL);
	if (!pm8001_ha->tags)
		goto err_out;

Will submit the patch for the same.

> 
> I also think that for the tags management, since you use spinlock in alloc,
> spinlock in the free path should also be used, like:
> 
> diff --git a/drivers/scsi/pm8001/pm8001_sas.c
> b/drivers/scsi/pm8001/pm8001_sas.c
> index 83e73009db5c..0a5e5b5f6975 100644
> --- a/drivers/scsi/pm8001/pm8001_sas.c
> +++ b/drivers/scsi/pm8001/pm8001_sas.c
> @@ -65,7 +65,11 @@ static int pm8001_find_tag(struct sas_task *task, u32
> *tag)
>   void pm8001_tag_free(struct pm8001_hba_info *pm8001_ha, u32 tag)
>   {
>         void *bitmap = pm8001_ha->tags;
> -       clear_bit(tag, bitmap);
> +       unsigned long flags;
> +
> +       spin_lock_irqsave(&pm8001_ha->bitmap_lock, flags);
> +       __clear_bit(tag, bitmap);
> +       spin_unlock_irqrestore(&pm8001_ha->bitmap_lock, flags);
>   }
> 
>   /**
> @@ -85,7 +89,7 @@ inline int pm8001_tag_alloc(struct pm8001_hba_info
> *pm8001_ha, u32 *tag_out)
>                 spin_unlock_irqrestore(&pm8001_ha->bitmap_lock, flags);
>                 return -SAS_QUEUE_FULL;
>         }
> -       set_bit(tag, bitmap);
> +       __set_bit(tag, bitmap);
>         spin_unlock_irqrestore(&pm8001_ha->bitmap_lock, flags);
>         *tag_out = tag;
>         return 0;

Diff changes look fine for me here.

> 
> 
> Thanks,
> John




[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [SCSI Target Devel]     [Linux SCSI Target Infrastructure]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Samba]     [Device Mapper]

  Powered by Linux