On 13/01/2022 14:17, John Garry wrote:
From an earlier mail [0] I got the impression that you tested on an
arm
platform – did you?
Yes, with respect to my previous mail update, at that time got the
chance to
load the driver on ARM server/enclosure connected in one of our tester's
arm server after attaching the controller card.
There this error handling issue was observed.
The card/driver was never tested or validated on ARM server before,
was curious to see the behavior for the first time. Whereas driver
loads smoothly on x86 server.
Currently busy with some other issues, debugging on ARM server is not
planned for now.
OK, since you do see this same/similar issue with another card on arm
then I think that it is safe to assume that it is a driver issue.
If you can share the dmesg on the arm machine then at least that would
be helpful.
I notice that UBSAN complains:
19.231481]
================================================================================
[ 19.239926] UBSAN: shift-out-of-bounds in
drivers/scsi/pm8001/pm80xx_hwi.c:1743:17
[ 19.247490] shift exponent 32 is too large for 32-bit type 'int'
[ 19.253490] CPU: 0 PID: 5 Comm: kworker/0:0 Not tainted
5.16.0-rc3-00389-g1758b8fcdbf7 #1018
[ 19.261915] Hardware name: Huawei D06 /D06, BIOS Hisilicon D06 UEFI
RC0 - V1.16.01 03/15/2019
[ 19.270426] Workqueue: events work_for_cpu_fn
[ 19.274777] Call trace:
[ 19.277211] dump_backtrace+0x0/0x1b0
[ 19.280863] show_stack+0x1c/0x30
[ 19.284167] dump_stack_lvl+0x7c/0xa8
[ 19.287818] dump_stack+0x1c/0x38
[ 19.291121] ubsan_epilogue+0x10/0x54
[ 19.294771] __ubsan_handle_shift_out_of_bounds+0x148/0x180
[ 19.300332] pm80xx_chip_interrupt_enable+0x74/0x19c
[ 19.305287] pm8001_pci_probe+0xf8c/0x1610
[ 19.309372] local_pci_probe+0x44/0xb0
[ 19.313112] work_for_cpu_fn+0x20/0x34
[ 19.316851] process_one_work+0x224/0x42c
[ 19.320849] worker_thread+0x204/0x44c
[ 19.324585] kthread+0x174/0x190
[ 19.327802] ret_from_fork+0x10/0x20
[ 19.331377] ==========================
Here's the code:
static void
pm80xx_chip_interrupt_enable(struct pm8001_hba_info *pm8001_ha, u8 vec)
{
#ifdef PM8001_USE_MSIX
u32 mask;
mask = (u32)(1 << vec);
pm8001_cw32(pm8001_ha, 0, MSGU_ODMR_CLR, (u32)(mask & 0xFFFFFFFF));
return;
#endif
pm80xx_chip_intx_interrupt_enable(pm8001_ha);
}
So vec can be >= 32 now and those interrupts are now used - are we
missing some operations for the upper bits?
Something else I notice is that pm80xx_set_sas_protocol_timer_config()
is called before the tags are setup in pm8001_init_ccb_tag(), and this
always fails silently as no tags are available for the command.
I also think that for the tags management, since you use spinlock in
alloc, spinlock in the free path should also be used, like:
diff --git a/drivers/scsi/pm8001/pm8001_sas.c
b/drivers/scsi/pm8001/pm8001_sas.c
index 83e73009db5c..0a5e5b5f6975 100644
--- a/drivers/scsi/pm8001/pm8001_sas.c
+++ b/drivers/scsi/pm8001/pm8001_sas.c
@@ -65,7 +65,11 @@ static int pm8001_find_tag(struct sas_task *task, u32
*tag)
void pm8001_tag_free(struct pm8001_hba_info *pm8001_ha, u32 tag)
{
void *bitmap = pm8001_ha->tags;
- clear_bit(tag, bitmap);
+ unsigned long flags;
+
+ spin_lock_irqsave(&pm8001_ha->bitmap_lock, flags);
+ __clear_bit(tag, bitmap);
+ spin_unlock_irqrestore(&pm8001_ha->bitmap_lock, flags);
}
/**
@@ -85,7 +89,7 @@ inline int pm8001_tag_alloc(struct pm8001_hba_info
*pm8001_ha, u32 *tag_out)
spin_unlock_irqrestore(&pm8001_ha->bitmap_lock, flags);
return -SAS_QUEUE_FULL;
}
- set_bit(tag, bitmap);
+ __set_bit(tag, bitmap);
spin_unlock_irqrestore(&pm8001_ha->bitmap_lock, flags);
*tag_out = tag;
return 0;
Thanks,
John