+cc folks from microchips On Wed, Nov 24, 2021 at 1:28 PM John Garry <john.garry@xxxxxxxxxx> wrote: > > Hi, > > When I enable the IOMMU on my arm64 system, the pm8001 driver crashes as > follows: > > [ 8.649365] pm80xx 0000:04:00.0: Adding to iommu group 0 > [ 8.655901] pm80xx 0000:04:00.0: pm80xx: driver version 0.1.40 > [ 8.661755] pm80xx 0000:04:00.0: enabling device (0140 -> 0142) > [ 8.667864] :: pm8001_pci_alloc 530:Setting link rate to default value > [ 9.716548] scsi host0: pm80xx > [ 10.423522] Freeing initrd memory: 413456K > [ 11.693443] Unable to handle kernel paging request at virtual address > ffff0000fcebfb00 > [ 11.701348] Mem abort info: > [ 11.704129] ESR = 0x96000005 > [ 11.707170] EC = 0x25: DABT (current EL), IL = 32 bits > [ 11.712468] SET = 0, FnV = 0 > [ 11.715510] EA = 0, S1PTW = 0 > [ 11.718637] FSC = 0x05: level 1 translation fault > [ 11.723501] Data abort info: > [ 11.726368] ISV = 0, ISS = 0x00000005 > [ 11.730190] CM = 0, WnR = 0 > [ 11.733145] swapper pgtable: 4k pages, 48-bit VAs, pgdp=0000000013d43000 > [ 11.739832] [ffff0000fcebfb00] pgd=18000a4fffff8003, > p4d=18000a4fffff8003, pud=0000000000000000 > [ 11.748521] Internal error: Oops: 96000005 [#1] PREEMPT SMP > [ 11.754080] Modules linked in: > [ 11.757122] CPU: 1 PID: 7 Comm: kworker/u192:0 Not tainted > 5.16.0-rc2-dirty #102 > [ 11.764505] Hardware name: Huawei D06 /D06, BIOS Hisilicon D06 UEFI > RC0 - V1.16.01 03/15/2019 > [ 11.773015] Workqueue: 0000:04:00.0_disco_q sas_discover_domain > [ 11.778926] pstate: 604000c9 (nZCv daIF +PAN -UAO -TCO -DIT -SSBS > BTYPE=--) > [ 11.785874] pc : pm80xx_chip_smp_req+0x2d0/0x3d0 > [ 11.790479] lr : pm80xx_chip_smp_req+0xac/0x3d0 > [ 11.794996] sp : ffff80001258ba60 > [ 11.798297] x29: ffff80001258ba60 x28: ffff0020a2892b50 x27: > ffff0020a2898000 > [ 11.805421] x26: ffff0020a3ee0000 x25: 0000000000000008 x24: > ffff0000fcebfb00 > [ 11.812546] x23: ffff8000113ab6b8 x22: 0000000000000000 x21: > ffff0020a3ed0038 > [ 11.819670] x20: ffff0020a2890000 x19: ffff80001258badc x18: > 00000000fffffffb > [ 11.826794] x17: 0000000000000000 x16: 0000000000000000 x15: > 0000000000000000 > [ 11.833917] x14: 0000000000000000 x13: 0000000000000000 x12: > 0000000000000002 > [ 11.841041] x11: 00000a20098b1000 x10: ffff0020b36515f0 x9 : > 0000000000001000 > [ 11.848165] x8 : 00000a20098b0000 x7 : ffff8000117eb7f0 x6 : > 0000000000000001 > [ 11.855288] x5 : 0000000000000f44 x4 : 0000000000001000 x3 : > 0000000000000000 > [ 11.862412] x2 : ffff8000113ab698 x1 : 0000000000000004 x0 : > ffff8000117eb000 > [ 11.869535] Call trace: > [ 11.871969] pm80xx_chip_smp_req+0x2d0/0x3d0 > [ 11.876226] pm8001_task_exec.constprop.0+0x368/0x520 > [ 11.881266] pm8001_queue_command+0x1c/0x30 > [ 11.885437] smp_execute_task_sg+0xdc/0x204 > [ 11.889607] sas_discover_expander.part.0+0xac/0x6cc > [ 11.894559] sas_discover_root_expander+0x8c/0x150 > [ 11.899337] sas_discover_domain+0x3ac/0x6a0 > [ 11.903594] process_one_work+0x1d0/0x354 > [ 11.907592] worker_thread+0x13c/0x470 > [ 11.911328] kthread+0x17c/0x190 > [ 11.914545] ret_from_fork+0x10/0x20 > [ 11.918110] Code: 371806e1 910006d6 6b16033f 54000249 (38766b05) > [ 11.924192] ---[ end trace b91d59aaee98ea2d ]--- > [ 11.928796] note: kworker/u192:0[7] exited with preempt_count 1 > > > I notice that the driver is calling virt_to_phys() on a dma_addr_t, > which is broken: phys_to_virt you meant. > > static int pm80xx_chip_smp_req(struct pm8001_hba_info *pm8001_ha, > struct pm8001_ccb_info *ccb) > { > char *preq_dma_addr = NULL; > __le64 tmp_addr; > > tmp_addr = cpu_to_le64((u64)sg_dma_address(&task->smp_task.smp_req)); > preq_dma_addr = (char *)phys_to_virt(tmp_addr); > The code was there since the initial support in 2013. f5860992db55 ("[SCSI] pm80xx: Added SPCv/ve specific hardware functionalities and relevant changes in common files") > How is this supposed to work? I assume that someone has enabled the > IOMMU on a system with one of these cards before. I guess it's due to the unaligned access to memory on ARM? AFAIK most of the user are on x86_64. > > I have encountered some other RAID cards which bypasses the IOMMU to > access host memory - is that the case here potentially? I don't know, maybe guys from microchip can answer. > > Thanks, > John Thanks for reporting!