Re: [PATCH 2/2] qla2xxx: Fix missed DMA unmap for aborted cmds

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




> On Apr 20, 2022, at 7:42 AM, Chesnokov Gleb <Chesnokov.G@xxxxxxxxxx> wrote:
> 
>> Do you have a log showing this error sequence?
> 
> Yes, I have, but the problem is that I have a different target stack, not LIO. So the Call Trace basically contains code sequence from this target stack only,
> except for the call of the qlt_free_cmd() that trigger BUG: BUG_ON(cmd->sg_mapped).
> Regardless, I think the problem lies on the qlogic driver side, because it is responsible for management to map/unmap sgl list.

Agree. Am curious to understand the test case/steps that would trigger this issue in your env. If you can share your test scenario would be a bit more helpful. 

> 
>> Can you share more details?
> 
> What I am observing:
> 
> 1) Command processing calls qlt_rdy_to_xfer(), maps sgl and sends a command to the firmware
> 2) Qlogic adapter reset occurs
> 
> qla2xxx [0000:82:00.1]-5003:13: ISP System Error - mbx1=110eh mbx2=10h mbx3=dh mbx4=0h mbx5=8a1h mbx6=0h mbx7=0h.

This message indicates there was a firmware crash. Qlogic/Marvell folks should be able to help you capture/save dump. That firmware dump might give you clues on what is the cause of the firmware crash. 

> qla2xxx [0000:82:00.1]-d01e:13: -> fwdump no buffer

> qla2xxx [0000:82:00.1]-00af:13: Performing ISP error recovery - ha=ffff9dd7d6058000.
> 

> 3) Somehow the command is being aborted, so that means the command's abort flag has already been set.
> I think it may happens something like this:
> qla2x00_abort_isp_cleanup() --> qla2x00_abort_all_cmds()
> 

I think this is the aftereffect of a firmware crash and the driver is just recovering from that. A good firmware analysis will shed more light on this issue. 

> 4) The target stack calls qlt_abort_cmd(), and since aborted flag has already been set, this call ended as multiple abort.
> 
> 5) The target stack calls xmit_response, and since command has already been aborted, this call starts the code sequence to release the command that ended with qlt_free_cmd()
> 
> I think I could try to reproduce the problem with LIO target stack, but I have special case with my target stack that lead to reset of qlogic adapter (ISP error recovery) and this is one important part of the error sequence. So, I think I will not be able to reproduce the problem with the LIO until I find out how to similarly reset qlogic adapter during processing active commands that have already been sent to the firmware.


Himanshu Madhani	Oracle Linux Engineering





[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [SCSI Target Devel]     [Linux SCSI Target Infrastructure]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Samba]     [Device Mapper]

  Powered by Linux