> On Apr 20, 2022, at 7:42 AM, Chesnokov Gleb <Chesnokov.G@xxxxxxxxxx> wrote: > >> Do you have a log showing this error sequence? > > Yes, I have, but the problem is that I have a different target stack, not LIO. So the Call Trace basically contains code sequence from this target stack only, > except for the call of the qlt_free_cmd() that trigger BUG: BUG_ON(cmd->sg_mapped). > Regardless, I think the problem lies on the qlogic driver side, because it is responsible for management to map/unmap sgl list. Agree. Am curious to understand the test case/steps that would trigger this issue in your env. If you can share your test scenario would be a bit more helpful. > >> Can you share more details? > > What I am observing: > > 1) Command processing calls qlt_rdy_to_xfer(), maps sgl and sends a command to the firmware > 2) Qlogic adapter reset occurs > > qla2xxx [0000:82:00.1]-5003:13: ISP System Error - mbx1=110eh mbx2=10h mbx3=dh mbx4=0h mbx5=8a1h mbx6=0h mbx7=0h. This message indicates there was a firmware crash. Qlogic/Marvell folks should be able to help you capture/save dump. That firmware dump might give you clues on what is the cause of the firmware crash. > qla2xxx [0000:82:00.1]-d01e:13: -> fwdump no buffer > qla2xxx [0000:82:00.1]-00af:13: Performing ISP error recovery - ha=ffff9dd7d6058000. > > 3) Somehow the command is being aborted, so that means the command's abort flag has already been set. > I think it may happens something like this: > qla2x00_abort_isp_cleanup() --> qla2x00_abort_all_cmds() > I think this is the aftereffect of a firmware crash and the driver is just recovering from that. A good firmware analysis will shed more light on this issue. > 4) The target stack calls qlt_abort_cmd(), and since aborted flag has already been set, this call ended as multiple abort. > > 5) The target stack calls xmit_response, and since command has already been aborted, this call starts the code sequence to release the command that ended with qlt_free_cmd() > > I think I could try to reproduce the problem with LIO target stack, but I have special case with my target stack that lead to reset of qlogic adapter (ISP error recovery) and this is one important part of the error sequence. So, I think I will not be able to reproduce the problem with the LIO until I find out how to similarly reset qlogic adapter during processing active commands that have already been sent to the firmware. Himanshu Madhani Oracle Linux Engineering