Re: [PATCH] mpt3sas: Fix for regression caused due to cf6bf9710c patch

Sreekanth Reddy <sreekanth.reddy@xxxxxxxxxxxx> · Wed, 4 Jul 2018 16:54:05 +0530

On Tue, Jul 3, 2018 at 8:53 PM, James Bottomley
<James.Bottomley@xxxxxxxxxxxxxxxxxxxxx> wrote:
> On Tue, 2018-07-03 at 22:49 +0900, David Miller wrote:
>> From: Sreekanth Reddy <sreekanth.reddy@xxxxxxxxxxxx>
>> Date: Tue, 3 Jul 2018 17:48:49 +0530
>>
>> > Any suggestion/update over my previous mail. I am using 4.13
>> kernel.
>>
>> I think the issue is that if you are reading a 32-bit word and then
>> interpreting it as a struct full of individual bytes, you have to
>> order the bytes in the structure appropriately for the cpu
>> endianness.
>
> This is undoubtedly it.  The point being if you read from a structure
> using readX, you have to read every element at its correct length for
> the endian swaps to work.  You can't do a readq on 2 32 bit words and
> expect the endianness to be correct (you'll find they come out in the
> wrong order).
>
> I think you're using a shared (device and cpu) memory mapped structured
> data with a doorbell register,

[Sreekanth] Yes it is correct, we are using a shared memory mapped
structured data with a doorbell register.
In this particular handshake function, HBA will send only 16 bit data
at a time. So driver has to read 16 bit data at a time in a loop until
complete reply message is read.

>  which is pretty identical to how the
> qla1280 does it.  We went through several iterations of fixing that
> driver for big endian but finally settled on putting __le annotations
> on all the structures and doing cpu_to_leX() swaps as we wrote them
> (and obviously leX_to_cpu() swaps to read them), meaning the structure
> in memory is always correct for the device.  Then we used a writeX to
> poke the doorbell and the device just picked up the correct
> information.
>
> The rule you want to be following is: memory mapped structure, you're
> responsible for annotation and swapping; readX/writeX to correctly
> sized data, the API will swap for you.
>
> So, can we just revert the original patch which is clearly now a
> regression and try to get this fixed in the merge window?
[Sreekanth] Yes we can revert the original patch.

>  I think the
> actual bug is simply you're missing __leX annotations on the shared
> memory mapped structure to fix sparse, but otherwise everything is
> working.

Actually our memory mapped structure variables are declared with
__leX annotations as shown below,

typedef struct _MPI2_DEFAULT_REPLY {
        U16 FunctionDependent1; /*0x00 */
        U8 MsgLength;           /*0x02 */
        U8 Function;            /*0x03 */
        U16 FunctionDependent2; /*0x04 */
        U8 FunctionDependent3;  /*0x06 */
        U8 MsgFlags;            /*0x07 */
        U8 VP_ID;               /*0x08 */
        U8 VF_ID;               /*0x09 */
        U16 Reserved1;          /*0x0A */
        U16 FunctionDependent5; /*0x0C */
        U16 IOCStatus;          /*0x0E */
        U32 IOCLogInfo;         /*0x10 */
} MPI2_DEFAULT_REPLY, *PTR_MPI2_DEFAULT_REPLY,
        MPI2DefaultReply_t, *pMPI2DefaultReply_t;

and

typedef u8 U8;
typedef __le16 U16;
typedef __le32 U32;
typedef __le64 U64 __attribute__ ((aligned(4)));

Also I tried replacing readl() API with readw()API (as HBA FW will
send 16 bit data at a time) as shown below and still I see same issue,

        MPI2DefaultReply_t *default_reply = (MPI2DefaultReply_t *)reply;
        u16 reply1;
        reply1 = readw(&ioc->chip->Doorbell);
        reply[1] = reply1;

        printk("LSI debug.. 0x%x, 0x%x, 0x%x \n", reply1, reply[1]);
        writel(0, &ioc->chip->HostInterruptStatus);

        printk("LSI MsgLength :%d\n", default_reply->MsgLength);

And I got below output where message length is wrong, when I execute
above code on SPARC64 machine,

LSI debug.. 0x311, 0x311
LSI MsgLength :3

Thanks,
Sreekanth

>
> James
>