On Tue, 14 Feb 2023 14:51:10 +0100 Lukas Wunner <lukas@xxxxxxxxx> wrote: > On Tue, Feb 14, 2023 at 11:15:04AM +0000, Jonathan Cameron wrote: > > On Fri, 10 Feb 2023 21:25:01 +0100 Lukas Wunner <lukas@xxxxxxxxx> wrote: > > > The CDAT exposed in sysfs differs between little endian and big endian > > > arches: On big endian, every 4 bytes are byte-swapped. > > > > > > PCI Configuration Space is little endian (PCI r3.0 sec 6.1). Accessors > > > such as pci_read_config_dword() implicitly swap bytes on big endian. > > > That way, the macros in include/uapi/linux/pci_regs.h work regardless of > > > the arch's endianness. For an example of implicit byte-swapping, see > > > ppc4xx_pciex_read_config(), which calls in_le32(), which uses lwbrx > > > (Load Word Byte-Reverse Indexed). > > > > > > DOE Read/Write Data Mailbox Registers are unlike other registers in > > > Configuration Space in that they contain or receive a 4 byte portion of > > > an opaque byte stream (a "Data Object" per PCIe r6.0 sec 7.9.24.5f). > > > They need to be copied to or from the request/response buffer verbatim. > > > So amend pci_doe_send_req() and pci_doe_recv_resp() to undo the implicit > > > byte-swapping. > > > > > > The CXL_DOE_TABLE_ACCESS_* and PCI_DOE_DATA_OBJECT_DISC_* macros assume > > > implicit byte-swapping. Byte-swap requests after constructing them with > > > those macros and byte-swap responses before parsing them. > > > > > > Change the request and response type to __le32 to avoid sparse warnings. > > > > In brief this patch set ensures that internal handling of the payloads > > is in le32 chunks by unwinding the implicit conversions in the pci config > > accessors for BE platforms. Thus the data in the payloads is always > > in the same order. Good start. Hence the binary read back of CDAT is > > little endian (including all fields within it). > > Correct. On big endian this means writing request dwords and > reading response dwords involves 2 endianness conversions. > > If the request is constructed with macros such as CXL_DOE_TABLE_ACCESS_* > and PCI_DOE_DATA_OBJECT_DISC_*, then an additional (3rd) endianness > conversion is necessary: The request is constructed in native big endian > order, then converted to a little endian DOE payload. That payload is > then converted back to big endian so that it can be written with the > config space accessor, which converts to little endian. > > I recognize this is crazy but I do not see a better solution. > Dan suggested amending struct pci_ops with ->read_raw and ->write_raw > callbacks which omit the endianness conversion. However there are > 175 struct pci_ops in the kernel and each one would have to be looked at. > I also do not know the Assembler instructions for every architecture to > perform a non-byte-swapped write or read. It's a big task and lots of > code to add. Given that DOE is likely the only user and big endian > is seeing decreasing use, it's probably not worth the effort and LoC. > > > > > @@ -322,15 +322,17 @@ static int pci_doe_discovery(struct pci_doe_mb *doe_mb, u8 *index, u16 *vid, > > > struct pci_doe_task task = { > > > .prot.vid = PCI_VENDOR_ID_PCI_SIG, > > > .prot.type = PCI_DOE_PROTOCOL_DISCOVERY, > > > - .request_pl = &request_pl, > > > + .request_pl = (__le32 *)&request_pl, > > > > I don't like this type cast. request_pl is local > > anyway, so why not change it's type and set it appropriately in > > the first place. > > That doesn't work I'm afraid. The request_pl is constructed with > FIELD_PREP() here and the response_pl is parsed with FIELD_GET(). > > Those macros operate on an unsigned integer of native endianness. > If I declare them __le32, I get sparse warnings and rightfully so. > > The casts are the simplest, least intrusive solution. The only > alternative would be to use separate variables but then I'd have > to change a lot more lines and it would be more difficult to review > and probably not all that more readable. Separate variables was what I meant (less than clear) Add a __le32 le32_req_pl then do the endian conversion into that rather than in place conversion. Equivalent in teh other direction for the response. That way the endian markings end up correct on all the local variables and it shouldn't require many more lines to change. > > Thanks, > > Lukas > > > > .request_pl_sz = sizeof(request_pl), > > > - .response_pl = &response_pl, > > > + .response_pl = (__le32 *)&response_pl, > > > > Likewise here. > > > > > .response_pl_sz = sizeof(response_pl), > > > .complete = pci_doe_task_complete, > > > .private = &c, > > > }; > > > int rc; > > > > > > + cpu_to_le32s(&request_pl); > > > + > > > rc = pci_doe_submit_task(doe_mb, &task); > > > if (rc < 0) > > > return rc; > > > @@ -340,6 +342,7 @@ static int pci_doe_discovery(struct pci_doe_mb *doe_mb, u8 *index, u16 *vid, > > > if (task.rv != sizeof(response_pl)) > > > return -EIO; > > > > > > + le32_to_cpus(&response_pl); > > > *vid = FIELD_GET(PCI_DOE_DATA_OBJECT_DISC_RSP_3_VID, response_pl); > > > *protocol = FIELD_GET(PCI_DOE_DATA_OBJECT_DISC_RSP_3_PROTOCOL, > > > response_pl); >