On Wednesday 18 May 2016 21:14:55 Christian Lamparter wrote: > On Tuesday, May 17, 2016 04:50:48 PM John Youn wrote: > > On 5/14/2016 6:11 AM, Christian Lamparter wrote: > Hey, that's really nice of you to do that :-D. Please keep me in the > loop (Cc) for those then. > > Yes, this needs definitely testing on all the affected ARCHs. > I've attached a diff to a updated version of the patch. It > drops the special MIPS case (as requested by Arnd). Ok, thanks! > BTW, I looked into the ioread32_rep and iowrite32_rep again. I'm > not entirely convinced that the hardware FIFOs are actually endian > neutral. But I can't verify it since my Western Digital My Book Live > only supports the host configuration (forces host mode), so I don't > know what a device in dual-mode or peripheral do here. I think it's highly unlikely that designware would have screwed up their part in such an unusual way, by intentionally adding a byte-reversal on something that is not connected to a 32-bit register. Note that the reason why ioread32_rep() doesn't swap is not that the registers are endian neutral, but that they use endianess in the same way that the memory does: If you want to look at the first byte of a (theoretical) four-byte USB data packet, we read four bytes from the FIFO register using __raw_readl() (a pointer dereference) and store it to memory using a 32-bit write: *(u32 *)buffer = __raw_readl(FIFO); Then we expect the first byte of the packet to be at the start: byte0 = *(u8*)buffer; If you replace the __raw_readl() with ioread32(), it gets byteswapped on big-endian *CPUs*, and then written to memory without an extra swap. This means that now you get the wrong data depending on the kernel endianess configuration, and independent of the device endianess. If the big-endian mode of the dwc2 block indeed contains a byteswap on the FIFO, that would mean not having to use ioread32_be() for the FIFO, but using fifo_read32(void *buffer) { u32 data = __raw_readl(FIFO_ADDRESS); if (big_endian_registers) data = bswap32(data); *(u32*)buffer = data; } so we byteswap the FIFO contents back, regardless of the CPU endianess. As I said, it's unlikely that the hardware is this broken, but not impossible. > The reason why I think it was broken is because there's a PIO copy > to and from the HCFIFO(x) in dwc2_hc_write_packet and > dwc2_hc_read_packet access in the hcd.c file as well... And there, > the code was using the dwc2_readl and dwc2_writel to access the data. Well, we know for a fact that those functions get endianess wrong, see dwc2_hc_write_packet: if (((unsigned long)data_buf & 0x3) == 0) { /* xfer_buf is DWORD aligned */ for (i = 0; i < dword_count; i++, data_buf++) dwc2_writel(*data_buf, data_fifo); } else { /* xfer_buf is not DWORD aligned */ for (i = 0; i < dword_count; i++, data_buf++) { u32 data = data_buf[0] | data_buf[1] << 8 | data_buf[2] << 16 | data_buf[3] << 24; dwc2_writel(data, data_fifo); } } On big-endian machines, unaligned case performs a byte swap while the aligned case does not. My best guess is that this function never got called on either the MIPS machine that first got the fix or your PowerPC machine. Another possibility is that you are right that there is a byteswap on the FIFO register in big-endian mode, and that this function always gets unaligned buffers, so the byteswap here cancels out the byteswap on the FIFO when both the CPU and the device are big-endian. > - if (((unsigned long)data_buf & 0x3) == 0) { > - /* xfer_buf is DWORD aligned */ > - for (i = 0; i < dword_count; i++, data_buf++) > - dwc2_writel(hsotg, *data_buf, HCFIFO(chan->hc_num)); > - } else { > - /* xfer_buf is not DWORD aligned */ > - for (i = 0; i < dword_count; i++, data_buf++) { > - u32 data = data_buf[0] | data_buf[1] << 8 | > - data_buf[2] << 16 | data_buf[3] << 24; > - dwc2_writel(hsotg, data, HCFIFO(chan->hc_num)); > - } > - } > + dwc2_writel_rep(hsotg, HCFIFO(chan->hc_num), data_buf, byte_count); > and here you are dropping the byteswap on big-endian. Arnd