Re: [PATCH v2 1/7] usb/isp1760: Move to native-endian ptds

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Sebastian Andrzej Siewior wrote:
> Arvid Brodin wrote:
>> This helps users with platform-bus-connected isp176xs with endianness
>> problems
> 
> could you please define endianness problems? This driver works on BE and
> LE machines. However, if the chip is wired wrong you need additional swaps
> in read/write routines.

When I started working with this driver I found several instances where people
wired this chip wrong and had problems getting it to work. The data bus byteswap
needed is actually not documented in the datasheet, but only in an accompanying
FAQ, which might explain this. The idea here is to make things easier for those
people (including me) by removing most of the byteswap calls, concentrating them
to one place.

So yes, endianness problems == BE cpu and no data bus HW byte swap. Before this
change, one needed to remove a lot of cpu_to_le32 calls to avoid unneccessary SW
byteswaps in this case, which is a pain. 


>> and fixes a bug with non-32-bit io transfers on big endian hardware with
>> straight data bus, where payload has to be byteswapped instead of ptds.
> 
>> diff --git a/drivers/usb/host/isp1760-hcd.c
>> b/drivers/usb/host/isp1760-hcd.c
>> index c470cc8..428a255 100644
>> --- a/drivers/usb/host/isp1760-hcd.c
>> +++ b/drivers/usb/host/isp1760-hcd.c
>> @@ -114,73 +114,136 @@ struct isp1760_qh {
>>  
>>  #define ehci_port_speed(priv, portsc) USB_PORT_STAT_HIGH_SPEED
>>  
>> -static unsigned int isp1760_readl(__u32 __iomem *regs)
>> +static u32 isp176x_reg_read32(void __iomem *base, u32 reg)
>>  {
>> -    return readl(regs);
>> +    BUG_ON(reg >= 0x0400);
>> +    return readl(base + reg);
> 
> Why do you push base address into read/write routines?

Before, a single address (regs == base + reg) was pushed into these functions.

The main reason for pushing these separately is to be able to check whether
byteswap should be done inside the mem read/write routines, which geatly eases
the situation for people who's wired the data bus wrong. Another nice benefit is
that it reduces clutter since cpu_to_le32() and le32_to_cpu() does not have to
be used in a lot of places outside these routines. Call it encapsulation, if you
like. :)

It also allowed me to make sure I accessed the correct part of the chip address
space while developing/debugging (the BUG_ON() calls). These calls will be
removed after comments from Greg KH so this point is moot.

Also, I think it is important to use the same style throughout the driver, so I
don't really like the idea of using 

mem_reads8(base, reg);
and 
reg_read32(base + reg);

Better to have both functions take the same arguments.

 
>>  }
>>  
>>  /*
>>   * The next two copy via MMIO data to/from the device.
>> memcpy_{to|from}io()
>>   * doesn't quite work because some people have to enforce 32-bit access
>>   */
>> -static void priv_read_copy(struct isp1760_hcd *priv, u32 *src,
>> -        __u32 __iomem *dst, u32 len)
>> +static void isp176x_bank_reads8(void __iomem *src_base, u32
>> src_offset, u32 bank_addr, __u32 *dst, u32 bytes)
>>  {
>> +    __u32 __iomem *src;
>>      u32 val;
>> -    u8 *buff8;
>> +    __u8 *src_byteptr;
>> +    __u8 *dst_byteptr;
>>  
>> -    if (!src) {
>> -        printk(KERN_ERR "ERROR: buffer: %p len: %d\n", src, len);
>> -        return;
>> -    }
>> +    BUG_ON(src_offset & 0x03);
>> +    BUG_ON(src_offset < PTD_OFFSET);
>>  
>> -    while (len >= 4) {
>> -        *src = __raw_readl(dst);
>> -        len -= 4;
>> -        src++;
>> -        dst++;
>> +    src = src_base + (bank_addr | src_offset);
>> +
>> +    if (src_offset < PAYLOAD_OFFSET) {
>> +        while (bytes >= 4) {
>> +            *dst = le32_to_cpu(__raw_readl(src));
> 
> Not sure if this does not break the other BE machines and fixes yours.
> Don't have any HW around for a quick test.

Actually the code above does not work on my machine. To make it work on an
incorrectly wired data bus, an additional patch is needed to actually switch the
byteswaps: 

-    if (src_offset < PAYLOAD_OFFSET) {
+    if (src_offset >= PAYLOAD_OFFSET) {

Nonetheless, this patch changes basic things and should be tested on both little
endian and correctly-wired big endian machines before it is accepted into the
tree. I hope that someone has the means to and is kind enough to help with this.


> As far as I can see you do three things here:
> - split the register read/write function into base and reg. Why? The only
>   "benefit I see is that you check if one tries to reach beyond the
>   register space. You should only map the register space so any access
>   after the register space should be catched by the VM.
> 
> - you push bank address the read/write functions. You lose the cool trick
>   where we implicit delay 90ns.
> 
> - you change the PTD struct from LE to native. Later you do an endian
>   swap. How does it help?

The base and register split is part of/needed for the endian change, since I
look at the register address offset to decide when to byteswap. The LE-to-native
change allows me to remove a lot of cpu_to_le32() and le32_to_cpu() calls and do
the byteswap "automatically" in the read/write functions only.

Regarding the 90 ns implicit delay, I could do

	isp176x_reg_write32(hcd->regs, HC_MEMORY_REG, ATL_PTD_OFFSET + ISP_BANK(0));
	isp176x_reg_write32(hcd->regs, HC_MEMORY_REG, payload + ISP_BANK(1));
	ndelay(90);
	isp176x_bank_reads8(hcd->regs, ATL_PTD_OFFSET + queue_entry*sizeof(ptd), ISP_BANK(0), &ptd, sizeof(ptd));
	...
	isp176x_bank_reads8(hcd->regs, payload, ISP_BANK(1), priv->atl_ints[queue_entry].data_buffer, length);

instead of

	isp176x_ptd_read(hcd->regs, ATL_PTD_OFFSET, queue_entry, &ptd);
	...
	isp176x_mem_reads8(hcd->regs, payload, priv->atl_ints[queue_entry].data_buffer, length);

This would save one 90 ns delay each time a payload needs to be read from the
chip, at the cost of more complicated code. That's 18 clock cycles on a 200 MHz
cpu. Is that tradeoff really worth it?

Also, since this chip is not always on a PCI bus, the original code which used a
duplicate isp1760_writel(...) for the delay may not delay for the correct
duration.

 
> Please split this patch in three since you have three logical changes as
> far as I can see. I saw some missing spaces and long lines. Please make
> checkpatch happy.

While waiting for your comments on the above, I'll try to make checkpatch happy.
:)


> 
> Sebastian

-- 
Arvid Brodin
Enea Services Stockholm AB
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Media]     [Linux Input]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Old Linux USB Devel Archive]

  Powered by Linux