Re: [PATCH] lpfc: correct writeq failures on 32-bit architectures

Arnd Bergmann <arnd@xxxxxxxx> · Tue, 27 Feb 2018 21:15:23 +0100

On Tue, Feb 27, 2018 at 7:05 PM, James Smart <james.smart@xxxxxxxxxxxx> wrote:
> On 2/27/2018 12:58 AM, Arnd Bergmann wrote:
>
> So you point out a very real concern, as in most cases the source buffer is
> a bytestream and the desire is to send the bytestream in the same byte order
> as in memory.  It turns out we're somewhat lucky as the driver's source
> buffer wasn't a real bytestream, it was a cpu-endian relative structure so
> writex behavior would be ok. However, the driver is confused as it wants to
> use 64-bit copies, but the cpu-endian structure was based on 32-bit's at a
> time when mapping to the hardware. So things are ok if using writel, but not
> so ok if writeq.

Ah, so it is a structure of only 32-bit integers? Obviously if you have 16-bit
or 64-bit integers mixed in with the structure, things get even more
complicated.

For an array of 32-bit little-endian registers, this variant should work
on all architectures:

#include <linux/io-64-nonatomic-lo-hi.h>

         for (i = 0; i < q->entry_size; i += sizeof(uint64_t))
                   writeq_relaxed(*((uint32_t *)(tmp + i) |
                                           *((uint32_t *)(tmp + i + 1) << 32,
                                           (q->dpp_regaddr + i)

On x86_64 and x86_32, that should do the exact same thing as
your version, but it should also work efficiently on other architectures.
The __raw_writeq() version I suggested earlier is not correct
if we are dealing with an array of __le32 values in hardware.

On  powerpc, the writeq_relaxed() unfortunately still prevents
write-combining, so that won't be efficient.

> As the code stands write now - this feature and the routine is only used
> when x86, thus LE, thus the whole LSB conversion by writeX works.  It's a
> mute point.
>
> Please accept the patch as proposed. We will address these other issues as
> we include the "push" support for other architectures.

Could you add an #ifdef and comment around the 'if (q->dpp_enable ...)'
block then to make sure that if anybody tries to make it work on other
architectures, they are aware of the problem?

      Arnd