Re: [PATCH] tg3: fix big endian MAC address collection failure

"Matt Carlson" <mcarlson@xxxxxxxxxxxx> · Mon, 13 Apr 2009 18:17:10 -0700

Sorry for the delay.  I just ran the same experiment on an ia64 we have
in-house.  The results were the same.  The machine was a HP integrity
BL870C pre-production blade with dual 5704S LOMs.  More responses
inline.

On Mon, Apr 13, 2009 at 01:48:18PM -0700, James Bottomley wrote:
> On Mon, 2009-04-13 at 11:37 -0700, Matt Carlson wrote:
> > But that is exactly what the code is doing.  tg3_nvram_read_be32() will
> > return the data in bytestream format.  A memcpy() should be all that is
> > needed to transport the data to a different memory location.
> 
> But not the one you've done.  cpu_to_be32 is a nop pass through on our
> architecture, so tg3_nvram_read_be32 is equivalent to tg3_nvram_read on
> our architecture (i.e. identical to the code that was doing the read in
> 2.6.29).

tg3_nvram_read() returns a 32-bit word of NVRAM data, swapped according to
the byteswapping rules for all register accesses.  On big endian machines,
the data returned will be exactly as it would have been in NVRAM.  On
little endian machines, the data will be swapped when compared to a raw
read of the NVRAM.  To bring the data back to a bytestream format on LE
systems, a swap is needed.  After that, a memcpy should be all that is
required to extract the MAC address.

Code that operates on data in a bytestream format will necessarily look
different than code that operates on data as a 32-bit native endian
quantity.  The code changes can look a bit drastic and are easily
culpable at first glance.

> However, the memcpy is the wrong way around for us.

It shouldn't be, and that is the problem.  We are trying to determine
why the data isn't in the form we think it should be in.  The most
difficult part of this discussion is convincing ourselves that the end
result is the same, even though the path to get there changes in
significant ways.

The old code read the data from NVRAM, did a blind swap of the data and
then operated on that data as a 32-bit quantity.  The blind swap was
completely superfluous and only changed which byte the driver attempted
to get its data.  An important property of this version is that the
same code when working with machines of either endianness.

If you are with me this far, then hopefully you can see that, while the
two versions of the code appear different, they wind up with the same
result.  The change was made because the code became much more readable
when we describe the operations as a bytestream.

> If you look at an example, the original code said
> 
> dev_addr[0] = hi >> 16;
> dev_addr[1] = hi >> 24
> 
> So MSB-1 and MSB.  However, on a BE machine these are at offset one and
> zero from the start of the word.  The replacement memcopy is:
> 
> memcpy(&dev->dev_addr[0], ((char *)&hi) + 2, 2)
> 
> i.e. offset 3 and 4, which actually copies LSB-1 and LSB into there.
> You can follow similar logic to show that the lo copy is wrong too.

Yes.  The code will be different depending on whether you treat the data
as a native endian 32-bit word or as a 4 byte segment of a bytestream.

> Perhaps the fix is just to put the tg3_nvram_read() back as well as the
> original by loads?

The code is much more readable with the changes in place.  I'd rather
not back them out.  I'd like to find out why you are getting different
results than I am with my tests.

I just saw your NVRAM dump post.  Perhaps that'll shed some light on the
matter.

--
To unsubscribe from this list: send the line "unsubscribe linux-parisc" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html