On Wed, Jan 05, 2011 at 04:47:35PM +1100, Herbert Xu wrote: > On Wed, Jan 05, 2011 at 04:52:22AM +0100, Mario 'BitKoenig' Holbe wrote: > > According to the VIA PadLock Programming Guide, XSTORE increments EDI by > > the number of bytes stored. > > This did obviously not matter as long as the buffer was "sufficiently > > distant", but now you have the buffer close on the stack and I believe, > > the optimizer crops up, hence the EDI increment begins to matter. > Interesting. So your compiler was producing incorrect output. Why incorrect? How should the compiler know that EDI gets modified? It's listed as XSTORE input only. > What version of gcc were you using? gcc version 4.4.5 (Debian 4.4.5-10) > Please attach the assembly output of the function in question. attached. I hope I got the gcc call right. However, I prefer the objdump output anyways, so this one is attached as well. As you can see (from both, maybe less from the one, more from the other), it is as I supposed it to be: the optimizer uses EDI to store via_rng_datum - reasonable, since EDI is a required input for the asm() directive anyways. And since it doesn't know EDI gets modified, it just continues using it as via_rng_datum afterwards. Btw: this is also the reason why it did work before: before, &rng->priv was never used again in a close-enough (and static enough) context, so it didn't matter whether EDI (which surely was used as &rng->priv) was clobbered or not. The IMHO best solution would be to tell the compiler that EDI gets clobbered: attached via-rng1-preferred.patch to apply on top of your first patch. However, as I already said, the compiler starts whining with either of both inputs on the clobber-list (don't really know why it cries for edx, but well...). The overkill solution is to preserve EDI manually: attached via-rng1-overkill.patch to apply on top of your first patch. And... yes, this works, really :) As I already said in my mail before: IMHO, for completeness, EDX should be preserved as well: the PadLock Quick Reference states the upper 30 bits of EDX will be zeroed by XSTORE, the VIA PadLock Programming Guide states they may be zeroed. This does currently not really matter, since a) VIA_RNG_CHUNK_1 (0x03) is quite zero in the upper 30 bits, and b) the XSTORE quality factor is only defined with 2 bits. Though it's hard to believe this could ever change, I could imagine future code that, for example, tries to balance quality/speed, and chooses different values for EDX (and overloads the upper 30 bits for some additional internal information) and after xstore() behaves different based on the value chosen before. Then, it would matter. Mario -- It is practically impossible to teach good programming style to students that have had prior exposure to BASIC: as potential programmers they are mentally mutilated beyond hope of regeneration. -- Dijkstra
Attachment:
via-rng.S.bz2
Description: Binary data
Attachment:
via-rng.o.objdump.bz2
Description: Binary data
--- linux-source-2.6.37-rc7/drivers/char/hw_random/via-rng.c.hxu1 2011-01-05 11:32:12.508274625 +0100 +++ linux-source-2.6.37-rc7/drivers/char/hw_random/via-rng.c 2011-01-05 12:57:19.712085325 +0100 @@ -82,7 +82,8 @@ static inline u32 xstore(u32 *addr, u32 asm(".byte 0x0F,0xA7,0xC0 /* xstore %%edi (addr=%0) */" :"=m"(*addr), "=a"(eax_out) - :"D"(addr), "d"(edx_in)); + :"D"(addr), "d"(edx_in) + :"edi", "edx"); irq_ts_restore(ts_state); return eax_out;
--- linux-source-2.6.37-rc7/drivers/char/hw_random/via-rng.c.hxu1 2011-01-05 11:32:12.508274625 +0100 +++ linux-source-2.6.37-rc7/drivers/char/hw_random/via-rng.c 2011-01-05 12:45:43.459208713 +0100 @@ -80,7 +80,9 @@ static inline u32 xstore(u32 *addr, u32 ts_state = irq_ts_save(); - asm(".byte 0x0F,0xA7,0xC0 /* xstore %%edi (addr=%0) */" + asm("pushl %%edi\n" + ".byte 0x0F,0xA7,0xC0 /* xstore %%edi (addr=%0) */\n" + "popl %%edi\n" :"=m"(*addr), "=a"(eax_out) :"D"(addr), "d"(edx_in));
Attachment:
signature.asc
Description: Digital signature