Re: memcpy is leaking secret data through ZMM vector registers

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On Fri, 19 Apr 2024, Zack Weinberg wrote:

> On Fri, Apr 19, 2024, at 4:15 PM, Mikulas Patocka wrote:
> > On Fri, 19 Apr 2024, Zack Weinberg wrote:
> >> ... the copy
> >> of round_keys in the vector registers *won't* get erased -- the exact
> >> problem being discussed in this thread.
> >
> > On the SYSV ABI, all the vector registers are volatile, so you can erase 
> > them in explicit_bzero.
> >
> > On Windows 64-bit ABI, it is more problematic, because some of the vector 
> > registers must be preserved.
> 
> Oh, huh. Yes, that would work.

I've just realized that this wouldn't work - if the function 
explicit_bzero is lazily resolved, the dynamic linker would spill the 
vector registers to the stack prior to calling explicit_bzero.

> Call-preserved registers are not a 
> problem, because any function that puts secret data in a call-preserved 
> register in the first place, must erase it again (by restoring the old 
> value) before returning. Therefore, if we made explicit_bzero wipe *all* 
> the call-clobbered registers before returning, my example function would 
> be safe.
> 
> There's still a place secrets could leak to and not get erased, though: 
> register spill slots on the stack. Only the compiler could plug this 
> leak. Long term, I think what we want is something like 
> __attribute__((sensitive)), which can only be applied to variables with 
> automatic storage duration, and which means "erase all copies of this 
> variable's value, wherever they wound up, at the end of its lifetime." 
> Note that such variables must not be put in call-preserved registers in 
> non-leaf functions, because then they might get spilled to the stack by 
> a callee, which has no way of knowing that it's just leaked a secret. 
> And I suppose we might also want to worry about signal frames. Nobody 
> said this was gonna be easy ;-)
> 
> zw

Yes.

Another problem is varargs - if there is at least one floating point 
argument, the compiler will store 8 XMM registers on the stack regardless 
of whether they are used or not.

In the past it didn't do it (it made indirect jump based on the value in 
the %AL register to save only the used registers), but someone probably 
found out that indirect jumps are expensive and that storing all 8 
registers is faster.

Mikulas





[Index of Archives]     [DM Crypt]     [Fedora Desktop]     [ATA RAID]     [Fedora Marketing]     [Fedora Packaging]     [Fedora SELinux]     [Yosemite Discussion]     [KDE Users]     [Fedora Docs]

  Powered by Linux