On Sun, Apr 17, 2022 at 12:41 PM Borislav Petkov <bp@xxxxxxxxx> wrote: > > Anyway, more playing with this later to make sure it really does what it > should. I think the special calling conventions have tripped you up: > SYM_FUNC_START(clear_user_original) > - ASM_STAC > movq %rcx,%rax > shrq $3,%rcx > andq $7,%rax > @@ -86,7 +84,7 @@ SYM_FUNC_START(clear_user_original) > decl %ecx > jnz 2b > > -3: ASM_CLAC > +3: > movq %rcx,%rax > RET That 'movq %rcx,%rax' can't be right. The caller expects it to be zero on input and stay zero on output. But I think "xorl %eax,%eax" is good, since %eax was used as a temporary in that function. And the comment above the function should be fixed too. > SYM_FUNC_START(clear_user_rep_good) > - ASM_STAC > movq %rcx,%rdx > - xorq %rax,%rax > shrq $3,%rcx > andq $7,%rdx > > @@ -118,7 +113,7 @@ SYM_FUNC_START(clear_user_rep_good) > > 1: rep stosb > > -3: ASM_CLAC > +3: > movq %rcx,%rax > RET Same issue here. Probably nothing notices, since %rcx *does* end up containing the right value, and it's _likely_ that the compiler doesn't actually end up re-using the zero value in %rax after the inline asm (that this bug has corrupted), but if the compiler ever goes "Oh, I put zero in %rax, so I'll just use that afterwards", this is going to blow up very spectacularly and be very hard to debug. > @@ -135,15 +130,13 @@ EXPORT_SYMBOL(clear_user_rep_good) > * > * Output: > * rax uncopied bytes or 0 if successful. > + * > + * XXX: check for small sizes and call the original version. > + * Benchmark it first though. > */ > - > SYM_FUNC_START(clear_user_erms) > - xorq %rax,%rax > - ASM_STAC > - > 0: rep stosb > - > -3: ASM_CLAC > +3: > movq %rcx,%rax > RET .. and one more time. Also, I do think that the rep_good and erms cases should probably check for small copes, and use the clear_user_original thing for %rcx < 64 or something like that. It's what we do on the memcpy side - and more importantly, it's the only difference between "erms" and FSRM. If the ERMS code doesn't do anything different for small copies, why have it at all? But other than these small issues, it looks good to me. Linus