On Sat, Apr 16, 2022 at 10:42:22AM -0700, Linus Torvalds wrote: > On Sat, Apr 16, 2022 at 10:28 AM Borislav Petkov <bp@xxxxxxxxx> wrote: > > > > you also need a _fsrm() one which checks X86_FEATURE_FSRM. That one > > should simply do rep; stosb regardless of the size. For that you can > > define an alternative_call_3 similar to how the _2 variant is defined. > > Honestly, my personal preference would be that with FSRM, we'd have an > alternative that looks something like > > asm volatile( > "1:" > ALTERNATIVE("call __stosb_user", "rep movsb", X86_FEATURE_FSRM) > "2:" > _ASM_EXTABLE_UA(1b, 2b) > :"=c" (count), "=D" (dest),ASM_CALL_CONSTRAINT > :"0" (count), "1" (dest), "a" (0) > :"memory"); > > iow, the 'rep stosb' case would be inline. I knew you were gonna say that - we have talked about this in the past. And I'll do you one better -- we have the patch-if-bit-not-set thing now too, so I think it should work if we did: alternative_call_3(__clear_user_fsrm, __clear_user_erms, ALT_NOT(X86_FEATURE_FSRM), __clear_user_string, ALT_NOT(X86_FEATURE_ERMS), __clear_user_orig, ALT_NOT(X86_FEATURE_REP_GOOD), : "+&c" (size), "+&D" (addr) :: "eax"); and yeah, you wanna get rid of the CALL even and I guess that could be made to work - I just need to play with it a bit to hammer out the details. I.e., it would be most optimal if it ended up being ALTERNATIVE_3("rep stosb", "call ... ", ALT_NOT(X86_FEATURE_FSRM), ... > Note that the above would have a few things to look out for: > > - special 'stosb' calling convention: > > %rax/%rcx/%rdx as inputs > %rcx as "bytes not copied" return value > %rdi can be clobbered > > so the actual functions would look a bit odd and would need to > save/restore some registers, but they'd basically just emulate "rep > stosb". Right. > - since the whole point is that the "rep movsb" is inlined, it also > means that the "call __stosb_user" is done within the STAC/CLAC > region, so objdump would have to be taught that's ok > > but wouldn't it be lovely if we could start moving towards a model > where we can just inline 'memset' and 'memcpy' like this? Yeah, inlined insns without even a CALL insn would be the most optimal thing to do. Thx. -- Regards/Gruss, Boris. https://people.kernel.org/tglx/notes-about-netiquette