On Mon, Apr 18, 2022 at 10:10:42AM -0700, Linus Torvalds wrote: > Ugh. If you do this, you need to have a big comment about how that > %rcx value gets fixed up with EX_TYPE_UCOPY_LEN (which basically ends > up doing "%rcx = %rcx*8+%rax" in ex_handler_ucopy_len() for the > exception case). Yap, and I reused your text and expanded it. You made me look at that crazy DEFINE_EXTABLE_TYPE_REG macro finally so that I know what it does in detail. So I have the below now, it boots in the guest so it must be perfect. --- /* * Default clear user-space. * Input: * rdi destination * rcx count * * Output: * rcx uncopied bytes or 0 if successful. */ SYM_FUNC_START(clear_user_original) mov %rcx,%rax shr $3,%rcx # qwords and $7,%rax # rest bytes test %rcx,%rcx jz 1f # do the qwords first .p2align 4 0: movq $0,(%rdi) lea 8(%rdi),%rdi dec %rcx jnz 0b 1: test %rax,%rax jz 3f # now do the rest bytes 2: movb $0,(%rdi) inc %rdi decl %eax jnz 2b 3: xorl %eax,%eax RET _ASM_EXTABLE_UA(0b, 3b) /* * The %rcx value gets fixed up with EX_TYPE_UCOPY_LEN (which basically ends * up doing "%rcx = %rcx*8 + %rax" in ex_handler_ucopy_len() for the exception * case). That is, we use %rax above at label 2: for simpler asm but the number * of uncleared bytes will land in %rcx, as expected by the caller. * * %rax at label 3: still needs to be cleared in the exception case because this * is called from inline asm and the compiler expects %rax to be zero when exiting * the inline asm, in case it might reuse it somewhere. */ _ASM_EXTABLE_TYPE_REG(2b, 3b, EX_TYPE_UCOPY_LEN8, %rax) SYM_FUNC_END(clear_user_original) EXPORT_SYMBOL(clear_user_original) -- Regards/Gruss, Boris. https://people.kernel.org/tglx/notes-about-netiquette