Re: [patch 02/14] tmpfs: fix regressions from wider use of ZERO_PAGE

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, Apr 16, 2022 at 10:42:22AM -0700, Linus Torvalds wrote:
> On Sat, Apr 16, 2022 at 10:28 AM Borislav Petkov <bp@xxxxxxxxx> wrote:
> >
> > you also need a _fsrm() one which checks X86_FEATURE_FSRM. That one
> > should simply do rep; stosb regardless of the size. For that you can
> > define an alternative_call_3 similar to how the _2 variant is defined.
> 
> Honestly, my personal preference would be that with FSRM, we'd have an
> alternative that looks something like
> 
>     asm volatile(
>         "1:"
>         ALTERNATIVE("call __stosb_user", "rep movsb", X86_FEATURE_FSRM)
>         "2:"
>        _ASM_EXTABLE_UA(1b, 2b)
>         :"=c" (count), "=D" (dest),ASM_CALL_CONSTRAINT
>         :"0" (count), "1" (dest), "a" (0)
>         :"memory");
> 
> iow, the 'rep stosb' case would be inline.

I knew you were gonna say that - we have talked about this in the past.
And I'll do you one better -- we have the patch-if-bit-not-set thing now
too, so I think it should work if we did:


       alternative_call_3(__clear_user_fsrm,
                          __clear_user_erms,   ALT_NOT(X86_FEATURE_FSRM),
                          __clear_user_string, ALT_NOT(X86_FEATURE_ERMS),
			  __clear_user_orig,   ALT_NOT(X86_FEATURE_REP_GOOD),
                          : "+&c" (size), "+&D" (addr)
                          :: "eax");

and yeah, you wanna get rid of the CALL even and I guess that could
be made to work - I just need to play with it a bit to hammer out the
details.

I.e., it would be most optimal if it ended up being

	ALTERNATIVE_3("rep stosb",
		      "call ... ", ALT_NOT(X86_FEATURE_FSRM),
		      ...


> Note that the above would have a few things to look out for:
> 
>  - special 'stosb' calling convention:
> 
>      %rax/%rcx/%rdx as inputs
>      %rcx as "bytes not copied" return value
>      %rdi can be clobbered
> 
>    so the actual functions would look a bit odd and would need to
> save/restore some registers, but they'd basically just emulate "rep
> stosb".

Right.

>  - since the whole point is that the "rep movsb" is inlined, it also
> means that the "call __stosb_user" is done within the STAC/CLAC
> region, so objdump would have to be taught that's ok
> 
> but wouldn't it be lovely if we could start moving towards a model
> where we can just inline 'memset' and 'memcpy' like this?

Yeah, inlined insns without even a CALL insn would be the most optimal
thing to do.

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette



[Index of Archives]     [Kernel Archive]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]

  Powered by Linux