Re: [PATCH] x86: bring back rep movsq for user access on CPUs without ERMS

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 8/29/23, Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
> On Mon, 28 Aug 2023 at 10:07, Mateusz Guzik <mjguzik@xxxxxxxxx> wrote:
>>
>> Hand-rolled mov loops executing in this case are quite pessimal compared
>> to rep movsq for bigger sizes. While the upper limit depends on uarch,
>> everyone is well south of 1KB AFAICS and sizes bigger than that are
>> common. The problem can be easily remedied so do it.
>
> Ok, looking at teh actual code now, and your patch is buggy.
>
>> +.Llarge_movsq:
>> +       movq %rcx,%r8
>> +       movq %rcx,%rax
>> +       shrq $3,%rcx
>> +       andl $7,%eax
>> +6:     rep movsq
>> +       movl %eax,%ecx
>>         testl %ecx,%ecx
>>         jne .Lcopy_user_tail
>>         RET
>
> The fixup code is very very broken:
>
>> +/*
>> + * Recovery after failed rep movsq
>> + */
>> +7:     movq %r8,%rcx
>> +       jmp .Lcopy_user_tail
>> +
>> +       _ASM_EXTABLE_UA( 6b, 7b)
>
> That just copies the original value back into %rcx. That's not at all
> ok. The "rep movsq" may have succeeded partially, and updated %rcx
> (and %rsi/rdi) accordingly. You now will do the "tail" for entirely
> too much, and returning the wrong return value.
>
> In fact, if this then races with a mmap() in another thread, the user
> copy might end up then succeeding for the part that used to fail, and
> in that case it will possibly end up copying much more than asked for
> and overrunning the buffers provided.
>
> So all those games with %r8 are entirely bogus. There is no way that
> "save the original length" can ever be relevant or correct.
>

Huh, pretty obvious now that you mention it, I don't know why I
thought regs go back. But more importantly I should have checked
handling in the now-removed movsq routine (copy_user_generic_string):

[snip]
        movl %edx,%ecx
        shrl $3,%ecx
        andl $7,%edx
1:      rep movsq
2:      movl %edx,%ecx
3:      rep movsb
        xorl %eax,%eax
        ASM_CLAC
        RET

11:     leal (%rdx,%rcx,8),%ecx
12:     movl %ecx,%edx          /* ecx is zerorest also */
        jmp .Lcopy_user_handle_tail

        _ASM_EXTABLE_CPY(1b, 11b)
        _ASM_EXTABLE_CPY(3b, 12b)
[/snip]

So I think I know how to fix it, but I'm going to sleep on it.

-- 
Mateusz Guzik <mjguzik gmail.com>



[Index of Archives]     [Linux Kernel]     [Kernel Newbies]     [x86 Platform Driver]     [Netdev]     [Linux Wireless]     [Netfilter]     [Bugtraq]     [Linux Filesystems]     [Yosemite Discussion]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Device Mapper]

  Powered by Linux