Re: [PATCH v2 07/11] x86, memory_failure: Introduce {set, clear}_mce_nospec()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Jun 4, 2018 at 11:08 AM, Luck, Tony <tony.luck@xxxxxxxxx> wrote:
> On Mon, Jun 04, 2018 at 10:39:48AM -0700, Dan Williams wrote:
>> On Mon, Jun 4, 2018 at 10:08 AM, Luck, Tony <tony.luck@xxxxxxxxx> wrote:
>> > On Sat, Jun 02, 2018 at 10:23:20PM -0700, Dan Williams wrote:
>> >> +static inline int set_mce_nospec(unsigned long pfn)
>> >> +{
>> >> +     int rc;
>> >> +
>> >> +     rc = set_memory_uc((unsigned long) __va(PFN_PHYS(pfn)), 1);
>> >
>> > You should really do the decoy_addr thing here that I had in mce_unmap_kpfn().
>> > Putting the virtual address of the page you mustn't accidentally prefetch
>> > from into a register is a pretty good way to make sure that the processor
>> > does do a prefetch.
>>
>> Maybe I'm misreading, but doesn't that make the page completely
>> inaccessible? We still want to read pmem through the driver and the
>> linear mapping with memcpy_mcsafe(). Alternatively I could just drop
>> this patch and setup a private / alias mapping for the pmem driver to
>> use. It seems aliased mappings would be the safer option, but I want
>> to make sure I've comprehended your suggestion correctly?
>
> I'm OK with the call to set_memory_uc() to make this uncacheable
> instead of set_memory_np() to make it inaccessible.
>
> The problem is how to achieve that.
>
> The result of __va(PFN_PHYS(pfn) is the virtual address where the poison
> page is currently mapped into the kernel. That value gets put into
> register %rdi to make the call to set_memory_uc() (which goes on to
> call a bunch of other functions passing the virtual address along
> the way).
>
> Now imagine an impatient super-speculative processor is waiting for
> some result to decide where to jump next, and picks a path that isn't
> going to be taken ... out in the weeds somewhere it runs into:
>
>         movzbl  (%rdi), %eax
>
> Oops ... now you just read from the address you were trying to
> avoid. So we log an error. Eventually the speculation gets sorted
> out and the processor knows not to signal a machine check. But the
> log is sitting in a machine check bank waiting to cause an overflow
> if we try to log a second error.
>
> The decoy_addr trick in mce_unmap_kpfn() throws in the high bit
> to the address passed.  The set_memory_np() code (and I assume the
> set_memory_uc()) code ignores it, but it means any stray speculative
> access won't point at the poison page.
>
> -Tony
>
> Note: this is *mostly* a problem if the poison is in the first
> cache line of the page.  But you could hit other lines if the
> instruction you speculatively ran into had the right offset. E.g.
> to hit the third line:
>
>         movzbl  128(%rdi), %eax

Ok, makes sense and I do see now that this decoy resolves to the same
physical address once PTE_PFN_MASK is applied when we start messing
with page tables.

However, set_memory_uc() is currently not prepared for this trick as
it specifies the unmasked physical address to reserve_memtype().

        ret = reserve_memtype(__pa(addr), __pa(addr) + numpages * PAGE_SIZE,
                              _PAGE_CACHE_MODE_UC_MINUS, NULL);

...compared to set_memory_np() which does not manipulate the memtype tracking.

I'll fix up reserve_memtype() and free_memtype() to be prepared for
decoy addresses.




[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux