On Thu, May 13, 2021 at 11:57:39AM +0100, Steven Price wrote: > On 12/05/2021 18:45, Catalin Marinas wrote: > > On Wed, May 12, 2021 at 04:46:48PM +0100, Steven Price wrote: > >> On 10/05/2021 19:35, Catalin Marinas wrote: > >>>> On Thu, May 06, 2021 at 05:15:25PM +0100, Steven Price wrote: > >>>>>> On Thu, Apr 29, 2021 at 05:06:41PM +0100, Steven Price wrote: > >>>>>>> Given the changes to set_pte_at() which means that tags are restored from > >>>>>>> swap even if !PROT_MTE, the only race I can see remaining is the creation of > >>>>>>> new PROT_MTE mappings. As you mention an attempt to change mappings in the > >>>>>>> VMM memory space should involve a mmu notifier call which I think serialises > >>>>>>> this. So the remaining issue is doing this in a separate address space. > >>>>>>> > >>>>>>> So I guess the potential problem is: > >>>>>>> > >>>>>>> * allocate memory MAP_SHARED but !PROT_MTE > >>>>>>> * fork() > >>>>>>> * VM causes a fault in parent address space > >>>>>>> * child does a mprotect(PROT_MTE) > >>>>>>> > >>>>>>> With the last two potentially racing. Sadly I can't see a good way of > >>>>>>> handling that. > > [...] > >>> Options: > >>> > >>> 1. Change the mte_sync_tags() code path to set the flag after clearing > >>> and avoid reading stale tags. We document that mprotect() on > >>> MAP_SHARED may lead to tag loss. Maybe we can intercept this in the > >>> arch code and return an error. > >> > >> This is the best option I've come up with so far - but it's not a good > >> one! We can replace the set_bit() with a test_and_set_bit() to catch the > >> race after it has occurred - but I'm not sure what we can do about it > >> then (we've already wiped the data). Returning an error doesn't seem > >> particularly useful at that point, a message in dmesg is about the best > >> I can come up with. > > > > What I meant about intercepting is on something like > > arch_validate_flags() to prevent VM_SHARED and VM_MTE together but only > > for mprotect(), not mmap(). However, arch_validate_flags() is currently > > called on both mmap() and mprotect() paths. > > I think even if we were to restrict mprotect() there would be corner > cases around swapping in. For example if a page mapped VM_SHARED|VM_MTE > is faulted simultaneously in both processes then we have the same situation: > > * with test_and_set_bit() one process could potentially see the tags > before they have been restored - i.e. a data leak. > > * with separated test and set then one process could write to the tags > before the second restore has completed causing a lost update. I don't think this can happen. We have shmem_swapin_page() which I think would be called on any faulting pte that was sharing such page. It takes the page lock around arch_swap_restore(). -- Catalin _______________________________________________ kvmarm mailing list kvmarm@xxxxxxxxxxxxxxxxxxxxx https://lists.cs.columbia.edu/mailman/listinfo/kvmarm