On Mon, Oct 24, 2022 at 12:58 PM Jann Horn <jannh@xxxxxxxxxx> wrote: > > Unless I'm completely misunderstanding what's going on here, the whole > "remove_table" thing only happens when you "remove a table", meaning > you free an entire *pagetable*. Just zapping PTEs doesn't trigger that > logic. I do have to admit that I'd be happier if this code - and the GUP code that also relies on "interrupts off" behavior - would just use a sequence counter instead. Relying on blocking IPI's is clever, but also clearly very subtle and somewhat dangerous. I think our GUP code is a *lot* more important than some "legacy x86-32 has problems in case you have an incredibly unlikely race that re-populates the page table with a different page that just happens to be exactly the same MOD-4GB", so honestly, I don't think the load-tearing is even worth worrying about - if you have hardware that is good enough at virtualizing things, it's almost certainly already 64-bit, and running 32-bit virtual machines with PAE you really only have yourself to blame. So I can't find it in myself to care about the 32-bit tearing thing, but this discussion makes me worried about Fast GUP. Note that even with proper atomic pte_t pte = ptep_get_lockless(ptep); in gup_pte_range(), and even if the page tables are RCU-free'd, that just means that the 'ptep' access itself is safe. But then you have the whole "the lookup of the page pointer is not atomic" wrt that. And right now that GUP code does rely on the "block IPI" to make it basically valid. I don't think it matters if GUP races with munmap or madvise() or something like that - if you get the old page, that's still a valid page, and the user only has himself to blame. But if we have memory pressure that causes vmscan to push out a page, and it gets replaced with a new page, and GUP gets the old page with no serialization, that sounds like a possible source of data inconsistency. I don't know if this can happen, but the whole "interrupts disabled doesn't actually block IPI's and synchronize with TLB flushes" really sounds like it would affect GUP too. And be much more serious there than on some x86-32 platform that nobody should be using anyway. Linus