On Thu, Sep 10, 2020 at 11:13 AM Jason Gunthorpe <jgg@xxxxxxxx> wrote: > > So.. To change away from the stack option I think we'd have to pass > the READ_ONCE value to pXX_offset() as an extra argument instead of it > derefing the pointer internally. Yeah, but I think that would actually be the better model than passing an address to a random stack location. It's also effectively what we do in some other places, eg the whole logic with "orig" in the regular pte fault handling is basically doing unlocked loads of the pte, various decisions on that, and then doing a final "is this still the same pte" after it has gotten the page table lock. (And yes, those other pte fault handling cases are different, since they _do_ hold the mmap lock, so they know the page *tables* are stable, and it's only the last level that then gets re-checked against the pte once the pte itself has also been stabilized with the page table lock). So I think it would actually be a better conceptual match to make the page table walking interface be "here, this is the value I read once carefully, and this is the address, now give me the next address". The folded case would then just return the address it was given, and the non-folded case would return the inner page table based on the value. I dunno. I don't actually feel all that strongly about this, so whatever works, I guess. Linus