On 9/28/20 4:57 PM, Jason Gunthorpe wrote:
On Mon, Sep 28, 2020 at 12:29:55PM -0700, Linus Torvalds wrote:
...
I think this is really hard to use and ugly. My thinking has been to just stick: if (flags & FOLL_LONGTERM) flags |= FOLL_FORCE | FOLL_WRITE In pin_user_pages(). It would make the driver API cleaner. If we can
+1, yes. The other choices so far are, as you say, really difficult to figure out.
do a bit better somehow by not COW'ing for certain VMA's as you explained then all the better, but not my primary goal.. Basically, I think if a driver is using FOLL_LONGTERM | FOLL_PIN we should guarentee that driver a consistent MM and take the gup_fast performance hit to do it. AFAICT the giant wack of other cases not using FOLL_LONGTERM really shouldn't care about read-decoherence. For those cases the user should really not be racing write's with data under read-only pin, and the new COW logic looks like it solves the other issues with this.
I hope this doesn't kill the seqcount() idea, though. That was my favorite part of the discussion, because it neatly separates out the two racing domains (fork, gup/pup) and allows easy reasoning about them--without really impacting performance. Truly elegant. We should go there.
I know Jann/John have been careful to not have special behaviors for the DMA case, but I think it makes sense here. It is actually different.
I think that makes sense. Everyone knew that DMA/FOLL_LONGTERM call sites were at least potentially special, despite the spirited debates in at least two conferences about the meaning and implications of "long term". :) And here we are seeing an example of such a special case, which I think is natural enough. thanks, -- John Hubbard NVIDIA