On Fri, Jan 28, 2022 at 12:02 AM David Hildenbrand <david@xxxxxxxxxx> wrote: > > On 27.01.22 22:16, Yang Shi wrote: > > On Wed, Jan 26, 2022 at 10:54 AM David Hildenbrand <david@xxxxxxxxxx> wrote: > >> > >>>>> Just page lock or elevated page refcount could serialize against THP > >>>>> split AFAIK. > >>>>> > >>>>>> > >>>>>> But yeah, using the mapcount of a page that is not even mapped > >>>>>> (migration entry) is clearly wrong. > >>>>>> > >>>>>> To summarize: reading the mapcount on an unlocked page will easily > >>>>>> return a wrong result and the result should not be relied upon. reading > >>>>>> the mapcount of a migration entry is dangerous and certainly wrong. > >>>>> > >>>>> Depends on your usecase. Some just want to get a snapshot, just like > >>>>> smaps, they don't care. > >>>> > >>>> Right, but as discussed, even the snapshot might be slightly wrong. That > >>>> might be just fine for smaps (and I would have enjoyed a comment in the > >>>> code stating that :) ). > >>> > >>> I think that is documented already, see Documentation/filesystems/proc.rst: > >>> > >>> Note: reading /proc/PID/maps or /proc/PID/smaps is inherently racy (consistent > >>> output can be achieved only in the single read call). > >> > >> Right, but I think there is a difference between > >> > >> * Atomic values that change immediately afterwards ("this value used to > >> be true at one point in time") > >> * Values that are unstable because we cannot read them atomically ("this > >> value never used to be true") > >> > >> I'd assume with the documented race we actually talk about the first > >> point, but I might be just wrong. > >> > >>> > >>> Of course, if the extra note is preferred in the code, I could try to > >>> add some in a separate patch. > >> > >> When staring at the (original) code I would have hoped to find something > >> like: > >> > >> /* > >> * We use page_mapcount() to get a snapshot of the mapcount. Without > >> * holding the page lock this snapshot can be slightly wrong as we > >> * cannot always read the mapcount atomically. As long we hold the PT > >> * lock, the page cannot get unmapped and it's at safe to call > >> * page_mapcount(). > >> */ > >> > >> With the addition of > >> > >> "... For unmapped pages (e.g., migration entries) we cannot guarantee > >> that, so treat the mapcount as being 1." > > > > It seems a little bit confusing to me, it is not safe to call with PTL > > held either, right? I'd like to rephrase the note to: > > The implication that could have been spelled out is that only a mapped > page can get unmapped. (I know, there are some weird migration entries > nowadays ...) Yes, I see your point. Just felt "only a mapped page can get unmapped" sounds not that straightforward (just my personal feeling). How's about "It is not safe to call page_mapcount() even with PTL held if the page is not mapped, especially for migration entries". > > /* > * We use page_mapcount() to get a snapshot of the mapcount. Without > * holding the page lock this snapshot can be slightly wrong as we > * cannot always read the mapcount atomically. As long we hold the PT > * lock, a mapped page cannot get unmapped and it's at safe to call > * page_mapcount(). Especially for migration entries, it's not safe to > * call page_mapcount(), so we treat the mapcount as being 1. > */ > > > > -- > Thanks, > > David / dhildenb >