Re: [RFC] mm: gup: add helper page_try_gup_pin(page)

Hillf Danton <hdanton@xxxxxxxx> · Mon, 4 Nov 2019 12:34:20 +0800

On Sun, 3 Nov 2019 12:20:10 -0800 John Hubbard wrote:
> On 11/3/19 3:21 AM, Hillf Danton wrote:
> > 
> > A helper is added for mitigating the gup issue described at
> > https://lwn.net/Articles/784574/. It is unsafe to write out
> > a dirty page that is already gup pinned for DMA.
> > 
> > In the current writeback context, dirty pages are written out with
> > no detecting whether they have been gup pinned; Nor mark to keep
> > gupers off. In the gup context, file pages can be pinned with other
> > gupers and writeback ignored.
> > 
> > The factor, that no room, supposedly even one bit, in the current
> > page struct can be used for tracking gupers, makes the issue harder
> > to tackle.
> 
> Well, as long as we're counting bits, I've taken 21 bits (!) to track
> "gupers". :)  More accurately, I'm sharing 31 bits with get_page()...please

Would you please specify the reasoning of tracking multiple gupers
for a dirty page? Do you mean that it is all fine for guper-A to add
changes to guper-B's data without warning and vice versa?

> see my recently posted patchset for tracking dma-pinned pages:
> 
> https://lore.kernel.org/r/20191030224930.3990755-1-jhubbard@xxxxxxxxxx
> 
> Once that is merged, you will have this available:
> 
>    static inline bool page_dma_pinned(struct page *page);
> 
> ...which will reliably track dma-pinned pages.
> 
> After that, we still need to convert a some more call sites (block/bio 
> in particular) to the new pin_user_pages()/put_user_page() system, in 
> order for filesystems to take advantage of it, but this approach has 
> the advantage of actually tracking such pages, rather than faking it by 
> hoping that there is only one gup caller at a time.
> 
> 
> > 
> > The approach here is, because it makes no sense to allow a file page
> > to have multiple gupers at the same time, looking to make gupers
> 
> ohhh...no, that's definitely not a claim you can make.
> 
> 
> > mutually exclusive, and then guper's singulairty helps to tell if a
> > guper is existing by staring at the change in page count.
> > 
> > The result of that sigularity is not yet 100% correct but something
> > of "best effort" as the effect of random get_page() is perhaps also
> > folded in it.
> > It is assumed the best effort is feasible/acceptable in practice
> > without the the cost of growing the page struct size by one bit,
> > were it true that something similar has been applied to the page
> > migrate and reclaim contexts for a while.
> > 
> > With the helper in place, we skip writing out a dirty page if a
> > guper is detected; On gupping, we give up pinning a file page due
> > to writeback or losing the race to become a guper.
> > 
> > The end result is, no gup-pinned page will be put under writeback.
> 
> I think you must have missed the many contentious debates about the
> tension between gup-pinned pages, and writeback. File systems can't
> just ignore writeback in all cases. This patch leads to either
> system hangs or filesystem corruption, in the presence of long-lasting
> gup pins.

The current risk of data corruption due to writeback with long-lived
gup references all ignored is zeroed out by detecting gup-pinned dirty
pages and skipping them; that may lead to problems you mention above.

Though I doubt anything helpful about it can be expected from fs in near
future, we have options for instance that gupers periodically release
their references and re-pin pages after data sync the same way as the
current flusher does.

> Really, this won't work. sorry.
> 
> 
> thanks,
> 
> John Hubbard
> NVIDIA