On Tue, Apr 05, 2022 at 03:48:19PM +1000, NeilBrown wrote: > On Tue, 05 Apr 2022, Matthew Wilcox wrote: > > It's a shame to not have these functions documented. I'm sure I've > > missed a few things that would be useful to document here. > > > > diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h > > index ab47579af434..47b7851f1b64 100644 > > --- a/include/linux/pagemap.h > > +++ b/include/linux/pagemap.h > > @@ -888,6 +888,18 @@ bool __folio_lock_or_retry(struct folio *folio, struct mm_struct *mm, > > void unlock_page(struct page *page); > > void folio_unlock(struct folio *folio); > > > > +/** > > + * folio_trylock() - Attempt to lock a folio. > > + * @folio: The folio to attempt to lock. > > + * > > + * Sometimes it is undesirable to wait for a folio to be unlocked (eg > > + * when the locks are being taken in the wrong order, or if making > > + * progress through a batch of folios is more important than processing > > + * them in order). Usually folio_lock() is the correct function to call. > > Usually? > I think a "see also" type reference to folio_lock() is useful, but I > don't think "usually" is helpful. That was supposed to stand in contrast to the "Sometimes". How about this: * folio_lock() is a sleeping function. If sleeping is not the right * behaviour (eg when the locks are being taken in the wrong order, * or if making progress through a batch of folios is more important * than processing them in order) then you can use folio_trylock(). * It is never appropriate to implement a spinlock using folio_trylock(). ... if not, could you suggest some better wording? > > +/** > > + * folio_lock() - Lock this folio. > > + * @folio: The folio to lock. > > + * > > + * The folio lock protects against many things, probably more than it > > + * should. It is primarily held while a folio is being read from storage, > > + * either from its backing file or from swap. It is also held while a > > + * folio is being truncated from its address_space. > > + * > > + * Holding the lock usually prevents the contents of the folio from being > > + * modified by other kernel users, although it does not prevent userspace > > + * from modifying data if it's mapped. The lock is not consistently held > > + * while a folio is being modified by DMA. > > I don't think this paragraph is helpful... maybe if it listed which > change *are* prevented by the lock, rather than a few which aren't? I put that in because it's a common misconception ("holding the page locked will prevent it from being modified"). > I think it is significant that the lock prevents removal from the page > cache, and so ->mapping is only stable while the lock is held. It might > be worth adding something about that. That's implied by the last sentence of the first paragraph, but I can include that. Actually, ->mapping is also stable if the page is mapped and you hold the page table lock of a page table it's mapped in. That's not theoretical BTW, it's the conditions under which ->dirty_folio() is called -- either the folio lock is held, OR the folio is mapped and the PTL is held. That prevents truncation because truncation unmaps pages before clearing ->mapping, and it needs to take the PTL to unmap the page. Hard to express that in a lockdep expression because the PTL isn't passed to folio_mark_dirty(). That explanation probably doesn't belong here, so how about ... * folio_lock() - Lock this folio. * @folio: The folio to lock. * * The folio lock protects against many things, probably more than it * should. It is primarily held while a folio is being brought uptodate, * either from its backing file or from swap. It is also held while a * folio is being truncated from its address_space, so holding the lock * is sufficient to keep folio->mapping stable. * * The folio lock is also held while write() is modifying the page to * provide POSIX atomicity guarantees (as long as the write does not * cross a page boundary). Other modifications to the data in the folio * do not hold the folio lock and can race with writes, eg DMA and stores * to mapped pages. * * Context: May sleep. If you need to acquire the locks of two or * more folios, they must be in order of ascending index, if they are * in the same address_space. If they are in different address_spaces, * acquire the lock of the folio which belongs to the address_space which * has the lowest address in memory first. Looking at the comment on folio_mark_dirty(), it's a bit unhelpful. How about this: * folio_mark_dirty - Mark a folio as being modified. * @folio: The folio. * - * For folios with a mapping this should be done with the folio lock held - * for the benefit of asynchronous memory errors who prefer a consistent - * dirty state. This rule can be broken in some special cases, - * but should be better not to. + * The folio may not be truncated while this function is running. + * Holding the folio lock is sufficient to prevent truncation, but some + * callers cannot acquire a sleeping lock. These callers instead hold + * the page table lock for a page table which contains at least one page + * in this folio. Truncation will block on the page table lock as it + * unmaps pages before removing the folio from its mapping. * * Return: True if the folio was newly dirtied, false if it was already dirty.