On 2024/11/8 21:57, Lorenzo Stoakes wrote:
Locking around VMAs is complicated and confusing. While we have a number of disparate comments scattered around the place, we seem to be reaching a level of complexity that justifies a serious effort at clearly documenting how locks are expected to be used when it comes to interacting with mm_struct and vm_area_struct objects. This is especially pertinent as regards the efforts to find sensible abstractions for these fundamental objects in kernel rust code whose compiler strictly requires some means of expressing these rules (and through this expression, self-document these requirements as well as enforce them). The document limits scope to mmap and VMA locks and those that are immediately adjacent and relevant to them - so additionally covers page table locking as this is so very closely tied to VMA operations (and relies upon us handling these correctly). The document tries to cover some of the nastier and more confusing edge cases and concerns especially around lock ordering and page table teardown. The document is split between generally useful information for users of mm interfaces, and separately a section intended for mm kernel developers providing a discussion around internal implementation details. Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@xxxxxxxxxx> ---
For the page table locks part: Acked-by: Qi Zheng <zhengqi.arch@xxxxxxxxxxxxx>
+ +.. note:: Interestingly, :c:func:`!pte_offset_map_lock` holds an RCU read lock + while the PTE page table lock is held. +
Yes, some paths will free PTE pages asynchronously by RCU (currently only in collapse_pte_mapped_thp() and retract_page_tables(), and maybe in madvise(MADV_DONTNEED) or even more places in the future), so the RCU read lock in pte_offset_map_lock() can ensure the stability of the PTE page. Although the spinlock can also be used as an RCU critical section, holding the RCU read lock at the same time allows spin_unlock(ptl) and pte_unmap(pte) to be called separately later. Thanks!