On 2024/11/19 00:47, Jann Horn wrote:
Make it clearer that holding the mmap lock in read mode is not enough
to traverse page tables, and that just having a stable VMA is not enough
to read PTEs.
Suggested-by: Matteo Rizzo <matteorizzo@xxxxxxxxxx>
Suggested-by: Lorenzo Stoakes <lorenzo.stoakes@xxxxxxxxxx>
Signed-off-by: Jann Horn <jannh@xxxxxxxxxx>
Acked-by: Qi Zheng <zhengqi.arch@xxxxxxxxxxxxx>
+
+* On 32-bit architectures, they may be in high memory (meaning they need to be
+ mapped into kernel memory to be accessible).
+* When empty, they can be unlinked and RCU-freed while holding an mmap lock or
+ rmap lock for reading in combination with the PTE and PMD page table locks.
+ In particular, this happens in :c:func:`!retract_page_tables` when handling
+ :c:macro:`!MADV_COLLAPSE`.
+ So accessing PTE-level page tables requires at least holding an RCU read lock;
+ but that only suffices for readers that can tolerate racing with concurrent
+ page table updates such that an empty PTE is observed (in a page table that
+ has actually already been detached and marked for RCU freeing) while another
+ new page table has been installed in the same location and filled with
+ entries. Writers normally need to take the PTE lock and revalidate that the
+ PMD entry still refers to the same PTE-level page table.
+
In practice, this also happens in the retract_page_tables(). Maybe can
add a note about this after my patch[1] is merged. ;)
[1].
https://lore.kernel.org/lkml/e5b321ffc3ebfcc46e53830e917ad246f7d2825f.1731566457.git.zhengqi.arch@xxxxxxxxxxxxx/
Thanks!