On 2024/11/19 15:48, Lorenzo Stoakes wrote:
On Tue, Nov 19, 2024 at 02:53:52PM +0800, Qi Zheng wrote:
On 2024/11/19 00:47, Jann Horn wrote:
Make it clearer that holding the mmap lock in read mode is not enough
to traverse page tables, and that just having a stable VMA is not enough
to read PTEs.
Suggested-by: Matteo Rizzo <matteorizzo@xxxxxxxxxx>
Suggested-by: Lorenzo Stoakes <lorenzo.stoakes@xxxxxxxxxx>
Signed-off-by: Jann Horn <jannh@xxxxxxxxxx>
Acked-by: Qi Zheng <zhengqi.arch@xxxxxxxxxxxxx>
+
+* On 32-bit architectures, they may be in high memory (meaning they need to be
+ mapped into kernel memory to be accessible).
+* When empty, they can be unlinked and RCU-freed while holding an mmap lock or
+ rmap lock for reading in combination with the PTE and PMD page table locks.
+ In particular, this happens in :c:func:`!retract_page_tables` when handling
+ :c:macro:`!MADV_COLLAPSE`.
+ So accessing PTE-level page tables requires at least holding an RCU read lock;
+ but that only suffices for readers that can tolerate racing with concurrent
+ page table updates such that an empty PTE is observed (in a page table that
+ has actually already been detached and marked for RCU freeing) while another
+ new page table has been installed in the same location and filled with
+ entries. Writers normally need to take the PTE lock and revalidate that the
+ PMD entry still refers to the same PTE-level page table.
+
In practice, this also happens in the retract_page_tables(). Maybe can
add a note about this after my patch[1] is merged. ;)
[1]. https://lore.kernel.org/lkml/e5b321ffc3ebfcc46e53830e917ad246f7d2825f.1731566457.git.zhengqi.arch@xxxxxxxxxxxxx/
You could even queue the doc change up there? :>)
OK, I can add this note to my patch after this patch is merged.
I think one really nice thing with having docs in-tree like this is when we
change things that alter the doc's accuracy we can queue them up with the
patch so the doc always stays in sync.
Agree.
I feel you may have accidentally self-volunteered there ;)
Thanks!