On 2/26/24 04:03, Ryan Roberts wrote:
Hi Ryan!
Make clear the atmicity/consistency requirements of the API and how we
"atomicity"
achieve them.
Link: https://lore.kernel.org/linux-mm/Zc-Tqqfksho3BHmU@xxxxxxx/
Signed-off-by: Ryan Roberts <ryan.roberts@xxxxxxx>
---
arch/arm64/mm/contpte.c | 24 ++++++++++++++----------
1 file changed, 14 insertions(+), 10 deletions(-)
diff --git a/arch/arm64/mm/contpte.c b/arch/arm64/mm/contpte.c
index be0a226c4ff9..1b64b4c3f8bf 100644
--- a/arch/arm64/mm/contpte.c
+++ b/arch/arm64/mm/contpte.c
@@ -183,16 +183,20 @@ EXPORT_SYMBOL_GPL(contpte_ptep_get);
pte_t contpte_ptep_get_lockless(pte_t *orig_ptep)
{
/*
- * Gather access/dirty bits, which may be populated in any of the ptes
- * of the contig range. We may not be holding the PTL, so any contiguous
- * range may be unfolded/modified/refolded under our feet. Therefore we
- * ensure we read a _consistent_ contpte range by checking that all ptes
- * in the range are valid and have CONT_PTE set, that all pfns are
- * contiguous and that all pgprots are the same (ignoring access/dirty).
- * If we find a pte that is not consistent, then we must be racing with
- * an update so start again. If the target pte does not have CONT_PTE
- * set then that is considered consistent on its own because it is not
- * part of a contpte range.
+ * The ptep_get_lockless() API requires us to read and return *orig_ptep
+ * so that it is self-consistent, without the PTL held, so we may be
+ * racing with other threads modifying the pte. Usually a READ_ONCE()
+ * would suffice, but for the contpte case, we also need to gather the
+ * access and dirty bits from across all ptes in the contiguous block,
+ * and we can't read all of those neighbouring ptes atomically, so any
This still leaves a key detail unexplained: how the accessed and dirty bits
are handled. The above raises the *problem*, but then talks about getting a
consistent set of reads. But during those consistent reads, the HW could have
dirtied or read a page. And this code here is only returning a single pte.
So I'm still feeling vague about what we're trying to say about accessed and
dirty bits.
+ * contiguous range may be unfolded/modified/refolded under our feet.
+ * Therefore we ensure we read a _consistent_ contpte range by checking
+ * that all ptes in the range are valid and have CONT_PTE set, that all
+ * pfns are contiguous and that all pgprots are the same (ignoring
+ * access/dirty). If we find a pte that is not consistent, then we must
+ * be racing with an update so start again. If the target pte does not
+ * have CONT_PTE set then that is considered consistent on its own
+ * because it is not part of a contpte range.
*/
pgprot_t orig_prot;
thanks,
--
John Hubbard
NVIDIA