Re: [PATCH V2 7/7] mm: Use pgdp_get() for accessing PGD entries

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Let's just rewind a bit. This thread exists because the kernel test robot failed
to compile pgd_none_or_clear_bad() (a core-mm function) for the arm architecture
after Anshuman changed the direct pgd dereference to pgdp_get(). The reason
compilation failed is because arm defines its own pgdp_get() override, but it is
broken (there is a typo).

Let's not rewind, because had you fully read and digested my reply, you
would have seen why this isn't a problem... but let me spell it out.


Code before Anshuman's change:

static inline int pgd_none_or_clear_bad(pgd_t *pgd)
{
	if (pgd_none(*pgd))
		return 1;
	if (unlikely(pgd_bad(*pgd))) {
		pgd_clear_bad(pgd);
		return 1;
	}
	return 0;
}

This isn't a problem as the code stands. While there is a dereference
in C, that dereference is a simple struct copy, something that we use
everywhere in the kernel. However, that is as far as it goes, because
neither pgd_none() and pgd_bad() make use of their argument, and thus
the compiler will optimise it away, resulting in no actual access to
the page tables - _as_ _intended_.

Right. Are you saying you depend upon those loads being optimized away for
correctness or performance reasons?


If these are going to be converted to pgd_get(), then we need pgd_get()
to _also_ be optimised away, 

OK, agreed.

So perhaps the best approach is to modify the existing default pxdp_get()
implementations to just do a C dereference. That will ensure that there are no
intended consequences, unlike moving to READ_ONCE() by default. Then riscv
(which I think is the only arch to actually use pxdp_get() currently?) will need
its own pxdp_get() overrides, which use READ_ONCE(). arm64 would also define its
own overrides in terms of READ_ONCE() to ensure single copy atomicity in the
presence of HW updates.

How does that sound to you?

and if e.g. this is the only place that
pgd_get() is going to be used, the suggestion I made in my previous
email is entirely reasonable, since we know that the result of pgd_get()
will not actually be used.

I guess you could do that as an arm-specific override, but I don't think it adds
anything over using my proposed reworked default? Your call.


As an aside, the kernel also dereferences p4d, pud, pmd and pte pointers in
various circumstances.

I already covered these in my previous reply.

And other changes in this series are also replacing those
direct dereferences with calls to similar helpers. The fact that these are all
folded (by a custom arm implementation if I've understood the below correctly)
just means that each dereference is returning what you would call the pmd from
the HW perspective, I think?

It'll "return" the first of each pair of level-1 page table entries,
which is pgd[0] or *p4d, *pud, *pmd - but all of these except *pmd
need to be optimised away, so throwing lots of READ_ONCE() around
this code without considering this is certainly the wrong approach.

Yep, got it.


The core-mm today
dereferences pgd pointers (and p4d, pud, pmd pointers) directly in its code. See
follow_pfnmap_start(),

Doesn't seem to exist at least not in 6.11.

Appologies, I'm on mm-unstable and that isn't upstream yet. See follow_pte() in
v6.11 or __apply_to_page_range(), or pgd_none_or_clear_bad() as per above.

Looking at follow_pte(), it's not a problem.

I think we wouldn't be having this conversation before:

commit a32618d28dbe6e9bf8ec508ccbc3561a7d7d32f0
Author: Russell King <rmk+kernel@xxxxxxxxxxxxxxxx>
Date:   Tue Nov 22 17:30:28 2011 +0000

    ARM: pgtable: switch to use pgtable-nopud.h

where:
-#define pgd_none(pgd)          (0)
-#define pgd_bad(pgd)           (0)

existed before this commit - and thus the dereference in things like:

	pgd_none(*pgd)

wouldn't even be visible to beyond the preprocessor step.






[Index of Archives]     [Video for Linux]     [Yosemite News]     [Linux S/390]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux