On Fri, 15 Dec 2023, Robin Murphy wrote:
On 14/12/2023 9:35 pm, Christoph Lameter (Ampere) wrote:
On Thu, 14 Dec 2023, Robin Murphy wrote:
It seems somewhat suspect that these counts only ever increase. It's not
often that we change or remove parts of the linear map, but it certainly
can happen.
Well yes in the case of hotplug I guess ... Ok here is V2
There are also paths where we remove and reinstate parts of the linear map
via set_memory_valid(), set_direct_map_*(), and possibly others. If we're
exposing a user ABI that claims to be accounting kernel VA mappings, then I
think users are within their rights to expect it to actually account kernel
VA mappings, not just expose numbers whose only guaranteed significance is
whether they are zero or nonzero.
set_memory_valid() changes mappings via __change_memory_common()
and apply_to_page_range(). It seems that apply_to_page range() creates
PTEs as needed etc.
However, I do not see any accounting for direct map modification
accounting on x86 either. Since this was satifactory for x86 I dont
believe that more is needed. Introducing atomics in potentially
performance sensitive functions run when the kernel is up is not that
advisable and doing so would require core kernel changes going beyond the
enablement of arch_report_meminfo() on ARM64.
Looking again, am I also right in thinking that what I assumed were the
non-contiguous counts here are actually total counts of *either* type of
mapping at that level, and inclusive of the contiguous counts? If so, that
seems a bit non-obvious - my intuitive expectation would be for the sum of
all these numbers to represent the total amount of direct-mapped RAM, where
either we're intersted in each distinct type of mapping and accounting them
all separately, or we're simply interested in the general shape of the
pagetables, and thus would account per-level and ignore the contiguous bit
since we don't know whether it's actually doing anything useful anyweay
Yes, the CONT PTEs are a subset of the other counts since they are only a
special case of the PTE type. They are important for performance on ARM64
in particular with the anticipated use of them for the various sizes of
pages supported in the kernel with the introduction of folios for the page
cache.
The problem with CONT_PTE is that it is not clear whether the architecture
supports it or now. The amount of CONT_PTE can influence the TLB coverage
possible in kernel space.
We are generally interested in the shape of the page tables. If the user
later uses processes that require a degradation through smaller mappings
then that is load dependent.