On Tue, Nov 30, 2021, Peter Xu wrote: > On Mon, Nov 29, 2021 at 10:31:14AM -0800, Ben Gardon wrote: > > 2. There could be a pointer to the page table in a vCPU's paging > > structure caches, which are similar to the TLB but cache partial > > translations. These are also cleared out on TLB flush. > > Could you elaborate what's the structure cache that you mentioned? I thought > the processor page walker will just use the data cache (L1-L3) as pgtable > caches, in which case IIUC the invalidation happens when we do WRITE_ONCE() > that'll invalidate all the rest data cache besides the writter core. But I > could be completely missing something.. Ben is referring to the Intel SDM's use of the term "paging-structure caches" Intel CPUs, and I'm guessing other x86 CPUs, cache upper level entries, e.g. the L4 PTE for a given address, to avoid having to do data cache lookups, reserved bits checked, A/D assists, etc... Like full VA=>PA TLB entries, these entries are associated with the PCID, VPID, EPT4A, etc... The data caches are still used when reading PTEs that aren't cached in the TLB, the extra caching in the "TLB" is optimization on top. 28.3.1 Information That May Be Cached Section 4.10, “Caching Translation Information” in Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3A identifies two kinds of translation-related information that may be cached by a logical processor: translations, which are mappings from linear page numbers to physical page frames, and paging-structure caches, which map the upper bits of a linear page number to information from the paging-structure entries used to translate linear addresses matching those upper bits.