On 10/7/24 01:44, David Hildenbrand wrote: > On 02.10.24 19:35, Dave Hansen wrote: >> We were just chatting about this on David Rientjes's MM alignment call. > > Unfortunately I was not able to attend this time, my body decided it's a > good idea to stay in bed for a couple of days. > >> I thought I'd try to give a little brain >> >> Let's start by thinking about KVM and secondary MMUs. KVM has a primary >> mm: the QEMU (or whatever) process mm. The virtualization (EPT/NPT) >> tables get entries that effectively mirror the primary mm page tables >> and constitute a secondary MMU. If the primary page tables change, >> mmu_notifiers ensure that the changes get reflected into the >> virtualization tables and also that the virtualization paging structure >> caches are flushed. >> >> msharefs is doing something very similar. But, in the msharefs case, >> the secondary MMUs are actually normal CPU MMUs. The page tables are >> normal old page tables and the caches are the normal old TLB. That's >> what makes it so confusing: we have lots of infrastructure for dealing >> with that "stuff" (CPU page tables and TLB), but msharefs has >> short-circuited the infrastructure and it doesn't work any more. > > It's quite different IMHO, to a degree that I believe they are different > beasts: > > Secondary MMUs: > * "Belongs" to same MM context and the primary MMU (process page tables) I think you're speaking to the ratio here. For each secondary MMU, I think you're saying that there's one and only one mm_struct. Is that right? > * Maintains separate tables/PTEs, in completely separate page table > hierarchy This is the case for KVM and the VMX/SVM MMUs, but it's not generally true about hardware. IOMMUs can walk x86 page tables and populate the IOTLB from the _same_ page table hierarchy as the CPU.