On Tue, 07 Mar 2023 03:45:55 +0000, Ricardo Koller <ricarkol@xxxxxxxxxx> wrote: > > From: Marc Zyngier <maz@xxxxxxxxxx> Thanks for writing a commit message for my hacks! > > Broadcasted TLB invalidations (TLBI) are usually less performant than More precisely, TLBIs targeting the Inner Shareable domain. Also, 's/broadcasted/broadcast/', as this is an adjective and not a verb indicative of the past tense.. > their local variant. In particular, we observed some implementations non-shareable rather than local. 'Local' has all sort of odd implementation specific meanings (local to *what* is the usual question that follows...). > that take millliseconds to complete parallel broadcasted TLBIs. > > It's safe to use local, non-shareable, TLBIs when relaxing permissions s/local// > on a PTE in the KVM case for a couple of reasons. First, according to > the ARM Arm (DDI 0487H.a D5-4913), permission relaxation does not need > break-before-make. This requires some more details, and references to the latest revision of the ARM ARM (0487I.a). In that particular revision, the relevant information is contained in D8.13.1 "Using break-before-make when updating translation table entries", and more importantly in the rule R_WHZWS, which states that only a change of output address or block size require a BBM. > Second, the VTTBR_EL2.CnP==0 case, where each PE > has its own TLB entry for the same page, is tolerated correctly by KVM > when doing permission relaxation. Not having changes broadcasted to > all PEs is correct for this case, as it's safe to have other PEs fault > on permission on the same page. I'm not sure mentioning CnP is relevant here. If CnP==1, the TLBI will nuke the TLB visible by the sibling PE, but not any other. So this is always a partial TLB invalidation, irrespective of CnP. Thanks, M. -- Without deviation from the norm, progress is not possible.