On Thu, Apr 21, 2022 at 09:11:37AM -0700, Ben Gardon wrote: > On Fri, Apr 15, 2022 at 2:59 PM Oliver Upton <oupton@xxxxxxxxxx> wrote: > > > > For parallel walks that collapse a table into a block KVM ensures a > > locked invalid pte is visible to all observers in pre-order traversal. > > As such, there is no need to try breaking the pte again. > > When you're doing the pre and post-order traversals, are they > implemented as separate traversals from the root, or is it a kind of > pre and post-order where non-leaf nodes are visited on the way down > and on the way up? The latter. We do one walk of the tables and fire the appropriate visitor callbacks based on what part of the walk we're in. > I assume either could be made to work, but the re-traversal from the > root probably minimizes TLB flushes, whereas the pre-and-post-order > would be a more efficient walk? When we need to start doing operations on a whole range of memory this way I completely agree (collapse to 2M, shatter to 4K for a memslot, etc.). For the current use cases of the stage 2 walker, to coalesce TLBIs we'd need a better science around when to do blast all of stage 2 vs. TLBI with an IPA argument. IOW, we go through a decent bit of trouble to avoid flushing all of stage 2 unless deemed necessary. And the other unfortunate thing about that is I doubt observations are portable between implementations so the point where we cut over to a full flush is likely highly dependent on the microarch. Later revisions of the ARM architecture bring us TLBI instructions that take a range argument, which could help a lot in this department. -- Thanks, Oliver