On 14/05/2018 16:32, Liran Alon wrote: > > Eventually I implemented (2) just because it was simpler to implement > and gave good enough performance for our needs. It could also be > further optimized by having a per-PCID array of X VAs to > virtual-invlpg on next switch to that PCID. If array is full, > fallback to treat cr3-switch as done with "TLB flush" even if bit 63 > is set. > > The above approach worked pretty-well and this made me wonder how the > approach suggested by this patch series compares to my approach. I > think there is an interesting trade-off between these 2 approaches: > My approach is simpler and support PCID & INVPCID generically without > limiting performance gain to specific workload of KPTI usage of PCID. > For example, guest that use PCIDs for minimizing TLB flushes (Linux > has such mechanism) would have improved performance. However, the > main drawback of this approach is that GPTs mapping user-space > portion of process will have multiple SPTs mirroring them for > user/kernel PCIDs. Therefore, this approach utilize more SPT pages > and therefore could result in more MMU flushes. This is in contrast > to this patch series which can share SPTs mirroring the GPTs mapping > user-space by both user PCID and kernel PCID. > > Turned out to be quite long but wanted to just give my two cents > analysis on this series. :) I agree that your approach should result in less code. I am not sure where you stored the PCID bits, but if it's the hash key, then we might as well store fewer and force a TLB flush when there is a conflict. This should limit the amount of memory used by SPTEs. Paolo