On Thu, 4 Apr 2019, Khalid Aziz wrote: > When xpfo unmaps a page from physmap only (after mapping the page in > userspace in response to an allocation request from userspace) on one > processor, there is a small window of opportunity for ret2dir attack on > other cpus until the TLB entry in physmap for the unmapped pages on > other cpus is cleared. Forcing that to happen synchronously is the > expensive part. A multiple of these requests can come in over a very > short time across multiple processors resulting in every cpu asking > every other cpusto flush TLB just to close this small window of > vulnerability in the kernel. If each request is processed synchronously, > each CPU will do multiple TLB flushes in short order. If we could > consolidate these TLB flush requests instead and do one TLB flush on > each cpu at the time of context switch, we can reduce the performance > impact significantly. This bears out in real life measuring the system > time when doing a parallel kernel build on a large server. Without this, > system time on 96-core server when doing "make -j60 all" went up 26x. > After this optimization, impact went down to 1.44x. > > The trade-off with this strategy is, the kernel on a cpu is vulnerable > for a short time if the current running processor is the malicious The "short" time to next context switch on the other CPUs is how short exactly? Anything from 1us to seconds- think NOHZ FULL - and even w/o that 10ms on a HZ=100 kernel is plenty of time to launch an attack. > process. Is that an acceptable trade-off? You are not seriously asking whether creating a user controllable ret2dir attack window is a acceptable trade-off? April 1st was a few days ago. Thanks, tglx