Hi Sean, Andy and Paolo,
On 11/3/2020 2:33 AM, Sean Christopherson wrote:
On Mon, Nov 02, 2020 at 10:01:16AM -0800, Andy Lutomirski wrote:
On Mon, Nov 2, 2020 at 9:31 AM Sean Christopherson
<sean.j.christopherson@xxxxxxxxx> wrote:
Tao, this patch should probably be tagged RFC, at least until we can experiment
with the threshold on real silicon. KVM and kernel behavior may depend on the
accuracy of detecting actual attacks, e.g. if we can set a threshold that has
zero false negatives and near-zero false postives, then it probably makes sense
to be more assertive in how such VM-Exits are reported and logged.
If you can actually find a threshold that reliably mitigates the bug
and does not allow a guest to cause undesirably large latency in the
host, then fine. 1/10 if a tick is way too long, I think.
Yes, this was my internal review feedback as well. Either that got lost along
the way or I wasn't clear enough in stating what should be used as a placeholder
until we have silicon in hand.
We have tested on real silicon and found it can work even with threshold
being set to 0.
It has an internal threshold, which is added to vmcs.notify_window as
the final effective threshold. The internal threshold is big enough to
cover normal instructions. For those long latency instructions like
WBINVD, the processor knows they cannot cause no interrupt window
attack. So no Notify VM exit will happen on them.
Initially, our hardware architect wants to set the notify window to
scheduler tick to not break kernel scheduling. But you folks want a
smaller one. So are you OK to set the window to 0?