On Thu, Apr 18, 2019 at 06:47:56PM +0200, Paolo Bonzini wrote: > On 18/04/19 16:17, Sean Christopherson wrote: > > On Wed, Mar 20, 2019 at 06:43:20PM +0100, Vitaly Kuznetsov wrote: > >> It was reported that with some special Multi Processor Group configuration, > >> e.g: > >> bcdedit.exe /set groupsize 1 > >> bcdedit.exe /set maxgroup on > >> bcdedit.exe /set groupaware on > >> for a 16-vCPU guest WS2012 shows BSOD on boot when PV TLB flush mechanism > >> is in use. > >> > >> Tracing kvm_hv_flush_tlb immediately reveals the issue: > >> > >> kvm_hv_flush_tlb: processor_mask 0x0 address_space 0x0 flags 0x2 > >> > >> The only flag set in this request is HV_FLUSH_ALL_VIRTUAL_ADDRESS_SPACES, > >> however, processor_mask is 0x0 and no HV_FLUSH_ALL_PROCESSORS is specified. > >> We don't flush anything and apparently it's not what Windows expects. > >> > >> TLFS doesn't say anything about such requests and newer Windows versions > >> seem to be unaffected. This all feels like a WS2012 bug, which is, however, > >> easy to workaround in KVM: let's flush everything when we see an empty > >> flush request, over-flushing doesn't hurt. > >> > >> Signed-off-by: Vitaly Kuznetsov <vkuznets@xxxxxxxxxx> > >> --- > >> arch/x86/kvm/hyperv.c | 12 +++++++++++- > >> 1 file changed, 11 insertions(+), 1 deletion(-) > >> > >> diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c > >> index 421899f6ad7b..5887f7d22ac6 100644 > >> --- a/arch/x86/kvm/hyperv.c > >> +++ b/arch/x86/kvm/hyperv.c > >> @@ -1371,7 +1371,17 @@ static u64 kvm_hv_flush_tlb(struct kvm_vcpu *current_vcpu, u64 ingpa, > >> > >> valid_bank_mask = BIT_ULL(0); > >> sparse_banks[0] = flush.processor_mask; > >> - all_cpus = flush.flags & HV_FLUSH_ALL_PROCESSORS; > >> + > >> + /* > >> + * WS2012 seems to be buggy, under certain conditions it is > >> + * possible to observe requests with processor_mask = 0x0 and > >> + * no HV_FLUSH_ALL_PROCESSORS flag set. It also seems that > > > > "and no HV_FLUSH_ALL_PROCESSORS flag set" is awkward, and probably > > extraneous. The whole comment is a probably a bit more verbose than it > > needs to be, e.g. most readers won't care how we came to the conclusion > > that 'processor_mask == 0', and those that care about the background will > > read the changelog anyways. > > > > Maybe something like this: > > > > /* > > * Some Windows versions, e.g. WS2012, use processor_mask = 0 > > * in lieu of the dedicated flag to flush all processors. > > */ > > Hmm, not really. "In lieu" seems intentional. "without" is more accurate. True, probably isn't exactly by design. > My take: > > * Work around possible WS2012 bug: it sends hypercalls > * with processor_mask = 0x0 and HV_FLUSH_ALL_PROCESSORS clear, > * while also expecting us to flush something and crashing if > * we don't. Let's treat processor_mask == 0 same as > * HV_FLUSH_ALL_PROCESSORS. Nice. What about this for the last sentence? Either way, LGTM. For simplicity, flush all CPUs if the guest neglected to set processor_mask. > */ > > Paolo > > > > > > > >> + * Windows actually expects us to flush something and crashes > >> + * otherwise. Let's treat processor_mask == 0 same as > >> + * HV_FLUSH_ALL_PROCESSORS. > >> + */ > >> + all_cpus = (flush.flags & HV_FLUSH_ALL_PROCESSORS) || > >> + (flush.processor_mask == 0); > > > > Nits: > > > > Personal preference, but I like '!flush.processor_mask' in this case as it > > immediately conveys that we're handling the scenario where the guest didn't > > set a mask. Then there wouldn't be a visual need for the second set of > > parentheses. > > > > Aligning its indentation with the first first chunk of the statement would > > also be nice, but again, personal preference. :-) > > > >> } else { > >> if (unlikely(kvm_read_guest(kvm, ingpa, &flush_ex, > >> sizeof(flush_ex)))) > >> -- > >> 2.20.1 > >> >