On 0, Nadav Har'El <NYH@xxxxxxxxxx> wrote: > > No, both patches are wrong. > > Guys, thanks for looking into this bug. I'm afraid I'm still at a loss at > why a TSC bug would even cause a guest lockup :( > > When Avi Kivity saw my nested TSC handling code he remarked "this is > probably wrong". When I asked him where it was wrong, he basically said > that he didn't know where, but TSC handling code is always wrong ;-) > And it turns out he was right. > > > The correct fix is to make kvm_get_msr() return the L1 guest TSC at all > times. > > We are serving the L1 guest in this hypervisor, not the L2 guest, and so > > > should never read its TSC for any reason. > ... > > allows the L2 guest to overwrite the L1 guest TSC, which at first seems > wrong, > > but is in fact the correct virtualization of a security hole in the L1 > guest. > > I think I'm beginning to see the error in my ways... > > When L1 lets L2 (using the MSR bitmap) direct read/write access to the TSC, > it doesn't want L0 to be "clever" and give L2 its own separate TSC (like > I do now), but rather gives it full control over L1's TSC - so reading or > writing it should actually return L1's TSC, and the TSC_OFFSET in vmcs12 > is to be ignored. > > So basically, if I understand correctly, what I need to change is > in prepare_vmcs02(), if the MSR_IA32_TSC is on the MSR bitmap (read? > write?), instead of doing > vmcs_write64(TSC_OFFSET, > vmx->nested.vmcs01_tsc_offset + vmcs12->tsc_offset); > I just need to do > vmcs_write64(TSC_OFFSET, > vmx->nested.vmcs01_tsc_offset); > thereby giving L2 exactly the same TSC that L1 had. > Brandan, if I remember correctly you once tried this sort of fix and > it actually worked? That is correct. That is still my "workaround fix" that I have been using on my systems. But as you have mentioned above (and below), I am still struggling with two questions : 1. Why does L1 hang even if the TSC has wrong values. 2. I see this on a Dell R610 and I don't know why you and some others don't see this. I assumed from the symptoms that this should be fairly easy to reproduce on any system. Bandan > Then, guest_read_tsc() will return (without need to change this code) > the correct L1 TSC. > > And vmx_write_tsc_offset() should do in the is_guest_mode() not what > it does now (vmcs12->tsc_offset is of no important when the TSC MSR > is passed through) but rather set vmcs01_tsc_offset (which will be > applied on the next exit to L1). > > Is my analysis correct? Or perhaps completely wrong? ;-) > Am I missing anything else that should be change? > > In any case, I don't understand why on my machine I never encountered > these problems, and nothing broke even if I replaced the TSC nesting > code with randomly broken code. Are the people who are seeing this > brakage actually passed the MSR from L1 to L2 - using the MSR bitmap - > like I guessed above? Or am I missing something completely different? > > Sorry, but I'm really becoming confused by these TSC issues... > > Thanks, > Nadav. > -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html