On 14/12/2017 14:01, Liran Alon wrote: >> But in our test, we found that there is a possible situation that Vcpu >> fails to read >> RTC_REG_C to clear irq, This could happens while two VCpus are >> writing/reading >> registers at the same time, for example, vcpu 0 is trying to read >> RTC_REG_C, >> so it write RTC_REG_C first, where the s->cmos_index will be RTC_REG_C, >> but before it tries to read register C, another vcpu1 is going to read >> RTC_YEAR, >> it changes s->cmos_index to RTC_YEAR by a writing action. >> The next operation of vcpu0 will be lead to read RTC_YEAR, In this >> case, we will miss >> calling qemu_irq_lower(s->irq) to clear the irq. After this, kvm will >> never inject RTC irq, >> and Windows VM will hang. > > If I understood correctly, this looks to me like a race-condition bug in > the Windows guest kernel. In real-hardware this race-condition will also > cause the RTC_YEAR to be read instead of RTC_REG_C. > Guest kernel should make sure that 2 CPUs does not attempt to read a > CMOS register in parallel as they can override each other's cmos_index. > > See for example how Linux kernel makes sure to avoid such kind of issues > in rtc_cmos_read() (arch/x86/kernel/rtc.c) by grabbing a cmos_lock. Lei and I looked at it further, and the root cause is not the missed EOI in QEMU. Rather it's a bug in ioapic.c's tracking of RTC interrupts. Paolo