Re: 2.6.35-rc1 regression with pvclock and smp guests

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Avi Kivity wrote:
  On 07/27/2010 03:21 PM, Andre Przywara wrote:
Avi Kivity wrote:
  On 07/27/2010 02:49 PM, Andre Przywara wrote:
What is the guest executing when it hangs?
Both VCPUs are halted, the monitor and System.map tell me it's in native_safe_halt(). The code sequence confirms this, it is an intentional sti;hlt condition.
Using -smp 16 also shows that all 16 VCPUs are stuck.

Well, strange. The intent of that patch was to make the clock never go backwards. Perhaps the change made it go forwards by a large amount, and the guest is not hung, just waiting for some timer that is far in the future.

Can you do something like

-      if (ret < last)
+      if (ret < last) {
+            static u64 max_delta;
+            if (last - ret > max_delta) {
+                  max_delta = last - ret;
+                  printk("advancing kvmclock by: %llx\n", max_delta);
+            }
              return last;
+      }

to see if this is happening?
No change, it still hangs. I also don't see the printk.
The output with smp=1 is like this:
[    1.186549] ACPI: Power Button [PWRF]
[    1.189204] XENFS: not registering filesystem on non-xen platform
[    1.195001] Non-volatile memory driver v1.3
[    1.196358] Linux agpgart interface v0.103
[    1.197687] [drm] Initialized drm 1.1.0 20060810
[ 1.198926] [drm:i915_init] *ERROR* drm/i915 can't work without intel_agp module!
[    1.201213] Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled
ÿ[    1.460714] serial8250: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
[    1.463243] 00:06: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
[    1.467153] brd: module loaded
[    1.469245] loop: module loaded
With smp=2 the output stops just before the strange "y" character (I guess it's ASCII 255), which I assume is an artifact of the serial console. As you can see at the timestamps, it takes some time between the last shown line (1.201213) and the first missing one (1.460714).

Wierd.  Maybe the clock goes crazy.

Let's see if it jumps forward alot:

         } while (unlikely(last != ret));
+
+       {
+            static u64 last_report;
+            if (ret > last_report + 10000) {
+                    last_report = ret;
+                    printk("kvmclock: %llx\n", ret);
+            }
+
+       }

         return ret;
  }

Worth updating the 'return last' to update ret and goto the new code, so we don't miss that path.
Did that. There is _a lot_ of output (about 350 lines per second via the 115k serial console), both with smp=1 and smp=2. The majority is differing about 2,000,000 (ticks?), but a handful of them are in the range of 20 million. No difference between smp=2 and smp=1. I also get some "BUG: recent printk recursion!" and I don't see any kernel boot progress beyond outputting the BogoMIPS value.
BTW: I found two message from your earlier debug statement:
[    0.000000] kvm-clock: cpu 0, msr 0:1ac0401, boot clock
[    0.000000] kvm-clock: cpu 0, msr 0:1e15401, primary cpu clock

Regards,
Andre.

--
Andre Przywara
AMD-OSRC (Dresden)
Tel: x29712

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux