On Mon, Jan 28, 2013 at 3:14 AM, Marcelo Tosatti <mtosatti@xxxxxxxxxx> wrote: > On Mon, Jan 28, 2013 at 12:04:50AM +0300, Andrey Korolyov wrote: >> On Sat, Jan 26, 2013 at 12:49 AM, Marcelo Tosatti <mtosatti@xxxxxxxxxx> wrote: >> > On Fri, Jan 25, 2013 at 10:45:02AM +0300, Andrey Korolyov wrote: >> >> On Thu, Jan 24, 2013 at 4:20 PM, Marcelo Tosatti <mtosatti@xxxxxxxxxx> wrote: >> >> > On Thu, Jan 24, 2013 at 01:54:03PM +0300, Andrey Korolyov wrote: >> >> >> Thank you Marcelo, >> >> >> >> >> >> Host node locking up sometimes later than yesterday, bur problem still >> >> >> here, please see attached dmesg. Stuck process looks like >> >> >> root 19251 0.0 0.0 228476 12488 ? D 14:42 0:00 >> >> >> /usr/bin/kvm -no-user-config -device ? -device pci-assign,? -device >> >> >> virtio-blk-pci,? -device >> >> >> >> >> >> on fourth vm by count. >> >> >> >> >> >> Should I try upstream kernel instead of applying patch to the latest >> >> >> 3.4 or it is useless? >> >> > >> >> > If you can upgrade to an upstream kernel, please do that. >> >> > >> >> >> >> With vanilla 3.7.4 there is almost no changes, and NMI started firing >> >> again. External symptoms looks like following: starting from some >> >> count, may be third or sixth vm, qemu-kvm process allocating its >> >> memory very slowly and by jumps, 20M-200M-700M-1.6G in minutes. Patch >> >> helps, of course - on both patched 3.4 and vanilla 3.7 I`m able to >> >> kill stuck kvm processes and node returned back to the normal, when on >> >> 3.2 sending SIGKILL to the process causing zombies and hanged ``ps'' >> >> output (problem and workaround when no scheduler involved described >> >> here http://www.spinics.net/lists/kvm/msg84799.html). >> > >> > Try disabling pause loop exiting with ple_gap=0 kvm-intel.ko module parameter. >> > >> >> Hi Marcelo, >> >> thanks, this parameter helped to increase number of working VMs in a >> half of order of magnitude, from 3-4 to 10-15. Very high SY load, 10 >> to 15 percents, persists on such numbers for a long time, where linux >> guests in same configuration do not jump over one percent even under >> stress bench. After I disabled HT, crash happens only in long runs and >> now it is kernel panic :) >> Stair-like memory allocation behaviour disappeared, but other symptom >> leading to the crash which I have not counted previously, persists: if >> VM count is ``enough'' for crash, some qemu processes starting to eat >> one core, and they`ll panic system after run in tens of minutes in >> such state or if I try to attach debugger to one of them. If needed, I >> can log entire crash output via netconsole, now I have some tail, >> almost the same every time: >> http://xdel.ru/downloads/btwin.png > > Yes, please log entire crash output, thanks. > Here please, 3.7.4-vanilla, 16 vms, ple_gap=0: http://xdel.ru/downloads/oops-default-kvmintel.txt -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html