Re: xen 3.4.3 SRPM / console problem

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Update:

Completely stable with 7 VMs. Hasn't missed a beat for several days now.

Will now run the pings again in 2 of 64PV from the virtual consoles (no 
graphics - and disconnected) and see if the host clock tick dies again.

Cheers
V

On Thu, 24 Jun 2010 03:17:51 pm Virgil wrote:
> Quick update:
> 
> Added 3 more VMs. Total of 7 now on this "desktop" computer.
> 
> 3x32PCFC6
> 1x32HVFC12
> 1x32HVwinXP-pro
> 2x64PVFC12
> 
> Host is maxed out now.
> 
> Everything going well when the pings are not running in the 64PVFC12
> machines.
> 
> Will leave it going for another couple of days.
> 
> Cheers
> V
> 
> p.s. FYI Samba4-Alpha12 Active Directory controller is working well on
> 64PVFC12.
> 
> On Wed, 23 Jun 2010 05:18:56 pm Virgil wrote:
> > On Wed, 23 Jun 2010 03:30:52 pm Pasi Kärkkäinen wrote:
> > > On Wed, Jun 23, 2010 at 11:11:58AM +1000, Virgil wrote:
> > > > Hi Pasi,
> > > > 
> > > > Had a hiccup overnite:
> > > > 
> > > > The host became unresponsive in a weird way. The time stopped
> > > > incrementing.
> > > > 
> > > > Turns out the clock stopped ticking (which I put down to the
> > > > interrupts being disconnected).
> > > > 
> > > > Anyway I decided I'd reset the time using 'time -s 10:41:30'.
> > > > 
> > > > Kaboom, or actually deathly silence. The machine fully stopped dead
> > > > in its tracks.
> > > > 
> > > > Just prior to this I connected to the console of one of the 64PV
> > > > machines which was just running a ping from yesterday. Anyway, 60,000
> > > > or so lines of pings went to the console zipping up the screen. Then
> > > > it was dead. I did a CTRL-C and eventually it returned to the prompt.
> > > > 
> > > > So I looked at the other 64PV machine, which was also pining, and
> > > > identical situation.
> > > > 
> > > > So I reckon, there's some kind of buffer overflow going on when
> > > > you're not "xm console MACHINE" connected. Once you pass 60,000
> > > > lines of text this buffer overflow causes the RTC to hangup somehow.
> > > 
> > > Do you have xenconsoled running?
> > > 
> > > I've noticed PV guests that print a lot to the console will stall if
> > > xenconsoled is not running.. xenconsoled needs to clear the guest
> > > console buffer..
> > > 
> > > -- Pasi
> > 
> > Seems to be now. Pretty sure it was then too.
> > 
> > udev-post       0:off   1:on    2:on    3:on    4:on    5:on    6:off
> > wpa_supplicant  0:off   1:off   2:off   3:off   4:off   5:off   6:off
> > xenconsoled     0:off   1:off   2:off   3:on    4:on    5:on    6:off
> > xend            0:off   1:off   2:off   3:on    4:on    5:on    6:off
> > xendomains      0:off   1:off   2:off   3:on    4:on    5:on    6:off
> > xenstored       0:off   1:off   2:off   3:on    4:on    5:on    6:off
> > ypbind          0:off   1:off   2:off   3:off   4:off   5:off   6:off
> > [root@seanl64 ~]# ps -ef | grep xenconso
> > root      1508     1  0 10:19 ?        00:00:00 /usr/sbin/xenconsoled
> > --log=none --log-dir=/var/log/xen/console root      7815  7732  0 17:07
> > pts/5    00:00:00 grep xenconso
> > 
> > Cheers
> > V
> > 
> > > > I pressed the reset button, but this time the 2 64PV machines are not
> > > > logged in. I'll just let it go and see if it keeps going.
> > > > 
> > > > Cheers
> > > > V
> > > > 
> > > > On Tue, 22 Jun 2010 04:29:06 pm Pasi Kärkkäinen wrote:
> > > > > On Tue, Jun 22, 2010 at 12:03:53PM +1000, Virgil wrote:
> > > > > > Hi Pasi,
> > > > > > 
> > > > > > On Mon, 21 Jun 2010 08:57:55 pm Pasi Kärkkäinen wrote:
> > > > > > > On Mon, Jun 21, 2010 at 01:56:36PM +0300, Pasi Kärkkäinen wrote:
> > > > > > > > On Mon, Jun 21, 2010 at 02:28:15PM +1000, Virgil wrote:
> > > > > > > > > Another quick update....
> > > > > > > > > 
> > > > > > > > > xen-4.0.1-0.1.rc3.fc13.src.rpm just compiled this under
> > > > > > > > > fc12.
> > > > > > > > > 
> > > > > > > > > Identical results with this too (i.e. it's probably in the
> > > > > > > > > kernel).
> > > > > > > > > 
> > > > > > > > > I have a (silly) idea for the serial console. The wiki page
> > > > > > > > > recommends using a phone camera to capture the screen....
> > > > > > > > > 
> > > > > > > > > Well my idea is to add an n-millisecond delay every time
> > > > > > > > > the output stream in Xen sees a \n. This would delay the
> > > > > > > > > screen updates enough for the camera to see them. The n
> > > > > > > > > should be configurable on the kernel boot command line.
> > > > > > > > > It's set to 0 right now.
> > > > > > > > 
> > > > > > > > Yeah, we really need to get a log somehow to troubleshoot
> > > > > > > > your problem.
> > > > > > > > 
> > > > > > > > Serial console log would be the best:
> > > > > > > > http://wiki.xensource.com/xenwiki/XenSerialConsole
> > > > > > > 
> > > > > > > Btw are you running the latest kernel:
> > > > > > > http://koji.fedoraproject.org/koji/taskinfo?taskID=2254110
> > > > > > > 
> > > > > > > Or are you running custom/self compiled kernel?
> > > > > > 
> > > > > > Everything is working with:
> > > > > > xen-4.0.1-0.1.rc3 compiled from source on fc12 machine and
> > > > > > 2.6.32.14-1.2.107.xendom0.fc12.x86_64 from the  myoung repo.
> > > > > > 
> > > > > > All fixed.
> > > > > 
> > > > > Good to hear it works!
> > > > > 
> > > > > > We also now have a "null modem" cable to another old computer
> > > > > > with a COM port. Turns out I was the only old man that could
> > > > > > remember what a null modem cable is. The young guy said "wtf"?
> > > > > > Also turns out I'm the only one who knows what minicom is and
> > > > > > what 8N1 means
> > > > > > 
> > > > > > :-)
> > > > > 
> > > > > Hehe.. yeah I guess young people don't get to play with serial
> > > > > consoles nowadays, until they're doing networking stuff..
> > > > > 
> > > > > So I guess most SOL devices in servers go unused.. :)
> > > > > 
> > > > > -- Pasi
> > > > > 
> > > > > > All VMs are now running concurrently.
> > > > > > 
> > > > > > Very happy again. Thanks.
> > > > > > V
> > > > > > 
> > > > > > > -- Pasi
> > > > > > > 
> > > > > > > > > Cheers
> > > > > > > > > V
> > > > > > > > > 
> > > > > > > > > On Mon, 21 Jun 2010 12:10:17 pm Virgil wrote:
> > > > > > > > > > Just a quick update:
> > > > > > > > > > 
> > > > > > > > > > Just tried xen-4.0.0-2. Recompile from source on
> > > > > > > > > > fc12.x86_64.
> > > > > > > > > > 
> > > > > > > > > > identical behaviour.
> > > > > > > > > > 
> > > > > > > > > > Cheers
> > > > > > > > > > V
> > > > > > > > > > 
> > > > > > > > > > On Fri, 18 Jun 2010 03:17:19 pm Virgil wrote:
> > > > > > > > > > > On Sat, 29 May 2010 11:26:50 pm M A Young wrote:
> > > > > > > > > > > > If anyone wants to test xen 3.4.3, I have put up a
> > > > > > > > > > > > source RPM at
> > > > > > > > > > > > http://myoung.fedorapeople.org/dom0/src/xen-3.4.3-0.9
> > > > > > > > > > > > 1. fc 13. src.r pm
> > > > > > > > > > > > 
> > > > > > > > > > > >   	Michael Young
> > > > > > > > > > > > 
> > > > > > > > > > > > --
> > > > > > > > > > > > xen mailing list
> > > > > > > > > > > > xen@xxxxxxxxxxxxxxxxxxxxxxx
> > > > > > > > > > > > https://admin.fedoraproject.org/mailman/listinfo/xen
> > > > > > > > > > > 
> > > > > > > > > > > Hi list,
> > > > > > > > > > > 
> > > > > > > > > > > Host crashing on 64FC12 kernel -105 dom0 when 2 PV64
> > > > > > > > > > > machines are run.
> > > > > > > > > > > 
> > > > > > > > > > > I can run HV32WinXP and HV32FC12 and 1 PV64FC12 all at
> > > > > > > > > > > the same time.
> > > > > > > > > > > 
> > > > > > > > > > > However, when any combination involves 2 PV64FC12
> > > > > > > > > > > (kernel version doesn't matter) the host crashes.
> > > > > > > > > > > 
> > > > > > > > > > > Running on the -97 dom0 everything works in all combos.
> > > > > > > > > > > 
> > > > > > > > > > > Using Xen 3.4.3.
> > > > > > > > > > > 
> > > > > > > > > > > Turning off the virt network cards in the PV64FC12
> > > > > > > > > > > machines makes things go (obviously not much use
> > > > > > > > > > > though).
> > > > > > > > > > > 
> > > > > > > > > > > Tried disabling IPV6, firewall stuff etc. etc.
> > > > > > > > > > > 
> > > > > > > > > > > Sometimes it would fire up and go but whichever machine
> > > > > > > > > > > is started second gets really long ping times like it's
> > > > > > > > > > > not receiving unless it sends something (if that makes
> > > > > > > > > > > sense). Sooner or later the host crashes.
> > > > > > > > > > > 
> > > > > > > > > > > Strangely a PV64FC12 and a PV64FC10 machine coexist
> > > > > > > > > > > happily. It's only when a second PV64FC12 machine
> > > > > > > > > > > starts up.
> > > > > > > > > > > 
> > > > > > > > > > > V
> > > > > > > > > > > --
> > > > > > > > > > > xen mailing list
> > > > > > > > > > > xen@xxxxxxxxxxxxxxxxxxxxxxx
> > > > > > > > > > > https://admin.fedoraproject.org/mailman/listinfo/xen
> > > > > > > > > > 
> > > > > > > > > > --
> > > > > > > > > > xen mailing list
> > > > > > > > > > xen@xxxxxxxxxxxxxxxxxxxxxxx
> > > > > > > > > > https://admin.fedoraproject.org/mailman/listinfo/xen
> > > > > > > > > 
> > > > > > > > > --
> > > > > > > > > xen mailing list
> > > > > > > > > xen@xxxxxxxxxxxxxxxxxxxxxxx
> > > > > > > > > https://admin.fedoraproject.org/mailman/listinfo/xen
> > > > > > > > 
> > > > > > > > --
> > > > > > > > xen mailing list
> > > > > > > > xen@xxxxxxxxxxxxxxxxxxxxxxx
> > > > > > > > https://admin.fedoraproject.org/mailman/listinfo/xen
> > 
> > --
> > xen mailing list
> > xen@xxxxxxxxxxxxxxxxxxxxxxx
> > https://admin.fedoraproject.org/mailman/listinfo/xen
> 
> --
> xen mailing list
> xen@xxxxxxxxxxxxxxxxxxxxxxx
> https://admin.fedoraproject.org/mailman/listinfo/xen
--
xen mailing list
xen@xxxxxxxxxxxxxxxxxxxxxxx
https://admin.fedoraproject.org/mailman/listinfo/xen



[Index of Archives]     [Fedora General]     [Fedora Music]     [Linux Kernel]     [Fedora Desktop]     [Fedora Directory]     [PAM]     [Big List of Linux Books]     [Gimp]     [Yosemite News]

  Powered by Linux