On 06/22/2016 12:45 PM, Jim Fehlig wrote: > On 06/22/2016 06:07 AM, Cole Robinson wrote: >> On 06/21/2016 10:32 PM, Jim Fehlig wrote: >>> On 06/21/2016 04:20 AM, Joao Martins wrote: >>>> On 06/21/2016 01:38 AM, Jim Fehlig wrote: >>>>> Joao Martins wrote: >>>>>> Guests use a <console /> (and sometimes <serial /> pair) to represent >>>>>> the console. On the callback that is called when console is brought up >>>>>> (NB: before domain starts), we fill the path of the console element with >>>>>> the appropriate "/dev/pts/X". For PV guests it all works fine, although >>>>>> for HVM guests it doesn't. Consequently we end up seeing erronous >>>>>> behaviour on HVM guests where console path is not displayed on the XML >>>>>> but still can be accessed with virDomainOpenConsole after booting guest. >>>>>> Consumers of this XML (e.g. Openstack) also won't be able to use the >>>>>> console path. >>>>> Can you provide example input domXML config that causes this problem? >>>>> >>>> Ah yes - I probably should include that in the commit description? >>> Yes, I think it helps clarify the problem being solved by the commit. >>> >>>> So, for PV guests for an input console XML element: >>>> >>>> <console type='pty'> >>>> <target type='xen' port='0'/> >>>> </console> >>>> >>>> Which then on domain start gets filled like: >>>> >>>> <console type='pty' tty='/dev/pts/3'> >>>> <source path='/dev/pts/3'/> >>>> <target type='xen' port='0'/> >>>> </console> >>>> >>>> Although for HVM guests we have: >>>> >>>> <serial type='pty'> >>>> <target port='0'/> >>>> </serial> >>>> <console type='pty'> >>>> <target type='serial' port='0'/> >>>> </console> >>>> >>>> And no changes happen i.e. source path isn't written there _even though_ it's >>>> set on the console element - being the reason why we can access the console in the >>>> first place. IIUC The expected output should be: >>>> >>>> <serial type='pty'> >>>> <source path='/dev/pts/4'/> >>>> <target port='0'/> >>>> </serial> >>>> <console type='pty' tty='/dev/pts/4'> >>>> <source path='/dev/pts/4'/> >>>> <target type='serial' port='0'/> >>>> </console> >>> I asked for an example after having problems testing your patch, but it turned >>> out to be an issue which I think was introduced long ago by commit 482e5f15. I >>> was testing with transient domains and noticed 'virsh create; 'virsh console' >>> always failed with >>> >>> error: internal error: character device <null> is not using a PTY >>> >>> My config only contained >>> >>> <console type='pty'> >>> <target port='0'/> >>> </console> >>> >>> so the "really crazy backcompat stuff for consoles" in >>> virDomainDefAddConsoleCompat() moved the config over to def->serials[0] and >>> created a new def->consoles[0] without def->consoles[0].source.type set to >>> VIR_DOMAIN_CHR_TYPE_PTY. Without source.type set, libxlConsoleCallback() would >>> ignore the console and not copy the PTY path to source.data.file.path. 'virsh >>> console' would then fail with the error noted above. >>> >>> I spent too much time debugging this, partly because I was confused by odd >>> behavior with persistent domains. E.g. 'virsh define; virsh start; virsh >>> console' would fail with the same error, but all subsequent 'virsh shutdown; >>> virsh start; virsh console' sequences would work :-). That behavior was due to >>> vm->newDef being populated with correct console config when calling >>> virDomainObjSetDefTransient() in libxlDomainStart(). After the first shutdown of >>> the domain, vm->newDef would be copied over to vm->def, allowing subsequent >>> 'virsh start; virsh console' to work correctly. >>> >> Hmm, that sounds weird. Can you explain more how exactly the XML changes? > > It only changes when defining the vm. > >> What >> is the diff between the initial 'virsh define; virsh dumpxml' > > Initial XML has > > <console type='pty'> > <target port='0'/> > </console> > > After define > > <serial type='pty'> > <target port='0'/> > </serial> > <console type='pty'> > <target type='serial' port='0'/> > </console> > > The config doesn't change after start; shutdown. But the contents of > virDomainDef does change, specifically def->consoles[0]->source.type changes > from VIR_DOMAIN_CHR_TYPE_PTY to VIR_DOMAIN_CHR_TYPE_NULL. And as mentioned > below, libxlConsoleCallback() only checks for source.type == > VIR_DOMAIN_CHR_TYPE_PTY before copying the PTY path to > def->consoles[0]->source.data.file.path. Another potential fix I also mentioned > below is to loosen that condition in libxlConsoleCallback(). I.e. copy the path > if source.type equals VIR_DOMAIN_CHR_TYPE_PTY or VIR_DOMAIN_CHR_TYPE_NULL (and > perhaps VIR_DOMAIN_CHR_TYPE_FILE too?). > Okay I think I follow. The thing I was missing is that virDomainDefAddConsoleCompat will alter ->def, but that's _not_ what is formated/printed in the XML, which has some special handling to fully duplicate serials[0] XML to consoles[0], if console target type=serial I _think_ what the proper thing to do here is something like: diff --git a/src/libxl/libxl_domain.c b/src/libxl/libxl_domain.c index 221af87..6bcb4d9 100644 --- a/src/libxl/libxl_domain.c +++ b/src/libxl/libxl_domain.c @@ -964,13 +964,17 @@ libxlConsoleCallback(libxl_ctx *ctx, libxl_event *ev, void *for_callback) virObjectLock(vm); for (i = 0; i < vm->def->nconsoles; i++) { virDomainChrDefPtr chr = vm->def->consoles[i]; - if (chr && chr->source.type == VIR_DOMAIN_CHR_TYPE_PTY) { + if (i == 0 && + chr->targetType == VIR_DOMAIN_CHR_CONSOLE_TARGET_TYPE_SERIAL) + chr = vm->def->serials[0]; + + if (chr->source.type == VIR_DOMAIN_CHR_TYPE_PTY) { libxl_console_type console_type; char *console = NULL; int ret; console_type = - (chr->targetType == VIR_DOMAIN_CHR_CONSOLE_TARGET_TYPE_SERIAL ? + (chr->deviceType == VIR_DOMAIN_CHR_DEVICE_TYPE_SERIAL ? LIBXL_CONSOLE_TYPE_SERIAL : LIBXL_CONSOLE_TYPE_PV); ret = libxl_console_get_tty(ctx, ev->domid, chr->target.port, console_type, Untested, but basically, libxl driver needs to learn that if consoles[0].target.type == SERIAL, then you should use serials[0] instead. The qemu driver has some magic like this sprinkled around... The problem with your patch below is that it's not a whole fix, there's many other bits besides source.type that would technically need to be copied over. Rather than doing that, which has its own problems trying to keep the console[0] and serial[0] data in sync, the convention seems to be that console gets only target.type=serial and the drivers need to map that to serials[0]. That's my reading of things anyways Thanks, Cole > Regards, > Jim > >> and the dumpxml >> after the VM has started+shutdown once? Whatever that diff is, and why it's >> not in the XML to begin with, sounds like the root issue to me. >> >>> Long story short, I found the problem but not sure how to fix it :-). One >>> approach would be the below patch. Another would be to loosen the restriction in >>> libxlConsoleCallback, filing in source.data.file.path when the console source >>> type is also set to VIR_DOMAIN_CHR_TYPE_NULL. Adding Cole and Peter (both of >>> which have toiled in and likely loathe this code) to see if they have opinions >>> on a proper fix. >>> >>> Regards, >>> Jim >>> >>> >>> From a231636e8eec08152220e32af5f4026a50954f41 Mon Sep 17 00:00:00 2001 >>> From: Jim Fehlig <jfehlig@xxxxxxxx> >>> Date: Tue, 21 Jun 2016 20:25:23 -0600 >>> Subject: [PATCH] domain conf: copy source type from stolen console >>> >>> When domXML contains only <console type='pty'> and no corresponding >>> <serial>, the console is "stolen" [1] and used as the first <serial> >>> device. A new console is created as an alias to the first <serial> >>> device, but missed copying some configuration such as the source >>> 'type' attribute. >>> >>> [1] See comments associated with virDomainDefAddConsoleCompat() in >>> $LIBVIRT-SRC/src/conf/domain_conf.c: >>> >>> Signed-off-by: Jim Fehlig <jfehlig@xxxxxxxx> >>> --- >>> src/conf/domain_conf.c | 1 + >>> 1 file changed, 1 insertion(+) >>> >>> diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c >>> index 75ad03f..2fda4fe 100644 >>> --- a/src/conf/domain_conf.c >>> +++ b/src/conf/domain_conf.c >>> @@ -3845,6 +3845,7 @@ virDomainDefAddConsoleCompat(virDomainDefPtr def) >>> /* Create an console alias for the serial port */ >>> def->consoles[0]->deviceType = VIR_DOMAIN_CHR_DEVICE_TYPE_CONSOLE; >>> def->consoles[0]->targetType = >>> VIR_DOMAIN_CHR_CONSOLE_TARGET_TYPE_SERIAL; >>> + def->consoles[0]->source.type = def->serials[0]->source.type; >>> } >>> } else if (def->os.type == VIR_DOMAIN_OSTYPE_HVM && def->nserials > 0 && >>> def->serials[0]->deviceType == VIR_DOMAIN_CHR_DEVICE_TYPE_SERIAL && >>> >> Then again this also seems okay to me, but it's concerning that I applied this >> but it didn't cause the test suite to change at all. >> >> - Cole >> > -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list