Re: [[PATCH v2] 0/4] try harder to get dest qemu errors on migation

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 06.10.2016 10:29, Peter Krempa wrote:
> On Thu, Oct 06, 2016 at 10:23:05 +0300, Nikolay Shirokovskiy wrote:
>>
>>
>> On 05.10.2016 18:13, Peter Krempa wrote:
>>> On Mon, Sep 12, 2016 at 17:34:39 +0300, Nikolay Shirokovskiy wrote:
>>>> Hi, all.
>>>>
>>>>   In case migration fails due to destination qemu exits unexpectedly user
>>>> recevies the qemu log in the error message. Unfortunately log is truncated and
>>>> the most interesting part is missed (below is the example of such a log [1]).
>>>>
>>>> Actually for the most cases the first patch will be enough to fix the issue.
>>>> Originally I thought the problem is qemu logging and reading the log are not in
>>>> sync (which is true) so I tried to fix it as well in the next patches.
>>>>
>>>> * diff from v1:
>>>>
>>>> 1. split changes to libvirtd and virtlogd to different patches
>>>> 2. split virtlogd patch further
>>>> 3. simplify handling eofs and hangups in draining function
>>>>
>>>> [1] log example:
>>>>
>>>> CPU Reset (CPU 0)
>>>> EAX=00000000 EBX=00000000 ECX=00000000 EDX=00000000
>>>> ESI=00000000 EDI=00000000 EBP=00000000 ESP=00000000
>>>> EIP=00000000 EFL=00000000 [-------] CPL=0 II=0 A20=0 SMM=0 HLT=0
>>>> ES =0000 00000000 00000000 00000000
>>>> CS =0000 00000000 00000000 00000000
>>>> SS =0000 00000000 00000000 00000000
>>>> DS =0000 00000000 00000000 00000000
>>>> FS =0000 00000000 00000000 00000000
>>>> GS =0000 00000000 00000000 00000000
>>>> LDT=0000 00000000 00000000 00000000
>>>> TR =0000 00000000 00000000 00000000
>>>> GDT=     00000000 00000000
>>>> IDT=     00000000 00000000
>>>> CR0=00000000 CR2=00000000 CR3=00000000 CR4=00000000
>>>> DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000 
>>>> DR6=0000000000000000 DR7=0000000000000000
>>>> CCS=00000000 CCD=00000000 CCO=DYNAMIC 
>>>> EFER=0000000000000000
>>>> FCW=0000 FSW=0000 [ST=0] FTW=ff MXCSR=00000000
>>>> FPR0=0000000000000000 0000 FPR1=0000000000000000 0000
>>>> FPR2=0000000000000000 0000 FPR3=0000000000000000 0000
>>>> FPR4=0000000000000000 0000 FPR5=0000000000000000 0000
>>>> FPR6=0000000000000000 0000 FPR7=0000000000000000 0000
>>>> XMM00=00000000000000000000000000000000 XMM01=00000000000000000000000000000000
>>>> XMM02=00000000000000000000000000000000 XMM03=00000000000000000000000000000000
>>>> XMM04=00000000000000000000000000000000 XMM05=00000000000000000000000000000000
>>>> XMM06=00000000000000000000000000000000 XMM07=00000000000000000000000000000000
>>>> CPU Reset (CPU 1)
>>>> EAX=00000000 EBX=00000000 ECX=00000000 EDX=000206a1
>>>> ESI=00000000 EDI=00000000 EBP=00000000 ESP=00000000
>>>> EIP=0000fff0 EFL=00000002 [-------] CPL=0 II=0 A20=1 SMM=0 HLT=0
>>>> ES =0000 00000000 0000ffff 00009300
>>>> CS =f000 ffff0000 0000ffff 00009b00
>>>> SS =0000 00000000 0000ffff 00009300
>>>> DS =0000 00000000 0000ffff 00009300
>>>> FS =0000 00000000 0000ffff 00009300
>>>> GS =0000 00000000 0000ffff 00009300
>>>> LDT=0000 00000000 0000ffff 00008200
>>>> TR =0000 00000000 0000ffff 00008b00
>>>> GDT=     00000000 0000ffff
>>>> IDT=     00000000 0000ffff
>>>> CR0=60000010 CR2=00000000 CR3=00000000 CR4=00000000
>>>> DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000 
>>>> DR6=00000000ffff0ff0 DR7=0000000000000400
>>>> CCS=00000000 CCD=00000000 CCO=DYNAMIC 
>>>> EFER=0000000000000000
>>>> FCW=037f FSW=0000 [ST=0] FTW=00 MXCSR=00001f80
>>>> FPR0=0000000000000000 0000 FPR1=0000000000000000 0000
>>>> FPR2=0000000000000000 0000 FPR3=0000000000000000 0000
>>>> FPR4=0000000000000000 0000 FPR5=0000000000000000 0000
>>>> FPR6=0000000000000000 0000 FPR7=0000000000000000 0000
>>>> XMM00=00000000000000000000000000000000 XMM01=00000000000000000000000000000000
>>>> XMM02=00000000000000000000000000000000 XMM03=00000000000000000000000000000000
>>>> XMM04=00000000000000000000000000000000 XMM05=00000000000000000000000000000000
>>>> XMM06=00000000000000000000000000000000 XMM07=000
>>>> qemu: terminating on signal 15 from pid 168133
>>>
>>> I don't think that reporting all of the above is a good idea. We should
>>> perhaps report at most two last lines.
>>>
>>
>> We already report about half of this, this patch just removes random truncation.
>> As to most two lines, AFAIU one can not say what part of this log will be
>> useful for crash investigation.
> 
> This is not about the log but about the error message. The error message
> containing ALL of the above stuff is useless for any user. For crash
> investigation you can always get the full log from the actual log file.

Isn't leaving 1-2 lines is random? Sometimes it will help, sometimes not.
If before die qemu writes 10 lines (besides registers dump) the first line
will probably be the most interesting. I think we'd better just print
something like "qemu died" ("go and see it's log if you want to").

> 
> When I've implemented this I did not see such error message. I'd
> otherwise truncate it to the end since all the above in a error message
> is clearly ridiculous.
> 

It is pretty impressive actually )) It scared me a couple of times when
I started with libvirt. Looks like BSOD or something. (Hmm, this can be
a reason - not to print the dump for unprepared user ))

Nikolay

--
libvir-list mailing list
libvir-list@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/libvir-list



[Index of Archives]     [Virt Tools]     [Libvirt Users]     [Lib OS Info]     [Fedora Users]     [Fedora Desktop]     [Fedora SELinux]     [Big List of Linux Books]     [Yosemite News]     [KDE Users]     [Fedora Tools]