Hi Jon, Thanks a lot for attaching the patch. Replies inline. On Thu, Sep 24, 2015 at 1:52 AM, John Ferlan <jferlan@xxxxxxxxxx> wrote: > > > On 09/22/2015 07:21 AM, Shivaprasad bhat wrote: >> On Mon, Sep 21, 2015 at 8:04 PM, John Ferlan <jferlan@xxxxxxxxxx> wrote: >>> >>> >>> On 09/21/2015 05:09 AM, Shivaprasad bhat wrote: >>>> Thanks John for the comments. >>>> >>>> >>>> On Fri, Sep 18, 2015 at 10:34 PM, John Ferlan <jferlan@xxxxxxxxxx> wrote: >>>>> >>>>> >>>>> On 09/14/2015 10:44 AM, Shivaprasad G Bhat wrote: >>>>>> Tunnelled migration can hang if the destination qemu exits despite all the >>>>>> ABI checks. This happens whenever the destination qemu exits before the >>>>>> complete transfer is noticed by source qemu. The savevm state checks at >>>>>> runtime can fail at destination and cause qemu to error out. >>>>>> The source qemu cant notice it as the EPIPE is not propogated to it. >>>>>> The qemuMigrationIOFunc() notices the stream being broken from virStreamSend() >>>>>> and it cleans up the stream alone. The qemuMigrationWaitForCompletion() would >>>>>> never get to 100% transfer completion. >>>>>> The qemuMigrationWaitForCompletion() never breaks out as well since >>>>>> the ssh connection to destination is healthy, and the source qemu also thinks >>>>>> the migration is ongoing as the Fd to which it transfers, is never >>>>>> closed or broken. So, the migration will hang forever. Even Ctrl-C on the >>>>>> virsh migrate wouldn't be honoured. Close the source side FD when there is >>>>>> an error in the stream. That way, the source qemu updates itself and >>>>>> qemuMigrationWaitForCompletion() notices the failure. >>>>>> >>>>>> Close the FD for all kinds of errors to be sure. The error message is not >>>>>> copied for EPIPE so that the destination error is copied instead later. >>>>>> >>>>>> Note: >>>>>> Reproducible with repeated migrations between Power hosts running in different >>>>>> subcores-per-core modes. >>>>>> >>>>>> Changes from v1 -> v2: >>>>>> VIR_FORCE_CLOSE() was called twice for this use case which would log >>>>>> unneccessary warnings. So, move the fd close to qemuMigrationIOFunc >>>>>> so that there are no unnecessary duplicate attempts.(Would this trigger >>>>>> a Coverity error? I don't have a setup to check.) >>>>>> >>>>>> Signed-off-by: Shivaprasad G Bhat <sbhat@xxxxxxxxxxxxxxxxxx> >>>>>> --- >>>>>> src/qemu/qemu_migration.c | 8 ++++++-- >>>>>> 1 file changed, 6 insertions(+), 2 deletions(-) >>>>>> >>>>>> diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c >>>>>> index ff89ab5..9602fb2 100644 >>>>>> --- a/src/qemu/qemu_migration.c >>>>>> +++ b/src/qemu/qemu_migration.c >>>>>> @@ -4012,6 +4012,7 @@ static void qemuMigrationIOFunc(void *arg) >>>>>> if (virStreamFinish(data->st) < 0) >>>>>> goto error; >>>>>> >>>>>> + VIR_FORCE_CLOSE(data->sock); >>>>>> VIR_FREE(buffer); >>>>>> >>>>>> return; >>>>>> @@ -4029,7 +4030,11 @@ static void qemuMigrationIOFunc(void *arg) >>>>>> } >>>>>> >>>>>> error: >>>>>> - virCopyLastError(&data->err); >>>>>> + /* Let the source qemu know that the transfer cant continue anymore. >>>>>> + * Don't copy the error for EPIPE as destination has the actual error. */ >>>>>> + VIR_FORCE_CLOSE(data->sock); >>>>>> + if (!virLastErrorIsSystemErrno(EPIPE)) >>>>>> + virCopyLastError(&data->err); >>>>>> virResetLastError(); >>>>>> VIR_FREE(buffer); >>>>>> } >>>>>> @@ -4397,7 +4402,6 @@ qemuMigrationRun(virQEMUDriverPtr driver, >>>>>> if (iothread && qemuMigrationStopTunnel(iothread, ret < 0) < 0) >>>>>> ret = -1; >>>>>> } >>>>>> - VIR_FORCE_CLOSE(fd); >>>>> >>>>> ^^^ >>>>> >>>>> This causes Coverity to claim a RESOURCE_LEAK >>>>> >>>>> Feels like this was a mistake edit... >>>>> >>>> >>>> The VIR_FORCE_CLOSE() inside qemuMigrationIOFunc() alone is sufficient. >>>> Having this again here would lead to Warning in the logs. I too thought coverity >>>> would complain. Is there a way to force ignore a coverity warning? >>>> >>> >>> Typically a marker of sorts, such as >>> >>> /* coverity[leaked_handle] <some extra comment explaining why> */ >>> >>> Although I'm not sure that's the best way to handle this either. >>> >>> The problem I see though is this is an "error path" issue and when >>> perhaps it's safe/required to close fd/io->sock/data->sock. >>> >>> Your commit comment notes that the issue is seen on a fairly specific >>> event (virStreamSend failure) for a specific type of migration. >> >> I believe the failure can be seen for all types of migration with savestate >> mismatch. >> > > My thoughts were based mostly on your commit message comments: > > "The qemuMigrationIOFunc() notices the stream being broken from > virStreamSend() and it cleans up the stream alone. The > qemuMigrationWaitForCompletion() would never get to 100% transfer > completion." > > and the belief that the core issue you described is there's no way for > qemuMigrationCompleted to know the qemuMigrationIOFunc thread failed and > thus that's what caused the hang. Since we can pass the iothread "io" > object to the Completion function, I figured we could also use that to > check the status of the thread in the list of things already checked. I > guess it's not clear to me why your test fails below. Maybe it'll help > to attach/show you my thoughts in a patch format. I just took your > changes and "adjusted" them, merging the differences. So you could 'git > am' to top of trunk in order to compare my thoughts. I tried your patch. Gave migrations in a while loop. Hit the hang in 7th run itlsef. I feel the small time window is not very negligible. > > I'm not opposed to the chosen method, although perhaps Dan or Jiri who > understand more of the inner workings of migration than I do can provide > some comments. Changing where the fd is closed could have some corner > race condition I'm not aware of... > > One issue that needs to be resolved with your patch is what happens to > the 'fd' when qemuMigrationIOFunc isn't involved? IOW: if "fwdType == > MIGRATION_FWD_DIRECT", then qemuMigrationStartTunnel isn't called and > thus qemuMigrationIOFunc won't be closing the 'fd' in the form of > data->sock. That's perhaps a path Coverity was considering (and I hadn't > considered initially). > > I think to resolve this the following should be added to your patch (I > tried it, Coverity is happy too)... > > if (spec->fwdType != MIGRATION_FWD_DIRECT) { > if (!(iothread = qemuMigrationStartTunnel(spec->fwd.stream, fd))) > goto cancel; > /* If we've created a tunnel, then the 'fd' will be closed in the > * qemuMigrationIOFunc as data->sock > */ > fd = -1; > } > > and restore the VIR_FORCE_CLOSE(fd); > I agree. Let me know if we should go with this approach. I'll spin the next version with the above corrections. I am copying Jiri on this patch for his opinion as well. > >>> As I >>> read the code, that failure jumps to error (as does virStreamFinish). So >>> now you have a fairly specific set of instances which perhaps would >>> cause qemuMigrationWaitForCompletion to need to fail. The fix you have >>> proposed is to close the data->sock (io->sock, fd). However, your >>> proposal is a larger hammer. I assume closure of data->sock causes >>> WaitForCompletion to fail (perhaps differently) and that's why you chose it. >>> >>> Going back to the beginning of qemuMigrationRun, setting the 'fd' and >>> using StartTunnel seems to rely on MIGRATION_FWD_DIRECT, but if a tunnel >>> is created an (ugh) 'iothread' variable is NON-NULL. What if 'iothread' >>> were passed to qemuMigrationWaitForCompletion? Then again passed to >>> qemuMigrationCompleted which would check if iothread non-NULL and for >>> some new flag that could be created in _qemuMigrationIOThread, being >>> true. If it were true, then that would cause failure so that >>> WaitForCompletion would return error status. The only way the flag is >>> set to true is in qemuMigrationIOFunc when the code jumps to error. >>> (e.g. add bool stream_abort and data->stream_abort = true in "error:", >>> then check iothread && iothread->stream_abort, force error). >>> >> >> I tried this. Until the fd is closed, the 'info migrate' wouldn't >> return from the >> qemu. So, the migration hangs. Checking the stream status before/after >> fetching the job status would still leave a race. So, having it in a different >> thread (qemuMigrationIOFunc) here seems inevitable to me. > > Although perhaps susceptible to a smaller "timing" window, if we move > the iothread check to before the CheckJobStatus call in Completed, then > that'll work better? > > It seems the "query-migrate" command has no timeout... I assume qemu > would indicate that if you call it when the migration pipe had an error, > then you unpredictable results (only a guess - I didn't look at that > code as well). Perhaps that means libvirt needs some code to handle the > condition where the command doesn't finish in a timely manner. > No, If the pipe was broken, the query-migrate returns "failed". This is anyway handled in a generic way in libvirt. 90.408 > 0x3fff80007410 {"execute":"query-migrate","id":"libvirt-725"} 90.408 < 0x3fff80007410 {"return": {"status": "failed"}, "id": "libvirt-725"} Thanks and Regards, Shiva > Perhaps worth a shot... I did change my attached patch to reflect this > thought.. > > John >> >> #1 0x00003fff83ff593c in virCondWait (c=<optimized out>, m=<optimized out>) >> at util/virthread.c:154 >> #2 0x00003fff76235544 in qemuMonitorSend (mon=0x3fff54003970, >> msg=<optimized out>) at qemu/qemu_monitor.c:1035 >> #3 0x00003fff7624fb30 in qemuMonitorJSONCommandWithFd (mon=0x3fff54003970, >> cmd=0x3fff5c001420, scm_fd=-1, reply=0x3fff79fbd388) >> at qemu/qemu_monitor_json.c:293 >> #4 0x00003fff76254b90 in qemuMonitorJSONCommand (reply=0x3fff79fbd388, >> cmd=0x3fff5c001420, mon=0x3fff54003970) at qemu/qemu_monitor_json.c:323 >> #5 qemuMonitorJSONGetMigrationStatus (mon=0x3fff54003970, >> status=0x3fff79fbd538) at qemu/qemu_monitor_json.c:2620 >> #6 0x00003fff7623a664 in qemuMonitorGetMigrationStatus (mon=<optimized out>, >> status=<optimized out>) at qemu/qemu_monitor.c:2134 >> #7 0x00003fff76228e6c in qemuMigrationFetchJobStatus (driver=0x3fff6c118d80, >> vm=0x3fff6c27faf0, asyncJob=<optimized out>, jobInfo=0x3fff79fbd4f0) >> at qemu/qemu_migration.c:2528 >> #8 0x00003fff76228fb4 in qemuMigrationUpdateJobStatus (driver=0x3fff6c118d80, >> vm=0x3fff6c27faf0, asyncJob=QEMU_ASYNC_JOB_MIGRATION_OUT) >> at qemu/qemu_migration.c:2565 >> #9 0x00003fff76229200 in qemuMigrationCheckJobStatus (driver=0x3fff6c118d80, >> vm=0x3fff6c27faf0, asyncJob=QEMU_ASYNC_JOB_MIGRATION_OUT) >> ---Type <return> to continue, or q <return> to quit---cont >> at qemu/qemu_migration.c:2585 >> #10 0x00003fff76229408 in qemuMigrationCompleted (iothread=<optimized out>, >> >>> The only question then in my mind then is would this also be something >>> that should be done for the virStreamFinish failure path which also >>> jumps to error? >>> >> >> Yes, I too am not sure if its appropriate for virStreamFinish failure >> case. But, given >> this an error path, I feel we can safely go ahead and close the >> data->sock. >> >>> Doing this means you shouldn't need the VIR_FILE_CLOSE(data->sock) for >>> either the normal or error path. Does this seem to be a reasonable >>> approach and solve the issue you're facing? >>> >>> John >>>> Thanks and Regards, >>>> Shivaprasad >>>> >>>>> The rest of the patch looks reasonable; however, I'm far from the expert >>>>> in these matters. >>>>> >>>>> John >>>>>> >>>>>> if (priv->job.completed) { >>>>>> qemuDomainJobInfoUpdateTime(priv->job.completed); >>>>>> >>>>>> -- >>>>>> libvir-list mailing list >>>>>> libvir-list@xxxxxxxxxx >>>>>> https://www.redhat.com/mailman/listinfo/libvir-list >>>>>> -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list