Re: Entering freeze for libvirt-3.5.0

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



  Daniel P. Berrange wrote:

> On Mon, Jul 03, 2017 at 05:20:13PM +0400, Roman Bogorodskiy wrote:
> >   Ján Tomko wrote:
> > 
> > > [cc: Guido]
> > > 
> > > On Sat, Jul 01, 2017 at 02:18:58PM +0400, Roman Bogorodskiy wrote:
> > > >  Andrea Bolognani wrote:
> > > >> virnetsockettest also fails pretty often for me, certainly
> > > >> more than your figure; even if that wasn't the case, 1/5
> > > >> failure rate is way too high for a CI job.
> > > >
> > > >I played a little more with virnetsockettest to get real stats and
> > > >figured the following:
> > > >
> > > > 1. On my desktop (i5) and laptop (i3), I didn't get any failures in 50
> > > > 'check' runs
> > > > 2. On a VM that I use to run test builds in Jenkins, out of 50 runs it
> > > > fails from 1 to 6 times; I did this test a couple of times and either I
> > > > was lucky or failure rate is higher when my Jenkins perform regular
> > > > builds.
> > > >
> > > >Anyway, I'll try to find a way to debug what's going on with
> > > >virnetsockettest.
> > > >
> > > 
> > > IIRC Debian disabled this test years ago.
> > > 
> > > Guido, have you ever discovered the cause?
> > > 
> > > Jan
> > 
> > I made some experiments on the weekend, and here are my results:
> > 
> > On a box where test fails from time to time, it fails at this point:
> > 
> >     virObjectUnref(csock);
> > 
> >     for (i = 0; i < nlsock; i++) {
> >         if (virNetSocketAccept(lsock[i], &ssock) != -1 && ssock) {
> >             char c = 'a';
> >             if (virNetSocketWrite(ssock, &c, 1) != -1 &&
> >                 virNetSocketRead(ssock, &c, 1) != -1) {
> >                 VIR_DEBUG("Unexpected client socket present"); <--- HERE
> >                 goto cleanup;
> >             }   
> >         }   
> >         virObjectUnref(ssock);
> >         ssock = NULL;
> >     }  
> > 
> > On a box where this test never fails, it reaches this block, but:
> > 
> >  * virNetSocketWrite(ssock, &c, 1) != -1
> >  * virNetSocketRead(ssock, &c, 1) == -1
> > 
> > It's enough to make the test pass. On a failing box both Write() and Read()
> > return != -1 when the test fails.
> 
> We discussed this on IRC and what is happening here is a race condition.
> 
> The test suite is connecting a client to the server and then immediately
> closing the client connection. When the server tries to accept the
> client, usually it'll get -1 because the client has already gone away.
> Socket termination is a multi-stage process at the network layer, and
> so there is a non-zero chance that Accept will succeed. The test suite
> is assuming that if the accept did succeeed, then we'll get I/O error
> on read & write, but this is also not actually guaranteed. It is
> possible that write may suceed buffering output, and read may simply
> see '0' for EOF. When this happens the test suite will fail. Roman
> debugging a failing run & confirmed this is exactly what's happening.
> 
> IOW, this test suite is just plain broken. It needs rewriting to do
> a more sensible real world test. ie spawn a thread to act as the
> server, and have the server just read from the client & echo it
> back to the client. The main thread acts as the client and tests
> this echo'ing. This is race free and is an real-world example of
> usage.

Yeah, I'll try to re-write this test, this weekend hopefully.

Roman Bogorodskiy

Attachment: signature.asc
Description: PGP signature

--
libvir-list mailing list
libvir-list@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/libvir-list

[Index of Archives]     [Virt Tools]     [Libvirt Users]     [Lib OS Info]     [Fedora Users]     [Fedora Desktop]     [Fedora SELinux]     [Big List of Linux Books]     [Yosemite News]     [KDE Users]     [Fedora Tools]
  Powered by Linux