Le jeudi 04 avril 2019 à 19:10 +0300, Yaniv Kaul a écrit : > I'm not convinced this is solved. Just had what I believe is a > similar > failure: > > *00:12:02.532* A dependency job for rpc-statd.service failed. See > 'journalctl -xe' for details.*00:12:02.532* mount.nfs: rpc.statd is > not running but is required for remote locking.*00:12:02.532* > mount.nfs: Either use '-o nolock' to keep locks local, or start > statd.*00:12:02.532* mount.nfs: an incorrect mount option was > specified > > (of course, it can always be my patch!) > > https://build.gluster.org/job/centos7-regression/5384/console same issue, different builder (206). I will check them all, as the issue is more widespread than I expected (or it did popup since last time I checked). > > On Thu, Apr 4, 2019 at 6:56 PM Atin Mukherjee <amukherj@xxxxxxxxxx> > wrote: > > > Thanks misc. I have always seen a pattern that on a reattempt > > (recheck > > centos) the same builder is picked up many time even though it's > > promised > > to pick up the builders in a round robin manner. > > > > On Thu, Apr 4, 2019 at 7:24 PM Michael Scherer <mscherer@xxxxxxxxxx > > > > > wrote: > > > > > Le jeudi 04 avril 2019 à 15:19 +0200, Michael Scherer a écrit : > > > > Le jeudi 04 avril 2019 à 13:53 +0200, Michael Scherer a écrit : > > > > > Le jeudi 04 avril 2019 à 16:13 +0530, Atin Mukherjee a écrit > > > > > : > > > > > > Based on what I have seen that any multi node test case > > > > > > will fail > > > > > > and > > > > > > the > > > > > > above one is picked first from that group and If I am > > > > > > correct > > > > > > none > > > > > > of > > > > > > the > > > > > > code fixes will go through the regression until this is > > > > > > fixed. I > > > > > > suspect it > > > > > > to be an infra issue again. If we look at > > > > > > https://review.gluster.org/#/c/glusterfs/+/22501/ & > > > > > > https://build.gluster.org/job/centos7-regression/5382/ peer > > > > > > handshaking is > > > > > > stuck as 127.1.1.1 is unable to receive a response back, > > > > > > did we > > > > > > end > > > > > > up > > > > > > having firewall and other n/w settings screwed up? The test > > > > > > never > > > > > > fails > > > > > > locally. > > > > > > > > > > The firewall didn't change, and since the start has a line: > > > > > "-A INPUT -i lo -j ACCEPT", so all traffic on the localhost > > > > > interface > > > > > work. (I am not even sure that netfilter do anything > > > > > meaningful on > > > > > the > > > > > loopback interface, but maybe I am wrong, and not keen on > > > > > looking > > > > > kernel code for that). > > > > > > > > > > > > > > > Ping seems to work fine as well, so we can exclude a routing > > > > > issue. > > > > > > > > > > Maybe we should look at the socket, does it listen to a > > > > > specific > > > > > address or not ? > > > > > > > > So, I did look at the 20 first ailure, removed all not related > > > > to > > > > rebal-all-nodes-migrate.t and seen all were run on builder203, > > > > who > > > > was > > > > freshly reinstalled. As Deepshika noticed today, this one had a > > > > issue > > > > with ipv6, the 2nd issue we were tracking. > > > > > > > > Summary, rpcbind.socket systemd unit listen on ipv6 despites > > > > ipv6 > > > > being > > > > disabled, and the fix is to reload systemd. We have so far no > > > > idea on > > > > why it happen, but suspect this might be related to the network > > > > issue > > > > we did identify, as that happen only after a reboot, that > > > > happen only > > > > if a build is cancelled/crashed/aborted. > > > > > > > > I apply the workaround on builder203, so if the culprit is that > > > > specific issue, guess that's fixed. > > > > > > > > I started a test to see how it go: > > > > https://build.gluster.org/job/centos7-regression/5383/ > > > > > > The test did just pass, so I would assume the problem was local > > > to > > > builder203. Not sure why it was always selected, except because > > > this > > > was the only one that failed, so was always up for getting new > > > jobs. > > > > > > Maybe we should increase the number of builder so this doesn't > > > happen, > > > as I guess the others builders were busy at that time ? > > > > > > -- > > > Michael Scherer > > > Sysadmin, Community Infrastructure and Platform, OSAS > > > > > > > > > _______________________________________________ > > > > Gluster-devel mailing list > > Gluster-devel@xxxxxxxxxxx > > https://lists.gluster.org/mailman/listinfo/gluster-devel -- Michael Scherer Sysadmin, Community Infrastructure and Platform, OSAS
Attachment:
signature.asc
Description: This is a digitally signed message part
_______________________________________________ Gluster-devel mailing list Gluster-devel@xxxxxxxxxxx https://lists.gluster.org/mailman/listinfo/gluster-devel