copy that. Thanks for looking into the issue.
David
------ Original Message ------
Sent: 2/5/2015 9:05:43 PM
Subject: Re: [Gluster-devel] missing files
Correct! I have seen(back in the day, its been 3ish years since I have seen it) having say 50+ volumes each with a geo rep session take system load levels to the point where pings couldn't be serviced within the ping timeout. So it is known to happen but there has been alot of work in the geo rep space to help here, some of which is discussed:
(think tar + ssh and other fixes)Your symptoms remind me of that case of 50+ geo repd volumes, thats why I mentioned it from the start. My current shoot from the hip theory is when rsyncing all that data the servers got too busy to service the pings and it lead to disconnects. This is common across all of the clustering / distributed software I have worked on, if the system gets too busy to service heartbeat within the timeout things go crazy(think fork bomb on a single host). Now this could be a case of me putting symptoms from an old issue into what you are describing, but thats where my head is at. If I'm correct I should be able to repro using a similar workload. I think that the multi threaded epoll changes that _just_ landed in master will help resolve this, but they are so new I haven't been able to test this. I'll know more when I get a chance to test tomorrow.
-b
|
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-users