It's worth me adding that since geo-replication broke, if I query the volume status (in this instance, on test1), I get this: test1# gluster volume status Another transaction is in progress. Please try again after sometime. It's still giving this error, 24 hours later. Cheers, Kingsley. On Mon, 2014-10-13 at 16:51 +0100, Kingsley wrote: > Hi, > > I have a small script to simulate file activity for an application we > have. It breaks geo-replication within about 15 - 20 seconds when I try > it. > > This is on a small Gluster test environment running in some VMs running > CentOS 6.5 and using gluster 3.6.0 beta3. I have 6 VMs - test1, test2, > test3, test4, test5 and test6. test1, test2 , test3 and test4 are > gluster servers while test5 and test6 are the clients. test3 is actually > not used in this test. > > > Before the test, I had a single gluster volume as follows: > > test1# gluster volume status > Status of volume: gv0 > Gluster process Port Online Pid > ------------------------------------------------------------------------------ > Brick test1:/data/brick/gv0 49168 Y 12017 > Brick test2:/data/brick/gv0 49168 Y 11835 > NFS Server on localhost 2049 Y 12032 > Self-heal Daemon on localhost N/A Y 12039 > NFS Server on test4 2049 Y 7934 > Self-heal Daemon on test4 N/A Y 7939 > NFS Server on test3 2049 Y 11768 > Self-heal Daemon on test3 N/A Y 11775 > NFS Server on test2 2049 Y 11849 > Self-heal Daemon on test2 N/A Y 11855 > > Task Status of Volume gv0 > ------------------------------------------------------------------------------ > There are no active volume tasks > > > I created a new volume and set up geo-replication as follows (as these > are test machines I only have one file system on each, hence using > "force" to create the bricks in the root FS): > > test4# date ; gluster volume create gv0-slave test4:/data/brick/gv0-slave force; date > Mon Oct 13 15:03:14 BST 2014 > volume create: gv0-slave: success: please start the volume to access data > Mon Oct 13 15:03:15 BST 2014 > > test4# date ; gluster volume start gv0-slave; date > Mon Oct 13 15:03:36 BST 2014 > volume start: gv0-slave: success > Mon Oct 13 15:03:39 BST 2014 > > test4# date ; gluster volume geo-replication gv0 test4::gv0-slave create push-pem force ; date > Mon Oct 13 15:05:59 BST 2014 > Creating geo-replication session between gv0 & test4::gv0-slave has been successful > Mon Oct 13 15:06:11 BST 2014 > > > I then mount volume gv0 on one of the client machines. I can create > files within the gv0 volume and can see the changes being replicated to > the gv0-slave volume, so I know that geo-replication is working at the > start. > > When I run my script (which quickly creates, deletes and renames files), > geo-replication breaks within a very short time. The test script output > is in > http://gluster.dogwind.com/files/georep20141013/test6_script-output.log > (I interrupted the script once I saw that geo-replication was broken). > Note that when it deletes a file, it renames any later-numbered file so > that the file numbering remains sequential with no gaps; this simulates > a real world application that we use. > > If you want a copy of the test script, it's here: > http://gluster.dogwind.com/files/georep20141013/test_script.tar.gz > > > The various gluster log files can be downloaded from here: > http://gluster.dogwind.com/files/georep20141013/ - each log file has the > actual log file path at the top of the file. > > If you want to run the test script on your own system, edit test.pl so > that @mailstores contains a directory path to a gluster volume. > > My systems' timezone is BST (GMT+1 / UTC+1) so any timestamps outside of > gluster logs are in this timezone. > > Let me know if you need any more info. > _______________________________________________ Gluster-users mailing list Gluster-users@xxxxxxxxxxx http://supercolony.gluster.org/mailman/listinfo/gluster-users