On Tue, Jan 15, 2019 at 2:18 AM Diego Remolina <dijuremo@xxxxxxxxx> wrote:
_______________________________________________Dear all,I was running gluster 3.10.12 on a pair of servers and recently upgraded to 4.1.6. There is a cron job that runs nightly in one machine, which rsyncs the data on the servers over to another machine for backup purposes. The rsync operation runs on one of the gluster servers, which mounts the gluster volume via fuse on /export.When using 3.10.12, this process would start at 8:00PM nightly, and usually end up at around 4:30AM when the servers had been freshly rebooted. From this point, things would start taking a bit longer and stabilize ending at around 7-9AM depending on actual file changes and at some point the servers would start eating up so much ram (up to 30GB) and I would have to reboot them to bring things back to normal as the file system would become extremely slow (perhaps the memory leak I have read was present on 3.10.x).After upgrading to 4.1.6 over the weekend, I was shocked to see the rsync process finish in about 1 hour and 26 minutes. This is compared to 8 hours 30 mins with the older version. This is a nice speed up, however, I can only ask myself what has changed so drastically that this process is now so fast. Have there really been improvements in 4.1.6 that could speed this up so dramatically? In both of my test cases, there would had not really been a lot to copy via rsync given the fresh reboots are done on Saturday after the sync has finished from the day before.In general, the servers (which are accessed via samba for windows clients) are much faster and responsive since the update to 4.1.6. Tonight I will have the first rsync run which will actually have to copy the day's changes and will have another point of comparison.I am still using fuse mounts for samba, due to prior problems with vsf =gluster, which are currently present in Samba 4.8.3-4, and already documented in bugs, for which patches exist, but no official updated samba packages have been released yet. Since I was going from 3.10.12 to 4.1.6 I also did not want to change other things to make sure I could track any issues just related to the change in gluster versions and eliminate other complexity.The file system currently has about 16TB of data in5142816 files and 696544 directoriesI've just ran the following code to count files and dirs and it took 67mins 38.957 secs to complete in this gluster volume:# time ( /root/sbin/dircnt /export )/export contains 5142816 files and 696544 directoriesreal 67m38.957suser 0m6.225ssys 0m48.939sThe gluster options set on the volume are:# gluster v status exportStatus of volume: exportGluster process TCP Port RDMA Port Online Pid------------------------------------------------------------------------------Brick 10.0.1.7:/bricks/hdds/brick 49157 0 Y 13986Brick 10.0.1.6:/bricks/hdds/brick 49153 0 Y 9953Self-heal Daemon on localhost N/A N/A Y 21934Self-heal Daemon on 10.0.1.5 N/A N/A Y 4598Self-heal Daemon on 10.0.1.6 N/A N/A Y 14485Task Status of Volume export------------------------------------------------------------------------------There are no active volume tasksTruth, there is a 3rd server here, but no bricks on it.Thoughts?Diego
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
https://lists.gluster.org/mailman/listinfo/gluster-users
Hi Diego,
Besides the actual improvements made in the code i think new releases might implement volume options by default that before might have had different setting. I would have been interesting to diff "gluster volume get <volname> all" befor and after the upgrade. Just for curiosity and i am trying to figure out volume options for rsync kind of workloads can you share the command output anyway along with gluster volume info <volname>?
thanks
_______________________________________________ Gluster-users mailing list Gluster-users@xxxxxxxxxxx https://lists.gluster.org/mailman/listinfo/gluster-users