Excerpts from Gionatan Danti's message of 2020-09-11 00:35:52 +0200: > The main point was the potentially long heal time could you (or anyone else) please elaborate on what long heal times are to be expected? we have a 3-node replica cluster running version 3.12.9 (we are building a new cluster now) with 32TiB of space. each node has a single brick on top of a 7-disk raid5 (linux softraid) at one point we had one node unavailable for one month (gluster failed to start up properly on that node and we didn't have monitoring in place to notice) and the accumulated changes of one month of operation took 4 months to heal. i would have expected this ideally to take 2 weeks or less, one month at the worst (ie faster than or at least as fast as it took to create the data but not slower, and especially not 4 times slower) the initial heal count was about 6million files for one node and 5.4million for the other. the healing speed was not constant. at first the heal count increased, that is, healing was seemingly slower than the amount of new files added. then it started to speed up and the first million of each node took about 46 days to heal, while the last million took 4 days. i logged the output of "gluster volume heal gluster-volume statistics heal-count" every hour to monitor the healing process. what makes healing so slow? almost all files are newly added and not changed, so they were missing on the node that was offline. the files are backup for user devices, so almost all files are written once and rarely, if ever, read. we do have a few huge directories with 250000, 88000, 60000 and 29000 subdirectories each. in total 26TiB of small files, but no more than a few 1000 per directory. (it's user data, some have more, some have less) could those huge directories be responsible for the slow healing? the filesystem is ext4 on top of a 7 disk raid5. after this ordeal was over we discovered the readdir-ahead setting which was on. we turned that off based on other discussions on performance that suggested an improvement from this change, but we haven't had the opportunity to do a large healing test since, so we can't tell if it makes a difference for us. any insights would be appreciated. greetings, martin. -- general manager realss.com student mentor fossasia.org community mentor blug.sh beijinglug.club pike programmer pike.lysator.liu.se caudium.net societyserver.org Martin Bähr working in china http://societyserver.org/mbaehr/ ________ Community Meeting Calendar: Schedule - Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC Bridge: https://bluejeans.com/441850968 Gluster-users mailing list Gluster-users@xxxxxxxxxxx https://lists.gluster.org/mailman/listinfo/gluster-users