Re: gluster heal performance

Gionatan Danti <g.danti@xxxxxxxxxx> · Fri, 11 Sep 2020 08:34:04 +0200

Il 2020-09-11 05:27 Martin Bähr ha scritto:
Excerpts from Gionatan Danti's message of 2020-09-11 00:35:52 +0200:
The main point was the potentially long heal time

could you (or anyone else) please elaborate on what long heal times are
to be expected?

Hi, there are multiple factor at works here:
- healing via network (gluster) vs internal bus data transfer (RAID 
rebuild);
- gluster being a user-space application which commands a significant 
CPU load;
- healing proceeding per-file and not in LBA order (ie: it has to 
traverse all the affected files/dirs, which means scattered random IO 
for the most part);
- other things which I am surely missing.

we have a 3-node replica cluster running version 3.12.9 (we are 
building
a new cluster now) with 32TiB of space. each node has a single brick on
top of a 7-disk raid5 (linux softraid)

3.12.9, while being the official RHEL 7 release, is very old now.

at one point we had one node unavailable for one month (gluster failed
to start up properly on that node and we didn't have monitoring in 
place
to notice) and the accumulated changes of one month of operation took 4
months to heal. i would have expected this ideally to take 2 weeks or
less, one month at the worst (ie faster than or at least as fast as it
took to create the data but not slower, and especially not 4 times
slower)

Wow, 4 months is a lot... but you had at least internal redundancy 
(RAID5 bricks). The OP was asking about running with *no* internal 
redundancy and this is the reason I suggest against it: losing a disk 
while needing weeks to heal is not good.

the initial heal count was about 6million files for one node and
5.4million for the other.
...
we do have a few huge directories with 250000, 88000, 60000 and 29000
subdirectories each. in total 26TiB of small files, but no more than
a few 1000 per directory. (it's user data, some have more, some have
less)

could those huge directories be responsible for the slow healing?

The very high number of to-be-healed files surely has a negative impact 
on your heal speed.

Regards.

--
Danti Gionatan
Supporto Tecnico
Assyoma S.r.l. - www.assyoma.it
email: g.danti@xxxxxxxxxx - info@xxxxxxxxxx
GPG public key ID: FF5F32A8
________

Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://bluejeans.com/441850968

Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
https://lists.gluster.org/mailman/listinfo/gluster-users