Re: EC planning

Xavier Hernandez <xhernandez@xxxxxxxxxx> · Tue, 13 Oct 2015 12:43:08 +0200

+gluster-users

On 13/10/15 12:34, Xavier Hernandez wrote:
Hi Serkan,

On 12/10/15 16:52, Serkan Çoban wrote:
Hi,

I am planning to use GlusterFS for backup purposes. I write big files
(>100MB) with a throughput of 2-3GB/sn. In order to gain from space we
plan to use erasure coding. I have some questions for EC and brick
planning:
- I am planning to use 200TB XFS/ZFS RAID6 volume to hold one brick per
server. Should I increase brick count? is increasing brick count also
increases performance?

Using a distributed-dispersed volume increases performance. You can
split each RAID6 volume into multiple bricks to create such a volume.
This is because a single brick process cannot achieve the maximum
throughput of the disk, so creating multiple bricks improves this.
However having too many bricks could be worse because all request will
go to the same filesystem and will compete between them in your case.

Another thing to consider is the size of the RAID volume. A 200TB RAID
will require *a lot* of time to reconstruct in case of failure of any
disk. Also, a 200 TB RAID means you need almost 30 8TB disks. A RAID6 of
30 disks is quite fragile. Maybe it would be better to create multiple
RAID6 volumes, each with 18 disks at most (16+2 is a good and efficient
configuration, specially for XFS on non-hardware raids). Even in this
configuration, you can create multiple bricks in each RAID6 volume.

- I plan to use 16+2 for EC. Is this a problem? Should I decrease this
to 12+2 or 10+2? Or is it completely safe to use whatever we want?

16+2 is a very big configuration. It requires much computation power and
forces you to grow (if you need to grow the gluster volume at some
point) in multiples of 18 bricks.

Considering that you are already using a RAID6 in your servers, what you
are really protecting with the disperse redundancy is the failure of the
servers themselves. Maybe a 8+1 configuration could be enough for your
needs and requires less computation. If you really need redundancy 2,
8+2 should be ok.

Using values that are not a power of 2 has a theoretical impact on the
performance of the disperse volume when applications write blocks whose
size is a multiple of a power of 2 (which is the most normal case). This
means that it's possible that a 10+2 performs worse than a 8+2. However
this depends on many other factors, some even internal to gluster, like
caching, meaning that the real impact could be almost negligible in some
cases. You should test it with your workload.

- I understand that EC calculation is performed on client side, I want
to know if there are any benchmarks how EC affects CPU usage? For
example each 100MB/sn traffic may use 1CPU core?

I don't have a detailed measurement of CPU usage related to bandwidth,
however we have made some tests that seem to indicate that the CPU
overhead caused by disperse is quite small for a 4+2 configuration. I
don't have access to this data right now. When I have it, I'll send it
to you.

I will also try to do some tests with a 8+2 and 16+2 configuration to
see the difference.

- Is client number affect cluster performance? Is there any difference
if I connect 100 clients each writing with 20-30MB/s to cluster vs 1000
clients each writing 2-3MB/s?

Increasing the number of clients improves performance however I wont' go
over 100 clients as this could have a negative impact on performance
caused by the overhead of managing all of them. In our tests, the
maximum performance if obtained with ~8 parallel clients (if my memory
doesn't fail).

You will also probably want to tweak some volume parameters, like
server.event-threads, client.event-threads,
performance.client-io-threads and server.outstanding-rpc-limit to
increase performance.

Xavi

Thank you for your time,
Serkan

_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-users

_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-users