Re: Hello, I have a question about the erasure code translator, hope someone give me some advice, thank you!

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

On Mon, Apr 8, 2019 at 8:50 AM PSC <1173701037@xxxxxx> wrote:

Hi, I am a storage software coder who is interested in Gluster. I am trying to improve the read/write performance of it.


I noticed that gluster is using Vandermonde matrix in erasure code encoding and decoding process. However, it is quite complicate to generate inverse matrix of a Vandermonde matrix, which is necessary for decode. The cost is O(n³).


That's not true, actually. A Vandermonde matrix can be inverted in O(n^2), as the code currently does (look at ec_method_matrix_inverse() in ec-method.c). Additionally, current code does caching of inverted matrices, so in normal circumstances there shouldn't be many inverse computations. Only when something changes (a brick dies or comes online), a new inverted matrix could be needed.
 


Use a Cauchy matrix, can greatly cut down the cost of the process to find an inverse matrix. Which is O(n²).


I use intel storage accelerate library to replace the original ec encode/decode part of gluster. And it reduce the encode and decode time to about 50% of the original one.


How do you test that ? I also did some tests long ago and I didn't observe that difference.

Doing a raw test of encoding/decoding performance of the current code using Intel AVX2 extensions, it's able to process 7.6 GiB/s on a single core of an Intel Xeon Silver 4114 when L1 cache is used. Without relying on internal cache, it performs at 3.9 GiB/s. Does ISA-L provide better performance for a matrix of the same size (4+2 non-systematic matrix) ?


However, when I test the whole system. The read/write performance is almost the same as the original gluster.


Yes, there are many more things involved in the read and write operations in gluster. For the particular case of EC, having to deal with many bricks simultaneously (6 in this case) means that it's very sensitive to network latency and communications delays, and this is probably one of the biggest contributors. There some other small latencies added by other xlators.


I test it on three machines as servers. Each one had two bricks, both of them are SSD. So the total amount of bricks is 6. Use two of them as coding bricks. That is a 4+2 disperse volume configure.


The capability of network card is 10000Mbps. Theoretically it can support read and write with the speed faster than 1000MB/s.


The actually performance of read is about 492MB/s.

The actually performance of write is about 336MB/s.


While the original one read at 461MB/s, write at 322MB/s


Is there someone who can give me some advice about how to improve its performance? Which part is the critical defect on its performance if it’s not the ec translator?


I did a time count on translators. It show me EC translator just take 7% in the whole read\write process. Even though I knew that some translators are run asynchronous, so the real percentage can be some how lager than that.


Sincerely thank you for your patient to read my question!

_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxxx
https://lists.gluster.org/mailman/listinfo/gluster-devel
_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxxx
https://lists.gluster.org/mailman/listinfo/gluster-devel

[Index of Archives]     [Gluster Users]     [Ceph Users]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux