On Fri, Jun 17, 2016 at 4:59 AM, Xavier Hernandez <xhernandez@xxxxxxxxxx> wrote: > Hi all, > > I've seen in many places the belief that disperse, or erasure coding in > general, is slow because of the complex or costly math involved. It's true > that there's an overhead compared to a simple copy like replica does, but > this overhead is way more smaller than many people think. > > The math used by disperse, if tested alone outside gluster, is much faster > than it seems. AFAIK the real problem of EC is the communications layer. It > adds a lot of latency and having to communicate simultaneously and > coordinate 6 or more bricks has a big impact. > > Erasure coding also suffers from partial writes, that require a > read-modify-write cycle. However this is completely avoided in many > situations where the volume is optimally configured and writes are in blocks > of multiples of 4096 bytes and aligned (typical on VMs, databases and many > other workloads). It could even be avoided in other situations taking > advantage of the write-behind xlator (not done yet). > > I've used a single core of two machines to test the raw math: one quite > limited (Atom D525 1.8 GHz) and another more powerful but not a top CPU > (Xeon E5-2630L 2.0 GHz). > > Common parameters: > > * nonsystematic vandermonde matrix (the same used by ec) > * algorithm slightly slower than the one used bye ec (I haven't implemented > some optimizations in the test program, but I think the difference should be > very small) > * buffer size: 128 KiB > * number of iterations: 16384 > * total size processed: 2 GiB > * results in MiB/s for a single core > > Config Atom Xeon > 2+1 633 1856 > 4+1 405 1203 > 4+2 324 984 > 4+3 275 807 > 8+2 227 611 > 8+3 202 545 > 8+4 182 501 > 16+3 116 303 > 16+4 111 295 > > The same tests using Intel SSE2 extensions (not present in EC yet, but the > patch is in review): > > Config Atom Xeon > 2+1 821 3047 > 4+1 767 2246 > 4+2 629 1887 > 4+3 535 1632 > 8+2 466 1237 > 8+3 423 1104 > 8+4 388 1044 > 16+3 289 675 > 16+4 271 637 > > With AVX2 it should be faster, but my machines doesn't support it. > > This is even much much faster when a systematic matrix is used. For example > a 16+4 configuration using SSE on a Xeon core can encode at 3865 MiB/s. > However this won't be a big difference inside gluster. > > Currently EC encoding/decoding for small/medium configurations is not the > bottle-neck of disperse. Maybe for big configurations on slow machines, it > could have some impact (I don't have resources to test those big > configurations properly). Agree here. In the performance results that I have observed, EC outperforms afr when multi-threaded large sequential read and write workloads are involved. -Vijay _______________________________________________ Gluster-devel mailing list Gluster-devel@xxxxxxxxxxx http://www.gluster.org/mailman/listinfo/gluster-devel