Re: Why vandermonde matrix is used in EC?

Xavier Hernandez <xhernandez@xxxxxxxxxx> · Mon, 28 Nov 2016 08:51:29 +0100

On 11/27/2016 09:58 PM, 한우형 wrote:
Hi,

Thank you so much for the speedy reply, but I have some more questions.

1) I understand Non-systematic encoding/decoding doesn't alter
performance when one or more bricks are down. but why systematic
approach has service degradation?
I think when parity part is down there's no performance degradation, and
when not-parity part is down it needs to be encoded. but It is same with
Non-systeamtic case.

If a systematic implementation does increase performance in a 
perceptible way, then a failure of one brick will give less performance 
to users. Even if that performance is the same that we currently have, 
it will be worse from the perspective of the users.

Note that there's no distinction between "data" bricks and "parity" 
bricks. Each file will use a different brick for its parity, so a 
failure of a brick will always cause trouble to some files. This would 
also allow a distribution of the read load among all available bricks.

Anyway, as I said in the other email, it's not so clear that a 
systematic implementation would really have an important improvement on 
performance.

2) In systematic approach, what kind of metatdata need to be checked?
Can't we just try to read not-parity part?

If a brick is down, it's clear that we'll need to read from parity, but 
when the brick comes up again it can contain old data (data modified 
while it was down), so we cannot simply read from that brick. We need to 
verify in some way that the other bricks do not contain updated data.

Best regards,

Xavi

Best regards,
Han

2016-11-24 17:26 GMT+09:00 Xavier Hernandez <xhernandez@xxxxxxxxxx
<mailto:xhernandez@xxxxxxxxxx>>:

    Hi Han,

    On 11/24/2016 04:25 AM, 한우형 wrote:

        Hi,

        I'm working on dispersed volume(ec) and I found ec encode/decode
        algorithm is using non-systematic vandermonde matrix.

        My question is this: why non-systematic algorithm is used?

    Non-systematic encoding/decoding doesn't alter performance when one
    or more bricks are down. This means that you won't have service
    degradation when you are having troubles with one brick or you are
    doing maintenance.

    From the implementation perspective, a systematic approach would
    need to talk to all bricks anyway to check for critical metadata
    (gluster doesn't have a centralized metadata server). This means
    that the theoretical benefit of a systematic decoding for reads
    would be masked by the overhead needed for metadata operations
    (involving additional network round-trips).

    That said, it's true that a systematic approach would have some
    benefits, like a little less CPU overhead. Not sure if the
    performance would benefit significantly though.

        If we use
        systematic algorithm(not systematic vandermonde, It's not MDS)

    A non-systematic Vandermonde matrix *IS* MDS. In fact, pure
    Vandermonde matrices are non-systematic by definition. Some
    alterations need to be done to make them systematic, and these
    transformations can lead to a non MDS matrix if not made with care.

        we can
        boost read performance. (no need to decode step in read)

    Though it would probably have some benefits, I'm not so sure that
    performance would improve significantly.

    Current implementation of ec decoding can process 1GB/s of data per
    CPU core on low end processors (Intel Atoms with SSE2) using block
    sizes of 128KB and a 4+2 configuration. Currently this is much
    faster than what a pure distributed volume on same hardware can read
    for a single client/single thread.

    So, for now, the non-systematic approach doesn't seem a bottle-neck
    for gluster. Anyway there are plans to provide a systematic version,
    but it's not a priority as of now.

    Best regards,

    Xavi

        Best regards,
        Han

        _______________________________________________
        Gluster-devel mailing list
        Gluster-devel@xxxxxxxxxxx <mailto:Gluster-devel@xxxxxxxxxxx>
        http://www.gluster.org/mailman/listinfo/gluster-devel
        <http://www.gluster.org/mailman/listinfo/gluster-devel>

_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-devel