Re: slow write perf for disperse volume

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





2017-04-25 9:03 GMT+02:00 Xavier Hernandez <xhernandez@xxxxxxxxxx>:
Hi Ingard,

On 24/04/17 14:43, Ingard Mevåg wrote:
I've done some more testing with tc and introduced latency on one of my
testservers. With 9ms latency artificially introduced using tc ( sudo tc
qdisc add dev bond0 root netem delay 9ms ) to a testserver in the same
DC as the disperse volume servers I get more or less the same throughput
as I do when testing DC1 <-> DC2 (which has ~9ms ping).

I know distribute volumes were more sensitive to latency in the past. At
least I can max out a 1gig link with 9-10ms latency when using
distribute. Disperse seems to max at 12-14MB/s with 8-10ms latency.

A pure distributed volume is a simple configuration that simply forwards a request to one of the bricks. No additional overhead is needed.
 
Well, we've still got gluster 3.0 running on an old cluster and this cluster is also dead slow when mounted at the other DC - About the same perf as get with disperse on 3.10. So some work has been done to make distribute volumes work better with increased latency.
 

However a dispersed 4+2 volume needs to talk simultaneously to 6 bricks, meaning 6 network round-trips for every request. Additionally it needs to keep integrity so one or more additional requests are needed.
 
The number of bricks doesnt appear to affect the throughput. I've tried different variations of data and redundancy bricks, but the throughput seems to stay the same. For instance 8+4 compared to 4+2 has double amount of connections, but half the throughput per connection.


If network latency is high, all these requests contribute to increase the overall request latency, limiting the throughput.

Have you tried a replica 2 or 3 ? it uses very similar integrity mechanisms so it'll also add some latency. Maybe not so much as a dispersed 4+2, but it should be perceptible.

We're after capacity with this setup.


Another test to confirm that the limitation is caused by latency is to do multiple writes in parallel. Each write will be limited by the latency, but the aggregated throughput should saturate the bandwidth, specially on a 1Gb ethernet.

That has been confirmed.
 

Even better performance can be achieved if you distribute the writes to multiple clients or mount points (assuming they are not writing to the same file).

Xavi


ingard

2017-04-24 14:03 GMT+02:00 Ingard Mevåg <ingard@xxxxxxxx
<mailto:ingard@xxxxxxxx>>:

    I can confirm mounting the disperse volume locally on one of the
    three servers i got 211 MB/s with dd if=/dev/zero of=./local.dd.test
    bs=1M count=10000.

    Its not very good concidering 10gig network, but at least 20x better
    than 10-12MB/s

    2017-04-24 13:53 GMT+02:00 Pranith Kumar Karampuri
    <pkarampu@xxxxxxxxxx <mailto:pkarampu@xxxxxxxxxx>>:

        +Ashish

        Ashish,
               Could you help Ingard? Do let me know what you find.

        On Mon, Apr 24, 2017 at 4:50 PM, Ingard Mevåg <ingard@xxxxxxxx
        <mailto:ingard@xxxxxxxx>> wrote:

            Hi. I can't see a fuse thread at all. Please see attached
            screenshot of top process with threads. Keep in mind this is
            from inside the container.

            2017-04-24 12:17 GMT+02:00 Pranith Kumar Karampuri
            <pkarampu@xxxxxxxxxx <mailto:pkarampu@xxxxxxxxxx>>:

                We were able to saturate hardware with EC as well. Could
                you check 'top' in threaded mode to see if fuse thread
                is saturated when you run dd?

                On Mon, Apr 24, 2017 at 3:27 PM, Ingard Mevåg
                <ingard@xxxxxxxx <mailto:ingard@xxxxxxxx>> wrote:

                    Hi
                    I've been playing with disperse volumes the past
                    week, and so far i can not get more than 12MB/s when
                    i do a write test. I've tried a distributed volume
                    on the same bricks and gotten close to gigabit
                    speeds. iperf confirms gigabit speeds to all three
                    servers in the storage pool.

                    The three storage servers have 10gig nics (connected
                    to the same switch). The client is for a now a
                    docker container in a 2nd DC (latency roughly 8-9 ms).

                    dpkg -l|grep -i gluster
                    ii  glusterfs-client
                    3.10.1-ubuntu1~xenial1          amd64
                     clustered file-system (client package)
                    ii  glusterfs-common
                    3.10.1-ubuntu1~xenial1          amd64
                     GlusterFS common libraries and translator modules
                    ii  glusterfs-server
                    3.10.1-ubuntu1~xenial1          amd64
                     clustered file-system (server package)

                    $ gluster volume info

                    Volume Name: DFS-ARCHIVE-001
                    Type: Disperse
                    Volume ID: 1497bc85-cb47-4123-8f91-a07f55c11dcc
                    Status: Started
                    Snapshot Count: 0
                    Number of Bricks: 1 x (4 + 2) = 6
                    Transport-type: tcp
                    Bricks:
                    Brick1: dna-001:/mnt/data01/brick
                    Brick2: dna-001:/mnt/data02/brick
                    Brick3: dna-002:/mnt/data01/brick
                    Brick4: dna-002:/mnt/data02/brick
                    Brick5: dna-003:/mnt/data01/brick
                    Brick6: dna-003:/mnt/data02/brick
                    Options Reconfigured:
                    transport.address-family: inet
                    nfs.disable: on

                    Anyone know the reason for the slow speeds on
                    disperse vs distribute?

                    kind regards
                    ingard

                    _______________________________________________
                    Gluster-users mailing list
                    Gluster-users@xxxxxxxxxxx
                    <mailto:Gluster-users@gluster.org>
                    http://lists.gluster.org/mailman/listinfo/gluster-users
                    <http://lists.gluster.org/mailman/listinfo/gluster-users>




                --
                Pranith




            --
            Ingard Mevåg
            Driftssjef
            Jottacloud

            Mobil: +47 450 22 834 <tel:+47%20450%2022%20834>
            E-post: ingard@xxxxxxxxxxxxxx <mailto:ingard@xxxxxxxxxxxxxx>
            Webside: www.jottacloud.com <http://www.jottacloud.com>




        --
        Pranith




    --
    Ingard Mevåg
    Driftssjef
    Jottacloud

    Mobil: +47 450 22 834 <tel:+47%20450%2022%20834>
    E-post: ingard@xxxxxxxxxxxxxx <mailto:ingard@xxxxxxxxxxxxxx>
    Webside: www.jottacloud.com <http://www.jottacloud.com>




--
Ingard Mevåg
Driftssjef
Jottacloud

Mobil: +47 450 22 834
E-post: ingard@xxxxxxxxxxxxxx <mailto:ingard@xxxxxxxxxxxxxx>
Webside: www.jottacloud.com <http://www.jottacloud.com>


_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://lists.gluster.org/mailman/listinfo/gluster-users





_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://lists.gluster.org/mailman/listinfo/gluster-users

[Index of Archives]     [Gluster Development]     [Linux Filesytems Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux