Re: Disbalanced load

Milos Kozak <milos.kozak@xxxxxxxxx> · Wed, 03 Sep 2014 16:01:58 -0400

I have just tried to copy an VM image (raw) and causes the same problem.

I have GlusterFS 3.5.2

On 9/3/2014 9:14 AM, Roman wrote:
Hi,

I had some issues with files generated from /dev/zero also. try real
files or /dev/urandom :)
I don't know, if there is a real issue/bug with files generated from
/dev/zero ? Devs should check them out  /me thinks.

2014-09-03 16:11 GMT+03:00 Milos Kozak <milos.kozak@xxxxxxxxx
<mailto:milos.kozak@xxxxxxxxx>>:

    Hi,

    I am facing a quite strange problem when I do have two servers with
    the same configuration and the same hardware. Servers are connected
    by bonded 1GE. I have one volume:

    [root@nodef02i 103]# gluster volume info

    Volume Name: ph-fs-0
    Type: Replicate
    Volume ID: f8f569ea-e30c-43d0-bb94-__b2f1164a7c9a
    Status: Started
    Number of Bricks: 1 x 2 = 2
    Transport-type: tcp
    Bricks:
    Brick1: 10.11.100.1:/gfs/s3-sata-10k/__fs
    Brick2: 10.11.100.2:/gfs/s3-sata-10k/__fs
    Options Reconfigured:
    storage.owner-gid: 498
    storage.owner-uid: 498
    network.ping-timeout: 2
    performance.io-thread-count: 3
    cluster.server-quorum-type: server
    network.remote-dio: enable
    cluster.eager-lock: enable
    performance.stat-prefetch: off
    performance.io-cache: off
    performance.read-ahead: off
    performance.quick-read: off

    Intended to host virtual servers (KVM), the configuration is
    according to the gluster blog.

    Currently I have got only one virtual server deployed on top of this
    volume in order to see effects of my stress tests. During the tests
    I write to the volume mounted through FUSE by dd (currently on one
    writing at a moment):

    dd if=/dev/zero of=test2.img bs=1M count=20000 conv=fdatasync

    Test 1) I run dd on nodef02i. Load on  nodef02i is max 1erl but on
    the nodef01i around 14erl (I do have 12threads CPU). After the write
    is done the load on nodef02i goes down, but the load goes up to
    28erl on nodef01i. 20minutes it stays the same. In the mean time I
    can see:

    [root@nodef01i 103]# gluster volume heal ph-fs-0 info
    Volume ph-fs-0 is not started (Or) All the bricks are not running.
    Volume heal failed

    [root@nodef02i 103]# gluster volume heal ph-fs-0 info
    Brick nodef01i.czprg:/gfs/s3-sata-__10k/fs/
    /__3706a2cb0bb27ba5787b3c12388f4e__bb - Possibly undergoing heal
    /test.img - Possibly undergoing heal
    Number of entries: 2

    Brick nodef02i.czprg:/gfs/s3-sata-__10k/fs/
    /__3706a2cb0bb27ba5787b3c12388f4e__bb - Possibly undergoing heal
    /test.img - Possibly undergoing heal
    Number of entries: 2

    [root@nodef01i 103]# gluster volume status
    Status of volume: ph-fs-0
    Gluster process                                         Port Online  Pid
    ------------------------------__------------------------------__------------------
    Brick 10.11.100.1:/gfs/s3-sata-10k/__fs                   49152 Y
        56631
    Brick 10.11.100.2:/gfs/s3-sata-10k/__fs                   49152 Y
        3372
    NFS Server on localhost                                 2049 Y
      56645
    Self-heal Daemon on localhost                           N/A Y
      56649
    NFS Server on 10.11.100.2                               2049 Y
      3386
    Self-heal Daemon on 10.11.100.2                         N/A Y       3387

    Task Status of Volume ph-fs-0
    ------------------------------__------------------------------__------------------
    There are no active volume tasks

    This very high load takes another 20-30minutes. During the first
    test I restarted glusterd service after 10minutes because everything
    seemed to me that the service does not work, but I could see very
    high load on the nodef01i.
    Consequently, the virtual server yields errors about problems with
    EXT4 filesystem - MySQL stops.

    When the load culminated I tried to run the same test but from
    opposite direction. I wrote (dd) from nodef01i - test2. Happened
    more or less the same. I gained extremely high load on nodef01i and
    minimal load on nodef02i. Outputs from heal were more or less the same..

    I would like to tweak this but I don´t know what I should focus on.
    Thank you for help.

    Milos

    _______________________________________________
    Gluster-users mailing list
    Gluster-users@xxxxxxxxxxx <mailto:Gluster-users@xxxxxxxxxxx>
    http://supercolony.gluster.org/mailman/listinfo/gluster-users

--
Best regards,
Roman.
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://supercolony.gluster.org/mailman/listinfo/gluster-users