Re: Issue with Pro active self healing for Erasure coding

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 06/15/2015 09:25 AM, Mohamed Pakkeer wrote:
Hi Xavier,

When can we expect the 3.7.2 release for fixing the I/O error which we
discussed on this mail thread?.

As per the latest meeting held last wednesday [1] it will be released this week.

Xavi

[1] http://meetbot.fedoraproject.org/gluster-meeting/2015-06-10/gluster-meeting.2015-06-10-12.01.html


Thanks
Backer

On Wed, May 27, 2015 at 8:02 PM, Xavier Hernandez <xhernandez@xxxxxxxxxx
<mailto:xhernandez@xxxxxxxxxx>> wrote:

    Hi again,

    in today's gluster meeting [1] it has been decided that 3.7.1 will
    be released urgently to solve a bug in glusterd. All fixes planned
    for 3.7.1 will be moved to 3.7.2 which will be released soon after.

    Xavi

    [1]
    http://meetbot.fedoraproject.org/gluster-meeting/2015-05-27/gluster-meeting.2015-05-27-12.01.html


    On 05/27/2015 12:01 PM, Xavier Hernandez wrote:

        On 05/27/2015 11:26 AM, Mohamed Pakkeer wrote:

            Hi Xavier,

            Thanks for your reply. When can we expect the 3.7.1 release?


        AFAIK a beta of 3.7.1 will be released very soon.


            cheers
            Backer

            On Wed, May 27, 2015 at 1:22 PM, Xavier Hernandez
            <xhernandez@xxxxxxxxxx <mailto:xhernandez@xxxxxxxxxx>
            <mailto:xhernandez@xxxxxxxxxx
            <mailto:xhernandez@xxxxxxxxxx>>> wrote:

                 Hi,

                 some Input/Output error issues have been identified and
            fixed. These
                 fixes will be available on 3.7.1.

                 Xavi


                 On 05/26/2015 10:15 AM, Mohamed Pakkeer wrote:

                     Hi Glusterfs Experts,

                     We are testing glusterfs 3.7.0 tarball on our 10
            Node glusterfs
                     cluster.
                     Each node has 36 dirves and please find the volume
            info below

                     Volume Name: vaulttest5
                     Type: Distributed-Disperse
                     Volume ID: 68e082a6-9819-4885-856c-1510cd201bd9
                     Status: Started
                     Number of Bricks: 36 x (8 + 2) = 360
                     Transport-type: tcp
                     Bricks:
                     Brick1: 10.1.2.1:/media/disk1
                     Brick2: 10.1.2.2:/media/disk1
                     Brick3: 10.1.2.3:/media/disk1
                     Brick4: 10.1.2.4:/media/disk1
                     Brick5: 10.1.2.5:/media/disk1
                     Brick6: 10.1.2.6:/media/disk1
                     Brick7: 10.1.2.7:/media/disk1
                     Brick8: 10.1.2.8:/media/disk1
                     Brick9: 10.1.2.9:/media/disk1
                     Brick10: 10.1.2.10:/media/disk1
                     Brick11: 10.1.2.1:/media/disk2
                     Brick12: 10.1.2.2:/media/disk2
                     Brick13: 10.1.2.3:/media/disk2
                     Brick14: 10.1.2.4:/media/disk2
                     Brick15: 10.1.2.5:/media/disk2
                     Brick16: 10.1.2.6:/media/disk2
                     Brick17: 10.1.2.7:/media/disk2
                     Brick18: 10.1.2.8:/media/disk2
                     Brick19: 10.1.2.9:/media/disk2
                     Brick20: 10.1.2.10:/media/disk2
                     ...
                     ....
                     Brick351: 10.1.2.1:/media/disk36
                     Brick352: 10.1.2.2:/media/disk36
                     Brick353: 10.1.2.3:/media/disk36
                     Brick354: 10.1.2.4:/media/disk36
                     Brick355: 10.1.2.5:/media/disk36
                     Brick356: 10.1.2.6:/media/disk36
                     Brick357: 10.1.2.7:/media/disk36
                     Brick358: 10.1.2.8:/media/disk36
                     Brick359: 10.1.2.9:/media/disk36
                     Brick360: 10.1.2.10:/media/disk36
                     Options Reconfigured:
                     performance.readdir-ahead: on

                     We did some performance testing and simulated the
            proactive self
                     healing
                     for Erasure coding. Disperse volume has been
            created across
            nodes.

                     _*Description of problem*_

                     I disconnected the *network of two nodes* and tried
            to write
                     some video
                     files and *glusterfs* *wrote the video files on
            balance 8 nodes
                     perfectly*. I tried to download the uploaded file
            and it was
                     downloaded
                     perfectly. Then i enabled the network of two nodes,
            the pro
                     active self
                     healing mechanism worked perfectly and wrote the
            unavailable
            junk of
                     data to the recently enabled node from the other 8
            nodes. But
            when i
                     tried to download the same file node, it showed
            Input/Output
                     error. I
                     couldn't download the file. I think there is an
            issue in pro
                     active self
                     healing.

                     Also we tried the simulation with one node network
            failure. We
            faced
                     same I/O error issue while downloading the file


                     _Error while downloading file _
                     _
                     _

                     root@master02:/home/admin# rsync -r --progress
                     /mnt/gluster/file13_AN
                     ./1/file13_AN-2

                     sending incremental file list

                     file13_AN

                         3,342,355,597 100% 4.87MB/s    0:10:54 (xfr#1,
            to-chk=0/1)

                     rsync: read errors mapping "/mnt/gluster/file13_AN":
                     Input/output error (5)

                     WARNING: file13_AN failed verification -- update
            discarded (will
                     try again).

                        root@master02:/home/admin# cp /mnt/gluster/file13_AN
                     ./1/file13_AN-3

                     cp: error reading ‘/mnt/gluster/file13_AN’:
            Input/output error

                     cp: failed to extend ‘./1/file13_AN-3’:
            Input/output error_
                     _


                     We can't conclude the issue with glusterfs 3.7.0 or
            our glusterfs
                     configuration.

                     Any help would be greatly appreciated

                     --
                     Cheers
                     Backer



                     _______________________________________________
                     Gluster-users mailing list
            Gluster-users@xxxxxxxxxxx <mailto:Gluster-users@xxxxxxxxxxx>
            <mailto:Gluster-users@xxxxxxxxxxx
            <mailto:Gluster-users@xxxxxxxxxxx>>
            http://www.gluster.org/mailman/listinfo/gluster-users






        _______________________________________________
        Gluster-users mailing list
        Gluster-users@xxxxxxxxxxx <mailto:Gluster-users@xxxxxxxxxxx>
        http://www.gluster.org/mailman/listinfo/gluster-users





_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-users




[Index of Archives]     [Gluster Development]     [Linux Filesytems Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux