Re: one brick one volume process dies?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 13/09/17 20:47, Ben Werthmann wrote:
These symptoms appear to be the same as I've recorded in this post:

http://lists.gluster.org/pipermail/gluster-users/2017-September/032435.html

On Wed, Sep 13, 2017 at 7:01 AM, Atin Mukherjee <atin.mukherjee83@xxxxxxxxx <mailto:atin.mukherjee83@xxxxxxxxx>> wrote:

    Additionally the brick log file of the same brick
    would be required. Please look for if brick process
    went down or crashed. Doing a volume start force
    should resolve the issue.


When I do: vol start force I see this between the lines:

[2017-09-28 16:00:55.120726] I [MSGID: 106568] [glusterd-proc-mgmt.c:87:glusterd_proc_stop] 0-management: Stopping glustershd daemon running in pid: 308300 [2017-09-28 16:00:55.128867] W [socket.c:593:__socket_rwv] 0-glustershd: readv on /var/run/gluster/0853a4555820d3442b1c3909f1cb8466.socket failed (No data available) [2017-09-28 16:00:56.122687] I [MSGID: 106568] [glusterd-svc-mgmt.c:228:glusterd_svc_stop] 0-management: glustershd service is stopped

funnily(or not) I now see, a week after:

gluster vol status CYTO-DATA
Status of volume: CYTO-DATA
Gluster process                             TCP Port  RDMA Port Online  Pid
------------------------------------------------------------------------------
Brick 10.5.6.49:/__.aLocalStorages/0/0-GLUS
TERs/0GLUSTER-CYTO-DATA                     49161     0 Y       1743719
Brick 10.5.6.100:/__.aLocalStorages/0/0-GLU
STERs/0GLUSTER-CYTO-DATA                    49152     0 Y       20438
Brick 10.5.6.32:/__.aLocalStorages/0/0-GLUS
TERs/0GLUSTER-CYTO-DATA                     49152     0 Y       5607 Self-heal Daemon on localhost               N/A       N/A Y       41106 Quota Daemon on localhost                   N/A       N/A Y       41117 Self-heal Daemon on 10.5.6.17               N/A       N/A Y       19088 Quota Daemon on 10.5.6.17                   N/A       N/A Y       19097 Self-heal Daemon on 10.5.6.32               N/A       N/A Y       1832978 Quota Daemon on 10.5.6.32                   N/A       N/A Y       1832987 Self-heal Daemon on 10.5.6.49               N/A       N/A Y       320291 Quota Daemon on 10.5.6.49                   N/A       N/A Y       320303

Task Status of Volume CYTO-DATA
------------------------------------------------------------------------------
There are no active volume tasks


$ gluster vol heal CYTO-DATA info
Brick 10.5.6.49:/__.aLocalStorages/0/0-GLUSTERs/0GLUSTER-CYTO-DATA
Status: Transport endpoint is not connected
Number of entries: -

Brick 10.5.6.100:/__.aLocalStorages/0/0-GLUSTERs/0GLUSTER-CYTO-DATA
....
....


    On Wed, 13 Sep 2017 at 16:28, Gaurav Yadav
    <gyadav@xxxxxxxxxx <mailto:gyadav@xxxxxxxxxx>> wrote:

        Please send me the logs as well i.e glusterd.logs
        and cmd_history.log.


        On Wed, Sep 13, 2017 at 1:45 PM, lejeczek
        <peljasz@xxxxxxxxxxx <mailto:peljasz@xxxxxxxxxxx>>
        wrote:



            On 13/09/17 06:21, Gaurav Yadav wrote:

                Please provide the output of gluster
                volume info, gluster volume status and
                gluster peer status.

                Apart  from above info, please provide
                glusterd logs, cmd_history.log.

                Thanks
                Gaurav

                On Tue, Sep 12, 2017 at 2:22 PM, lejeczek
                <peljasz@xxxxxxxxxxx
                <mailto:peljasz@xxxxxxxxxxx>
                <mailto:peljasz@xxxxxxxxxxx
                <mailto:peljasz@xxxxxxxxxxx>>> wrote:

                    hi everyone

                    I have 3-peer cluster with all vols in
                replica mode, 9
                    vols.
                    What I see, unfortunately, is one
                brick fails in one
                    vol, when it happens it's always the
                same vol on the
                    same brick.
                    Command: gluster vol status $vol -
                would show brick
                    not online.
                    Restarting glusterd with systemclt
                does not help, only
                    system reboot seem to help, until it
                happens, next time.

                    How to troubleshoot this weird
                misbehaviour?
                    many thanks, L.

                    .
                   
                _______________________________________________
                    Gluster-users mailing list
                Gluster-users@xxxxxxxxxxx
                <mailto:Gluster-users@xxxxxxxxxxx>

                    <mailto:Gluster-users@xxxxxxxxxxx
                <mailto:Gluster-users@xxxxxxxxxxx>>
                http://lists.gluster.org/mailman/listinfo/gluster-users
                <http://lists.gluster.org/mailman/listinfo/gluster-users>
                   
                <http://lists.gluster.org/mailman/listinfo/gluster-users
                <http://lists.gluster.org/mailman/listinfo/gluster-users>>



            hi, here:

            $ gluster vol info C-DATA

            Volume Name: C-DATA
            Type: Replicate
            Volume ID: 18ffba73-532e-4a4d-84da-fceea52f8c2e
            Status: Started
            Snapshot Count: 0
            Number of Bricks: 1 x 3 = 3
            Transport-type: tcp
            Bricks:
            Brick1:
            10.5.6.49:/__.aLocalStorages/0/0-GLUSTERs/0GLUSTER-C-DATA
            Brick2:
            10.5.6.100:/__.aLocalStorages/0/0-GLUSTERs/0GLUSTER-C-DATA
            Brick3:
            10.5.6.32:/__.aLocalStorages/0/0-GLUSTERs/0GLUSTER-C-DATA
            Options Reconfigured:
            performance.md-cache-timeout: 600
            performance.cache-invalidation: on
            performance.stat-prefetch: on
            features.cache-invalidation-timeout: 600
            features.cache-invalidation: on
            performance.io-thread-count: 64
            performance.cache-size: 128MB
            cluster.self-heal-daemon: enable
            features.quota-deem-statfs: on
            changelog.changelog: on
            geo-replication.ignore-pid-check: on
            geo-replication.indexing: on
            features.inode-quota: on
            features.quota: on
            performance.readdir-ahead: on
            nfs.disable: on
            transport.address-family: inet
            performance.cache-samba-metadata: on


            $ gluster vol status C-DATA
            Status of volume: C-DATA
            Gluster process TCP Port  RDMA Port Online  Pid
            ------------------------------------------------------------------------------
            Brick 10.5.6.49:/__.aLocalStorages/0/0-GLUS
            TERs/0GLUSTER-C-DATA N/A       N/A N       N/A
            Brick 10.5.6.100:/__.aLocalStorages/0/0-GLU
            STERs/0GLUSTER-C-DATA 49152     0 Y       9376
            Brick 10.5.6.32:/__.aLocalStorages/0/0-GLUS
            TERs/0GLUSTER-C-DATA 49152     0 Y       8638
            Self-heal Daemon on localhost N/A       N/A
            Y       387879
            Quota Daemon on localhost N/A       N/A
            Y       387891
            Self-heal Daemon on rider.private.ccnr.ceb.
            private.cam.ac.uk <http://private.cam.ac.uk>
            N/A       N/A Y       16439
            Quota Daemon on rider.private.ccnr.ceb.priv
            ate.cam.ac.uk <http://ate.cam.ac.uk> N/A      
            N/A Y       16451
            Self-heal Daemon on 10.5.6.32 N/A       N/A
            Y       7708
            Quota Daemon on 10.5.6.32 N/A       N/A
            Y       8623
            Self-heal Daemon on 10.5.6.17 N/A       N/A
            Y       20549
            Quota Daemon on 10.5.6.17 N/A       N/A
            Y       9337

            Task Status of Volume C-DATA
            ------------------------------------------------------------------------------
            There are no active volume tasks





            .
            _______________________________________________
            Gluster-users mailing list
            Gluster-users@xxxxxxxxxxx
            <mailto:Gluster-users@xxxxxxxxxxx>
            http://lists.gluster.org/mailman/listinfo/gluster-users
            <http://lists.gluster.org/mailman/listinfo/gluster-users>

        _______________________________________________
        Gluster-users mailing list
        Gluster-users@xxxxxxxxxxx
        <mailto:Gluster-users@xxxxxxxxxxx>
        http://lists.gluster.org/mailman/listinfo/gluster-users
        <http://lists.gluster.org/mailman/listinfo/gluster-users>

-- --Atin

    _______________________________________________
    Gluster-users mailing list
    Gluster-users@xxxxxxxxxxx
    <mailto:Gluster-users@xxxxxxxxxxx>
    http://lists.gluster.org/mailman/listinfo/gluster-users
    <http://lists.gluster.org/mailman/listinfo/gluster-users>




.
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://lists.gluster.org/mailman/listinfo/gluster-users




[Index of Archives]     [Gluster Development]     [Linux Filesytems Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux