fuse clients unusual disconnect/reconnects

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



web client
http://fpaste.org/105784/14014066/

gluster storage server
http://fpaste.org/105786/40140677/

Please find the above fpaste a situation I have experienced twice where my 8-node gluster 2x4 distributed-replication was doing the connect/disconnect game.
My servers are rhel 6.4 on glusterfs3.4.2 .

Volume Name: prodstatic
Type: Distributed-Replicate
Volume ID: 187c241d-0eeb-4405-80f2-c704ea44bc36
Status: Started
Number of Bricks: 2 x 4 = 8
Transport-type: tcp
Bricks:
Brick1: omhq1140:/export/content/static
Brick2: omdx1c5d:/export/content/static
Brick3: omhq11ad:/export/content/static
Brick4: omdx1781:/export/content/static
Brick5: omhq1c56:/export/content/static
Brick6: omdx1c58:/export/content/static
Brick7: omhq1c57:/export/content/static
Brick8: omdx1c59:/export/content/static
Options Reconfigured:
server.statedump-path: /debug
features.quota: on
server.allow-insecure: on
network.ping-timeout: 10

My question is why the frequent disconnect/connects ?  It was very sporadic in my environment across multiple clients/storage nodes.  The chaos lasted about 10:20-12:00.  My network admins are looking at the network while i'm trying to make sense of my storage logs.

30 mins prior, i made a config change 'gluster volume profile prodstatic stop'.  I couldn't imagine that issuing that command could drop my storage/clients like flies, can it ?  From the client logs it appears to establish an rpc connection, but immediately drops it?  Are the logs lying?  Is my understanding of network.ping-timeout too low?  I wanted the clients to act if the storage is not responding drop him, and move on to the next available replica.  If that is not the intention then a higher network.ping-timeout would mean a longer stall which is not my desire.

All my storage disks are fine, I never once got an alert for an offline server/client.  During this duration the mounted fuse filesystems were crawling!  No rebalance was done.

Any bugzilla's or threads is much appreciated.

Khoi


**

This email and any attachments may contain information that is confidential and/or privileged for the sole use of the intended recipient. Any use, review, disclosure, copying, distribution or reliance by others, and any forwarding of this email or its contents, without the express permission of the sender is strictly prohibited by law. If you are not the intended recipient, please contact the sender immediately, delete the e-mail and destroy all copies.
**
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://supercolony.gluster.org/mailman/listinfo/gluster-users

[Index of Archives]     [Gluster Development]     [Linux Filesytems Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux