Hi
I noticed that there is a high throughput degradation while attaching the gdb script to a glusterfs client process. Write speed becomes 2% or less. It is not be able to keep thrown in production.
Could you provide the custom build that you mentioned before? I am going to keep trying to reproduce the problem outside of the production environment.
Regards
I noticed that there is a high throughput degradation while attaching the gdb script to a glusterfs client process. Write speed becomes 2% or less. It is not be able to keep thrown in production.
Could you provide the custom build that you mentioned before? I am going to keep trying to reproduce the problem outside of the production environment.
Regards
2017年1月8日 21:54、Mohammed Rafi K C <rkavunga@xxxxxxxxxx>:
Is there any update on this ?
Regards
Rafi KC
On 12/24/2016 03:53 PM, yonex wrote:Rafi,Thanks again. I will try that and get back to you.Regards.2016-12-23 18:03 GMT+09:00 Mohammed Rafi K C <rkavunga@xxxxxxxxxx>:Hi Yonex,As we discussed in irc #gluster-devel , I have attached the gdb scriptalong with this mail.Procedure to run the gdb script.1) Install gdb,2) Download and install gluster debuginfo for your machine . packageslocation --- > https://cbs.centos.org/koji/buildinfo?buildID=127573) find the process id and attach gdb to the process using the commandgdb attach <pid> -x <path_to_script>4) Continue running the script till you hit the problem5) Stop the gdb6) You will see a file called mylog.txt in the location where you ranthe gdbPlease keep an eye on the attached process. If you have any doubt pleasefeel free to revert me.RegardsRafi KCOn 12/19/2016 05:33 PM, Mohammed Rafi K C wrote:On 12/19/2016 05:32 PM, Mohammed Rafi K C wrote:Client 0-glusterfs01-client-2 has disconnected from bricks around2016-12-15 11:21:17.854249 . Can you look and/or paste the brick logsaround the time.You can find the brick name and hostname for 0-glusterfs01-client-2 fromclient graph.RafiAre you there in any of gluster irc channel, if so Have you got anickname that I can search.RegardsRafi KCOn 12/19/2016 04:28 PM, yonex wrote:Rafi,OK. Thanks for your guide. I found the debug log and pasted lines around that.http://pastebin.com/vhHR6PQNRegards2016-12-19 14:58 GMT+09:00 Mohammed Rafi K C <rkavunga@xxxxxxxxxx>:On 12/16/2016 09:10 PM, yonex wrote:Rafi,Thanks, the .meta feature I didn't know is very nice. I finally havecaptured debug logs from a client and bricks.A mount log:- http://pastebin.com/Tjy7wGGjFYI rickdom126 is my client's hostname.Brick logs around that time:- Brick1: http://pastebin.com/qzbVRSF3- Brick2: http://pastebin.com/j3yMNhP3- Brick3: http://pastebin.com/m81mVj6L- Brick4: http://pastebin.com/JDAbChf6- Brick5: http://pastebin.com/7saP6rsmHowever I could not find any message like "EOF on socket". I hopethere is any helpful information in the logs above.Indeed. I understand that the connections are in disconnected state. Butwhat particularly I'm looking for is the cause of the disconnect, Canyou paste the debug logs when it start disconnects, and around that. Youmay see a debug logs that says "disconnecting now".RegardsRafi KCRegards.2016-12-14 15:20 GMT+09:00 Mohammed Rafi K C <rkavunga@xxxxxxxxxx>:On 12/13/2016 09:56 PM, yonex wrote:Hi Rafi,Thanks for your response. OK, I think it is possible to capture debuglogs, since the error seems to be reproduced a few times per day. Iwill try that. However, so I want to avoid redundant debug outputs ifpossible, is there a way to enable debug log only on specific clientnodes?if you are using fuse mount, there is proc kind of feature called .meta. You can set log level through that for a particular client [1] . But Ialso want log from bricks because I suspect bricks process forinitiating the disconnects.[1] eg : echo 8 > /mnt/glusterfs/.meta/logging/loglevelRegardsYonex2016-12-13 23:33 GMT+09:00 Mohammed Rafi K C <rkavunga@xxxxxxxxxx>:Hi Yonex,Is this consistently reproducible ? if so, Can you enable debug log [1]and check for any message similar to [2]. Basically you can even searchfor "EOF on socket".You can set your log level back to default (INFO) after capturing forsome time.[1] : gluster volume set <volname> diagnostics.brick-log-level DEBUG andgluster volume set <volname> diagnostics.client-log-level DEBUG[2] : http://pastebin.com/xn8QHXWaRegardsRafi KCOn 12/12/2016 09:35 PM, yonex wrote:Hi,When my application moves a file from it's local disk to FUSE-mountedGlusterFS volume, the client outputs many warnings and errors notalways but occasionally. The volume is a simple distributed volume.A sample of logs pasted: http://pastebin.com/axkTCRJXIt seems to come from something like a network disconnection("Transport endpoint is not connected") at a glance, but othernetworking applications on the same machine don't observe such athing. So I guess there may be a problem somewhere in GlusterFS stack.It ended in failing to rename a file, logging PHP Warning like below:PHP Warning: rename(/glusterfs01/db1/stack/f0/13a9a2f0): failedto open stream: Input/output error in [snipped].php on line 278PHP Warning:rename(/var/stack/13a9a2f0,/glusterfs01/db1/stack/f0/13a9a2f0):Input/output error in [snipped].php on line 278Conditions:- GlusterFS 3.8.5 installed via yum CentOS-Gluster-3.8.repo- Volume info and status pasted: http://pastebin.com/JPt2KeD8- Client machines' OS: Scientific Linux 6 or CentOS 6.- Server machines' OS: CentOS 6.- Kernel version is 2.6.32-642.6.2.el6.x86_64 on all machines.- The number of connected FUSE clients is 260.- No firewall between connected machines.- Neither remounting volumes nor rebooting client machines take effect.- It is caused by not only rename() but also copy() and filesize() operation.- No outputs in brick logs when it happens.Any ideas? I'd appreciate any help.Regards._______________________________________________Gluster-users mailing listGluster-users@xxxxxxxxxxxhttp://www.gluster.org/mailman/listinfo/gluster-users
_______________________________________________ Gluster-users mailing list Gluster-users@xxxxxxxxxxx http://lists.gluster.org/mailman/listinfo/gluster-users