Hi Yonex,
Are you still hitting this issue ?
Regards
Rafi KC
On 01/16/2017 10:36 AM, yonex wrote:
Hi
I noticed that there is a high throughput degradation while
attaching the gdb script to a glusterfs client process. Write
speed becomes 2% or less. It is not be able to keep thrown in
production.
Could you provide the custom build that you mentioned before? I
am going to keep trying to reproduce the problem outside of the
production environment.
Regards
Is
there any update on this ?
Regards
Rafi KC
On 12/24/2016 03:53 PM, yonex wrote:
Rafi,
Thanks
again. I will try that and get back to you.
Regards.
2016-12-23
18:03 GMT+09:00 Mohammed Rafi K C
<rkavunga@xxxxxxxxxx>:
Hi
Yonex,
As
we discussed in irc #gluster-devel , I have attached
the gdb script
along
with this mail.
Procedure
to run the gdb script.
1)
Install gdb,
2)
Download and install gluster debuginfo for your
machine . packages
location
--- > https://cbs.centos.org/koji/buildinfo?buildID=12757
3)
find the process id and attach gdb to the process
using the command
gdb
attach <pid> -x <path_to_script>
4)
Continue running the script till you hit the problem
5)
Stop the gdb
6)
You will see a file called mylog.txt in the location
where you ran
the
gdb
Please
keep an eye on the attached process. If you have any
doubt please
feel
free to revert me.
Regards
Rafi
KC
On
12/19/2016 05:33 PM, Mohammed Rafi K C wrote:
On
12/19/2016 05:32 PM, Mohammed Rafi K C wrote:
Client
0-glusterfs01-client-2 has disconnected from
bricks around
2016-12-15
11:21:17.854249 . Can you look and/or paste the
brick logs
around
the time.
You
can find the brick name and hostname for
0-glusterfs01-client-2 from
client
graph.
Rafi
Are
you there in any of gluster irc channel, if so
Have you got a
nickname
that I can search.
Regards
Rafi
KC
On
12/19/2016 04:28 PM, yonex wrote:
Rafi,
OK. Thanks for your
guide. I found the debug log and pasted lines
around that.
http://pastebin.com/vhHR6PQN
Regards
2016-12-19 14:58
GMT+09:00 Mohammed Rafi K C
<rkavunga@xxxxxxxxxx>:
On 12/16/2016
09:10 PM, yonex wrote:
Rafi,
Thanks, the
.meta feature I didn't know is very nice.
I finally have
captured debug
logs from a client and bricks.
A mount log:
- http://pastebin.com/Tjy7wGGj
FYI rickdom126
is my client's hostname.
Brick logs
around that time:
- Brick1: http://pastebin.com/qzbVRSF3
- Brick2: http://pastebin.com/j3yMNhP3
- Brick3: http://pastebin.com/m81mVj6L
- Brick4: http://pastebin.com/JDAbChf6
- Brick5: http://pastebin.com/7saP6rsm
However I could
not find any message like "EOF on socket".
I hope
there is any
helpful information in the logs above.
Indeed. I
understand that the connections are in
disconnected state. But
what particularly
I'm looking for is the cause of the
disconnect, Can
you paste the
debug logs when it start disconnects, and
around that. You
may see a debug
logs that says "disconnecting now".
Regards
Rafi KC
Regards.
2016-12-14 15:20
GMT+09:00 Mohammed Rafi K C
<rkavunga@xxxxxxxxxx>:
On 12/13/2016
09:56 PM, yonex wrote:
Hi Rafi,
Thanks for
your response. OK, I think it is
possible to capture debug
logs, since
the error seems to be reproduced a few
times per day. I
will try
that. However, so I want to avoid
redundant debug outputs if
possible, is
there a way to enable debug log only
on specific client
nodes?
if you are
using fuse mount, there is proc kind of
feature called .meta
. You can set
log level through that for a particular
client [1] . But I
also want log
from bricks because I suspect bricks
process for
initiating the
disconnects.
[1] eg : echo
8 >
/mnt/glusterfs/.meta/logging/loglevel
Regards
Yonex
2016-12-13
23:33 GMT+09:00 Mohammed Rafi K C
<rkavunga@xxxxxxxxxx>:
Hi Yonex,
Is this
consistently reproducible ? if so,
Can you enable debug log [1]
and check
for any message similar to [2].
Basically you can even search
for "EOF
on socket".
You can
set your log level back to default
(INFO) after capturing for
some time.
[1] :
gluster volume set <volname>
diagnostics.brick-log-level DEBUG
and
gluster
volume set <volname>
diagnostics.client-log-level DEBUG
[2] : http://pastebin.com/xn8QHXWa
Regards
Rafi KC
On
12/12/2016 09:35 PM, yonex wrote:
Hi,
When my
application moves a file from it's
local disk to FUSE-mounted
GlusterFS
volume, the client outputs many
warnings and errors not
always
but occasionally. The volume is a
simple distributed volume.
A sample
of logs pasted: http://pastebin.com/axkTCRJX
It seems
to come from something like a
network disconnection
("Transport
endpoint is not connected") at a
glance, but other
networking
applications on the same machine
don't observe such a
thing.
So I guess there may be a problem
somewhere in GlusterFS stack.
It ended
in failing to rename a file,
logging PHP Warning like below:
PHP
Warning:
rename(/glusterfs01/db1/stack/f0/13a9a2f0):
failed
to open
stream: Input/output error in
[snipped].php on line 278
PHP
Warning:
rename(/var/stack/13a9a2f0,/glusterfs01/db1/stack/f0/13a9a2f0):
Input/output
error in [snipped].php on line 278
Conditions:
-
GlusterFS 3.8.5 installed via yum
CentOS-Gluster-3.8.repo
- Volume
info and status pasted: http://pastebin.com/JPt2KeD8
- Client
machines' OS: Scientific Linux 6
or CentOS 6.
- Server
machines' OS: CentOS 6.
- Kernel
version is
2.6.32-642.6.2.el6.x86_64 on all
machines.
- The
number of connected FUSE clients
is 260.
- No
firewall between connected
machines.
-
Neither remounting volumes nor
rebooting client machines take
effect.
- It is
caused by not only rename() but
also copy() and filesize()
operation.
- No
outputs in brick logs when it
happens.
Any
ideas? I'd appreciate any help.
Regards.
_______________________________________________
Gluster-users
mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-users
|