Le 22/07/2016 21:12, Yannick Perret a
écrit :
Le
22/07/2016 17:47, Mykola Ulianytskyi a écrit :
Hi
3.7 clients are not compatible with
3.6 servers
Can you provide more info?
I use some 3.7 clients with 3.6 servers and don't see issues.
Well,
with client 3.7.13 compiled on the same machine when I try the
same mount I get:
# mount -t glusterfs sto1.my.domain:BACKUP-ADMIN-DATA /zog/
Mount failed. Please check the log file for more details.
Checking the logs (/var/log/glusterfs/zog.log) I have:
[2016-07-22 19:05:40.249143] I [MSGID: 100030]
[glusterfsd.c:2338:main] 0-/usr/local/sbin/glusterfs: Started
running /usr/local/sbin/glusterfs version 3.7.13 (args:
/usr/local/sbin/glusterfs --volfile-server=sto1.my.domain
--volfile-id=BACKUP-ADMIN-DATA /zog)
[2016-07-22 19:05:40.258437] I [MSGID: 101190]
[event-epoll.c:632:event_dispatch_epoll_worker] 0-epoll: Started
thread with index 1
[2016-07-22 19:05:40.259480] W [socket.c:701:__socket_rwv]
0-glusterfs: readv on <the-IP>:24007 failed (Aucune donnée
disponible)
[2016-07-22 19:05:40.259859] E
[rpc-clnt.c:362:saved_frames_unwind] (-->
/usr/local/lib/libglusterfs.so.0(_gf_log_callingfn+0x175)[0x7fad7d039335]
(-->
/usr/local/lib/libgfrpc.so.0(saved_frames_unwind+0x1b3)[0x7fad7ce04e73]
(-->
/usr/local/lib/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7fad7ce04f6e]
(-->
/usr/local/lib/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x7e)[0x7fad7ce065ee]
(-->
/usr/local/lib/libgfrpc.so.0(rpc_clnt_notify+0x88)[0x7fad7ce06de8]
))))) 0-glusterfs: forced unwinding frame type(GlusterFS
Handshake) op(GETSPEC(2)) called at 2016-07-22 19:05:40.258858
(xid=0x1)
[2016-07-22 19:05:40.259894] E
[glusterfsd-mgmt.c:1690:mgmt_getspec_cbk] 0-mgmt: failed to fetch
volume file (key:BACKUP-ADMIN-DATA)
[2016-07-22 19:05:40.259939] W
[glusterfsd.c:1251:cleanup_and_exit]
(-->/usr/local/lib/libgfrpc.so.0(saved_frames_unwind+0x1de)
[0x7fad7ce04e9e]
-->/usr/local/sbin/glusterfs(mgmt_getspec_cbk+0x454) [0x40d564]
-->/usr/local/sbin/glusterfs(cleanup_and_exit+0x4b) [0x407eab]
) 0-: received signum (0), shutting down
[2016-07-22 19:05:40.259965] I [fuse-bridge.c:5720:fini] 0-fuse:
Unmounting '/zog'.
[2016-07-22 19:05:40.260913] W
[glusterfsd.c:1251:cleanup_and_exit]
(-->/lib/x86_64-linux-gnu/libpthread.so.0(+0x80a4)
[0x7fad7c0a30a4]
-->/usr/local/sbin/glusterfs(glusterfs_sigwaiter+0xc5)
[0x408015] -->/usr/local/sbin/glusterfs(cleanup_and_exit+0x4b)
[0x407eab] ) 0-: received signum (15), shutting down
Hmmm… I just saw that logs are (partly) translated which can be
harder to understand for non-french speakers.
"Aucune donnée disponible" means: no available data
BTW If I could manage 3.7 clients to work with my servers and if the
memory leak don't exists in 3.7 it would be fine for me.
--
Y.
I
did not go further about that as I just presumed that 3.7 series
was not compatible with 3.6 servers but it's maybe something else.
But here it is the same client, the same server(s) and the same
volume.
The compilation is with features (built with "configure
--disable-tiering" as I don't have installed stuff for that):
FUSE client : yes
Infiniband verbs : no
epoll IO multiplex : yes
argp-standalone : no
fusermount : yes
readline : yes
georeplication : yes
Linux-AIO : no
Enable Debug : no
Block Device xlator : no
glupy : yes
Use syslog : yes
XML output : yes
QEMU Block formats : no
Encryption xlator : yes
Unit Tests : no
POSIX ACLs : yes
Data Classification : no
firewalld-config : no
Regards,
--
Y.
Thank you
--
With best regards,
Mykola
On Fri, Jul 22, 2016 at 4:31 PM, Yannick Perret
<yannick.perret@xxxxxxxxxxxxx> wrote:
Note: I'm have a dev client machine so I
can perform tests or recompile
glusterfs client if it can helps getting data about that.
I did not test this problem against 3.7.x version as my 2
servers are in use
and I can't upgrade them at this time, and 3.7 clients are not
compatible
with 3.6 servers (as far as I can see from my tests).
--
Y.
Le 22/07/2016 14:06, Yannick Perret a écrit :
Hello,
some times ago I posted about a memory leak in client process,
but it was on
a very old 32bit machine (both kernel and OS) and I don't
found evidences
about a similar problem on our recent machines.
But I performed more tests and I have the same problem.
Clients are 64bit Debian 8.2 machines. Glusterfs client on
these machines is
compiled from sources with activated stuff:
FUSE client : yes
Infiniband verbs : no
epoll IO multiplex : yes
argp-standalone : no
fusermount : yes
readline : yes
georeplication : yes
Linux-AIO : no
Enable Debug : no
systemtap : no
Block Device xlator : no
glupy : no
Use syslog : yes
XML output : yes
QEMU Block formats : no
Encryption xlator : yes
Erasure Code xlator : yes
I tested both 3.6.7 and 3.6.9 version on client (3.6.7 is the
one installed
on our machines, even on servers, 3.6.9 is for testing with
last 3.6
version).
Here are the operations on the client (also performed with
similar results
with 3.6.7 version):
# /usr/local/sbin/glusterfs --version
glusterfs 3.6.9 built on Jul 22 2016 13:27:42
(…)
# mount -t glusterfs sto1.my.domain:BACKUP-ADMIN-DATA /zog/
# cd /usr/
# cp -Rp * /zog/TEMP/
Then monitoring memory used by glusterfs process while 'cp' is
running
(resp. VSZ and RSS from 'ps'):
284740 70232
284740 70232
284876 71704
285000 72684
285136 74008
285416 75940
(…)
368684 151980
369324 153768
369836 155576
370092 156192
370092 156192
Here both sizes are stable and correspond to the end of 'cp'
command.
If I restart an other 'cp' (even on the same directories) size
starts again
to increase.
If I perform a 'ls -lR' in the directory size also increase:
370756 192488
389964 212148
390948 213232
(here I ^C the 'ls')
When doing nothing the size don't increase but never decrease
(calling
'sync' don't change the situation).
Sending a HUP signal to glusterfs process also increases
memory (390948
213324 → 456484 213320).
Changing volume configuration (changing
diagnostics.client-sys-log-level
value) don't change anything.
Here the actual ps:
root 17041 4.9 5.2 456484 213320 ? Ssl 13:29
1:21
/usr/local/sbin/glusterfs --volfile-server=sto1.my.domain
--volfile-id=BACKUP-ADMIN-DATA /zog
Of course umouting/remounting fall back to "start" size:
# umount /zog
# mount -t glusterfs sto1.my.domain:BACKUP-ADMIN-DATA /zog/
→ root 28741 0.3 0.7 273320 30484 ? Ssl 13:57
0:00
/usr/local/sbin/glusterfs --volfile-server=sto1.my.domain
--volfile-id=BACKUP-ADMIN-DATA /zog
I didn't saw this before because most of our volumes are
mounted "on demand"
for some storage activities or are permanently mounted but
with very few
activity.
But clearly this memory usage driff is a long-term problem. On
the old 32bit
machine I had this problem ("solved" by using NFS mounts in
order to wait
for this old machine to be replaced) and it lead to glusterfs
being killed
by OS when out of free memory. It was faster than what I
describe here but
it's just a question of time.
Thanks for any help about that.
Regards,
--
Y.
The corresponding volume on servers is (if it can help):
Volume Name: BACKUP-ADMIN-DATA
Type: Replicate
Volume ID: 306d57f3-fb30-4bcc-8687-08bf0a3d7878
Status: Started
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: sto1.my.domain:/glusterfs/backup-admin/data
Brick2: sto2.my.domain:/glusterfs/backup-admin/data
Options Reconfigured:
diagnostics.client-sys-log-level: WARNING
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-users
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-users
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-users
|
Attachment:
smime.p7s
Description: Signature cryptographique S/MIME
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-users