Hi,
We have a cluster whose common storage is a gluster volume
consisting of 5 bricks residing on 3 servers.
- Gluster volume machines
- mseas-data2: CentOS release 6.8 (Final)
- mseas-data3: CentOS release 6.10 (Final)
- mseas-data4: CentOS Linux release 7.9.2009 (Core)
- Client machines
- CentOS Linux release 7.9.2009 (Core)
More details on the gluster volume are included below.
We were recently trying to gunzip a file on the gluster volume and
got a "Transport endpoint is not connected" even though every
test we try shows that gluster is fully up and running fine. We
traced the file to brick 3 in the server mseas-data3. We have
included the relevant portions of the various log files on the
client (mseas) where we were running the gunzip command and the
server hosting the file (mseas-data3) below the gluster
information
What can you suggest we do to further debug and/or solve this
issue?
Thanks
Pat
============================================================
Gluster volume information
============================================================
---------------------------------------------------
gluster volume info
-----------------------------------------
Volume Name: data-volume
Type: Distribute
Volume ID: c162161e-2a2d-4dac-b015-f31fd89ceb18
Status: Started
Number of Bricks: 5
Transport-type: tcp
Bricks:
Brick1: mseas-data2:/mnt/brick1
Brick2: mseas-data2:/mnt/brick2
Brick3: mseas-data3:/export/sda/brick3
Brick4: mseas-data3:/export/sdc/brick4
Brick5: mseas-data4:/export/brick5
Options Reconfigured:
diagnostics.client-log-level: ERROR
network.inode-lru-limit: 50000
performance.md-cache-timeout: 60
performance.open-behind: off
disperse.eager-lock: off
auth.allow: *
server.allow-insecure: on
nfs.exports-auth-enable: on
diagnostics.brick-sys-log-level: WARNING
performance.readdir-ahead: on
nfs.disable: on
nfs.export-volumes: off
cluster.min-free-disk: 1%
---------------------------------------------------
gluster volume status
--------------------------------------------
Status of volume: data-volume
Gluster process TCP Port RDMA Port
Online Pid
------------------------------------------------------------------------------
Brick mseas-data2:/mnt/brick1 49154 0
Y 15978
Brick mseas-data2:/mnt/brick2 49155 0
Y 15997
Brick mseas-data3:/export/sda/brick3 49153 0
Y 14221
Brick mseas-data3:/export/sdc/brick4 49154 0
Y 14240
Brick mseas-data4:/export/brick5 49152 0
Y 50569
---------------------------------------------------
gluster peer status
-----------------------------------------
Number of Peers: 2
Hostname: mseas-data3
Uuid: b39d4deb-c291-437e-8013-09050c1fa9e3
State: Peer in Cluster (Connected)
Hostname: mseas-data4
Uuid: 5c4d06eb-df89-4e5c-92e4-441fb401a9ef
State: Peer in Cluster (Connected)
---------------------------------------------------
glusterfs --version
--------------------------------------------
glusterfs 3.7.11 built on Apr 18 2016 13:20:46
Repository revision: git://git.gluster.com/glusterfs.git
Copyright (c) 2006-2013 Red Hat, Inc.
<http://www.redhat.com/>
GlusterFS comes with ABSOLUTELY NO WARRANTY.
It is licensed to you under your choice of the GNU Lesser
General Public License, version 3 or any later version (LGPLv3
or later), or the GNU General Public License, version 2 (GPLv2),
in all cases as published by the Free Software Foundation.
============================================================
Relevant sections from log files
============================================================
---------------------------------------------------
mseas: gdata.log
-----------------------------------------
[2022-06-15 14:51:17.263858] C
[rpc-clnt-ping.c:165:rpc_clnt_ping_timer_expired]
0-data-volume-client-2: server 172.16.1.113:49153 has not
responded in the last 42 seconds, disconnecting.
[2022-06-15 14:51:17.264522] E
[rpc-clnt.c:362:saved_frames_unwind] (-->
/usr/local/lib/libglusterfs.so.0(_gf_log_callingfn+0x172)[0x7f84886a0202]
(-->
/usr/local/lib/libgfrpc.so.0(saved_frames_unwind+0x1c2)[0x7f848846c3e2]
(-->
/usr/local/lib/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7f848846c4de]
(-->
/usr/local/lib/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x7a)[0x7f848846dd2a]
(-->
/usr/local/lib/libgfrpc.so.0(rpc_clnt_notify+0x88)[0x7f848846e538]
))))) 0-data-volume-client-2: forced unwinding frame
type(GlusterFS 3.3) op(READ(12)) called at 2022-06-15
14:49:52.113795 (xid=0xb4f49b)
[2022-06-15 14:51:17.264859] E
[rpc-clnt.c:362:saved_frames_unwind] (-->
/usr/local/lib/libglusterfs.so.0(_gf_log_callingfn+0x172)[0x7f84886a0202]
(-->
/usr/local/lib/libgfrpc.so.0(saved_frames_unwind+0x1c2)[0x7f848846c3e2]
(-->
/usr/local/lib/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7f848846c4de]
(-->
/usr/local/lib/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x7a)[0x7f848846dd2a]
(-->
/usr/local/lib/libgfrpc.so.0(rpc_clnt_notify+0x88)[0x7f848846e538]
))))) 0-data-volume-client-2: forced unwinding frame
type(GF-DUMP) op(NULL(2)) called at 2022-06-15 14:49:53.251903
(xid=0xb4f49c)
[2022-06-15 14:51:17.265111] E
[rpc-clnt.c:362:saved_frames_unwind] (-->
/usr/local/lib/libglusterfs.so.0(_gf_log_callingfn+0x172)[0x7f84886a0202]
(-->
/usr/local/lib/libgfrpc.so.0(saved_frames_unwind+0x1c2)[0x7f848846c3e2]
(-->
/usr/local/lib/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7f848846c4de]
(-->
/usr/local/lib/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x7a)[0x7f848846dd2a]
(-->
/usr/local/lib/libgfrpc.so.0(rpc_clnt_notify+0x88)[0x7f848846e538]
))))) 0-data-volume-client-2: forced unwinding frame
type(GlusterFS 3.3) op(FSTAT(25)) called at 2022-06-15
14:50:00.103768 (xid=0xb4f49d)
[root@mseas glusterfs]#
---------------------------------------------------
mseas-data3: cli.log
-----------------------------------------
[2022-06-15 14:27:12.982510] I [cli.c:721:main] 0-cli: Started
running gluster with version 3.7.11
[2022-06-15 14:27:13.206046] I [MSGID: 101190]
[event-epoll.c:632:event_dispatch_epoll_worker] 0-epoll: Started
thread with index 1
[2022-06-15 14:27:13.206152] I
[socket.c:2356:socket_event_handler] 0-transport: disconnecting
now
[2022-06-15 14:27:13.208711] I [input.c:36:cli_batch] 0-:
Exiting with: 0
[2022-06-15 14:27:23.579669] I [cli.c:721:main] 0-cli: Started
running gluster with version 3.7.11
[2022-06-15 14:27:23.711445] I [MSGID: 101190]
[event-epoll.c:632:event_dispatch_epoll_worker] 0-epoll: Started
thread with index 1
[2022-06-15 14:27:23.711551] I
[socket.c:2356:socket_event_handler] 0-transport: disconnecting
now
[2022-06-15 14:27:23.735073] I [input.c:36:cli_batch] 0-:
Exiting with: 0
---------------------------------------------------
mseas-data3: usr-local-etc-glusterfs-glusterd.vol.log
-----------------------------------------
[2022-06-15 14:27:13.208084] I [MSGID: 106487]
[glusterd-handler.c:1472:__glusterd_handle_cli_list_friends]
0-glusterd: Received cli list req
[2022-06-15 14:27:23.721724] I [MSGID: 106499]
[glusterd-handler.c:4331:__glusterd_handle_status_volume]
0-management: Received status volume req for volume data-volume
[2022-06-15 14:27:23.732286] W [MSGID: 106217]
[glusterd-op-sm.c:4630:glusterd_op_modify_op_ctx] 0-management:
Failed uuid to hostname conversion
[2022-06-15 14:27:23.732328] W [MSGID: 106387]
[glusterd-op-sm.c:4734:glusterd_op_modify_op_ctx] 0-management:
op_ctx modification failed
---------------------------------------------------
mseas-data3: bricks/export-sda-brick3.log
-----------------------------------------
[2022-06-15 14:50:42.588143] I [MSGID: 115036]
[server.c:552:server_rpc_notify] 0-data-volume-server:
disconnecting connection from
mseas.mit.edu-155483-2022/05/13-03:24:14:618694-data-volume-client-2-0-28
[2022-06-15 14:50:42.588220] I [MSGID: 115013]
[server-helpers.c:294:do_fd_cleanup] 0-data-volume-server: fd
cleanup on
/projects/posydon/Acoustics_ASA/MSEAS-ParEq-DO/Save/2D/Test_Cases/RI/DO_NAPE_JASA_Paper/Uncertain_Pekeris_Waveguide_DO_MC
[2022-06-15 14:50:42.588259] I [MSGID: 115013]
[server-helpers.c:294:do_fd_cleanup] 0-data-volume-server: fd
cleanup on
/projects/dri_calypso/PE/2019/Apr09/Ens3R200deg001/pe_out.nc.gz
[2022-06-15 14:50:42.588288] I [MSGID: 101055]
[client_t.c:420:gf_client_unref] 0-data-volume-server: Shutting
down connection
mseas.mit.edu-155483-2022/05/13-03:24:14:618694-data-volume-client-2-0-28
[2022-06-15 14:50:53.605215] I [MSGID: 115029]
[server-handshake.c:690:server_setvolume] 0-data-volume-server:
accepted client from
mseas.mit.edu-155483-2022/05/13-03:24:14:618694-data-volume-client-2-0-29
(version: 3.7.11)
[2022-06-15 14:50:42.588247] I [MSGID: 115013]
[server-helpers.c:294:do_fd_cleanup] 0-data-volume-server: fd
cleanup on
/projects/posydon/Acoustics_ASA/MSEAS-ParEq-DO/Save/2D/Test_Cases/RI/DO_NAPE_JASA_Paper/Uncertain_Pekeris_Waveguide_DO_MC
-- -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- Pat Haley Email: phaley@xxxxxxx Center for Ocean Engineering Phone: (617) 253-6824 Dept. of Mechanical Engineering Fax: (617) 253-8125 MIT, Room 5-213 http://web.mit.edu/phaley/www/ 77 Massachusetts Avenue Cambridge, MA 02139-4301
________ Community Meeting Calendar: Schedule - Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC Bridge: https://meet.google.com/cpu-eiue-hvk Gluster-users mailing list Gluster-users@xxxxxxxxxxx https://lists.gluster.org/mailman/listinfo/gluster-users