Re: "Transport endpoint is not connected" error + long list of files to be healed

Ashish Pandey <aspandey@xxxxxxxxxx> · Wed, 13 Nov 2019 14:05:56 -0500 (EST)

Hi Mauro,

Yes, it will take time to heal these files and time depends on the number of file/dir you have created and the amount of data you have written while the 
bricks were down.

YOu can just run following command and keep observing that the count is changing or not - 

gluster volume heal tier2 info | grep entries

---
Ashish

From: "Mauro Tridici" <mauro.tridici@xxxxxxx>
To: "Gluster Devel" <gluster-devel@xxxxxxxxxxx>
Cc: "Gluster-users" <gluster-users@xxxxxxxxxxx>
Sent: Wednesday, November 13, 2019 7:00:37 PM
Subject: [Gluster-users] "Transport endpoint is not connected" error + long list of files to be healed

Dear All,

our GlusterFS filesystem was showing some problem during some simple users actions (for example, during directory or file creation).

mkdir -p test
mkdir: impossibile creare la directory `test': Transport endpoint is not connected

After received some users notification, I investigated about the issue and I detected that 3 bricks (each one in a separate gluster servers) were down.So, I forced the bricks to be up using “gluster vol start tier force” and bricks come back successfully. All the bricks are up.

Anyway, I see from “gluster vol status” command output that also 2 self-heal daemons were down and I had to restart daemons to fix the problem.
Now, everything seems to be ok watching the output of “gluster vol status” and I can create a test directory on the file system.

But, during the last check made using “gluster volume heal tier2 info”, I saw a long list of files and directories that need to be healed.
The list is very long and the command output is still going ahead on my terminal.

What I can do to fix this issue? Does the self-heal feature fix automatically each files that need to be healed?
Could you please help me to understand what I need to do in this case?

You can find below some information about our GlusterFS configuration:

Volume Name: tier2
Type: Distributed-Disperse
Volume ID: a28d88c5-3295-4e35-98d4-210b3af9358c
Status: Started
Snapshot Count: 0
Number of Bricks: 12 x (4 + 2) = 72
Transport-type: tcp

Thank you in advance.
Regards,
Mauro

________

Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/118564314

NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/118564314

Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
https://lists.gluster.org/mailman/listinfo/gluster-users

________

Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/118564314

NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/118564314

Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
https://lists.gluster.org/mailman/listinfo/gluster-users