In glusterfs the long string is called "gfid" and does not represent the name. Best Regards, Strahil Nikolov В петък, 7 август 2020 г., 21:40:11 Гринуич+3, Mathias Waack <mathias.waack@xxxxxxxxxxxxxxx> написа: Hi Strahil, but I cannot find these files in the heal info: find /zbrick/.glusterfs -links 1 -ls | grep -v ' -> ' ... 7443397 132463 -rw------- 1 999 docker 1073741824 Aug 3 10:35 /zbrick/.glusterfs/b5/3c/b53c8e46-068b-4286-94a6-7cf54f711983 Now looking for this file in the heal infos: gluster volume heal gvol info | grep b53c8e46-068b-4286-94a6-7cf54f711983 shows nothing. So I do not know, what I have to heal... Mathias On 07.08.20 14:32, Strahil Nikolov wrote: > Have you tried to gluster heal and check if the files are back into their place? > > I always thought that those hard links are used by the healing mechanism and if that is true - gluster should restore the files to their original location and then wiping the correct files from FUSE will be easy. > > Best Regards, > Strahil Nikolov > > На 7 август 2020 г. 10:24:38 GMT+03:00, Mathias Waack <mathias.waack@xxxxxxxxxxxxxxx> написа: >> Hi all, >> >> maybe I should add some more information: >> >> The container which filled up the space was running on node x, which >> still shows a nearly filled fs: >> >> 192.168.1.x:/gvol 2.6T 2.5T 149G 95% /gluster >> >> nearly the same situation on the underlying brick partition on node x: >> >> zdata/brick 2.6T 2.4T 176G 94% /zbrick >> >> On node y the network card crashed, glusterfs shows the same values: >> >> 192.168.1.y:/gvol 2.6T 2.5T 149G 95% /gluster >> >> but different values on the brick: >> >> zdata/brick 2.9T 1.6T 1.4T 54% /zbrick >> >> I think this happened because glusterfs still has hardlinks to the >> deleted files on node x? So I can find these files with: >> >> find /zbrick/.glusterfs -links 1 -ls | grep -v ' -> ' >> >> But now I am lost. How can I verify these files really belongs to the >> right container? Or can I just delete this files because there is no >> way >> to access it? Or offers glusterfs a way to solve this situation? >> >> Mathias >> >> On 05.08.20 15:48, Mathias Waack wrote: >>> Hi all, >>> >>> we are running a gluster setup with two nodes: >>> >>> Status of volume: gvol >>> Gluster process TCP Port RDMA Port >>> Online Pid >>> >> ------------------------------------------------------------------------------ >> >>> Brick 192.168.1.x:/zbrick 49152 0 Y 13350 >>> Brick 192.168.1.y:/zbrick 49152 0 Y 5965 >>> Self-heal Daemon on localhost N/A N/A Y 14188 >>> Self-heal Daemon on 192.168.1.93 N/A N/A Y 6003 >>> >>> Task Status of Volume gvol >>> >> ------------------------------------------------------------------------------ >> >>> There are no active volume tasks >>> >>> The glusterfs hosts a bunch of containers with its data volumes. The >>> underlying fs is zfs. Few days ago one of the containers created a >> lot >>> of files in one of its data volumes, and at the end it completely >>> filled up the space of the glusterfs volume. But this happened only >> on >>> one host, on the other host there was still enough space. We finally >>> were able to identify this container and found out, the sizes of the >>> data on /zbrick were different on both hosts for this container. Now >>> we made the big mistake to delete these files on both hosts in the >>> /zbrick volume, not on the mounted glusterfs volume. >>> >>> Later we found the reason for this behavior: the network driver on >> the >>> second node partially crashed (which means we ware able to login on >>> the node, so we assumed the network was running, but the card was >>> already dropping packets at this time) at the same time, as the >> failed >>> container started to fill up the gluster volume. After rebooting the >>> second node the gluster became available again. >>> >>> Now the glusterfs volume is running again- but it is still (nearly) >>> full: the files created by the container are not visible, but they >>> still count into amount of free space. How can we fix this? >>> >>> In addition there are some files which are no longer accessible since >>> this accident: >>> >>> tail access.log.old >>> tail: cannot open 'access.log.old' for reading: Input/output error >>> >>> Looks like affected by this error are files which have been changed >>> during the accident. Is there a way to fix this too? >>> >>> Thanks >>> Mathias >>> >>> >>> ________ >>> >>> >>> >>> Community Meeting Calendar: >>> >>> Schedule - >>> Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC >>> Bridge: https://bluejeans.com/441850968 >>> >>> Gluster-users mailing list >>> Gluster-users@xxxxxxxxxxx >>> https://lists.gluster.org/mailman/listinfo/gluster-users >> ________ >> >> >> >> Community Meeting Calendar: >> >> Schedule - >> Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC >> Bridge: https://bluejeans.com/441850968 >> >> Gluster-users mailing list >> Gluster-users@xxxxxxxxxxx >> https://lists.gluster.org/mailman/listinfo/gluster-users ________ Community Meeting Calendar: Schedule - Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC Bridge: https://bluejeans.com/441850968 Gluster-users mailing list Gluster-users@xxxxxxxxxxx https://lists.gluster.org/mailman/listinfo/gluster-users ________ Community Meeting Calendar: Schedule - Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC Bridge: https://bluejeans.com/441850968 Gluster-users mailing list Gluster-users@xxxxxxxxxxx https://lists.gluster.org/mailman/listinfo/gluster-users