Repair after accident

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi all,

we are running a gluster setup with two nodes:

Status of volume: gvol
Gluster process                             TCP Port  RDMA Port Online  Pid
------------------------------------------------------------------------------
Brick 192.168.1.x:/zbrick                  49152     0 Y       13350
Brick 192.168.1.y:/zbrick                  49152     0 Y       5965
Self-heal Daemon on localhost               N/A       N/A Y       14188
Self-heal Daemon on 192.168.1.93            N/A       N/A Y       6003

Task Status of Volume gvol
------------------------------------------------------------------------------
There are no active volume tasks

The glusterfs hosts a bunch of containers with its data volumes. The underlying fs is zfs. Few days ago one of the containers created a lot of files in one of its data volumes, and at the end it completely filled up the space of the glusterfs volume. But this happened only on one host, on the other host there was still enough space. We finally were able to identify this container and found out, the sizes of the data on /zbrick were different on both hosts for this container. Now we made the big mistake to delete these files on both hosts in the /zbrick volume, not on the mounted glusterfs volume.

Later we found the reason for this behavior: the network driver on the second node partially crashed (which means we ware able to login on the node, so we assumed the network was running, but the card was already dropping packets at this time) at the same time, as the failed container started to fill up the gluster volume. After rebooting the second node  the gluster became available again.

Now the glusterfs volume is running again- but it is still (nearly) full: the files created by the container are not visible, but they still count into amount of free space. How can we fix this?

In addition there are some files which are no longer accessible since this accident:

tail access.log.old
tail: cannot open 'access.log.old' for reading: Input/output error

Looks like affected by this error are files which have been changed during the accident. Is there a way to fix this too?

Thanks
    Mathias


________



Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://bluejeans.com/441850968

Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
https://lists.gluster.org/mailman/listinfo/gluster-users




[Index of Archives]     [Gluster Development]     [Linux Filesytems Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux