This is the reason behind my earlier questions re replacing a node.
I had *yet another* disk failure. However I have now replaced all desktop drives on the server (WD Blacks & Blues) with WD reds which are rated for 24/7 NAS operation. Two nodes are 4*WD red in Zfs Raid10, one node has 8 SAS drives in raid 10.
However :( the ZFS resilver on node vng revealed data corruption in one file:
/tank/vmdata/datastore4/.glusterfs/d7/0f/d70f39ea-e831-45ef-b2bc-899d921ea572
However checking its hard link, its not linked to a data file, rather to what was presumably a shard of the data file
Checking the shard directory, there are 1024 64MB "719041d0-d755-4bc6-a5fc-6b59071fac17.*" files. Checking the actually gluster mount, there are no files with a gfid of "719041d0-d755-4bc6-a5fc-6b59071fac17". All three nodes are the same in this regard.
1. I'm not to concerned about it as it seems to be a result of straight out undetected disk corruption, a result of a crappy setup on our part, since corrected. 2. I have no idea as to what image file was originally represented by these shards or if its possible to find out. After a quick check they all appear to be ok. 3. Not sure what to do about it - should I just delete the shards?
Or a broader note, I contemplating putting together some scripts to do basically integrity checks on a sharded setup: - Check for orphaned shards - check for files missing shards (is it possible) - Anything else?
I'd also like to write a service (c/c++) that can do an online scrub (md5 check) of shards. Would that be possible via the gfapi or is to high level for that?
Thanks,
-- Lindsay Mathieson |
_______________________________________________ Gluster-users mailing list Gluster-users@xxxxxxxxxxx http://www.gluster.org/mailman/listinfo/gluster-users