Re: PG inconsistency

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 6 Nov 2014, GuangYang wrote:
> Hello Cephers,
> Recently we observed a couple of inconsistencies in our Ceph cluster, 
> there were two major patterns leading to inconsistency as I observed: 1) 
> EIO to read the file, 2) the digest is inconsistent (for EC) even there 
> is no read error).
> 
> While ceph has built-in tool sets to repair the inconsistencies, I also 
> would like to check with the community in terms of what is the best ways 
> to handle such issues (e.g. should we run fsck / xfs_repair when such 
> issue happens).
> 
> In more details, I have the following questions:
> 1. When there is inconsistency detected, what is the chance there is 
> some hardware issues which need to be repaired physically, or should I 
> run some disk/filesystem tools to further check?

I'm not really an operator so I'm not as familiar with these tools as I 
should be :(, but I suspect the prodent route is to check the SMART info 
on the disk, and/or trigger a scrub of everything else on the OSD (ceph 
osd scrub N).  For DreamObjects, I think they usually just fail the OSD 
once it starts throwing bad sectors (most of the hardware is already 
reasonably aged).

> 2. Should we use fsck / xfs_repair to fix the inconsistencies, or should 
> we solely relay on Ceph's repair tool sets?

That might not be a bad idea, but I would urge caution if xfs_repair finds 
any issues or makes any changes, as subtle changes to the fs contents can 
confuse ceph-osd.  At an absolute minimum, do a full scrub after, but 
even better would be to fail the OSD.

(FWIW I think we should document a recommended "safe" process for 
failing/replacing an OSD that takes the suspect data offline but waits for 
the cluster to heal before destroying any data.  Simply marking the OSD 
out will work, but then when a fresh drive is added there will be a second 
repair/rebalance event, which isn't ideal.)

sage

> 
> It would be great to hear you experience and suggestions.
> 
> BTW, we are using XFS in the cluster.
> 
> Thanks,
> Guang 		 	   		  N????y????b?????v?????{.n??????z??ay????????j???f????????????????:+v??????????zZ+??????"?!?
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux