Re: HDD bad sector, pg inconsistent, no object remapping

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi David,

Thanks for taking the time to look into this for us.

Isn't the checksum calculated over the data? If so, wouldn't it
then be easy for ceph to tell which copy is good (because the
checksum matches) and so an automatic repair should be possible?

Is the lack of this functionality once again just a matter of
having sufficient tuits?

Cheers,

Chris

On Mon, Nov 18, 2013 at 04:39:37PM -0800, David Zafman wrote:
> 
> I looked at the code.  The automatic repair should handle getting an EIO during read of the object replica.  It does NOT require removing the object as I said before, so it doesn’t matter which copy has bad sectors.  It will copy from a good replica to the primary, if necessary.  By default a deep-scrub which would catch this case is performed weekly.  A repair must be initiated by administrative action.
> 
> When replicas differ due to comparison of checksums, we currently don’t have a way to determine which copy(s) are corrupt.  This is where a manual intervention may be necessary if the administrator can determine which copy(s) are bad.
> 
> David Zafman
> Senior Developer
> http://www.inktank.com
> 
> 
> 
> 
> On Nov 18, 2013, at 1:11 PM, Chris Dunlop <chris@xxxxxxxxxxxx> wrote:
> 
>> OK, that's good (as far is it goes, being a manual process).
>> 
>> So then, back to what I think was Mihály's original issue:
>> 
>>> pg repair or deep-scrub can not fix this issue. But if I
>>> understand correctly, osd has to known it can not retrieve
>>> object from osd.0 and need to be replicate an another osd
>>> because there is no 3 working replicas now.
>> 
>> Given a bad checksum and/or read error tells ceph that an object
>> is corrupt, it would seem to be a natural step to then have ceph
>> automatically use another good-checksum copy, and even rewrite
>> the corrupt object, either in normal operation or under a scub
>> or repair.
>> 
>> Is there a reason this isn't done, apart from lack of tuits?
>> 
>> Cheers,
>> 
>> Chris
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com





[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux