Re: Still inconsistant pg's, ceph-osd crashes reliably after trying to repair

Gregory Farnum <gregory.farnum@xxxxxxxxxxxxx> · Thu, 1 Mar 2012 11:03:10 -0800

On Thu, Mar 1, 2012 at 10:07 AM, Oliver Francke <Oliver.Francke@xxxxxxxx> wrote:
> Well,
>
> Am 01.03.2012 um 18:15 schrieb Oliver Francke:
>
>> Hi *,
>>
>> after some crashes we still had to care for some remaining inconsistancies reported via
>>    ceph -w
>> and friends.
>> Well, we traced one of them down via
>>    ceph pg dump
>>
>> and we picked 79. pg=79.7 and found the corresponding file in the /var/log/ceph/osd.2.log.
>>    /data/osd4/current/79.7_head/rb.0.0.00000000136c__head_9FB2FA17
>> and the dup on
>>    /data/osd2/...
>> Strange though, they had the same checksum but reported a stat-error. Anyway. Decided to do a:
>>    ceph pg repair 79.7
>> ... byebye ceph-osd on node2!
>>
>> Here the trace:
>>
>> === 8-< ===
>>
>> 2012-03-01 17:49:13.024571 7f3944584700 -- 10.10.10.14:6802/4892 >> 10.10.10.10:6802/19139 pipe(0xfcd2c80 sd=16 pgs=0 cs=0 l=0).connect protocol version mismatch, my 9 != 0
>> 2012-03-01 17:49:23.674162 7f395001b700 log [ERR] : 79.7 osd.4: soid 9fb2fa17/rb.0.0.00000000136c/headextra attr _, extra attr snapset
>
> one clarification by ourselves done: one copy is missing the xattrs, checked via
>        getfattr
> but why can't it be corrected, and worse this crash happens?

You've got a lot of odd things going on here, some of which are
obviously connected and some of which aren't. Right now Ceph doesn't
automatically handle conflicts like missing xattrs because doing so
with 2x replication is really hard, and even when you have more people
(and can do some form of voting) you have to write some complicated
code to make it actually happen. :) At some point in the future, it
will, but for now we really want the attention anyway.

So the reason it's crashing is because it lost the "_" xattr, which we
could handle, except...losing that xattr is really, really bad. What
backing filesystem are you using? Are you using snapshots?

>> === 8-< ===
>>
>> ... after some journal-replay things calmed down, but:
>>
>>    2012-03-01 17:58:29.470446   log 2012-03-01 17:58:24.242369 osd.2 10.10.10.14:6801/3111 368 : [WRN] bad locator @56 on object @79 loc @56 op osd_op(client.44350.0:1412387 rb.0.0.00000000136c [write 2465792~49152] 56.9fb2fa17) v4

Does the bad locator always look like that, with the @56 and @79 values?
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html