Re: automatic repair of inconsistent pg?

Samuel Just <sam.just@xxxxxxxxxxx> · Mon, 31 Dec 2012 11:21:14 -0800



The ceph-osd relies on fs barriers for correctness.  You will want to
remove the nobarrier option to prevent future corruption.
-Sam

On Mon, Dec 31, 2012 at 3:59 AM, Stefan Priebe <s.priebe@xxxxxxxxxxxx> wrote:
> Am 31.12.2012 02:10, schrieb Samuel Just:
>
>> Are you using xfs?  If so, what mount options?
>
>
> Yes,
> noatime,nodiratime,nobarrier,logbufs=8,logbsize=256k
>
> Stefan
>
>>
>> On Dec 30, 2012 1:28 PM, "Stefan Priebe" <s.priebe@xxxxxxxxxxxx
>> <mailto:s.priebe@xxxxxxxxxxxx>> wrote:
>>  >
>>  > Am 30.12.2012 19:17, schrieb Samuel Just:
>>  >>
>>  >> This is somewhat more likely to have been a bug in the replication
>> logic
>>  >> (there were a few fixed between 0.53 and 0.55).  Had there been any
>>  >> recent osd failures?
>>  >
>>  > Yes i was stressing CEPH with failures (power, link, disk, ...).
>>  >
>>  > Stefan
>>  >
>>  >> On Dec 24, 2012 10:55 PM, "Sage Weil" <sage@xxxxxxxxxxx
>> <mailto:sage@xxxxxxxxxxx>
>>  >> <mailto:sage@xxxxxxxxxxx <mailto:sage@xxxxxxxxxxx>>> wrote:
>>  >>
>>  >>     On Tue, 25 Dec 2012, Stefan Priebe wrote:
>>  >>      > Hello list,
>>  >>      >
>>  >>      > today i got the following ceph status output:
>>  >>      > 2012-12-25 02:57:00.632945 mon.0 [INF] pgmap v1394388: 7632
>> pgs: 7631
>>  >>      > active+clean, 1 active+clean+inconsistent; 151 GB data, 307 GB
>>  >>     used, 5028 GB /
>>  >>      > 5336 GB avail
>>  >>      >
>>  >>      >
>>  >>      > i then grepped the inconsistent pg by:
>>  >>      > # ceph pg dump - | grep inconsistent
>>  >>      > 3.ccf   10      0       0       0       41037824        155930
>>  >>       155930
>>  >>      > active+clean+inconsistent       2012-12-25 01:51:35.318459
>> 6243'2107
>>  >>      > 6190'9847       [14,42] [14,42] 6243'2107       2012-12-25
>>  >>     01:51:35.318436
>>  >>      > 6007'2074       2012-12-23 01:51:24.386366
>>  >>      >
>>  >>      > and initiated a repair:
>>  >>      > #  ceph pg repair 3.ccf
>>  >>      > instructing pg 3.ccf on osd.14 to repair
>>  >>      >
>>  >>      > The log output then was:
>>  >>      > 2012-12-25 02:56:59.056382 osd.14 [ERR] 3.ccf osd.42 missing
>>  >>      > 1c602ccf/rbd_data.4904d6b8b4567.0000000000000b84/head//3
>>  >>      > 2012-12-25 02:56:59.056385 osd.14 [ERR] 3.ccf osd.42 missing
>>  >>      > ceb55ccf/rbd_data.48cc66b8b4567.0000000000001538/head//3
>>  >>      > 2012-12-25 02:56:59.097989 osd.14 [ERR] 3.ccf osd.42 missing
>>  >>      > dba6bccf/rbd_data.4797d6b8b4567.00000000000015ad/head//3
>>  >>      > 2012-12-25 02:56:59.097991 osd.14 [ERR] 3.ccf osd.42 missing
>>  >>      > a4deccf/rbd_data.45f956b8b4567.00000000000003d5/head//3
>>  >>      > 2012-12-25 02:56:59.098022 osd.14 [ERR] 3.ccf repair 4 missing,
>> 0
>>  >>     inconsistent
>>  >>      > objects
>>  >>      > 2012-12-25 02:56:59.098046 osd.14 [ERR] 3.ccf repair 4 errors,
>> 4
>>  >>     fixed
>>  >>      >
>>  >>      > Why doesn't ceph repair this automatically? Ho could this
>> happen
>>  >>     at all?
>>  >>
>>  >>     We just made some fixes to repair in next (it was broken sometime
>>  >>     between
>>  >>     ~0.53 and 0.55).  The latest next should repair it.  In general
>> we don't
>>  >>     repair automatically lest we inadvertantly propagate bad data or
>> paper
>>  >>     over a bug.
>>  >>
>>  >>     As for the original source of the missing objects... I'm not sure.
>>  >>       There
>>  >>     were some fixed races related to backfill that could lead to an
>> object
>>  >>     being missed, but Sam would know more about how likely that
>> actually is.
>>  >>
>>  >>     sage
>>  >>     --
>>  >>     To unsubscribe from this list: send the line "unsubscribe
>> ceph-devel" in
>>  >>     the body of a message to majordomo@xxxxxxxxxxxxxxx
>> <mailto:majordomo@xxxxxxxxxxxxxxx>
>>  >>     <mailto:majordomo@xxxxxxxxxxxxxxx
>>
>> <mailto:majordomo@xxxxxxxxxxxxxxx>>
>>  >>     More majordomo info at http://vger.kernel.org/majordomo-info.html
>>  >>
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html