OK thanks! Will change that.
Am 31.12.2012 20:21, schrieb Samuel Just:
The ceph-osd relies on fs barriers for correctness. You will want to
remove the nobarrier option to prevent future corruption.
-Sam
On Mon, Dec 31, 2012 at 3:59 AM, Stefan Priebe <s.priebe@xxxxxxxxxxxx> wrote:
Am 31.12.2012 02:10, schrieb Samuel Just:
Are you using xfs? If so, what mount options?
Yes,
noatime,nodiratime,nobarrier,logbufs=8,logbsize=256k
Stefan
On Dec 30, 2012 1:28 PM, "Stefan Priebe" <s.priebe@xxxxxxxxxxxx
<mailto:s.priebe@xxxxxxxxxxxx>> wrote:
>
> Am 30.12.2012 19:17, schrieb Samuel Just:
>>
>> This is somewhat more likely to have been a bug in the replication
logic
>> (there were a few fixed between 0.53 and 0.55). Had there been any
>> recent osd failures?
>
> Yes i was stressing CEPH with failures (power, link, disk, ...).
>
> Stefan
>
>> On Dec 24, 2012 10:55 PM, "Sage Weil" <sage@xxxxxxxxxxx
<mailto:sage@xxxxxxxxxxx>
>> <mailto:sage@xxxxxxxxxxx <mailto:sage@xxxxxxxxxxx>>> wrote:
>>
>> On Tue, 25 Dec 2012, Stefan Priebe wrote:
>> > Hello list,
>> >
>> > today i got the following ceph status output:
>> > 2012-12-25 02:57:00.632945 mon.0 [INF] pgmap v1394388: 7632
pgs: 7631
>> > active+clean, 1 active+clean+inconsistent; 151 GB data, 307 GB
>> used, 5028 GB /
>> > 5336 GB avail
>> >
>> >
>> > i then grepped the inconsistent pg by:
>> > # ceph pg dump - | grep inconsistent
>> > 3.ccf 10 0 0 0 41037824 155930
>> 155930
>> > active+clean+inconsistent 2012-12-25 01:51:35.318459
6243'2107
>> > 6190'9847 [14,42] [14,42] 6243'2107 2012-12-25
>> 01:51:35.318436
>> > 6007'2074 2012-12-23 01:51:24.386366
>> >
>> > and initiated a repair:
>> > # ceph pg repair 3.ccf
>> > instructing pg 3.ccf on osd.14 to repair
>> >
>> > The log output then was:
>> > 2012-12-25 02:56:59.056382 osd.14 [ERR] 3.ccf osd.42 missing
>> > 1c602ccf/rbd_data.4904d6b8b4567.0000000000000b84/head//3
>> > 2012-12-25 02:56:59.056385 osd.14 [ERR] 3.ccf osd.42 missing
>> > ceb55ccf/rbd_data.48cc66b8b4567.0000000000001538/head//3
>> > 2012-12-25 02:56:59.097989 osd.14 [ERR] 3.ccf osd.42 missing
>> > dba6bccf/rbd_data.4797d6b8b4567.00000000000015ad/head//3
>> > 2012-12-25 02:56:59.097991 osd.14 [ERR] 3.ccf osd.42 missing
>> > a4deccf/rbd_data.45f956b8b4567.00000000000003d5/head//3
>> > 2012-12-25 02:56:59.098022 osd.14 [ERR] 3.ccf repair 4 missing,
0
>> inconsistent
>> > objects
>> > 2012-12-25 02:56:59.098046 osd.14 [ERR] 3.ccf repair 4 errors,
4
>> fixed
>> >
>> > Why doesn't ceph repair this automatically? Ho could this
happen
>> at all?
>>
>> We just made some fixes to repair in next (it was broken sometime
>> between
>> ~0.53 and 0.55). The latest next should repair it. In general
we don't
>> repair automatically lest we inadvertantly propagate bad data or
paper
>> over a bug.
>>
>> As for the original source of the missing objects... I'm not sure.
>> There
>> were some fixed races related to backfill that could lead to an
object
>> being missed, but Sam would know more about how likely that
actually is.
>>
>> sage
>> --
>> To unsubscribe from this list: send the line "unsubscribe
ceph-devel" in
>> the body of a message to majordomo@xxxxxxxxxxxxxxx
<mailto:majordomo@xxxxxxxxxxxxxxx>
>> <mailto:majordomo@xxxxxxxxxxxxxxx
<mailto:majordomo@xxxxxxxxxxxxxxx>>
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html