Re: Ceph doesn't detect journal failure while the OSD is running

Sébastien Han <han.sebastien@xxxxxxxxx> · Sun, 26 Aug 2012 23:07:47 +0200

I used:

umount -l to bypassed the warning.

I agree for the second one, that's a normal behavior, even if the file
doesn't exist anymore, the file is still opened by the OSD process.

On Sun, Aug 26, 2012 at 10:59 PM, Sage Weil <sage@xxxxxxxxxxx> wrote:
> On Sun, 26 Aug 2012, S?bastien Han wrote:
>> Hi Greg,
>>
>> My first test was:
>>
>> * put a journal on tmpfs
>> * umount the tmpfs
>
> This should have errored out with 'device is busy'.  Are you sure it
> actually umounted?
>
>> The second one was almost the same:
>>
>> * rm -rf /journals/*
>>
>> Here /journals contains every journal... (3 actually)
>
> That's normal Unix behavior.  The file doesn't go away until all names are
> unlinked/removed *and* all open file handles are closed...
>
> sage
>
>
>> This action wiped off the journal and ceph didn't detect anything.
>>
>> After that I created a new pool, a new image inside it, mapped it,
>> formated it and wrote data in it with dd. I also used rados bench to
>> write data.
>> Ceph didn't see anything, only a service ceph restart osd make them
>> the OSDs crashed.
>
>
>
>>
>>
>> On Sun, Aug 26, 2012 at 9:43 PM, Gregory Farnum <greg@xxxxxxxxxxx> wrote:
>> >
>> > On Sunday, August 26, 2012 at 11:09 AM, S?bastien Han wrote:
>> > > Hi guys!
>> > >
>> > > Ceph doesn't seem to detect a journal failure. The cluster keeps
>> > > writing data even if the journal doesn't exist anymore.
>> > > I can find anywhere in the log or from the ceph's command output any
>> > > information about a journal failure. Obviously if an OSD is restarted
>> > > ceph will complain but a failure on fly won't be detected.
>> > >
>> > > It seems that Ceph just writes directly to the backend filesystem
>> > > without complaining.
>> > >
>> > > Yes my monitoring system will tell me that the disk which contains my
>> > > journals is down... But it's not the point here :)
>> > >
>> > > The really good point for me is that the cluster keeps running even if
>> > > the journal is gone. The bad point is obviously that the cluster keeps
>> > > writing data to the backend filesystem (without O_DIRECT I guess...).
>> > > I'll prefer a 'read only' cluster facility while the journal is down.
>> > > Being able to retrieve the data is as crucial as writing data.
>> > >
>> > > Any reaction about that? Roadmap feature maybe?
>> > So are you saying the filesystem that the journal was located on disappeared? Or the underlying disk disappeared?
>> > And then the OSD didn't notice?
>> >
>> > If so, that's definitely a problem to be corrected as soon as we can?It's more likely to make the OSD shut down than to continue serving reads, though.
>> > -Greg
>> >
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo@xxxxxxxxxxxxxxx
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>>
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html