Re: Ceph doesn't detect journal failure while the OSD is running

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sunday, August 26, 2012 at 11:09 AM, Sébastien Han wrote:
> Hi guys!
>  
> Ceph doesn't seem to detect a journal failure. The cluster keeps
> writing data even if the journal doesn't exist anymore.
> I can find anywhere in the log or from the ceph's command output any
> information about a journal failure. Obviously if an OSD is restarted
> ceph will complain but a failure on fly won't be detected.
>  
> It seems that Ceph just writes directly to the backend filesystem
> without complaining.
>  
> Yes my monitoring system will tell me that the disk which contains my
> journals is down... But it's not the point here :)
>  
> The really good point for me is that the cluster keeps running even if
> the journal is gone. The bad point is obviously that the cluster keeps
> writing data to the backend filesystem (without O_DIRECT I guess...).
> I'll prefer a 'read only' cluster facility while the journal is down.
> Being able to retrieve the data is as crucial as writing data.
>  
> Any reaction about that? Roadmap feature maybe?
So are you saying the filesystem that the journal was located on disappeared? Or the underlying disk disappeared?
And then the OSD didn't notice?

If so, that's definitely a problem to be corrected as soon as we can…It's more likely to make the OSD shut down than to continue serving reads, though.
-Greg

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux