Re: SSD recommendations for OSD journals

"Chen, Xiaoxi" <xiaoxi.chen@xxxxxxxxx> · Mon, 22 Jul 2013 15:28:10 +0000

发自我的 iPhone

在 2013-7-22，23:16，"Gandalf Corvotempesta" <gandalf.corvotempesta@xxxxxxxxx> 写道：

> 2013/7/22 Chen, Xiaoxi <xiaoxi.chen@xxxxxxxxx>:
>> With “journal writeahead”,the data first write to journal ,ack to the
>> client, and write to OSD, note that, the data always keep in memory before
>> it write to both OSD and journal,so the write is directly from memory to
>> OSDs. This mode suite for XFS and EXT4.
> 
> What happens in case of journal failure during a write (to the
> journal, not to the OSDs) ?
> Is ceph smart enough to retry the write to another OSDs/Journal ?
Yes and No，for btrfs the write will success to OSD but for xfs，the write will failed and the OSD will suicide for EIO，then the inter-osd ping will find the osd failure ，and finally the crush-map changed. so if the client retry，the write will finally go to another osd.
> 
> And what happens in case of server failure during a write?
For btrfs，it's fine. for xfs，OSD daemon need to replay the journal when it start up，to keep the osd data consistent.
> 
> I'm asking this to evaluate if journals on RAM disks will be good or not.
For battery backed ram，it should be fine，but for normal ram disk，you will lose data，no matter xfs or btrfs.

Imaging you have several writes have been flushed to journal and acked，but not yet write to disk. Now the system crash by kernal panic or power failure，you will lose your data in ram disk，thus lose data that assumed to be successful written.
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com