Re: When ceph synchronizes journal to disk?

Gregory Farnum <greg@xxxxxxxxxxx> · Mon, 4 Mar 2013 08:55:09 -0800

On Sun, Mar 3, 2013 at 4:36 AM, Xing Lin <xinglin@xxxxxxxxxxx> wrote:
> Hi,
>
> There were some discussions about this before on the mailing list but I am
> still confused with this. I thought Ceph would flush data from the journal
> to disk when either the journal is full or when the time to do
> synchronization is due. In my test experiment, I used 24 osds(one osd for
> each disk). I used a 10 GB tmpfs file as the journal disk for each osd. Then
> for testing, I delayed the synchronization between the journal and disk on
> purpose. I increased the 'journal min sync interval' to be 60 s and 'journal
> max sync interval' to be 300 s. Then I created a rbd and then started a 4M
> sequential write workload with fio for 30 seconds. I was expecting that no
> IO should happen to disks, unless we have filled 240 GB data (10G*24).
> However, 'iostat' showed there was data
> started to be written into disks (at about 20 MB/s per disk), right after I
> started the sequential workload. Could someone help to explain this
> situation?

The "journal [min|max] sync interval" values specify how frequently
the OSD's "FileStore" sends a sync to the disk. However, data is still
written into the normal filesystem as it comes in, and the normal
filesystem continues to schedule normal dirty data writeouts. This is
good — it means that when we do send a sync down you don't need to
wait for all (30 seconds * 100MB/s) 3GB or whatever of data to go to
disk before it's completed.

> I am running 0.48.2. The related configuration is as follows.
If you're starting up a new cluster I recommend upgrading to the
bobtail series (.56.3) instead of using Argonaut — it's got a number
of enhancements you'll appreciate!
-Greg
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html