Re: ceph journal system vs filesystem journal system

한승진 <yongiman@xxxxxxxxx> · Mon, 5 Sep 2016 11:31:13 +0900

Hi huang jun,
Thanks for your reply.

Still now, I am really confused...

According to your reply, ceph's journal operates as a full journal(meaning metadata and object data).

Is that right?

I tested for jounral I/O through rados bench and trace the journal I/O from blktrace.

[client node]
root@ceph-mon01:~# rados -p rados-test df
pool name                 KB      objects       clones     degraded      unfound           rd        rd KB           wr      B
rados-test            528384          129            0          258            0            0            0          129      4
  total used          658876          129
  total avail      156547604
  total space      157206480

[osd node  - blktracte output]
CPU0 (8,16):
 Reads Queued:           0,        0KiB  Writes Queued:         821,   551804KiB
 Read Dispatches:        0,        0KiB  Write Dispatches:      588,   529428KiB
 Reads Requeued:         0               Writes Requeued:         0
 Reads Completed:        0,        0KiB  Writes Completed:      704,   529428KiB
 Read Merges:            0,        0KiB  Write Merges:          117,     6144KiB
 Read depth:             0               Write depth:            32
 IO unplugs:           117               Timer unplugs:           0

The result also seems to says that the full data are written to journal.

However, the ceph documentation says, 

Consistency: Ceph OSD Daemons require a filesystem interface that guarantees atomic compound operations.
Ceph OSD Daemons write a description of the operation to the journal and apply the operation to the filesystem

How can I understand above document?

I will really appreciate for your help.

Thanks.

2016-09-01 19:09 GMT+09:00 huang jun <hjwsm1989@xxxxxxxxx>:
2016-09-01 17:25 GMT+08:00 한승진 <yongiman@xxxxxxxxx>:

> Hi all.

>

> I'm very confused about ceph journal system

>

> Some people said ceph journal system works like linux journal filesystem.

>

> Also some people said all data are written journal first and then written to

> OSD data.

>

> Journal of Ceph storage also write just metadata of object or write all data

> of object?

>

> Which is right?

>

data writen to osd first will write to osd journal through dio, and

then submit to objectstore,

that will improve the small file write performance bc the journal

write is sequential not random,

and journal can recover the data that written to journal but didn't

write to objectstore yet, like outage..

> Thanks for your help

>

> Best regards.

>

>

>

>

>

>

> _______________________________________________

> ceph-users mailing list

> ceph-users@xxxxxxxxxxxxxx

> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

>

--

Thank you!

HuangJun

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com