On Sat, Jan 11, 2014 at 11:18 PM, Dong Yuan <yuandong1222@xxxxxxxxx> wrote: > It is not only for consistent between memory and disk. The key point > is to implement the atomicity of an trancation. > > That is when an trancation needs to write an object and update the > pglog at the same time, we must make sure the two IO do both or > nether. > > With the journal, when osd restore from failure, the reply process can > redo the transcation. I think that is why the journal can not be > disabled. Hmm, I missed it. Journal is the guarantee for the atomic of transaction. Thanks! > > On 11 January 2014 13:24, Haomai Wang <haomaiwang@xxxxxxxxx> wrote: >> On Fri, Jan 10, 2014 at 11:13 AM, Gregory Farnum <greg@xxxxxxxxxxx> wrote: >>> Exactly. We can't do a safe update without a journal — what if power >>> goes out while the write is happening? When we boot back up, we don't >>> know what version the object is actually at. So if you're using btrfs, >>> you can run without a journal already (and depend on snapshots for >>> recovering after failures); if you are using xfs or ext4 a journal is >>> required for any safety at all, even when it's fronted by a cache >>> pool. >> >> I'm not fully agree with it. Why we can't call "fdatasync()" during >> each transaction to >> ensure consistent if exists cache in the front of. >> >>> >>> On Thu, Jan 9, 2014 at 7:08 PM, Dong Yuan <yuandong1222@xxxxxxxxx> wrote: >>>> The Journal is the part of implementation of ObjectStore Transaction >>>> Interface, while transaction is used by PG to write pglog with object >>>> data in one transaction. >>>> So I think if the FileJournal could be disabled, there must be >>>> something else to implement the Transaction Interface. But it seems >>>> hard while no local file-system provide such function in my opinion. >>>> >>>> >>>> On 10 January 2014 10:04, Haomai Wang <haomaiwang@xxxxxxxxx> wrote: >>>>> On Fri, Jan 10, 2014 at 1:28 AM, Gregory Farnum <greg@xxxxxxxxxxx> wrote: >>>>>> >>>>>> The FileJournal is also for data safety whenever we're using write >>>>>> ahead. To disable it we need a backing store that we know can provide >>>>>> us consistent checkpoints (i.e., we can use parallel journaling mode — >>>>>> so for the FileJournal, we're using btrfs, or maybe zfs someday). But >>>>>> for those systems you can already configure the system not to use a >>>>>> journal. >>>>> >>>>> Yes, it depends on backend. For example, FileStore can write a object with sync >>>>> to sure consistent. If adding a disable FileJournal option, we need >>>>> some works on >>>>> FileStore to implement it. >>>>> >>>>>> -Greg >>>>>> Software Engineer #42 @ http://inktank.com | http://ceph.com >>>>>> >>>>>> >>>>>> On Thu, Jan 9, 2014 at 12:13 AM, Haomai Wang <haomaiwang@xxxxxxxxx> wrote: >>>>>> > Hi all, >>>>>> > >>>>>> > We know FileJournal plays a important role in FileStore backend, it can >>>>>> > hugely reduce write latency and improve small write operations. >>>>>> > >>>>>> > But in practice, there exists exceptions such as we already use FlashCache or cachepool(although it's not ready). >>>>>> > >>>>>> > If cachepool enabled, we may use use journal in cache_pool but may >>>>>> > not like to use journal in base_pool. The main reason why drop journal >>>>>> > in base_pool is that journal take over a single physical device and waste >>>>>> > too much in base_pool. >>>>>> > >>>>>> > Like above, if I enable FlashCache or other cache, I'd not like to enable >>>>>> > journal in OSD layer. >>>>>> > >>>>>> > So is it necessary to disable journal in special(not really special) case? >>>>>> > >>>>>> > Best regards, >>>>>> > Wheats >>>>>> > >>>>>> > >>>>>> > >>>>>> > -- >>>>>> > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >>>>>> > the body of a message to majordomo@xxxxxxxxxxxxxxx >>>>>> > More majordomo info at http://vger.kernel.org/majordomo-info.html >>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> >>>>> Best Regards, >>>>> >>>>> Wheat >>>>> -- >>>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >>>>> the body of a message to majordomo@xxxxxxxxxxxxxxx >>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>>> >>>> >>>> >>>> -- >>>> Dong Yuan >>>> Email:yuandong1222@xxxxxxxxx >> >> >> >> -- >> Best Regards, >> >> Wheat > > > > -- > Dong Yuan > Email:yuandong1222@xxxxxxxxx -- Best Regards, Wheat -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html