2011/7/6 huang jun <hjwsm1989@xxxxxxxxx>: > we know we should use journal if we want keep data integrity, but it really > influence the performance even we use btrfs as osd low-level fs, not > fs for journal. > in the debug log, find that : > journal prepare_single_write 1 will write 3649536 : seq 445 len > 4195462 -> 4202496 (head 40 pre_pad 3895 ebl 4195462 post_pad 3059 > tail 40) (ebl alignment 3935) > > it will generate 4195462 bytes journal of writing 4MB data from client. > that means we will write 8MB when it actually write 4MB data, > so what advices you can give if we want to lower the overhead of journal? Journals pretty much by definition mean you write the data twice; that's inherent in their design. As far as I know, if you want good performance, you should use btrfs and a recent kernel. If you insist on using ext3, different mount options may bring you performance improvements, but we have not had the time to explore those yet -- and in any case, I expect our efforts would focus on btrfs first. We are building tools for better benchmarking of Ceph right now, but the truth is that most of the performance work is still ahead of us. Current wisdom is pretty much "if it's slow, add machines". Right now, we're hard at work on correctness. -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html