On Tue, 27 Dec 2011, Leander Yu wrote: > Thanks for the feedback. We found the performance issue actually from > the journal writing. even we separate journal into the other SATA HDD, > it didn't shows any performance difference. > At the end , we use SSD to store the journal file which give us the > performance 52.3MB/s with oflag=direct. compare SATA 6MB/s > I didn't have chance to trace the code for journal writing , however I > am wondering that is there a way to improve the journal writing so we > can get better performance in SATA HDD. We can. The FileJournal code is currently doing a single sync direct-io write to the device, which means that you usually do a full disk rotation between each write. It needs to do aio so that we can keep multiple writes in flight. http://tracker.newdream.net/issues/1836 This would be a great thing for an external contributor to work on, as it's not dependent on anything else! Thanks- sage > > Thanks. > > Regards, > Leander Yu. > > On Tue, Dec 27, 2011 at 12:27 AM, Sage Weil <sage@xxxxxxxxxxxx> wrote: > > Hi Leander, > > > > On Mon, 26 Dec 2011, Leander Yu wrote: > >> Hi all, > >> I have a question regarding to RBD performance. Recently I test RBD in > >> 2 cluster, using dd if=/dev/zero of=/dev/rbd5 bs=4M count=1024 > >> oflag=direct. I got 33.4MB/s in one cluster , however got only 6MB/s > >> in the other cluster. > >> > >> I have confirmed that the network bandwidth is ok. and osd bench shows > >> around 32 ~ 40 MB/s on all OSD. Is this performance normal? if not , > >> how can I troubleshoot it? > >> I have move journal to ramdisk for testing and get performance boost > >> to 7x MB/s. Does this means the HDD is the bottleneck ? > >> > >> The other cluster that got 33.4 MB/s , it's osd bench shows around > >> 44MB/s. The major difference between those two cluster is the better > >> performed one use SAS with RAID controller the other use SATA with no > >> RAID control. > > > > If you are using the kernel client, each 4M write is fully synchronous, > > which means (among other things) that the client is completely idle while > > the write is being replicated by the OSDs. This makes the performance > > very sensitive to the write latency; it's not really testing write > > throughput. Your SAS RAID controller is probably doing writeback caching > > of some kind that keeps the write latency low, while the SATA disks > > clearly are not. > > > > 'dd oflag=direct' is probably not the benchmark you want; no application > > should do a single synchronous IO at a time and expect good throughput. > > The more interesting number would be steady-state buffered write > > throughput when you have a file system on top, or something along those > > lines. > > > > sage > -- > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html > >