Hi Sage, Thanks for the feedback. We found the performance issue actually from the journal writing. even we separate journal into the other SATA HDD, it didn't shows any performance difference. At the end , we use SSD to store the journal file which give us the performance 52.3MB/s with oflag=direct. compare SATA 6MB/s I didn't have chance to trace the code for journal writing , however I am wondering that is there a way to improve the journal writing so we can get better performance in SATA HDD. Thanks. Regards, Leander Yu. On Tue, Dec 27, 2011 at 12:27 AM, Sage Weil <sage@xxxxxxxxxxxx> wrote: > Hi Leander, > > On Mon, 26 Dec 2011, Leander Yu wrote: >> Hi all, >> I have a question regarding to RBD performance. Recently I test RBD in >> 2 cluster, using dd if=/dev/zero of=/dev/rbd5 bs=4M count=1024 >> oflag=direct. I got 33.4MB/s in one cluster , however got only 6MB/s >> in the other cluster. >> >> I have confirmed that the network bandwidth is ok. and osd bench shows >> around 32 ~ 40 MB/s on all OSD. Is this performance normal? if not , >> how can I troubleshoot it? >> I have move journal to ramdisk for testing and get performance boost >> to 7x MB/s. Does this means the HDD is the bottleneck ? >> >> The other cluster that got 33.4 MB/s , it's osd bench shows around >> 44MB/s. The major difference between those two cluster is the better >> performed one use SAS with RAID controller the other use SATA with no >> RAID control. > > If you are using the kernel client, each 4M write is fully synchronous, > which means (among other things) that the client is completely idle while > the write is being replicated by the OSDs. This makes the performance > very sensitive to the write latency; it's not really testing write > throughput. Your SAS RAID controller is probably doing writeback caching > of some kind that keeps the write latency low, while the SATA disks > clearly are not. > > 'dd oflag=direct' is probably not the benchmark you want; no application > should do a single synchronous IO at a time and expect good throughput. > The more interesting number would be steady-state buffered write > throughput when you have a file system on top, or something along those > lines. > > sage -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html