Hi Sage: > The dbench workload is metadata intensive, and most metadata (update) > operations require a round trip to the MDS. The client/mds protocol is > pretty aggressive about issueing leases to the client, but that only goes > so far. Metadata performance will never be as fast as a local fs for a > single process. I see, what about the metadata workload (intensive?) for a typical block-based benchmarks? e.g., using iometer, iozone. for random read/write, sequential read/write..? When I test the whole system's performance, I'd need to mount ceph, create a file size of (2x1TB disks spread on two OSDs) -- 2TB!. Even creating this blank file takes more than several hours.. I think this is very hard to scale to a higher TB benchmarks, e.g., 10TB. Is there any better/simpler approach to test the performance? possibly just directly accessing /dev/sdX over the network? I could simply create a small chunk of file (say 1GB) for testing, but depending on where the chunk was created in the disk (esp. for hdd), it could change the result. Also, for mounting/creating, how do I check/enable -o big_writes and -o direct_io ? > Hmmm -- it looks like the scripts are split on whether you have > tcmalloc installed in your system, and I don't see any obvious issues > when I go over them. Do you have tcmalloc installed? Did you try a > "make clean; make" cycle? Sadly, I still get the error in some of other PCs. cosd.cc: In function ‘int main(int, const char**)’: cosd.cc:65: error: ‘IsHeapProfilerRunning’ was not declared in this scope cosd.cc:310: warning: ignoring return value of ‘int chdir(const char*)’, declared with attribute warn_unused_result make[2]: *** [cosd-cosd.o] Error 1 I've tried make-clean, and re-git everything again. only reverting back to 'stable' version compiles. Just 'unstable' doesn't work. Anyway to fix this, without re-installing the OS ? ;-) Another silly question, since I'm using dpkg to build to .deb and install, is there a faster way to make and compile? It seems that everytime change occur, I'd need to dpkg (which takes several minutes) and re-install again.? I've tried to just use configure,make but it won't include nifty stuffs for deb, like automatic daemon /etc/init.d/ceph Thanks a lot. On Fri, Sep 24, 2010 at 5:20 PM, Sage Weil <sage@xxxxxxxxxxxx> wrote: > On Fri, 24 Sep 2010, DongJin Lee wrote: >> > Wow, these levels seem oddly slow. What filesystem are those disks running? >> > What tool are you using to test IOPS, and what's your network setup? >> > If you can move the journal onto a separate device it will help. >> > -Greg >> > >> >> I have changed the OSD to SSD (OCZ Vertex2) >> To simplify the problem, only one OSD is used, >> so now, 3 PCs - 1OSD, 1MDS, 1MON (all similar spec, except the OSD's >> 1TB disk replaced with 60GB SSD), using btrfs >> all connected to 1G/s managed switch. Iperf gives about 980Mb/s >> >> I ran dbench, on the OSD PC. >> Local vs /ceph >> >> using dbench -t 10 100 -D /data/osd0/ >> - gives 150MB/s, good and as expected. >> >> using dbench -t 10 100 -D /ceph >> - gives 15MB/s > > The dbench workload is metadata intensive, and most metadata (update) > operations require a round trip to the MDS. The client/mds protocol is > pretty aggressive about issueing leases to the client, but that only goes > so far. Metadata performance will never be as fast as a local fs for a > single process. > >> Also, I often get errors from dbench while benchmarking (only when >> /ceph mounts, not elsewhere) >> >> [323] unlink '/media/cephlocal/clients/client25/~dmtmp/WORD/CHAP10.DOC' >> failed - No such file or directory >> 52 cleanup 14 sec >> 39 cleanup 15 sec >> [323] unlink '/media/cephlocal/clients/client24/~dmtmp/WORDPRO/NEWS1_1.LWP' >> failed - No such file or directory >> [323] unlink '/media/cephlocal/clients/client91/~dmtmp/WORD/CHAP10.DOC' >> failed - No such file or directory >> [323] unlink '/media/cephlocal/clients/client27/~dmtmp/WORDPRO/RESULTS.XLS' >> failed - No such file or directory >> >> and >> >> [31] open /media/cephlocal/clients/client258/filler.001 failed for >> handle 9939 (No such file or directory) >> [31] open /media/cephlocal/clients/client220/filler.001 failed for >> handle 9939 (No such file or directory) >> [111] open /media/cephlocal/clients/client218/~dmtmp/WORD failed for >> handle 9943 (No such file or directory) >> [91] open /media/cephlocal/clients/client986/filler.004 failed for >> handle 9942 (No such file or directory) > > What version of the client are you running? I was hitting this (or a > similar) problem with the dbench workload a couple weeks ago and it turned > out to be a long standing bug in fs/namei.c that has finally (finally!) > been fixed in 2.6.36-rc2 (2e2e88ea). I hadn't seen it actually come up > for over a year, so something must have changed with the kclient or MDS > recently that made the bug happen, but in any case it's fixed now. > > sage > -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html