thanks for you reply now we find two points confused us: 1) the kernel client execute sequence read though aio_read function, but from OSD log, the dispatch_queue length in OSD is always 0, it means OSD can't got next READ message until client send to it. It seems that async_read changes to sync_read, OSD can't parallely read data, so can not make the most of resources.What are the original purposes when you design this part? perfect realiablity? 2) In singleness read circumstance,during OSD read data from it disk, the OSD doesn't do anything but to wait it finish.We think it was the result of 1), OSD have nothing to do,so just to wait. 2011/7/19 Sage Weil <sage@xxxxxxxxxxxx>: > On Mon, 18 Jul 2011, huang jun wrote: >> hi,all >> We test ceph's read performance last week, and find something weird >> we use ceph v0.30 on linux 2.6.37 >> mount ceph on back-platform consist of 2 osds \1 mon \1 mds >> $mount -t ceph 192.168.1.103:/ /mnt -vv >> $ dd if=/dev/zero of=/mnt/test bs=4M count=200 >> $ cd .. && umount /mnt >> $mount -t ceph 192.168.1.103:/ /mnt -vv >> $dd if=test of=/dev/zero bs=4M >> 200+0 records in >> 200+0 records out >> 838860800 bytes (839 MB) copied, 16.2327 s, 51.7 MB/s >> but if we use rados to test it >> $ rados -m 192.168.1.103:6789 -p data bench 60 write >> $ rados -m 192.168.1.103:6789 -p data bench 60 seq >> the result is: >> Total time run: 24.733935 >> Total reads made: 438 >> Read size: 4194304 >> Bandwidth (MB/sec): 70.834 >> >> Average Latency: 0.899429 >> Max latency: 1.85106 >> Min latency: 0.128017 >> this phenomenon attracts our attention, then we begin to analysis the >> osd debug log. >> we find that : >> 1) the kernel client send READ request, at first it requests 1MB, and >> after that it is 512KB >> 2) from rados test cmd log, OSD recept the READ op with 4MB data to handle >> we know the ceph developers pay their attention to read and write >> performance, so i just want to confrim that >> if the communication between the client and OSD spend more time than >> it should be? can we request bigger size, just like default object >> size 4MB, when it occurs to READ operation? or this is related to OS >> management, if so, what can we do to promote the performance? > > I think it's related to the way the Linux VFS is doing readahead, and how > the ceph fs code is handling it. It's issue #1122 in the tracker and I > plan to look at it today or tomorrow! > > Thanks- > sage > -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html