Here are some more info : rados bench -p cephfs_data 100 write --no-cleanup Total time run: 100.096473 Total writes made: 21900 Write size: 4194304 Bandwidth (MB/sec): 875.156 Stddev Bandwidth: 96.1234 Max bandwidth (MB/sec): 932 Min bandwidth (MB/sec): 0 Average Latency: 0.0731273 Stddev Latency: 0.0439909 Max latency: 1.23972 Min latency: 0.0306901 (Again the numbers from bench don't match what is listed in client io. Ceph -s shows anywhere from 200 MB/s to 1700 MB/s even when the max bandwidth lists 932 as the highest) rados bench -p cephfs_data 100 seq Total time run: 29.460172 Total reads made: 21900 Read size: 4194304 Bandwidth (MB/sec): 2973.506 Average Latency: 0.0215173 Max latency: 0.693831 Min latency: 0.00519763 On client: [root@blarg cephfs]# time for i in {1..100000}; do mkdir blarg"$i" ; done real 10m36.794s user 1m45.329s sys 6m29.982s [root@blarg cephfs]# time for i in {1..100000}; do touch yadda"$i" ; done real 13m29.155s user 3m55.256s sys 7m50.301s What variables are most important in the perf dump? I would like to grep out the vars (ceph daemon /var/run/ceph-mds.cephnautilus01.asok perf dump | jq '.') that are of meaning while running the bonnie++ test again with -s 0. Thanks, BJ On Fri, May 22, 2015 at 10:34 AM, John Spray <john.spray@xxxxxxxxxx> wrote: > > > On 22/05/2015 16:25, Barclay Jameson wrote: >> >> The Bonnie++ job _FINALLY_ finished. If I am reading this correctly it >> took days to create, stat, and delete 16 files?? >> [root@blarg cephfs]# ~/bonnie++-1.03e/bonnie++ -u root:root -s 256g -r >> 131072 -d /cephfs/ -m CephBench -f -b >> Using uid:0, gid:0. >> Writing intelligently...done >> Rewriting...done >> Reading intelligently...done >> start 'em...done...done...done... >> Create files in sequential order...done. >> Stat files in sequential order...done. >> Delete files in sequential order...done. >> Create files in random order...done. >> Stat files in random order...done. >> Delete files in random order...done. >> Version 1.03e ------Sequential Output------ --Sequential Input- >> --Random- >> -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- >> --Seeks-- >> Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP >> /sec %CP >> CephBench 256G 1006417 76 90114 13 137110 >> 8 329.8 7 >> ------Sequential Create------ --------Random >> Create-------- >> -Create-- --Read--- -Delete-- -Create-- --Read--- >> -Delete-- >> files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP >> /sec %CP >> 16 0 0 +++++ +++ 0 0 0 0 5267 19 >> 0 0 >> >> CephBench,256G,,,1006417,76,90114,13,,,137110,8,329.8,7,16,0,0,+++++,+++,0,0,0,0,5267,19,0,0 >> >> Any thoughts? >> > It's 16000 files by default (not 16), but this usually takes only a few > minutes. > > FWIW I tried running a quick bonnie++ (with -s 0 to skip the IO phase) on a > development (vstart.sh) cluster with a fuse client, and it readily handles > several hundred client requests per second (checked with "ceph daemonperf > mds.<id>") > > Nothing immediately leapt out at me from a quick look at the log you posted, > but with issues like these it is always worth trying to narrow it down by > trying the fuse client instead of the kernel client, and/or different kernel > versions. > > You may also want to check that your underlying RADOS cluster is performing > reasonably by doing a rados bench too. > > Cheers, > John -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html