What may be possible solutions? Update centos to 6.3? About issue with writes to lots of disk, i think parallel dd command will be good as test! :) 2012/11/4 Mark Nelson <mark.nelson@xxxxxxxxxxx>: > On 11/04/2012 07:18 AM, Aleksey Samarin wrote: >> >> Well, i create ceph cluster with 2 osd ( 1 osd per node), 2 mon, 2 mds. >> here is what I did: >> ceph osd pool create bench >> ceph osd tell \* bench >> rados -p bench bench 30 write --no-cleanup >> output: >> >> Maintaining 16 concurrent writes of 4194304 bytes for at least 30 >> seconds. >> Object prefix: benchmark_data_host01_11635 >> sec Cur ops started finished avg MB/s cur MB/s last lat avg >> lat >> 0 0 0 0 0 0 - >> 0 >> 1 16 16 0 0 0 - >> 0 >> 2 16 37 21 41.9911 42 0.139005 >> 1.08941 >> 3 16 53 37 49.3243 64 0.754114 >> 1.09392 >> 4 16 75 59 58.9893 88 0.284647 >> 0.914221 >> 5 16 89 73 58.3896 56 0.072228 >> 0.881008 >> 6 16 95 79 52.6575 24 1.56959 >> 0.961477 >> 7 16 111 95 54.2764 64 0.046105 >> 1.08791 >> 8 16 128 112 55.9906 68 0.035714 >> 1.04594 >> 9 16 150 134 59.5457 88 0.046298 >> 1.04415 >> 10 16 166 150 59.9901 64 0.048635 >> 0.986384 >> 11 16 176 160 58.1723 40 0.727784 >> 0.988408 >> 12 16 206 190 63.3231 120 0.28869 >> 0.946624 >> 13 16 225 209 64.2976 76 1.34472 >> 0.919464 >> 14 16 263 247 70.5605 152 0.070926 >> 0.90046 >> 15 16 295 279 74.3887 128 0.041517 >> 0.830466 >> 16 16 315 299 74.7388 80 0.296037 >> 0.841527 >> 17 16 333 317 74.5772 72 0.286097 >> 0.849558 >> 18 16 340 324 71.9891 28 0.295084 >> 0.83922 >> 19 16 343 327 68.8317 12 1.46948 >> 0.845797 >> 2012-11-04 17:14:52.090941min lat: 0.035714 max lat: 2.64841 avg lat: >> 0.861539 >> sec Cur ops started finished avg MB/s cur MB/s last lat avg >> lat >> 20 16 378 362 72.389 140 0.566232 >> 0.861539 >> 21 16 400 384 73.1313 88 0.038835 >> 0.857785 >> 22 16 404 388 70.5344 16 0.801216 >> 0.857002 >> 23 16 413 397 69.0327 36 0.062256 >> 0.86376 >> 24 16 428 412 68.6543 60 0.042583 >> 0.89389 >> 25 16 450 434 69.4277 88 0.383877 >> 0.905833 >> 26 16 472 456 70.1415 88 0.269878 >> 0.898023 >> 27 16 472 456 67.5437 0 - >> 0.898023 >> 28 16 512 496 70.8448 80 0.056798 >> 0.891163 >> 29 16 530 514 70.8843 72 1.20653 >> 0.898112 >> 30 16 542 526 70.1212 48 0.744383 >> 0.890733 >> Total time run: 30.174151 >> Total writes made: 543 >> Write size: 4194304 >> Bandwidth (MB/sec): 71.982 >> >> Stddev Bandwidth: 38.318 >> Max bandwidth (MB/sec): 152 >> Min bandwidth (MB/sec): 0 >> Average Latency: 0.889026 >> Stddev Latency: 0.677425 >> Max latency: 2.94467 >> Min latency: 0.035714 >> > > Much better for 1 disk per node! I suspect that lack of syncfs is hurting > you, or perhaps some other issue with writes to lots of disks at the same > time. > > >> >> 2012/11/4 Aleksey Samarin <nrg3tik@xxxxxxxxx>: >>> >>> Ok! >>> Well, I'll take these tests and write about the results. >>> >>> btw, >>> disks are the same, as some may be faster than others? >>> >>> 2012/11/4 Gregory Farnum <greg@xxxxxxxxxxx>: >>>> >>>> That's only nine — where are the other three? If you have three slow >>>> disks that could definitely cause the troubles you're seeing. >>>> >>>> Also, what Mark said about sync versus syncfs. >>>> >>>> On Sun, Nov 4, 2012 at 1:26 PM, Aleksey Samarin <nrg3tik@xxxxxxxxx> >>>> wrote: >>>>> >>>>> It`s ok! >>>>> >>>>> Output: >>>>> >>>>> 2012-11-04 16:19:23.195891 osd.0 [INF] bench: wrote 1024 MB in blocks >>>>> of 4096 KB in 11.441035 sec at 91650 KB/sec >>>>> 2012-11-04 16:19:24.981631 osd.1 [INF] bench: wrote 1024 MB in blocks >>>>> of 4096 KB in 13.225048 sec at 79287 KB/sec >>>>> 2012-11-04 16:19:25.672896 osd.2 [INF] bench: wrote 1024 MB in blocks >>>>> of 4096 KB in 13.917157 sec at 75344 KB/sec >>>>> 2012-11-04 16:19:28.058517 osd.21 [INF] bench: wrote 1024 MB in blocks >>>>> of 4096 KB in 16.453375 sec at 63730 KB/sec >>>>> 2012-11-04 16:19:28.715552 osd.22 [INF] bench: wrote 1024 MB in blocks >>>>> of 4096 KB in 17.108887 sec at 61288 KB/sec >>>>> 2012-11-04 16:19:23.440054 osd.23 [INF] bench: wrote 1024 MB in blocks >>>>> of 4096 KB in 11.834639 sec at 88602 KB/sec >>>>> 2012-11-04 16:19:24.023650 osd.24 [INF] bench: wrote 1024 MB in blocks >>>>> of 4096 KB in 12.418276 sec at 84438 KB/sec >>>>> 2012-11-04 16:19:24.617514 osd.25 [INF] bench: wrote 1024 MB in blocks >>>>> of 4096 KB in 13.011955 sec at 80585 KB/sec >>>>> 2012-11-04 16:19:25.148613 osd.26 [INF] bench: wrote 1024 MB in blocks >>>>> of 4096 KB in 13.541710 sec at 77433 KB/sec >>>>> >>>>> All the best. >>>>> >>>>> 2012/11/4 Gregory Farnum <greg@xxxxxxxxxxx>: >>>>>> >>>>>> [Sorry for the blank email; I missed!] >>>>>> On Sun, Nov 4, 2012 at 1:04 PM, Aleksey Samarin <nrg3tik@xxxxxxxxx> >>>>>> wrote: >>>>>>> >>>>>>> Hi! >>>>>>> This command? ceph tell osd \* bench >>>>>>> Output: tell target 'osd' not a valid entity name >>>>>> >>>>>> >>>>>> I guess it's "ceph osd tell \* bench". Try that one. :) >>>>>> >>>>>>> Well, i did pool by command ceph osd pool create bench2 120 >>>>>>> This output of rados -p bench2 bench 30 write --no-cleanup >>>>>>> >>>>>>> rados -p bench2 bench 30 write --no-cleanup >>>>>>> >>>>>>> Maintaining 16 concurrent writes of 4194304 bytes for at least 30 >>>>>>> seconds. >>>>>>> Object prefix: benchmark_data_host01_5827 >>>>>>> sec Cur ops started finished avg MB/s cur MB/s last lat >>>>>>> avg lat >>>>>>> 0 0 0 0 0 0 - >>>>>>> 0 >>>>>>> 1 16 29 13 51.9885 52 0.489268 >>>>>>> 0.186749 >>>>>>> 2 16 52 36 71.9866 92 1.87226 >>>>>>> 0.711888 >>>>>>> 3 16 57 41 54.657 20 0.089697 >>>>>>> 0.697821 >>>>>>> 4 16 60 44 43.9923 12 1.61868 >>>>>>> 0.765361 >>>>>>> 5 16 60 44 35.1941 0 - >>>>>>> 0.765361 >>>>>>> 6 16 60 44 29.3285 0 - >>>>>>> 0.765361 >>>>>>> 7 16 60 44 25.1388 0 - >>>>>>> 0.765361 >>>>>>> 8 16 61 45 22.4964 1 5.89643 >>>>>>> 0.879384 >>>>>>> 9 16 62 46 20.4412 4 6.0234 >>>>>>> 0.991211 >>>>>>> 10 16 62 46 18.3971 0 - >>>>>>> 0.991211 >>>>>>> 11 16 63 47 17.0883 2 8.79749 >>>>>>> 1.1573 >>>>>>> 12 16 63 47 15.6643 0 - >>>>>>> 1.1573 >>>>>>> 13 16 63 47 14.4593 0 - >>>>>>> 1.1573 >>>>>>> 14 16 63 47 13.4266 0 - >>>>>>> 1.1573 >>>>>>> 15 16 63 47 12.5315 0 - >>>>>>> 1.1573 >>>>>>> 16 16 63 47 11.7483 0 - >>>>>>> 1.1573 >>>>>>> 17 16 63 47 11.0572 0 - >>>>>>> 1.1573 >>>>>>> 18 16 63 47 10.4429 0 - >>>>>>> 1.1573 >>>>>>> 19 16 63 47 9.89331 0 - >>>>>>> 1.1573 >>>>>>> 2012-11-04 15:58:15.473733min lat: 0.036475 max lat: 8.79749 avg lat: >>>>>>> 1.1573 >>>>>>> sec Cur ops started finished avg MB/s cur MB/s last lat >>>>>>> avg lat >>>>>>> 20 16 63 47 9.39865 0 - >>>>>>> 1.1573 >>>>>>> 21 16 63 47 8.95105 0 - >>>>>>> 1.1573 >>>>>>> 22 16 63 47 8.54419 0 - >>>>>>> 1.1573 >>>>>>> 23 16 63 47 8.17271 0 - >>>>>>> 1.1573 >>>>>>> 24 16 63 47 7.83218 0 - >>>>>>> 1.1573 >>>>>>> 25 16 63 47 7.5189 0 - >>>>>>> 1.1573 >>>>>>> 26 16 63 47 7.22972 0 - >>>>>>> 1.1573 >>>>>>> 27 16 81 65 9.62824 4.5 0.076456 >>>>>>> 4.9428 >>>>>>> 28 16 118 102 14.5693 148 0.427273 >>>>>>> 4.34095 >>>>>>> 29 16 119 103 14.2049 4 1.57897 >>>>>>> 4.31414 >>>>>>> 30 16 132 116 15.4645 52 2.25424 >>>>>>> 4.01492 >>>>>>> 31 16 133 117 15.0946 4 0.974652 >>>>>>> 3.98893 >>>>>>> 32 16 133 117 14.6229 0 - >>>>>>> 3.98893 >>>>>>> Total time run: 32.575351 >>>>>>> Total writes made: 133 >>>>>>> Write size: 4194304 >>>>>>> Bandwidth (MB/sec): 16.331 >>>>>>> >>>>>>> Stddev Bandwidth: 31.8794 >>>>>>> Max bandwidth (MB/sec): 148 >>>>>>> Min bandwidth (MB/sec): 0 >>>>>>> Average Latency: 3.91583 >>>>>>> Stddev Latency: 7.42821 >>>>>>> Max latency: 25.24 >>>>>>> Min latency: 0.036475 >>>>>>> >>>>>>> Im think problem not in pg. This output of ceph pg dump > >>>>>>> http://pastebin.com/BqLsyMBC >>>>>> >>>>>> >>>>>> Well, that did improve it a bit; but yes, I think there's something >>>>>> else going on. Just wanted to verify. :) >>>>>> >>>>>>> >>>>>>> I have still no idea. >>>>>>> >>>>>>> All the best. Alex >>>>>>> >>>>>>> >>>>>>> >>>>>>> 2012/11/4 Gregory Farnum <greg@xxxxxxxxxxx>: >>>>>>>> >>>>>>>> On Sun, Nov 4, 2012 at 10:58 AM, Aleksey Samarin <nrg3tik@xxxxxxxxx> >>>>>>>> wrote: >>>>>>>>> >>>>>>>>> Hi all >>>>>>>>> >>>>>>>>> Im planning use ceph for cloud storage. >>>>>>>>> My test setup is 2 servers connected via infiniband 40Gb, 6x2Tb >>>>>>>>> disks per node. >>>>>>>>> Centos 6.2 >>>>>>>>> Ceph 0.52 from http://ceph.com/rpms/el6/x86_64 >>>>>>>>> This is my config http://pastebin.com/Pzxafnsm >>>>>>>>> journal on tmpfs >>>>>>>>> well, im create bench pool and test it: >>>>>>>>> ceph osd pool create bench >>>>>>>>> rados -p bench bench 30 write >>>>>>>>> >>>>>>>>> Total time run: 43.258228 >>>>>>>>> Total writes made: 151 >>>>>>>>> Write size: 4194304 >>>>>>>>> Bandwidth (MB/sec): 13.963 >>>>>>>>> Stddev Bandwidth: 26.307 >>>>>>>>> Max bandwidth (MB/sec): 128 >>>>>>>>> Min bandwidth (MB/sec): 0 >>>>>>>>> Average Latency: 4.48605 >>>>>>>>> Stddev Latency: 8.17709 >>>>>>>>> Max latency: 29.7957 >>>>>>>>> Min latency: 0.039435 >>>>>>>>> >>>>>>>>> when i do rados -p bench bench 30 seq >>>>>>>>> Total time run: 20.626935 >>>>>>>>> Total reads made: 275 >>>>>>>>> Read size: 4194304 >>>>>>>>> Bandwidth (MB/sec): 53.328 >>>>>>>>> Average Latency: 1.19754 >>>>>>>>> Max latency: 7.0215 >>>>>>>>> Min latency: 0.011647 >>>>>>>>> >>>>>>>>> I tested the single drive via dd if=/dev/zero of=/mnt/hdd2/testfile >>>>>>>>> bs=1024k count=20000 >>>>>>>>> result: 158 MB/sec >>>>>>>>> >>>>>>>>> Anyone can tell me why such a weak performance? Maybe I missed >>>>>>>>> something? >>>>>>>> >>>>>>>> >>>>>>>> Can you run "ceph tell osd \* bench" and report the results? (It'll >>>>>>>> go >>>>>>>> to the "central log" which you can keep an eye on if you run "ceph >>>>>>>> -w" >>>>>>>> in another terminal.) >>>>>>>> I think you also didn't create your bench pool correctly; it >>>>>>>> probably >>>>>>>> only has 8 PGs which is not going to perform very well with your >>>>>>>> disk >>>>>>>> count. Try "ceph pool create bench2 120" and run the benchmark >>>>>>>> against >>>>>>>> that pool. The extra number at the end tells it to create 120 >>>>>>>> placement groups. >>>>>>>> -Greg >> >> -- >> >> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >> the body of a message to majordomo@xxxxxxxxxxxxxxx >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> > -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html