On 03/25/17 23:01, Nick Fisk wrote: > >> I think I owe you another graph later when I put all my VMs on there >> (probably finally fixed my rbd snapshot hanging VM issue ...worked around it >> by disabling exclusive-lock,object-map,fast-diff). The bandwidth hungry ones >> (which hung the most often) were moved shortly after the bcache change, >> and it's hard to explain how it affects the graphs... easier to see with iostat >> while changing it and having a mix of cache and not than ganglia afterwards. > Please do, I can't resist a nice graph. What I would be really interested in is answers to these questions, if you can: > > 1. Has your per disk bandwidth gone up, due to removing random writes. Ie. I struggle to get more than about 50MB/s writes per disk due to extra random IO per request > 2. Any feeling on how it helps with dentry/inode lookups. As mentioned above, I'm using 8TB disks and cold data has extra penalty for reads/writes as it has to lookup the FS data first > 3. I assume with 4.9 kernel you don't have the bcache fix which allows partitions. What method are you using to create OSDs? > 4. As mentioned above any stats around percentage of MB/s that is hitting your cache device vs journal (assuming journal is 100% of IO). This is to calculate extra wear > > Thanks, > Nick So it's graph time... Here's basically what you saw before, but I made it stacked (so 900 on the %util means like 18/27 of the disks in the whole cluster are at avg 50% in the sample period for that one pixel width of the graph) (remove gtype=stack and it won't be stacked, or http://www.brockmann-consult.de/ganglia/?c=ceph and find the aggregate report form and fill it out yourself ... I manually added date (cs and ce) copied from another url since that form doesn't have it, and only makes last x time periods. You can also find more metrics in the drop downs on that page. sda,sdb have always been the SSDs, disk metrics are 30s averages from iostat) With no bcache until a bit at the end, plus some load from migrating to bcache possibly in there (didn't record dates on that). %util - http://www.brockmann-consult.de/ganglia/graph.php?hreg[]=ceph.*&mreg[]=sd[c-z]_util&glegend=show&aggregate=1&_=1491205396888&cs=11%2F1%2F2016+21%3A18&ce=12%2F15%2F2016+4%3A21&z=xlarge>ype=stack&x=1000 await - http://www.brockmann-consult.de/ganglia/graph.php?hreg[]=ceph.*&mreg[]=sd[c-z]_await&glegend=show&aggregate=1&_=1491205396888&cs=11%2F1%2F2016+21%3A18&ce=12%2F15%2F2016+4%3A21&z=xlarge>ype=stack&x=1000 wMBps - http://www.brockmann-consult.de/ganglia/graph.php?hreg[]=ceph.*&mreg[]=sd[c-z]_wMBps&glegend=show&aggregate=1&_=1491205396888&cs=11%2F1%2F2016+21%3A18&ce=12%2F15%2F2016+4%3A21&z=xlarge>ype=stack&x=300 And here is since most VMs were on ceph (more than the before graphs), with some osd-reweight-by-utilization started since a few days ago (but scrub disabled during that) making the last part look higher. And the last VMs were moved today, also seen on the graph, plus some extra backup load some time later. %util - http://www.brockmann-consult.de/ganglia/graph.php?hreg[]=ceph.*&mreg[]=sd[c-z]_util&glegend=show&aggregate=1&_=1491205396888&cs=3%2F24%2F2017+23%3A3&z=xlarge>ype=stack&x=1000 await - http://www.brockmann-consult.de/ganglia/graph.php?hreg[]=ceph.*&mreg[]=sd[c-z]_await&glegend=show&aggregate=1&_=1491205396888&cs=3%2F24%2F2017+23%3A3&z=xlarge>ype=stack&x=1000 wMBps - http://www.brockmann-consult.de/ganglia/graph.php?hreg[]=ceph.*&mreg[]=sd[c-z]_wMBps&glegend=show&aggregate=1&_=1491205396888&cs=3%2F24%2F2017+23%3A3&z=xlarge>ype=stack&x=300 Looking at the wMBps graph, you can see the cluster doesn't really have that high of a load on average, only in bursts, but seeing that the before and after are similar load means the other graphs can be at least somewhat comparable. I think the %util graph speaks for itself, but I don't know how to show you what it does in VMs. I figure it will smooth out the performance at times when lots of requests happen that hdds are bad at but ssds are good at (snap trimming, directory splitting, etc.). Lots of issues I find are clearly seen in %util. Or both time ranges together in the main reports page: http://www.brockmann-consult.de/ganglia/?r=year&cs=10%2F21%2F2016+20%3A33&ce=4%2F7%2F2017+7%3A6&c=ceph&h=&tab=m&vn=&hide-hf=false&m=load_one&sh=1&z=small&hc=4&host_regex=&max_graphs=0&s=by+name And be sure to share some of your own results. :) _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com