Hi, benchmarking is done via fio and different blocksizes. I compared with benchmarks I did before the ceph.conf change and encountered very similar numbers. Thanks for the hint with mysql benchmarking. I will try it out. Cheers Nick On Friday, January 08, 2016 06:59:13 AM Josef Johansson wrote: > Hi, > > How did you benchmark? > > I would recommend to have a lot of mysql with a lot of innodb tables that > are utilised heavily. During a recover you should see the latency rise at > least. Maybe using one of the tools here > https://dev.mysql.com/downloads/benchmarks.html > > Regards, > Josef > > On 7 Jan 2016 16:36, "Robert LeBlanc" <robert@xxxxxxxxxxxxx> wrote: > > With these min,max settings, we didn't have any problem going to more > > backfills. > > > > Robert LeBlanc > > > > Sent from a mobile device please excuse any typos. > > > > On Jan 7, 2016 8:30 AM, "nick" <nick@xxxxxxx> wrote: > >> Heya, > >> thank you for your answers. We will try to set 16/32 as values for > >> osd_backfill_scan_[min|max]. I also set the debug logging config. Here is > >> an > >> excerpt of our new ceph.conf: > >> > >> """ > >> [osd] > >> osd max backfills = 1 > >> osd backfill scan max = 32 > >> osd backfill scan min = 16 > >> osd recovery max active = 1 > >> osd recovery op priority = 1 > >> osd op threads = 8 > >> > >> [global] > >> debug optracker = 0/0 > >> debug asok = 0/0 > >> debug hadoop = 0/0 > >> debug mds migrator = 0/0 > >> debug objclass = 0/0 > >> debug paxos = 0/0 > >> debug context = 0/0 > >> debug objecter = 0/0 > >> debug mds balancer = 0/0 > >> debug finisher = 0/0 > >> debug auth = 0/0 > >> debug buffer = 0/0 > >> debug lockdep = 0/0 > >> debug mds log = 0/0 > >> debug heartbeatmap = 0/0 > >> debug journaler = 0/0 > >> debug mon = 0/0 > >> debug client = 0/0 > >> debug mds = 0/0 > >> debug throttle = 0/0 > >> debug journal = 0/0 > >> debug crush = 0/0 > >> debug objectcacher = 0/0 > >> debug filer = 0/0 > >> debug perfcounter = 0/0 > >> debug filestore = 0/0 > >> debug rgw = 0/0 > >> debug monc = 0/0 > >> debug rbd = 0/0 > >> debug tp = 0/0 > >> debug osd = 0/0 > >> debug ms = 0/0 > >> debug mds locker = 0/0 > >> debug timer = 0/0 > >> debug mds log expire = 0/0 > >> debug rados = 0/0 > >> debug striper = 0/0 > >> debug rbd replay = 0/0 > >> debug none = 0/0 > >> debug keyvaluestore = 0/0 > >> debug compressor = 0/0 > >> debug crypto = 0/0 > >> debug xio = 0/0 > >> debug civetweb = 0/0 > >> debug newstore = 0/0 > >> """ > >> > >> I already made a benchmark on our staging setup with the new config and > >> fio, but > >> did not really get different results than before. > >> > >> For us it is hardly possible to reproduce the 'stalling' problems on the > >> staging cluster so I will have to wait and test this in production. > >> > >> Does anyone know if 'osd max backfills' > 1 could have an impact as well? > >> The > >> default seems to be 10... > >> > >> Cheers > >> Nick > >> > >> On Wednesday, January 06, 2016 09:17:43 PM Josef Johansson wrote: > >> > Hi, > >> > > >> > Also make sure that you optimize the debug log config. There's a lot on > >> > >> the > >> > >> > ML on how to set them all to low values (0/0). > >> > > >> > Not sure how it's in infernalis but it did a lot in previous versions. > >> > > >> > Regards, > >> > Josef > >> > > >> > On 6 Jan 2016 18:16, "Robert LeBlanc" <robert@xxxxxxxxxxxxx> wrote: > >> > > -----BEGIN PGP SIGNED MESSAGE----- > >> > > Hash: SHA256 > >> > > > >> > > There has been a lot of "discussion" about osd_backfill_scan[min,max] > >> > > lately. My experience with hammer has been opposite that of what > >> > > people have said before. Increasing those values for us has reduced > >> > > the load of recovery and has prevented a lot of the disruption seen > >> > > in > >> > > our cluster caused by backfilling. It does increase the amount of > >> > > time > >> > > to do the recovery (a new node added to the cluster took about 3-4 > >> > > hours before, now takes about 24 hours). > >> > > > >> > > We are currently using these values and seem to work well for us. > >> > > osd_max_backfills = 1 > >> > > osd_backfill_scan_min = 16 > >> > > osd_recovery_max_active = 1 > >> > > osd_backfill_scan_max = 32 > >> > > > >> > > I would be interested in your results if you try these values. > >> > > -----BEGIN PGP SIGNATURE----- > >> > > Version: Mailvelope v1.3.2 > >> > > Comment: https://www.mailvelope.com > >> > > > >> > > wsFcBAEBCAAQBQJWjUu/CRDmVDuy+mK58QAArdMQAI+0Er/sdN7TF7knGey2 > >> > > 5wJ6Ie81KJlrt/X9fIMpFdwkU2g5ET+sdU9R2hK4XcBpkonfGvwS8Ctha5Aq > >> > > XOJPrN4bMMeDK9Z4angK86ioLJevTH7tzp3FZL0U4Kbt1s9ZpwF6t+wlvkKl > >> > > mt6Tkj4VKr0917TuXqk58AYiZTYcEjGAb0QUe/gC24yFwZYrPO0vUVb4gmTQ > >> > > klNKAdTinGSn4Ynj+lBsEstWGVlTJiL3FA6xRBTz1BSjb4vtb2SoIFwHlAp+ > >> > > GO+bKSh19YIasXCZfRqC/J2XcNauOIVfb4l4viV23JN2fYavEnLCnJSglYjF > >> > > Rjxr0wK+6NhRl7naJ1yGNtdMkw+h+nu/xsbYhNqT0EVq1d0nhgzh6ZjAhW1w > >> > > oRiHYA4KNn2uWiUgigpISFi4hJSP4CEPToO8jbhXhARs0H6v33oWrR8RYKxO > >> > > dFz+Lxx969rpDkk+1nRks9hTeIF+oFnW7eezSiR6TILYxvCZQ0ThHXQsL4ph > >> > > bvUr0FQmdV3ukC+Xwa/cePIlVY6JsIQfOlqmrtG7caTZWLvLUDwrwcleb272 > >> > > 243GXlbWCxoI7+StJDHPnY2k7NHLvbN2yG3f5PZvZaBgqqyAP8Fnq6CDtTIE > >> > > vZ/p+ZcuRw8lqoDgjjdiFyMmhQnFcCtDo3vtIy/UXDw23AVsI5edUyyP/sHt > >> > > ruPt > >> > > =X7SH > >> > > -----END PGP SIGNATURE----- > >> > > ---------------- > >> > > Robert LeBlanc > >> > > PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 > >> > > > >> > > On Wed, Jan 6, 2016 at 7:13 AM, nick <nick@xxxxxxx> wrote: > >> > > > Heya, > >> > > > we are using a ceph cluster (6 Nodes with each having 10x4TB HDD + > >> > >> 2x > >> > >> > > SSD (for > >> > > > >> > > > journal)) in combination with KVM virtualization. All our virtual > >> > > > >> > > machine hard > >> > > > >> > > > disks are stored on the ceph cluster. The ceph cluster was updated > >> > >> to > >> > >> > > > the > >> > > > 'infernalis' release recently. > >> > > > > >> > > > We are experiencing problems during cluster maintenance. A normal > >> > > > >> > > workflow for > >> > > > >> > > > us looks like this: > >> > > > > >> > > > - set the noout flag for the cluster > >> > > > - stop all OSDs on one node > >> > > > - update the node > >> > > > - reboot the node > >> > > > - start all OSDs > >> > > > - wait for the backfilling to finish > >> > > > - unset the noout flag > >> > > > > >> > > > After we start all OSDs on the node again the cluster backfills and > >> > > > >> > > tries to > >> > > > >> > > > get all the OSDs in sync. During the beginning of this process we > >> > > > >> > > experience > >> > > > >> > > > 'stalls' in our running virtual machines. On some the load raises > >> > >> to a > >> > >> > > very > >> > > > >> > > > high value. On others a running webserver responses only with 5xx > >> > >> HTTP > >> > >> > > codes. > >> > > > >> > > > It takes around 5-6 minutes until all is ok again. After those 5-6 > >> > > > >> > > minutes the > >> > > > >> > > > cluster is still backfilling, but the virtual machines behave > >> > > > normal > >> > > > >> > > again. > >> > > > >> > > > I already set the following parameters in ceph.conf on the nodes to > >> > >> have > >> > >> > > a > >> > > > >> > > > better rebalance traffic/user traffic ratio: > >> > > > > >> > > > """ > >> > > > [osd] > >> > > > osd max backfills = 1 > >> > > > osd backfill scan max = 8 > >> > > > osd backfill scan min = 4 > >> > > > osd recovery max active = 1 > >> > > > osd recovery op priority = 1 > >> > > > osd op threads = 8 > >> > > > """ > >> > > > > >> > > > It helped a bit, but we are still experiencing the above written > >> > > > >> > > problems. It > >> > > > >> > > > feels like that for a short time some virtual hard disks are > >> > >> locked. Our > >> > >> > > ceph > >> > > > >> > > > nodes are using bonded 10G network interfaces for the 'OSD > >> > >> network', so > >> > >> > > I do > >> > > > >> > > > not think that network is a bottleneck. > >> > > > > >> > > > After reading this blog post: > >> > > > http://dachary.org/?p=2182 > >> > > > I wonder if there is really a 'read lock' during the object push. > >> > > > > >> > > > Does anyone know more about this or do others have the same > >> > >> problems and > >> > >> > > were > >> > > > >> > > > able to fix it? > >> > > > > >> > > > Best Regards > >> > > > Nick > >> > > > > >> > > > -- > >> > > > Sebastian Nickel > >> > > > Nine Internet Solutions AG, Albisriederstr. 243a, CH-8047 Zuerich > >> > > > Tel +41 44 637 40 00 | Support +41 44 637 40 40 | www.nine.ch > >> > > > _______________________________________________ > >> > > > ceph-users mailing list > >> > > > ceph-users@xxxxxxxxxxxxxx > >> > > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > >> > > > >> > > _______________________________________________ > >> > > ceph-users mailing list > >> > > ceph-users@xxxxxxxxxxxxxx > >> > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > >> > >> -- > >> Sebastian Nickel > >> Nine Internet Solutions AG, Albisriederstr. 243a, CH-8047 Zuerich > >> Tel +41 44 637 40 00 | Support +41 44 637 40 40 | www.nine.ch > > > > _______________________________________________ > > ceph-users mailing list > > ceph-users@xxxxxxxxxxxxxx > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Sebastian Nickel Nine Internet Solutions AG, Albisriederstr. 243a, CH-8047 Zuerich Tel +41 44 637 40 00 | Support +41 44 637 40 40 | www.nine.ch
Attachment:
signature.asc
Description: This is a digitally signed message part.
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com