Heya, we are using a ceph cluster (6 Nodes with each having 10x4TB HDD + 2x SSD (for journal)) in combination with KVM virtualization. All our virtual machine hard disks are stored on the ceph cluster. The ceph cluster was updated to the 'infernalis' release recently. We are experiencing problems during cluster maintenance. A normal workflow for us looks like this: - set the noout flag for the cluster - stop all OSDs on one node - update the node - reboot the node - start all OSDs - wait for the backfilling to finish - unset the noout flag After we start all OSDs on the node again the cluster backfills and tries to get all the OSDs in sync. During the beginning of this process we experience 'stalls' in our running virtual machines. On some the load raises to a very high value. On others a running webserver responses only with 5xx HTTP codes. It takes around 5-6 minutes until all is ok again. After those 5-6 minutes the cluster is still backfilling, but the virtual machines behave normal again. I already set the following parameters in ceph.conf on the nodes to have a better rebalance traffic/user traffic ratio: """ [osd] osd max backfills = 1 osd backfill scan max = 8 osd backfill scan min = 4 osd recovery max active = 1 osd recovery op priority = 1 osd op threads = 8 """ It helped a bit, but we are still experiencing the above written problems. It feels like that for a short time some virtual hard disks are locked. Our ceph nodes are using bonded 10G network interfaces for the 'OSD network', so I do not think that network is a bottleneck. After reading this blog post: http://dachary.org/?p=2182 I wonder if there is really a 'read lock' during the object push. Does anyone know more about this or do others have the same problems and were able to fix it? Best Regards Nick -- Sebastian Nickel Nine Internet Solutions AG, Albisriederstr. 243a, CH-8047 Zuerich Tel +41 44 637 40 00 | Support +41 44 637 40 40 | www.nine.ch
Attachment:
signature.asc
Description: This is a digitally signed message part.
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com