On Wed, Dec 6, 2017 at 4:09 AM, Marcus Priesch <marcus@xxxxxxxxxxxxx> wrote: > Dear Ceph Users, > > first of all, big thanks to all the devs and people who made all this > possible, ceph is amazing !!! > > ok, so let me get to the point where i need your help: > > i have a cluster of 6 hosts, mixed with ssd's and hdd's. > > on 4 of the 6 hosts are 21 vm's running in total with less to no > workload (web, mail, elasticsearch) for a couple of users. > > 4 nodes are running ubuntu server and 2 of them are running proxmox > (because we are now in the process of migrating towards proxmox). > > i am running ceph luminous (have upgraded two weeks ago) > > ceph communication is carried out on a seperate 1Gbit Network where we > plan to upgrade to bonded 2x10Gbit during the next couple of weeks. > > i have two pools defined where i only use disk images via libvirt/rbd. > > the hdd pool has two replicas and is for large (~4TB) backup images and > the ssd pool has three replicas (two on ssd osd's and one on hdd osd's) > for improved fail safety and faster access for "live data" and OS > images. > > in the crush map i have two different rules for the two pools so that > replicas always are stored on different hosts - i have verified this and > it works. it is coded via the "host" attribute (host node1-hdd and host > node1 are both actually on the same host) > > so, now comes the interesting part: > > when i turn off one of the hosts (lets say node7) that do only ceph, > after some time the vm's stall and hang until the host comes up again. > > when i dont turn on the host again, after some time the cluster starts > rebalancing ... > > yesterday i experienced that after a couple of hours of rebalancing the > vm's continue working again - i think thats when the cluster has > finished rebalancing ? havent really digged into this. > > well, today we turned off the same host (node7) again and i got stuck > pg's again. > > this time i did some investigation and to my surprise i found the > following in the output of ceph health detail: > > REQUEST_SLOW 17 slow requests are blocked > 32 sec > 3 ops are blocked > 2097.15 sec > 14 ops are blocked > 1048.58 sec > osds 9,10 have blocked requests > 1048.58 sec > osd.5 has blocked requests > 2097.15 sec > > i think the blocked requests are my problem, do they ? > > but neither osd's 9, 10 or 5 are located on host7 - so can anyone of you > tell me why the requests to this nodes got stuck ? > > i have one pg in state "stuck unclean" which has its replicas on osd's > 2, 3 and 15. 3 is on node7, but the first in the active set is 2 - i > thought the "write op" should have gone there ... so why unclean ? the > manual states "For stuck unclean placement groups, there is usually > something preventing recovery from completing, like unfound objects" but > there arent ... > > do i have a configuration issue here (amount of replicas?) or is this > behavior simply just because my cluster network is too slow ? > > you can find detailed outputs here : > > https://owncloud.priesch.co.at/index.php/s/toYdGekchqpbydY > > i hope any of you can help me shed any light on this ... > > at least the point of all is that a single host should be allowed to > fail and the vm's continue running ... ;) You don't really have six MONs do you (although I know the answer to this question)? I think you need to take another look at some of the docs about monitors. > > regards and thanks in advance, > marcus. > > -- > Marcus Priesch > open source consultant - solution provider > www.priesch.co.at / office@xxxxxxxxxxxxx > A-2122 Riedenthal, In Prandnern 31 / +43 650 62 72 870 > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Cheers, Brad _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com