Hello Alwin, Dear All, yesterday we finished cluster migration to proxmox and i had the same problem again: A couple of osd's down and out and a stuck request on a completely different osd which blocked the vm's. i tried to put this specific osd out (ceph osd out xx) and voila, the problem was gone. later on i put the osd back in and anything works as expected. in the meantime i read the post here: http://ceph.com/community/new-luminous-rados-improvements/ where network problems with switches are also mentioned ... as the 1Gb network is completely busy in such a scenario i would assume maybe the problem is that some network communication got stuck somewhere ... however all in all the transition from ubuntu / jewel to ubuntu /luminous to proxmox / luminous went rather flawless - despite of the problem stated above - but i am aware that i am using ceph outside its requirements - so definitely *thumbs up* for ceph in general !!!! to your comments : >> i am running ceph luminous (have upgraded two weeks ago) > I guess, you are running on ceph 12.2.1 (12.2.2 is out)? What does ceph versions say? 12.2.1 >> ceph communication is carried out on a seperate 1Gbit Network where we >> plan to upgrade to bonded 2x10Gbit during the next couple of weeks. > With 6 hosts you will need 10GbE, alone for lower latency. Also a ceph > recovery/rebalance might max out the bandwidth of your link. yes, i think this is the problem ... > Mixing of spinners with SSDs is not recommended, as spinners will slow > down the pools residing on that root. why should this happen ? i would assume that osd's are seperate parts running on hosts - not influencing each other ? otherwise i would need a different set of hosts for the ssd's and the hdd's ? >> when i turn off one of the hosts (lets say node7) that do only ceph, >> after some time the vm's stall and hang until the host comes up again. > A stall of I/O shouldn't happen, what is your min_size of the pools? How > is your 'ceph osd tree' looking? you find it on the owncloud link ... at least ceph osd df tree >> but neither osd's 9, 10 or 5 are located on host7 - so can anyone of you >> tell me why the requests to this nodes got stuck ? > Those OSDs are waiting on other OSDs on host7, you can see that in the > ceph logs and you see with 'ceph pg dump' which pgs are located on which > OSDs. ok, you mean that they are waiting for operations to finish with the osd's that just went offline ? this should be a normal scenario when hardware fails - so this shouldnt lead to a stuck vm ... i assume ? >> i have one pg in state "stuck unclean" which has its replicas on osd's >> 2, 3 and 15. 3 is on node7, but the first in the active set is 2 - i >> thought the "write op" should have gone there ... so why unclean ? the >> manual states "For stuck unclean placement groups, there is usually >> something preventing recovery from completing, like unfound objects" but >> there arent ... > unclean - The placement group has not been clean for too long (i.e., it > hasn’t been able to completely recover from a previous failure). > http://docs.ceph.com/docs/master/rados/troubleshooting/troubleshooting-pg/#stuck-placement-groups i know this ... ther was no previous failure ... when i turn off some osd's i always get this after some time ... > How is your 1GbE utilized? I guess, with 6 nodes (3-4 OSDs) your link > might be maxed out. But you should get something in the ceph > logs. yes, it is maxed out ... i suspect that maybe its a problem of the network hardware that some packets get lost/stuck somewhere ... >> do i have a configuration issue here (amount of replicas?) or is this >> behavior simply just because my cluster network is too slow ? >> >> you can find detailed outputs here : >> >> https://owncloud.priesch.co.at/index.php/s/toYdGekchqpbydY >> >> i hope any of you can help me shed any light on this ... >> >> at least the point of all is that a single host should be allowed to >> fail and the vm's continue running ... ;) > To get a better look at your setup, a crush map, ceph osd dump, ceph -s > and some log output would be nice. you should find all in ceph_report.txt in the link above ... > Also you are moving to Proxmox, you might want to have look at the docs > & the forum. > > Docs: https://pve.proxmox.com/pve-docs/ > Forum: https://forum.proxmox.com thanks, been there ... > Some more useful information on PVE + Ceph: https://forum.proxmox.com/threads/ceph-raw-usage-grows-by-itself.38395/#post-189842 havent read this ... thanks a lot ! marcus. _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com