I think I might find something. When I start an OSD its making High I/O around %95 and the other OSDs are also triggered and altogether they make same the I/O. This is true even if when I set noup flag. So all the OSDs are making high I/O when ever an OSD starts. I think this is too much. I have 168 OSD and when I start them OSD I/O job never finishes. I let the cluster for 70 hours and the high I/O never finished at all. We're trying to start OSD's host by host and wait for settlement but it takes too much time. OSD can not even answer "ceph tell osd.158 version". So if it becomes so busy and this seems to be a loop since another OSD startup triggers other OSD I/O. So I debug it and I hope this can be examined. This is debug=20 OSD log : Full log: https://www.dropbox.com/s/pwzqeajlsdwaoi1/ceph-osd.90.log?dl=0 Less log: Only the last part before the high I/O is finished: https://paste.ubuntu.com/p/7ZfwH8CBC5/ Strace -f -P osd; - When I start the osd: https://paste.ubuntu.com/p/8n2kTvwnG6/ - After I/O is finished: https://paste.ubuntu.com/p/4sGfj7Bf4c/ Now some people in IRC says this is a bug, try Ubuntu and new Ceph repo maybe it will help. I agree with them and I will give a shot. What do you think? by morphin <morphinwithyou@xxxxxxxxx>, 27 Eyl 2018 Per, 16:27 tarihinde şunu yazdı: > > I should not have client I/O right now. All of my VMs are down right > now. There is only a single pool. > > Here is my crush map: https://paste.ubuntu.com/p/Z9G5hSdqCR/ > > Cluster does not recover. After starting OSDs with the specified > flags, OSD up count drops from 168 to 50 with in 24 hours. > Stefan Kooman <stefan@xxxxxx>, 27 Eyl 2018 Per, 16:10 tarihinde şunu yazdı: > > > > Quoting by morphin (morphinwithyou@xxxxxxxxx): > > > After 72 hours I believe we may hit a bug. Any help would be greatly > > > appreciated. > > > > Is it feasible for you to stop all client IO to the Ceph cluster? At > > least until it stabilizes again. "ceph osd pause" would do the trick > > (ceph osd unpause would unset it). > > > > What kind of workload are you running on the cluster? How does your > > crush map looks like (ceph osd getcrushmap -o /tmp/crush_raw; > > crushtool -d /tmp/crush_raw -o /tmp/crush_edit)? > > > > I have seen a (test) Ceph cluster "healing" itself to the point there was > > nothing left to recover on. In *that* case the disks were overbooked > > (multiple OSDs per physical disk) ... The flags you set (nooout, nodown, > > nobackfill, norecover, noscrub, etc., etc.) helped to get it to recover > > again. I would try to get all OSDs online again (and manually keep them > > up / restart them, because you have set nodown). > > > > Does the cluster recover at all? > > > > Gr. Stefan > > > > -- > > | BIT BV http://www.bit.nl/ Kamer van Koophandel 09090351 > > | GPG: 0xD14839C6 +31 318 648 688 / info@xxxxxx _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com