Hi, Cluster is still down :( Up to not we have managed to compensate the OSDs. 118s of 160 OSD are stable and cluster is still in the progress of settling. Thanks for the guy Be-El in the ceph IRC channel. Be-El helped a lot to make flapping OSDs stable. What we learned up now is that this is the cause of unsudden death of 2 monitor servers of 3. And when they come back if they do not start one by one (each after joining cluster) this can happen. Cluster can be unhealty and it can take countless hour to come back. Right now here is our status: ceph -s : https://paste.ubuntu.com/p/6DbgqnGS7t/ health detail: https://paste.ubuntu.com/p/w4gccnqZjR/ Since OSDs disks are NL-SAS it can take up to 24 hours for an online cluster. What is most it has been said that we could be extremely luck if all the data is rescued. Most unhappily our strategy is just to sit and wait :(. As soon as the peering and activating count drops to 300-500 pgs we will restart the stopped OSDs one by one. For each OSD and we will wait the cluster to settle down. The amount of data stored is OSD is 33TB. Our most concern is to export our rbd pool data outside to a backup space. Then we will start again with clean one. I hope to justify our analysis with an expert. Any help or advise would be greatly appreciated. by morphin <morphinwithyou@xxxxxxxxx>, 25 Eyl 2018 Sal, 15:08 tarihinde şunu yazdı: > > After reducing the recovery parameter values did not change much. > There are a lot of OSD still marked down. > > I don't know what I need to do after this point. > > [osd] > osd recovery op priority = 63 > osd client op priority = 1 > osd recovery max active = 1 > osd max scrubs = 1 > > > ceph -s > cluster: > id: 89569e73-eb89-41a4-9fc9-d2a5ec5f4106 > health: HEALTH_ERR > 42 osds down > 1 host (6 osds) down > 61/8948582 objects unfound (0.001%) > Reduced data availability: 3837 pgs inactive, 1822 pgs > down, 1900 pgs peering, 6 pgs stale > Possible data damage: 18 pgs recovery_unfound > Degraded data redundancy: 457246/17897164 objects degraded > (2.555%), 213 pgs degraded, 209 pgs undersized > 2554 slow requests are blocked > 32 sec > 3273 slow ops, oldest one blocked for 1453 sec, daemons > [osd.0,osd.1,osd.10,osd.100,osd.101,osd.102,osd.103,osd.104,osd.105,osd.106]... > have slow ops. > > services: > mon: 3 daemons, quorum SRV-SEKUARK3,SRV-SBKUARK2,SRV-SBKUARK3 > mgr: SRV-SBKUARK2(active), standbys: SRV-SEKUARK2, SRV-SEKUARK3, > SRV-SEKUARK4 > osd: 168 osds: 118 up, 160 in > > data: > pools: 1 pools, 4096 pgs > objects: 8.95 M objects, 17 TiB > usage: 33 TiB used, 553 TiB / 586 TiB avail > pgs: 93.677% pgs not active > 457246/17897164 objects degraded (2.555%) > 61/8948582 objects unfound (0.001%) > 1676 down > 1372 peering > 528 stale+peering > 164 active+undersized+degraded > 145 stale+down > 73 activating > 40 active+clean > 29 stale+activating > 17 active+recovery_unfound+undersized+degraded > 16 stale+active+clean > 16 stale+active+undersized+degraded > 9 activating+undersized+degraded > 3 active+recovery_wait+degraded > 2 activating+undersized > 2 activating+degraded > 1 creating+down > 1 stale+active+recovery_unfound+undersized+degraded > 1 stale+active+clean+scrubbing+deep > 1 stale+active+recovery_wait+degraded > > ceph -w: https://paste.ubuntu.com/p/WZ2YqzS86S/ > ceph health detail: https://paste.ubuntu.com/p/8w7Jpms8fj/ > by morphin <morphinwithyou@xxxxxxxxx>, 25 Eyl 2018 Sal, 14:32 > tarihinde şunu yazdı: > > > > The config didnt work. Because increasing the number faced with more OSD Drops. > > > > bhfs -s > > cluster: > > id: 89569e73-eb89-41a4-9fc9-d2a5ec5f4106 > > health: HEALTH_ERR > > norebalance,norecover flag(s) set > > 1 osds down > > 17/8839434 objects unfound (0.000%) > > Reduced data availability: 3578 pgs inactive, 861 pgs > > down, 1928 pgs peering, 11 pgs stale > > Degraded data redundancy: 44853/17678868 objects degraded > > (0.254%), 221 pgs degraded, 20 pgs undersized > > 610 slow requests are blocked > 32 sec > > 3996 stuck requests are blocked > 4096 sec > > 6076 slow ops, oldest one blocked for 4129 sec, daemons > > [osd.0,osd.1,osd.10,osd.100,osd.101,osd.102,osd.103,osd.104,osd.105,osd.106]... > > have slow ops. > > > > services: > > mon: 3 daemons, quorum SRV-SEKUARK3,SRV-SBKUARK2,SRV-SBKUARK3 > > mgr: SRV-SBKUARK2(active), standbys: SRV-SEKUARK2, SRV-SEKUARK3 > > osd: 168 osds: 128 up, 129 in; 2 remapped pgs > > flags norebalance,norecover > > > > data: > > pools: 1 pools, 4096 pgs > > objects: 8.84 M objects, 17 TiB > > usage: 26 TiB used, 450 TiB / 477 TiB avail > > pgs: 0.024% pgs unknown > > 89.160% pgs not active > > 44853/17678868 objects degraded (0.254%) > > 17/8839434 objects unfound (0.000%) > > 1612 peering > > 720 down > > 583 activating > > 319 stale+peering > > 255 active+clean > > 157 stale+activating > > 108 stale+down > > 95 activating+degraded > > 84 stale+active+clean > > 50 active+recovery_wait+degraded > > 29 creating+down > > 23 stale+activating+degraded > > 18 stale+active+recovery_wait+degraded > > 14 active+undersized+degraded > > 12 active+recovering+degraded > > 4 stale+creating+down > > 3 stale+active+recovering+degraded > > 3 stale+active+undersized+degraded > > 2 stale > > 1 active+recovery_wait+undersized+degraded > > 1 active+clean+scrubbing+deep > > 1 unknown > > 1 active+undersized+degraded+remapped+backfilling > > 1 active+recovering+undersized+degraded > > > > I guess OSD down and drop issue increases the recovery time. So I > > decided to try with decreasing recovery parameters for less load on > > cluster. > > I have Nvme and SAS disks. Servers are powerfull enough. Network is 4x10Gb. > > I dont think my cluster is a bad shape. Because I have datacenter > > redundancy (14 servers + 14 servers). The crashed 7 servers are on > > only datacenter A. And it took only a few minutes to back online. Also > > 2 of them is monitors and cluster I/O should be suspended so there > > should be less data difference. > > > > On the other hand I dont understand the burden of recovery. I have > > faced many recoverys but none of the stopped my cluster working. This > > recovery burden is so high that it didnt stop for hours. I wish I > > could just decrease the recovery speed and continue to serve my VMs. > > Is the change of recovery load some what different than mimic? > > Luminous was pretty fine indeed. > > by morphin <morphinwithyou@xxxxxxxxx>, 25 Eyl 2018 Sal, 13:57 > > tarihinde şunu yazdı: > > > > > > Thank you for answer > > > > > > What do you think the conf for speed the recover? > > > > > > [osd] > > > osd recovery op priority = 63 > > > osd client op priority = 1 > > > osd recovery max active = 16 > > > osd max scrubs = 16 > > > <admin@xxxxxxxxxxxxxxx> adresine sahip kullanıcı 25 Eyl 2018 Sal, > > > 13:37 tarihinde şunu yazdı: > > > > > > > > Just let it recover. > > > > > > > > data: > > > > pools: 1 pools, 4096 pgs > > > > objects: 8.95 M objects, 17 TiB > > > > usage: 34 TiB used, 577 TiB / 611 TiB avail > > > > pgs: 94.873% pgs not active > > > > 48475/17901254 objects degraded (0.271%) > > > > 1/8950627 objects unfound (0.000%) > > > > 2631 peering > > > > 637 activating > > > > 562 down > > > > 159 active+clean > > > > 44 activating+degraded > > > > 30 active+recovery_wait+degraded > > > > 12 activating+undersized+degraded > > > > 10 active+recovering+degraded > > > > 10 active+undersized+degraded > > > > 1 active+clean+scrubbing+deep > > > > > > > > You've got deep scrubbed PGs which put considerable IO load on OSDs. > > > > > > > > > > > > September 25, 2018 1:23 PM, "by morphin" <morphinwithyou@xxxxxxxxx> wrote: > > > > > > > > > > > > > What should I do now? > > > > > _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com