Hello Simon, Another idea is to increase choose_total_tries. Hth Mehmet Am 7. März 2019 09:56:17 MEZ schrieb Martin Verges <martin.verges@xxxxxxxx>: >Hello, > >try restarting every osd if possible. >Upgrade to a recent ceph version. > >-- >Martin Verges >Managing director > >Mobile: +49 174 9335695 >E-Mail: martin.verges@xxxxxxxx >Chat: https://t.me/MartinVerges > >croit GmbH, Freseniusstr. 31h, 81247 Munich >CEO: Martin Verges - VAT-ID: DE310638492 >Com. register: Amtsgericht Munich HRB 231263 > >Web: https://croit.io >YouTube: https://goo.gl/PGE1Bx > > >Am Do., 7. März 2019 um 08:39 Uhr schrieb simon falicon < >simonfalicon@xxxxxxxxx>: > >> Hello Ceph Users, >> >> I have an issue with my ceph cluster, after one serious fail in four >SSD >> (electricaly dead) I have lost PGs (and replicats) and I have 14 Pgs >stuck. >> >> So for correct it I have try to force create this PGs (with same IDs) >but >> now the Pgs stuck in creating state -_-" : >> >> ~# ceph -s >> health HEALTH_ERR >> 14 pgs are stuck inactive for more than 300 seconds >> .... >> >> ceph pg dump | grep creating >> >> dumped all in format plain >> 9.3 0 0 0 0 0 0 0 0 creating 2019-02-25 >09:32:12.333979 0'0 0:0[20,26] 20 [20,11] 20 0'0 >2019-02-25 09:32:12.333979 0'0 2019-02-25 09:32:12.333979 >> 3.9 0 0 0 0 0 0 0 0 creating 2019-02-25 >09:32:11.295451 0'0 0:0[16,39] 16 [17,6] 17 0'0 >2019-02-25 09:32:11.295451 0'0 2019-02-25 09:32:11.295451 >> ... >> >> I have try to create new PG dosent existe before and it work, but for >this >> PG stuck in creating state. >> >> In my monitor logs I have this message: >> >> 2019-02-25 11:02:46.904897 7f5a371ed700 0 mon.controller1@1(peon) e7 >handle_command mon_command({"prefix": "pg force_create_pg", "pgid": >"4.20e"} v 0) v1 >> 2019-02-25 11:02:46.904938 7f5a371ed700 0 log_channel(audit) log >[INF] : from='client.? 172.31.101.107:0/3101034432' >entity='client.admin' cmd=[{"prefix": "pg force_create_pg", "pgid": >"4.20e"}]: dispatch >> >> When I check map I have: >> >> ~# ceph pg map 4.20e >> osdmap e428069 pg 4.20e (4.20e) -> up [27,37,36] acting [13,17] >> >> I have restart OSD 27,37,36,13 and 17 but no effect. (one by one) >> >> I have see this issue http://tracker.ceph.com/issues/18298 but I run >on >> ceph 10.2.11. >> >> So could you help me please ? >> >> Many thanks by advance, >> Sfalicon. >> >> _______________________________________________ >> ceph-users mailing list >> ceph-users@xxxxxxxxxxxxxx >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com