Hi Jan, yes, I'm watching this TT as well, I'll post update there (together with quick & dirty patch to get more debugging info) BR nik On Mon, Mar 23, 2020 at 12:12:43PM +0100, Jan Fajerski wrote: > https://tracker.ceph.com/issues/44184 > Looks similar, maybe you're also seeing other symptoms listed there? > In any case would be good to track this in one place. > > On Mon, Mar 23, 2020 at 11:29:53AM +0100, Nikola Ciprich wrote: > >OK, so after some debugging, I've pinned the problem down to > >OSDMonitor::get_trim_to: > > > > std::lock_guard<std::mutex> l(creating_pgs_lock); > > if (!creating_pgs.pgs.empty()) { > > return 0; > > } > > > >apparently creating_pgs.pgs.empty() is not true, do I understand it > >correctly that cluster thinks the list of creating pgs is not empty? > > > >all pgs are in clean+active state, so maybe there's something malformed > >in the db? How can I check? > > > >I tried dumping list of creating_pgs according to > >http://lists.ceph.com/pipermail/ceph-users-ceph.com/2018-October/030297.html > >but to no avail > > > >On Tue, Mar 17, 2020 at 12:25:29PM +0100, Nikola Ciprich wrote: > >>Hello dear cephers, > >> > >>lately, there's been some discussion about slow requests hanging > >>in "wait for new map" status. At least in my case, it's being caused > >>by osdmaps not being properly trimmed. I tried all possible steps > >>to force osdmap pruning (restarting mons, restarting everyging, > >>poking crushmap), to no avail. Still all OSDs keep min osdmap version > >>1, while newest is 4734. Otherwise cluster is healthy, with no down > >>OSDs, network communication works flawlessly, all seems to be fine. > >>Just can't get old osdmaps to go away.. I's very small cluster and I've > >>moved all production traffic elsewhere, so I'm free to investigate > >>and debug, however I'm out of ideas on what to try or where to look. > >> > >>Any ideas somebody please? > >> > >>The cluster is running 13.2.8 > >> > >>I'd be very grateful for any tips > >> > >>with best regards > >> > >>nikola ciprich > >> > >>-- > >>------------------------------------- > >>Ing. Nikola CIPRICH > >>LinuxBox.cz, s.r.o. > >>28.rijna 168, 709 00 Ostrava > >> > >>tel.: +420 591 166 214 > >>fax: +420 596 621 273 > >>mobil: +420 777 093 799 > >>www.linuxbox.cz > >> > >>mobil servis: +420 737 238 656 > >>email servis: servis@xxxxxxxxxxx > >>------------------------------------- > >> > > > >-- > >------------------------------------- > >Ing. Nikola CIPRICH > >LinuxBox.cz, s.r.o. > >28.rijna 168, 709 00 Ostrava > > > >tel.: +420 591 166 214 > >fax: +420 596 621 273 > >mobil: +420 777 093 799 > >www.linuxbox.cz > > > >mobil servis: +420 737 238 656 > >email servis: servis@xxxxxxxxxxx > >------------------------------------- > >_______________________________________________ > >ceph-users mailing list -- ceph-users@xxxxxxx > >To unsubscribe send an email to ceph-users-leave@xxxxxxx > > -- > Jan Fajerski > Senior Software Engineer Enterprise Storage > SUSE Software Solutions Germany GmbH > Maxfeldstr. 5, 90409 Nürnberg, Germany > (HRB 36809, AG Nürnberg) > Geschäftsführer: Felix Imendörffer > _______________________________________________ > ceph-users mailing list -- ceph-users@xxxxxxx > To unsubscribe send an email to ceph-users-leave@xxxxxxx > -- ------------------------------------- Ing. Nikola CIPRICH LinuxBox.cz, s.r.o. 28.rijna 168, 709 00 Ostrava tel.: +420 591 166 214 fax: +420 596 621 273 mobil: +420 777 093 799 www.linuxbox.cz mobil servis: +420 737 238 656 email servis: servis@xxxxxxxxxxx ------------------------------------- _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx