Re: can't get healthy cluster to trim osdmaps (13.2.8)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Jan,

yes, I'm watching this TT as well, I'll post update there
(together with quick & dirty patch to get more debugging info)

BR

nik


On Mon, Mar 23, 2020 at 12:12:43PM +0100, Jan Fajerski wrote:
> https://tracker.ceph.com/issues/44184
> Looks similar, maybe you're also seeing other symptoms listed there?
> In any case would be good to track this in one place.
> 
> On Mon, Mar 23, 2020 at 11:29:53AM +0100, Nikola Ciprich wrote:
> >OK, so after some debugging, I've pinned the problem down to
> >OSDMonitor::get_trim_to:
> >
> >   std::lock_guard<std::mutex> l(creating_pgs_lock);
> >   if (!creating_pgs.pgs.empty()) {
> >     return 0;
> >   }
> >
> >apparently creating_pgs.pgs.empty() is not true, do I understand it
> >correctly that cluster thinks the list of creating pgs is not empty?
> >
> >all pgs are in clean+active state, so maybe there's something malformed
> >in the db? How can I check?
> >
> >I tried dumping list of creating_pgs according to
> >http://lists.ceph.com/pipermail/ceph-users-ceph.com/2018-October/030297.html
> >but to no avail
> >
> >On Tue, Mar 17, 2020 at 12:25:29PM +0100, Nikola Ciprich wrote:
> >>Hello dear cephers,
> >>
> >>lately, there's been some discussion about slow requests hanging
> >>in "wait for new map" status. At least in my case, it's being caused
> >>by osdmaps not being properly trimmed. I tried all possible steps
> >>to force osdmap pruning (restarting mons, restarting everyging,
> >>poking crushmap), to no avail. Still all OSDs keep min osdmap version
> >>1, while newest is 4734. Otherwise cluster is healthy, with no down
> >>OSDs, network communication works flawlessly, all seems to be fine.
> >>Just can't get old osdmaps to go away.. I's very small cluster and I've
> >>moved all production traffic elsewhere, so I'm free to investigate
> >>and debug, however I'm out of ideas on what to try or where to look.
> >>
> >>Any ideas somebody please?
> >>
> >>The cluster is running 13.2.8
> >>
> >>I'd be very grateful for any tips
> >>
> >>with best regards
> >>
> >>nikola ciprich
> >>
> >>--
> >>-------------------------------------
> >>Ing. Nikola CIPRICH
> >>LinuxBox.cz, s.r.o.
> >>28.rijna 168, 709 00 Ostrava
> >>
> >>tel.:   +420 591 166 214
> >>fax:    +420 596 621 273
> >>mobil:  +420 777 093 799
> >>www.linuxbox.cz
> >>
> >>mobil servis: +420 737 238 656
> >>email servis: servis@xxxxxxxxxxx
> >>-------------------------------------
> >>
> >
> >-- 
> >-------------------------------------
> >Ing. Nikola CIPRICH
> >LinuxBox.cz, s.r.o.
> >28.rijna 168, 709 00 Ostrava
> >
> >tel.:   +420 591 166 214
> >fax:    +420 596 621 273
> >mobil:  +420 777 093 799
> >www.linuxbox.cz
> >
> >mobil servis: +420 737 238 656
> >email servis: servis@xxxxxxxxxxx
> >-------------------------------------
> >_______________________________________________
> >ceph-users mailing list -- ceph-users@xxxxxxx
> >To unsubscribe send an email to ceph-users-leave@xxxxxxx
> 
> -- 
> Jan Fajerski
> Senior Software Engineer Enterprise Storage
> SUSE Software Solutions Germany GmbH
> Maxfeldstr. 5, 90409 Nürnberg, Germany
> (HRB 36809, AG Nürnberg)
> Geschäftsführer: Felix Imendörffer
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
> 

-- 
-------------------------------------
Ing. Nikola CIPRICH
LinuxBox.cz, s.r.o.
28.rijna 168, 709 00 Ostrava

tel.:   +420 591 166 214
fax:    +420 596 621 273
mobil:  +420 777 093 799
www.linuxbox.cz

mobil servis: +420 737 238 656
email servis: servis@xxxxxxxxxxx
-------------------------------------
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux