Re: can't get healthy cluster to trim osdmaps (13.2.8)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



https://tracker.ceph.com/issues/44184
Looks similar, maybe you're also seeing other symptoms listed there? In any case would be good to track this in one place.

On Mon, Mar 23, 2020 at 11:29:53AM +0100, Nikola Ciprich wrote:
OK, so after some debugging, I've pinned the problem down to
OSDMonitor::get_trim_to:

   std::lock_guard<std::mutex> l(creating_pgs_lock);
   if (!creating_pgs.pgs.empty()) {
     return 0;
   }

apparently creating_pgs.pgs.empty() is not true, do I understand it
correctly that cluster thinks the list of creating pgs is not empty?

all pgs are in clean+active state, so maybe there's something malformed
in the db? How can I check?

I tried dumping list of creating_pgs according to
http://lists.ceph.com/pipermail/ceph-users-ceph.com/2018-October/030297.html
but to no avail

On Tue, Mar 17, 2020 at 12:25:29PM +0100, Nikola Ciprich wrote:
Hello dear cephers,

lately, there's been some discussion about slow requests hanging
in "wait for new map" status. At least in my case, it's being caused
by osdmaps not being properly trimmed. I tried all possible steps
to force osdmap pruning (restarting mons, restarting everyging,
poking crushmap), to no avail. Still all OSDs keep min osdmap version
1, while newest is 4734. Otherwise cluster is healthy, with no down
OSDs, network communication works flawlessly, all seems to be fine.
Just can't get old osdmaps to go away.. I's very small cluster and I've
moved all production traffic elsewhere, so I'm free to investigate
and debug, however I'm out of ideas on what to try or where to look.

Any ideas somebody please?

The cluster is running 13.2.8

I'd be very grateful for any tips

with best regards

nikola ciprich

--
-------------------------------------
Ing. Nikola CIPRICH
LinuxBox.cz, s.r.o.
28.rijna 168, 709 00 Ostrava

tel.:   +420 591 166 214
fax:    +420 596 621 273
mobil:  +420 777 093 799
www.linuxbox.cz

mobil servis: +420 737 238 656
email servis: servis@xxxxxxxxxxx
-------------------------------------


--
-------------------------------------
Ing. Nikola CIPRICH
LinuxBox.cz, s.r.o.
28.rijna 168, 709 00 Ostrava

tel.:   +420 591 166 214
fax:    +420 596 621 273
mobil:  +420 777 093 799
www.linuxbox.cz

mobil servis: +420 737 238 656
email servis: servis@xxxxxxxxxxx
-------------------------------------
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx

--
Jan Fajerski
Senior Software Engineer Enterprise Storage
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Felix Imendörffer
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux