Hi Dan, Thanks for the hint, i'll try this tomorrow with a test bed first. This evening I had to fix some Bareos client systems to get a quiet sleep. ;-) Will give you feedback asap. Best regards, Christoph Am Di., 15. Juni 2021 um 21:03 Uhr schrieb Dan van der Ster < dan@xxxxxxxxxxxxxx>: > Hi Christoph, > > What about the max osd? If "ceph osd getmaxosd" is not 76 on this > cluster, then set it: `ceph osd setmaxosd 76`. > > -- dan > > On Tue, Jun 15, 2021 at 8:54 PM Ackermann, Christoph > <c.ackermann@xxxxxxxxxxxx> wrote: > > > > Dan, > > > > sorry, we have no gaps in osd numbering: > > isceph@ceph-deploy:~$ sudo ceph osd ls |wc -l; sudo ceph osd tree | > sort -n -k1 |tail > > 76 > > [..] > > 73 ssd 0.28600 osd.73 up 1.00000 > 1.00000 > > 74 ssd 0.27689 osd.74 up 1.00000 > 1.00000 > > 75 ssd 0.28600 osd.75 up 1.00000 > 1.00000 > > > > The (quite old) cluster is running v15.2.13 very well. :-) OSDs > running on top of (newest) centos8.4 bare metal, mon/mds run on (bewest) > Centos 7.9 VMs. Problem just appears only with the newest Centos8 client > libceph. > > > > Christoph > > > > > > > > > > > > Am Di., 15. Juni 2021 um 20:26 Uhr schrieb Dan van der Ster < > dan@xxxxxxxxxxxxxx>: > >> > >> Replying to own mail... > >> > >> On Tue, Jun 15, 2021 at 7:54 PM Dan van der Ster <dan@xxxxxxxxxxxxxx> > wrote: > >> > > >> > Hi Ilya, > >> > > >> > We're now hitting this on CentOS 8.4. > >> > > >> > The "setmaxosd" workaround fixed access to one of our clusters, but > >> > isn't working for another, where we have gaps in the osd ids, e.g. > >> > > >> > # ceph osd getmaxosd > >> > max_osd = 553 in epoch 691642 > >> > # ceph osd tree | sort -n -k1 | tail > >> > 541 ssd 0.87299 osd.541 up 1.00000 > 1.00000 > >> > 543 ssd 0.87299 osd.543 up 1.00000 > 1.00000 > >> > 548 ssd 0.87299 osd.548 up 1.00000 > 1.00000 > >> > 552 ssd 0.87299 osd.552 up 1.00000 > 1.00000 > >> > > >> > Is there another workaround for this? > >> > >> The following seems to have fixed this cluster: > >> > >> 1. Fill all gaps with: ceph osd new `uuid` > >> ^^ after this, the cluster is still not mountable. > >> 2. Purge all the gap osds: ceph osd purge <id> > >> > >> I filled/purged a couple hundred gap osds, and now the cluster can be > mounted. > >> > >> Cheers! > >> > >> Dan > >> > >> P.S. The bugzilla is not public: > >> https://bugzilla.redhat.com/show_bug.cgi?id=1972278 > >> > >> > > >> > Cheers, dan > >> > > >> > > >> > On Mon, May 3, 2021 at 12:32 PM Ilya Dryomov <idryomov@xxxxxxxxx> > wrote: > >> > > > >> > > On Mon, May 3, 2021 at 12:27 PM Magnus Harlander <magnus@xxxxxxxxx> > wrote: > >> > > > > >> > > > Am 03.05.21 um 12:25 schrieb Ilya Dryomov: > >> > > > > >> > > > ceph osd setmaxosd 10 > >> > > > > >> > > > Bingo! Mount works again. > >> > > > > >> > > > Veeeery strange things are going on here (-: > >> > > > > >> > > > Thanx a lot for now!! If I can help to track it down, please let > me know. > >> > > > >> > > Good to know it helped! I'll think about this some more and > probably > >> > > plan to patch the kernel client to be less stringent and not choke > on > >> > > this sort of misconfiguration. > >> > > > >> > > Thanks, > >> > > > >> > > Ilya > >> > > _______________________________________________ > >> > > ceph-users mailing list -- ceph-users@xxxxxxx > >> > > To unsubscribe send an email to ceph-users-leave@xxxxxxx > >> _______________________________________________ > >> ceph-users mailing list -- ceph-users@xxxxxxx > >> To unsubscribe send an email to ceph-users-leave@xxxxxxx > _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx