Hi Eugen, Yes, I have an inactive pgs when the osd goes down. Then I started the osds manually. But the rgw fails to start. Only upgrading to a newer version is only for the issue and we faced this issue two times. I dont know why it is happening. But maybe the rgw are running in separate machines. This causes the issue ? On Sat, Sep 10, 2022 at 11:27 PM Eugen Block <eblock@xxxxxx> wrote: > You didn’t respond to the other questions. If you want people to be > able to help you need to provide more information. If your OSDs fail > do you have inactive PGs? Or do you have full OSDs which would RGW > prevent from starting? I’m assuming that if you fix your OSDs the RGWs > would start working again. But then again, we still don’t know > anything about the current situation. > > Zitat von Monish Selvaraj <monish@xxxxxxxxxxxxxxx>: > > > Hi Eugen, > > > > Below is the log output, > > > > 2022-09-07T12:03:42.893+0000 7fdd23fdc5c0 0 framework: beast > > 2022-09-07T12:03:42.893+0000 7fdd23fdc5c0 0 framework conf key: port, > val: > > 80 > > 2022-09-07T12:03:42.893+0000 7fdd23fdc5c0 1 radosgw_Main not setting > numa > > affinity > > 2022-09-07T12:03:42.893+0000 7fdd23fdc5c0 1 rgw_d3n: > > rgw_d3n_l1_local_datacache_enabled=0 > > 2022-09-07T12:03:42.893+0000 7fdd23fdc5c0 1 D3N datacache enabled: 0 > > 2022-09-07T12:03:53.313+0000 7fdd23fdc5c0 1 rgw main: int > > RGWSI_Notify::robust_notify(const DoutPrefixProvider*, RGWSI_RADOS::Obj&, > > const RGWCacheNotifyInfo&, optional_yi> > > 2022-09-07T12:03:53.313+0000 7fdd23fdc5c0 1 rgw main: int > > RGWSI_Notify::robust_notify(const DoutPrefixProvider*, RGWSI_RADOS::Obj&, > > const RGWCacheNotifyInfo&, optional_yi> > > 2022-09-07T12:08:42.891+0000 7fdd1661c700 -1 Initialization timeout, > failed > > to initialize > > 2022-09-07T12:08:53.395+0000 7f69017095c0 0 deferred set uid:gid to > > 167:167 (ceph:ceph) > > 2022-09-07T12:08:53.395+0000 7f69017095c0 0 ceph version 17.2.0 > > (43e2e60a7559d3f46c9d53f1ca875fd499a1e35e) quincy (stable), process > > radosgw, pid 7 > > 2022-09-07T12:08:53.395+0000 7f69017095c0 0 framework: beast > > 2022-09-07T12:08:53.395+0000 7f69017095c0 0 framework conf key: port, > val: > > 80 > > 2022-09-07T12:08:53.395+0000 7f69017095c0 1 radosgw_Main not setting > numa > > affinity > > 2022-09-07T12:08:53.395+0000 7f69017095c0 1 rgw_d3n: > > rgw_d3n_l1_local_datacache_enabled=0 > > 2022-09-07T12:08:53.395+0000 7f69017095c0 1 D3N datacache enabled: 0 > > 2022-09-07T12:09:03.747+0000 7f69017095c0 1 rgw main: int > > RGWSI_Notify::robust_notify(const DoutPrefixProvider*, RGWSI_RADOS::Obj&, > > const RGWCacheNotifyInfo&, optional_yi> > > 2022-09-07T12:09:03.747+0000 7f69017095c0 1 rgw main: int > > RGWSI_Notify::robust_notify(const DoutPrefixProvider*, RGWSI_RADOS::Obj&, > > const RGWCacheNotifyInfo&, optional_yi> > > 2022-09-07T12:13:53.397+0000 7f68f3d49700 -1 Initialization timeout, > failed > > to initialize > > > > I installed the cluster in quincy. > > > > > > On Sat, Sep 10, 2022 at 4:02 PM Eugen Block <eblock@xxxxxx> wrote: > > > >> What troubleshooting have you tried? You don’t provide any log output > >> or information about the cluster setup, for example the ceph osd tree, > >> ceph status, are the failing OSDs random or do they all belong to the > >> same pool? Any log output from failing OSDs and the RGWs might help, > >> otherwise it’s just wild guessing. Is the cluster a new installation > >> with cephadm or an older cluster upgraded to Quincy? > >> > >> Zitat von Monish Selvaraj <monish@xxxxxxxxxxxxxxx>: > >> > >> > Hi all, > >> > > >> > I have one critical issue in my prod cluster. When the customer's data > >> > comes from 600 MiB . > >> > > >> > My Osds are down *8 to 20 from 238* . Then I manually up my osds . > After > >> a > >> > few minutes, my all rgw crashes. > >> > > >> > We did some troubleshooting but nothing works. When we upgrade ceph to > >> > 17.2.0. to 17.2.1 is resolved. Also we faced the issue two times. But > >> both > >> > times we upgraded the ceph. > >> > > >> > *Node schema :* > >> > > >> > *Node 1 to node 5 --> mon,mgr and osds* > >> > *Node 6 to Node15 --> only osds* > >> > *Node 16 to Node 20 --> only rgws.* > >> > > >> > Kindly, check this issue and let me know the correct troubleshooting > >> method. > >> > _______________________________________________ > >> > ceph-users mailing list -- ceph-users@xxxxxxx > >> > To unsubscribe send an email to ceph-users-leave@xxxxxxx > >> > >> > >> > >> _______________________________________________ > >> ceph-users mailing list -- ceph-users@xxxxxxx > >> To unsubscribe send an email to ceph-users-leave@xxxxxxx > >> > > > > _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx