Hello Anthony / Users, After some initial analysis, I had increased max_pg_per_osd to 480, but we're out of luck. Also tried force-backfill and force-repair as well. On querying PG using *# ceph pg **<pg.ID> query* the output says blocked_by 3 to 4 OSDs which are out of the cluster already. Guessing if these have to do something with the recovery. Thanks, Jayanth Reddy On Sat, Jun 17, 2023 at 4:17 PM Anthony D'Atri <anthony.datri@xxxxxxxxx> wrote: > Your cluster’s configuration is preventing CRUSH from calculating full > placements > > set max_pg_per_osd = 1000, either in central config (or ceph.conf if you > have it set there now). > > If you have it set in ceph.conf, you may need to serially restart the mons. > > ceph osd down 214 > sleep 60 > ceph osd down 223 > sleep 60 > ceph osd down 548 > sleep 60 > ceph osd down 584 > > > > > > > > On Jun 17, 2023, at 2:22 AM, Jayanth Reddy <jayanthreddy5666@xxxxxxxxx> > wrote: > > > > Hello Users, > > Greetings. We've a Ceph Cluster with the version > > *ceph version 14.2.5-382-g8881d33957 > > (8881d33957b54b101eae9c7627b351af10e87ee8) nautilus (stable)* > > > > 5 PGs belonging to our RGW 8+3 EC Pool are stuck in incomplete and > > incomplete+remapped states. Below are the PGs, > > > > # ceph pg dump_stuck inactive > > ok > > PG_STAT STATE UP > > UP_PRIMARY ACTING > > ACTING_PRIMARY > > 15.251e incomplete [151,464,146,503,166,41,555,542,9,565,268] > > 151 > > [151,464,146,503,166,41,555,542,9,565,268] 151 > > 15.3f3 incomplete [584,281,672,699,199,224,239,430,355,504,196] > > 584 > > [584,281,672,699,199,224,239,430,355,504,196] 584 > > 15.985 remapped+incomplete [396,690,493,214,319,209,546,91,599,237,352] > > 396 > > > [2147483647,2147483647,2147483647,214,319,2147483647,546,91,599,2147483647,352] > > 214 > > 15.39d3 remapped+incomplete [404,221,223,585,38,102,533,471,568,451,195] > > 404 > > [2147483647,2147483647,223,585,38,102,533,2147483647,231,451,2147483647] > > 223 > > 15.d46 remapped+incomplete [297,646,212,254,110,169,500,372,623,470,678] > > 297 > > [2147483647,548,2147483647,2147483647,110,169,500,372,2147483647,470,678] > > 548 > > > > Some of the OSDs had gone down on the cluster. Below is the # ceph status > > > > # ceph -s > > cluster: > > id: 30d6f7ee-fa02-4ab3-8a09-9321c8002794 > > health: HEALTH_WARN > > noscrub,nodeep-scrub flag(s) set > > 1 pools have many more objects per pg than average > > Reduced data availability: 5 pgs inactive, 5 pgs incomplete > > Degraded data redundancy: 44798/8718528059 objects degraded > > (0.001%), 1 pg degraded, 1 pg undersized > > 22726 pgs not deep-scrubbed in time > > 23552 pgs not scrubbed in time > > 77 slow ops, oldest one blocked for 56400 sec, daemons > > [osd.214,osd.223,osd.548,osd.584] have slow ops. > > too many PGs per OSD (330 > max 250) > > > > services: > > mon: 3 daemons, quorum brc1mon2,brc1mon3,brc1mon1 (age 2y) > > mgr: brc1mon2(active, since 8d), standbys: brc1mon1, brc1mon3 > > mds: cephfs:1 {0=brc1mds2=up:active} 1 up:standby > > osd: 1012 osds: 698 up (since 14h), 698 in (since 2d); 3 remapped pgs > > flags noscrub,nodeep-scrub > > rgw: 2 daemons active (brc1rgw1, brc1rgw2) > > > > data: > > pools: 17 pools, 23552 pgs > > objects: 863.74M objects, 1.2 PiB > > usage: 2.4 PiB used, 6.2 PiB / 8.6 PiB avail > > pgs: 0.021% pgs not active > > 44798/8718528059 objects degraded (0.001%) > > 23546 active+clean > > 3 remapped+incomplete > > 2 incomplete > > 1 active+undersized+degraded > > > > io: > > client: 24 MiB/s rd, 3.2 KiB/s wr, 56 op/s rd, 4 op/s wr > > > > And the health detail shows as > > > > # ceph health detail > > HEALTH_WARN noscrub,nodeep-scrub flag(s) set; 1 pools have many more > > objects per pg than average; Reduced data availability: 5 pgs inactive, 5 > > pgs incomplete; Degraded data redundancy: 44798/8718528081 objects > degraded > > (0.001%), 1 pg degraded, 1 pg undersized; 22726 pgs not deep-scrubbed in > > time; 23552 pgs not scrubbed in time; 77 slow ops, oldest one blocked for > > 56440 sec, daemons [osd.214,osd.223,osd.548,osd.584] have slow ops.; too > > many PGs per OSD (330 > max 250) > > OSDMAP_FLAGS noscrub,nodeep-scrub flag(s) set > > MANY_OBJECTS_PER_PG 1 pools have many more objects per pg than average > > pool iscsi-images objects per pg (540004) is more than 14.7248 times > > cluster average (36673) > > PG_AVAILABILITY Reduced data availability: 5 pgs inactive, 5 pgs > incomplete > > pg 15.3f3 is incomplete, acting > > [584,281,672,699,199,224,239,430,355,504,196] (reducing pool > > default.rgw.buckets.data min_size from 9 may help; search ceph.com/docs > for > > 'incomplete') > > pg 15.985 is remapped+incomplete, acting > > > [2147483647,2147483647,2147483647,214,319,2147483647,546,91,599,2147483647,352] > > (reducing pool default.rgw.buckets.data min_size from 9 may help; search > > ceph.com/docs for 'incomplete') > > pg 15.d46 is remapped+incomplete, acting > > [2147483647,548,2147483647,2147483647,110,169,500,372,2147483647,470,678] > > (reducing pool default.rgw.buckets.data min_size from 9 may help; search > > ceph.com/docs for 'incomplete') > > pg 15.251e is incomplete, acting > > [151,464,146,503,166,41,555,542,9,565,268] (reducing pool > > default.rgw.buckets.data min_size from 9 may help; search ceph.com/docs > for > > 'incomplete') > > pg 15.39d3 is remapped+incomplete, acting > > [2147483647,2147483647,223,585,38,102,533,2147483647,231,451,2147483647] > > (reducing pool default.rgw.buckets.data min_size from 9 may help; search > > ceph.com/docs for 'incomplete') > > PG_DEGRADED Degraded data redundancy: 44798/8718528081 objects degraded > > (0.001%), 1 pg degraded, 1 pg undersized > > pg 15.28f0 is stuck undersized for 67359238.592403, current state > > active+undersized+degraded, last acting > > [2147483647,343,355,415,426,640,302,392,78,202,607] > > PG_NOT_DEEP_SCRUBBED 22726 pgs not deep-scrubbed in time > > > > We've the pools as below > > > > # ceph osd lspools > > 1 iscsi-images > > 2 cephfs_data > > 3 cephfs_metadata > > 4 .rgw.root > > 5 default.rgw.control > > 6 default.rgw.meta > > 7 default.rgw.log > > 8 default.rgw.buckets.index > > 13 rbd > > 15 default.rgw.buckets.data > > 16 default.rgw.buckets.non-ec > > 19 cephfs_data-ec > > 22 rbd-ec > > 23 iscsi-images-ec > > 24 hpecpool > > 25 hpec.rgw.buckets.index > > 26 hpec.rgw.buckets.non-ec > > > > > > We've been struggling for a long time to fix this but out of luck! Our > RGW > > daemons hosted on dedicated machines are continuously failing to respond, > > being behind a load balancer, LB throws 504 Gateway Timeout as the > daemons > > are failing to respond in the expected time. We perform active health > > checks from the LB on '/' by HTTP HEAD but these are failing as well, > very > > frequently. Currently we're surviving by writing a script that restarts > RGW > > daemons whenever the LB responds with HTTP status code 504. Any help is > > highly appreciated! > > > > Regards, > > Jayanth Reddy > > _______________________________________________ > > ceph-users mailing list -- ceph-users@xxxxxxx > > To unsubscribe send an email to ceph-users-leave@xxxxxxx > > _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx