Hi Frank, CRUSH can only find 5 OSDs, given your current tree, rule, and reweights. This is why there is a NONE in the UP set for shard 6. But in ACTING we see that it is refusing to remove shard 6 from osd.1 -- that is the only copy of that shard, so in this case it's helping you rather than deleting the shard altogether. ACTING == what the OSDs are serving now. UP == where CRUSH wants to place the shards. I suspect that this is a case of CRUSH tunables + your reweights putting CRUSH in a corner case of not finding 6 OSDs for that particular PG. If you set the reweights all back to 1, it probably finds 6 OSDs? Cheers, Dan On Mon, Aug 29, 2022 at 4:44 PM Frank Schilder <frans@xxxxxx> wrote: > > Hi all, > > I'm investigating a problem with a degenerated PG on an octopus 15.2.16 test cluster. It has 3Hosts x 3OSDs and a 4+2 EC pool with failure domain OSD. After simulating a disk fail by removing an OSD and letting the cluster recover (all under load), I end up with a PG with the same OSD allocated twice: > > PG 4.1c, UP: [6,1,4,5,3,NONE] ACTING: [6,1,4,5,3,1] > > OSD 1 is allocated twice. How is this even possible? > > Here the OSD tree: > > ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF > -1 2.44798 root default > -7 0.81599 host tceph-01 > 0 hdd 0.27199 osd.0 up 0.87999 1.00000 > 3 hdd 0.27199 osd.3 up 0.98000 1.00000 > 6 hdd 0.27199 osd.6 up 0.92999 1.00000 > -3 0.81599 host tceph-02 > 2 hdd 0.27199 osd.2 up 0.95999 1.00000 > 4 hdd 0.27199 osd.4 up 0.89999 1.00000 > 8 hdd 0.27199 osd.8 up 0.89999 1.00000 > -5 0.81599 host tceph-03 > 1 hdd 0.27199 osd.1 up 0.89999 1.00000 > 5 hdd 0.27199 osd.5 up 1.00000 1.00000 > 7 hdd 0.27199 osd.7 destroyed 0 1.00000 > > I tried already to change some tunables thinking about https://docs.ceph.com/en/octopus/rados/troubleshooting/troubleshooting-pg/#crush-gives-up-too-soon, but giving up too soon is obviously not the problem. It is accepting a wrong mapping. > > Is there a way out of this? Clearly this is calling for trouble if not data loss and should not happen at all. > > Best regards, > ================= > Frank Schilder > AIT Risø Campus > Bygning 109, rum S14 > _______________________________________________ > ceph-users mailing list -- ceph-users@xxxxxxx > To unsubscribe send an email to ceph-users-leave@xxxxxxx _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx