Hi, My cluster is in a warning state as it is rebalancing after I've added a bunch of disks. (no issue here!) Though, there are a few things which I just cannot understand... I hope someone can help me... I'm getting hopeless finding the answers... If you can answer any question (even one), it will be greatly appreciated. Environment: Ceph Nautilus 14.2.11, 282 OSDs, mainly erasure coding. 1. Ceph dashboard shows more PGs for an OSD then I can extract from the pg dump information. I assume that the value in the dashboard is coming from Prometheus, though I'm not sure. Querying Prometheus thru Grafana also gives me the same figures as the one I see in the dashboard (which is incorrect). I can't find out how this value is calculated... All the information I can find regarding calculating the pg's stored on an osd are derived from pg dump :-( Help help help 2. As the cluster is rebalancing and there is a huge gap between the acting and the up osd's for a bunch of pg's, some disks sometimes have slow responses due to the backfilling and are from time to time marked as down. (I know I can go around this one by setting "nodown" temporarily). If the disk which is going down is in an acting set of a specific pg (not in the upmap for that pg), then the pg will be marked as degraded as the system will try to rebuild the missing data towards an osd in the up set. This I understand... What I don't understand is that when the disk is restarted and thus marked back as "up" (or simply being patient...), it doesn't add that osd back to the acting set... Restarting other osds (and thus causing more peering operations again) results in the disk being added again to the acting set... I don't understand how this is happening. 3. Addition to question 2: If an osd which was in an acting set went down and is removed from the acting set (replaced with -1), which process will remove the obsolete data from the osd that went down once it is back up? Which process is cleaning up the obsolete copy in the end? Does scrubbing take this process into account? I assume that this might be related to my first question too. 4. I'm data mining the pg dump output... If I get an answer on all the previous questions, this will probably be answered automatically: When I look at all the acting pg's for a specific osd, and I look at the numbytes of that pg and I calculate the size that should be stored on that osd (taking into account the erasure coding process) I get a difference between disk space which should be used and effectively used. E.g. for a disk the system is saying 11.3 TiB is being used... Calculating it using pg dump gives me +/- 10TiB which should be used. I know that a delta might occur due to the block size etc, but it doesn't seem correct as the usage is too high... I've tried to search on processes which clean up the osds, garbage collection, etc. but no good information is available. You can find tons of information related to garbage collection in combination with RGW, but not for the RADOS mechanism... I really can't find any clue how the pg's of a disk are removed after it went down and is not being used again in the acting/up set of those pg's... Another question which I probably should post in the dev group, but which IDE is recommended for developing in the Ceph project? I'm working on a Mac... Don't know if there are any recommendations. I really hope I can get some help on these questions. Many thanks! Regards, Kristof _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx