Thanks again to Jan, Burkhard, Marc and Hector for responses on this. To review, I am removing OSDs from a small cluster and running up against the “too many PGs per OSD problem due to lack of clarity. Here’s a summary of what I have collected on it:
- The CephFS data pool can’t be changed, only added to.
- CephFS metadata pool might be rebuildable via https://www.spinics.net/lists/ceph-users/msg29536.html, but the post is a couple of years old, and even then, the author stated that he wouldn’t do this unless it was an emergency.
- Running multiple clusters on the same hardware is deprecated, so there’s no way to make a new cluster with properly-sized pools and cpio across.
- Running multiple filesystems on the same hardware is considered experimental: http://docs.ceph.com/docs/master/cephfs/experimental-features/#multiple-filesystems-within-a-ceph-cluster. It’s unclear what permanent changes this will effect on the cluster that I’d like to use moving forward. This would be a second option to mount and cpio across.
- Importing pools (ie `zpool export …`, `zpool import …`) from other clusters is likely not supported, so even if I created a new cluster on a different machine, getting the pools back in the original cluster is fraught.
- There’s really no way to tell Ceph where to put pools, so when the new drives are added to CRUSH, everything starts rebalancing unless `max pg per osd` is set to some small number that is already exceeded. But if I start copying data to the new pool, doesn’t it fail?
- Maybe the former problem can be avoided by changing the weights of the OSDs...
All these options so far seem either a) dangerous or b) like I’m going to have a less-than-pristine cluster to kick off the next ten years with. Unless I am mistaken in that, the only options are to copy everything at least once or twice more:
- Copy everything back off CephFS to a `mdadm` RAID 1 with two of the 6TB drives. Blow away the cluster and start over with the other two drives, copy everything back to CephFS, then re-add the freed drive used as a store. Might be done by the end of next week.
- Create a new, properly sized cluster on a second machine, copy everything over ethernet, then move the drives and the `/var/lib/ceph` and `/etc/ceph` back to the cluster seed.
I appreciate small clusters are not the target use case of Ceph, but everyone has to start somewhere! |
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com