Re: Downsizing a cephfs pool

Brian Topping <brian.topping@xxxxxxxxx> · Fri, 8 Feb 2019 18:39:14 -0700

Thanks again to Jan, Burkhard, Marc and Hector for responses on this. To review, I am removing OSDs from a small cluster and running up against the “too many PGs per OSD problem due to lack of clarity. Here’s a summary of what I have collected on it:
The CephFS data pool can’t be changed, only added to. 
CephFS metadata pool might be rebuildable via https://www.spinics.net/lists/ceph-users/msg29536.html, but the post is a couple of years old, and even then, the author stated that he wouldn’t do this unless it was an emergency.
Running multiple clusters on the same hardware is deprecated, so there’s no way to make a new cluster with properly-sized pools and cpio across.
Running multiple filesystems on the same hardware is considered experimental: http://docs.ceph.com/docs/master/cephfs/experimental-features/#multiple-filesystems-within-a-ceph-cluster. It’s unclear what permanent changes this will effect on the cluster that I’d like to use moving forward. This would be a second option to mount and cpio across.
Importing pools (ie `zpool export …`, `zpool import …`) from other clusters is likely not supported, so even if I created a new cluster on a different machine, getting the pools back in the original cluster is fraught.
There’s really no way to tell Ceph where to put pools, so when the new drives are added to CRUSH, everything starts rebalancing unless `max pg per osd` is set to some small number that is already exceeded. But if I start copying data to the new pool, doesn’t it fail?
Maybe the former problem can be avoided by changing the weights of the OSDs...

All these options so far seem either a) dangerous or b) like I’m going to have a less-than-pristine cluster to kick off the next ten years with. Unless I am mistaken in that, the only options are to copy everything at least once or twice more:

Copy everything back off CephFS to a `mdadm` RAID 1 with two of the 6TB drives. Blow away the cluster and start over with the other two drives, copy everything back to CephFS, then re-add the freed drive used as a store. Might be done by the end of next week.
Create a new, properly sized cluster on a second machine, copy everything over ethernet, then move the drives and the `/var/lib/ceph` and `/etc/ceph` back to the cluster seed.

I appreciate small clusters are not the target use case of Ceph, but everyone has to start somewhere!
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com