On Fri, Jul 20, 2012 at 11:08 AM, Florian Haas <florian@xxxxxxxxxxx> wrote: > > On Fri, Jul 20, 2012 at 9:33 AM, François Charlier > <francois.charlier@xxxxxxxxxxxx> wrote: > > Hello, > > > > Reading http://ceph.com/docs/master/ops/manage/grow/placement-groups/ > > and thinking to build a ceph cluster with potentially 1000 OSDs. > > > > Using the recommandations on the previously cited link, it would require > > pg_num being set between 10,000 & 30,000. Okay with that. Let's use the > > recommended value of 16,384 ; this is alreay about 160 placement groups > > per OSD. > > > > What if, for a start, we choose to reach this number of 1000 OSDs > > slowly, starting with 100 OSDs ? It's now 1600 placement groups per OSD. > > > > What if we chose 30,000 (or 32,768) placement groups to keep room for > > expansion ? > > > > My question is : How will behave a Ceph pool with 1000, 5000 or even > > 10000 placement groups per OSD ? Will this impact performance ? How bad > > ? Can it be worked around ? Is this a problem of RAM size ? CPU usage ? > > > > Any hint about this would be much appreciated. > > If I may, I'd like to add an additional point of consideration, > specifically for radosgw setups: > > What's the recommended way to set the number of PGs for the half-dozen > pools that radosgw normally creates on its own (.rgw, .rgw.users, > .rgw.buckets and so on)? I *think* wanting to set a custom number of > PGs would require pre-creating these pools manually, but there may be > a way -- undocumented? -- to instruct radosgw to set a user-configured > number of PGs on pool creation. Insight on that would be much > appreciated. > At the moment there's no way to tell radosgw how many pgs should be in the pools it creates automatically. One way to get around that is to create these pools before running the radosgw in the first time. For the data pools, you can modify the set of pools that will be used for data placement by using the radosgw-admin 'pool add', 'pool rm', and 'pool list' commands. Note that buckets that have already been created will retain their original pool. Data in pools that were automatically created can now be copied to a different pool (rados cppool), and pools can now be renamed (ceph osd pool rename <oldname> <newname>). So you can create new pool with the amount of required pgs, copy old data into it, and rename old pool and new pool. NOTE: this should not be used for the data pool (.rgw.buckets by default)! This can only be done for the pools that hold the different indexes and metadata. The bucket index in the data pool relies on internal pg state, which will be broken if pool moves around. Yehuda -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html