On Mon, Feb 18, 2013 at 7:00 AM, femi anjorin <femi.anjorin@xxxxxxxxx> wrote: > Hi, > > Pls can somebody help with two questions: > > 1. I have 96 OSDs and 18624 pgs. ceph created the placement groups it > self when i issued the mkcephfs command initially. > > I dont know how ceph arrived at that number of pgs? I felt the system > will use the formula in this reference > http://ceph.com/docs/master/rados/operations/placement-groups/ , > average of 100 per OSD but apparently there might be some other > factors considered? This is just because it's using a very basic left-shift on the number of OSDs to arrive at the number of pools; the number of PGs is only a rough guide anyway. > 2. What will be a good/ideal placement group number for data pool? > and metadata pool? in my cluster with 96 OSDs? This is still a bit more complicated than it should be. I've reproduced my answer to a similar question below (since unfortunately ceph-users still isn't showing up in gmane): Unfortunately, the number of PGs isn't really this cut and dry. The recommendation for 100 per OSD is based on statistical tests of the evenness of the data distribution across the cluster, but those tests were all run when using only one pool in the cluster. If your pools all see roughly the same amount of usage /and they had uncorrelated PG placements/ then this distribution would roughly maintain itself if you split up that 100 PGs/OSD across multiple pools. Unfortunately as Sage mentioned the PG placements are currently correlated (whoops!). Now, your OSDs should be able to handle quite a lot more than 100 PGs/OSD — Sam guesstimates that (modulo weird hardware configs) you don't really run into trouble until each OSD is hosting in the neighborhood of 5000 PGs (so 1600-2500 PGs/OSD with 3x or 2x replication), so I'd bias for a per-pool count which is close to 100/OSD and then reduce if necessary to avoid getting ridiculous in your total number. Of course, the long-term vision once the PG functionality for merge is written and split is a bit more baked is that the cluster will auto-scale your PG counts based on the quality of the data distribution and the amount of data in the PG. Hope this clarifies the tradeoffs you're making a bit more! :) -Greg _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com