learning about increasing osd / pg_num for a pol

"John Hogenmiller (yt)" <john@xxxxxxxxxxx> · Fri, 12 Feb 2016 20:45:01 -0500

I started a cluster with 9 OSD across 3 nodes. Then I expanded it to 419 OSDs across 7 nodes.  Along the way, I increased the pg_num/pgp_num in the rbd pool. Thanks to help earlier on this list, I was able to do that.
Tonight I started to do some perf testing and quickly realized that I never updated pg_num/pgp_num on the .rgw.buckets and rgw.buckets.index pools.   When I went to do so, I was told the pg_num was too high with "new PGs on ~8 OSDs exceed per-OSD max of 32".  I did a lot of reading up on osds, pg's, and crush map to try and figure out why .rgw.buckets was only seeing 8 OSDs, when it should see 419. 

Through trial and error, I did increase it to the max I could, and let everything rebalance as far as it could. I then found that the number of OSDs for the pool increased. The only way I know currently to see the number of OSDs a pool can see is to attempt to set pg_num to some ridiculously high value and get the error.

```
root@ljb01:# ceph osd pool set .rgw.buckets.index pg_num 9999999999                                                                                                                  
2016-02-13 01:27:06.913291 7f6a1c9cc700  0 -- :/3369124606 >> 172.29.4.154:6789/0 pipe(0x7f6a20064550 sd=3 :0 s=1 pgs=0 cs=0 l=1 c=0x7f6a2005e220).fault
Error E2BIG: specified pg_num 9999999999 is too large (creating 9999999743 new PGs on ~256 OSDs exceeds per-OSD max of 32)
```

My questions:

1. How can I see/determine the number of OSDs a pool can access?
2. For ".rgw", ".rgw.buckets", and ".rgw.buckets.index" how should I plan out the PG number for these?  We're only doing object storage, so .rgw.buckets will get the most objects, and .rgw.buckets.index will (I assume) get a smaller amount by some ratio.
3. What are .rgw, .rgw.buckets.extra, and .rgw.control used for?

Just in general, I'm looking for more information on how to plan for configuring for receiving a lot of objects across a lot of OSDs.

pool name                 KB      objects       clones     degraded      unfound           rd        rd KB           wr        wr KB
.log                       0          127            0            0            0      5076804      5076677      3384550            0
.rgw                    1607         8676            0            0            0       135382        94455        62465        17676
.rgw.buckets        61241052        34145            0            0            0       790047     36795213       328715     64702857
.rgw.buckets.extra            0          330            0            0            0      6910047      4607382         1723            0
.rgw.buckets.index            0         8083            0            0            0     26064639     26084041     18862332            0
.rgw.control               0            8            0            0            0            0            0            0            0
.rgw.gc                    0           32            0            0            0       240948       249931       192926            0
.rgw.root                  1            3            0            0            0          864          576            3            3
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com