Re: 答复: 答复: too many PGs per OSD (307 > max 300)

Chengwei Yang <chengwei.yang.cn@xxxxxxxxx> · Fri, 29 Jul 2016 16:29:34 +0800



On Fri, Jul 29, 2016 at 04:46:54AM +0000, zhu tong wrote:
> Right, that was the one that I calculated the   osd_pool_default_pg_num in our
> test cluster.
> 
> 
> 7 OSD, 11 pools, osd_pool_default_pg_num is calculated to be 256, but when ceph
> status shows 
> 
> health HEALTH_WARN
>             too many PGs per OSD (5818 > max 300)
>      monmap e1: 1 mons at {open-kvm-app63=192.168.32.103:6789/0}
>             election epoch 1, quorum 0 open-kvm-app63
>      osdmap e143: 7 osds: 7 up, 7 in
>       pgmap v717609: 6916 pgs, 11 pools, 1617 MB data, 4577 objects
>             17600 MB used, 3481 GB / 3498 GB avail
>                 6916 active+clean
> 
> How so?

It says there are 6916 pgs, 6919 / 11 = 629, so 629 gps per pool I think, quite
larger than 256 that you said.

PG per OSD = pgs * pool size / osd, so pool size = 5818 * 7 / 6916, so I think
you have quite a large pool size?

you may get pg_num and pool size from below commands

$ ceph osd pool get <pool> pg_num
pg_num: 4096
# ceph osd pool get <pool> size
size: 3


-- 
Thanks,
Chengwei

> 
> 
> Thanks.
> 
> ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
> 发件人: Christian Balzer <chibi@xxxxxxx>
> 发送时间: 2016年7月29日 3:31:18
> 收件人: ceph-users@xxxxxxxxxxxxxx
> 抄送: zhu tong
> 主题: Re:  答复: too many PGs per OSD (307 > max 300)
>  
> 
> Hello,
> 
> On Fri, 29 Jul 2016 03:18:10 +0000 zhu tong wrote:
> 
> > The same problem is confusing me recently too, trying to figure out the
> relationship (an equation would be the best) among number of pools, OSD and PG.
> >
> The pgcalc tool and the equation on that page are your best bet/friend.
>  http://ceph.com/pgcalc/
> 
> > For example, having 10 OSD, 7 pools in one cluster, and
> osd_pool_default_pg_num = 128, then how many PGs the health status would show?
> > I have seen some recommended calc the other way round -- inferring
> osd_pool_default_pg_num  value by giving a fixed amount of OSD and PGs, but
> when I try it in the way above mentioned, the two results do not match.
> >
> Number of PGs per OSD is your goal.
> To use a simpler example, 20 OSDs, 4 pools, all of equal (expected amount
> of data) size.
> So that's 1024 total PGs (about 150 per OSD),  thus 256 per pool.
>  
> Again, see pgcalc.
> 
> Christian
> > Thanks.
> > ________________________________
> > 发件人: ceph-users <ceph-users-bounces@xxxxxxxxxxxxxx> 代表 Christian Balzer
> <chibi@xxxxxxx>
> > 发送时间: 2016年7月29日 2:47:59
> > 收件人: ceph-users
> > 主题: Re:  too many PGs per OSD (307 > max 300)
> >
> > On Fri, 29 Jul 2016 09:59:38 +0800 Chengwei Yang wrote:
> >
> > > Hi list,
> > >
> > > I just followed the placement group guide to set pg_num for the rbd pool.
> > >
> > How many other pools do you have, or is that the only pool?
> >
> > The numbers mentioned are for all pools, not per pool, something that
> > isn't abundantly clear from the documentation either.
> >
> > >   "
> > >   Less than 5 OSDs set pg_num to 128
> > >   Between 5 and 10 OSDs set pg_num to 512
> > >   Between 10 and 50 OSDs set pg_num to 4096
> > >   If you have more than 50 OSDs, you need to understand the tradeoffs and
> how to
> > >   calculate the pg_num value by yourself
> > >   For calculating pg_num value by yourself please take help of pgcalc tool
> > >   "
> > >
> > You should have headed the hint about pgcalc, which is by far the best
> > thing to do.
> > The above numbers are an (imprecise) attempt to give a quick answer to a
> > complex question.
> >
> > > Since I have 40 OSDs, so I set pg_num to 4096 according to the above
> > > recommendation.
> > >
> > > However, after configured pg_num and pgp_num both to 4096, I found that my
> > > ceph cluster in **HEALTH_WARN** status, which does surprised me and still
> > > confusing me.
> > >
> > PGcalc would recommend 2048 PGs at most (for a single pool) with 40 OSDs.
> >
> > I assume the above high number of 4096 stems from the wisdom that with
> > small clusters more PGs than normally recommended (100 per OSD) can be
> > helpful.
> > It was also probably written before those WARN calculations were added to
> > Ceph.
> >
> > The above would better read:
> > ---
> > Use PGcalc!
> > [...]
> > Between 10 and 20 OSDs set pg_num to 1024
> > Between 20 and 40 OSDs set pg_num to 2048
> >
> > Over 40 definitely use and understand PGcalc.
> > ---
> >
> > > >   cluster bf6fa9e4-56db-481e-8585-29f0c8317773
> > >      health HEALTH_WARN
> > >             too many PGs per OSD (307 > max 300)
> > >
> > > I see the cluster also says "4096 active+clean" so it's safe, but I do not
> like
> > > the HEALTH_WARN in anyway.
> > >
> > You can ignore it, but yes, it is annoying.
> >
> > > As I know(from ceph -s output), the recommended pg_num per OSD is [30,
> 300], any
> > > other pg_num out of this range with bring cluster to HEALTH_WARN.
> > >
> > > So what I would like to say: is the document misleading? Should we fix it?
> > >
> > Definitely.
> >
> > Christian
> > --
> > Christian Balzer        Network/Systems Engineer
> > chibi@xxxxxxx    Global OnLine Japan/Rakuten Communications
> > http://www.gol.com/
> > _______________________________________________
> > ceph-users mailing list
> > ceph-users@xxxxxxxxxxxxxx
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
> 
> --
> Christian Balzer        Network/Systems Engineer               
> chibi@xxxxxxx    Global OnLine Japan/Rakuten Communications
> http://www.gol.com/
> SECURITY NOTE: file ~/.netrc must not be accessible by others

> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Attachment:
signature.asc

Description: Digital signature
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com