Re: Firefly to Hammer Upgrade -- HEALTH_WARN; too many PGs per OSD (480 > max 300)

10 minus <t10tennn@xxxxxxxxx> · Tue, 1 Sep 2015 14:07:26 +0200

Hi Greg,

Thanks for the update.. 
I think the documentation on Ceph should be reworded.

--snip--

http://ceph.com/docs/master/rados/operations/placement-groups/#choosing-the-number-of-placement-groups

* Less than 5 OSDs set pg_num to 128
* Between 5 and 10 OSDs set pg_num to 512
* Between 10 and 50 OSDs set pg_num to 4096
* If you have more than 50 OSDs, you need to understand the tradeoffs
    and how to calculate the pg_num value by yourself
--snip--

On Mon, Aug 31, 2015 at 10:31 AM, Gregory Farnum <gfarnum@xxxxxxxxxx> wrote:
On Mon, Aug 31, 2015 at 8:30 AM, 10 minus <t10tennn@xxxxxxxxx> wrote:

> Hi ,

>

> I 'm in the process of upgrading my ceph cluster from Firefly to Hammer.

>

> The ceph cluster has 12 OSD spread across 4 nodes.

>

> Mons have been upgraded to hammer, since I have created pools  with value

> 512 and 256 , so am bit confused with the warning message.

>

> --snip--

>

> ceph -s

>     cluster a7160e16-0aaf-4e78-9e7c-7fbec08642f0

>      health HEALTH_WARN

>             too many PGs per OSD (480 > max 300)

>      monmap e1: 3 mons at

> {mon01=172.16.10.5:6789/0,mon02=172.16.10.6:6789/0,mon03=172.16.10.7:6789/0}

>             election epoch 116, quorum 0,1,2 mon01,mon02,mon03

>      osdmap e6814: 12 osds: 12 up, 12 in

>       pgmap v2961763: 1920 pgs, 4 pools, 230 GB data, 29600 objects

>             692 GB used, 21652 GB / 22345 GB avail

>                 1920 active+clean

>

>

>

> --snip--

>

>

> ## Conf and ceph output

>

> --snip--

>

> [global]

> fsid = a7160e16-0aaf-4e78-9e7c-7fbec08642f0

> public_network = 172.16.10.0/24

> cluster_network = 172.16.10.0/24

> mon_initial_members = mon01, mon02, mon03

> mon_host = 172.16.10.5,172.16.10.6,172.16.10.7

> auth_cluster_required = cephx

> auth_service_required = cephx

> auth_client_required = cephx

> filestore_xattr_use_omap = true

> mon_clock_drift_allowed = .15

> mon_clock_drift_warn_backoff = 30

> mon_osd_down_out_interval = 300

> mon_osd_report_timeout = 300

> mon_osd_full_ratio = .85

> mon_osd_nearfull_ratio = .75

> osd_backfill_full_ratio = .75

> osd_pool_default_size = 3

> osd_pool_default_min_size = 2

> osd_pool_default_pg_num = 512

> osd_pool_default_pgp_num = 512

> --snip--

>

> ceph df

>

>

> POOLS:

>     NAME        ID           USED     %USED   MAX AVAIL     OBJECTS

>     images        3             216G        0.97         7179G       27793

>     vms            4          14181M        0.06         7179G        1804

>     volumes      5                0                 0         7179G

> 1

>     backups      6                0                 0         7179G

> 0

>

>

> ceph osd pool get poolname pg_num

> images: 256

> backup: 512

> vms: 512

> volumes: 512

>

> --snip--

>

> Since it is a warning .. can I upgrade the OSDs without destroying the data.

> or

> Should I roll back.

It's not a problem, just a diagnostic warning that appears to be

misbehaving. If you can create a bug at tracker.ceph.com listing what

Ceph versions are involved and exactly what's happened it can get

investigated, but you should feel free to keep upgrading. :)

-Greg

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com