You might want to know about this change coming
"This would be a semi-incompatible change with pre-luminous ceph CLI"
cheers,
Gregory
---------- Forwarded message ----------
From: Sage Weil <sweil@xxxxxxxxxx>
Date: Tue, Jun 13, 2017 at 12:34 PM
Subject: cluster health checks
To: jspray@xxxxxxxxxx
Cc: ceph-devel@xxxxxxxxxxxxxxx
I've put together a rework of the cluster health checks at
https://github.com/ceph/ceph/pull/15643
based on John's original proposal in
http://tracker.ceph.com/issues/7192
(with a few changes). I think it's pretty complete except that the
MDSMonitor new-style checks aren't implemented yet.
This would be a semi-incompatible change with pre-luminous ceph in that
- the structured (json/xml) health output is totally different
- the plaintext health *detail* output is different
- specific error messages are a bit different. I was reimplementing them
and took the liberty of revising what information was in the
summary and detail in several cases.
Let me know what you think!
Thanks-
sage
$ ceph -s
cluster:
id: 9ee7f49c-57c3-4686-afd1-75b3a8f08c73
health: HEALTH_WARN
2 osds down
1 host (2 osds) down
1 root (2 osds) down
8 pgs stale
services:
mon: 3 daemons, quorum a,b,c
mgr: x(active)
osd: 2 osds: 0 up, 2 in
data:
pools: 1 pools, 8 pgs
objects: 0 objects, 0 bytes
usage: 414 GB used, 330 GB / 744 GB avail
pgs: 8 stale+active+clean
$ ceph health detail -f json-pretty
{
"checks": {
"OSD_DOWN": {
"severity": "HEALTH_WARN",
"message": "2 osds down"
},
"OSD_HOST_DOWN": {
"severity": "HEALTH_WARN",
"message": "1 host (2 osds) down"
},
"OSD_ROOT_DOWN": {
"severity": "HEALTH_WARN",
"message": "1 root (2 osds) down"
},
"PG_STALE": {
"severity": "HEALTH_WARN",
"message": "8 pgs stale"
}
},
"status": "HEALTH_WARN",
"detail": {
"OSD_DOWN": [
"osd.0 (root=default,host=gnit) is down",
"osd.1 (root=default,host=gnit) is down"
],
"OSD_HOST_DOWN": [
"host gnit (root=default) (2 osds) is down"
],
"OSD_ROOT_DOWN": [
"root default (2 osds) is down"
],
"PG_STALE": [
"pg 0.7 is stale+active+clean, acting [1,0]",
"pg 0.6 is stale+active+clean, acting [0,1]",
"pg 0.5 is stale+active+clean, acting [0,1]",
"pg 0.4 is stale+active+clean, acting [0,1]",
"pg 0.0 is stale+active+clean, acting [0,1]",
"pg 0.1 is stale+active+clean, acting [1,0]",
"pg 0.2 is stale+active+clean, acting [0,1]",
"pg 0.3 is stale+active+clean, acting [0,1]"
]
}
}
$ ceph health detail
HEALTH_WARN 2 osds down; 1 host (2 osds) down; 1 root (2 osds) down; 8 pgs stale
OSD_DOWN 2 osds down
osd.0 (root=default,host=gnit) is down
osd.1 (root=default,host=gnit) is down
OSD_HOST_DOWN 1 host (2 osds) down
host gnit (root=default) (2 osds) is down
OSD_ROOT_DOWN 1 root (2 osds) down
root default (2 osds) is down
PG_STALE 8 pgs stale
pg 0.7 is stale+active+clean, acting [1,0]
pg 0.6 is stale+active+clean, acting [0,1]
pg 0.5 is stale+active+clean, acting [0,1]
pg 0.4 is stale+active+clean, acting [0,1]
pg 0.0 is stale+active+clean, acting [0,1]
pg 0.1 is stale+active+clean, acting [1,0]
pg 0.2 is stale+active+clean, acting [0,1]
pg 0.3 is stale+active+clean, acting [0,1]
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
From: Sage Weil <sweil@xxxxxxxxxx>
Date: Tue, Jun 13, 2017 at 12:34 PM
Subject: cluster health checks
To: jspray@xxxxxxxxxx
Cc: ceph-devel@xxxxxxxxxxxxxxx
I've put together a rework of the cluster health checks at
https://github.com/ceph/ceph/
based on John's original proposal in
http://tracker.ceph.com/
(with a few changes). I think it's pretty complete except that the
MDSMonitor new-style checks aren't implemented yet.
This would be a semi-incompatible change with pre-luminous ceph in that
- the structured (json/xml) health output is totally different
- the plaintext health *detail* output is different
- specific error messages are a bit different. I was reimplementing them
and took the liberty of revising what information was in the
summary and detail in several cases.
Let me know what you think!
Thanks-
sage
$ ceph -s
cluster:
id: 9ee7f49c-57c3-4686-afd1-
health: HEALTH_WARN
2 osds down
1 host (2 osds) down
1 root (2 osds) down
8 pgs stale
services:
mon: 3 daemons, quorum a,b,c
mgr: x(active)
osd: 2 osds: 0 up, 2 in
data:
pools: 1 pools, 8 pgs
objects: 0 objects, 0 bytes
usage: 414 GB used, 330 GB / 744 GB avail
pgs: 8 stale+active+clean
$ ceph health detail -f json-pretty
{
"checks": {
"OSD_DOWN": {
"severity": "HEALTH_WARN",
"message": "2 osds down"
},
"OSD_HOST_DOWN": {
"severity": "HEALTH_WARN",
"message": "1 host (2 osds) down"
},
"OSD_ROOT_DOWN": {
"severity": "HEALTH_WARN",
"message": "1 root (2 osds) down"
},
"PG_STALE": {
"severity": "HEALTH_WARN",
"message": "8 pgs stale"
}
},
"status": "HEALTH_WARN",
"detail": {
"OSD_DOWN": [
"osd.0 (root=default,host=gnit) is down",
"osd.1 (root=default,host=gnit) is down"
],
"OSD_HOST_DOWN": [
"host gnit (root=default) (2 osds) is down"
],
"OSD_ROOT_DOWN": [
"root default (2 osds) is down"
],
"PG_STALE": [
"pg 0.7 is stale+active+clean, acting [1,0]",
"pg 0.6 is stale+active+clean, acting [0,1]",
"pg 0.5 is stale+active+clean, acting [0,1]",
"pg 0.4 is stale+active+clean, acting [0,1]",
"pg 0.0 is stale+active+clean, acting [0,1]",
"pg 0.1 is stale+active+clean, acting [1,0]",
"pg 0.2 is stale+active+clean, acting [0,1]",
"pg 0.3 is stale+active+clean, acting [0,1]"
]
}
}
$ ceph health detail
HEALTH_WARN 2 osds down; 1 host (2 osds) down; 1 root (2 osds) down; 8 pgs stale
OSD_DOWN 2 osds down
osd.0 (root=default,host=gnit) is down
osd.1 (root=default,host=gnit) is down
OSD_HOST_DOWN 1 host (2 osds) down
host gnit (root=default) (2 osds) is down
OSD_ROOT_DOWN 1 root (2 osds) down
root default (2 osds) is down
PG_STALE 8 pgs stale
pg 0.7 is stale+active+clean, acting [1,0]
pg 0.6 is stale+active+clean, acting [0,1]
pg 0.5 is stale+active+clean, acting [0,1]
pg 0.4 is stale+active+clean, acting [0,1]
pg 0.0 is stale+active+clean, acting [0,1]
pg 0.1 is stale+active+clean, acting [1,0]
pg 0.2 is stale+active+clean, acting [0,1]
pg 0.3 is stale+active+clean, acting [0,1]
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com