all my osds are down, but ceph -s tells they are up and in.

sweil@xxxxxxxxxx (Sage Weil) · Mon, 8 Sep 2014 21:02:05 -0700 (PDT)

On Tue, 9 Sep 2014, yuelongguang wrote:
> hi,all
> ?
> that is crazy.
> 1.
> all my osds are down, but ceph -s tells they are up and in. why?

Peer OSDs normally handle failure detection.  If all OSDs are down, 
there is nobody to report the failures.

After 5 or 10 minutes if the OSDs don't report any stats to the monitor it 
will eventually assume they are dead and mark them down.

> 2.
> now all osds are down, a vm is using rbd as its disk, and inside?vm? fio is
> r/wing the disk , but it hang ,can not be killed. why ?

The IOs will block indefinitely until the cluster is available.  Once 
the OSDs are started teh VM will become responsive again.

sage

> ?
> thanks
> ?
> [root at cephosd2-monb ~]# ceph -v
> ceph version 0.81 (8de9501df275a5fe29f2c64cb44f195130e4a8fc)
> ?
> ?[root at cephosd2-monb ~]# ceph -s
> ??? cluster 508634f6-20c9-43bb-bc6f-b777f4bb1651
> ???? health HEALTH_WARN mds 0 is laggy
> ???? monmap e13: 3 mons at{cephosd1-mona=10.154.249.3:6789/0,cephosd2-monb=10.154.249.4:6789/0,cephos
> d3-monc=10.154.249.5:6789/0}, election epoch 154, quorum 0,1,2
> cephosd1-mona,cephosd2-monb,cephosd3-monc
> ???? mdsmap e21: 1/1/1 up {0=0=up:active(laggy or crashed)}
> ???? osdmap e196: 5 osds: 5 up, 5 in
> ????? pgmap v21836: 512 pgs, 5 pools, 3115 MB data, 805 objects
> ??????????? 9623 MB used, 92721 MB / 102344 MB avail
> ???????????????? 512 active+clean
> 
> 
> 
>