pg stuck in peering while power failure

Craig Chi <craigchi@xxxxxxxxxxxx> · Tue, 10 Jan 2017 21:36:13 +0800

Hi List,

I am testing the stability of my Ceph cluster with power failure.

I brutally powered off 2 Ceph units with each 90 OSDs on it while the client I/O was continuing.

Since then, some of the pgs of my cluster stucked in peering

      pgmap v3261136: 17408 pgs, 4 pools, 176 TB data, 5082 kobjects

            236 TB used, 5652 TB / 5889 TB avail

            8563455/38919024 objects degraded (22.003%)

               13526 active+undersized+degraded

                3769 active+clean

                 104 down+remapped+peering

                   9 down+peering

I queried the peering pg (all on EC pool with 7+2) and got blocked information (full query: http://pastebin.com/pRkaMG2h )

            "probing_osds": [

                "153(6)",

                "183(3)",

                "345(0)",

                "401(7)",

                "516(8)",

                "622(1)",

                "685(2)"

            ],

            "blocked": "peering is blocked due to down osds",

            "down_osds_we_would_probe": [

                792

            ],

            "peering_blocked_by": [

                {

                    "osd": 792,

                    "current_lost_at": 0,

                    "comment": "starting or marking this osd lost may let us proceed"

                }

            ]

osd.792 is exactly on one of the units I powered off. And I think the I/O associated with this pg is paused too.

I have checked the troubleshooting page on Ceph website ( http://docs.ceph.com/docs/master/rados/troubleshooting/troubleshooting-pg/ ), it says that start the OSD or mark it lost can make the procedure continue.

I am sure that my cluster was healthy before the power outage occurred. I am wondering if the power outage really happens in production environment, will it also freeze my client I/O if I don't do anything? Since I just lost 2 redundancies (I have erasure code with 7+2), I think it should still serve normal functionality.

Or if I am doing something wrong? Please give me some suggestions, thanks.

Sincerely,

Craig Chi
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com