Re: recovery process stops

Craig Lewis <clewis@xxxxxxxxxxxxxxxxxx> · Tue, 21 Oct 2014 08:28:58 -0700

That will fix itself over time.  remapped just means that Ceph is moving the data around.  It's normal to see PGs in the remapped and/or backfilling state after OSD restarts.
They should go down steadily over time.  How long depends on how much data is in the PGs, how fast your hardware is, how many OSDs are affected, and how much you allow recovery to impact cluster performance.  Mine currently take about 20 minutes per PG.  If all 47 are on the same OSD, it'll be a while.  If they're evenly split between multiple OSDs, parallelism will speed that up.

On Tue, Oct 21, 2014 at 1:22 AM, Harald Rößler <Harald.Roessler@xxxxxx> wrote:
Hi all,

thank you for your support, now the file system is not degraded any more. Now I have a minus degrading :-)

2014-10-21 10:15:22.303139 mon.0 [INF] pgmap v43376478: 3328 pgs: 3281 active+clean, 47 active+remapped; 1609 GB data, 5022 GB used, 1155 GB / 6178 GB avail; 8034B/s rd, 3548KB/s wr, 161op/s; -1638/1329293 degraded (-0.123%)

but ceph reports me a health HEALTH_WARN 47 pgs stuck unclean; recovery -1638/1329293 degraded (-0.123%)

I think this warning is reported because there are 47 active+remapped objects, some ideas how to fix that now?

Kind Regards
Harald Roessler

Am 21.10.2014 um 01:03 schrieb Craig Lewis <clewis@xxxxxxxxxxxxxxxxxx>:

I've been in a state where reweight-by-utilization was deadlocked (not the daemons, but the remap scheduling).  After successive osd reweight commands, two OSDs wanted to swap PGs, but they were both toofull.  I ended up temporarily increasing mon_osd_nearfull_ratio to 0.87.  That removed the impediment, and everything finished remapping.  Everything went smoothly, and I changed it back when all the remapping finished.
Just be careful if you need to get close to mon_osd_full_ratio.  Ceph does greater-than on these percentages, not greater-than-equal.  You really don't want the disks to get greater-than mon_osd_full_ratio, because all external IO will stop until you resolve that.

On Mon, Oct 20, 2014 at 10:18 AM, Leszek Master <keksior@xxxxxxxxx> wrote:
You can set lower weight on full osds, or try changing the 
osd_near_full_ratio parameter in your cluster from 85 to for example 89.
 But i don't know what can go wrong when you do that.

2014-10-20 17:12 GMT+02:00 Wido den Hollander <wido@xxxxxxxx>:
On 10/20/2014 05:10 PM, Harald Rößler wrote:

> yes, tomorrow I will get the replacement of the failed disk, to get a new node with many disk will take a few days.

> No other idea?

>

If the disks are all full, then, no.

Sorry to say this, but it came down to poor capacity management. Never

let any disk in your cluster fill over 80% to prevent these situations.

Wido

> Harald Rößler

>

>

>> Am 20.10.2014 um 16:45 schrieb Wido den Hollander <wido@xxxxxxxx>:

>>

>> On 10/20/2014 04:43 PM, Harald Rößler wrote:

>>> Yes, I had some OSD which was near full, after that I tried
 to fix the problem with "ceph osd reweight-by-utilization", but this 
does not help. After that I set the near full ratio to 88% with the idea
 that the remapping would fix the issue. Also a restart of the OSD 
doesn’t help. At the same time I had a hardware failure of on disk. :-(.
 After that failure the recovery process start at "degraded ~ 13%“ and 
stops at 7%.

>>> Honestly I am scared in the moment I am doing the wrong operation.

>>>

>>

>> Any chance of adding a new node with some fresh disks? Seems like you

>> are operating on the storage capacity limit of the nodes and that your

>> only remedy would be adding more spindles.

>>

>> Wido

>>

>>> Regards

>>> Harald Rößler

>>>

>>>

>>>

>>>> Am 20.10.2014 um 14:51 schrieb Wido den Hollander <wido@xxxxxxxx>:

>>>>

>>>> On 10/20/2014 02:45 PM, Harald Rößler wrote:

>>>>> Dear All

>>>>>

>>>>> I have in them moment a issue with my cluster. The recovery process stops.

>>>>>

>>>>

>>>> See this: 2 active+degraded+remapped+backfill_toofull

>>>>

>>>> 156 pgs backfill_toofull

>>>>

>>>> You have one or more OSDs which are to full and that causes recovery to

>>>> stop.

>>>>

>>>> If you add more capacity to the cluster recovery will continue and finish.

>>>>

>>>>> ceph -s

>>>>>  health HEALTH_WARN 188 pgs backfill; 156 pgs 
backfill_toofull; 4 pgs backfilling; 55 pgs degraded; 49 pgs 
recovery_wait; 297 pgs stuck unclean; recovery 111487/1488290 degraded 
(7.491%)

>>>>>  monmap e2: 3 mons at {0=10.99.10.10:6789/0,12=10.99.10.22:6789/0,6=10.99.10.16:6789/0}, election epoch 332, quorum 0,1,2 0,12,6

>>>>>  osdmap e6748: 24 osds: 23 up, 23 in

>>>>>   pgmap v43314672: 3328 pgs: 3031 active+clean, 43 
active+remapped+wait_backfill, 3 active+degraded+wait_backfill, 96 
active+remapped+wait_backfill+backfill_toofull, 31 active+recovery_wait, 19 active+degraded+wait_backfill+backfill_toofull, 36 active+remapped, 3 active+remapped+backfilling, 18 active+remapped+backfill_toofull, 6 active+degraded+remapped+wait_backfill, 15 active+recovery_wait+remapped, 21 active+degraded+remapped+wait_backfill+backfill_toofull, 1 active+recovery_wait+degraded, 1 active+degraded+remapped+backfilling, 2 active+degraded+remapped+backfill_toofull, 2 active+recovery_wait+degraded+remapped;
 1698 GB data, 5206 GB used, 971 GB / 6178 GB avail; 24382B/s rd, 
12411KB/s wr, 320op/s; 111487/1488290 degraded (7.491%)

>>>>>

>>>>>

>>>>> I have tried to restart all OSD in the cluster, but does not help to finish the recovery of the cluster.

>>>>>

>>>>> Have someone any idea

>>>>>

>>>>> Kind Regards

>>>>> Harald Rößler

>>>>>

>>>>>

>>>>>

>>>>> _______________________________________________

>>>>> ceph-users mailing list

>>>>> ceph-users@xxxxxxxxxxxxxx

>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

>>>>>

>>>>

>>>>

>>>> --

>>>> Wido den Hollander

>>>> Ceph consultant and trainer

>>>> 42on B.V.

>>>>

>>>> Phone: +31 (0)20 700 9902

>>>> Skype: contact42on

>>>> _______________________________________________

>>>> ceph-users mailing list

>>>> ceph-users@xxxxxxxxxxxxxx

>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

>>>

>>

>>

>> --

>> Wido den Hollander

>> Ceph consultant and trainer

>> 42on B.V.

>>

>> Phone: +31 (0)20 700 9902

>> Skype: contact42on

>

--

Wido den Hollander

Ceph consultant and trainer

42on B.V.

Phone: +31 (0)20 700 9902

Skype: contact42on

_______________________________________________

ceph-users mailing list

ceph-users@xxxxxxxxxxxxxx

http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

2014-10-20 17:12 GMT+02:00 Wido den Hollander <wido@xxxxxxxx>:
On 10/20/2014 05:10 PM, Harald Rößler wrote:

> yes, tomorrow I will get the replacement of the failed disk, to get a new node with many disk will take a few days.

> No other idea?

>

If the disks are all full, then, no.

Sorry to say this, but it came down to poor capacity management. Never

let any disk in your cluster fill over 80% to prevent these situations.

Wido

> Harald Rößler

>

>

>> Am 20.10.2014 um 16:45 schrieb Wido den Hollander <wido@xxxxxxxx>:

>>

>> On 10/20/2014 04:43 PM, Harald Rößler wrote:

>>> Yes, I had some OSD which was near full, after that I tried to fix the problem with "ceph osd reweight-by-utilization", but this does not help. After that I set the near full ratio to 88% with the idea that the remapping would fix the issue. Also a restart of the OSD doesn’t help. At the same time I had a hardware failure of on disk. :-(. After that failure the recovery process start at "degraded ~ 13%“ and stops at 7%.

>>> Honestly I am scared in the moment I am doing the wrong operation.

>>>

>>

>> Any chance of adding a new node with some fresh disks? Seems like you

>> are operating on the storage capacity limit of the nodes and that your

>> only remedy would be adding more spindles.

>>

>> Wido

>>

>>> Regards

>>> Harald Rößler

>>>

>>>

>>>

>>>> Am 20.10.2014 um 14:51 schrieb Wido den Hollander <wido@xxxxxxxx>:

>>>>

>>>> On 10/20/2014 02:45 PM, Harald Rößler wrote:

>>>>> Dear All

>>>>>

>>>>> I have in them moment a issue with my cluster. The recovery process stops.

>>>>>

>>>>

>>>> See this: 2 active+degraded+remapped+backfill_toofull

>>>>

>>>> 156 pgs backfill_toofull

>>>>

>>>> You have one or more OSDs which are to full and that causes recovery to

>>>> stop.

>>>>

>>>> If you add more capacity to the cluster recovery will continue and finish.

>>>>

>>>>> ceph -s

>>>>>  health HEALTH_WARN 188 pgs backfill; 156 pgs backfill_toofull; 4 pgs backfilling; 55 pgs degraded; 49 pgs recovery_wait; 297 pgs stuck unclean; recovery 111487/1488290 degraded (7.491%)

>>>>>  monmap e2: 3 mons at {0=10.99.10.10:6789/0,12=10.99.10.22:6789/0,6=10.99.10.16:6789/0}, election epoch 332, quorum 0,1,2 0,12,6

>>>>>  osdmap e6748: 24 osds: 23 up, 23 in

>>>>>   pgmap v43314672: 3328 pgs: 3031 active+clean, 43 active+remapped+wait_backfill, 3 active+degraded+wait_backfill, 96 active+remapped+wait_backfill+backfill_toofull, 31 active+recovery_wait, 19 active+degraded+wait_backfill+backfill_toofull, 36 active+remapped, 3 active+remapped+backfilling, 18 active+remapped+backfill_toofull, 6 active+degraded+remapped+wait_backfill, 15 active+recovery_wait+remapped, 21 active+degraded+remapped+wait_backfill+backfill_toofull, 1 active+recovery_wait+degraded, 1 active+degraded+remapped+backfilling, 2 active+degraded+remapped+backfill_toofull, 2 active+recovery_wait+degraded+remapped; 1698 GB data, 5206 GB used, 971 GB / 6178 GB avail; 24382B/s rd, 12411KB/s wr, 320op/s; 111487/1488290 degraded (7.491%)

>>>>>

>>>>>

>>>>> I have tried to restart all OSD in the cluster, but does not help to finish the recovery of the cluster.

>>>>>

>>>>> Have someone any idea

>>>>>

>>>>> Kind Regards

>>>>> Harald Rößler

>>>>>

>>>>>

>>>>>

>>>>> _______________________________________________

>>>>> ceph-users mailing list

>>>>> ceph-users@xxxxxxxxxxxxxx

>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

>>>>>

>>>>

>>>>

>>>> --

>>>> Wido den Hollander

>>>> Ceph consultant and trainer

>>>> 42on B.V.

>>>>

>>>> Phone: +31 (0)20 700 9902

>>>> Skype: contact42on

>>>> _______________________________________________

>>>> ceph-users mailing list

>>>> ceph-users@xxxxxxxxxxxxxx

>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

>>>

>>

>>

>> --

>> Wido den Hollander

>> Ceph consultant and trainer

>> 42on B.V.

>>

>> Phone: +31 (0)20 700 9902

>> Skype: contact42on

>

--

Wido den Hollander

Ceph consultant and trainer

42on B.V.

Phone: +31 (0)20 700 9902

Skype: contact42on

_______________________________________________

ceph-users mailing list

ceph-users@xxxxxxxxxxxxxx

http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________

ceph-users mailing list

ceph-users@xxxxxxxxxxxxxx

http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com