Re: Kicking 'Remapped' PGs

Paul Evans <paul@xxxxxxxxxxxx> · Sun, 3 May 2015 12:18:07 +0000

Thanks,
 Greg.  Following your lead, we discovered the proper 'set_choose_tries xxx’ value had not been applied to *this* pool’s rule, and we updated the cluster accordingly. We then moved a random OSD out and back in to ‘kick’ things, but no joy: we still have
 the 4 ‘remapped’ PGs.  BTW: the 4 PGs look OK from a basic rule perspective: they’re on different OSDs/on different Hosts, which is what we’re concerned with… but it seems CRUSH has different goals for them and they are inactive.
So..back to the basic question:  can we get just the ‘remapped’ PGs to re-sort themselves without causing massive data movement….or is
 a complete re-sort the only way to get to a desired CRUSH state? 

As for the force_create_pg command: if it creates a blank PG element on a specific OSD (yes?), what happens to an existing PG element on
 other OSDs? Could we use force_create_pg followed by a ‘pg repair’ command to get things back to the proper state (in a very targeted way)?  

For reference, below is the (reduced) output of dump_stuck: 

pg_stat  objects mip  degr unf  bytes       log    disklog state  state_stamp                   v              reported        up    up_pri
  acting    acting_pri    

11.6e5    284    0    0    0    2366787669  3012    3012  remapped  2015-04-23 13:19:02.373507  68310'49068    78500:123712   [0,92]    0    [0,84]    0    

11.8bb    283    0    0    0    2349260884  3001    3001  remapped  2015-04-23 13:19:02.550735  70105'49776    78500:125026   [0,92]    0    [0,88]    0    

11.e2f    280    0    0    0    2339844181  3001    3001  remapped  2015-04-23 13:18:59.299589  68310'51082    78500:119555   [77,4]    77   [77,34]   77    

11.323    282    0    0    0    2357186647  3001    3001  remapped  2015-04-23 13:18:58.970396  70105'48961    78500:123987   [0,37]    0    [0,19]    0    

On Apr 30, 2015, at 10:30 AM, Gregory Farnum <greg@xxxxxxxxxxx> wrote:

Remapped
 PGs that are stuck that way mean that CRUSH is failing to map

them
 appropriately — I think we talked about the circumstances around

that
 previously. :) So nudging CRUSH can't do anything; it will just

fail
 to map them appropriately again. (And indeed this is what happens

whenever
 anyone does something to that PG or the OSD Map gets

changed.)

The
 force_create_pg command does exactly what it sounds like: it tells

the
 OSDs which should currently host the named PG to create it. You

shouldn't
 need to run it and I don't remember exactly what checks it

goes
 through, but it's generally for when you've given up on

retrieving
 any data out of a PG whose OSDs died and want to just start

over
 with a completely blank one.

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com