Re: Kicking 'Remapped' PGs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Thanks, Greg.  Following your lead, we discovered the proper 'set_choose_tries xxx’ value had not been applied to *this* pool’s rule, and we updated the cluster accordingly. We then moved a random OSD out and back in to ‘kick’ things, but no joy: we still have the 4 ‘remapped’ PGs.  BTW: the 4 PGs look OK from a basic rule perspective: they’re on different OSDs/on different Hosts, which is what we’re concerned with… but it seems CRUSH has different goals for them and they are inactive.
So..back to the basic question:  can we get just the ‘remapped’ PGs to re-sort themselves without causing massive data movement….or is a complete re-sort the only way to get to a desired CRUSH state? 

As for the force_create_pg command: if it creates a blank PG element on a specific OSD (yes?), what happens to an existing PG element on other OSDs? Could we use force_create_pg followed by a ‘pg repair’ command to get things back to the proper state (in a very targeted way)?  

For reference, below is the (reduced) output of dump_stuck: 

pg_stat  objects mip  degr unf  bytes       log    disklog state  state_stamp                   v              reported        up    up_pri  acting    acting_pri    
11.6e5    284    0    0    0    2366787669  3012    3012  remapped  2015-04-23 13:19:02.373507  68310'49068    78500:123712   [0,92]    0    [0,84]    0    
11.8bb    283    0    0    0    2349260884  3001    3001  remapped  2015-04-23 13:19:02.550735  70105'49776    78500:125026   [0,92]    0    [0,88]    0    
11.e2f    280    0    0    0    2339844181  3001    3001  remapped  2015-04-23 13:18:59.299589  68310'51082    78500:119555   [77,4]    77   [77,34]   77    
11.323    282    0    0    0    2357186647  3001    3001  remapped  2015-04-23 13:18:58.970396  70105'48961    78500:123987   [0,37]    0    [0,19]    0    



On Apr 30, 2015, at 10:30 AM, Gregory Farnum <greg@xxxxxxxxxxx> wrote:

Remapped PGs that are stuck that way mean that CRUSH is failing to map
them appropriately — I think we talked about the circumstances around
that previously. :) So nudging CRUSH can't do anything; it will just
fail to map them appropriately again. (And indeed this is what happens
whenever anyone does something to that PG or the OSD Map gets
changed.)

The force_create_pg command does exactly what it sounds like: it tells
the OSDs which should currently host the named PG to create it. You
shouldn't need to run it and I don't remember exactly what checks it
goes through, but it's generally for when you've given up on
retrieving any data out of a PG whose OSDs died and want to just start
over with a completely blank one.

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux