Re: pgs stuck inactive and unclean, too feww PGs per OSD

wikison <wikison@xxxxxxx> · Thu, 8 Oct 2015 12:21:40 +0800 (CST)

Here, like this :
esta@monitorOne:~$ sudo ceph osd tree
ID WEIGHT  TYPE NAME            UP/DOWN REWEIGHT PRIMARY-AFFINITY
-3 4.39996 root defualt
-2 1.09999     host storageTwo
 0 0.09999         osd.0             up  1.00000          1.00000
 1 1.00000         osd.1             up  1.00000          1.00000
-4 1.09999     host storageFour
 2 0.09999         osd.2             up  1.00000          1.00000
 3 1.00000         osd.3             up  1.00000          1.00000
-5 1.09999     host storageLast
 4 0.09999         osd.4             up  1.00000          1.00000
 5 1.00000         osd.5             up  1.00000          1.00000
-6 1.09999     host storageOne
 6 0.09999         osd.6             up  1.00000          1.00000
 7 1.00000         osd.7             up  1.00000          1.00000
-1       0 root default

I have four storage nodes. Each of them has two independent hard drive to store data. One is 120GB SSD, and the other is 1TB HDD. I set the weight of SSD is 0.1 and weight of HDD is 1.0.

--

Zhen Wang
Shanghai Jiao Tong University

At 2015-10-08 11:32:52, "Christian Balzer" <chibi@xxxxxxx> wrote:
>
>Hello,
>
>On Thu, 8 Oct 2015 11:27:46 +0800 (CST) wikison wrote:
>
>> Hi,
>>         I've removed the rbd pool and created it again. It picked up my
>> default settings but there are still some problems. After running "sudo
>> ceph -s", the output is as follow: 
>>     cluster 0b9b05db-98fe-49e6-b12b-1cce0645c015
>>      health HEALTH_WARN
>>             512 pgs stuck inactive
>>             512 pgs stuck unclean
>>      monmap e1: 1 mons at {monitorOne=192.168.1.153:6789/0}
>>             election epoch 1, quorum 0 monitorOne
>>      osdmap e62: 8 osds: 8 up, 8 in
>>       pgmap v219: 512 pgs, 1 pools, 0 bytes data, 0 objects
>>             8460 MB used, 4162 GB / 4171 GB avail
>>                  512 creating
>> 
>Output of "ceph osd tree" please.
>
>The only reason I can think of is if your OSDs are up, but have no weight.
>
>Christian
>
>> Ceph stucks in creating the pgs forever. Those pgs are stuck in inactive
>> and unclean. And the Ceph pg query hangs forever. I googled this problem
>> and didn't get a clue. Is there anything I missed?
>> Any idea to help me?
>> 
>> 
>> --
>> 
>> Zhen Wang
>> 
>> 
>> 
>> At 2015-10-07 13:05:51, "Christian Balzer" <chibi@xxxxxxx> wrote:
>> >
>> >Hello,
>> >On Wed, 7 Oct 2015 12:57:58 +0800 (CST) wikison wrote:
>> >
>> >This is a very old bug, misfeature. 
>> >And creeps up every week or so here, google is your friend.
>> >
>> >> Hi, 
>> >> I have a cluster of one monitor and eight OSDs. These OSDs are running
>> >> on four hosts(each host has two OSDs). When I set up everything and
>> >> started Ceph, I got this: esta@monitorOne:~$ sudo ceph -s [sudo]
>> >> password for esta: cluster 0b9b05db-98fe-49e6-b12b-1cce0645c015
>> >>      health HEALTH_WARN
>> >>             64 pgs stuck inactive
>> >>             64 pgs stuck unclean
>> >>             too few PGs per OSD (8 < min 30)
>> >
>> >Those 3 lines tell you pretty much all there is wrong.
>> >You did (correctly) set the defaul pg and pgp nums to something sensible
>> >(512) in your ceph.conf.
>> >Unfortunately when creating the initial pool (rbd) it still ignores
>> >those settings.
>> >
>> >You could try to increase those for your pool, which may or may not
>> >work.
>> >
>> >The easier and faster way is to remove the rbd pool and create it again.
>> >This should pick up your default settings.
>> >
>> >Christian
>> >
>> >>      monmap e1: 1 mons at {monitorOne=192.168.1.153:6789/0}
>> >>             election epoch 1, quorum 0 monitorOne
>> >>      osdmap e58: 8 osds: 8 up, 8 in
>> >>       pgmap v191: 64 pgs, 1 pools, 0 bytes data, 0 objects
>> >>             8460 MB used, 4162 GB / 4171 GB avail
>> >>                   64 creating
>> >> 
>> >> 
>> >> How to deal with this HEALTH_WARN status?
>> >> This is my ceph.conf:
>> >> [global]
>> >> 
>> >> 
>> >>     fsid                        = 0b9b05db-98fe-49e6-b12b-1cce0645c015
>> >> 
>> >> 
>> >>     mon initial members         = monitorOne
>> >>     mon host                    = 192.168.1.153
>> >>     filestore_xattr_use_omap    = true
>> >> 
>> >> 
>> >>     public network              = 192.168.1.0/24
>> >>     cluster network             = 10.0.0.0/24
>> >>     pid file                    = /var/run/ceph/$name.pid
>> >> 
>> >> 
>> >>     auth cluster required      = cephx
>> >>     auth service required      = cephx
>> >>     auth client required       = cephx
>> >> 
>> >> 
>> >>     osd pool default size       = 3
>> >>     osd pool default min size   = 2
>> >>     osd pool default pg num     = 512
>> >>     osd pool default pgp num    = 512
>> >>     osd crush chooseleaf type   = 1
>> >>     osd journal size            = 1024
>> >> 
>> >> 
>> >> [mon]
>> >> 
>> >> 
>> >> [mon.0]
>> >>     host = monitorOne
>> >>     mon addr = 192.168.1.153:6789
>> >> 
>> >> 
>> >> [osd]
>> >> 
>> >> 
>> >> [osd.0]
>> >>     host = storageOne
>> >> 
>> >> 
>> >> [osd.1]
>> >>     host = storageTwo
>> >> 
>> >> 
>> >> [osd.2]
>> >>     host = storageFour
>> >> 
>> >> 
>> >> [osd.3]
>> >>     host = storageLast
>> >>                         
>> >> 
>> >> Could anybody help me?
>> >> 
>> >> best regards,
>> >> 
>> >> --
>> >> 
>> >> Zhen Wang
>> >
>> >-- 
>> >Christian Balzer        Network/Systems Engineer                
>> >chibi@xxxxxxx   	Global OnLine Japan/Fusion Communications
>> >http://www.gol.com/
>
>
>-- 
>Christian Balzer        Network/Systems Engineer                
>chibi@xxxxxxx   	Global OnLine Japan/Fusion Communications
>http://www.gol.com/

 _______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com