On Wed, 9 Jan 2013, Roman Hlynovskiy wrote: > Thanks a lot Greg, > > that was the black magic command I was looking for ) > > I deleted some obsolete data and reached those figures: > > chef@cephgw:~$ ./clu.sh exec "df -kh"|grep osd > /dev/mapper/vg00-osd 252G 153G 100G 61% /var/lib/ceph/osd/ceph-0 > /dev/mapper/vg00-osd 252G 180G 73G 72% /var/lib/ceph/osd/ceph-1 > /dev/mapper/vg00-osd 252G 213G 40G 85% /var/lib/ceph/osd/ceph-2 > > which in comparison to previous one: > > /dev/mapper/vg00-osd 252G 173G 80G 69% /var/lib/ceph/osd/ceph-0 > /dev/mapper/vg00-osd 252G 203G 50G 81% /var/lib/ceph/osd/ceph-1 > /dev/mapper/vg00-osd 252G 240G 13G 96% /var/lib/ceph/osd/ceph-2 > > show that 20gig were removed from osd-1, 23gig from osd-2 and 27gig from osd-3. > So, cleaned up space also has some disproportion. > > at the same time: > chef@cephgw:~$ ceph osd tree > > # id weight type name up/down reweight > -1 3 pool default > -3 3 rack unknownrack > -2 1 host ceph-node01 > 0 1 osd.0 up 1 > -4 1 host ceph-node02 > 1 1 osd.1 up 1 > -5 1 host ceph-node03 > 2 1 osd.2 up 1 > > > all osd weights are the same. I guess there is no automatic way to > balance storage usage for my case and I have to play with osd weights > using 'ceph osd reweight-by-utilization xx' until storage is used more > or less equally and when get the weights back to 1? How many pgs do you have? ('ceph osd dump | grep ^pool'). You might also adjust the crush tunables, see http://ceph.com/docs/master/rados/operations/crush-map/?highlight=tunable#tunables sage > > > > 2013/1/8 Gregory Farnum <greg@xxxxxxxxxxx>: > > On Tue, Jan 8, 2013 at 2:42 AM, Roman Hlynovskiy > > <roman.hlynovskiy@xxxxxxxxx> wrote: > >> Hello, > >> > >> I am running ceph v0.56 and at the moment trying to recover ceph which > >> got completely stuck after 1 osd got filled by 95%. Looks like the > >> distribution algorithm is not perfect since all 3 OSD's I user are > >> 256Gb each, however one of them got filled faster than others: > >> > >> osd-1: > >> Filesystem Size Used Avail Use% Mounted on > >> /dev/mapper/vg00-osd 252G 173G 80G 69% /var/lib/ceph/osd/ceph-0 > >> > >> osd-2: > >> Filesystem Size Used Avail Use% Mounted on > >> /dev/mapper/vg00-osd 252G 203G 50G 81% /var/lib/ceph/osd/ceph-1 > >> > >> osd-3: > >> Filesystem Size Used Avail Use% Mounted on > >> /dev/mapper/vg00-osd 252G 240G 13G 96% /var/lib/ceph/osd/ceph-2 > >> > >> > >> by the moment mds is showing the following behaviour: > >> 2013-01-08 16:25:47.006354 b4a73b70 0 mds.0.objecter FULL, paused > >> modify 0x9ba63c0 tid 23448 > >> 2013-01-08 16:26:47.005211 b4a73b70 0 mds.0.objecter FULL, paused > >> modify 0xca86c30 tid 23449 > >> > >> so, it does not respond to any mount requests > >> > >> I've played around with all types of commands like: > >> ceph mon tell \* injectargs '--mon-osd-full-ratio 98' > >> ceph mon tell \* injectargs '--mon-osd-full-ratio 0.98' > >> > >> and > >> > >> 'mon osd full ratio = 0.98' in mon configuration for each mon > >> > >> however > >> > >> chef@ceph-node03:/var/log/ceph$ ceph health detail > >> HEALTH_ERR 1 full osd(s) > >> osd.2 is full at 95% > >> > >> mds still believes 95% is the threshold, so no responses to mount requests. > >> > >> chef@ceph-node03:/var/log/ceph$ rados -p data bench 10 write > >> Maintaining 16 concurrent writes of 4194304 bytes for at least 10 seconds. > >> Object prefix: benchmark_data_ceph-node03_3903 > >> 2013-01-08 16:33:02.363206 b6be3710 0 client.9958.objecter FULL, > >> paused modify 0xa467ff0 tid 1 > >> 2013-01-08 16:33:02.363618 b6be3710 0 client.9958.objecter FULL, > >> paused modify 0xa468780 tid 2 > >> 2013-01-08 16:33:02.363741 b6be3710 0 client.9958.objecter FULL, > >> paused modify 0xa468f88 tid 3 > >> 2013-01-08 16:33:02.364056 b6be3710 0 client.9958.objecter FULL, > >> paused modify 0xa469348 tid 4 > >> 2013-01-08 16:33:02.364171 b6be3710 0 client.9958.objecter FULL, > >> paused modify 0xa469708 tid 5 > >> 2013-01-08 16:33:02.365024 b6be3710 0 client.9958.objecter FULL, > >> paused modify 0xa469ac8 tid 6 > >> 2013-01-08 16:33:02.365187 b6be3710 0 client.9958.objecter FULL, > >> paused modify 0xa46a2d0 tid 7 > >> 2013-01-08 16:33:02.365296 b6be3710 0 client.9958.objecter FULL, > >> paused modify 0xa46a690 tid 8 > >> 2013-01-08 16:33:02.365402 b6be3710 0 client.9958.objecter FULL, > >> paused modify 0xa46aa50 tid 9 > >> 2013-01-08 16:33:02.365508 b6be3710 0 client.9958.objecter FULL, > >> paused modify 0xa46ae10 tid 10 > >> 2013-01-08 16:33:02.365635 b6be3710 0 client.9958.objecter FULL, > >> paused modify 0xa46b1d0 tid 11 > >> 2013-01-08 16:33:02.365742 b6be3710 0 client.9958.objecter FULL, > >> paused modify 0xa46b590 tid 12 > >> 2013-01-08 16:33:02.365868 b6be3710 0 client.9958.objecter FULL, > >> paused modify 0xa46b950 tid 13 > >> 2013-01-08 16:33:02.365975 b6be3710 0 client.9958.objecter FULL, > >> paused modify 0xa46bd10 tid 14 > >> 2013-01-08 16:33:02.366096 b6be3710 0 client.9958.objecter FULL, > >> paused modify 0xa46c0d0 tid 15 > >> 2013-01-08 16:33:02.366203 b6be3710 0 client.9958.objecter FULL, > >> paused modify 0xa46c490 tid 16 > >> sec Cur ops started finished avg MB/s cur MB/s last lat avg lat > >> 0 16 16 0 0 0 - 0 > >> 1 16 16 0 0 0 - 0 > >> 2 16 16 0 0 0 - 0 > >> > >> rados doesn't work. > >> > >> chef@ceph-node03:/var/log/ceph$ ceph osd reweight-by-utilization > >> no change: average_util: 0.812678, overload_util: 0.975214. overloaded > >> osds: (none) > >> > >> this one also. > >> > >> > >> is there any chance to recover ceph? > > > > "ceph pg set_full_ratio 0.98" > > > > However, as Mark mentioned, you want to figure out why one OSD is so > > much fuller than the others first. Even in a small cluster I don't > > think you should be able to see that kind of variance. Simply setting > > the full ratio to 98% and then continuing to run could cause bigger > > problems if that OSD continues to get a disproportionate share of the > > writes and fills up its disk. > > -Greg > > > > -- > ...WBR, Roman Hlynovskiy > -- > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html > > -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html