Eric, Yeah, your OSD weights are a little crazy... For example, looking at one host from your output of "ceph osd tree"... -3 31.5 host tca23 1 3.63 osd.1 up 1 7 0.26 osd.7 up 1 13 2.72 osd.13 up 1 19 2.72 osd.19 up 1 25 0.26 osd.25 up 1 31 3.63 osd.31 up 1 37 2.72 osd.37 up 1 43 0.26 osd.43 up 1 49 3.63 osd.49 up 1 55 0.26 osd.55 up 1 61 3.63 osd.61 up 1 67 0.26 osd.67 up 1 73 3.63 osd.73 up 1 79 0.26 osd.79 up 1 85 3.63 osd.85 up 1 osd.7 is set to 0.26, with others set to > 3. Under normal circumstances, the rule of thumb would be to set weights equal to the disk size in TB. So, a 2TB disk would have a weight of 2, a 1.5TB disk == 1.5, etc. These weights control what proportion of data is directed to each OSD. I'm guessing you do have very different size disks, though, as it looks like the disk that are reporting near full all have relatively small weights (OSD 43 is at 91%, weight = 0.26). Is this really a 260GB disk? A mix of HDD and SSDs? or maybe just a small partition? Either way, you probably have something wrong with the weights. I'd look into that. Having a single pool made of disks of such varied size may not be a good option, but I'm not sure if that's your setup or not. To the best of my knowledge, Ceph halts IO operations when any disk reaches the near full scenario (85% by default). I'm not 100% certain on that one, but I believe that is true. Hope that helps, - Travis On Tue, Oct 1, 2013 at 2:51 AM, Yan, Zheng <ukernel@xxxxxxxxx> wrote: > On Mon, Sep 30, 2013 at 11:50 PM, Eric Eastman <eric0e@xxxxxxx> wrote: >> >> Thank you for the reply >> >> >>> -28 == -ENOSPC (No space left on device). I think it's is due to the >> >> fact that some osds are near full. >>> >>> >>> Yan, Zheng >> >> >> I thought that may be the case, but I would expect that ceph health would >> tell me I had a full OSDs, but it is only saying they are near full: >> >> >>>> # ceph health detail >>>> HEALTH_WARN 9 near full osd(s) >>>> osd.9 is near full at 85% >>>> osd.29 is near full at 85% >>>> osd.43 is near full at 91% >>>> osd.45 is near full at 88% >>>> osd.47 is near full at 88% >>>> osd.55 is near full at 94% >>>> osd.59 is near full at 94% >>>> osd.67 is near full at 94% >>>> osd.83 is near full at 94% >> >> > > Are these OSD's disks smaller than other OSD's. If they do, you need > to lower these OSD's weights. > > Regards > Yan, Zheng > >> As I still have lots of space: >> >> >>>> # ceph df >>>> GLOBAL: >>>> SIZE AVAIL RAW USED %RAW USED >>>> 249T 118T 131T 52.60 >>>> >>>> POOLS: >>>> NAME ID USED %USED OBJECTS >>>> data 0 0 0 0 >>>> metadata 1 0 0 0 >>>> rbd 2 8 0 1 >>>> rbd-pool 3 67187G 26.30 17713336 >> >> >> And I setup lots of Placement Groups: >> >> # ceph osd dump | grep 'rep size' | grep rbd-pool >> pool 3 'rbd-pool' rep size 2 min_size 1 crush_ruleset 0 object_hash rjenkins >> pg_num 4500 pgp_num 4500 last_change 360 owner 0 >> >> Why did the OSDs fill up long before I ran out of space? >> > > > > >> Thanks, >> >> Eric >> > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com