Re: Cinder + CEPH Storage Full Scenario

Jan Schermer <jan@xxxxxxxxxxx> · Mon, 19 Oct 2015 20:51:37 +0200

Sorry about that, I guess newer releases than my Dumpling calculate it differently, then.
I can take a look tomorrow at the exact numbers I get, but I'm pretty sure it's just a sum on D.

Jan

> On 19 Oct 2015, at 20:40, John Spray <jspray@xxxxxxxxxx> wrote:
> 
> On Mon, Oct 19, 2015 at 7:28 PM, Jan Schermer <jan@xxxxxxxxxxx> wrote:
>> Cinder checking free space will not help.
>> You will get one full OSD long before you run "out of space" from Ceph
>> perspective, and it gets worse with the number of OSDs you have. Using 99%
>> of space in Ceph is not the same as having all the OSDs 99% full because the
>> data is not distributed in a completely fair fashion. Not sure how much that
>> can be helped, but my cluster can store at most 2TB of data when claiming to
>> have 14TB free.
> 
> The "max_avail" per-pool number that you get out of "ceph df" is aware
> of this, and will calculate that actual writeable capacity based on
> whatever OSD has the least available space.  From a quick look at the
> code, it seems that the RBD Cinder plugin reports free_capacity_gb
> from max_avail, so unless you're seeing a different behaviour I don't
> think we have a problem.
> 
> This is me looking at master ceph and master cinder, so no idea which
> released versions got this behaviour (the cinder code was modified in
> March this year).
> 
> John
> 
>> 
>> You *really* need to monitor each OSD's free space and treat it with utmost
>> criticality.
>> 
>> Jan
>> 
>> 
>> On 19 Oct 2015, at 20:00, Andrew Woodward <xarses@xxxxxxxxx> wrote:
>> 
>> Cinder will periodically inspect the free space of the volume services and
>> use this data when determining which one to schedule to when a request is
>> received. In this case the cinder volume create request may error out in
>> scheduling. You may also see an error when instantiating a volume from an
>> image if it passes the prior but then becomes out of space during writing
>> the image to the volume.
>> 
>> I'm not sure if it's still the case, but in Havana (I see no reason for it
>> to change) the free space check in cinder didn't account for the difference
>> between promised space (the max of the volumes assigned) instead it would
>> literally look for free space in the output of `rados df`
>> 
>> As noted above if the cluster gets to "100%" used, bad things will happen to
>> your VM's. The most likely case is that they all assert read-only
>> filesystems. (100% is a missnomer as there is a configured max % where it
>> will stop accepting data writes to ensure that important object replication
>> / maintenance can occur and have the cluster not fall over)
>> 
>> On Mon, Oct 19, 2015 at 7:51 AM LOPEZ Jean-Charles <jelopez@xxxxxxxxxx>
>> wrote:
>>> 
>>> Hi,
>>> 
>>> when an OSD gets full, any write operation to the entire cluster will be
>>> disabled.
>>> 
>>> As a result, creating a single RBD will become impossible and all VMs that
>>> need to write to one of their Ceph back RBDs will suffer the same pain.
>>> 
>>> Usually, this ends up as a bad sorry for the VMs.
>>> 
>>> The best practice is to monitor the disk space usage for the OSDs and as a
>>> matter of fact RHCS 1.# includes a cep old df command to do this. You can
>>> also use the output of the cep old report command to grab the appropriate
>>> info to compute it or rely on external SNMP monitoring tools to grab the
>>> usage details of the particular OSD disk drives.
>>> 
>>> Have a great day.
>>> JC
>>> 
>>>> On Oct 19, 2015, at 02:32, Bharath Krishna <BKrishna@xxxxxxxxxxxxxxx>
>>>> wrote:
>>>> 
>>>> I mean cluster OSDs are physically full.
>>>> 
>>>> I understand its not a pretty way to operate CEPH allowing to become
>>>> full,
>>>> but I just wanted to know the boundary condition if it becomes full.
>>>> 
>>>> Will cinder create volume operation creates new volume at all or error
>>>> is
>>>> thrown at Cinder API level itself stating that no space available?
>>>> 
>>>> When IO stalls, will I be able to read the data from CEPH cluster I.e
>>>> can
>>>> I still read data from existing volumes created from CEPH cluster?
>>>> 
>>>> Thanks for the quick reply.
>>>> 
>>>> Regards
>>>> M Bharath Krishna
>>>> 
>>>> On 10/19/15, 2:51 PM, "Jan Schermer" <jan@xxxxxxxxxxx> wrote:
>>>> 
>>>>> Do you mean when the CEPH cluster (OSDs) is physically full or when the
>>>>> quota is reached?
>>>>> 
>>>>> If CEPH becomes full it just stalls all IO (maybe just write IO, but
>>>>> effectively same thing) - not pretty and you must never ever let it
>>>>> become full.
>>>>> 
>>>>> Jan
>>>>> 
>>>>> 
>>>>>> On 19 Oct 2015, at 11:15, Bharath Krishna <BKrishna@xxxxxxxxxxxxxxx>
>>>>>> wrote:
>>>>>> 
>>>>>> Hi
>>>>>> 
>>>>>> What happens when Cinder service with CEPH backend storage cluster
>>>>>> capacity is FULL?
>>>>>> 
>>>>>> What would be the out come of new cinder create volume request?
>>>>>> 
>>>>>> Will volume be created with space not available for use or an error
>>>>>> thrown from Cinder API stating no space available for new volume.
>>>>>> 
>>>>>> I could not try this in my environment and fill up the cluster.
>>>>>> 
>>>>>> Please reply if you have ever tried and tested this.
>>>>>> 
>>>>>> Thank you.
>>>>>> 
>>>>>> Regards,
>>>>>> M Bharath Krishna
>>>>>> _______________________________________________
>>>>>> ceph-users mailing list
>>>>>> ceph-users@xxxxxxxxxxxxxx
>>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>>> 
>>>> 
>>>> _______________________________________________
>>>> ceph-users mailing list
>>>> ceph-users@xxxxxxxxxxxxxx
>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>> 
>>> _______________________________________________
>>> ceph-users mailing list
>>> ceph-users@xxxxxxxxxxxxxx
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> 
>> --
>> 
>> --
>> 
>> Andrew Woodward
>> 
>> Mirantis
>> 
>> Fuel Community Ambassador
>> 
>> Ceph Community
>> 
>> _______________________________________________
>> ceph-users mailing list
>> ceph-users@xxxxxxxxxxxxxx
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> 
>> 
>> 
>> _______________________________________________
>> ceph-users mailing list
>> ceph-users@xxxxxxxxxxxxxx
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> 

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com