Re: subtrees have overcommitted (target_size_bytes / target_size_ratio)

Lars Täuber <taeuber@xxxxxxx> · Mon, 28 Oct 2019 11:24:54 +0100

Is there a way to get rid of this warnings with activated autoscaler besides adding new osds?

Yet I couldn't get a satisfactory answer to the question why this all happens.

ceph osd pool autoscale-status :
 POOL               SIZE  TARGET SIZE  RATE  RAW CAPACITY   RATIO  TARGET RATIO  BIAS  PG_NUM  NEW PG_NUM  AUTOSCALE 
 cephfs_data      122.2T                1.5        165.4T  1.1085        0.8500   1.0    1024              on 

versus

 ceph df  :
RAW STORAGE:
    CLASS     SIZE        AVAIL       USED        RAW USED     %RAW USED 
    hdd       165 TiB      41 TiB     124 TiB      124 TiB         74.95 

POOLS:
    POOL                ID     STORED     OBJECTS     USED        %USED     MAX AVAIL 
    cephfs_data          1     75 TiB      49.31M     122 TiB     87.16        12 TiB 

It seems that the overcommitment is wrongly calculated. Isn't the RATE already used to calculate the SIZE?

It seems USED(df) = SIZE(autoscale-status)
Isn't the RATE already taken into account here?

Could someone please explain the numbers to me?

Thanks!
Lars

Fri, 25 Oct 2019 07:42:58 +0200
Lars Täuber <taeuber@xxxxxxx> ==> Nathan Fish <lordcirth@xxxxxxxxx> :
> Hi Nathan,
> 
> Thu, 24 Oct 2019 10:59:55 -0400
> Nathan Fish <lordcirth@xxxxxxxxx> ==> Lars Täuber <taeuber@xxxxxxx> :
> > Ah, I see! The BIAS reflects the number of placement groups it should
> > create. Since cephfs metadata pools are usually very small, but have
> > many objects and high IO, the autoscaler gives them 4x the number of
> > placement groups that it would normally give for that amount of data.
> >   
> ah ok, I understand.
> 
> > So, your cephfs_data is set to a ratio of 0.9, and cephfs_metadata to
> > 0.3? Are the two pools using entirely different device classes, so
> > they are not sharing space?  
> 
> Yes, the metadata is on SSDs and the data on HDDs.
> 
> > Anyway, I see that your overcommit is only "1.031x". So if you set
> > cephfs_data to 0.85, it should go away.  
> 
> This is not the case. I set the target_ratio to 0.7 and get this:
> 
>  POOL               SIZE  TARGET SIZE  RATE  RAW CAPACITY   RATIO  TARGET RATIO  BIAS  PG_NUM  NEW PG_NUM  AUTOSCALE 
>  cephfs_metadata  15736M                3.0         2454G  0.0188        0.3000   4.0     256              on        
>  cephfs_data      122.2T                1.5        165.4T  1.1085        0.7000   1.0    1024              on        
> 
> The ratio seems to have nothing to do with the target_ratio but the SIZE and the RAW_CAPACITY.
> Because the pool is still getting more data the SIZE increases and therefore the RATIO increases.
> The RATIO seems to be calculated by this formula
> RATIO = SIZE * RATE / RAW_CAPACITY.
> 
> This is what I don't understand. The data in the cephfs_data pool seems to need more space than the raw capacity of the cluster provides. Hence the situation is called "overcommitment".
> 
> But why is this only the case when the autoscaler is active?
> 
> Thanks
> Lars
> 
> > 
> > On Thu, Oct 24, 2019 at 10:09 AM Lars Täuber <taeuber@xxxxxxx> wrote:  
> > >
> > > Thanks Nathan for your answer,
> > >
> > > but I set the the Target Ratio to 0.9. It is the cephfs_data pool that makes the troubles.
> > >
> > > The 4.0 is the BIAS from the cephfs_metadata pool. This "BIAS" is not explained on the page linked below. So I don't know its meaning.
> > >
> > > How can be a pool overcommited when it is the only pool on a set of OSDs?
> > >
> > > Best regards,
> > > Lars
> > >
> > > Thu, 24 Oct 2019 09:39:51 -0400
> > > Nathan Fish <lordcirth@xxxxxxxxx> ==> Lars Täuber <taeuber@xxxxxxx> :    
> > > > The formatting is mangled on my phone, but if I am reading it correctly,
> > > > you have set Target Ratio to 4.0. This means you have told the balancer
> > > > that this pool will occupy 4x the space of your whole cluster, and to
> > > > optimize accordingly. This is naturally a problem. Setting it to 0 will
> > > > clear the setting and allow the autobalancer to work.
> > > >
> > > > On Thu., Oct. 24, 2019, 5:18 a.m. Lars Täuber, <taeuber@xxxxxxx> wrote:
> > > >    
> > > > > This question is answered here:
> > > > > https://ceph.io/rados/new-in-nautilus-pg-merging-and-autotuning/
> > > > >
> > > > > But it tells me that there is more data stored in the pool than the raw
> > > > > capacity provides (taking the replication factor RATE into account) hence
> > > > > the RATIO being above 1.0 .
> > > > >
> > > > > How comes this is the case? - Data is stored outside of the pool?
> > > > > How comes this is only the case when the autoscaler is active?
> > > > >
> > > > > Thanks
> > > > > Lars
> > > > >
> > > > >
> > > > > Thu, 24 Oct 2019 10:36:52 +0200
> > > > > Lars Täuber <taeuber@xxxxxxx> ==> ceph-users@xxxxxxx :    
> > > > > > My question requires too complex an answer.
> > > > > > So let me ask a simple question:
> > > > > >
> > > > > > What does the SIZE of "osd pool autoscale-status" tell/mean/comes from?
> > > > > >
> > > > > > Thanks
> > > > > > Lars
> > > > > >
> > > > > > Wed, 23 Oct 2019 14:28:10 +0200
> > > > > > Lars Täuber <taeuber@xxxxxxx> ==> ceph-users@xxxxxxx :    
> > > > > > > Hello everybody!
> > > > > > >
> > > > > > > What does this mean?
> > > > > > >
> > > > > > >     health: HEALTH_WARN
> > > > > > >             1 subtrees have overcommitted pool target_size_bytes
> > > > > > >             1 subtrees have overcommitted pool target_size_ratio
> > > > > > >
> > > > > > > and what does it have to do with the autoscaler?
> > > > > > > When I deactivate the autoscaler the warning goes away.
> > > > > > >
> > > > > > >
> > > > > > > $ ceph osd pool autoscale-status
> > > > > > >  POOL               SIZE  TARGET SIZE  RATE  RAW CAPACITY   RATIO    
> > > > > TARGET RATIO  BIAS  PG_NUM  NEW PG_NUM  AUTOSCALE    
> > > > > > >  cephfs_metadata  15106M                3.0         2454G  0.0180    
> > > > >   0.3000   4.0     256              on    
> > > > > > >  cephfs_data      113.6T                1.5        165.4T  1.0306    
> > > > >   0.9000   1.0     512              on    
> > > > > > >
> > > > > > >
> > > > > > > $ ceph health detail
> > > > > > > HEALTH_WARN 1 subtrees have overcommitted pool target_size_bytes; 1    
> > > > > subtrees have overcommitted pool target_size_ratio    
> > > > > > > POOL_TARGET_SIZE_BYTES_OVERCOMMITTED 1 subtrees have overcommitted    
> > > > > pool target_size_bytes    
> > > > > > >     Pools ['cephfs_data'] overcommit available storage by 1.031x due    
> > > > > to target_size_bytes    0  on pools []    
> > > > > > > POOL_TARGET_SIZE_RATIO_OVERCOMMITTED 1 subtrees have overcommitted    
> > > > > pool target_size_ratio    
> > > > > > >     Pools ['cephfs_data'] overcommit available storage by 1.031x due    
> > > > > to target_size_ratio 0.900 on pools ['cephfs_data']    
> > > > > > >
> > > > > > >
> > > > > > > Thanks
> > > > > > > Lars
> > > > > > > _______________________________________________
> > > > > > > ceph-users mailing list -- ceph-users@xxxxxxx
> > > > > > > To unsubscribe send an email to ceph-users-leave@xxxxxxx    
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx