Hi Casper,
Thank you for the response, problem is solved now. After some searching, it turned out to be that after Luminous, setting
mon_osd_backfillfull_ratio and
mon_osd_nearfull_ratio do not take effect anymore. This is because these settings are being read from the OSD map and the commands "ceph osd set-nearfull-ratio" and "ceph osd set-backfillfull-ratio" are used to change them.
This was verified by running "ceph osd dunp|head" all ratios were still 0.92 and 0.95...etc. When Setting them to 0.85 the flags started to work normally and we were able to control our cluster in a better way.
Moreover, setting the backfillfull ratio lower than near full ration would show a HEALH_ERR out of order flags. Therefore, we set them to the same number for now and started reweighting to rebalance the cluster
The backfillfull ones actually prevent data movement to them and data was moved to other OSDs with more free space. Nevertheless some PG got stuck and backfill_too_full was flagged. at the end those we reweighted and all restored to normal. Finally we set the backfullfull ratio to be higher than the nearfull ratio. END OF STORY.
Thanks
On Wed, Apr 18, 2018 at 11:20 AM, Caspar Smit <casparsmit@xxxxxxxxxxx> wrote:
Hi Monis,The settings you mention do not prevent data movement to overloaded OSD's, they are a threshold when CEPH warns when an OSD is nearfull or backfillfull.
No expert on this but setting backfillfull lower then nearfull is not recommended, the nearfull state should be reached first in stead of backfillfull.You can reweight the overloaded OSD's manually by issueing: ceph osd reweight osd.X 0.95 (the last value should be between 0 and 1, where 1 is the default and can be seen as 100%, setting this to 0.95 means to only use 95% of the OSD, to move more PGS of this OSD you can set the value lower to 0.9 or 0.85)Kind regards,Caspar2018-04-18 9:07 GMT+02:00 Monis Monther <mmmm82@xxxxxxxxx>:______________________________Hi,We are running a cluster with ceph luminous 12.2.0. Some of the OSDs are getting full and we are running ceph osd reweight-by-utilization to re-balance the OSDs. We have also setmon_osd_backfillfull_ratio 0.8 (This is to prevent moving data to an overloaded OSD when re-weighting)mon_osd_nearfull_ratio 0.85However, reweight is worsening the problem by moving data from an 85% full OSD to an 84.7 full OSD instead of moving it to half empty OSD. This is causing the last to increase up to 85.6. Some OSDs have now reached 87% and 86%Moreover, the cluster does not show any OSD as near full although some OSDs have passed 86% and is totaly ignoring the backfillfull setting by moving data to OSDs that are above 80%.Are the settings above wrong? what can we do to prevent moving data to overloaded OSDs--Best RegardsMonis_________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Best Regards
Monis
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com