Hi Luis, I was able to reproduce this issue locally and this looks like a bug. I have raised a tracker to help track the fix for this: https://tracker.ceph.com/issues/55153. This issue here is that with the 'custom' profile enabled, a change to the config parameters is written to the configuration db as you have noted, but the values are not coming into effect on the OSD(s). I will look into this further and come back with a fix. Thank you for trying out mclock and for your feedback. -Sridhar On Wed, Mar 30, 2022 at 8:40 PM Sridhar Seshasayee <sseshasa@xxxxxxxxxx> wrote: > Hi Luis, > > As Neha mentioned, I am trying out your steps and investigating this > further. > I will get back to you in the next day or two. Thanks for your patience. > > -Sridhar > > On Thu, Mar 17, 2022 at 11:51 PM Neha Ojha <nojha@xxxxxxxxxx> wrote: > >> Hi Luis, >> >> Thanks for testing the Quincy rc and trying out the mClock settings! >> Sridhar is looking into this issue and will provide his feedback as >> soon as possible. >> >> Thanks, >> Neha >> >> On Thu, Mar 3, 2022 at 5:05 AM Luis Domingues <luis.domingues@xxxxxxxxx> >> wrote: >> > >> > Hi all, >> > >> > As we are doing some tests on our lab cluster, running Quincy 17.1.0, >> we observed some strange behavior regarding the propagation of the mClock >> parameters to the OSDs. Basically, when we change the profile is set on a >> per-recorded one, and we change to custom, the change on the different >> mClock parameters are not propagated. >> > >> > For more details, here is how we reproduce the issue on our lab: >> > >> > ********************** Step 1 >> > >> > We start the OSDs, with this configuration set, using ceph config dump: >> > >> > ``` >> > >> > osd advanced osd_mclock_profile custom >> > osd advanced osd_mclock_scheduler_background_recovery_lim 512 >> > osd advanced osd_mclock_scheduler_background_recovery_res 128 >> > osd advanced osd_mclock_scheduler_background_recovery_wgt 3 >> > osd advanced osd_mclock_scheduler_client_lim 80 >> > osd advanced osd_mclock_scheduler_client_res 30 >> > osd advanced osd_mclock_scheduler_client_wgt 1 osd advanced >> osd_op_queue mclock_scheduler * >> > ``` >> > >> > And we can observe that this is what the OSD is running, using ceph >> daemon osd.X config show: >> > >> > ``` >> > "osd_mclock_profile": "custom", >> > "osd_mclock_scheduler_anticipation_timeout": "0.000000", >> > "osd_mclock_scheduler_background_best_effort_lim": "999999", >> > "osd_mclock_scheduler_background_best_effort_res": "1", >> > "osd_mclock_scheduler_background_best_effort_wgt": "1", >> > "osd_mclock_scheduler_background_recovery_lim": "512", >> > "osd_mclock_scheduler_background_recovery_res": "128", >> > "osd_mclock_scheduler_background_recovery_wgt": "3", >> > "osd_mclock_scheduler_client_lim": "80", >> > "osd_mclock_scheduler_client_res": "30", >> > "osd_mclock_scheduler_client_wgt": "1", >> > "osd_mclock_skip_benchmark": "false", >> > "osd_op_queue": "mclock_scheduler", >> > ``` >> > >> > At this point, is we change something, the change can be viewed on the >> osd. Let's say we change the background recovery to 100: >> > >> > `ceph config set osd osd_mclock_scheduler_background_recovery_res 100` >> > >> > The change has been set properly on the OSDs: >> > >> > ``` >> > "osd_mclock_profile": "custom", >> > "osd_mclock_scheduler_anticipation_timeout": "0.000000", >> > "osd_mclock_scheduler_background_best_effort_lim": "999999", >> > "osd_mclock_scheduler_background_best_effort_res": "1", >> > "osd_mclock_scheduler_background_best_effort_wgt": "1", >> > "osd_mclock_scheduler_background_recovery_lim": "512", >> > "osd_mclock_scheduler_background_recovery_res": "100", >> > "osd_mclock_scheduler_background_recovery_wgt": "3", >> > "osd_mclock_scheduler_client_lim": "80", >> > "osd_mclock_scheduler_client_res": "30", >> > "osd_mclock_scheduler_client_wgt": "1", >> > "osd_mclock_skip_benchmark": "false", >> > "osd_op_queue": "mclock_scheduler", >> > ``` >> > >> > ********************** Step 2 >> > >> > We change the profile to high_recovery_ops, and remove the old >> configuration >> > >> > ``` >> > ceph config set osd osd_mclock_profile high_recovery_ops >> > ceph config rm osd osd_mclock_scheduler_background_recovery_lim >> > ceph config rm osd osd_mclock_scheduler_background_recovery_res >> > ceph config rm osd osd_mclock_scheduler_background_recovery_wgt >> > ceph config rm osd osd_mclock_scheduler_client_lim >> > ceph config rm osd osd_mclock_scheduler_client_resceph config rm osd >> osd_mclock_scheduler_client_wgt >> > ``` >> > >> > The config contains this now: >> > >> > ``` >> > osd advanced osd_mclock_profile high_recovery_ops >> > osd advanced osd_op_queue mclock_scheduler * >> > ``` >> > >> > And we can see that the configuration was propagated to the OSDs: >> > >> > ``` >> > "osd_mclock_profile": "high_recovery_ops", >> > "osd_mclock_scheduler_anticipation_timeout": "0.000000", >> > "osd_mclock_scheduler_background_best_effort_lim": "999999", >> > "osd_mclock_scheduler_background_best_effort_res": "1", >> > "osd_mclock_scheduler_background_best_effort_wgt": "2", >> > "osd_mclock_scheduler_background_recovery_lim": "343", >> > "osd_mclock_scheduler_background_recovery_res": "103", >> > "osd_mclock_scheduler_background_recovery_wgt": "2", >> > "osd_mclock_scheduler_client_lim": "137", >> > "osd_mclock_scheduler_client_res": "51", >> > "osd_mclock_scheduler_client_wgt": "1", >> > "osd_mclock_skip_benchmark": "false", >> > "osd_op_queue": "mclock_scheduler", >> > >> > ``` >> > >> > ********************** Step 3 >> > >> > The issue comes now, when we try to go back to custom profile: >> > >> > ``` >> > ceph config set osd osd_mclock_profile custom >> > ceph config set osd osd_mclock_scheduler_background_recovery_lim 512 >> > ceph config set osd osd_mclock_scheduler_background_recovery_res 128 >> > ceph config set osd osd_mclock_scheduler_background_recovery_wgt 3 >> > ceph config set osd osd_mclock_scheduler_client_lim 80 >> > ceph config set osd osd_mclock_scheduler_client_res 30ceph config set >> osd osd_mclock_scheduler_client_wgt 1 >> > >> > ``` >> > >> > The ceph configuration looks good: >> > >> > ``` >> > osd advanced osd_mclock_profile custom >> > osd advanced osd_mclock_scheduler_background_recovery_lim 512 >> > osd advanced osd_mclock_scheduler_background_recovery_res 128 >> > osd advanced osd_mclock_scheduler_background_recovery_wgt 3 >> > osd advanced osd_mclock_scheduler_client_lim 80 >> > osd advanced osd_mclock_scheduler_client_res 30 >> > osd advanced osd_mclock_scheduler_client_wgt 1 osd advanced >> osd_op_queue mclock_scheduler * >> > ``` >> > >> > But the lim, res and wgt on the OSDs still have the old high_recovery >> config, even if the profile is custom: >> > >> > ``` >> > "osd_mclock_profile": "custom", >> > "osd_mclock_scheduler_anticipation_timeout": "0.000000", >> > "osd_mclock_scheduler_background_best_effort_lim": "999999", >> > "osd_mclock_scheduler_background_best_effort_res": "1", >> > "osd_mclock_scheduler_background_best_effort_wgt": "2", >> > "osd_mclock_scheduler_background_recovery_lim": "343", >> > "osd_mclock_scheduler_background_recovery_res": "103", >> > "osd_mclock_scheduler_background_recovery_wgt": "2", >> > "osd_mclock_scheduler_client_lim": "137", >> > "osd_mclock_scheduler_client_res": "51", >> > "osd_mclock_scheduler_client_wgt": "1", >> > "osd_mclock_skip_benchmark": "false", >> > "osd_op_queue": "mclock_scheduler", >> > ``` >> > >> > At this point, we can try to change what we want, the only changes that >> as an effect on the mClock parameters is if we change to a pre-configured >> profile, such as balanced, high_recovery_ops, high_client_ops. Setting >> custom again will leave the OSDs on their last pre-configured profile. >> > >> > The only way I found to be able to change this parameters, is to >> restart the OSDs. >> > >> > This look to me like a bug, but tell me if this is expected. >> > >> > Regards, >> > Luis Domingues >> > Proton AG >> > _______________________________________________ >> > ceph-users mailing list -- ceph-users@xxxxxxx >> > To unsubscribe send an email to ceph-users-leave@xxxxxxx >> > >> >> > > -- > > Sridhar Seshasayee > > Principal Software Engineer > > Red Hat <https://www.redhat.com> > <https://www.redhat.com> > -- Sridhar Seshasayee Principal Software Engineer Red Hat <https://www.redhat.com> <https://www.redhat.com> _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx