Re: Quincy: mClock config propagation does not work properly

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Luis,

Thanks for testing the Quincy rc and trying out the mClock settings!
Sridhar is looking into this issue and will provide his feedback as
soon as possible.

Thanks,
Neha

On Thu, Mar 3, 2022 at 5:05 AM Luis Domingues <luis.domingues@xxxxxxxxx> wrote:
>
> Hi all,
>
> As we are doing some tests on our lab cluster, running Quincy 17.1.0, we observed some strange behavior regarding the propagation of the mClock parameters to the OSDs. Basically, when we change the profile is set on a per-recorded one, and we change to custom, the change on the different mClock parameters are not propagated.
>
> For more details, here is how we reproduce the issue on our lab:
>
> ********************** Step 1
>
> We start the OSDs, with this configuration set, using ceph config dump:
>
> ```
>
> osd advanced osd_mclock_profile custom
> osd advanced osd_mclock_scheduler_background_recovery_lim 512
> osd advanced osd_mclock_scheduler_background_recovery_res 128
> osd advanced osd_mclock_scheduler_background_recovery_wgt 3
> osd advanced osd_mclock_scheduler_client_lim 80
> osd advanced osd_mclock_scheduler_client_res 30
> osd advanced osd_mclock_scheduler_client_wgt 1 osd advanced osd_op_queue mclock_scheduler *
> ```
>
> And we can observe that this is what the OSD is running, using ceph daemon osd.X config show:
>
> ```
> "osd_mclock_profile": "custom",
> "osd_mclock_scheduler_anticipation_timeout": "0.000000",
> "osd_mclock_scheduler_background_best_effort_lim": "999999",
> "osd_mclock_scheduler_background_best_effort_res": "1",
> "osd_mclock_scheduler_background_best_effort_wgt": "1",
> "osd_mclock_scheduler_background_recovery_lim": "512",
> "osd_mclock_scheduler_background_recovery_res": "128",
> "osd_mclock_scheduler_background_recovery_wgt": "3",
> "osd_mclock_scheduler_client_lim": "80",
> "osd_mclock_scheduler_client_res": "30",
> "osd_mclock_scheduler_client_wgt": "1",
> "osd_mclock_skip_benchmark": "false",
> "osd_op_queue": "mclock_scheduler",
> ```
>
> At this point, is we change something, the change can be viewed on the osd. Let's say we change the background recovery to 100:
>
> `ceph config set osd osd_mclock_scheduler_background_recovery_res 100`
>
> The change has been set properly on the OSDs:
>
> ```
> "osd_mclock_profile": "custom",
> "osd_mclock_scheduler_anticipation_timeout": "0.000000",
> "osd_mclock_scheduler_background_best_effort_lim": "999999",
> "osd_mclock_scheduler_background_best_effort_res": "1",
> "osd_mclock_scheduler_background_best_effort_wgt": "1",
> "osd_mclock_scheduler_background_recovery_lim": "512",
> "osd_mclock_scheduler_background_recovery_res": "100",
> "osd_mclock_scheduler_background_recovery_wgt": "3",
> "osd_mclock_scheduler_client_lim": "80",
> "osd_mclock_scheduler_client_res": "30",
> "osd_mclock_scheduler_client_wgt": "1",
> "osd_mclock_skip_benchmark": "false",
> "osd_op_queue": "mclock_scheduler",
> ```
>
> ********************** Step 2
>
> We change the profile to high_recovery_ops, and remove the old configuration
>
> ```
> ceph config set osd osd_mclock_profile high_recovery_ops
> ceph config rm osd osd_mclock_scheduler_background_recovery_lim
> ceph config rm osd osd_mclock_scheduler_background_recovery_res
> ceph config rm osd osd_mclock_scheduler_background_recovery_wgt
> ceph config rm osd osd_mclock_scheduler_client_lim
> ceph config rm osd osd_mclock_scheduler_client_resceph config rm osd osd_mclock_scheduler_client_wgt
> ```
>
> The config contains this now:
>
> ```
> osd advanced osd_mclock_profile high_recovery_ops
> osd advanced osd_op_queue mclock_scheduler *
> ```
>
> And we can see that the configuration was propagated to the OSDs:
>
> ```
> "osd_mclock_profile": "high_recovery_ops",
> "osd_mclock_scheduler_anticipation_timeout": "0.000000",
> "osd_mclock_scheduler_background_best_effort_lim": "999999",
> "osd_mclock_scheduler_background_best_effort_res": "1",
> "osd_mclock_scheduler_background_best_effort_wgt": "2",
> "osd_mclock_scheduler_background_recovery_lim": "343",
> "osd_mclock_scheduler_background_recovery_res": "103",
> "osd_mclock_scheduler_background_recovery_wgt": "2",
> "osd_mclock_scheduler_client_lim": "137",
> "osd_mclock_scheduler_client_res": "51",
> "osd_mclock_scheduler_client_wgt": "1",
> "osd_mclock_skip_benchmark": "false",
> "osd_op_queue": "mclock_scheduler",
>
> ```
>
> ********************** Step 3
>
> The issue comes now, when we try to go back to custom profile:
>
> ```
> ceph config set osd osd_mclock_profile custom
> ceph config set osd osd_mclock_scheduler_background_recovery_lim 512
> ceph config set osd osd_mclock_scheduler_background_recovery_res 128
> ceph config set osd osd_mclock_scheduler_background_recovery_wgt 3
> ceph config set osd osd_mclock_scheduler_client_lim 80
> ceph config set osd osd_mclock_scheduler_client_res 30ceph config set osd osd_mclock_scheduler_client_wgt 1
>
> ```
>
> The ceph configuration looks good:
>
> ```
> osd advanced osd_mclock_profile custom
> osd advanced osd_mclock_scheduler_background_recovery_lim 512
> osd advanced osd_mclock_scheduler_background_recovery_res 128
> osd advanced osd_mclock_scheduler_background_recovery_wgt 3
> osd advanced osd_mclock_scheduler_client_lim 80
> osd advanced osd_mclock_scheduler_client_res 30
> osd advanced osd_mclock_scheduler_client_wgt 1 osd advanced osd_op_queue mclock_scheduler *
> ```
>
> But the lim, res and wgt on the OSDs still have the old high_recovery config, even if the profile is custom:
>
> ```
> "osd_mclock_profile": "custom",
> "osd_mclock_scheduler_anticipation_timeout": "0.000000",
> "osd_mclock_scheduler_background_best_effort_lim": "999999",
> "osd_mclock_scheduler_background_best_effort_res": "1",
> "osd_mclock_scheduler_background_best_effort_wgt": "2",
> "osd_mclock_scheduler_background_recovery_lim": "343",
> "osd_mclock_scheduler_background_recovery_res": "103",
> "osd_mclock_scheduler_background_recovery_wgt": "2",
> "osd_mclock_scheduler_client_lim": "137",
> "osd_mclock_scheduler_client_res": "51",
> "osd_mclock_scheduler_client_wgt": "1",
> "osd_mclock_skip_benchmark": "false",
> "osd_op_queue": "mclock_scheduler",
> ```
>
> At this point, we can try to change what we want, the only changes that as an effect on the mClock parameters is if we change to a pre-configured profile, such as balanced, high_recovery_ops, high_client_ops. Setting custom again will leave the OSDs on their last pre-configured profile.
>
> The only way I found to be able to change this parameters, is to restart the OSDs.
>
> This look to me like a bug, but tell me if this is expected.
>
> Regards,
> Luis Domingues
> Proton AG
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux