Re: Quincy: mClock config propagation does not work properly

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Luis,

As Neha mentioned, I am trying out your steps and investigating this
further.
I will get back to you in the next day or two. Thanks for your patience.

-Sridhar

On Thu, Mar 17, 2022 at 11:51 PM Neha Ojha <nojha@xxxxxxxxxx> wrote:

> Hi Luis,
>
> Thanks for testing the Quincy rc and trying out the mClock settings!
> Sridhar is looking into this issue and will provide his feedback as
> soon as possible.
>
> Thanks,
> Neha
>
> On Thu, Mar 3, 2022 at 5:05 AM Luis Domingues <luis.domingues@xxxxxxxxx>
> wrote:
> >
> > Hi all,
> >
> > As we are doing some tests on our lab cluster, running Quincy 17.1.0, we
> observed some strange behavior regarding the propagation of the mClock
> parameters to the OSDs. Basically, when we change the profile is set on a
> per-recorded one, and we change to custom, the change on the different
> mClock parameters are not propagated.
> >
> > For more details, here is how we reproduce the issue on our lab:
> >
> > ********************** Step 1
> >
> > We start the OSDs, with this configuration set, using ceph config dump:
> >
> > ```
> >
> > osd advanced osd_mclock_profile custom
> > osd advanced osd_mclock_scheduler_background_recovery_lim 512
> > osd advanced osd_mclock_scheduler_background_recovery_res 128
> > osd advanced osd_mclock_scheduler_background_recovery_wgt 3
> > osd advanced osd_mclock_scheduler_client_lim 80
> > osd advanced osd_mclock_scheduler_client_res 30
> > osd advanced osd_mclock_scheduler_client_wgt 1 osd advanced osd_op_queue
> mclock_scheduler *
> > ```
> >
> > And we can observe that this is what the OSD is running, using ceph
> daemon osd.X config show:
> >
> > ```
> > "osd_mclock_profile": "custom",
> > "osd_mclock_scheduler_anticipation_timeout": "0.000000",
> > "osd_mclock_scheduler_background_best_effort_lim": "999999",
> > "osd_mclock_scheduler_background_best_effort_res": "1",
> > "osd_mclock_scheduler_background_best_effort_wgt": "1",
> > "osd_mclock_scheduler_background_recovery_lim": "512",
> > "osd_mclock_scheduler_background_recovery_res": "128",
> > "osd_mclock_scheduler_background_recovery_wgt": "3",
> > "osd_mclock_scheduler_client_lim": "80",
> > "osd_mclock_scheduler_client_res": "30",
> > "osd_mclock_scheduler_client_wgt": "1",
> > "osd_mclock_skip_benchmark": "false",
> > "osd_op_queue": "mclock_scheduler",
> > ```
> >
> > At this point, is we change something, the change can be viewed on the
> osd. Let's say we change the background recovery to 100:
> >
> > `ceph config set osd osd_mclock_scheduler_background_recovery_res 100`
> >
> > The change has been set properly on the OSDs:
> >
> > ```
> > "osd_mclock_profile": "custom",
> > "osd_mclock_scheduler_anticipation_timeout": "0.000000",
> > "osd_mclock_scheduler_background_best_effort_lim": "999999",
> > "osd_mclock_scheduler_background_best_effort_res": "1",
> > "osd_mclock_scheduler_background_best_effort_wgt": "1",
> > "osd_mclock_scheduler_background_recovery_lim": "512",
> > "osd_mclock_scheduler_background_recovery_res": "100",
> > "osd_mclock_scheduler_background_recovery_wgt": "3",
> > "osd_mclock_scheduler_client_lim": "80",
> > "osd_mclock_scheduler_client_res": "30",
> > "osd_mclock_scheduler_client_wgt": "1",
> > "osd_mclock_skip_benchmark": "false",
> > "osd_op_queue": "mclock_scheduler",
> > ```
> >
> > ********************** Step 2
> >
> > We change the profile to high_recovery_ops, and remove the old
> configuration
> >
> > ```
> > ceph config set osd osd_mclock_profile high_recovery_ops
> > ceph config rm osd osd_mclock_scheduler_background_recovery_lim
> > ceph config rm osd osd_mclock_scheduler_background_recovery_res
> > ceph config rm osd osd_mclock_scheduler_background_recovery_wgt
> > ceph config rm osd osd_mclock_scheduler_client_lim
> > ceph config rm osd osd_mclock_scheduler_client_resceph config rm osd
> osd_mclock_scheduler_client_wgt
> > ```
> >
> > The config contains this now:
> >
> > ```
> > osd advanced osd_mclock_profile high_recovery_ops
> > osd advanced osd_op_queue mclock_scheduler *
> > ```
> >
> > And we can see that the configuration was propagated to the OSDs:
> >
> > ```
> > "osd_mclock_profile": "high_recovery_ops",
> > "osd_mclock_scheduler_anticipation_timeout": "0.000000",
> > "osd_mclock_scheduler_background_best_effort_lim": "999999",
> > "osd_mclock_scheduler_background_best_effort_res": "1",
> > "osd_mclock_scheduler_background_best_effort_wgt": "2",
> > "osd_mclock_scheduler_background_recovery_lim": "343",
> > "osd_mclock_scheduler_background_recovery_res": "103",
> > "osd_mclock_scheduler_background_recovery_wgt": "2",
> > "osd_mclock_scheduler_client_lim": "137",
> > "osd_mclock_scheduler_client_res": "51",
> > "osd_mclock_scheduler_client_wgt": "1",
> > "osd_mclock_skip_benchmark": "false",
> > "osd_op_queue": "mclock_scheduler",
> >
> > ```
> >
> > ********************** Step 3
> >
> > The issue comes now, when we try to go back to custom profile:
> >
> > ```
> > ceph config set osd osd_mclock_profile custom
> > ceph config set osd osd_mclock_scheduler_background_recovery_lim 512
> > ceph config set osd osd_mclock_scheduler_background_recovery_res 128
> > ceph config set osd osd_mclock_scheduler_background_recovery_wgt 3
> > ceph config set osd osd_mclock_scheduler_client_lim 80
> > ceph config set osd osd_mclock_scheduler_client_res 30ceph config set
> osd osd_mclock_scheduler_client_wgt 1
> >
> > ```
> >
> > The ceph configuration looks good:
> >
> > ```
> > osd advanced osd_mclock_profile custom
> > osd advanced osd_mclock_scheduler_background_recovery_lim 512
> > osd advanced osd_mclock_scheduler_background_recovery_res 128
> > osd advanced osd_mclock_scheduler_background_recovery_wgt 3
> > osd advanced osd_mclock_scheduler_client_lim 80
> > osd advanced osd_mclock_scheduler_client_res 30
> > osd advanced osd_mclock_scheduler_client_wgt 1 osd advanced osd_op_queue
> mclock_scheduler *
> > ```
> >
> > But the lim, res and wgt on the OSDs still have the old high_recovery
> config, even if the profile is custom:
> >
> > ```
> > "osd_mclock_profile": "custom",
> > "osd_mclock_scheduler_anticipation_timeout": "0.000000",
> > "osd_mclock_scheduler_background_best_effort_lim": "999999",
> > "osd_mclock_scheduler_background_best_effort_res": "1",
> > "osd_mclock_scheduler_background_best_effort_wgt": "2",
> > "osd_mclock_scheduler_background_recovery_lim": "343",
> > "osd_mclock_scheduler_background_recovery_res": "103",
> > "osd_mclock_scheduler_background_recovery_wgt": "2",
> > "osd_mclock_scheduler_client_lim": "137",
> > "osd_mclock_scheduler_client_res": "51",
> > "osd_mclock_scheduler_client_wgt": "1",
> > "osd_mclock_skip_benchmark": "false",
> > "osd_op_queue": "mclock_scheduler",
> > ```
> >
> > At this point, we can try to change what we want, the only changes that
> as an effect on the mClock parameters is if we change to a pre-configured
> profile, such as balanced, high_recovery_ops, high_client_ops. Setting
> custom again will leave the OSDs on their last pre-configured profile.
> >
> > The only way I found to be able to change this parameters, is to restart
> the OSDs.
> >
> > This look to me like a bug, but tell me if this is expected.
> >
> > Regards,
> > Luis Domingues
> > Proton AG
> > _______________________________________________
> > ceph-users mailing list -- ceph-users@xxxxxxx
> > To unsubscribe send an email to ceph-users-leave@xxxxxxx
> >
>
>

-- 

Sridhar Seshasayee

Principal Software Engineer

Red Hat <https://www.redhat.com>
<https://www.redhat.com>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux