Re: snap_schedule works after 1 hour of scheduling

Milind Changire <mchangir@xxxxxxxxxx> · Thu, 5 Oct 2023 14:04:15 +0530



I don't see sensible output for the commands:
# ls -ld /<mount-path>/volumes/subvolgrp/test
# ls -l /<mount-path>/volumes/subvolgrp/test/.snap

please remember to replace /<mount-path> with the path to the mount
point on your system
I'm presuming /<mount-path> is the path where you have mounted the
root dir of your cephfs filesystem

On Thu, Oct 5, 2023 at 1:44 PM Kushagr Gupta
<kushagrguptasps.mun@xxxxxxxxx> wrote:
>
> Hi Milind,
>
> Thank you for your response.
> Please find the logs attached, as instructed.
>
> Thanks and Regards,
> Kushagra Gupta
>
>
> On Thu, Oct 5, 2023 at 12:09 PM Milind Changire <mchangir@xxxxxxxxxx> wrote:
>>
>> this is really odd
>>
>> Please run following commands and send over their outputs:
>> # ceph status
>> # ceph fs status
>> # ceph report
>> # ls -ld /<mount-path>/volumes/subvolgrp/test
>> # ls -l /<mount-path>/volumes/subvolgrp/test/.snap
>>
>> On Thu, Oct 5, 2023 at 11:17 AM Kushagr Gupta
>> <kushagrguptasps.mun@xxxxxxxxx> wrote:
>> >
>> > Hi Milind,Team
>> >
>> > Thank you for your response @Milind Changire
>> >
>> > >>The only thing I can think of is a stale mgr that wasn't restarted
>> > >>after an upgrade.
>> > >>Was an upgrade performed lately ?
>> >
>> > Yes an upgrade was performed after which we faced this. But we were facing this issue previously as well.
>> > Another interesting thing which we observed was that even after the upgrade, the schedules that we created  before upgrade were still running.
>> >
>> > But to eliminate this, I installed a fresh cluster after purging the old one.
>> > Commands used for as follows:
>> > ```
>> > ansible-playbook -i hosts infrastructure-playbooks/purge-cluster.yml
>> > ansible-playbook -i hosts site.yml
>> > ```
>> >
>> > After this kindly note the commands which we followed:
>> > ```
>> > [root@storagenode-1 ~]# ceph mgr module  enable snap_schedule
>> > [root@storagenode-1 ~]# ceph config set mgr mgr/snap_schedule/log_level debug
>> > [root@storagenode-1 ~]# sudo ceph fs subvolumegroup create cephfs subvolgrp
>> > [root@storagenode-1 ~]# ceph fs subvolume create cephfs test subvolgrp
>> > [root@storagenode-1 ~]# date
>> > Thu Oct  5 04:23:09 UTC 2023
>> > [root@storagenode-1 ~]# ceph fs snap-schedule add /volumes/subvolgrp/test 1h 2023-10-05T04:30:00
>> > Schedule set for path /volumes/subvolgrp/test
>> > [root@storagenode-1 ~]#  ceph fs snap-schedule list / --recursive=true
>> > /volumes/subvolgrp/test 1h
>> > [root@storagenode-1 ~]# ceph fs snap-schedule status /volumes/subvolgrp/test
>> > {"fs": "cephfs", "subvol": null, "path": "/volumes/subvolgrp/test", "rel_path": "/volumes/subvolgrp/test", "schedule": "1h", "retention": {}, "start": "2023-10-05T04:30:00", "created": "2023-10-05T04:23:39", "first": null, "last": null, "last_pruned": null, "created_count": 0, "pruned_count": 0, "active": true}
>> > [root@storagenode-1 ~]# ceph fs subvolume info cephfs test subvolgrp
>> > {
>> >     "atime": "2023-10-05 04:20:18",
>> >     "bytes_pcent": "undefined",
>> >     "bytes_quota": "infinite",
>> >     "bytes_used": 0,
>> >     "created_at": "2023-10-05 04:20:18",
>> >     "ctime": "2023-10-05 04:20:18",
>> >     "data_pool": "cephfs_data",
>> >     "features": [
>> >         "snapshot-clone",
>> >         "snapshot-autoprotect",
>> >         "snapshot-retention"
>> >     ],
>> >     "gid": 0,
>> >     "mode": 16877,
>> >     "mon_addrs": [
>> >         "[abcd:abcd:abcd::34]:6789",
>> >         "[abcd:abcd:abcd::35]:6789",
>> >         "[abcd:abcd:abcd::36]:6789"
>> >     ],
>> >     "mtime": "2023-10-05 04:20:18",
>> >     "path": "/volumes/subvolgrp/test/73d82b1a-6fb1-4160-a388-66b898967a85",
>> >     "pool_namespace": "",
>> >     "state": "complete",
>> >     "type": "subvolume",
>> >     "uid": 0
>> > }
>> > [root@storagenode-1 ~]#
>> > [root@storagenode-1 ~]# ceph fs snap-schedule status /volumes/subvolgrp/test
>> > {"fs": "cephfs", "subvol": null, "path": "/volumes/subvolgrp/test", "rel_path": "/volumes/subvolgrp/test", "schedule": "1h", "retention": {"h": 4}, "start": "2023-10-05T04:30:00", "created": "2023-10-05T04:23:39", "first": null, "last": null, "last_pruned": null, "created_count": 0, "pruned_count": 0, "active": true}
>> > [root@storagenode-1 ~]# date
>> > Thu Oct  5 05:31:20 UTC 2023
>> > [root@storagenode-1 ~]#
>> > ```
>> >
>> > Could you please help us. Are we doing something wrong? Because still the schedules are not getting created.
>> >
>> > Thanks and Regards,
>> > Kushagra Gupta
>> >
>> > On Wed, Oct 4, 2023 at 9:33 PM Milind Changire <mchangir@xxxxxxxxxx> wrote:
>> >>
>> >> On Wed, Oct 4, 2023 at 7:19 PM Kushagr Gupta
>> >> <kushagrguptasps.mun@xxxxxxxxx> wrote:
>> >> >
>> >> > Hi Milind,
>> >> >
>> >> > Thank you for your swift response.
>> >> >
>> >> > >>How many hours did you wait after the "start time" and decide to restart mgr ?
>> >> > We waited for ~3 days before restarting the mgr-service.
>> >>
>> >> The only thing I can think of is a stale mgr that wasn't restarted
>> >> after an upgrade.
>> >> Was an upgrade performed lately ?
>> >>
>> >> Did the dir exist at the time the snapshot was scheduled to take place.
>> >> If it didn't then the schedule gets disabled until explicitly enabled.
>> >>
>> >> >
>> >> > There was one more instance where we waited for 2 hours and then re-started and in the third hour the schedule started working.
>> >> >
>> >> > Could you please guide us if we are doing anything wrong.
>> >> > Kindly let us know if any logs are required.
>> >> >
>> >> > Thanks and Regards,
>> >> > Kushagra Gupta
>> >> >
>> >> > On Wed, Oct 4, 2023 at 5:39 PM Milind Changire <mchangir@xxxxxxxxxx> wrote:
>> >> >>
>> >> >> On Wed, Oct 4, 2023 at 3:40 PM Kushagr Gupta
>> >> >> <kushagrguptasps.mun@xxxxxxxxx> wrote:
>> >> >> >
>> >> >> > Hi Team,Milind
>> >> >> >
>> >> >> > Ceph-version: Quincy, Reef
>> >> >> > OS: Almalinux 8
>> >> >> >
>> >> >> > Issue: snap_schedule works after 1 hour of schedule
>> >> >> >
>> >> >> > Description:
>> >> >> >
>> >> >> > We are currently working in a 3-node ceph cluster.
>> >> >> > We are currently exploring the scheduled snapshot capability of the ceph-mgr module.
>> >> >> > To enable/configure scheduled snapshots, we followed the following link:
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> > https://docs.ceph.com/en/quincy/cephfs/snap-schedule/
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> > We were able to create snap schedules for the subvolumes as suggested.
>> >> >> > But we have observed a two very strange behaviour:
>> >> >> > 1. The snap_schedules only work when we restart the ceph-mgr service on the mgr node:
>> >> >> > We then restarted the mgr-service on the active mgr node, and after 1 hour it started getting created. I am attaching the log file for the same after restart. Thre behaviour looks abnormal.
>> >> >>
>> >> >> A mgr restart is not required for the schedule to get triggered.
>> >> >> How many hours did you wait after the "start time" and decide to restart mgr ?
>> >> >>
>> >> >> >
>> >> >> > So,  for eg consider the below output:
>> >> >> > ```
>> >> >> > [root@storagenode-1 ~]# ceph fs snap-schedule status /volumes/subvolgrp/test3
>> >> >> > {"fs": "cephfs", "subvol": null, "path": "/volumes/subvolgrp/test3", "rel_path": "/volumes/subvolgrp/test3", "schedule": "1h", "retention": {}, "start": "2023-10-04T07:20:00", "created": "2023-10-04T07:18:41", "first": "2023-10-04T08:20:00", "last": "2023-10-04T09:20:00", "last_pruned": null, "created_count": 2, "pruned_count": 0, "active": true}
>> >> >> > [root@storagenode-1 ~]#
>> >> >> > ```
>> >> >> > As we can see in the above o/p, we created the schedule at 2023-10-04T07:18:41. The schedule was suppose to start at 2023-10-04T07:20:00 but it started at 2023-10-04T08:20:00
>> >> >>
>> >> >> seems normal behavior to me
>> >> >> the schedule starts countdown for 1h from 2023-10-04T07:20:00 and
>> >> >> created first snapshot at 2023-10-04T08:20:00
>> >> >>
>> >> >> >
>> >> >> > Any input w.r.t the same will be of great help.
>> >> >> >
>> >> >> > Thanks and Regards
>> >> >> > Kushagra Gupta
>> >> >>
>> >> >>
>> >> >>
>> >> >> --
>> >> >> Milind
>> >> >>
>> >>
>> >>
>> >> --
>> >> Milind
>> >>
>>
>>
>> --
>> Milind
>>


-- 
Milind
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx