Re: snap_schedule works after 1 hour of scheduling

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



this is really odd

Please run following commands and send over their outputs:
# ceph status
# ceph fs status
# ceph report
# ls -ld /<mount-path>/volumes/subvolgrp/test
# ls -l /<mount-path>/volumes/subvolgrp/test/.snap

On Thu, Oct 5, 2023 at 11:17 AM Kushagr Gupta
<kushagrguptasps.mun@xxxxxxxxx> wrote:
>
> Hi Milind,Team
>
> Thank you for your response @Milind Changire
>
> >>The only thing I can think of is a stale mgr that wasn't restarted
> >>after an upgrade.
> >>Was an upgrade performed lately ?
>
> Yes an upgrade was performed after which we faced this. But we were facing this issue previously as well.
> Another interesting thing which we observed was that even after the upgrade, the schedules that we created  before upgrade were still running.
>
> But to eliminate this, I installed a fresh cluster after purging the old one.
> Commands used for as follows:
> ```
> ansible-playbook -i hosts infrastructure-playbooks/purge-cluster.yml
> ansible-playbook -i hosts site.yml
> ```
>
> After this kindly note the commands which we followed:
> ```
> [root@storagenode-1 ~]# ceph mgr module  enable snap_schedule
> [root@storagenode-1 ~]# ceph config set mgr mgr/snap_schedule/log_level debug
> [root@storagenode-1 ~]# sudo ceph fs subvolumegroup create cephfs subvolgrp
> [root@storagenode-1 ~]# ceph fs subvolume create cephfs test subvolgrp
> [root@storagenode-1 ~]# date
> Thu Oct  5 04:23:09 UTC 2023
> [root@storagenode-1 ~]# ceph fs snap-schedule add /volumes/subvolgrp/test 1h 2023-10-05T04:30:00
> Schedule set for path /volumes/subvolgrp/test
> [root@storagenode-1 ~]#  ceph fs snap-schedule list / --recursive=true
> /volumes/subvolgrp/test 1h
> [root@storagenode-1 ~]# ceph fs snap-schedule status /volumes/subvolgrp/test
> {"fs": "cephfs", "subvol": null, "path": "/volumes/subvolgrp/test", "rel_path": "/volumes/subvolgrp/test", "schedule": "1h", "retention": {}, "start": "2023-10-05T04:30:00", "created": "2023-10-05T04:23:39", "first": null, "last": null, "last_pruned": null, "created_count": 0, "pruned_count": 0, "active": true}
> [root@storagenode-1 ~]# ceph fs subvolume info cephfs test subvolgrp
> {
>     "atime": "2023-10-05 04:20:18",
>     "bytes_pcent": "undefined",
>     "bytes_quota": "infinite",
>     "bytes_used": 0,
>     "created_at": "2023-10-05 04:20:18",
>     "ctime": "2023-10-05 04:20:18",
>     "data_pool": "cephfs_data",
>     "features": [
>         "snapshot-clone",
>         "snapshot-autoprotect",
>         "snapshot-retention"
>     ],
>     "gid": 0,
>     "mode": 16877,
>     "mon_addrs": [
>         "[abcd:abcd:abcd::34]:6789",
>         "[abcd:abcd:abcd::35]:6789",
>         "[abcd:abcd:abcd::36]:6789"
>     ],
>     "mtime": "2023-10-05 04:20:18",
>     "path": "/volumes/subvolgrp/test/73d82b1a-6fb1-4160-a388-66b898967a85",
>     "pool_namespace": "",
>     "state": "complete",
>     "type": "subvolume",
>     "uid": 0
> }
> [root@storagenode-1 ~]#
> [root@storagenode-1 ~]# ceph fs snap-schedule status /volumes/subvolgrp/test
> {"fs": "cephfs", "subvol": null, "path": "/volumes/subvolgrp/test", "rel_path": "/volumes/subvolgrp/test", "schedule": "1h", "retention": {"h": 4}, "start": "2023-10-05T04:30:00", "created": "2023-10-05T04:23:39", "first": null, "last": null, "last_pruned": null, "created_count": 0, "pruned_count": 0, "active": true}
> [root@storagenode-1 ~]# date
> Thu Oct  5 05:31:20 UTC 2023
> [root@storagenode-1 ~]#
> ```
>
> Could you please help us. Are we doing something wrong? Because still the schedules are not getting created.
>
> Thanks and Regards,
> Kushagra Gupta
>
> On Wed, Oct 4, 2023 at 9:33 PM Milind Changire <mchangir@xxxxxxxxxx> wrote:
>>
>> On Wed, Oct 4, 2023 at 7:19 PM Kushagr Gupta
>> <kushagrguptasps.mun@xxxxxxxxx> wrote:
>> >
>> > Hi Milind,
>> >
>> > Thank you for your swift response.
>> >
>> > >>How many hours did you wait after the "start time" and decide to restart mgr ?
>> > We waited for ~3 days before restarting the mgr-service.
>>
>> The only thing I can think of is a stale mgr that wasn't restarted
>> after an upgrade.
>> Was an upgrade performed lately ?
>>
>> Did the dir exist at the time the snapshot was scheduled to take place.
>> If it didn't then the schedule gets disabled until explicitly enabled.
>>
>> >
>> > There was one more instance where we waited for 2 hours and then re-started and in the third hour the schedule started working.
>> >
>> > Could you please guide us if we are doing anything wrong.
>> > Kindly let us know if any logs are required.
>> >
>> > Thanks and Regards,
>> > Kushagra Gupta
>> >
>> > On Wed, Oct 4, 2023 at 5:39 PM Milind Changire <mchangir@xxxxxxxxxx> wrote:
>> >>
>> >> On Wed, Oct 4, 2023 at 3:40 PM Kushagr Gupta
>> >> <kushagrguptasps.mun@xxxxxxxxx> wrote:
>> >> >
>> >> > Hi Team,Milind
>> >> >
>> >> > Ceph-version: Quincy, Reef
>> >> > OS: Almalinux 8
>> >> >
>> >> > Issue: snap_schedule works after 1 hour of schedule
>> >> >
>> >> > Description:
>> >> >
>> >> > We are currently working in a 3-node ceph cluster.
>> >> > We are currently exploring the scheduled snapshot capability of the ceph-mgr module.
>> >> > To enable/configure scheduled snapshots, we followed the following link:
>> >> >
>> >> >
>> >> >
>> >> > https://docs.ceph.com/en/quincy/cephfs/snap-schedule/
>> >> >
>> >> >
>> >> >
>> >> > We were able to create snap schedules for the subvolumes as suggested.
>> >> > But we have observed a two very strange behaviour:
>> >> > 1. The snap_schedules only work when we restart the ceph-mgr service on the mgr node:
>> >> > We then restarted the mgr-service on the active mgr node, and after 1 hour it started getting created. I am attaching the log file for the same after restart. Thre behaviour looks abnormal.
>> >>
>> >> A mgr restart is not required for the schedule to get triggered.
>> >> How many hours did you wait after the "start time" and decide to restart mgr ?
>> >>
>> >> >
>> >> > So,  for eg consider the below output:
>> >> > ```
>> >> > [root@storagenode-1 ~]# ceph fs snap-schedule status /volumes/subvolgrp/test3
>> >> > {"fs": "cephfs", "subvol": null, "path": "/volumes/subvolgrp/test3", "rel_path": "/volumes/subvolgrp/test3", "schedule": "1h", "retention": {}, "start": "2023-10-04T07:20:00", "created": "2023-10-04T07:18:41", "first": "2023-10-04T08:20:00", "last": "2023-10-04T09:20:00", "last_pruned": null, "created_count": 2, "pruned_count": 0, "active": true}
>> >> > [root@storagenode-1 ~]#
>> >> > ```
>> >> > As we can see in the above o/p, we created the schedule at 2023-10-04T07:18:41. The schedule was suppose to start at 2023-10-04T07:20:00 but it started at 2023-10-04T08:20:00
>> >>
>> >> seems normal behavior to me
>> >> the schedule starts countdown for 1h from 2023-10-04T07:20:00 and
>> >> created first snapshot at 2023-10-04T08:20:00
>> >>
>> >> >
>> >> > Any input w.r.t the same will be of great help.
>> >> >
>> >> > Thanks and Regards
>> >> > Kushagra Gupta
>> >>
>> >>
>> >>
>> >> --
>> >> Milind
>> >>
>>
>>
>> --
>> Milind
>>


-- 
Milind
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux