On Mon, Dec 18, 2017 at 11:11 AM, 13605702596@xxxxxxx <13605702596@xxxxxxx> wrote: > hi Yan > > my test script: > > #!/bin/sh > > rm -f /root/cephfs/time.txt > > while true > do > echo `date` >> /root/cephfs/time.txt > sync > sleep 1 > done > > i run this scripte and then reboot master mds > > from the file /root/cephfs/time.txt, i can see there are more than 15 lines > got lost: > Mon Dec 18 03:07:43 UTC 2017 > Mon Dec 18 03:07:44 UTC 2017 > Mon Dec 18 03:07:45 UTC 2017 > Mon Dec 18 03:07:47 UTC 2017 <-- reboot > Mon Dec 18 03:08:05 UTC 2017 <-- mds failover works this is caused by write stall > Mon Dec 18 03:08:06 UTC 2017 > Mon Dec 18 03:08:07 UTC 2017 > Mon Dec 18 03:08:08 UTC 2017 > Mon Dec 18 03:08:09 UTC 2017 > Mon Dec 18 03:08:10 UTC 2017 > > ________________________________ > 13605702596@xxxxxxx > > > From: Yan, Zheng > Date: 2017-12-18 10:59 > To: 13605702596@xxxxxxx > CC: John Spray; ceph-users > Subject: Re: Re: cephfs miss data for 15s when master mds > rebooting > On Mon, Dec 18, 2017 at 10:10 AM, 13605702596@xxxxxxx > <13605702596@xxxxxxx> wrote: >> hi Yan >> >> 1. run "ceph mds fail" before rebooting host >> 2. host reboot by itself for some reason >> >> you means no data get lost in the BOTH conditions? >> >> in my test, i echo the date string per second into the file under cephfs >> dir, >> when i reboot the master mds, there are 15 lines got lost. >> > > what do you mean 15 line got lost? are you sure it's not caused by write > stall? > > >> thanks >> >> ________________________________ >> 13605702596@xxxxxxx >> >> >> From: Yan, Zheng >> Date: 2017-12-18 09:55 >> To: 13605702596@xxxxxxx >> CC: John Spray; ceph-users >> Subject: Re: cephfs miss data for 15s when master mds >> rebooting >> On Mon, Dec 18, 2017 at 9:24 AM, 13605702596@xxxxxxx >> <13605702596@xxxxxxx> wrote: >>> hi John >>> >>> thanks for your answer. >>> >>> in normal condition, i can run "ceph mds fiail" before reboot. >>> but if the host reboots by itself for some reason, i can do nothing! >>> if this happens, data must be losed. >>> >>> so, is there any other way to stop data from being losed? >>> >> >> no data get lost in this condition. just IO stall for a few seconds >> >>> thanks >>> >>> ________________________________ >>> 13605702596@xxxxxxx >>> >>> >>> From: John Spray >>> Date: 2017-12-15 18:08 >>> To: 13605702596@xxxxxxx >>> CC: ceph-users >>> Subject: Re: cephfs miss data for 15s when master mds >>> rebooting >>> On Fri, Dec 15, 2017 at 1:45 AM, 13605702596@xxxxxxx >>> <13605702596@xxxxxxx> wrote: >>>> hi >>>> >>>> i used 3 nodes to deploy mds (each node also has mon on it) >>>> >>>> my config: >>>> [mds.ceph-node-10-101-4-17] >>>> mds_standby_replay = true >>>> mds_standby_for_rank = 0 >>>> >>>> [mds.ceph-node-10-101-4-21] >>>> mds_standby_replay = true >>>> mds_standby_for_rank = 0 >>>> >>>> [mds.ceph-node-10-101-4-22] >>>> mds_standby_replay = true >>>> mds_standby_for_rank = 0 >>>> >>>> the mds stat: >>>> e29: 1/1/1 up {0=ceph-node-10-101-4-22=up:active}, 1 up:standby-replay, >>>> 1 >>>> up:standby >>>> >>>> i mount the cephfs on the ceph client, and run the test script to write >>>> data >>>> into file under the cephfs dir, >>>> when i reboot the master mds, and i found the data is not written into >>>> the >>>> file. >>>> after 15 seconds, data can be written into the file again >>>> >>>> so my question is: >>>> is this normal when reboot the master mds? >>>> when will the up:standby-replay mds take over the the cephfs? >>> >>> The standby takes over after the active daemon has not reported to the >>> monitors for `mds_beacon_grace` seconds, which as you have noticed is >>> 15s by default. >>> >>> If you know you are rebooting something, you can pre-empt the timeout >>> mechanism by using "ceph mds fail" on the active daemon, to cause >>> another to take over right away. >>> >>> John >>> >>>> thanks >>>> >>>> ________________________________ >>>> 13605702596@xxxxxxx >>>> >>>> _______________________________________________ >>>> ceph-users mailing list >>>> ceph-users@xxxxxxxxxxxxxx >>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>>> >>> >>> >>> _______________________________________________ >>> ceph-users mailing list >>> ceph-users@xxxxxxxxxxxxxx >>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>> _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com