Re: cephfs miss data for 15s when master mds rebooting

"Yan, Zheng" <ukernel@xxxxxxxxx> · Mon, 18 Dec 2017 12:01:55 +0800

On Mon, Dec 18, 2017 at 11:34 AM, 13605702596@xxxxxxx
<13605702596@xxxxxxx> wrote:
> hi Yan
>
>> Mon Dec 18 03:07:47 UTC 2017  <-- reboot
>> Mon Dec 18 03:08:05 UTC 2017  <-- mds failover works
>
> this is caused by write stall
>
> but the data below got lost, is this normal?

your script never write below data to the file. try script

while true; do date=`date`; echo $date; echo $date >> time.txt; sync;
sleep 1; done

> Mon Dec 18 03:07:48 UTC 2017
> Mon Dec 18 03:07:49 UTC 2017
> Mon Dec 18 03:07:50 UTC 2017
> Mon Dec 18 03:07:51 UTC 2017
> Mon Dec 18 03:07:52 UTC 2017
> Mon Dec 18 03:07:53 UTC 2017
> Mon Dec 18 03:07:54 UTC 2017
> Mon Dec 18 03:07:55 UTC 2017
> Mon Dec 18 03:07:56 UTC 2017
> Mon Dec 18 03:07:57 UTC 2017
> Mon Dec 18 03:07:58 UTC 2017
> Mon Dec 18 03:07:59 UTC 2017
> Mon Dec 18 03:08:00 UTC 2017
> Mon Dec 18 03:08:01 UTC 2017
> Mon Dec 18 03:08:02 UTC 2017
> Mon Dec 18 03:08:03 UTC 2017
> Mon Dec 18 03:08:04 UTC 2017
>
> ________________________________
> 13605702596@xxxxxxx
>
>
> From: Yan, Zheng
> Date: 2017-12-18 11:27
> To: 13605702596@xxxxxxx
> CC: John Spray; ceph-users
> Subject: Re: Re:  cephfs miss data for 15s when master mds
> rebooting
> On Mon, Dec 18, 2017 at 11:11 AM, 13605702596@xxxxxxx
> <13605702596@xxxxxxx> wrote:
>> hi Yan
>>
>> my test script:
>>
>> #!/bin/sh
>>
>> rm -f /root/cephfs/time.txt
>>
>> while true
>> do
>>     echo `date` >> /root/cephfs/time.txt
>>     sync
>>     sleep 1
>> done
>>
>> i run this scripte and then reboot master mds
>>
>> from the file /root/cephfs/time.txt, i can see there are more than 15
>> lines
>> got lost:
>> Mon Dec 18 03:07:43 UTC 2017
>> Mon Dec 18 03:07:44 UTC 2017
>> Mon Dec 18 03:07:45 UTC 2017
>> Mon Dec 18 03:07:47 UTC 2017  <-- reboot
>> Mon Dec 18 03:08:05 UTC 2017  <-- mds failover works
>
> this is caused by write stall
>
>> Mon Dec 18 03:08:06 UTC 2017
>> Mon Dec 18 03:08:07 UTC 2017
>> Mon Dec 18 03:08:08 UTC 2017
>> Mon Dec 18 03:08:09 UTC 2017
>> Mon Dec 18 03:08:10 UTC 2017
>>
>> ________________________________
>> 13605702596@xxxxxxx
>>
>>
>> From: Yan, Zheng
>> Date: 2017-12-18 10:59
>> To: 13605702596@xxxxxxx
>> CC: John Spray; ceph-users
>> Subject: Re: Re:  cephfs miss data for 15s when master mds
>> rebooting
>> On Mon, Dec 18, 2017 at 10:10 AM, 13605702596@xxxxxxx
>> <13605702596@xxxxxxx> wrote:
>>> hi Yan
>>>
>>> 1. run "ceph mds fail" before rebooting host
>>> 2. host reboot by itself for some reason
>>>
>>> you means no data get lost in the  BOTH conditions?
>>>
>>> in my test, i echo the date string per second into the file under cephfs
>>> dir,
>>> when i reboot the master mds, there are 15 lines got lost.
>>>
>>
>> what do you mean 15 line got lost? are you sure it's not caused by write
>> stall?
>>
>>
>>> thanks
>>>
>>> ________________________________
>>> 13605702596@xxxxxxx
>>>
>>>
>>> From: Yan, Zheng
>>> Date: 2017-12-18 09:55
>>> To: 13605702596@xxxxxxx
>>> CC: John Spray; ceph-users
>>> Subject: Re:  cephfs miss data for 15s when master mds
>>> rebooting
>>> On Mon, Dec 18, 2017 at 9:24 AM, 13605702596@xxxxxxx
>>> <13605702596@xxxxxxx> wrote:
>>>> hi John
>>>>
>>>> thanks for your answer.
>>>>
>>>> in normal condition, i can run  "ceph mds fiail" before reboot.
>>>> but if the host reboots by itself for some reason, i can do nothing!
>>>> if this happens, data must be losed.
>>>>
>>>> so, is there any other way to stop data from being losed?
>>>>
>>>
>>> no data get lost in this condition.  just IO stall for a few seconds
>>>
>>>> thanks
>>>>
>>>> ________________________________
>>>> 13605702596@xxxxxxx
>>>>
>>>>
>>>> From: John Spray
>>>> Date: 2017-12-15 18:08
>>>> To: 13605702596@xxxxxxx
>>>> CC: ceph-users
>>>> Subject: Re:  cephfs miss data for 15s when master mds
>>>> rebooting
>>>> On Fri, Dec 15, 2017 at 1:45 AM, 13605702596@xxxxxxx
>>>> <13605702596@xxxxxxx> wrote:
>>>>> hi
>>>>>
>>>>> i used 3 nodes to deploy mds (each node also has mon on it)
>>>>>
>>>>> my config:
>>>>> [mds.ceph-node-10-101-4-17]
>>>>> mds_standby_replay = true
>>>>> mds_standby_for_rank = 0
>>>>>
>>>>> [mds.ceph-node-10-101-4-21]
>>>>> mds_standby_replay = true
>>>>> mds_standby_for_rank = 0
>>>>>
>>>>> [mds.ceph-node-10-101-4-22]
>>>>> mds_standby_replay = true
>>>>> mds_standby_for_rank = 0
>>>>>
>>>>> the mds stat:
>>>>> e29: 1/1/1 up {0=ceph-node-10-101-4-22=up:active}, 1 up:standby-replay,
>>>>> 1
>>>>> up:standby
>>>>>
>>>>> i mount the cephfs on the ceph client, and run the test script to write
>>>>> data
>>>>> into file under the cephfs dir,
>>>>> when i reboot the master mds, and i found the data is not written into
>>>>> the
>>>>> file.
>>>>> after 15 seconds, data can be written into the file again
>>>>>
>>>>> so my question is:
>>>>> is this normal when reboot the master mds?
>>>>> when will the up:standby-replay mds take over the the cephfs?
>>>>
>>>> The standby takes over after the active daemon has not reported to the
>>>> monitors for `mds_beacon_grace` seconds, which as you have noticed is
>>>> 15s by default.
>>>>
>>>> If you know you are rebooting something, you can pre-empt the timeout
>>>> mechanism by using "ceph mds fail" on the active daemon, to cause
>>>> another to take over right away.
>>>>
>>>> John
>>>>
>>>>> thanks
>>>>>
>>>>> ________________________________
>>>>> 13605702596@xxxxxxx
>>>>>
>>>>> _______________________________________________
>>>>> ceph-users mailing list
>>>>> ceph-users@xxxxxxxxxxxxxx
>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> ceph-users mailing list
>>>> ceph-users@xxxxxxxxxxxxxx
>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>>
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com