Re: cephfs miss data for 15s when master mds rebooting

"13605702596@xxxxxxx" <13605702596@xxxxxxx> · Mon, 18 Dec 2017 10:10:39 +0800

hi Yan

1. run "ceph mds fail" before rebooting host
2. host reboot by itself for some reason

you means no data get lost in the  BOTH conditions?

in my test, i echo the date string per second into the file under cephfs dir,
when i reboot the master mds, there are 15 lines got lost.

thanks

13605702596@xxxxxxx

From: Yan, Zheng
Date: 2017-12-18 09:55
To: 13605702596@xxxxxxx
CC: John Spray; ceph-users
Subject: Re:  cephfs miss data for 15s when master mds rebooting
On Mon, Dec 18, 2017 at 9:24 AM, 13605702596@xxxxxxx
<13605702596@xxxxxxx> wrote:
> hi John
>
> thanks for your answer.
>
> in normal condition, i can run  "ceph mds fiail" before reboot.
> but if the host reboots by itself for some reason, i can do nothing!
> if this happens, data must be losed.
>
> so, is there any other way to stop data from being losed?
>

no data get lost in this condition.  just IO stall for a few seconds

> thanks
>
> ________________________________
> 13605702596@xxxxxxx
>
>
> From: John Spray
> Date: 2017-12-15 18:08
> To: 13605702596@xxxxxxx
> CC: ceph-users
> Subject: Re:  cephfs miss data for 15s when master mds rebooting
> On Fri, Dec 15, 2017 at 1:45 AM, 13605702596@xxxxxxx
> <13605702596@xxxxxxx> wrote:
>> hi
>>
>> i used 3 nodes to deploy mds (each node also has mon on it)
>>
>> my config:
>> [mds.ceph-node-10-101-4-17]
>> mds_standby_replay = true
>> mds_standby_for_rank = 0
>>
>> [mds.ceph-node-10-101-4-21]
>> mds_standby_replay = true
>> mds_standby_for_rank = 0
>>
>> [mds.ceph-node-10-101-4-22]
>> mds_standby_replay = true
>> mds_standby_for_rank = 0
>>
>> the mds stat:
>> e29: 1/1/1 up {0=ceph-node-10-101-4-22=up:active}, 1 up:standby-replay, 1
>> up:standby
>>
>> i mount the cephfs on the ceph client, and run the test script to write
>> data
>> into file under the cephfs dir,
>> when i reboot the master mds, and i found the data is not written into the
>> file.
>> after 15 seconds, data can be written into the file again
>>
>> so my question is:
>> is this normal when reboot the master mds?
>> when will the up:standby-replay mds take over the the cephfs?
>
> The standby takes over after the active daemon has not reported to the
> monitors for `mds_beacon_grace` seconds, which as you have noticed is
> 15s by default.
>
> If you know you are rebooting something, you can pre-empt the timeout
> mechanism by using "ceph mds fail" on the active daemon, to cause
> another to take over right away.
>
> John
>
>> thanks
>>
>> ________________________________
>> 13605702596@xxxxxxx
>>
>> _______________________________________________
>> ceph-users mailing list
>> ceph-users@xxxxxxxxxxxxxx
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>
>
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com