It should take ~25 seconds by default to detect a network failure, the config option that controls this is "osd heartbeat grace" (default 20 seconds, but it takes a little longer for it to really detect the failure). Check ceph -w while performing the test. Paul -- Paul Emmerich Looking for help with your Ceph cluster? Contact us at https://croit.io croit GmbH Freseniusstr. 31h 81247 München www.croit.io Tel: +49 89 1896585 90 On Fri, Nov 29, 2019 at 8:14 AM majia xiao <xiaomajia.st@xxxxxxxxx> wrote: > > Hello, > > > We have a Ceph cluster (version 12.2.4) with 10 hosts, and there are 21 OSDs on each host. > > > An EC pool is created with the following commands: > > > ceph osd erasure-code-profile set profile_jerasure_4_3_reed_sol_van \ > > plugin=jerasure \ > > k=4 \ > > m=3 \ > > technique=reed_sol_van \ > > packetsize=2048 \ > > crush-device-class=hdd \ > > crush-failure-domain=host > > > ceph osd pool create pool_jerasure_4_3_reed_sol_van 2048 2048 erasure profile_jerasure_4_3_reed_sol_van > > > > Here are my questions: > > The EC pool is created using k=4, m=3, and crush-device-class=hdd, so we just disable the network interfaces of some hosts (using "ifdown" command) to verify the functionality of the EC pool while performing ‘rados bench’ command. > However, the IO rate drops immediately to 0 when a single host goes offline, and it takes a long time (~100 seconds) for the IO rate becoming normal. > As far as I know, the default value of min_size is k+1 or 5, which means that the EC pool can be still working even if there are two hosts offline. > Is there something wrong with my understanding? > According to our observations, it seems that the IO rate becomes normal when Ceph detects all OSDs corresponding to the failed host. > Is there any way to reduce the time needed for Ceph to detect all failed OSDs? > > > > Thanks for any help. > > > Best regards, > > Majia Xiao > > _______________________________________________ > ceph-users mailing list -- ceph-users@xxxxxxx > To unsubscribe send an email to ceph-users-leave@xxxxxxx _______________________________________________ Dev mailing list -- dev@xxxxxxx To unsubscribe send an email to dev-leave@xxxxxxx