Re: Ceph availability test & recovering question

Andrey Korolyov <andrey@xxxxxxx> · Mon, 18 Mar 2013 19:41:41 +0400

Hello,

I`m experiencing same long-lasting problem - during recovery ops, some
percentage of read I/O remains in-flight for seconds, rendering
upper-level filesystem on the qemu client very slow and almost
unusable. Different striping has almost no effect on visible delays
and reads may be non-intensive at all but they still are very slow.

Here is some fio results on randread with small blocks, so it is not
affected by readahead as linear one:

Intensive reads during recovery:
    lat (msec) : 2=0.01%, 4=0.08%, 10=1.87%, 20=4.17%, 50=8.34%
    lat (msec) : 100=13.93%, 250=2.77%, 500=1.19%, 750=25.13%, 1000=0.41%
    lat (msec) : 2000=15.45%, >=2000=26.66%

same on healthy cluster:
    lat (msec) : 20=0.33%, 50=9.17%, 100=23.35%, 250=25.47%, 750=6.53%
    lat (msec) : 1000=0.42%, 2000=34.17%, >=2000=0.56%

On Sun, Mar 17, 2013 at 8:18 AM,  <Kelvin_Huang@xxxxxxxxxx> wrote:
> Hi, all
>
> I have some problem after availability test
>
> Setup:
> Linux kernel: 3.2.0
> OS: Ubuntu 12.04
> Storage server : 11 HDD (each storage server has 11 osd, 7200 rpm, 1T) + 10GbE NIC
> RAID card: LSI MegaRAID SAS 9260-4i  For every HDD: RAID0, Write Policy: Write Back with BBU, Read Policy: ReadAhead, IO Policy: Direct
> Storage server number : 2
>
> Ceph version : 0.48.2
> Replicas : 2
> Monitor number:3
>
>
> We have two storage server as a cluter, then use ceph client create 1T RBD image for testing, the client also
> has 10GbE NIC , Linux kernel 3.2.0 , Ubuntu 12.04
>
> We also use FIO to produce workload
>
> fio command:
> [Sequencial Read]
> fio --iodepth = 32 --numjobs=1 --runtime=120  --bs = 65536 --rw = read --ioengine=libaio --group_reporting --direct=1 --eta=always  --ramp_time=10 --thinktime=10
>
> [Sequencial Write]
> fio --iodepth = 32 --numjobs=1 --runtime=120  --bs = 65536 --rw = write --ioengine=libaio --group_reporting --direct=1 --eta=always  --ramp_time=10 --thinktime=10
>
>
> Now I want observe to ceph state when one storage server is crash, so I turn off one storage server networking.
> We expect that data write and data read operation can be quickly resume or even not be suspended in ceph recovering time, but the experimental results show
> the data write and data read operation will pause for about 20~30 seconds in ceph recovering time.
>
> My question is:
> 1.The state of I/O pause is normal when ceph recovering ?
> 2.The pause time of I/O that can not be avoided when ceph recovering ?
> 3.How to reduce the I/O pause time ?
>
>
> Thanks!!
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html