Re: performance degredation every 30 seconds

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I did a git pull of latest fio from git://git.kernel.dk/fio.git
and built with
# gcc --version
gcc (GCC) 7.3.1 20180303 (Red Hat 7.3.1-5)


Results were as expected.

Using straight rados, there were no performance hiccups.

But using
fio  --direct=1 --rw=randwrite --bs=4k --ioengine=rbd  --pool=testpool  --rbdname=testrbd --iodepth=256 --numjobs=1 --time_based --group_reporting --name=iops-test-job --runtime=120 --eta-newline=1

exhibited the same behaviour, albeit slightly different timing.


Jobs: 1 (f=1): [w(1)][19.0%][r=0KiB/s,w=69.7MiB/s][r=0,w=17.8k IOPS][eta 01m:38s]
Jobs: 1 (f=1): [w(1)][20.7%][r=0KiB/s,w=2044KiB/s][r=0,w=511 IOPS][eta 01m:36s]
Jobs: 1 (f=1): [w(1)][22.3%][r=0KiB/s,w=52.5MiB/s][r=0,w=13.4k IOPS][eta 01m:34s]

  .. SKIP ..

Jobs: 1 (f=1): [w(1)][38.8%][r=0KiB/s,w=56.8MiB/s][r=0,w=14.5k IOPS][eta 01m:14s]
Jobs: 1 (f=1): [w(1)][40.5%][r=0KiB/s,w=16.3MiB/s][r=0,w=4182 IOPS][eta 01m:12s]
Jobs: 1 (f=1): [w(1)][42.1%][r=0KiB/s,w=15.6MiB/s][r=0,w=3990 IOPS][eta 01m:10s]
Jobs: 1 (f=1): [w(1)][43.8%][r=0KiB/s,w=16.1MiB/s][r=0,w=4114 IOPS][eta 01m:08s]
Jobs: 1 (f=1): [w(1)][45.5%][r=0KiB/s,w=11.1MiB/s][r=0,w=2853 IOPS][eta 01m:06s]
Jobs: 1 (f=1): [w(1)][47.1%][r=0KiB/s,w=9793KiB/s][r=0,w=2448 IOPS][eta 01m:04s]
Jobs: 1 (f=1): [w(1)][48.8%][r=0KiB/s,w=55.1MiB/s][r=0,w=14.1k IOPS][eta 01m:02s]



Side note:

# fio --filename=/dev/rbd0 --direct=1 --rw=randwrite --bs=4k --ioengine=rados --iodepth=128 --numjobs=1 --time_based --group_reporting --name=iops-test-job --runtime=120 --eta-newline=1
iops-test-job: (g=0): rw=randwrite, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=rbd, iodepth=128
fio-3.7
Starting 1 process
terminate called after throwing an instance of 'std::logic_error'
  what():  basic_string::_S_construct null not valid
Aborted (core dumped)

It would be nice if it did a more user-friendly arg check, and said "you need to specify pool" instead of coredumping.





----- Original Message -----
From: "Jason Dillaman" <jdillama@xxxxxxxxxx>
To: "Philip Brown" <pbrown@xxxxxxxxxx>
Cc: "ceph-users" <ceph-users@xxxxxxx>
Sent: Tuesday, December 15, 2020 9:41:12 AM
Subject: Re:  Re: performance degredation every 30 seconds

On Tue, Dec 15, 2020 at 12:24 PM Philip Brown <pbrown@xxxxxxxxxx> wrote:
>
> It wont be on the same node...
> but since as you saw, the problem still shows up with iodepth=32.... seems we're still in the same problem ball park
> also... there may be 100 client machines.. but each client can have anywhere between 1-30 threads running at a time.
>
> as far as fio using the rados engine as you suggested...
> Wouldnt that bypass /dev/rbd
>
> That would negate the whole point of benchmarking. We cant use direct rados for our actual application.
> We need to benchmark the performance of the end-to-end system through /dev/rbd

Yup, that's the goal -- to better isolate the issue between a
client-side vs a server-side issue. You could also use the
"ioengine=rbd" to the same effect.

> We specifically want to use rbds, because that's how our clients will be accessing it.
>
> New information:
>
> when I drop the iodepth down to 16.. the problem still happens, but not at 30 seconds.
> with high iodepth, its more dependably around 30 seconds. but with iodepth=16, i've seen times between 50-60 seconds.
> And then the second hit is unevenly spaced.  On this run it took 100 seconds more.
>
>
>
> # fio --filename=/dev/rbd0 --direct=1 --rw=randrw --bs=4k --ioengine=libaio --iodepth=16 --numjobs=1 --time_based --group_reporting --name=iops-test-job --runtime=240 --eta-newline=1
> iops-test-job: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
> fio-3.7
> Starting 1 process
> fio: file /dev/rbd0 exceeds 32-bit tausworthe random generator.
> fio: Switching to tausworthe64. Use the random_generator= option to get rid of this warning.
> Jobs: 1 (f=1): [m(1)][1.2%][r=16.6MiB/s,w=16.8MiB/s][r=4237,w=4301 IOPS][eta 03m:58s]
> Jobs: 1 (f=1): [m(1)][2.1%][r=17.4MiB/s,w=17.5MiB/s][r=4452,w=4471 IOPS][eta 03m:56s]
> Jobs: 1 (f=1): [m(1)][2.9%][r=19.2MiB/s,w=18.8MiB/s][r=4925,w=4810 IOPS][eta 03m:54s]
> Jobs: 1 (f=1): [m(1)][3.7%][r=18.8MiB/s,w=19.1MiB/s][r=4822,w=4886 IOPS][eta 03m:52s]
> Jobs: 1 (f=1): [m(1)][4.6%][r=21.6MiB/s,w=20.8MiB/s][r=5537,w=5318 IOPS][eta 03m:50s]
> Jobs: 1 (f=1): [m(1)][5.4%][r=22.2MiB/s,w=22.2MiB/s][r=5691,w=5695 IOPS][eta 03m:48s]
> Jobs: 1 (f=1): [m(1)][6.2%][r=21.4MiB/s,w=20.0MiB/s][r=5474,w=5366 IOPS][eta 03m:46s]
> Jobs: 1 (f=1): [m(1)][7.1%][r=22.4MiB/s,w=22.7MiB/s][r=5722,w=5819 IOPS][eta 03m:44s]
> Jobs: 1 (f=1): [m(1)][7.9%][r=21.2MiB/s,w=21.4MiB/s][r=5423,w=5491 IOPS][eta 03m:42s]
> Jobs: 1 (f=1): [m(1)][8.7%][r=21.5MiB/s,w=21.9MiB/s][r=5502,w=5603 IOPS][eta 03m:40s]
> Jobs: 1 (f=1): [m(1)][9.5%][r=23.3MiB/s,w=22.9MiB/s][r=5958,w=5851 IOPS][eta 03m:38s]
> Jobs: 1 (f=1): [m(1)][10.4%][r=22.6MiB/s,w=22.9MiB/s][r=5790,w=5853 IOPS][eta 03m:36s]
> Jobs: 1 (f=1): [m(1)][11.2%][r=23.3MiB/s,w=23.6MiB/s][r=5964,w=6035 IOPS][eta 03m:34s]
> Jobs: 1 (f=1): [m(1)][12.0%][r=20.6MiB/s,w=20.5MiB/s][r=5269,w=5243 IOPS][eta 03m:32s]
> Jobs: 1 (f=1): [m(1)][12.9%][r=21.1MiB/s,w=20.9MiB/s][r=5405,w=5344 IOPS][eta 03m:30s]
> Jobs: 1 (f=1): [m(1)][13.7%][r=21.1MiB/s,w=20.6MiB/s][r=5397,w=5273 IOPS][eta 03m:28s]
> Jobs: 1 (f=1): [m(1)][14.5%][r=22.2MiB/s,w=21.7MiB/s][r=5683,w=5544 IOPS][eta 03m:26s]
> Jobs: 1 (f=1): [m(1)][15.4%][r=21.1MiB/s,w=21.6MiB/s][r=5392,w=5525 IOPS][eta 03m:24s]
> Jobs: 1 (f=1): [m(1)][16.2%][r=22.2MiB/s,w=22.6MiB/s][r=5688,w=5789 IOPS][eta 03m:22s]
> Jobs: 1 (f=1): [m(1)][17.0%][r=22.1MiB/s,w=21.0MiB/s][r=5667,w=5630 IOPS][eta 03m:20s]
> Jobs: 1 (f=1): [m(1)][17.8%][r=20.6MiB/s,w=21.1MiB/s][r=5275,w=5405 IOPS][eta 03m:18s]
> Jobs: 1 (f=1): [m(1)][18.7%][r=22.6MiB/s,w=22.5MiB/s][r=5781,w=5754 IOPS][eta 03m:16s]
> Jobs: 1 (f=1): [m(1)][19.5%][r=22.0MiB/s,w=22.1MiB/s][r=5644,w=5654 IOPS][eta 03m:14s]
> Jobs: 1 (f=1): [m(1)][20.3%][r=21.4MiB/s,w=22.0MiB/s][r=5485,w=5642 IOPS][eta 03m:12s]
> Jobs: 1 (f=1): [m(1)][21.2%][r=21.8MiB/s,w=22.3MiB/s][r=5588,w=5713 IOPS][eta 03m:10s]
> Jobs: 1 (f=1): [m(1)][22.0%][r=24.1MiB/s,w=23.6MiB/s][r=6162,w=6030 IOPS][eta 03m:08s]
> Jobs: 1 (f=1): [m(1)][22.8%][r=23.2MiB/s,w=22.2MiB/s][r=5943,w=5676 IOPS][eta 03m:06s]
> Jobs: 1 (f=1): [m(1)][23.7%][r=23.4MiB/s,w=22.8MiB/s][r=5980,w=5848 IOPS][eta 03m:04s]
> Jobs: 1 (f=1): [m(1)][24.5%][r=22.8MiB/s,w=22.3MiB/s][r=5844,w=5719 IOPS][eta 03m:02s]
> Jobs: 1 (f=1): [m(1)][25.3%][r=23.6MiB/s,w=22.9MiB/s][r=6038,w=5865 IOPS][eta 03m:00s]
> Jobs: 1 (f=1): [m(1)][26.1%][r=22.7MiB/s,w=22.9MiB/s][r=5809,w=5861 IOPS][eta 02m:58s]
> Jobs: 1 (f=1): [m(1)][27.0%][r=14.3MiB/s,w=14.2MiB/s][r=3662,w=3644 IOPS][eta 02m:56s]
> Jobs: 1 (f=1): [m(1)][27.8%][r=8784KiB/s,w=8436KiB/s][r=2196,w=2109 IOPS][eta 02m:54s]
> Jobs: 1 (f=1): [m(1)][28.6%][r=2962KiB/s,w=3191KiB/s][r=740,w=797 IOPS][eta 02m:52s]     ****
> Jobs: 1 (f=1): [m(1)][29.5%][r=5532KiB/s,w=6072KiB/s][r=1383,w=1518 IOPS][eta 02m:50s]
> Jobs: 1 (f=1): [m(1)][30.3%][r=13.5MiB/s,w=14.0MiB/s][r=3461,w=3586 IOPS][eta 02m:48s]
>
>
>    .. .SKIP... ***
>
> Jobs: 1 (f=1): [m(1)][70.1%][r=14.1MiB/s,w=13.6MiB/s][r=3598,w=3479 IOPS][eta 01m:12s]
> Jobs: 1 (f=1): [m(1)][71.0%][r=6588KiB/s,w=6656KiB/s][r=1647,w=1664 IOPS][eta 01m:10s]
> Jobs: 1 (f=1): [m(1)][71.8%][r=3192KiB/s,w=2892KiB/s][r=798,w=723 IOPS][eta 01m:08s]
> Jobs: 1 (f=1): [m(1)][72.6%][r=3296KiB/s,w=3176KiB/s][r=824,w=794 IOPS][eta 01m:06s]
> Jobs: 1 (f=1): [m(1)][73.4%][r=2640KiB/s,w=2644KiB/s][r=660,w=661 IOPS][eta 01m:04s]
> Jobs: 1 (f=1): [m(1)][74.3%][r=1792KiB/s,w=2008KiB/s][r=448,w=502 IOPS][eta 01m:02s]
> Jobs: 1 (f=1): [m(1)][75.1%][r=12.9MiB/s,w=13.1MiB/s][r=3291,w=3351 IOPS][eta 01m:00s]
> Jobs: 1 (f=1): [m(1)][75.9%][r=14.9MiB/s,w=15.0MiB/s][r=3819,w=3844 IOPS][eta 00m:58s]
>
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>


-- 
Jason
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux