Re: Poor RBD performance as LIO iSCSI target

Nick Fisk <nick@xxxxxxxxxx> · Thu, 20 Nov 2014 17:43:42 -0000

Hi David,

I've just finished running the 75GB fio test you posted a few days back on
my new test cluster.

The cluster is as follows:-

Single server with 3x hdd and 1 ssd
Ubuntu 14.04 with 3.16.7 kernel
2+1 EC pool on hdds below a 10G ssd cache pool. SSD is also partitioned to
provide journals for hdds.
150G RBD mapped locally

The fio test seemed to run without any problems. I want to run a few more
tests with different settings to see if I can reproduce your problem. I will
let you know if I find anything.

If there is anything you would like me to try, please let me know.

Nick

-----Original Message-----
From: ceph-users [mailto:ceph-users-bounces@xxxxxxxxxxxxxx] On Behalf Of
David Moreau Simard
Sent: 19 November 2014 10:48
To: Ramakrishna Nishtala (rnishtal)
Cc: ceph-users@xxxxxxxxxxxxxx; Nick Fisk
Subject: Re:  Poor RBD performance as LIO iSCSI target

Rama,

Thanks for your reply.

My end goal is to use iSCSI (with LIO/targetcli) to export rbd block
devices.

I was encountering issues with iSCSI which are explained in my previous
emails.
I ended up being able to reproduce the problem at will on various Kernel and
OS combinations, even on raw RBD devices - thus ruling out the hypothesis
that it was a problem with iSCSI but rather with Ceph.
I'm even running 0.88 now and the issue is still there.

I haven't isolated the issue just yet.
My next tests involve disabling the cache tiering.

I do have client krbd cache as well, i'll try to disable it too if cache
tiering isn't enough.
--
David Moreau Simard

> On Nov 18, 2014, at 8:10 PM, Ramakrishna Nishtala (rnishtal)
<rnishtal@xxxxxxxxx> wrote:
> 
> Hi Dave
> Did you say iscsi only? The tracker issue does not say though.
> I am on giant, with both client and ceph on RHEL 7 and seems to work ok,
unless I am missing something here. RBD on baremetal with kmod-rbd and
caching disabled.
>  
> [root@compute4 ~]# time fio --name=writefile --size=100G 
> --filesize=100G --filename=/dev/rbd0 --bs=1M --nrfiles=1 --direct=1 
> --sync=0 --randrepeat=0 --rw=write --refill_buffers --end_fsync=1 
> --iodepth=200 --ioengine=libaio
> writefile: (g=0): rw=write, bs=1M-1M/1M-1M/1M-1M, ioengine=libaio, 
> iodepth=200
> fio-2.1.11
> Starting 1 process
> Jobs: 1 (f=1): [W(1)] [100.0% done] [0KB/853.0MB/0KB /s] [0/853/0 
> iops] [eta 00m:00s] ...
> Disk stats (read/write):
>   rbd0: ios=184/204800, merge=0/0, ticks=70/16164931, 
> in_queue=16164942, util=99.98%
>  
> real    1m56.175s
> user    0m18.115s
> sys     0m10.430s
>  
> Regards,
> 
> Rama
>  
>  
> -----Original Message-----
> From: ceph-users [mailto:ceph-users-bounces@xxxxxxxxxxxxxx] On Behalf 
> Of David Moreau Simard
> Sent: Tuesday, November 18, 2014 3:49 PM
> To: Nick Fisk
> Cc: ceph-users@xxxxxxxxxxxxxx
> Subject: Re:  Poor RBD performance as LIO iSCSI target
>  
> Testing without the cache tiering is the next test I want to do when I
have time..
>  
> When it's hanging, there is no activity at all on the cluster.
> Nothing in "ceph -w", nothing in "ceph osd pool stats".
>  
> I'll provide an update when I have a chance to test without tiering.
> --
> David Moreau Simard
>  
>  
> > On Nov 18, 2014, at 3:28 PM, Nick Fisk <nick@xxxxxxxxxx> wrote:
> > 
> > Hi David,
> > 
> > Have you tried on a normal replicated pool with no cache? I've seen 
> > a number of threads recently where caching is causing various things to
block/hang.
> > It would be interesting to see if this still happens without the 
> > caching layer, at least it would rule it out.
> > 
> > Also is there any sign that as the test passes ~50GB that the cache 
> > might start flushing to the backing pool causing slow performance?
> > 
> > I am planning a deployment very similar to yours so I am following 
> > this with great interest. I'm hoping to build a single node test 
> > "cluster" shortly, so I might be in a position to work with you on 
> > this issue and hopefully get it resolved.
> > 
> > Nick
> > 
> > -----Original Message-----
> > From: ceph-users [mailto:ceph-users-bounces@xxxxxxxxxxxxxx] On 
> > Behalf Of David Moreau Simard
> > Sent: 18 November 2014 19:58
> > To: Mike Christie
> > Cc: ceph-users@xxxxxxxxxxxxxx; Christopher Spearman
> > Subject: Re:  Poor RBD performance as LIO iSCSI target
> > 
> > Thanks guys. I looked at http://tracker.ceph.com/issues/8818 and 
> > chatted with "dis" on #ceph-devel.
> > 
> > I ran a LOT of tests on a LOT of comabination of kernels (sometimes 
> > with tunables legacy). I haven't found a magical combination in 
> > which the following test does not hang:
> > fio --name=writefile --size=100G --filesize=100G 
> > --filename=/dev/rbd0 --bs=1M --nrfiles=1 --direct=1 --sync=0 
> > --randrepeat=0 --rw=write --refill_buffers --end_fsync=1 
> > --iodepth=200 --ioengine=libaio
> > 
> > Either directly on a mapped rbd device, on a mounted filesystem 
> > (over rbd), exported through iSCSI.. nothing.
> > I guess that rules out a potential issue with iSCSI overhead.
> > 
> > Now, something I noticed out of pure luck is that I am unable to 
> > reproduce the issue if I drop the size of the test to 50GB. Tests 
> > will complete in under 2 minutes.
> > 75GB will hang right at the end and take more than 10 minutes.
> > 
> > TL;DR of tests:
> > - 3x fio --name=writefile --size=50G --filesize=50G
> > --filename=/dev/rbd0 --bs=1M --nrfiles=1 --direct=1 --sync=0
> > --randrepeat=0 --rw=write --refill_buffers --end_fsync=1 
> > --iodepth=200 --ioengine=libaio
> > -- 1m44s, 1m49s, 1m40s
> > 
> > - 3x fio --name=writefile --size=75G --filesize=75G
> > --filename=/dev/rbd0 --bs=1M --nrfiles=1 --direct=1 --sync=0
> > --randrepeat=0 --rw=write --refill_buffers --end_fsync=1 
> > --iodepth=200 --ioengine=libaio
> > -- 10m12s, 10m11s, 10m13s
> > 
> > Details of tests here: http://pastebin.com/raw.php?i=3v9wMtYP
> > 
> > Does that ring you guys a bell ?
> > 
> > --
> > David Moreau Simard
> > 
> > 
> >> On Nov 13, 2014, at 3:31 PM, Mike Christie <mchristi@xxxxxxxxxx> wrote:
> >> 
> >> On 11/13/2014 10:17 AM, David Moreau Simard wrote:
> >>> Running into weird issues here as well in a test environment. I 
> >>> don't
> > have a solution either but perhaps we can find some things in common..
> >>> 
> >>> Setup in a nutshell:
> >>> - Ceph cluster: Ubuntu 14.04, Kernel 3.16.7, Ceph 0.87-1 (OSDs 
> >>> with separate public/cluster network in 10 Gbps)
> >>> - iSCSI Proxy node (targetcli/LIO): Ubuntu 14.04, Kernel 3.16.7, 
> >>> Ceph
> >>> 0.87-1 (10 Gbps)
> >>> - Client node: Ubuntu 12.04, Kernel 3.11 (10 Gbps)
> >>> 
> >>> Relevant cluster config: Writeback cache tiering with NVME PCI-E 
> >>> cards (2
> > replica) in front of a erasure coded pool (k=3,m=2) backed by spindles.
> >>> 
> >>> I'm following the instructions here: 
> >>> http://www.hastexo.com/resources/hints-and-kinks/turning-ceph-rbd-
> >>> im a ges-san-storage-devices No issues with creating and mapping a 
> >>> 100GB RBD image and then creating the target.
> >>> 
> >>> I'm interested in finding out the overhead/performance impact of
> > re-exporting through iSCSI so the idea is to run benchmarks.
> >>> Here's a fio test I'm trying to run on the client node on the 
> >>> mounted
> > iscsi device:
> >>> fio --name=writefile --size=100G --filesize=100G 
> >>> --filename=/dev/sdu --bs=1M --nrfiles=1 --direct=1 --sync=0 
> >>> --randrepeat=0 --rw=write --refill_buffers --end_fsync=1 
> >>> --iodepth=200 --ioengine=libaio
> >>> 
> >>> The benchmark will eventually hang towards the end of the test for 
> >>> some
> > long seconds before completing.
> >>> On the proxy node, the kernel complains with iscsi portal login
> >>> timeout: http://pastebin.com/Q49UnTPr and I also see irqbalance 
> >>> errors in syslog: http://pastebin.com/AiRTWDwR
> >>> 
> >> 
> >> You are hitting a different issue. German Anders is most likely 
> >> correct and you hit the rbd hang. That then caused the iscsi/scsi 
> >> command to timeout which caused the scsi error handler to run. In 
> >> your logs we see the LIO error handler has received a task abort 
> >> from the initiator and that timed out which caused the escalation 
> >> (iscsi portal login related messages).
> > 
> > _______________________________________________
> > ceph-users mailing list
> > ceph-users@xxxxxxxxxxxxxx
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> > 
> > 
> > 
> > 
>  
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com