Here you go:- Erasure Profile k=2 m=1 plugin=jerasure ruleset-failure-domain=osd ruleset-root=hdd technique=reed_sol_van Cache Settings hit_set_type: bloom hit_set_period: 3600 hit_set_count: 1 target_max_objects target_max_objects: 0 target_max_bytes: 1000000000 cache_target_dirty_ratio: 0.4 cache_target_full_ratio: 0.8 cache_min_flush_age: 0 cache_min_evict_age: 0 Crush Dump # begin crush map tunable choose_local_tries 0 tunable choose_local_fallback_tries 0 tunable choose_total_tries 50 tunable chooseleaf_descend_once 1 # devices device 0 osd.0 device 1 osd.1 device 2 osd.2 device 3 osd.3 # types type 0 osd type 1 host type 2 chassis type 3 rack type 4 row type 5 pdu type 6 pod type 7 room type 8 datacenter type 9 region type 10 root # buckets host ceph-test-hdd { id -5 # do not change unnecessarily # weight 2.730 alg straw hash 0 # rjenkins1 item osd.1 weight 0.910 item osd.2 weight 0.910 item osd.0 weight 0.910 } root hdd { id -3 # do not change unnecessarily # weight 2.730 alg straw hash 0 # rjenkins1 item ceph-test-hdd weight 2.730 } host ceph-test-ssd { id -6 # do not change unnecessarily # weight 1.000 alg straw hash 0 # rjenkins1 item osd.3 weight 1.000 } root ssd { id -4 # do not change unnecessarily # weight 1.000 alg straw hash 0 # rjenkins1 item ceph-test-ssd weight 1.000 } # rules rule hdd { ruleset 0 type replicated min_size 0 max_size 10 step take hdd step chooseleaf firstn 0 type osd step emit } rule ssd { ruleset 1 type replicated min_size 0 max_size 4 step take ssd step chooseleaf firstn 0 type osd step emit } rule ecpool { ruleset 2 type erasure min_size 3 max_size 20 step set_chooseleaf_tries 5 step take hdd step chooseleaf indep 0 type osd step emit } -----Original Message----- From: ceph-users [mailto:ceph-users-bounces@xxxxxxxxxxxxxx] On Behalf Of David Moreau Simard Sent: 20 November 2014 20:03 To: Nick Fisk Cc: ceph-users@xxxxxxxxxxxxxx Subject: Re: Poor RBD performance as LIO iSCSI target Nick, Can you share more datails on the configuration you are using ? I'll try and duplicate those configurations in my environment and see what happens. I'm mostly interested in: - Erasure code profile (k, m, plugin, ruleset-failure-domain) - Cache tiering pool configuration (ex: hit_set_type, hit_set_period, hit_set_count, target_max_objects, target_max_bytes, cache_target_dirty_ratio, cache_target_full_ratio, cache_min_flush_age, cache_min_evict_age) The crush rulesets would also be helpful. Thanks, -- David Moreau Simard > On Nov 20, 2014, at 12:43 PM, Nick Fisk <nick@xxxxxxxxxx> wrote: > > Hi David, > > I've just finished running the 75GB fio test you posted a few days > back on my new test cluster. > > The cluster is as follows:- > > Single server with 3x hdd and 1 ssd > Ubuntu 14.04 with 3.16.7 kernel > 2+1 EC pool on hdds below a 10G ssd cache pool. SSD is also > 2+partitioned to > provide journals for hdds. > 150G RBD mapped locally > > The fio test seemed to run without any problems. I want to run a few > more tests with different settings to see if I can reproduce your > problem. I will let you know if I find anything. > > If there is anything you would like me to try, please let me know. > > Nick > > -----Original Message----- > From: ceph-users [mailto:ceph-users-bounces@xxxxxxxxxxxxxx] On Behalf > Of David Moreau Simard > Sent: 19 November 2014 10:48 > To: Ramakrishna Nishtala (rnishtal) > Cc: ceph-users@xxxxxxxxxxxxxx; Nick Fisk > Subject: Re: Poor RBD performance as LIO iSCSI target > > Rama, > > Thanks for your reply. > > My end goal is to use iSCSI (with LIO/targetcli) to export rbd block > devices. > > I was encountering issues with iSCSI which are explained in my > previous emails. > I ended up being able to reproduce the problem at will on various > Kernel and OS combinations, even on raw RBD devices - thus ruling out > the hypothesis that it was a problem with iSCSI but rather with Ceph. > I'm even running 0.88 now and the issue is still there. > > I haven't isolated the issue just yet. > My next tests involve disabling the cache tiering. > > I do have client krbd cache as well, i'll try to disable it too if > cache tiering isn't enough. > -- > David Moreau Simard > > >> On Nov 18, 2014, at 8:10 PM, Ramakrishna Nishtala (rnishtal) > <rnishtal@xxxxxxxxx> wrote: >> >> Hi Dave >> Did you say iscsi only? The tracker issue does not say though. >> I am on giant, with both client and ceph on RHEL 7 and seems to work >> ok, > unless I am missing something here. RBD on baremetal with kmod-rbd and > caching disabled. >> >> [root@compute4 ~]# time fio --name=writefile --size=100G >> --filesize=100G --filename=/dev/rbd0 --bs=1M --nrfiles=1 --direct=1 >> --sync=0 --randrepeat=0 --rw=write --refill_buffers --end_fsync=1 >> --iodepth=200 --ioengine=libaio >> writefile: (g=0): rw=write, bs=1M-1M/1M-1M/1M-1M, ioengine=libaio, >> iodepth=200 >> fio-2.1.11 >> Starting 1 process >> Jobs: 1 (f=1): [W(1)] [100.0% done] [0KB/853.0MB/0KB /s] [0/853/0 >> iops] [eta 00m:00s] ... >> Disk stats (read/write): >> rbd0: ios=184/204800, merge=0/0, ticks=70/16164931, >> in_queue=16164942, util=99.98% >> >> real 1m56.175s >> user 0m18.115s >> sys 0m10.430s >> >> Regards, >> >> Rama >> >> >> -----Original Message----- >> From: ceph-users [mailto:ceph-users-bounces@xxxxxxxxxxxxxx] On Behalf >> Of David Moreau Simard >> Sent: Tuesday, November 18, 2014 3:49 PM >> To: Nick Fisk >> Cc: ceph-users@xxxxxxxxxxxxxx >> Subject: Re: Poor RBD performance as LIO iSCSI target >> >> Testing without the cache tiering is the next test I want to do when >> I > have time.. >> >> When it's hanging, there is no activity at all on the cluster. >> Nothing in "ceph -w", nothing in "ceph osd pool stats". >> >> I'll provide an update when I have a chance to test without tiering. >> -- >> David Moreau Simard >> >> >>> On Nov 18, 2014, at 3:28 PM, Nick Fisk <nick@xxxxxxxxxx> wrote: >>> >>> Hi David, >>> >>> Have you tried on a normal replicated pool with no cache? I've seen >>> a number of threads recently where caching is causing various things >>> to > block/hang. >>> It would be interesting to see if this still happens without the >>> caching layer, at least it would rule it out. >>> >>> Also is there any sign that as the test passes ~50GB that the cache >>> might start flushing to the backing pool causing slow performance? >>> >>> I am planning a deployment very similar to yours so I am following >>> this with great interest. I'm hoping to build a single node test >>> "cluster" shortly, so I might be in a position to work with you on >>> this issue and hopefully get it resolved. >>> >>> Nick >>> >>> -----Original Message----- >>> From: ceph-users [mailto:ceph-users-bounces@xxxxxxxxxxxxxx] On >>> Behalf Of David Moreau Simard >>> Sent: 18 November 2014 19:58 >>> To: Mike Christie >>> Cc: ceph-users@xxxxxxxxxxxxxx; Christopher Spearman >>> Subject: Re: Poor RBD performance as LIO iSCSI target >>> >>> Thanks guys. I looked at http://tracker.ceph.com/issues/8818 and >>> chatted with "dis" on #ceph-devel. >>> >>> I ran a LOT of tests on a LOT of comabination of kernels (sometimes >>> with tunables legacy). I haven't found a magical combination in >>> which the following test does not hang: >>> fio --name=writefile --size=100G --filesize=100G >>> --filename=/dev/rbd0 --bs=1M --nrfiles=1 --direct=1 --sync=0 >>> --randrepeat=0 --rw=write --refill_buffers --end_fsync=1 >>> --iodepth=200 --ioengine=libaio >>> >>> Either directly on a mapped rbd device, on a mounted filesystem >>> (over rbd), exported through iSCSI.. nothing. >>> I guess that rules out a potential issue with iSCSI overhead. >>> >>> Now, something I noticed out of pure luck is that I am unable to >>> reproduce the issue if I drop the size of the test to 50GB. Tests >>> will complete in under 2 minutes. >>> 75GB will hang right at the end and take more than 10 minutes. >>> >>> TL;DR of tests: >>> - 3x fio --name=writefile --size=50G --filesize=50G >>> --filename=/dev/rbd0 --bs=1M --nrfiles=1 --direct=1 --sync=0 >>> --randrepeat=0 --rw=write --refill_buffers --end_fsync=1 >>> --iodepth=200 --ioengine=libaio >>> -- 1m44s, 1m49s, 1m40s >>> >>> - 3x fio --name=writefile --size=75G --filesize=75G >>> --filename=/dev/rbd0 --bs=1M --nrfiles=1 --direct=1 --sync=0 >>> --randrepeat=0 --rw=write --refill_buffers --end_fsync=1 >>> --iodepth=200 --ioengine=libaio >>> -- 10m12s, 10m11s, 10m13s >>> >>> Details of tests here: http://pastebin.com/raw.php?i=3v9wMtYP >>> >>> Does that ring you guys a bell ? >>> >>> -- >>> David Moreau Simard >>> >>> >>>> On Nov 13, 2014, at 3:31 PM, Mike Christie <mchristi@xxxxxxxxxx> wrote: >>>> >>>> On 11/13/2014 10:17 AM, David Moreau Simard wrote: >>>>> Running into weird issues here as well in a test environment. I >>>>> don't >>> have a solution either but perhaps we can find some things in common.. >>>>> >>>>> Setup in a nutshell: >>>>> - Ceph cluster: Ubuntu 14.04, Kernel 3.16.7, Ceph 0.87-1 (OSDs >>>>> with separate public/cluster network in 10 Gbps) >>>>> - iSCSI Proxy node (targetcli/LIO): Ubuntu 14.04, Kernel 3.16.7, >>>>> Ceph >>>>> 0.87-1 (10 Gbps) >>>>> - Client node: Ubuntu 12.04, Kernel 3.11 (10 Gbps) >>>>> >>>>> Relevant cluster config: Writeback cache tiering with NVME PCI-E >>>>> cards (2 >>> replica) in front of a erasure coded pool (k=3,m=2) backed by spindles. >>>>> >>>>> I'm following the instructions here: >>>>> http://www.hastexo.com/resources/hints-and-kinks/turning-ceph-rbd- >>>>> im a ges-san-storage-devices No issues with creating and mapping a >>>>> 100GB RBD image and then creating the target. >>>>> >>>>> I'm interested in finding out the overhead/performance impact of >>> re-exporting through iSCSI so the idea is to run benchmarks. >>>>> Here's a fio test I'm trying to run on the client node on the >>>>> mounted >>> iscsi device: >>>>> fio --name=writefile --size=100G --filesize=100G >>>>> --filename=/dev/sdu --bs=1M --nrfiles=1 --direct=1 --sync=0 >>>>> --randrepeat=0 --rw=write --refill_buffers --end_fsync=1 >>>>> --iodepth=200 --ioengine=libaio >>>>> >>>>> The benchmark will eventually hang towards the end of the test for >>>>> some >>> long seconds before completing. >>>>> On the proxy node, the kernel complains with iscsi portal login >>>>> timeout: http://pastebin.com/Q49UnTPr and I also see irqbalance >>>>> errors in syslog: http://pastebin.com/AiRTWDwR >>>>> >>>> >>>> You are hitting a different issue. German Anders is most likely >>>> correct and you hit the rbd hang. That then caused the iscsi/scsi >>>> command to timeout which caused the scsi error handler to run. In >>>> your logs we see the LIO error handler has received a task abort >>>> from the initiator and that timed out which caused the escalation >>>> (iscsi portal login related messages). >>> >>> _______________________________________________ >>> ceph-users mailing list >>> ceph-users@xxxxxxxxxxxxxx >>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>> >>> >>> >>> >> >> _______________________________________________ >> ceph-users mailing list >> ceph-users@xxxxxxxxxxxxxx >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > > _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com