Re: Terrible iSCSI tgt RBD performance

Thomas Foster <thomas.foster80@xxxxxxxxx> · Tue, 17 Mar 2015 18:08:20 -0400

Also what are you getting locally on your filesystem?  Looking at the specs for a 840 pro, ~520MBps and based on the numbers you stated earlier your arent getting close to that so there might be a problem at the server.  Once you start seeing better numbers at the local, then retry your iscsi targets.
On Mar 17, 2015 6:02 PM, "Nick Fisk" <nick@xxxxxxxxxx> wrote:
Hi Robin,

Just a few things to try:-

1. Increase the number of worker threads for tgt (it's a parameter of tgtd,

so modify however its being started)

2. Disable librbd caching in ceph.conf

3. Do you see the same performance problems exporting a krbd as a block

device via tgt?

Nick

> -----Original Message-----

> From: ceph-users [mailto:ceph-users-bounces@xxxxxxxxxxxxxx] On Behalf Of

> Robin H. Johnson

> Sent: 17 March 2015 18:25

> To: ceph-users@xxxxxxxxxxxxxx

> Subject:  Terrible iSCSI tgt RBD performance

>

> I'm trying to get better performance out of exporting RBD volumes via tgt

for

> iSCSI consumers...

>

> By terrible, I'm getting <5MB/sec reads, <50IOPS. I'm pretty sure neither

RBD

> or iSCSI themselves are the problems; as the individually perform well.

>

> iSCSI to RAM-backed: >60MB/sec, >500IOPS iSCSI to SSD-backed:

> >50MB/sec, >300IOPS iSCSI to RBD-backed: <5MB/sec, <50IOPS

>

> Cluster:

> 4 nodes (ceph1..4):

> - Supermicro 6027TR-D70RF+ (2U twin systems)

>   - Chassis A: ceph1, ceph2

>   - Chassis B: ceph3, ceph4

> - 2x E5-2650

> - 256GB RAM

> - 4x 4TB Seagate ST4000NM0023 SAS, dedicated to Ceph

> - 2x 512GB Samsung 840 PRO

>   - MD RAID1

>   - LVM

>   - LV: OS on 'root', 20GiB

>   - LV: Ceph Journals, 8GB, one per Ceph disk

> - 2x Bonded 1GbE network

> - 10GbE network:

>   - port1: to switch

>   - port2: direct-connect pairs: ceph1/3 ceph2/4 (vertical between

chassis)

> - All 4 nodes run OSPF

>   - ceph1/2; ceph3/4: ~9.8Gbit bandwidth confirmed

>   - ceph1/3; ceph2/4: ~18.2Gbit bandwidth confirmed

> - The nodes also co-house VMs with Ganeti, backed onto the SSDs w/ DRBD;

> - S3 is the main Ceph use-case, and it works well from the VMs.

>

> Direct performance on the nodes is reasonable good, but it would be nice

if

> the random performance were better.

>

> # rbd bench-write XXXXX

> bench-write  io_size 4096 io_threads 16 bytes 1073741824 pattern seq ...

> elapsed:    36  ops:   246603  ops/sec:  6681.20  bytes/sec: 29090920.91

> # rbd bench-write XXXXX

> bench-write  io_size 4096 io_threads 16 bytes 1073741824 pattern seq ...

> elapsed:    48  ops:   246585  ops/sec:  5070.70  bytes/sec: 22080207.55

> # rbd bench-write test.libraries.coop --io-pattern rand bench-write

io_size

> 4096 io_threads 16 bytes 1073741824 pattern rand ...

> elapsed:   324  ops:   246178  ops/sec:   757.74  bytes/sec: 3305000.99

> # rbd bench-write test.libraries.coop --io-threads 16 --io-pattern rand

--io-

> size 32768 bench-write  io_size 32768 io_threads 16 bytes 1073741824

pattern

> rand ...

> elapsed:    86  ops:    30141  ops/sec:   347.39  bytes/sec: 12375512.34

>

> Yes I know the data below seems small; I have another older cluster of

data

> that I still have to merge to this newer hardware.

>

> # ceph -w

>     cluster 401a58ef-5075-49ec-9615-1c2973624252

>      health HEALTH_WARN 6 pgs stuck unclean; recovery 8472/241829 objects

> degraded (3.503%); mds cluster is degraded; mds ceph1 is laggy

>      monmap e3: 3 mons at

> {ceph1=10.77.10.41:6789/0,ceph2=10.77.10.42:6789/0,ceph4=10.77.10.44:678

> 9/0}, election epoch 11486, quorum 0,1,2 ceph1,ceph2,ceph4

>      mdsmap e1496661: 1/1/1 up {0=ceph1=up:replay(laggy or crashed)}

>      osdmap e4323895: 16 osds: 16 up, 16 in

>       pgmap v14695205: 481 pgs, 17 pools, 186 GB data, 60761 objects

>             1215 GB used, 58356 GB / 59571 GB avail

>             8472/241829 objects degraded (3.503%)

>                    6 active

>                  475 active+clean

>   client io 67503 B/s rd, 7297 B/s wr, 13 op/s

>

>

> TGT setups:

> Target 1: rbd.XXXXXXXXXXX

>     System information:

>         Driver: iscsi

>         State: ready

>     I_T nexus information:

>         I_T nexus: 11

>             Initiator: iqn.1993-08.org.debian:01:6b14da6a48b6 alias:

> XXXXXXXXXXXXXXXX

>             Connection: 0

>                 IP Address: 10.77.110.6

>     LUN information:

>         LUN: 0

>             Type: controller

>             SCSI ID: IET     00010000

>             SCSI SN: beaf10

>             Size: 0 MB, Block size: 1

>             Online: Yes

>             Removable media: No

>             Prevent removal: No

>             Readonly: No

>             SWP: No

>             Thin-provisioning: No

>             Backing store type: null

>             Backing store path: None

>             Backing store flags:

>         LUN: 1

>             Type: disk

>             SCSI ID: IET     00010001

>             SCSI SN: beaf11

>             Size: 161061 MB, Block size: 512

>             Online: Yes

>             Removable media: No

>             Prevent removal: No

>             Readonly: No

>             SWP: No

>             Thin-provisioning: No

>             Backing store type: rbd

>             Backing store path: XXXXXXXXXXXXXXXXXXXXXXx

>             Backing store flags:

>     Account information:

>     ACL information:

>         XXXXXXXXXXXXXXXXXXXXXXXXXXXxx

>

> # tgtadm --lld iscsi --mode target --op show --tid 1

> MaxRecvDataSegmentLength=8192

> HeaderDigest=None

> DataDigest=None

> InitialR2T=Yes

> MaxOutstandingR2T=1

> ImmediateData=Yes

> FirstBurstLength=65536

> MaxBurstLength=262144

> DataPDUInOrder=Yes

> DataSequenceInOrder=Yes

> ErrorRecoveryLevel=0

> IFMarker=No

> OFMarker=No

> DefaultTime2Wait=2

> DefaultTime2Retain=20

> OFMarkInt=Reject

> IFMarkInt=Reject

> MaxConnections=1

> RDMAExtensions=Yes

> TargetRecvDataSegmentLength=262144

> InitiatorRecvDataSegmentLength=262144

> MaxOutstandingUnexpectedPDUs=0

> MaxXmitDataSegmentLength=8192

> MaxQueueCmd=128

>

>

> --

> Robin Hugh Johnson

> Gentoo Linux: Developer, Infrastructure Lead

> E-Mail     : robbat2@xxxxxxxxxx

> GnuPG FP   : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85

> _______________________________________________

> ceph-users mailing list

> ceph-users@xxxxxxxxxxxxxx

> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________

ceph-users mailing list

ceph-users@xxxxxxxxxxxxxx

http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com