Also what are you getting locally on your filesystem? Looking at the specs for a 840 pro, ~520MBps and based on the numbers you stated earlier your arent getting close to that so there might be a problem at the server. Once you start seeing better numbers at the local, then retry your iscsi targets.
On Mar 17, 2015 6:02 PM, "Nick Fisk" <nick@xxxxxxxxxx> wrote:
Hi Robin,
Just a few things to try:-
1. Increase the number of worker threads for tgt (it's a parameter of tgtd,
so modify however its being started)
2. Disable librbd caching in ceph.conf
3. Do you see the same performance problems exporting a krbd as a block
device via tgt?
Nick
> -----Original Message-----
> From: ceph-users [mailto:ceph-users-bounces@xxxxxxxxxxxxxx] On Behalf Of
> Robin H. Johnson
> Sent: 17 March 2015 18:25
> To: ceph-users@xxxxxxxxxxxxxx
> Subject: Terrible iSCSI tgt RBD performance
>
> I'm trying to get better performance out of exporting RBD volumes via tgt
for
> iSCSI consumers...
>
> By terrible, I'm getting <5MB/sec reads, <50IOPS. I'm pretty sure neither
RBD
> or iSCSI themselves are the problems; as the individually perform well.
>
> iSCSI to RAM-backed: >60MB/sec, >500IOPS iSCSI to SSD-backed:
> >50MB/sec, >300IOPS iSCSI to RBD-backed: <5MB/sec, <50IOPS
>
> Cluster:
> 4 nodes (ceph1..4):
> - Supermicro 6027TR-D70RF+ (2U twin systems)
> - Chassis A: ceph1, ceph2
> - Chassis B: ceph3, ceph4
> - 2x E5-2650
> - 256GB RAM
> - 4x 4TB Seagate ST4000NM0023 SAS, dedicated to Ceph
> - 2x 512GB Samsung 840 PRO
> - MD RAID1
> - LVM
> - LV: OS on 'root', 20GiB
> - LV: Ceph Journals, 8GB, one per Ceph disk
> - 2x Bonded 1GbE network
> - 10GbE network:
> - port1: to switch
> - port2: direct-connect pairs: ceph1/3 ceph2/4 (vertical between
chassis)
> - All 4 nodes run OSPF
> - ceph1/2; ceph3/4: ~9.8Gbit bandwidth confirmed
> - ceph1/3; ceph2/4: ~18.2Gbit bandwidth confirmed
> - The nodes also co-house VMs with Ganeti, backed onto the SSDs w/ DRBD;
> - S3 is the main Ceph use-case, and it works well from the VMs.
>
> Direct performance on the nodes is reasonable good, but it would be nice
if
> the random performance were better.
>
> # rbd bench-write XXXXX
> bench-write io_size 4096 io_threads 16 bytes 1073741824 pattern seq ...
> elapsed: 36 ops: 246603 ops/sec: 6681.20 bytes/sec: 29090920.91
> # rbd bench-write XXXXX
> bench-write io_size 4096 io_threads 16 bytes 1073741824 pattern seq ...
> elapsed: 48 ops: 246585 ops/sec: 5070.70 bytes/sec: 22080207.55
> # rbd bench-write test.libraries.coop --io-pattern rand bench-write
io_size
> 4096 io_threads 16 bytes 1073741824 pattern rand ...
> elapsed: 324 ops: 246178 ops/sec: 757.74 bytes/sec: 3305000.99
> # rbd bench-write test.libraries.coop --io-threads 16 --io-pattern rand
--io-
> size 32768 bench-write io_size 32768 io_threads 16 bytes 1073741824
pattern
> rand ...
> elapsed: 86 ops: 30141 ops/sec: 347.39 bytes/sec: 12375512.34
>
> Yes I know the data below seems small; I have another older cluster of
data
> that I still have to merge to this newer hardware.
>
> # ceph -w
> cluster 401a58ef-5075-49ec-9615-1c2973624252
> health HEALTH_WARN 6 pgs stuck unclean; recovery 8472/241829 objects
> degraded (3.503%); mds cluster is degraded; mds ceph1 is laggy
> monmap e3: 3 mons at
> {ceph1=10.77.10.41:6789/0,ceph2=10.77.10.42:6789/0,ceph4=10.77.10.44:678
> 9/0}, election epoch 11486, quorum 0,1,2 ceph1,ceph2,ceph4
> mdsmap e1496661: 1/1/1 up {0=ceph1=up:replay(laggy or crashed)}
> osdmap e4323895: 16 osds: 16 up, 16 in
> pgmap v14695205: 481 pgs, 17 pools, 186 GB data, 60761 objects
> 1215 GB used, 58356 GB / 59571 GB avail
> 8472/241829 objects degraded (3.503%)
> 6 active
> 475 active+clean
> client io 67503 B/s rd, 7297 B/s wr, 13 op/s
>
>
> TGT setups:
> Target 1: rbd.XXXXXXXXXXX
> System information:
> Driver: iscsi
> State: ready
> I_T nexus information:
> I_T nexus: 11
> Initiator: iqn.1993-08.org.debian:01:6b14da6a48b6 alias:
> XXXXXXXXXXXXXXXX
> Connection: 0
> IP Address: 10.77.110.6
> LUN information:
> LUN: 0
> Type: controller
> SCSI ID: IET 00010000
> SCSI SN: beaf10
> Size: 0 MB, Block size: 1
> Online: Yes
> Removable media: No
> Prevent removal: No
> Readonly: No
> SWP: No
> Thin-provisioning: No
> Backing store type: null
> Backing store path: None
> Backing store flags:
> LUN: 1
> Type: disk
> SCSI ID: IET 00010001
> SCSI SN: beaf11
> Size: 161061 MB, Block size: 512
> Online: Yes
> Removable media: No
> Prevent removal: No
> Readonly: No
> SWP: No
> Thin-provisioning: No
> Backing store type: rbd
> Backing store path: XXXXXXXXXXXXXXXXXXXXXXx
> Backing store flags:
> Account information:
> ACL information:
> XXXXXXXXXXXXXXXXXXXXXXXXXXXxx
>
> # tgtadm --lld iscsi --mode target --op show --tid 1
> MaxRecvDataSegmentLength=8192
> HeaderDigest=None
> DataDigest=None
> InitialR2T=Yes
> MaxOutstandingR2T=1
> ImmediateData=Yes
> FirstBurstLength=65536
> MaxBurstLength=262144
> DataPDUInOrder=Yes
> DataSequenceInOrder=Yes
> ErrorRecoveryLevel=0
> IFMarker=No
> OFMarker=No
> DefaultTime2Wait=2
> DefaultTime2Retain=20
> OFMarkInt=Reject
> IFMarkInt=Reject
> MaxConnections=1
> RDMAExtensions=Yes
> TargetRecvDataSegmentLength=262144
> InitiatorRecvDataSegmentLength=262144
> MaxOutstandingUnexpectedPDUs=0
> MaxXmitDataSegmentLength=8192
> MaxQueueCmd=128
>
>
> --
> Robin Hugh Johnson
> Gentoo Linux: Developer, Infrastructure Lead
> E-Mail : robbat2@xxxxxxxxxx
> GnuPG FP : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com