First I use lio-utils instead of targetcli, as this is an embedded box that has very limited python packages built-in. On Wed, Oct 2, 2013 at 5:26 PM, Nicholas A. Bellinger <nab@xxxxxxxxxxxxxxx> wrote: > On Wed, 2013-10-02 at 14:07 -0500, Xianghua Xiao wrote: >> after I changed default_cmdsn_depth to 64 I use iomter to do READ, >> only core0 is busy, for WRITE, all cores(12 of them) are equally busy. >> > > Have you been able to isolate the issue down to per session > performance..? What happens when the same MD RAID backend is accessed > across multiple sessions via a different TargetName+TargetPortalGroupTag > endpoint..? Does the performance stay the same..? > > Also, it would be useful to confirm with a rd_mcp backend to determine > if it's something related to the fabric (eg: iscsi) or something related > to the backend itself. > I have 12 RAID5 built from 4 SSDs(each SSD has 8 partitions). Only the first two of key steps are shown here: tcm_node --block iblock_0/my_iblock0 /dev/md0 tcm_node --block iblock_1/my_iblock1 /dev/md1 ... lio_node --addlun iscsi-test0 1 0 lun_my_block iblock_0/my_iblock0 lio_node --addlun iscsi-test1 1 0 lun_my_block iblock_1/my_iblock1 ... lio_node --addnp iscsi-test0 1 172.16.0.1:3260 lio_node --addnp iscsi-test1 1 172.16.0.1:3260 ... lio_node --enabletpg iscsi-test0 1 lio_node --enabletpg iscsi-test1 1 ... After this, on the Windows machine I get 12 new disk drives, and format them as NTFS. >> I created 12 target(each has one LUN) for 12-cores in this case, still >> the performance for both READ and WRITE are about 1/3 comparing to >> SCST I got in the past. >> > > Can you send along your rtsadmin/targetcli configuration output in order > to get an idea of the setup..? Also, any other information about the > backend configuration + hardware would be useful as well. > > Also, can you give some specifics on the workload in question..? > the workload is generated by iometer, I created a 64KB 100% Sequential Write and 128KB 100% Sequential READ workloads to all the 12 iSCSI disks per worker. After that I duplicate the workers, to 4, 8, 12, for example. No matter what I try, the performance is roughly 1/3 comparing to SCST with similar settings(12 RAID5 iSCSI + iometer) For example with SCST, I can easily get wire speed(10Gbps) for READ, with LIO I can at most get 3.8Gbps. For READ, core0 is 0% idle during test, the rest 11 cores are about 80% idle each. For WRITE, all 12 cores are 10% idle. Again comparing to SCST, all cores are always nearly evenly distributed for computing at both READ and WRITE via iometer. >> is LIO-iSCSI on 3.8.x 'best' for 10/100/1G network only? other than >> the DEFAULT_CMDSN_DEPTH definition what else I could tune for 10G/40G >> iSCSI? Again I am using the same scheduler/ fifo_batch >> strip_cache_size read_ahead_kb etc parameters as I used with SCST, >> the only major difference is LIO vs SCST itself. > > If your on IB/RoCE/iWARP verbs capable hardware, I'd very much recommend > checking out the iser-target that is included in >= v3.10 kernels. I have to use 3.8.x for now, and am testing iSCSI/Lio at the moment, before moving to FCoE soon. Thanks! > > --nab > -- To unsubscribe from this list: send the line "unsubscribe target-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html