Ming Zhang wrote: > > On Fri, 2007-10-19 at 16:30 +0200, BERTRAND Joël wrote: > > Ming Zhang wrote: > > > On Fri, 2007-10-19 at 09:48 +0200, BERTRAND Joël wrote: > > >> Ross S. W. Walker wrote: > > >>> BERTRAND Joël wrote: > > >>>> BERTRAND Joël wrote: > > >>>>> I can format serveral times (mkfs.ext3) a 1.5 TB volume > > >>>> over iSCSI > > >>>>> without any trouble. I can read and write on this virtual > > >>>> disk without > > >>>>> any trouble. > > >>>>> > > >>>>> Now, I have configured ietd with : > > >>>>> > > >>>>> Lun 0 Sectors=1464725758,Type=nullio > > >>>>> > > >>>>> and I run on initiator side : > > >>>>> > > >>>>> Root gershwin:[/dev] > dd if=/dev/zero of=/dev/sdj bs=8192 > > >>>>> 479482+0 records in > > >>>>> 479482+0 records out > > >>>>> 3927916544 bytes (3.9 GB) copied, 153.222 seconds, 25.6 MB/s > > >>>>> > > >>>>> Root gershwin:[/dev] > dd if=/dev/zero of=/dev/sdj bs=8192 > > >>>>> > > >>>>> I'm waitinfor a crash. No one when I write these lines. > > >>>> I suspect > > >>>>> an interaction between raid and iscsi. > > >>>> I simultanely run : > > >>>> > > >>>> Root gershwin:[/dev] > dd if=/dev/zero of=/dev/sdj bs=8192 > > >>>> 8397210+0 records in > > >>>> 8397210+0 records out > > >>>> 68789944320 bytes (69 GB) copied, 2732.55 seconds, 25.2 MB/s > > >>>> > > >>>> and > > >>>> > > >>>> Root gershwin:[~] > dd if=/dev/sdj of=/dev/null bs=8192 > > >>>> 739200+0 records in > > >>>> 739199+0 records out > > >>>> 6055518208 bytes (6.1 GB) copied, 447.178 seconds, 13.5 MB/s > > >>>> > > >>>> without any trouble. > > >>> The speed can definitely be improved. Look at your network setup > > >>> and use ping to try and get the network latency to a minimum. > > >>> > > >>> # ping -A -s 8192 172.16.24.140 > > >>> .... > > >>> --- 172.16.24.140 ping statistics --- > > >>> 14058 packets transmitted, 14057 received, 0% packet > loss, time 9988ms > > >>> rtt min/avg/max/mdev = 0.234/0.268/2.084/0.041 ms, > ipg/ewma 0.710/0.260 ms > > >> gershwin:[~] > ping -A -s 8192 192.168.0.2 > > >> PING 192.168.0.2 (192.168.0.2) 8192(8220) bytes of data. > > >> 8200 bytes from 192.168.0.2: icmp_seq=1 ttl=64 time=0.693 ms > > >> 8200 bytes from 192.168.0.2: icmp_seq=2 ttl=64 time=0.595 ms > > >> 8200 bytes from 192.168.0.2: icmp_seq=3 ttl=64 time=0.583 ms > > >> 8200 bytes from 192.168.0.2: icmp_seq=4 ttl=64 time=0.589 ms > > >> 8200 bytes from 192.168.0.2: icmp_seq=5 ttl=64 time=0.580 ms > > >> 8200 bytes from 192.168.0.2: icmp_seq=6 ttl=64 time=0.594 ms > > >> 8200 bytes from 192.168.0.2: icmp_seq=7 ttl=64 time=0.580 ms > > >> 8200 bytes from 192.168.0.2: icmp_seq=8 ttl=64 time=0.592 ms > > >> 8200 bytes from 192.168.0.2: icmp_seq=9 ttl=64 time=0.589 ms > > >> 8200 bytes from 192.168.0.2: icmp_seq=10 ttl=64 time=0.571 ms > > >> 8200 bytes from 192.168.0.2: icmp_seq=11 ttl=64 time=0.588 ms > > >> 8200 bytes from 192.168.0.2: icmp_seq=12 ttl=64 time=0.580 ms > > >> 8200 bytes from 192.168.0.2: icmp_seq=13 ttl=64 time=0.587 ms > > >> > > >> --- 192.168.0.2 ping statistics --- > > >> 13 packets transmitted, 13 received, 0% packet loss, time 2400ms > > >> rtt min/avg/max/mdev = 0.571/0.593/0.693/0.044 ms, > ipg/ewma 200.022/0.607 ms > > >> gershwin:[~] > > > >> > > >> Both initiator and target are alone on a gigabit NIC > (Tigon3). On > > >> target server, istd1 takes 100% of a CPU (and only one > CPU, even my > > >> T1000 can simultaneous run 32 threads). I think the > limitation comes > > >> from istd1. > > > > > > usually istdx will not take 100% cpu with 1G network, > especially when > > > using disk as back storage, some kind of profiling work > might be helpful > > > to tell what happened... > > > > > > forgot to ask, your sparc64 platform cpu spec. > > > > Root gershwin:[/mnt/solaris] > cat /proc/cpuinfo > > cpu : UltraSparc T1 (Niagara) > > fpu : UltraSparc T1 integrated FPU > > prom : OBP 4.23.4 2006/08/04 20:45 > > type : sun4v > > ncpus probed : 24 > > ncpus active : 24 > > D$ parity tl1 : 0 > > I$ parity tl1 : 0 > > > > Both servers are built with 1 GHz T1 processors (6 > cores, 24 threads). > > > > as Ross pointed out, many io pattern only have 1 outstanding io at any > time, so there is only one work thread actively to serve it. so it can > not exploit the multiple core here. > > > you see 100% at nullio or fileio? with disk, most time should spend on > iowait and cpu utilization should not high at all. Maybe it has to do with the endian-ness fix? Look at where the fix was implemented and if there was a simpler way of implementing it? (If that is the cause) The network is still slower then expected, I don't know what chipset the Sparcs use for their interfaces, if it is e1000 then you can set low-latency interrupt throttling with InterruptThrottleRate=1 which works well. You can explore other interface module options around Interrupt throttling or coalesence. -Ross ______________________________________________________________________ This e-mail, and any attachments thereto, is intended only for use by the addressee(s) named herein and may contain legally privileged and/or confidential information. If you are not the intended recipient of this e-mail, you are hereby notified that any dissemination, distribution or copying of this e-mail, and any attachments thereto, is strictly prohibited. If you have received this e-mail in error, please immediately notify the sender and permanently delete the original and any copy or printout thereof. - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html