BERTRAND Joël wrote: > > BERTRAND Joël wrote: > > BERTRAND Joël wrote: > >> Bill Davidsen wrote: > >>> Dan Williams wrote: > >>>> On Fri, 2007-10-19 at 01:04 -0700, BERTRAND Joël wrote: > >>>> > >>>>> I run for 12 hours some dd's (read and write in nullio) > >>>>> between > >>>>> initiator and target without any disconnection. Thus > iSCSI code seems > >>>>> to > >>>>> be robust. Both initiator and target are alone on a > single gigabit > >>>>> ethernet link (without any switch). I'm investigating... > >>>>> > >>>> > >>>> Can you reproduce on 2.6.22? > >>>> > >>>> Also, I do not think this is the cause of your failure, > but you have > >>>> CONFIG_DMA_ENGINE=y in your config. Setting this to 'n' > will compile > >>>> out the unneeded checks for offload engines in async_memcpy and > >>>> async_xor. > >>> > >>> Given that offload engines are far less tested code, I > think this is > >>> a very good thing to try! > >> > >> I'm trying wihtout CONFIG_DMA_ENGINE=y. istd1 only > uses 40% of one > >> CPU when I rebuild my raid1 array. 1% of this array was now > >> resynchronized without any hang. > >> > >> Root gershwin:[/usr/scripts] > cat /proc/mdstat > >> Personalities : [raid1] [raid6] [raid5] [raid4] > >> md7 : active raid1 sdi1[2] md_d0p1[0] > >> 1464725632 blocks [2/1] [U_] > >> [>....................] recovery = 1.0% > (15705536/1464725632) > >> finish=1103.9min speed=21875K/sec > > > > Same result... > > > > connection2:0: iscsi: detected conn error (1011) > > > > session2: iscsi: session recovery timed out after 120 secs > > sd 4:0:0:0: scsi: Device offlined - not ready after error recovery > > sd 4:0:0:0: scsi: Device offlined - not ready after error recovery > > sd 4:0:0:0: scsi: Device offlined - not ready after error recovery > > sd 4:0:0:0: scsi: Device offlined - not ready after error recovery > > sd 4:0:0:0: scsi: Device offlined - not ready after error recovery > > sd 4:0:0:0: scsi: Device offlined - not ready after error recovery > > sd 4:0:0:0: scsi: Device offlined - not ready after error recovery > > Sorry for this last mail. I have found another mistake, > but I don't > know if this bug comes from iscsi-target or raid5 itself. > iSCSI target > is disconnected because istd1 and md_d0_raid5 kernel threads > use 100% of > CPU each ! > > Tasks: 235 total, 6 running, 227 sleeping, 0 stopped, 2 zombie > Cpu(s): 0.1%us, 12.5%sy, 0.0%ni, 87.4%id, 0.0%wa, 0.0%hi, > 0.0%si, > 0.0%st > Mem: 4139032k total, 218424k used, 3920608k free, > 10136k buffers > Swap: 7815536k total, 0k used, 7815536k free, > 64808k cached > > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND > > 5824 root 15 -5 0 0 0 R 100 0.0 10:34.25 istd1 > > 5599 root 15 -5 0 0 0 R 100 0.0 7:25.43 > md_d0_raid5 > > Regards, > > JKB If you have 2 iSCSI sessions mirrored then any failure along either path will hose the setup. Plus having iSCSI and MD RAID fight over same resources in kernel is a recipe for a race condition. How about exploring MPIO and DRBD? -Ross ______________________________________________________________________ This e-mail, and any attachments thereto, is intended only for use by the addressee(s) named herein and may contain legally privileged and/or confidential information. If you are not the intended recipient of this e-mail, you are hereby notified that any dissemination, distribution or copying of this e-mail, and any attachments thereto, is strictly prohibited. If you have received this e-mail in error, please immediately notify the sender and permanently delete the original and any copy or printout thereof. - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html