On Fri, 11 Nov 2005, Andrew Morton wrote: > Begin forwarded message: > > Date: Mon, 7 Nov 2005 14:49:17 -0800 > From: bugme-daemon@xxxxxxxxxxxxxxxxxxx > To: bugme-new@xxxxxxxxxxxxxx > Subject: [Bugme-new] [Bug 5566] New: scsi_eh_x/scsi_wq_x "zombie" processes in kernel 2.6.13+ > > > http://bugzilla.kernel.org/show_bug.cgi?id=5566 > > Summary: scsi_eh_x/scsi_wq_x "zombie" processes in kernel 2.6.13+ > Kernel Version: 2.6.13+ > Status: NEW > Severity: normal > Owner: andrew.vasquez@xxxxxxxxxx > Submitter: gator@xxxxxxxxxxxxxxx > > > Most recent kernel where this bug did not occur: 2.6.12 > Starting around kernel version 2.6.13, the scsi_eh_x and scsi_wq_x > processes that are created per scsi host will not terminate if the > driver for the scsi interface is removed. I don't know whether there > are any serious problems involved with this, but one thing that is > definitely annoying, is that the process list fills very quickly when > modules are loaded/unloaded on demand, because 2 new processes will > be created every time the driver for a scsi adapter gets loaded. > > (I guess, this happens with all scsi host modules - in my case, the > "culprit" is a qlogic fibre channel driver that gets loaded only when > needed.) Seems there appear to be some reference-counting problems here, as the task trace: scsi_eh_2 S 00000001 0 10399 19 10452 900 (L-TLB) f5144fa4 f5144f94 00000004 00000001 c19b2560 c0370b15 f5144fa4 00000004 00000002 00000002 00000001 00000000 f383f31c 00000001 f1d32d54 00000001 ffffffff c1b3a030 c19b2560 00000001 0000166d 6db5a7a1 00000022 c1ac0540 Call Trace: [<c0370b15>] schedule+0x6a5/0xd0d [<c02919cd>] scsi_error_handler+0x0/0x12e [<c0291a09>] scsi_error_handler+0x3c/0x12e [<c012ef91>] kthread+0xa3/0xcd [<c012eeee>] kthread+0x0/0xcd [<c0101119>] kernel_thread_helper+0x5/0xb scsi_wq_2 S 00000003 0 10452 19 11613 10399 (L-TLB) eb196f3c eb196f28 00000004 00000003 c02957b4 f7f1e79c 00000040 ee08cebc eccf2540 00000001 00000000 00000000 e8fdd540 f04ac17c f04ac320 e8fdd540 00000000 c19c2ec0 c19c2560 00000003 000761f8 d13d8b02 00000022 e8fdd540 Call Trace: [<c02957b4>] __scsi_scan_target+0xaf/0x12d [<c012ae53>] worker_thread+0x147/0x24a [<f8835f20>] fc_scsi_scan_rport+0x0/0x40 [scsi_transport_fc] [<c0116b80>] default_wake_function+0x0/0x12 [<c0116b80>] default_wake_function+0x0/0x12 [<c012ad0c>] worker_thread+0x0/0x24a [<c012ef91>] kthread+0xa3/0xcd [<c012eeee>] kthread+0x0/0xcd [<c0101119>] kernel_thread_helper+0x5/0xb has the workqueue thread stuck: out_reap: /* now determine if the target has any children at all and if not, nuke it */ scsi_target_reap(starget); put_device(&starget->dev); } at the final put_device() in __scsi_scan_target(). I'm still trying to figure out the call paths which take us there during module unload. James, any ideas on the ref-counting? -- av - : send the line "unsubscribe linux-scsi" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html