http://bugzilla.kernel.org/show_bug.cgi?id=11990 ------- Comment #2 from anonymous@xxxxxxxxxxxxxxxxxxxx 2008-11-09 07:22 ------- Reply-To: James.Bottomley@xxxxxxxxxxxxxxxxxxxxx On Sat, 2008-11-08 at 19:50 -0800, bugme-daemon@xxxxxxxxxxxxxxxxxxx wrote: > http://bugzilla.kernel.org/show_bug.cgi?id=11990 > > Summary: Kernel hang in spin_unlock_irq from scsi_request_fn from > do_IRQ > Product: IO/Storage > Version: 2.5 > KernelVersion: 2.6.28-rc3 > Platform: All > OS/Version: Linux > Tree: Mainline > Status: NEW > Severity: normal > Priority: P1 > Component: SCSI > AssignedTo: linux-scsi@xxxxxxxxxxxxxxx > ReportedBy: vandrove@xxxxxxxxxx > > > Latest working kernel version: commit c8d7aa after 2.6.28-rc2 > Earliest failing kernel version: commit 920da6 after 2.6.28-rc2 > Distribution: Debian > Hardware Environment: sata_sil24, amd64, 2cpu > Software Environment: 64bit kernel, 32bit userspace, preemptible kernel > Problem Description: > > When I/O is under stress, from time to time CPU1 hangs, most probably due to > endless stream of interrupts. Backtrace printed either by kernel's softlockup > detection or alt-sysrq-p is below (written down; I/O is dead when this > happens). > > _spin_unlock_irq + 0x30 (after sti) > scsi_request_fn + 0x1b9 (after spin_unlock_irq(shost->host_lock) at > not_ready:) > blk_invoke_request_fn > __blk_runqueue > scsi_run_queue > scsi_next_command > scsi_end_request > scsi_io_completion > scsi_finish_command > scsi_softirq_done > blk_done_softirq > __do_softirq > call_softirq > do_softirq > irqexit > do_IRQ > ret_from_intr > <EOI> > native_safe_halt > trace_hardirqs_on > default_idle > c1e_idle > cpu_idle > start_secondary > > Steps to reproduce: > > It seems to occur under heavy I/O (updatedb, dumping core from ~3GB app), but I > was not able to trigger it reliably - most reliable is hard resetting box, then > it occurs in ~80% cases when replaying journals on disks connected to > sata_sil24 (through PMP, but problem does not seem to occur on 2.6.28-rc2 with > Jens's PMP patches). This looks identical to http://bugzilla.kernel.org/show_bug.cgi?id=11898 Could you see if this refinement of the discussed patches fixes it for you? Thanks, James --- diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c index f5d3b96..e09a661 100644 --- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -606,6 +606,7 @@ static void scsi_run_queue(struct request_queue *q) } list_del_init(&sdev->starved_entry); + starved_head = NULL; spin_unlock(shost->host_lock); spin_lock(sdev->request_queue->queue_lock); @@ -620,6 +621,12 @@ static void scsi_run_queue(struct request_queue *q) spin_unlock(sdev->request_queue->queue_lock); spin_lock(shost->host_lock); + if (unlikely(!list_empty(&sdev->starved_entry))) + /* + * sdev got put back on the starved list + * so finish starved handling + */ + break; } spin_unlock_irqrestore(shost->host_lock, flags); -- Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html