On 11-04-12 06:30 AM, Bruce Stenning wrote: .. > I am currently inserting tracing into 2.6.38.2 to try to work out what is going > on. From mv_write_main_irq_mask I can see that the IRQ for each port is still > enabled, even when ports stop responding. .. > One thing I noticed was that there is no spinlock around the > mv_save_cached_regs/mv_edma_cfg in mv_hardreset (unlike mv_port_start and > mv_port_stop); why is this? Yeah, I'm suspecting there's a loophole in the logic there somewhere. I dusted off the 6041 reference card I have here, and played with the cables for a while. Managed to get one port to stop responding to hot plug fairly quickly, though I'm not sure how/why. Then I added a debug printk() to mv_write_main_irq_mask(), with no other changes, and that appears to have been enough to change the race timing so that I could no longer produce the problem. Bruce, here's a slightly-ugly patch that should remove all doubt about races in the irq_mask. Please apply it, test with it, and let me know here if the issue goes away. Thanks --- old/drivers/ata/sata_mv.c 2011-04-13 11:03:07.442481426 -0400 +++ linux/drivers/ata/sata_mv.c 2011-04-13 11:07:55.224249621 -0400 @@ -1027,15 +1027,19 @@ static void mv_set_main_irq_mask(struct ata_host *host, u32 disable_bits, u32 enable_bits) { + static spinlock_t irq_mask_lock = __SPIN_LOCK_UNLOCKED(irq_mask_lock); // FIXME: per-host would be nicer struct mv_host_priv *hpriv = host->private_data; u32 old_mask, new_mask; + unsigned long flags; + spin_lock_irqsave(&irq_mask_lock, flags); old_mask = hpriv->main_irq_mask; new_mask = (old_mask & ~disable_bits) | enable_bits; if (new_mask != old_mask) { hpriv->main_irq_mask = new_mask; mv_write_main_irq_mask(new_mask, hpriv); } + spin_unlock_irqrestore(&irq_mask_lock, flags); } static void mv_enable_port_irqs(struct ata_port *ap,
--- old/drivers/ata/sata_mv.c 2011-04-13 11:03:07.442481426 -0400 +++ linux/drivers/ata/sata_mv.c 2011-04-13 11:07:55.224249621 -0400 @@ -1027,15 +1027,19 @@ static void mv_set_main_irq_mask(struct ata_host *host, u32 disable_bits, u32 enable_bits) { + static spinlock_t irq_mask_lock = __SPIN_LOCK_UNLOCKED(irq_mask_lock); // FIXME: per-host would be nicer struct mv_host_priv *hpriv = host->private_data; u32 old_mask, new_mask; + unsigned long flags; + spin_lock_irqsave(&irq_mask_lock, flags); old_mask = hpriv->main_irq_mask; new_mask = (old_mask & ~disable_bits) | enable_bits; if (new_mask != old_mask) { hpriv->main_irq_mask = new_mask; mv_write_main_irq_mask(new_mask, hpriv); } + spin_unlock_irqrestore(&irq_mask_lock, flags); } static void mv_enable_port_irqs(struct ata_port *ap,