On 12-08-07 10:54 AM, Barry J Sturgeon wrote: > when loading the 3.2.1. kernel onto my supermicro pdsm4 motherboard (with > marvell 8 port sata controller) i have encountered performance degradation > due to excessive cpu usage (i.e. three busy kworkers). after some > investigation i have tracked down the problem to: > > +static void mv_wait_for_edma_empty_idle(struct ata_port *ap) > +{ > + void __iomem *port_mmio = mv_ap_base(ap); > + const u32 empty_idle = (edma_status_cache_empty | edma_status_idle); > + const int per_loop = 5, timeout = (15 * 1000 / per_loop); > + int i; > + > + /* > + * wait for the edma engine to finish transactions in progress. > + */ > + for (i = 0; i < timeout; ++i) { > + u32 edma_stat = readl(port_mmio + edma_status_ofs); > + if ((edma_stat & empty_idle) == empty_idle) > + break; > *+ udelay(per_loop);* > + } > + /* ata_port_printk(ap, kern_info, "%s: %u+ usecs\n", __func__, i); */ > +} > > it appears that we always reach the timeout value. > note that empty_idle = 0xc0 (as you know), > but most edma_stat values are (1100 or 1000). Okay, that's weird. mv_wait_for_edma_empty_idle() is called from mv_stop_edma(), which happens whenever we transition between sending NCQ commands (eg. read/write) and non-NCQ commands (eg. flush cache). The mv_qc_defer() function is supposed to ensure that things are more or less idle before it even gets to mv_stop_edma(). So we'd expect the mv_wait_for_edma_empty_idle() function to be quick, which is a big part of why we're willing to tolerate an inline busy-wait. But here it seems to be waiting MUCH longer, as if it has to wait for some IO-in-progress to complete before continuing. Which means that the mv_qc_defer() function isn't being as effective as it should be. The idea is, before any new command is queued, libata invokes the defer function to see if the driver is ready to accept the new command. The sata_mv driver normally says "yes" (go ahead) for a new NCQ command whenever it currently has existing NCQ commands in-flight. Otherwise it always says "no" (please defer) unless completely idle. That's how mv_qc_defer() is supposed to work, and that should minimize the time spent inside mv_wait_for_edma_empty_idle() by ensuring the edma engine is already idle whenever that gets called. This in turn suggests two possibilities: (1) perhaps ap->nr_active_links has an accounting glitch somewhere. Or (2) maybe there's a locking bug somewhere around the invocation of mv_qc_defer(). Or something else is wrong. :) So, focus your attentions on mv_qc_defer(). Also, you could try mounting the filesystem with the "barrier=0" option, just to see if that makes the problem go away -- confirming the analysis above. -- mark lord real-time remedies inc. mlord@xxxxxxxxx -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html