Jeff Garzik wrote:
A spurious SDB FIS updating SActive is bad news, yes. But with a busy
AHCI controller, perhaps sharing PCI interrupts, I think there is
distinct potential to be flagged as a spurious interrupt, when not.
You are talking about cases where the controller raises SDB IRQ without
actually receiving one, right? That is certainly possible and if this
is the case, I'm very okay with eating the IRQ.
But I'm taking a higher level view than that, from two angles:
1) I think its a waste of time to even worry about this. We should just
program AHCI to spec, and let the controller and devices talk to each
other as they will. If there are spurious completions, that should show
up elsewhere via tag poisoning and/or tag rotation. Or data corruption,
if nothing else. We'll know, even without the potentially-questionable
spurious interrupt detection code.
2) Given the factors mentioned above -- shared irq and busy
multi-controller controller -- highly asynchronous conditions combine to
create a higher probability of seeing events arrive while you're
processing other events. Overall, I feel that trying to accurately
account for everything going on in silicon leads to madness and
complexity :) Just hand things to silicon, and trust that it either
accurately accounts SDB FISs, or will be quite obviously broken under
stress.
I see your point. But I'm quite worried about this one, because
* This won't occur often and even when it occurs at the hardware level,
it might be masked by processing delays in the kernel. But when it
blows, it will corrupt data silently. We might not see such a report
for months and even when it strikes pinpointing the origin will be
extremely difficult.
* The danger of the race condition is not theoretical. I think I can
trigger it with AHCI. AHCI delegates management of SActive register to
the driver.
Remember the lost command completion bug in very early AHCI NCQ
implementation? It was caused by setting SActive after issuing the
command. Command completion SDB was received before SActive was cleared
and when the IRQ handler kicks in afterward, it saw the bit which was
set after completion and thus couldn't complete the command resulting in
timeout.
This problem is similar except that it's the other way around. From
ahci_qc_issue()...
if (qc->tf.protocol == ATA_PROT_NCQ)
writel(1 << qc->tag, port_mmio + PORT_SCR_ACT);
writel(1 << qc->tag, port_mmio + PORT_CMD_ISSUE);
Let's say we insert mdelay(10) between SActive setting and actual
command issuance. It will drastically increase the chance of receiving
the spurious SDB FISes inbetween. The following sequence is likely when
such event occurs.
1. driver sets SActive bit for tag x
2. disk issues spurious SDB completion for tag x
3. ahci receives the SDB FIS, clears the corresponding SActive bit and
raises interrupt
4. driver issues command for tag x and complete command issuance
5. IRQ handler kicks in sees that SActive for tag x is clear and
completes the corresponding qc.
6. we now have data corruption.
I want to emphasize that SActive manipulation correctness is highly
critical for NCQ integrity. It's the thin thread the whole NCQ protocol
hangs on. Spurious SDB FIS amounts to random command completion without
synchronization. This is a serious problem which requires handling and
I'm pretty sure I'm not being paranoid here.
I think detection of this problem proves the necessity for spurious
interrupt reporting, at least until NCQ driver and hardware mature. We
still don't know how drives act with NCQ commands and definitely need to
learn what sorts of culprits they have and how to work around them. The
spurious interrupt handling allows us to detect and diagnose such
problems much better.
Am I making any sense? I really wish I could write better. Please give
it some thoughts even if my writing sucks. :-)
--
tejun
-
: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html