Hello, Mikael. > I did a round of regression tests which identified another > related but different problem. > > In kernels 2.6.24 and 2.6.25 sata_promise would actually recover > from the timeouts, while in kernels 2.6.26 and 2.6.27 it would not. > Before 2.6.26 sata_promise explicitly used sata_std_hardreset, but > in the "make reset related methods proper port operations" commit > (a1efdaba2dbd6fb89e23a87b66d3f4dd92c9f5af), Tejun changed sata_promise > to use the hardreset it now inherits from ata_sff_port_ops, namely > sata_sff_hardreset. This change looks accidental. The main difference > between these two procedures is that the sff version will poll the > legacy status register until the port becomes ready. Hmm... it's quite likely that I've introduced the regression but that commit ain't it (I actually wrote a script to verify the inheritance change doesn't actually change the function table). What used to be sata_std_hardreset() is now sata_sff_hardreset(). The change was made while separating out SFF as [S]ATA as per the standard doesn't have any way to wait for device readiness. The TF-polling is SFF specific and thus moved out to sata_sff_hardreset(). So, in both 2.6.24 and 2.6.25, sata_promise did wait for device readiness after hardreset as does 2.6.26 or any later version. > Changing sata_promise to use sata_std_hardreset in 2.6.26/.27 > makes the EH after the timeouts much more reliable. Not as > tidy as with the previous ->prereset fix, but still working. The only behavior change between 2.6.25 and 2.6.26 as far as sata_promise is concerned is that hardrset is preferred over softreset. Here's what I think is going on. Previously, after a timeout, libata-eh will invoke softreset and if that works all should have been fine whether hardreset actually worked or not. Now, after something goes wrong, libata EH calls hardreset and as hardreset doesn't work properly without the controller reset so it fails. So, the libata core layer change exposed a bug in hardreset, which was one of the reasons why the change was made - hardreset being the last line of defense, using it occasionally doesn't make sense test-coverage-wise. I agree the core layer changes can be quite confusing but they were necessary to keep the core layer scalable. libata now has oodles of different types of low level drivers and things were and still are getting quite treacherous for drivers living on the edge. Anyways, so, please fix hardreset. If it can't wait for device readiness reliably, the right thing to do is to use sata_std_hardreset() which will trigger follow-up softreset to wait for device readiness and classify devices but if adding the controller reset makes the hardreset more reliable, please do so. Thanks. -- tejun -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html