Am 17.07.2012 22:01, schrieb Tejun Heo: > On Tue, Jul 17, 2012 at 09:39:41PM +0200, Matthias Prager wrote: >> I could not however reproduce the issue on any other device than a LSI >> SAS controller (using SATA disks) - on a regular ICH10 using AHCI and a >> SATA drive I don't see these i/o errors. But since I'm experiencing >> these issues on two different systems (both with lsi controllers while >> running vmware-guests on them) and Robert sees them on his >> (non-virtualized) system with the same lsi controller (9211-8i), I'm >> inclined to make the following assumptions: >> Either it is an issue which is limited to this controller and possibly >> sata disks hanging off it or it is a more general issue with sas >> controllers and sata disks (again it could well affect sas disks too). >> Lacking other controllers or sas disks I can't be sure. > > So, nothing in the libata stack generates NOT_READY - "initializing > command required". I suppose it's LSI firmware / driver translating > TUR to CHECK_POWER_MODE and generating NOT_READY. I don't know what > SAT says about this but this can't be correct. An ATA device in > standby mode is ready to process any commands. It should be able to > come back to full operation on demand as necessary and that's why it > can be transparently enabled from device side. Eric? > While reading the linux-scsi mailing list I stumbled upon '[Bug 16070] Fail to issue Start/Stop Unit' <http://marc.info/?l=linux-scsi&m=134278835822649&w=2> (bugtracker: <https://bugzilla.kernel.org/show_bug.cgi?id=16070>) which lead me to trying to enable the 'allow_restart' flag for my disks. With this workaround a vanilla kernel 3.4.5 does not exhibit the i/o errors on sleeping sata disks hanging off sas controllers. I'm currently running one of my systems with a 'echo 1 | tee /sys/block/sd?/device/scsi_disk/*/allow_restart >/dev/null' line added to the init scripts. This way I can use the untouched kernel sources and still get around the i/o errors. But I reckon this is no solution. I'm no expert on scsi/sas/ata internals, so please take the following thoughts with a grain of salt: As far as I can see (and Tejun confirmed that - I think) Tejun commit 85ef06d1d252f6a2e73b678591ab71caad4667bb somehow exposes a bug, which lies deeper in the sas/ata code. The 'sas_slave_configure()' function in 'drivers/scsi/libsas/sas_scsi_host.c' sets the 'allow_restart' flag for sas disks hanging off sas controllers. But if it encounters a sata disk it calls 'ata_sas_slave_configure()' in 'drivers/ata/libata_scsi.c' instead and returns without enabling the 'allow_restart' flag. A simple fix would be to set allow_restart=1 after having called 'ata_sas_slave_configure()' but before returning (in 'sas_slave_configure()'). Now I'm not sure this isn't taping over another bug. Which leads me to my question: What is the correct behavior? #1 Issuing a separate spin-up command (START UNIT?) prior to sending i/o by setting allow_restart=1 for sata disks on sas controllers or #2 Teaching the sas drivers they do not need spin-up commands and can simply start issuing i/o to sata disks -- Matthias -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html