On Mon, 2010-06-14 at 09:59 +0200, Tejun Heo wrote: > Hello, > > On 06/14/2010 09:53 AM, Andi Kleen wrote: > > On Mon, Jun 14, 2010 at 09:43:28AM +0200, Tejun Heo wrote: > >>> sd 11:0:0:0: [sdd] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK > >>> sd 11:0:0:0: [sdd] CDB: Write(10): 2a 00 00 e5 f0 08 00 01 00 00 > >>> sd 11:0:0:0: [sdd] Unhandled error code > >>> sd 11:0:0:0: [sdd] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK > >>> sd 11:0:0:0: [sdd] CDB: Write(10): 2a 00 00 e5 f1 08 00 01 00 00 > >>> sd 11:0:0:0: [sdd] Unhandled error code > >>> sd 11:0:0:0: [sdd] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK > >>> sd 11:0:0:0: [sdd] CDB: Write(10): 2a 00 00 e5 f2 08 00 01 00 00 > >>> > >>> same messages repeating forever, just with CDB changing occasionally. > >>> > >>> .... > >>> > >>> not stopping until I reset the box. > >> > >> Did you have a lot of dirty pages? It looks like upper layer is > > > > Yes, there was a dd running. > > > >> trying to flush all the dirty buffers and SCSI is a tad bit too > >> verbose about failing each IO w/ DID_BAD_TARGET thus taking a very > > > > A bit too verbose? That's really an euphemism ... > > Yeap, of course it was. :-) > > > During the CDB: Write loop the console was totally unusable! > > > > And I think the fsyncs in syslogd completely made the performance > > tank. > > Console often becomes the bottleneck too when there are a lot of > kernel messages. > > > So basically it was a "reset button only" situation. > > > > When the device is gone what's the point in giving a message > > more than once? Can't the requests just be silently failed in this > > case? > > Yeah, it would be better to somehow summarize those error message > instead of spitting out all of them. I don't think we can summarize. However, when things start to go wrong, it's usually only the first set of errors that are significant, so we could do a simple ratelimit. James --- diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c index 1646fe7..c8c7483 100644 --- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -896,7 +896,7 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int good_bytes) case ACTION_FAIL: /* Give up and fail the remainder of the request */ scsi_release_buffers(cmd); - if (!(req->cmd_flags & REQ_QUIET)) { + if (!(req->cmd_flags & REQ_QUIET) && printk_ratelimit()) { if (description) scmd_printk(KERN_INFO, cmd, "%s\n", description); -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html