On 3/9/20 11:14 AM, Ewan D. Milne wrote:
Large queues of I/O to offline devices that are eventually
submitted when devices are unblocked result in a many repeated
"rejecting I/O to offline device" messages. These messages
can fill up the dmesg buffer in crash dumps so no useful
prior messages remain. In addition, if a serial console
is used, the flood of messages can cause a hard lockup in
the console code.
Introduce a flag indicating the message has already been logged
for the device, and reset the flag when scsi_device_set_state()
changes the device state.
Signed-off-by: Ewan D. Milne <emilne@xxxxxxxxxx>
---
drivers/scsi/scsi_lib.c | 8 ++++++--
include/scsi/scsi_device.h | 2 ++
2 files changed, 8 insertions(+), 2 deletions(-)
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 610ee41..d3a6d97 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -1240,8 +1240,11 @@ scsi_prep_state_check(struct scsi_device *sdev, struct request *req)
* commands. The device must be brought online
* before trying any recovery commands.
*/
- sdev_printk(KERN_ERR, sdev,
- "rejecting I/O to offline device\n");
+ if (!sdev->offline_already) {
+ sdev->offline_already = 1;
+ sdev_printk(KERN_ERR, sdev,
+ "rejecting I/O to offline device\n");
+ }
return BLK_STS_IOERR;
case SDEV_DEL:
/*
@@ -2340,6 +2343,7 @@ scsi_device_set_state(struct scsi_device *sdev, enum scsi_device_state state)
break;
}
+ sdev->offline_already = 0;
sdev->sdev_state = state;
return 0;
diff --git a/include/scsi/scsi_device.h b/include/scsi/scsi_device.h
index f8312a3..72987a0 100644
--- a/include/scsi/scsi_device.h
+++ b/include/scsi/scsi_device.h
@@ -204,6 +204,8 @@ struct scsi_device {
unsigned unmap_limit_for_ws:1; /* Use the UNMAP limit for WRITE SAME */
unsigned rpm_autosuspend:1; /* Enable runtime autosuspend at device
* creation time */
+ unsigned offline_already:1; /* Device offline message logged */
+
atomic_t disk_events_disable_depth; /* disable depth for disk events */
DECLARE_BITMAP(supported_events, SDEV_EVT_MAXBITS); /* supported events */
Bitfields are troublesome in multithreaded software. Has it been
considered to use rate-limiting instead of introducing a new bitfield
member?
Thanks,
Bart.