Re: [PATCH] scsi: core: Rate limit "rejecting I/O" messages

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 2020-04-08 at 19:10 +0200, Daniel Wagner wrote:
> Prevent excessive logging by rate limiting the "rejecting I/O"
> messages. For example in setups where remote syslog is used the link
> is saturated by those messages when a storage controller/disk
> misbehaves.
> 
> Cc: "James E.J. Bottomley" <jejb@xxxxxxxxxxxxx>
> Cc: "Martin K. Petersen" <martin.petersen@xxxxxxxxxx>
> Signed-off-by: Daniel Wagner <dwagner@xxxxxxx>
> ---
>  drivers/scsi/scsi_lib.c    |  4 ++--
>  include/scsi/scsi_device.h | 10 ++++++++++
>  2 files changed, 12 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
> index 47835c4b4ee0..01c35c58c6f3 100644
> --- a/drivers/scsi/scsi_lib.c
> +++ b/drivers/scsi/scsi_lib.c
> @@ -1217,7 +1217,7 @@ scsi_prep_state_check(struct scsi_device *sdev,
> struct request *req)
>  		 */
>  		if (!sdev->offline_already) {
>  			sdev->offline_already = true;
> -			sdev_printk(KERN_ERR, sdev,
> +			sdev_printk_ratelimited(KERN_ERR, sdev,
>  				    "rejecting I/O to offline
> device\n");

I would really prefer we not do it this way if at all possible.
It loses information we may need to debug SAN outage problems.

The reason I didn't use ratelimit is that the ratelimit structure is
per-instance of the ratelimit call here, not per-device.  So this
doesn't work right -- it will drop messages for other devices.

>  		}
>  		return BLK_STS_IOERR;
> @@ -1226,7 +1226,7 @@ scsi_prep_state_check(struct scsi_device *sdev,
> struct request *req)
>  		 * If the device is fully deleted, we refuse to
>  		 * process any commands as well.
>  		 */
> -		sdev_printk(KERN_ERR, sdev,
> +		sdev_printk_ratelimited(KERN_ERR, sdev,
>  			    "rejecting I/O to dead device\n");

I practice I hardly see this message, do you actually have a case
where this happens?  If so perhaps add another flag similar to
offline_already?

The offline message happens a *lot*, we get a ton of them for each
active device when the queues are unblocked when a target goes away.

-Ewan

>  		return BLK_STS_IOERR;
>  	case SDEV_BLOCK:
> diff --git a/include/scsi/scsi_device.h b/include/scsi/scsi_device.h
> index c3cba2aaf934..8be40b0e1b8f 100644
> --- a/include/scsi/scsi_device.h
> +++ b/include/scsi/scsi_device.h
> @@ -257,6 +257,16 @@ sdev_prefix_printk(const char *, const struct
> scsi_device *, const char *,
>  #define sdev_printk(l, sdev, fmt, a...)				
> \
>  	sdev_prefix_printk(l, sdev, NULL, fmt, ##a)
>  
> +#define sdev_printk_ratelimited(l, sdev, fmt, a...)			
> \
> +({									
> \
> +	static DEFINE_RATELIMIT_STATE(_rs,				
> \
> +				      DEFAULT_RATELIMIT_INTERVAL,	\
> +				      DEFAULT_RATELIMIT_BURST);		
> \
> +									
> \
> +	if (__ratelimit(&_rs))						
> \
> +		sdev_prefix_printk(l, sdev, NULL, fmt, ##a);		
> \
> +})
> +
>  __printf(3, 4) void
>  scmd_printk(const char *, const struct scsi_cmnd *, const char *,
> ...);
>  




[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [SCSI Target Devel]     [Linux SCSI Target Infrastructure]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Samba]     [Device Mapper]

  Powered by Linux