Hi Thank you very much for you reply, I have tried to contact 3PAR about this issue. Hundreds of them flooding the log for some seconds that is an serious problem. It can be reproduced on the Linux 4.14 kernel version or the lastest version for 3PAR storage Remote Copy Failover platform when the main storage is powered off. How to solve it ? I have found that the lastest patch 'pg->interval = 2000' is incorrect, pg->interval = 2 maybe is reasonable, 2 seconds delay, retry . But I have an new idea. I am wondering if the patch is reasonable to use the function printk_timed_ratelimit(&j, 500), can you help me review and commit this patch, Best regards Details: Jun 2 11:42:48 cvknode2058 kernel: [ 8451.881332] sd 1:0:3:2: [alua] Add. Sense: Logical unit not accessible, asymmetric access state transition Jun 2 11:42:48 cvknode2058 kernel: [ 8451.881382] sd 1:0:3:1: alua: rtpg retry Jun 2 11:42:48 cvknode2058 kernel: [ 8451.881388] sd 1:0:3:1: [alua] Sense Key : Not Ready [current] Jun 2 11:42:48 cvknode2058 kernel: [ 8451.881390] sd 1:0:3:1: [alua] Add. Sense: Logical unit not accessible, asymmetric access state transition Jun 2 11:42:48 cvknode2058 kernel: [ 8451.881394] sd 8:0:3:0: alua: rtpg retry Jun 2 11:42:48 cvknode2058 kernel: [ 8451.881400] sd 8:0:3:0: [alua] Sense Key : Not Ready [current] Jun 2 11:42:48 cvknode2058 kernel: [ 8451.881401] sd 8:0:3:0: [alua] Add. Sense: Logical unit not accessible, asymmetric access state transition Jun 2 11:42:48 cvknode2058 kernel: [ 8451.881421] sd 1:0:3:2: alua: rtpg retry Jun 2 11:42:48 cvknode2058 kernel: [ 8451.881427] sd 1:0:3:2: [alua] Sense Key : Not Ready [current] Jun 2 11:42:48 cvknode2058 kernel: [ 8451.881428] sd 1:0:3:2: [alua] Add. Sense: Logical unit not accessible, asymmetric access state transition Jun 2 11:42:48 cvknode2058 kernel: [ 8451.881481] sd 1:0:3:1: alua: rtpg retry @@ -576,9 +576,12 @@ static int alua_rtpg(struct scsi_device *sdev, struct alua_port_group *pg) err = SCSI_DH_RETRY; if (err == SCSI_DH_RETRY && pg->expiry != 0 && time_before(jiffies, pg->expiry)) { - sdev_printk(KERN_ERR, sdev, "%s: rtpg retry\n", + static unsigned long int j; + if (printk_timed_ratelimit(&j, 500)) { + sdev_printk(KERN_ERR, sdev, "%s: rtpg retry\n", ALUA_DH_NAME); - scsi_print_sense_hdr(sdev, ALUA_DH_NAME, &sense_hdr); + scsi_print_sense_hdr(sdev, ALUA_DH_NAME, &sense_hdr); + } kfree(buff); return err; } -----邮件原件----- 发件人: Dan Carpenter [mailto:dan.carpenter@xxxxxxxxxx] 发送时间: 2019年6月3日 2:01 收件人: zhangguanghui (Cloud) 抄送: martin.petersen@xxxxxxxxxx 主题: Re: scsi_dh_alua: re-initialize pg->interval in alua_rtpg_work On Sun, Jun 02, 2019 at 07:58:19AM +0000, Zhangguanghui wrote: > Hi > > The messages appear after a 3PAR storage controller shutdown and there are hundreds of them flooding the log. > > this is an problem. but I’m not sure if the patch is reasonable. Can you help review the patch, thanks. > > It's possible to use " pg->interval = 0", this causes an intense requeing of the ALUA work queue for somes seconds and it floods the kernel log. > > The reports pointed out that we should probably re-initialize it for every iteration through the retry loop. > Jun 2 11:42:48 cvknode2058 kernel: [ 8451.881332] sd 1:0:3:2: [alua] Add. Sense: Logical unit not accessible, asymmetric access state transition > Jun 2 11:42:48 cvknode2058 kernel: [ 8451.881382] sd 1:0:3:1: alua: rtpg retry > Jun 2 11:42:48 cvknode2058 kernel: [ 8451.881388] sd 1:0:3:1: [alua] Sense Key : Not Ready [current] > Jun 2 11:42:48 cvknode2058 kernel: [ 8451.881390] sd 1:0:3:1: [alua] Add. Sense: Logical unit not accessible, asymmetric access state transition > Jun 2 11:42:48 cvknode2058 kernel: [ 8451.881394] sd 8:0:3:0: alua: rtpg retry > Jun 2 11:42:48 cvknode2058 kernel: [ 8451.881400] sd 8:0:3:0: [alua] Sense Key : Not Ready [current] > Jun 2 11:42:48 cvknode2058 kernel: [ 8451.881401] sd 8:0:3:0: [alua] Add. Sense: Logical unit not accessible, asymmetric access state transition > Jun 2 11:42:48 cvknode2058 kernel: [ 8451.881421] sd 1:0:3:2: alua: rtpg retry > Jun 2 11:42:48 cvknode2058 kernel: [ 8451.881427] sd 1:0:3:2: [alua] Sense Key : Not Ready [current] > Jun 2 11:42:48 cvknode2058 kernel: [ 8451.881428] sd 1:0:3:2: [alua] Add. Sense: Logical unit not accessible, asymmetric access state transition > Jun 2 11:42:48 cvknode2058 kernel: [ 8451.881481] sd 1:0:3:1: alua: rtpg retry > Jun 2 11:42:48 cvknode2058 kernel: [ 8451.881487] sd 1:0:3:1: [alua] Sense Key : Not Ready [current] > Jun 2 11:42:48 cvknode2058 kernel: [ 8451.881489] sd 1:0:3:1: [alua] Add. Sense: Logical unit not accessible, asymmetric access state transition > Jun 2 11:42:48 cvknode2058 kernel: [ 8451.881496] sd 8:0:3:0: alua: rtpg retry > Jun 2 11:42:48 cvknode2058 kernel: [ 8451.881501] sd 8:0:3:0: [alua] Sense Key : Not Ready [current] > Jun 2 11:42:48 cvknode2058 kernel: [ 8451.881504] sd 8:0:3:0: [alua] Add. Sense: Logical unit not accessible, asymmetric access state transition > Jun 2 11:42:48 cvknode2058 kernel: [ 8451.881526] sd 1:0:3:2: alua: rtpg retry > Jun 2 11:42:48 cvknode2058 kernel: [ 8451.881535] sd 1:0:3:2: [alua] Sense Key : Not Ready [current] > Jun 2 11:42:48 cvknode2058 kernel: [ 8451.881539] sd 1:0:3:2: [alua] Add. Sense: Logical unit not accessible, asymmetric access state transition > Jun 2 11:42:49 cvknode2058 kernel: [ 8451.881583] sd 1:0:3:1: alua: rtpg retry > Jun 2 11:42:49 cvknode2058 kernel: [ 8451.881589] sd 1:0:3:1: [alua] Sense Key : Not Ready [current] > Jun 2 11:42:49 cvknode2058 kernel: [ 8451.881592] sd 1:0:3:1: [alua] Add. Sense: Logical unit not accessible, asymmetric access state transition > Jun 2 11:42:49 cvknode2058 kernel: [ 8451.881600] sd 8:0:3:0: alua: rtpg retry > > diff --git a/drivers/scsi/device_handler/scsi_dh_alua.c b/drivers/scsi/device_handler/scsi_dh_alua.c > > index 0962fd5..6b1060a 100644 > > --- a/drivers/scsi/device_handler/scsi_dh_alua.c > > +++ b/drivers/scsi/device_handler/scsi_dh_alua.c > > @@ -814,6 +814,7 @@ static void alua_rtpg_work(struct work_struct *work) > > if (err == SCSI_DH_RETRY || pg->flags & ALUA_PG_RUN_RTPG) { > > pg->flags &= ~ALUA_PG_RUNNING; > > pg->flags |= ALUA_PG_RUN_RTPG; > > + pg->interval = 2000; pg->interval is the time in seconds so this is a 33 minute delay. That is too long. This seems like a hardware problem. Have you tried to contact 3PAR about this issue? regards, dan carpenter ------------------------------------------------------------------------------------------------------------------------------------- 本邮件及其附件含有新华三集团的保密信息,仅限于发送给上面地址中列出 的个人或群组。禁止任何其他人以任何形式使用(包括但不限于全部或部分地泄露、复制、 或散发)本邮件中的信息。如果您错收了本邮件,请您立即电话或邮件通知发件人并删除本 邮件! This e-mail and its attachments contain confidential information from New H3C, which is intended only for the person or entity whose address is listed above. Any use of the information contained herein in any way (including, but not limited to, total or partial disclosure, reproduction, or dissemination) by persons other than the intended recipient(s) is prohibited. If you receive this e-mail in error, please notify the sender by phone or email immediately and delete it!