Re: [PATCH v3] scsi: libsas: Fix exp-attached end device cannot be scanned in again after probe failed

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi, John

On 2024/6/18 23:21, John Garry wrote:
On 18/06/2024 14:10, yangxingui wrote:

We found that it is judged as broadcast flutter when the exp-attached end
device reconnects after probe failed, as follows:

[78779.654026] sas: broadcast received: 0
[78779.654037] sas: REVALIDATING DOMAIN on port 0, pid:10
[78779.654680] sas: ex 500e004aaaaaaa1f phy05 change count has changed [78779.662977] sas: ex 500e004aaaaaaa1f phy05 originated BROADCAST(CHANGE)
[78779.662986] sas: ex 500e004aaaaaaa1f phy05 new device attached
[78779.663079] sas: ex 500e004aaaaaaa1f phy05:U:8 attached: 500e004aaaaaaa05 (stp)
[78779.693542] hisi_sas_v3_hw 0000:b4:02.0: dev[16:5] found
[78779.701155] sas: done REVALIDATING DOMAIN on port 0, pid:10, res 0x0
[78779.707864] sas: Enter sas_scsi_recover_host busy: 0 failed: 0
...
[78835.161307] sas: --- Exit sas_scsi_recover_host: busy: 0 failed: 0 tries: 1 [78835.171344] sas: sas_probe_sata: for exp-attached device 500e004aaaaaaa05 returned -19
[78835.180879] hisi_sas_v3_hw 0000:b4:02.0: dev[16:5] is gone
[78835.187487] sas: broadcast received: 0
[78835.187504] sas: REVALIDATING DOMAIN on port 0, pid:10
[78835.188263] sas: ex 500e004aaaaaaa1f phy05 change count has changed [78835.195870] sas: ex 500e004aaaaaaa1f phy05 originated BROADCAST(CHANGE)
[78835.195875] sas: ex 500e004aaaaaaa1f rediscovering phy05
[78835.196022] sas: ex 500e004aaaaaaa1f phy05:U:A attached: 500e004aaaaaaa05 (stp)
[78835.196026] sas: ex 500e004aaaaaaa1f phy05 broadcast flutter
[78835.197615] sas: done REVALIDATING DOMAIN on port 0, pid:10, res 0x0

The cause of the problem is that the related ex_phy's attached_sas_addr was not cleared after the end device probe failed. In order to solve the above problem, a function sas_ex_unregister_end_dev() is defined to clear the ex_phy information and unregister the end device after the exp-attached end
device probe failed.

Can you just manually clear the ex_phy's attached_sas_addr at the appropiate point (along with calling sas_unregister_dev())? It seems that we are using heavy-handed approach in calling sas_unregister_devs_sas_addr(), which does the clearing and much more.

I just tried it and it worked. If we only clear ex_phy's attached_sas_addr, there is no need to call sas_destruct_ports(). We are currently using sas_unregister_devs_sas_addr() which will add the port to sas_port_del_list, so we need to call sas_destruct_ports() separately to delete the port.

Should we also delete the port after the devices probe failed?

I'm not sure. Please check it.

sas_fail_probe() would still call sas_unregister_dev(), as required.

And you said that the sas_fail_probe() probe call would be asynchronous to sas_revalidate_domainin(). I actually expected you to have the new call to sas_destruct_ports() at the top of sas_revalidate_domainin(), like v2, but it is in sas_probe_devices().

Anyway, please check whether you require this additional call to delete the port.

Sorry, there was something wrong with the previous process description.
the correct is:

1. REVALIDATING DOMAIN
2. new device attached, create port,etc.
4. done REVALIDATING DOMAIN
5. @out, handle parent->port->sas_port_del_list
6. sas_probe_devices()
7. if device probe failed in step 6 and call sas_unregister_devs_sas_addr(), then add phy->port->list to parent->port->sas_port_del_list // port won't delete

8. next, REVALIDATING DOMAIN
9. new device attached
10. new port create failed, as port already exits.


So, v3 delete port at then end of sas_probe_devices(). And if we don't use sas_unregister_devs_sas_addr() follow your suggestion then we don't need to call sas_destruct_ports().

I am finding it hard to follow you now.
I'm sorry for that. ^-^

Can you show the complete change which you think that we now require to fix this issue?

Okay, I'll update a new version.

Thanks,
Xingui




[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [SCSI Target Devel]     [Linux SCSI Target Infrastructure]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Samba]     [Device Mapper]

  Powered by Linux