On 2025/3/11 1:45, John Garry wrote:
Sure, but I am just trying to keep this simple. If you deform and
reform the port - and so lose and find the disk (which does the itct
config) - will that solve the problem?
We found that we need to perform lose and find for all devices on the
port including the local phy and the remote phy. This process still
requires traversing the phy information corresponding to all devices
to reset and it is also necessary to consider that there is a race
between device removal and the current process. it looks similar to
solution of update port id directly. And there will be the problem
mentioned above. e.g, during error handling, the recovery state will
last for more than 15 seconds, affecting the performance of other
disks on the same host.
How do you even detect the port id inconsistency for the device attached
at the remote phy? For this series, you could detect it at the phy
up/down handler for the directly attached device - how would it be
triggered for the remote phy?
When the hardware port ID of the EXP is detected to have changed, the
link reset is performed on the local phy of EXP in sequence, which will
not trigger the lose and find process of the EXP device, and it will not
trigger the lose and find process of the disks connected to the remote
phy. Therefore, it is necessary to lose and found all devices separately
to update the device's port id in itct.
local phy0 --------------------------- disk0
local phy1 --------------------------- disk1
local phy2 --------------------------- disk2
local phy3 --------------------------- disk3
_________
local phy4 --------| |-------- disk4
| |
local phy5 --------| |-------- disk5
| |
local phy6 --------| EXP |-------- disk6
| |
local phy7 --------| |-------- disk7
|_________|
Thanks,
Xingui
.