On 07/05/2024 12:17, Li, Eric (Honggang) wrote:
Sure, I don't particularly like it as a fix either. But first I would like to get your setup working
again to verify that only this needs fixing. Indeed something else may be broken since
a1b6fb947f923. In addition, if we need to backport a fix, I would only like to backport a fix for
real known broken topologies, and not theoretical issues not experienced.
But what exact setup do you have? I am confused, as you seem to be talking about your
setup being broken, but then other setup may also being broken, but you don't have access
to another setup. What is the other setup?
This issue was reported by our tester. His setup is
SAS Controller <--> SAS Expander <--> SAS Expander <--> SAS drives
<--> SAS Expander <--> SAS drives
......
When this issue occurred, no SAS drives could be detected.
As a workaround, I've already added those codes removed by the commit a1b6fb947f923, and at least we could detect the SAS drives.
ok, good to know, but it would be good to confirm that just re-adding
the code to set phy_state is enough.
Since our SAS expanders are self-configured, we don't need to explicitly configure the per-PHY routing tables.
So, I am not sure there is any other issue in this workaround as some per-PHY routing tables are not configured.
I think the root cause of this issue is the order of function calls
to
sas_dev_present_in_domain() and sas_ex_join_wide_port().
My concern here is whether or not we still need to configure routing
on the parent SAS expander before calling sas_ex_join_wide_port().
This part of logic is not present previously and I don't have environment to test this.
Back to your question, I believe we do need to join a wideport to downstream expander.
Changing the phy_state to PHY_STATE_DISCOVERED will skip scanning
those PHYs,
I would have thought that re-adding the code removed in a1b6fb947f923 to set
PHY_STATE_DISCOVERED would only affect the first phy scanned in that wideport. Other
phys scanned would have it set through calls to
sas_ex_join_wide_port()
I don't catch your point.
In my understanding, it would affect the rest PHYs in that wide port, not the first one.
The first PHY has been scanned/discovered and added to a port (at that time, it is just a narrow port yet).
Agreed
Call to sas_ex_join_wide_port() makes the rest PHYs associated with that existing port (making it become wideport) and set up sysfs between the PHY and port. > Set PHY_STATE_DISCOVERED would make the rest PHYs not being scanned/discovered again (as this wide port is already scanned).
If you can just confirm that re-adding the code to set phy_state =
DISCOVERED is good enough to see the SAS disks again, then this can be
further discussed.
Thanks,
John