Internal Use - Confidential +AD4------Original Message----- +AD4-From: Jason Yan +ADw-yanaijie+AEA-huawei.com+AD4- +AD4-Sent: Thursday, April 25, 2024 10:58 AM +AD4-To: John Garry +ADw-john.g.garry+AEA-oracle.com+AD4AOw- Li, Eric (Honggang) +ADw-Eric.H.Li+AEA-Dell.com+AD4AOw- +AD4-james.bottomley+AEA-hansenpartnership.com+ADs- Martin K . Petersen +ADw-martin.petersen+AEA-oracle.com+AD4- +AD4-Cc: linux-scsi+AEA-vger.kernel.org +AD4-Subject: Re: Issue in sas+AF8-ex+AF8-discover+AF8-dev() for multiple level of SAS expanders in a domain +AD4- +AD4- +AD4AWw-EXTERNAL EMAIL+AF0- +AD4- +AD4-On 2024/4/24 18:46, John Garry wrote: +AD4APg- On 24/04/2024 09:59, Li, Eric (Honggang) wrote: +AD4APgA+- Hi, +AD4APgA+- +AD4APgA+- There is an issue in the function sas+AF8-ex+AF8-discover+AF8-dev() when I have +AD4APgA+- multiple SAS expanders chained under one SAS port on SAS controller. +AD4APg- +AD4APg- I think typically we can't and so don't test such a setup. +AD4- +AD4-Eric, +AD4- +AD4-I also don't understand why you need such a setup. Can you explain more details of your +AD4-topology? I believe this is common setup if you want to support large number of drives under one SAS port of SAS controller. +AD4- +AD4APg- +AD4APgA+- +AD4APgA+- In this function, we first check whether the PHY+IBk-s +AD4APgA+- attached+AF8-sas+AF8-address is already present in the SAS domain, and then +AD4APgA+- check if this PHY belongs to an existing port on this SAS expander. +AD4APgA+- I think this has an issue if this SAS expander use a wide port +AD4APgA+- connecting a downstream SAS expander. +AD4APgA+- This is because if the PHY belongs to an existing port on this SAS +AD4APgA+- expander, the attached SAS address of this port must already be +AD4APgA+- present in the domain and it results in disabling that port. +AD4APgA+- I don+IBk-t think that is what we expect. +AD4APgA+- +AD4APgA+- In old release (4.x), at the end of this function, it would make +AD4APgA+- addition sas+AF8-ex+AF8-join+AF8-wide+AF8-port() call for any possibly PHYs that +AD4APgA+- could be added into the SAS port. +AD4APgA+- This will make subsequent PHYs (other than the first PHY of that +AD4APgA+- port) being marked to DISCOVERED so that this function would not be +AD4APgA+- invoked on those subsequent PHYs (in that port). +AD4APgA+- But potential question here is we didn+IBk-t configure the per-PHY +AD4APgA+- routing table for those PHYs. +AD4APgA+- As I don+IBk-t have such SAS expander on hand, I am not sure what+IBk-s +AD4APgA+- impact (maybe just performance/bandwidth impact). +AD4APgA+- But at least, it didn+IBk-t impact the functionality of that port. +AD4APgA+- +AD4APgA+- But in v5.3 or later release, that part of code was removed (in the +AD4APgA+- commit a1b6fb947f923). +AD4APg- +AD4APg- Jason, can you please check this? +AD4- +AD4-The removed code is only for races before we serialize the event processing. All PHYs will still +AD4-be scanned one by one and add to the wide port if they have the same address. So are you +AD4-encountering a real issue? If so, can you share the full log? Yes. We did hit this issue when we upgrade Linux kernel from 4.19.236 to 5.14.21. Full logs attached. +AD4- +AD4-Thanks, +AD4-Jason +AD4- +AD55XU4AUgeYelIp/wE- +AD4- +AD4APg- +AD4APg- Thanks+ACE- +AD4APg- +AD4APgA+- And this caused this problem occurred (downstream port of that SAS +AD4APgA+- expander was disabled and all downstream SAS devices were removed +AD4APgA+- from the domain). +AD4APgA+- +AD4APgA+- Regards. +AD4APgA+- Eric Li +AD4APgA+- +AD4APgA+- SPE, DellEMC +AD4APgA+- 3/F KIC 1, 252+ACM- Songhu Road, YangPu District, SHANGHAI +AD4APgA+- +-86-21-6036-4384 +AD4APgA+- +AD4APgA+- +AD4APgA+- Internal Use - Confidential +AD4APg- +AD4APg- .
Attachment:
dmesg.log
Description: dmesg.log