RE: Issue in sas_ex_discover_dev() for multiple level of SAS expanders in a domain

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



John,

I agree that the call to sas+AF8-ex+AF8-join+AF8-wide+AF8-port() is not mandatory. In fact, some logic here is similar to that function. We don't need to do it again.
But just updating the phy+AF8-state may not be enough. I suppose you still need to add that PHY into the corresponding wide port by calling sas+AF8-port+AF8-add+AF8-phy() and update phy-+AD4-port.
Just updating the phy+AF8-state may avoid the port disabled in this issue, but other missing piece of work may cause other issues.

Eric Li


Internal Use - Confidential
+AD4- -----Original Message-----
+AD4- From: John Garry +ADw-john.g.garry+AEA-oracle.com+AD4-
+AD4- Sent: Wednesday, May 1, 2024 10:24 PM
+AD4- To: Li, Eric (Honggang) +ADw-Eric.H.Li+AEA-Dell.com+AD4AOw- Jason Yan +ADw-yanaijie+AEA-huawei.com+AD4AOw-
+AD4- james.bottomley+AEA-hansenpartnership.com+ADs- Martin K . Petersen
+AD4- +ADw-martin.petersen+AEA-oracle.com+AD4-
+AD4- Cc: linux-scsi+AEA-vger.kernel.org
+AD4- Subject: Re: Issue in sas+AF8-ex+AF8-discover+AF8-dev() for multiple level of SAS expanders in a
+AD4- domain
+AD4-
+AD4-
+AD4- +AFs-EXTERNAL EMAIL+AF0-
+AD4-
+AD4- On 30/04/2024 15:22, Li, Eric (Honggang) wrote:
+AD4- +AD4- I suppose you have got the log file I attached.
+AD4- +AD4- If not, please let me know.
+AD4- +AD4- Any update about this?
+AD4- +AD4-
+AD4- +AD4- Eric LI
+AD4-
+AD4- So if you revert a1b6fb947f923, but then remove the call to
+AD4- sas+AF8-ex+AF8-join+AF8-wide+AF8-port() re-added in that revert, is it ok? I am just wondering are
+AD4- we just missing the call to set phy+AF8-state +AD0- PHY+AF8-DEVICE+AF8-DISCOVERED after v5.3?
+AD4-
+AD4- Thanks,
+AD4- John
+AD4-
+AD4- +AD4-
+AD4- +AD4-
+AD4- +AD4- Internal Use - Confidential
+AD4- +AD4APg- -----Original Message-----
+AD4- +AD4APg- From: Li, Eric (Honggang)
+AD4- +AD4APg- Sent: Thursday, April 25, 2024 1:04 PM
+AD4- +AD4APg- To: Jason Yan +ADw-yanaijie+AEA-huawei.com+AD4AOw- John Garry
+AD4- +AD4APg- +ADw-john.g.garry+AEA-oracle.com+AD4AOw- james.bottomley+AEA-hansenpartnership.com+ADs-
+AD4- +AD4APg- Martin K . Petersen +ADw-martin.petersen+AEA-oracle.com+AD4-
+AD4- +AD4APg- Cc: linux-scsi+AEA-vger.kernel.org
+AD4- +AD4APg- Subject: RE: Issue in sas+AF8-ex+AF8-discover+AF8-dev() for multiple level of SAS
+AD4- +AD4APg- expanders in a domain
+AD4- +AD4APg-
+AD4- +AD4APgA+- -----Original Message-----
+AD4- +AD4APgA+- From: Jason Yan +ADw-yanaijie+AEA-huawei.com+AD4-
+AD4- +AD4APgA+- Sent: Thursday, April 25, 2024 10:58 AM
+AD4- +AD4APgA+- To: John Garry +ADw-john.g.garry+AEA-oracle.com+AD4AOw- Li, Eric (Honggang)
+AD4- +AD4APgA+- +ADw-Eric.H.Li+AEA-Dell.com+AD4AOw- james.bottomley+AEA-hansenpartnership.com+ADs- Martin K .
+AD4- +AD4APgA+- Petersen +ADw-martin.petersen+AEA-oracle.com+AD4-
+AD4- +AD4APgA+- Cc: linux-scsi+AEA-vger.kernel.org
+AD4- +AD4APgA+- Subject: Re: Issue in sas+AF8-ex+AF8-discover+AF8-dev() for multiple level of
+AD4- +AD4APgA+- SAS expanders in a domain
+AD4- +AD4APgA+-
+AD4- +AD4APgA+-
+AD4- +AD4APgA+- +AFs-EXTERNAL EMAIL+AF0-
+AD4- +AD4APgA+-
+AD4- +AD4APgA+- On 2024/4/24 18:46, John Garry wrote:
+AD4- +AD4APgA+AD4- On 24/04/2024 09:59, Li, Eric (Honggang) wrote:
+AD4- +AD4APgA+AD4APg- Hi,
+AD4- +AD4APgA+AD4APg-
+AD4- +AD4APgA+AD4APg- There is an issue in the function sas+AF8-ex+AF8-discover+AF8-dev() when I
+AD4- +AD4APgA+AD4APg- have multiple SAS expanders chained under one SAS port on SAS
+AD4- controller.
+AD4- +AD4APgA+AD4-
+AD4- +AD4APgA+AD4- I think typically we can't and so don't test such a setup.
+AD4- +AD4APgA+-
+AD4- +AD4APgA+- Eric,
+AD4- +AD4APgA+-
+AD4- +AD4APgA+- I also don't understand why you need such a setup. Can you explain
+AD4- +AD4APgA+- more details of your topology?
+AD4- +AD4APg-
+AD4- +AD4APg- I believe this is common setup if you want to support large number of
+AD4- +AD4APg- drives under one SAS port of SAS controller.
+AD4- +AD4APg-
+AD4- +AD4APgA+-
+AD4- +AD4APgA+AD4-
+AD4- +AD4APgA+AD4APg-
+AD4- +AD4APgA+AD4APg- In this function, we first check whether the PHY+IBk-s
+AD4- +AD4APgA+AD4APg- attached+AF8-sas+AF8-address is already present in the SAS domain, and
+AD4- +AD4APgA+AD4APg- then check if this PHY belongs to an existing port on this SAS expander.
+AD4- +AD4APgA+AD4APg- I think this has an issue if this SAS expander use a wide port
+AD4- +AD4APgA+AD4APg- connecting a downstream SAS expander.
+AD4- +AD4APgA+AD4APg- This is because if the PHY belongs to an existing port on this SAS
+AD4- +AD4APgA+AD4APg- expander, the attached SAS address of this port must already be
+AD4- +AD4APgA+AD4APg- present in the domain and it results in disabling that port.
+AD4- +AD4APgA+AD4APg- I don+IBk-t think that is what we expect.
+AD4- +AD4APgA+AD4APg-
+AD4- +AD4APgA+AD4APg- In old release (4.x), at the end of this function, it would make
+AD4- +AD4APgA+AD4APg- addition sas+AF8-ex+AF8-join+AF8-wide+AF8-port() call for any possibly PHYs that
+AD4- +AD4APgA+AD4APg- could be added into the SAS port.
+AD4- +AD4APgA+AD4APg- This will make subsequent PHYs (other than the first PHY of that
+AD4- +AD4APgA+AD4APg- port) being marked to DISCOVERED so that this function would not
+AD4- +AD4APgA+AD4APg- be invoked on those subsequent PHYs (in that port).
+AD4- +AD4APgA+AD4APg- But potential question here is we didn+IBk-t configure the per-PHY
+AD4- +AD4APgA+AD4APg- routing table for those PHYs.
+AD4- +AD4APgA+AD4APg- As I don+IBk-t have such SAS expander on hand, I am not sure what+IBk-s
+AD4- +AD4APgA+AD4APg- impact (maybe just performance/bandwidth impact).
+AD4- +AD4APgA+AD4APg- But at least, it didn+IBk-t impact the functionality of that port.
+AD4- +AD4APgA+AD4APg-
+AD4- +AD4APgA+AD4APg- But in v5.3 or later release, that part of code was removed (in
+AD4- +AD4APgA+AD4APg- the commit a1b6fb947f923).
+AD4- +AD4APgA+AD4-
+AD4- +AD4APgA+AD4- Jason, can you please check this?
+AD4- +AD4APgA+-
+AD4- +AD4APgA+- The removed code is only for races before we serialize the event
+AD4- +AD4APgA+- processing. All PHYs will still be scanned one by one and add to the
+AD4- +AD4APgA+- wide port if they have the same address. So are you encountering a
+AD4- +AD4APgA+- real issue? If
+AD4- +AD4APg- so, can you share the full log?
+AD4- +AD4APg-
+AD4- +AD4APg- Yes. We did hit this issue when we upgrade Linux kernel from 4.19.236 to 5.14.21.
+AD4- +AD4APg- Full logs attached.
+AD4- +AD4APg-
+AD4- +AD4APgA+-
+AD4- +AD4APgA+- Thanks,
+AD4- +AD4APgA+- Jason
+AD4- +AD4APgA+-
+AD4- +AD4APgA+- +eV1OAFIHmHpSKf8B-
+AD4- +AD4APgA+-
+AD4- +AD4APgA+AD4-
+AD4- +AD4APgA+AD4- Thanks+ACE-
+AD4- +AD4APgA+AD4-
+AD4- +AD4APgA+AD4APg- And this caused this problem occurred (downstream port of that SAS
+AD4- +AD4APgA+AD4APg- expander was disabled and all downstream SAS devices were removed
+AD4- +AD4APgA+AD4APg- from the domain).
+AD4- +AD4APgA+AD4APg-
+AD4- +AD4APgA+AD4APg- Regards.
+AD4- +AD4APgA+AD4APg- Eric Li
+AD4- +AD4APgA+AD4APg-
+AD4- +AD4APgA+AD4APg- SPE, DellEMC
+AD4- +AD4APgA+AD4APg- 3/F KIC 1, 252+ACM- Songhu Road, YangPu District, SHANGHAI
+AD4- +AD4APgA+AD4APg- +-86-21-6036-4384
+AD4- +AD4APgA+AD4APg-
+AD4- +AD4APgA+AD4APg-
+AD4- +AD4APgA+AD4APg- Internal Use - Confidential
+AD4- +AD4APgA+AD4-
+AD4- +AD4APgA+AD4- .






[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [SCSI Target Devel]     [Linux SCSI Target Infrastructure]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Samba]     [Device Mapper]

  Powered by Linux