Re: [REGRESSION] v6.8 SMC-D issues

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 2024/1/24 22:29, Alexandra Winter wrote:
Hello Wen Gu,

our colleague Matthew reported that SMC-D is failing in certain scenarios on
kernel v6.8 (thx Matt!). He bisected it to
b40584d ("net/smc: compatible with 128-bits extended GID of virtual ISM device")
I think the root cause could also be somewhere else in the SMC-Dv2.1 patchset.

I was able to reproduce the issue on a 6.8.0-rc1 kernel.
I tested iperf over smc-d with:
smc_run iperf3 -s
smc_run iperf3 -c <IP@>

1) Doing an iperf in a single system using 127.0.0.1 as IP@
(System A=iperf client=iperf server)
2) Doing iperf to a remote system (System A=client; System B=iperf server)

The second iperf fails with an error message like:
"iperf3: error - unable to receive cookie at server: Bad file descriptor" on the server"

If I do first 2) (iperf to remote) and then 1) (iperf to local), then the
iperf to local fails.

I can do multiple iperf to the first server without problems.

I ran it on a debug server with KASAN, but got no reports in the Logfile.

I will try to debug further, but wanted to let you all know.

Kind regards
Alexandra

Reported-by: Matthew Rosato <mjrosato@xxxxxxxxxxxxx>


Hi Alexandra and Matthew,

Thank you very much for detailed description.

I tried to reproduce this with loopback-ism, cut some checks so that the remote-system
handshake can be done. After a while debug I found an elementary mistake of mine in
b40584d ("net/smc: compatible with 128-bits extended GID of virtual ISM device")..

The operator order in smcd_lgr_match() is not as expected. It will always return
'true' in remote-system case.

 static bool smcd_lgr_match(struct smc_link_group *lgr,
-                          struct smcd_dev *smcismdev, u64 peer_gid)
+                          struct smcd_dev *smcismdev,
+                          struct smcd_gid *peer_gid)
 {
-       return lgr->peer_gid == peer_gid && lgr->smcd == smcismdev;
+       return lgr->peer_gid.gid == peer_gid->gid && lgr->smcd == smcismdev &&
+               smc_ism_is_virtual(smcismdev) ?
+               (lgr->peer_gid.gid_ext == peer_gid->gid_ext) : 1;
 }

Could you please try again with this patch? to see if this is the root cause.
Really sorry for the inconvenience.

diff --git a/net/smc/smc_core.c b/net/smc/smc_core.c
index da6a8d9c81ea..c6a6ba56c9e3 100644
--- a/net/smc/smc_core.c
+++ b/net/smc/smc_core.c
@@ -1896,8 +1896,8 @@ static bool smcd_lgr_match(struct smc_link_group *lgr,
                           struct smcd_gid *peer_gid)
 {
        return lgr->peer_gid.gid == peer_gid->gid && lgr->smcd == smcismdev &&
-               smc_ism_is_virtual(smcismdev) ?
-               (lgr->peer_gid.gid_ext == peer_gid->gid_ext) : 1;
+               (smc_ism_is_virtual(smcismdev) ?
+                (lgr->peer_gid.gid_ext == peer_gid->gid_ext) : 1);
 }


Thanks,
Wen Gu




[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Kernel Development]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite Info]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Samba]     [Linux Media]     [Device Mapper]

  Powered by Linux