On 2023/5/31 04:34, Wenjia Zhang wrote:
Hi Wen,
Sorry for the late answer because of the public holiday here!
I really like the test scenario, thank you for the elaboration and the fixes!
They look good to me.
Why I asked that was that the first patch looked very reasonable, but I was wondering why I didn't meet any problem with
that before ;-) and if it would trigger some problem during processing the SMCRv1 ADD Link Continuation Messages. After
checking the code again, I don't think there would be any problem with the patch, because in the case of processing the
SMCRv1 ADD Link Continuation Messages, it's about the same RMB.
Hi @Paolo, I would appreciate it if you could give us more time to review and test the patches. Because we have to make
sure that they can work on our platform (s390) without problem, not only on x86.
Thanks
Wenjia
Inspired by your comments, I check the SMCRv1 and find it has the similar issue in smc_llc_add_link_cont().
The cause and way to reproduce it are similar to the issue in SMCRv2. I will fix this as well.
[ 361.813390] BUG: kernel NULL pointer dereference, address: 0000000000000014
[ 361.814121] #PF: supervisor read access in kernel mode
[ 361.814646] #PF: error_code(0x0000) - not-present page
[ 361.815160] PGD 0 P4D 0
[ 361.815431] Oops: 0000 [#1] PREEMPT SMP PTI
[ 361.815866] CPU: 5 PID: 48 Comm: kworker/5:0 Kdump: loaded Tainted: G W E 6.4.0-rc3+ #49
[ 361.817952] Workqueue: events smc_llc_add_link_work [smc]
[ 361.818527] RIP: 0010:smc_llc_add_link_cont+0x160/0x270 [smc]
[ 361.820973] RSP: 0018:ffffa737801d3d50 EFLAGS: 00010286
[ 361.821517] RAX: ffff964f82144000 RBX: ffffa737801d3dd8 RCX: 0000000000000000
[ 361.822246] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff964f81370c30
[ 361.822957] RBP: ffffa737801d3dd4 R08: ffff964f81370000 R09: ffffa737801d3db0
[ 361.823678] R10: 0000000000000001 R11: 0000000000000060 R12: ffff964f82e70000
[ 361.824409] R13: ffff964f81370c38 R14: ffffa737801d3dd3 R15: 0000000000000001
[ 361.825119] FS: 0000000000000000(0000) GS:ffff9652bfd40000(0000) knlGS:0000000000000000
[ 361.825934] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 361.826515] CR2: 0000000000000014 CR3: 000000008fa20004 CR4: 00000000003706e0
[ 361.827251] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 361.827989] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 361.828712] Call Trace:
[ 361.828964] <TASK>
[ 361.829182] smc_llc_srv_rkey_exchange+0xa7/0x190 [smc]
[ 361.829726] smc_llc_srv_add_link+0x3ae/0x5a0 [smc]
[ 361.830246] smc_llc_add_link_work+0xb8/0x140 [smc]
[ 361.830752] process_one_work+0x1e5/0x3f0
[ 361.831173] worker_thread+0x4d/0x2f0
[ 361.831531] ? __pfx_worker_thread+0x10/0x10
[ 361.831925] kthread+0xe5/0x120
[ 361.832239] ? __pfx_kthread+0x10/0x10
[ 361.832630] ret_from_fork+0x2c/0x50
[ 361.833004] </TASK>
[ 361.833236] Modules linked in: binfmt_misc(E) smc_diag(E) smc(E) rfkill(E) intel_rapl_msr(E) intel_rapl_common(E)
mousedev(E) psmouse(E) i2c_piix4(E) pcspkr(E) ip_tables(E) mlx5_ib(E) ib_uverbs(E) ib_core(E) cirrus(E) ata_generic(E)
drm_shmem_helper(E) drm_kms_helper(E) syscopyarea(E) ata_piix(E) sysfillrect(E) crct10dif_pclmul(E) sysimgblt(E)
mlx5_core(E) crc32_pclmul(E) drm(E) virtio_net(E) mlxfw(E) crc32c_intel(E) ghash_clmulni_intel(E) net_failover(E)
psample(E) i2c_core(E) failover(E) pci_hyperv_intf(E) serio_raw(E) libata(E) dm_mirror(E) dm_region_hash(E) dm_log(E)
dm_mod(E)
[ 361.839180] CR2: 0000000000000014
Thanks,
Wen Gu