Re: Divide by zero in iaa_crypto during boot of a kdump kernel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Jerry,

On Tue, 2024-03-19 at 13:51 -0700, Jerry Snitselaar wrote:
> Hi Tom,
> 
> While looking at a different issue on a GNR system I noticed that
> during the boot of the kdump kernel it crashes when probing
> iaa_crypto
> due to a divide by zero in rebalance_wq_table. The problem is that
> the
> kdump kernel comes up with a single cpu, and if there are multiple
> iaa
> devices cpus_per_iaa is going to be calculated to be 0, and then the
> 'if ((cpu % cpus_per_iaa) == 0)' in rebalance_wq_table results in a
> divide by zero. I reproduced it with the 6.8 eln kernel, and so far
> have reproduced it on GNR, EMR, and SRF systems. I'm assuming the
> same
> will be the case on and SPR system with IAA devices enabled if I can
> find one.
> 

Good catch, I've never tested that before. Thanks for reporting it.

> Should save_iaa_wq return an error if the number of iaa devices is
> greater
> than the number of cpus?
> 

No, you should still be able to use the driver with just one cpu, maybe
it just always maps to the same device. I'll take a look and come up
with a fix.

Tom

> 
>     [   17.242696] idxd: crypto: iaa_crypto now ENABLED
>     [   17.248641] divide error: 0000 [#1] PREEMPT SMP NOPTI
>     [   17.254358] CPU: 0 PID: 396 Comm: systemd-udevd Not tainted
> 6.8.0-63.eln136.1.x86_64 #1
>     [   17.263399] Hardware name: Intel Corporation
> AvenueCity/AvenueCity, BIOS BHSDCRB1.IPC.2780.D02.2311070514
> 11/07/2023
>     [   17.275266] RIP: 0010:rebalance_wq_table.part.0+0x163/0x220
> [iaa_crypto]
>     [   17.282851] Code: 85 c0 74 c1 8b 35 6d ed f3 c2 31 db 48 39 f3
> 73 4d 48 89 da 4c 89 f7 e8 9b 5a 26 c1 3b 05 55 ed f3 c2 89 c6 73 38
> 31 d2 89 d8 <f7> 35 9f 76 00 00 83 fa 01 41 83 d5 00 44 89 ef e8 68
> f9 ff ff 85
>     [   17.303974] RSP: 0018:ffa0000001147bb0 EFLAGS: 00010246
>     [   17.309895] RAX: 0000000000000000 RBX: 0000000000000000 RCX:
> 00000000ffffffff
>     [   17.317956] RDX: 0000000000000000 RSI: 0000000000000000 RDI:
> 0000000000000000
>     [   17.326016] RBP: 0000000000000000 R08: 0000000000000000 R09:
> 0000000000000001
>     [   17.334076] R10: ff1100005bff93c0 R11: 0000000000000003 R12:
> ffffffff826cbba8
>     [   17.342137] R13: 00000000ffffffff R14: ff1100005bff93c0 R15:
> ff110000563968e0
>     [   17.350197] FS:  00007f0697de8540(0000)
> GS:ff1100005ba00000(0000) knlGS:0000000000000000
>     [   17.359333] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>     [   17.365834] CR2: 000055bf003ad358 CR3: 0000000046632003 CR4:
> 0000000000f71eb0
>     [   17.373900] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
> 0000000000000000
>     [   17.381960] DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7:
> 0000000000000400
>     [   17.390020] PKRU: 55555554
>     [   17.393113] Call Trace:
>     [   17.395905]  <TASK>
>     [   17.398310]  ? die+0x36/0x90
>     [   17.401600]  ? do_trap+0xda/0x100
>     [   17.405373]  ? rebalance_wq_table.part.0+0x163/0x220
> [iaa_crypto]
>     [   17.412265]  ? do_error_trap+0x65/0x80
>     [   17.416519]  ? rebalance_wq_table.part.0+0x163/0x220
> [iaa_crypto]
>     [   17.423412]  ? exc_divide_error+0x38/0x50
>     [   17.427970]  ? rebalance_wq_table.part.0+0x163/0x220
> [iaa_crypto]
>     [   17.434861]  ? asm_exc_divide_error+0x1a/0x20
>     [   17.439805]  ? rebalance_wq_table.part.0+0x163/0x220
> [iaa_crypto]
>     [   17.446696]  iaa_crypto_probe+0x117/0x2e0 [iaa_crypto]
>     [   17.452514]  really_probe+0x19b/0x3e0
>     [   17.456674]  ? __pfx___driver_attach+0x10/0x10
>     [   17.461715]  __driver_probe_device+0x78/0x160
>     [   17.466659]  driver_probe_device+0x1f/0xa0
>     [   17.471313]  __driver_attach+0xba/0x1c0
>     [   17.475665]  bus_for_each_dev+0x8c/0xe0
>     [   17.480028]  bus_add_driver+0x116/0x220
>     [   17.484380]  driver_register+0x5c/0x100
>     [   17.488731]  iaa_crypto_init_module+0xe5/0xff0 [iaa_crypto]
>     [   17.495043]  ? __pfx_iaa_crypto_init_module+0x10/0x10
> [iaa_crypto]
>     [   17.502032]  do_one_initcall+0x58/0x310
>     [   17.506385]  do_init_module+0x60/0x240
>     [   17.510640]  __do_sys_init_module+0x17a/0x1b0
>     [   17.515587]  do_syscall_64+0x81/0x160
>     [   17.519746]  ? handle_mm_fault+0xdd/0x360
>     [   17.524302]  ? do_user_addr_fault+0x2fe/0x670
>     [   17.529248]  ? exc_page_fault+0x6b/0x150
>     [   17.533697]  entry_SYSCALL_64_after_hwframe+0x6e/0x76
>     [   17.539413] RIP: 0033:0x7f0698a2ef1e
>     [   17.543479] Code: 48 8b 0d 05 af 0e 00 f7 d8 64 89 01 48 83 c8
> ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 49 89 ca b8 af 00
> 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d d2 ae 0e 00 f7 d8
> 64 89 01 48
>     [   17.564605] RSP: 002b:00007ffe27da0918 EFLAGS: 00000246
> ORIG_RAX: 00000000000000af
>     [   17.573156] RAX: ffffffffffffffda RBX: 000055beffb45cb0 RCX:
> 00007f0698a2ef1e
>     [   17.581216] RDX: 000055beffb78ba0 RSI: 0000000000026400 RDI:
> 000055bf00386cf0
>     [   17.589276] RBP: 000055bf00386cf0 R08: 000055beffb70340 R09:
> 0000000000026010
>     [   17.597337] R10: 0000000000000005 R11: 0000000000000246 R12:
> 000055beffb78ba0
>     [   17.605397] R13: 000055beffb71110 R14: 0000000000000000 R15:
> 000055beffb45fe0
>     [   17.613459]  </TASK>
> 
> 
> Regards,
> Jerry
> 






[Index of Archives]     [Kernel]     [Gnu Classpath]     [Gnu Crypto]     [DM Crypt]     [Netfilter]     [Bugtraq]
  Powered by Linux