On Tue, Nov 7, 2017 at 4:38 PM, Yu Chen <yu.chen.surf@xxxxxxxxx> wrote: > Hi all, > We are using 4.13.5-100.fc25.x86_64 and a panic was found during > resume from hibernation, the backtrace is illustrated as below, would > someone please take a look if this has already been fixed or is this issue still > in the upstream kernel? thanks! > [ 114.846213] PM: Using 3 thread(s) for decompression. > [ 114.846213] PM: Loading and decompressing image data (6555729 pages)... > [ 115.143169] PM: Image loading progress: 0% > [ 156.386990] PM: Image loading progress: 10% > [ 175.114169] PM: Image loading progress: 20% > [ 185.364073] PM: Image loading progress: 30% > [ 191.345652] PM: Image loading progress: 40% > [ 200.655883] PM: Image loading progress: 50% > [ 220.084360] PM: Image loading progress: 60% > [ 240.581079] PM: Image loading progress: 70% > [ 250.406290] general protection fault: 0000 [#1] SMP > [ 250.411779] Modules linked in: nouveau video mxm_wmi i2c_algo_bit > drm_kms_helper ttm drm crc32c_intel wmi > [ 250.422524] CPU: 99 PID: 0 Comm: swapper/99 Not tainted > 4.13.5-100.fc25.x86_64 #1 > [ 250.430902] Hardware name: Intel Corporation PURLEY/PURLEY, BIOS > PLYXCRB1.86B.0521.D18.1710241520 10/24/2017 > [ 250.441901] task: ffff97f5827c0000 task.stack: ffffb0e418cdc000^M > [ 250.448528] RIP: 0010:bio_integrity_advance+0x1a/0xf0 > [ 250.454182] RSP: 0018:ffff97f58f6c3da8 EFLAGS: 00010202 > [ 250.460024] RAX: db19e5a5b91ff161 RBX: 58b38c0def2b26b8 RCX: 0000000180400021 > [ 250.468008] RDX: 0000000000000000 RSI: 0000000000008000 RDI: ffff97f56eb7fd20 > [ 250.475993] RBP: ffff97f58f6c3db0 R08: ffff97f56d8d3600 R09: 0000000180400021 > [ 250.483976] R10: ffff97f58f6c3c48 R11: 00000000000a8000 R12: 0000000000008000 > [ 250.491961] R13: ffff9739fcdfd400 R14: 00000000000a0000 R15: 0000000000008000 > [ 250.499944] FS: 0000000000000000(0000) GS:ffff97f58f6c0000(0000) > knlGS:0000000000000000 > [ 250.508997] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 250.515427] CR2: 0000565407552e40 CR3: 00000115b7a67000 CR4: 00000000007406e0 > [ 250.523412] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > [ 250.533458] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > [ 250.543500] PKRU: 55555554 > [ 250.548604] Call Trace: > [ 250.553415] <IRQ> > [ 250.557729] bio_advance+0x28/0xf0 > [ 250.563598] blk_update_request+0x92/0x2f0 > [ 250.570223] scsi_end_request+0x37/0x1d0 > [ 250.576654] scsi_io_completion+0x20e/0x690 > [ 250.583362] ? rebalance_domains+0x160/0x2b0 > [ 250.590187] scsi_finish_command+0xd9/0x120 > [ 250.596924] scsi_softirq_done+0x125/0x140 > [ 250.603562] blk_done_softirq+0x9e/0xd0 > [ 250.609916] __do_softirq+0x10c/0x2a5 > [ 250.616073] irq_exit+0xff/0x110 > [ 250.621737] smp_call_function_single_interrupt+0x33/0x40 > [ 250.629831] call_function_single_interrupt+0x93/0xa0 > [ 250.637544] RIP: 0010:cpuidle_enter_state+0x126/0x2c0 > [ 250.645263] RSP: 0018:ffffb0e418cdfe60 EFLAGS: 00000246 ORIG_RAX: > ffffffffffffff04 > [ 250.655814] RAX: 0000000000000000 RBX: 0000000000000002 RCX: 000000000000001f > [ 250.665885] RDX: 0000003a4d5f2d20 RSI: ffffffc820eb310b RDI: 0000000000000000 > [ 250.675956] RBP: ffffb0e418cdfe98 R08: 0000000000000176 R09: 0000000000000018 > [ 250.686018] R10: ffffb0e418cdfe30 R11: 0000000000000094 R12: ffff97f58f6e3b00 > [ 250.696080] R13: ffffffffb1f72a78 R14: 0000003a4d5f2d20 R15: ffffffffb1f72a60 > [ 250.706123] </IRQ> > [ 250.710547] cpuidle_enter+0x17/0x20 > [ 250.716609] call_cpuidle+0x23/0x40 > [ 250.722550] do_idle+0x18e/0x1e0 > [ 250.728177] cpu_startup_entry+0x73/0x80 > [ 250.734560] start_secondary+0x156/0x190 > [ 250.740930] secondary_startup_64+0x9f/0x9f > [ 250.747578] Code: 01 79 cc b1 e8 09 16 ce ff 31 c0 eb e6 0f 1f 40 > 00 0f 1f 44 00 00 55 48 89 e5 53 31 db f6 47 16 01 74 04 48 8b 5f 68 > 48 8b 47 08 <48> 8b 80 80 00 00 00 48 8b 90 d0 03 00 00 48 83 ba 48 02 > 00 00 > [ 250.770821] RIP: bio_integrity_advance+0x1a/0xf0 RSP: ffff97f58f6c3da8^M > [ 250.780481] ---[ end trace d7b00b76aab34156 ]--- > [ 250.841521] Kernel panic - not syncing: Fatal exception in interrupt > [ 250.851158] Kernel Offset: 0x30000000 from 0xffffffff81000000 > (relocation range: 0xffffffff80000000-0xffffffffbfffffff) > [ 250.912067] ---[ end Kernel panic - not syncing: Fatal exception in interrupt According to the log, the exception was triggered when trying to access: bio->bi_bdev->bd_disk: 00000000000003b0 <bio_integrity_advance>: 3b0: e8 00 00 00 00 callq 3b5 <bio_integrity_advance+0x5> ... 3c2: 48 8b 5f 68 mov 0x68(%rdi),%rbx 3c6: 48 8b 47 08 mov 0x8(%rdi),%rax bio->bi_bdev->bd_disk, BOOM! 3ca: 48 8b 80 80 00 00 00 mov 0x80(%rax),%rax When the exception was triggered, the bio->bi_bdev is: RAX: db19e5a5b91ff161 besides, we can see that bio->bi_integrity is RBX: 58b38c0def2b26b8 which is also a random value. So, is it possible that, during hibernation, 1. either the bio has not been initialized yet, AKA, use-before-inialize, 2. or, the bio has already been released, thus cause a access-after-free scenario? Any idea here? thanks, Yu