On 11/19/2015 11:22 AM, Aaro Koskinen wrote:
I get the below crash when cold booting OCTEON router with USB disk as
rootfs. Bisected to:
commit bf2cf3baa20b0a6cd2d08707ef05dc0e992a8aa0
Author: Bart Van Assche <bart.vanassche@xxxxxxxxxxx>
Date: Fri Sep 18 17:23:42 2015 -0700
scsi: Fix a bdi reregistration race
Reverting the patch makes the board boot fine again.
A.
Waiting for rootfs media to appear... Press ENTER to interrupt.
[ 1.540522] usb 1-1: new high-speed USB device number 2 using ehci-platform
[ 1.699752] usb-storage 1-1:1.0: USB Mass Storage device detected
[ 1.706054] scsi host0: usb-storage 1-1:1.0
[ 2.702105] scsi 0:0:0:0: Direct-Access Ext Hard Disk PQ: 0 ANSI: 5
[ 2.714214] sd 0:0:0:0: [sda] Spinning up disk...
[ 3.720503] ...
[ 6.674040] usb 1-1: USB disconnect, device number 2
[ 6.750508] .ready
[ 6.752558] sd 0:0:0:0: [sda] Read Capacity(10) failed: Result: hostbyte=0x00 driverbyte=0x04
[ 6.761112] sd 0:0:0:0: [sda] Sense not available.
[ 6.765918] sd 0:0:0:0: [sda] Write Protect is off
[ 6.770741] sd 0:0:0:0: [sda] Asking for cache data failed
[ 6.776236] sd 0:0:0:0: [sda] Assuming drive cache: write through
[ 6.782745] ------------[ cut here ]------------
[ 6.787383] WARNING: CPU: 1 PID: 15 at /home/aaro/git/linux/block/genhd.c:626 add_disk+0x41c/0x478()
[ 6.796549] Modules linked in:
[ 6.799624] CPU: 1 PID: 15 Comm: kworker/u4:1 Not tainted 4.4.0-rc1-octeon-los_73f9f-00002-gd81c963 #1
[ 6.808959] Workqueue: events_unbound async_run_entry_fn
[ 6.814296] Stack : 0000000000000001 0000000000000004 ffffffff81760000 0000000000000000
0000000000000001 0000000000000000 0000000000000000 0000000000000000
ffffffff81f3abc8 ffffffff811893f8 0000000000000000 ffffffff81f3a758
0000000000000000 0000000000000002 0000000000000001 ffffffff81f40000
ffffffff816b78f8 80000000330e9000 0000000000000272 0000000000000009
ffffffff813471cc 0000000000000000 80000000330086a0 8000000033008400
80000000330e9000 ffffffff811cea44 800000003314bb68 8000000033008400
80000000330e9000 800000003314ba70 800000003314bb88 ffffffff8135331c
000000000000015f ffffffff813c0900 000000000000006e 0000000000000000
735f756e626f756e ffffffff81124190 0000000000000000 0000000000000000
...
[ 6.879950] Call Trace:
[ 6.882414] [<ffffffff81124190>] show_stack+0x88/0xa8
[ 6.887475] [<ffffffff8135331c>] dump_stack+0x6c/0x90
[ 6.892549] [<ffffffff81141cb4>] warn_slowpath_common+0x94/0xd8
[ 6.898481] [<ffffffff813471cc>] add_disk+0x41c/0x478
[ 6.903552] [<ffffffff81400794>] sd_probe_async+0xfc/0x218
[ 6.909047] [<ffffffff8116373c>] async_run_entry_fn+0x4c/0x120
[ 6.914898] [<ffffffff8115a83c>] process_one_work+0x17c/0x438
[ 6.920663] [<ffffffff8115ac60>] worker_thread+0x168/0x5e0
[ 6.926159] [<ffffffff81160dc4>] kthread+0xd4/0xf0
[ 6.930968] [<ffffffff8111e9d8>] ret_from_kernel_thread+0x14/0x1c
[ 6.937069]
Hello Aaro,
The patch you mentioned changes the device removal code. The above
output shows a warning triggered by the device probing code. That makes
it unlikely that the above warning is caused by my patch. Please double
check your bisect results.
Thanks,
Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html