Re: Kernel Null pointer deref after bad btree header

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

Some progress, here are the errors with the most recent vanilla mainline kernel:

[   84.551715] bcache: register_bcache() error /dev/sda4: device already registered
[   84.553188] bcache: register_bcache() error /dev/sdc2: device already registered
[   84.616438] bcache: error on 1330b5f6-0c13-43ec-b925-2ee2734b135f:
[   84.616440] bad btree header at bucket 85065, block 0, 0 keys
[   84.616442] , disabling caching
[   84.616445] bcache: register_cache() registered cache device sdb2
[   84.616597] bcache: cache_set_free() Cache set 1330b5f6-0c13-43ec-b925-2ee2734b135f unregistered
[   85.375933]  sdb: sdb1 sdb2 sdb4 < sdb5 >
[   85.416610] bcache: error on 1330b5f6-0c13-43ec-b925-2ee2734b135f:
[   85.416612] bad btree header at bucket 85065, block 0, 0 keys
[   85.416614] , disabling caching
[   85.416618] bcache: register_cache() registered cache device sdb2
[   85.416624] bcache: register_bcache() error /dev/sdc2: device already registered
[   85.416626] bcache: register_bcache() error /dev/sda4: device already registered
[   85.416796] bcache: cache_set_free() Cache set 1330b5f6-0c13-43ec-b925-2ee2734b135f unregistered
[   85.488246] bcache: error on 1330b5f6-0c13-43ec-b925-2ee2734b135f:
[   85.488249] bad btree header at bucket 85065, block 0, 0 keys
[   85.488251] , disabling caching
[   85.488254] bcache: register_cache() registered cache device sdb2
[   85.488429] bcache: cache_set_free() Cache set 1330b5f6-0c13-43ec-b925-2ee2734b135f unregistered
[   85.560003] bcache: error on 1330b5f6-0c13-43ec-b925-2ee2734b135f:
[   85.560006] bad btree header at bucket 85065, block 0, 0 keys
[   85.560008] , disabling caching
[   85.560013] bcache: register_cache() registered cache device sdb2
[   85.560017] bcache: register_bcache() error /dev/sda4: device already registered
[   85.560217] bcache: cache_set_free() Cache set 1330b5f6-0c13-43ec-b925-2ee2734b135f unregistered
[   85.571950] bcache: register_bcache() error /dev/sdc2: device already registered
[   85.580628] bcache: register_bcache() error /dev/sdc2: device already registered
[   85.761969] bcache: register_bcache() error /dev/sda4: device already registered
[   85.792749] bcache: register_bcache() error /dev/sda4: device already registered
[   85.952931] bcache: register_bcache() error /dev/sda4: device already registered
[   85.955640] bcache: register_bcache() error /dev/sda4: device already registered
[   86.072102] bcache: register_bcache() error /dev/sda4: device already registered
[   98.583895] bcache: error on 1330b5f6-0c13-43ec-b925-2ee2734b135f:
[   98.583898] bad btree header at bucket 85065, block 0, 0 keys
[   98.583902] , disabling caching
[   98.583905] bcache: register_cache() registered cache device sdb2
[   98.584131] bcache: cache_set_free() Cache set 1330b5f6-0c13-43ec-b925-2ee2734b135f unregistered
[   98.656158] bcache: error on 1330b5f6-0c13-43ec-b925-2ee2734b135f:
[   98.656161] bad btree header at bucket 85065, block 0, 0 keys
[   98.656166] , disabling caching
[   98.656172] bcache: register_cache() registered cache device sdb2
[   98.656386] bcache: cache_set_free() Cache set 1330b5f6-0c13-43ec-b925-2ee2734b135f unregistered
[   98.730245] bcache: error on 1330b5f6-0c13-43ec-b925-2ee2734b135f:
[   98.730249] bad btree header at bucket 85065, block 0, 0 keys
[   98.730253] , disabling caching
[   98.730258] bcache: register_cache() registered cache device sdb2
[   98.730572] bcache: cache_set_free() Cache set 1330b5f6-0c13-43ec-b925-2ee2734b135f unregistered
[   98.811047] bcache: error on 1330b5f6-0c13-43ec-b925-2ee2734b135f:
[   98.811051] bad btree header at bucket 85065, block 0, 0 keys
[   98.811055] , disabling caching
[   98.811065] bcache: register_cache() registered cache device sdb2
[   98.811093] bcache: register_bcache() error /dev/sdc2: device already registered
[   98.811277] bcache: cache_set_free() Cache set 1330b5f6-0c13-43ec-b925-2ee2734b135f unregistered
[   98.848069] bcache: error on 1330b5f6-0c13-43ec-b925-2ee2734b135f:
[   98.848073] bad btree header at bucket 85065, block 0, 0 keys
[   98.848077] , disabling caching
[   98.848083] bcache: register_cache() registered cache device sdb2
[   98.848327] bcache: cache_set_free() Cache set 1330b5f6-0c13-43ec-b925-2ee2734b135f unregistered
[   98.963716] bcache: register_bcache() error /dev/sdc2: device already registered
[   99.063875] bcache: register_bcache() error /dev/sdc2: device already registered
[   99.073352] bcache: register_bcache() error /dev/sdc2: device already registered
[   99.076450] bcache: register_bcache() error /dev/sdc2: device already registered
[   99.094278] bcache: register_bcache() error /dev/sdc2: device already registered

$ uname -a
Linux rescue 4.16.0 #2 SMP Mon Apr 2 13:20:45 BST 2018 x86_64 GNU/Linux

smartctl does not report any errors on backing or caching disks, and the system was shutdown cleanly.

The only possibly related thing that comes to mind is that a few days ago I hibernated and resumed the system (this is something I normally don't do). Resume worked fine as far as I could tell though.

What's the best way for me to proceed here? Both in terms of preserving any debugging information and making the system usable again. I could wipe and re-install from backup, but I'm wondering if maybe there's just some cosmic ray corruption in one byte and the system is otherwise fine? And if not, should I maybe not restore the the same exact disk configuration?

Best,
-Nikolaus
--
GPG Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F

             »Time flies like an arrow, fruit flies like a Banana.«

On Mon, 2 Apr 2018, at 10:22, Nikolaus Rath wrote:
> Hello,
>
> This morning, my system refused to boot because it couldn't find the
> root filesystem anymore. The root filesystem is ext4 on LVM on dm-crypt
> on bcache. Booting from a recovery medium, I got the following from
> dmesg:
>
> [   10.346673] e1000e: eth1 NIC Link is Up 1000 Mbps Full Duplex, Flow
> Control: Rx/Tx
> [  148.588402] bcache: error on 1330b5f6-0c13-43ec-b925-2ee2734b135f:
> bad btree header at bucket 85065, block 0, 0 keys, disabling caching
> [  148.588414] bcache: register_cache() registered cache device sdb2
> [  148.588696] BUG: unable to handle kernel NULL pointer dereference at
> 00000000000009b0
> [  148.588701] IP: [<ffffffffc01f67da>] register_bcache+0x108a/0x17a0
> [bcache]
> [  148.588734] PGD d951d067 PUD 36748067 PMD 0
> [  148.588738] Oops: 0000 [#1] SMP
> [  148.588742] Modules linked in: bnep(E) arc4(E) intel_rapl(E)
> x86_pkg_temp_thermal(E) intel_powerclamp(E) coretemp(E) kvm_intel(E)
> kvm(E) irqbypass(E) crct10dif_pclmul(E) mxm_wmi(E) iwlmvm(E) mac80211(E)
> iTCO_wdt(E) iTCO_vendor_support(E) eeepc_wmi(E) asus_wmi(E)
> sparse_keymap(E) crc32_pclmul(E) iwlwifi(E) uvcvideo(E)
> videobuf2_vmalloc(E) videobuf2_memops(E) videobuf2_v4l2(E)
> videobuf2_core(E) videodev(E) ghash_clmulni_intel(E) media(E) evdev(E)
> cfg80211(E) shpchp(E) hmac(E) drbg(E) ansi_cprng(E) aesni_intel(E)
> aes_x86_64(E) lrw(E) gf128mul(E) glue_helper(E) ablk_helper(E) cryptd(E)
> btusb(E) btrtl(E) btbcm(E) tpm_tis(E) battery(E) btintel(E)
> snd_hda_codec_realtek(E) snd_hda_codec_generic(E) bluetooth(E) rfkill(E)
> pcspkr(E) serio_raw(E) snd_hda_codec_hdmi(E) snd_hda_intel(E)
> snd_hda_codec(E) mei_me(E)
> [  148.588780]  mei(E) lpc_ich(E) mfd_core(E) tpm(E) i2c_i801(E)
> snd_hda_core(E) snd_hwdep(E) snd_pcm(E) snd_timer(E) snd(E) processor(E)
> soundcore(E) video(E) button(E) wmi(E) fuse(E) autofs4(E) ext4(E)
> crc16(E) mbcache(E) jbd2(E) btrfs(E) xor(E) raid6_pq(E) hid_generic(E)
> uas(E) usbhid(E) usb_storage(E) hid(E) dm_mod(E) bcache(E) sg(E)
> sr_mod(E) cdrom(E) sd_mod(E) ahci(E) libahci(E) ehci_pci(E)
> crc32c_intel(E) ehci_hcd(E) libata(E) psmouse(E) usbcore(E)
> usb_common(E) scsi_mod(E) e1000e(E) ptp(E) pps_core(E) fan(E) thermal(E)
> fjes(E)
> [  148.588816] CPU: 3 PID: 1368 Comm: bcache-register Tainted:
> G            E   4.5.0-0.bpo.2-amd64 #1 Debian 4.5.4-1~bpo8+1
> [  148.588818] Hardware name: System manufacturer System Product Name/
> P8Z68-V GEN3, BIOS 3603 11/09/2012
> [  148.588821] task: ffff8804081e2ec0 ti: ffff8800da748000 task.ti:
> ffff8800da748000
> [  148.588822] RIP: 0010:[<ffffffffc01f67da>]  [<ffffffffc01f67da>]
> register_bcache+0x108a/0x17a0 [bcache]
> [  148.588834] RSP: 0018:ffff8800da74bde8  EFLAGS: 00010202
> [  148.588835] RAX: ffff880409a20000 RBX: ffff88040a083000 RCX:
> ffff880409a20000
> [  148.588837] RDX: 0000000000000000 RSI: 0000000000000001 RDI:
> ffffffffc0205180
> [  148.588838] RBP: ffffffffc0205180 R08: ffffffffc0205180 R09:
> 0000002298927e0f
> [  148.588839] R10: ffff8800365a3901 R11: 0000000000000000 R12:
> ffff88040dc036c0
> [  148.588840] R13: ffff880408341ed8 R14: ffffffffc0202932 R15:
> fffffffffffffff0
> [  148.588842] FS:  00007fbe24a74700(0000) GS:ffff88041ecc0000(0000)
> knlGS:0000000000000000
> [  148.588844] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  148.588846] CR2: 00000000000009b0 CR3: 00000000365ad000 CR4:
> 00000000000406e0
> [  148.588847] Stack:
> [  148.588849]  024680ca00000080 ffff88040a0d4c20 000000000000000a
> ffff8800d94f3928
> [  148.588852]  ffffffff811abc55 0000000000000286 0000000000000286
> ffff8800da7a0408
> [  148.588854]  00007fbe24a81000 ffffffff811a12ba 00000000024000c0
> ffff880000000408
> [  148.588857] Call Trace:
> [  148.588868]  [<ffffffff811abc55>] ? page_add_new_anon_rmap+0x95/0xd0
> [  148.588872]  [<ffffffff811a12ba>] ? handle_mm_fault+0x13ca/0x1b90
> [  148.588878]  [<ffffffff81267007>] ? kernfs_fop_write+0x117/0x160
> [  148.588883]  [<ffffffff811eb624>] ? vfs_write+0xa4/0x190
> [  148.588887]  [<ffffffff811ec602>] ? SyS_write+0x52/0xc0
> [  148.588894]  [<ffffffff815ba236>] ? system_call_fast_compare_end+0xc/
> 0x6b
> [  148.588895] Code: 42 d0 48 81 fa b0 51 20 c0 48 89 c1 48 8d 7e d0 49
> 89 f8 0f 84 49 06 00 00 0f b7 b1 34 04 00 00 48 8b 91 40 0c 00 00 85 f6
> 74 39 <4c> 3b a2 b0 09 00 00 0f 84 a2 00 00 00 83 ee 01 48 8d 91 48 0c
> [  148.588920] RIP  [<ffffffffc01f67da>] register_bcache+0x108a/0x17a0
> [bcache]
> [  148.588934]  RSP <ffff8800da74bde8>
> [  148.588935] CR2: 00000000000009b0
> [  148.588946] ---[ end trace 0b7d5d2eba253662 ]---
>
>
> This is the kernel from the rescue system, the kernel on the actual
> system is newer. At the moment, I can't figure out either the version
> number nor the exact error message because the system won't boot.
>
>
> Is there any way to tell if the problem here is with bcache or with one
> of the disks?
>
>
> Best,
> -Nikolaus
>
> --
> GPG Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F
>
>              »Time flies like an arrow, fruit flies like a Banana.«
>
--
To unsubscribe from this list: send the line "unsubscribe linux-bcache" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Linux ARM Kernel]     [Linux Filesystem Development]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux