Bcache still unstable for me (memory problems)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I started a btrfs copy from a bcached device to another one (both are in
the 5 to 15TB range).
A minute later, my system that had been up for several days, crashed.

Ultimately, it crashed on this:
bcache: register_cache() error opening sdl2: cannot allocate memory
BUG: unable to handle kernel NULL pointer dereference at 00000000000009b8
IP: [<ffffffffc0516e8d>] cache_set_flush+0x102/0x15c [bcache]

I'm not sure why it tried to register sdl2 then when it was already
registered from boot. Did it time out and had to be registered again?

At this point, things with bcache have been bad enough, that I'm
considering removing it. My system has been very unstable with it :(
Or maybe I can just remove the caching devices, leave the backing ones, 
and run without cache for now, giving me the option to re-add the cache later.

I actually just added 16GB of RAM to the server after adding bcache, and
clearly going from 8 to 24GB made no difference. I'm wondering if there
is a memory leak somewhere?

Just before the system crashed, memory looked like this, which isn't bad:
             total       used       free     shared    buffers     cached
Mem:      24392440   24241440     151000          0         88   20116748
-/+ buffers/cache:    4124604   20267836
Swap:     15616764     244452   15372312

I'm using btrfs on top of dmcrypt on top of bcache.
btrfs on top of dmcrypt was stable, I had been using it for years. Adding bcache is what
caused all my recent problems to start.

Any ideas?

Thanks,
Marc


[290569.458707] Process accounting resumed
[290623.673871] bcache-register: page allocation failure: order:7, mode:0x24080c0
[290623.697143] CPU: 0 PID: 1284 Comm: bcache-register Tainted: G        W       4.4.2-amd64-i915-volpreempt-20160214bc3 #5
[290623.732143] Hardware name: System manufacturer System Product Name/P8H67-M PRO, BIOS 3904 04/27/2013
[290623.762428]  0000000000000000 ffff88001df7bb08 ffffffff8134ae0a 0000000000000001
[290623.787470]  ffff88001df7bba0 ffffffff81124ab6 024080c01df7bc44 024080c000000040
[290623.811775]  0000000700000001 0000000000000007 0000000000000007 0000000000000040
[290623.836443] Call Trace:
[290623.846280]  [<ffffffff8134ae0a>] dump_stack+0x44/0x55
[290623.864123]  [<ffffffff81124ab6>] warn_alloc_failed+0x114/0x12c
[290623.883425]  [<ffffffff811274b8>] __alloc_pages_nodemask+0x7cb/0x84c
[290623.904159]  [<ffffffff8107df80>] ? try_to_wake_up+0x1d9/0x1eb
[290623.923173]  [<ffffffff8115f6d7>] alloc_pages_current+0xa9/0xcd
[290623.942435]  [<ffffffff8112377e>] __get_free_pages+0xe/0x3c
[290623.960621]  [<ffffffffc05192f0>] bch_cache_set_alloc+0x3c7/0x479 [bcache]
[290623.983825]  [<ffffffffc051a5e3>] register_bcache+0x1241/0x1452 [bcache]
[290624.005456]  [<ffffffff8134c853>] kobj_attr_store+0x10/0x1a
[290624.023729]  [<ffffffff811df9df>] sysfs_kf_write+0x39/0x3b
[290624.041739]  [<ffffffff811df2e1>] kernfs_fop_write+0xed/0x130
[290624.060394]  [<ffffffff8117d97d>] __vfs_write+0x26/0xa5
[290624.077452]  [<ffffffff810b2969>] ? current_kernel_time64+0x10/0x36
[290624.097921]  [<ffffffff812d6050>] ? security_file_permission+0x3b/0x42
[290624.119197]  [<ffffffff810915ea>] ? percpu_down_read+0x12/0x41
[290624.138945]  [<ffffffff8117fb61>] ? __sb_start_write+0x2b/0x48
[290624.158291]  [<ffffffff8117dfe8>] vfs_write+0x9d/0xe8
[290624.174919]  [<ffffffff8117e7bd>] SyS_write+0x4d/0x78
[290624.191455]  [<ffffffff810039c3>] do_fast_syscall_32+0xb3/0xf3
[290624.210331]  [<ffffffff816e3e32>] sysenter_flags_fixed+0x8/0x12
[290624.229355] Mem-Info:
[290624.237432] active_anon:607145 inactive_anon:208571 isolated_anon:0
[290624.237432]  active_file:2103407 inactive_file:1565209 isolated_file:0
[290624.237432]  unevictable:1307 dirty:188846 writeback:138775 unstable:0
[290624.237432]  slab_reclaimable:180112 slab_unreclaimable:73373
[290624.237432]  mapped:419640 shmem:466581 pagetables:5778 bounce:0
[290624.237432]  free:988612 free_pcp:2130 free_cma:14
[290624.351715] Node 0 DMA free:15884kB min:12kB low:12kB high:16kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15976kB managed:15892kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
[290624.483875] lowmem_reserve[]: 0 3201 23800 23800
[290624.500998] Node 0 DMA32 free:581000kB min:2648kB low:3308kB high:3972kB active_anon:229220kB inactive_anon:248156kB active_file:1102468kB inactive_file:832264kB unevictable:948kB isolated(anon):0kB isolated(file):68kB present:3362068kB managed:3283032kB mlocked:68719477684kB dirty:94316kB writeback:65996kB mapped:253400kB shmem:282168kB slab_reclaimable:96344kB slab_unreclaimable:41308kB kernel_stack:2480kB pagetables:2952kB unstable:0kB bounce:0kB free_pcp:4008kB local_pcp:140kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
[290624.655741] lowmem_reserve[]: 0 0 20599 20599
[290624.674232] Node 0 Normal free:3368136kB min:17048kB low:21308kB high:25572kB active_anon:2203780kB inactive_anon:586104kB active_file:7311280kB inactive_file:5430068kB unevictable:4280kB isolated(anon):0kB isolated(file):0kB present:21485568kB managed:21093516kB mlocked:481036341432kB dirty:662172kB writeback:472472kB mapped:1425288kB shmem:1584132kB slab_reclaimable:624104kB slab_unreclaimable:251680kB kernel_stack:11696kB pagetables:20180kB unstable:0kB bounce:0kB free_pcp:4304kB local_pcp:588kB free_cma:56kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
[290624.834378] lowmem_reserve[]: 0 0 0 0
[290624.848459] Node 0 DMA: 1*4kB (U) 1*8kB (U) 0*16kB 0*32kB 2*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (M) 3*4096kB (M) = 15884kB
[290624.895688] Node 0 DMA32: 30635*4kB (UME) 23925*8kB (UME) 7478*16kB (UME) 2420*32kB (UME) 794*64kB (UME) 139*128kB (UME) 10*256kB (UM) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 582196kB
[290624.947950] Node 0 Normal: 241499*4kB (UME) 161875*8kB (UME) 33424*16kB (UME) 10915*32kB (UME) 2795*64kB (UME) 334*128kB (UME) 15*256kB (UM) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 3370532kB
[290625.003724] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[290625.030463] 4142937 total pagecache pages
[290625.043969] 5782 pages in swap cache
[290625.056118] Swap cache stats: add 102837, delete 97055, find 627323/635133
[290625.078329] Free swap  = 15372324kB
[290625.090190] Total swap = 15616764kB
[290625.102186] 6215903 pages RAM
[290625.112399] 0 pages HighMem/MovableOnly
[290625.125747] 117793 pages reserved
[290625.137495] 4096 pages cma reserved
[290625.149199] 0 pages hwpoisoned
[290625.159961] bcache: register_cache() error opening sdl2: cannot allocate memory
[290625.183410] BUG: unable to handle kernel NULL pointer dereference at 00000000000009b8
[290625.208648] IP: [<ffffffffc0516e8d>] cache_set_flush+0x102/0x15c [bcache]
[290625.230734] PGD 0
[290625.238446] Oops: 0000 [#1] SMP
[290625.249857] Modules linked in: udp_diag tcp_diag inet_diag veth ip6table_filter ip6_tables ebtable_nat ebtables ppdev lp xt_addrtype tun bridge stp llc autofs4 softdog binfmt_misc ftdi_sio nfsd nfs_acl auth_rpcgss nfs fscache lockd grace sunrpc ipt_REJECT nf_reject_ipv4 xt_conntrack xt_mark xt_nat xt_tcpudp nf_log_ipv4 nf_log_common xt_LOG iptable_mangle iptable_filter bcache lm85 hwmon_vid pl2303 iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 ip_tables nf_conntrack_ftp ipt_MASQUERADE x_tables nf_nat_masquerade_ipv4 nf_nat nf_conntrack sg st snd_pcm_oss snd_mixer_oss kvm_intel snd_hda_codec_realtek snd_hda_codec_generic snd_opl3_synth kvm snd_seq_midi_emul snd_hda_intel snd_cmipci gameport snd_hda_codec snd_opl3_lib snd_mpu401_uart snd_seq_midi irqbypass snd_seq_midi_event snd_seq snd_hda_core snd_hwdep snd_rawmidi asix coretemp eeepc_wmi tpm_infineon rc_ati_x10 snd_pcm libphy snd_seq_device asus_wmi snd_timer sparse_keymap rfkill intel_rapl 8250_fintek ati_remote usbnet tpm_tis snd rc_core battery tpm evdev iosf_mbi wmi pcspkr input_leds intel_powerclamp parport_pc processor i2c_i801 xhci_pci x86_pkg_temp_thermal usbserial soundcore parport xhci_hcd lpc_ich e1000e ptp pps_core fuse raid456 multipath mmc_block mmc_core dm_snapshot dm_bufio dm_mirror dm_region_hash dm_log dm_crypt dm_mod async_raid6_recov async_pq async_xor async_memcpy async_tx blowfish_x86_64 blowfish_common ecb xts ansi_cprng drbg aesni_intel ablk_helper lrw gf128mul glue_helper aes_x86_64 crct10dif_pclmul crc32_pclmul crc32c_intel r8169 ehci_pci fjes ehci_hcd thermal mii sata_mv fan cryptd usbcore usb_common sata_sil24 [last unloaded: ftdi_sio]
[290625.713109] CPU: 0 PID: 25842 Comm: kworker/0:0 Tainted: G        W       4.4.2-amd64-i915-volpreempt-20160214bc3 #5
[290625.747597] Hardware name: System manufacturer System Product Name/P8H67-M PRO, BIOS 3904 04/27/2013
[290625.776918] Workqueue: events cache_set_flush [bcache]
[290625.794260] task: ffff880467caa0c0 ti: ffff8805af184000 task.ti: ffff8805af184000
[290625.818544] RIP: 0010:[<ffffffffc0516e8d>]  [<ffffffffc0516e8d>] cache_set_flush+0x102/0x15c [bcache]
[290625.848082] RSP: 0018:ffff8805af187e10  EFLAGS: 00010202
[290625.866890] RAX: 0000000000000001 RBX: ffff880153a40ce8 RCX: 0000000000000282
[290625.891136] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
[290625.914365] RBP: ffff8805af187e30 R08: 00000000000000fa R09: ffffe8ffffc01900
[290625.937588] R10: ffff880088430838 R11: ffff880088430838 R12: 0000000000000000
[290625.960781] R13: ffff880153a40de8 R14: 0000000000000000 R15: ffffe8ffffc01900
[290625.983951] FS:  0000000000000000(0000) GS:ffff88061e200000(0000) knlGS:0000000000000000
[290626.010918] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[290626.030800] CR2: 00000000000009b8 CR3: 0000000001e0a000 CR4: 00000000001406f0
[290626.053914] Stack:
[290626.061687]  ffff8805d6183280 ffff88061e216040 ffff880153a40ce8 0000000000000000
[290626.085725]  ffff8805af187e78 ffffffff81070fcf ffff8805d61832b0 000000001e216058
[290626.110510]  ffff8805d6183280 ffff88061e216040 ffff8805d61832b0 ffff88061e216058
[290626.135329] Call Trace:
[290626.145108]  [<ffffffff81070fcf>] process_one_work+0x15b/0x260
[290626.164241]  [<ffffffff81071574>] worker_thread+0x1f0/0x29d
[290626.182561]  [<ffffffff81071384>] ? rescuer_thread+0x281/0x281
[290626.201721]  [<ffffffff810758e9>] kthread+0xa5/0xad
[290626.217966]  [<ffffffff81075844>] ? kthread_parkme+0x24/0x24
[290626.236505]  [<ffffffff816e1c0f>] ret_from_fork+0x3f/0x70
[290626.255052]  [<ffffffff81075844>] ? kthread_parkme+0x24/0x24
[290626.274460] Code: 89 1c c1 4d 8b a4 24 f0 01 00 00 49 81 ec f0 01 00 00 eb b4 45 31 e4 0f b7 83 4c f7 ff ff 4a 8b 94 e3 58 ff ff ff 44 39 e0 76 16 <48> 8b ba b8 09 00 00 48 85 ff 74 05 e8 61 e8 b5 c0 49 ff c4 eb
[290626.337443] RIP  [<ffffffffc0516e8d>] cache_set_flush+0x102/0x15c [bcache]
[290626.359833]  RSP <ffff8805af187e10>
[290626.372050] CR2: 00000000000009b8
[290626.383646] ---[ end trace f97d88b8bb456800 ]---
[290626.399977] Kernel panic - not syncing: Fatal exception
[290626.418169] Kernel Offset: disabled
[290626.430278] Rebooting in 20 seconds..
[290646.378903] ACPI MEMORY or I/O RESET_REG.
-- 
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems ....
                                      .... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/                         | PGP 1024R/763BE901
--
To unsubscribe from this list: send the line "unsubscribe linux-bcache" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Linux ARM Kernel]     [Linux Filesystem Development]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux