Re: bcache bug / fs freeze on heavy IO

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



----- Ursprüngliche Mail -----
> Von: "Kent Overstreet" <kmo@xxxxxxxxxxxxx>
> An: "Thomas Klaube" <thomas@xxxxxxxxxx>
> CC: linux-bcache@xxxxxxxxxxxxxxx
> Gesendet: Freitag, 22. August 2014 11:38:05
> Betreff: Re: bcache bug / fs freeze on heavy IO
> 
> there weren't any bcache changes in 3.16 from 3.15, so unless you hit
> this again or someone else reports it I would think you just got
> unlucky.

Hi,

I have similar issue again. This is with kernel 3.13.0-34 (ubuntu
server 14.04.1 LTS). This also happend during a fio benchmark on a
bcache device:

Aug 26 01:52:06 ubuntu kernel: [18378.656038] BUG: unable to handle kernel NULL pointer dereference at 0000000000000099
Aug 26 01:52:06 ubuntu kernel: [18378.656067] IP: [<ffffffffa0306bb6>] bch_btree_insert_node+0x16/0x2b0 [bcache]
Aug 26 01:52:06 ubuntu kernel: [18378.656093] PGD 0 
Aug 26 01:52:06 ubuntu kernel: [18378.656101] Oops: 0000 [#1] SMP 
Aug 26 01:52:06 ubuntu kernel: [18378.656113] Modules linked in: bcache binfmt_misc x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm crct10dif_pclmul ast ttm crc32_pclmul ghash_clmulni_intel drm_kms_helper aesni_intel aes_x86_64 drm lrw gf128mul glue_helper ablk_helper syscopyarea cryptd sysfillrect sysimgblt lpc_ich shpchp mei_me mei bonding lp parport ipmi_si video mac_hid acpi_pad hid_generic usbhid ses hid enclosure usb_storage megaraid_sas ahci libahci igb e1000e i2c_algo_bit dca ptp pps_core
Aug 26 01:52:06 ubuntu kernel: [18378.656277] CPU: 3 PID: 1770 Comm: bcache_gc Not tainted 3.13.0-34-generic #60-Ubuntu
Aug 26 01:52:06 ubuntu kernel: [18378.656299] Hardware name: Supermicro X10SLM-F/X10SLM-F, BIOS 2.0 04/24/2014
Aug 26 01:52:06 ubuntu kernel: [18378.656319] task: ffff8804045fc7d0 ti: ffff880405b28000 task.ti: ffff880405b28000
Aug 26 01:52:06 ubuntu kernel: [18378.656340] RIP: 0010:[<ffffffffa0306bb6>]  [<ffffffffa0306bb6>] bch_btree_insert_node+0x16/0x2b0 [bcache]
Aug 26 01:52:06 ubuntu kernel: [18378.656370] RSP: 0018:ffff880405b297d8  EFLAGS: 00010246
Aug 26 01:52:06 ubuntu kernel: [18378.656385] RAX: ffff8803fe5c0000 RBX: ffff8802f5824400 RCX: 0000000000000000
Aug 26 01:52:06 ubuntu kernel: [18378.656405] RDX: ffff880405b29858 RSI: ffff880405b29dd4 RDI: ffffffffffffffff
Aug 26 01:52:06 ubuntu kernel: [18378.656424] RBP: ffff880405b297f8 R08: 0000000000000000 R09: ffff880405b29880
Aug 26 01:52:06 ubuntu kernel: [18378.656444] R10: 0000000000000001 R11: 000007ffffffffff R12: 0000000000000000
Aug 26 01:52:06 ubuntu kernel: [18378.656464] R13: ffff880405b29858 R14: ffff880405b29828 R15: 0000000000004587
Aug 26 01:52:06 ubuntu kernel: [18378.656484] FS:  0000000000000000(0000) GS:ffff88041fd80000(0000) knlGS:0000000000000000
Aug 26 01:52:06 ubuntu kernel: [18378.656507] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Aug 26 01:52:06 ubuntu kernel: [18378.656524] CR2: 0000000000000099 CR3: 0000000001c0e000 CR4: 00000000001407e0
Aug 26 01:52:06 ubuntu kernel: [18378.656544] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Aug 26 01:52:06 ubuntu kernel: [18378.656564] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Aug 26 01:52:06 ubuntu kernel: [18378.656584] Stack:
Aug 26 01:52:06 ubuntu kernel: [18378.656590]  ffff8802f5824400 ffff880039161800 0000000000000000 ffff880405b29828
Aug 26 01:52:06 ubuntu kernel: [18378.656614]  ffff880405b29910 ffffffffa0306a71 0000000000000000 ffff880405b29ab0
Aug 26 01:52:06 ubuntu kernel: [18378.656638]  000010b71d30b6be ffff880405b29dd4 0000000000000000 ffff8804045fc7d0
Aug 26 01:52:06 ubuntu kernel: [18378.656661] Call Trace:
Aug 26 01:52:06 ubuntu kernel: [18378.656672]  [<ffffffffa0306a71>] btree_split+0x441/0x570 [bcache]
Aug 26 01:52:06 ubuntu kernel: [18378.656692]  [<ffffffff810753d5>] ? del_timer+0x55/0x70
Aug 26 01:52:06 ubuntu kernel: [18378.656709]  [<ffffffff81081f89>] ? try_to_grab_pending+0xa9/0x160
Aug 26 01:52:06 ubuntu kernel: [18378.656728]  [<ffffffffa0306cc1>] bch_btree_insert_node+0x121/0x2b0 [bcache]
Aug 26 01:52:06 ubuntu kernel: [18378.656750]  [<ffffffffa030787e>] btree_gc_recurse+0xa2e/0xbb0 [bcache]
Aug 26 01:52:06 ubuntu kernel: [18378.656771]  [<ffffffffa0309755>] ? bch_btree_ptr_invalid+0xa5/0xd0 [bcache]
Aug 26 01:52:06 ubuntu kernel: [18378.656793]  [<ffffffffa03072d6>] btree_gc_recurse+0x486/0xbb0 [bcache]
Aug 26 01:52:06 ubuntu kernel: [18378.656813]  [<ffffffff810a7145>] ? load_balance+0x185/0x890
Aug 26 01:52:06 ubuntu kernel: [18378.656831]  [<ffffffffa0309755>] ? bch_btree_ptr_invalid+0xa5/0xd0 [bcache]
Aug 26 01:52:06 ubuntu kernel: [18378.656852]  [<ffffffff8101b7e9>] ? sched_clock+0x9/0x10
Aug 26 01:52:06 ubuntu kernel: [18378.656869]  [<ffffffffa0302380>] ? btree_node_free+0x1d0/0x1d0 [bcache]
Aug 26 01:52:06 ubuntu kernel: [18378.656889]  [<ffffffffa0305803>] ? btree_gc_mark_node+0x63/0x210 [bcache]
Aug 26 01:52:06 ubuntu kernel: [18378.656910]  [<ffffffffa0307feb>] bch_btree_gc+0x41b/0x5a0 [bcache]
Aug 26 01:52:06 ubuntu kernel: [18378.656930]  [<ffffffff8171fd41>] ? __schedule+0x381/0x7d0
Aug 26 01:52:06 ubuntu kernel: [18378.656948]  [<ffffffffa03081a8>] bch_gc_thread+0x38/0x120 [bcache]
Aug 26 01:52:06 ubuntu kernel: [18378.656967]  [<ffffffffa0308170>] ? bch_btree_gc+0x5a0/0x5a0 [bcache]
Aug 26 01:52:06 ubuntu kernel: [18378.656986]  [<ffffffff8108b3d2>] kthread+0xd2/0xf0
Aug 26 01:52:06 ubuntu kernel: [18378.657608]  [<ffffffff8108b300>] ? kthread_create_on_node+0x1d0/0x1d0
Aug 26 01:52:06 ubuntu kernel: [18378.658237]  [<ffffffff8172c6bc>] ret_from_fork+0x7c/0xb0
Aug 26 01:52:06 ubuntu kernel: [18378.658845]  [<ffffffff8108b300>] ? kthread_create_on_node+0x1d0/0x1d0
Aug 26 01:52:06 ubuntu kernel: [18378.659445] Code: 24 60 e8 5e a1 da e0 eb 8a 66 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 41 56 41 55 49 89 d5 41 54 49 89 cc 53 <80> bf 9a 00 00 00 00 48 89 fb 0f 85 6c 02 00 00 4c 8b 8b 80 00 
Aug 26 01:52:06 ubuntu kernel: [18378.660709] RIP  [<ffffffffa0306bb6>] bch_btree_insert_node+0x16/0x2b0 [bcache]
Aug 26 01:52:06 ubuntu kernel: [18378.661333]  RSP <ffff880405b297d8>
Aug 26 01:52:06 ubuntu kernel: [18378.661938] CR2: 0000000000000099
Aug 26 01:52:06 ubuntu kernel: [18378.685807] ---[ end trace c759c6ac8f543aa1 ]---

There are several fio processes hanging in d state and kill -9 does
not work. Elevator is cfq, here is the fio setup:

[rnd]
rw=randrw
ramp_time=30
runtime=36600
time_based
rwmixread=30
size=100g
refill_buffers=1
directory=.
iodepth=64
direct=1
blocksize=4k
numjobs=16
group_reporting
ioengine=libaio
loops=1

the fio job reads/writes to preallocated files and this fio job is
run in parallel with a similar fio job (same setup) on a non-bcached
device. There is no error on the fio job that runs on the non-bcache
device (job is successfully finishing after 36600 sec with reasonable
results). There are no errors in the controller logs and there are no
other errors in dmesg.

Any ideas? Probably I can reproduce this. 

Regards
Thomas Klaube
--
To unsubscribe from this list: send the line "unsubscribe linux-bcache" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Linux ARM Kernel]     [Linux Filesystem Development]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux