Re: bcache: btree_split() couldn't split

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Zhe and Mariusz,

Based on my understanding of the code, this problem only occurs with
3.14 and older kernels. I believe Kent fixed this bug in v3.15-rc1
with this patch:

commit 0a63b66db566cffdf90182eb6e66fdd4d0479e63
Author: Kent Overstreet <kmo@xxxxxxxxxxxxx>
Date:   Mon Mar 17 17:15:53 2014 -0700

    bcache: Rework btree cache reserve handling

    This changes the bucket allocation reserves to use _real_ reserves
- separate
    freelists - instead of watermarks, which if nothing else makes the
current code
    saner to reason about and is going to be important in the future when we add
    support for multiple btrees.

    It also adds btree_check_reserve(), which checks (and locks) the
reserves for
    both bucket allocation and memory allocation for btree nodes; the
old code just
    kinda sorta assumed that since (e.g. for btree node splits) it had the root
    locked and that meant no other threads could try to make use of the same
    reserve; this technically should have been ok for memory
allocation (we should
    always have a reserve for memory allocation (the btree node cache
is used as a
    reserve and we preallocate it)), but multiple btrees will mean
that locking the
    root won't be sufficient anymore, and for the bucket allocation
reserve it was
    technically possible for the old code to deadlock.

    Signed-off-by: Kent Overstreet <kmo@xxxxxxxxxxxxx>

On Mon, May 12, 2014 at 4:53 AM, Mariusz Paradowski
<indianin@xxxxxxxxxxxx> wrote:
> Confirmed on kernel 3.14.3 from kernel.org:
>
> May 11 17:43:16 x kernel: ------------[ cut here ]------------
> May 11 17:43:16 x kernel: WARNING: CPU: 3 PID: 376101 at
> drivers/md/bcache/btree.c:1979 0xffffffffa00d65ab()
> May 11 17:43:16 x kernel: bcache: btree split failed
> May 11 17:43:16 x kernel: Modules linked in: e1000e ptp pps_core microcode
> firmware_class unix mpt2sas raid_class scsi_transport_sas bcache fuse
> hid_generic usbhid hid xhci_hcd ehci_pci ehci_hcd usbcore usb_common msr
> cpuid
> May 11 17:43:16 x kernel: CPU: 3 PID: 376101 Comm: kworker/3:2 Not tainted
> 3.14.3 #1
> May 11 17:43:16 x kernel: Hardware name:                  /DH87MC, BIOS
> MCH8710H.86A.0047.2013.0606.1508 06/06/2013
> May 11 17:43:16 x kernel: Workqueue: events 0xffffffffa00e8fa0
> May 11 17:43:16 x kernel: 0000000000000009 ffffffff81303a63 ffff88040c24b988
> ffffffff8104c2fd
> May 11 17:43:16 x kernel: ffff8801056f2400 ffff88040c24b9d8 ffff88040c24ba00
> ffff88040c24bd10
> May 11 17:43:16 x kernel: ffffffffffffffe4 ffffffff8104c367 ffffffffa00ea33b
> ffff880400000018
> May 11 17:43:16 x kernel: Call Trace:
> May 11 17:43:16 x kernel: [<ffffffff81303a63>] ? 0xffffffff81303a63
> May 11 17:43:16 x kernel: [<ffffffff8104c2fd>] ? 0xffffffff8104c2fd
> May 11 17:43:16 x kernel: [<ffffffff8104c367>] ? 0xffffffff8104c367
> May 11 17:43:16 x kernel: [<ffffffffa00d65ab>] ? 0xffffffffa00d65ab
> May 11 17:43:16 x kernel: [<ffffffff810752c3>] ? 0xffffffff810752c3
> May 11 17:43:16 x kernel: [<ffffffffa00d669d>] ? 0xffffffffa00d669d
> May 11 17:43:16 x kernel: [<ffffffffa00d7520>] ? 0xffffffffa00d7520
> May 11 17:43:16 x kernel: [<ffffffffa00d753b>] ? 0xffffffffa00d753b
> May 11 17:43:16 x kernel: [<ffffffffa00d4bce>] ? 0xffffffffa00d4bce
> May 11 17:43:16 x kernel: [<ffffffffa00d12a9>] ? 0xffffffffa00d12a9
> May 11 17:43:16 x kernel: [<ffffffffa00d4975>] ? 0xffffffffa00d4975
> May 11 17:43:16 x kernel: [<ffffffffa00d7520>] ? 0xffffffffa00d7520
> May 11 17:43:16 x kernel: [<ffffffffa00d4c65>] ? 0xffffffffa00d4c65
> May 11 17:43:16 x kernel: [<ffffffff811bc9c4>] ? 0xffffffff811bc9c4
> May 11 17:43:16 x kernel: [<ffffffffa00d7d2c>] ? 0xffffffffa00d7d2c
> May 11 17:43:16 x kernel: [<ffffffffa00d7520>] ? 0xffffffffa00d7520
> May 11 17:43:16 x kernel: [<ffffffffa00d7e98>] ? 0xffffffffa00d7e98
> May 11 17:43:16 x kernel: [<ffffffff81079110>] ? 0xffffffff81079110
> May 11 17:43:16 x kernel: [<ffffffffa00e914a>] ? 0xffffffffa00e914a
> May 11 17:43:16 x kernel: [<ffffffff81054cb1>] ? 0xffffffff81054cb1
> May 11 17:43:16 x kernel: [<ffffffff81054b9d>] ? 0xffffffff81054b9d
> May 11 17:43:16 x kernel: [<ffffffff81054eaf>] ? 0xffffffff81054eaf
> May 11 17:43:16 x kernel: [<ffffffff8105e9a1>] ? 0xffffffff8105e9a1
> May 11 17:43:16 x kernel: [<ffffffff8105c9f3>] ? 0xffffffff8105c9f3
> May 11 17:43:16 x kernel: [<ffffffff8105f566>] ? 0xffffffff8105f566
> May 11 17:43:16 x kernel: [<ffffffff8105f450>] ? 0xffffffff8105f450
> May 11 17:43:16 x kernel: [<ffffffff81064621>] ? 0xffffffff81064621
> May 11 17:43:16 x kernel: [<ffffffff81064560>] ? 0xffffffff81064560
> May 11 17:43:16 x kernel: [<ffffffff8130853c>] ? 0xffffffff8130853c
> May 11 17:43:16 x kernel: [<ffffffff81064560>] ? 0xffffffff81064560
> May 11 17:43:16 x kernel: ---[ end trace 4fa5a49292304c0d ]---
> May 11 17:43:16 x kernel: bcache: bch_btree_insert() error -12
> --
> Mariusz Paradowski
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-bcache" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-bcache" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Linux ARM Kernel]     [Linux Filesystem Development]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux