Re: Critical bug on bcache kernel module in Fedora 30

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2019/6/8 5:52 上午, Rolf Fokkens wrote:
> Some potential progress: https://bugzilla.kernel.org/show_bug.cgi?id=203573
> 
> On 5/30/19 2:50 PM, Rolf Fokkens wrote:
>> Not being sure if people here follow the issue at Fedora I'd like to
>> pass a suggestion:
>> https://bugzilla.redhat.com/show_bug.cgi?id=1708315#c27
>>
>> On 5/22/19 1:44 AM, Nix wrote:
>>> On 21 May 2019, Coly Li uttered the following:
>>>> Also I try to analyze the assemble code of bcache, just find out the
>>>> generated assembly code between gcc9 and gcc7 is quite different. For
>>>> gcc9 there is a XXXX.cold part. So far I can not tell where the problem
>>>> is from yet.
>>> This is hot/cold partitioning. You can turn it off with
>>> -fno-reorder-blocks-and-partition and see if that helps things (and if
>>> it doesn't, it should at least make stuff easier to compare).

Hi folks,

>From the above bugzilla.kernel.org links, it seems people are all on the
same/correct track :-)

I don't analyze the assembly code, because I can stably reproduce the
problem: two adjacent keys don't merge. And I find the problem is from
macro PRECEDING_KEY(), which misleading bch_btree_insert_key() when it
calls bch_btree_iter_init(). Misleading means returned 'm' always
indicates the inserting key is the first key in the btree node because
prev in the while-loop is NULL.

For the second adjacent key, obviously prev point should not be NULL, so
the suspicious point is PRECEDING_KEY(), its code is,
437 #define PRECEDING_KEY(_k)                                       \
438 ({                                                              \
439         struct bkey *_ret = NULL;                               \
440                                                                 \
441         if (KEY_INODE(_k) || KEY_OFFSET(_k)) {                  \
442                 _ret = &KEY(KEY_INODE(_k), KEY_OFFSET(_k), 0);  \
443                                                                 \
444                 if (!_ret->low)                                 \
445                         _ret->high--;                           \
446                 _ret->low--;                                    \
447         }                                                       \
448                                                                 \
449         _ret;                                                   \
450 })

The problem is at line 442, KEY() announced a on-stack temporary
variable in struct bkey {high = KEY_INODE(_k), low = KEY_OFFSET(_k), ptr
= NULL}. But its life range is in line 442-447. When address of this
on-stack variable returns from PRECEDING_KEY() and sent into
bch_btree_iter_init(), this stack space of this on-stack variable is
overwritten by stackframe of bch_btree_iter_init().

To confirm the above suspicion, I modify bch_btree_insert_key() to not
use PRECEDING_KEY(), and store the preceding key in a local variable of
bch_btree_insert_key() like this,

@@ -884,27 +894,78 @@ unsigned int bch_btree_insert_key(struct
btree_keys *b, struct bkey *k,
        struct bset *i = bset_tree_last(b)->data;
        struct bkey *m, *prev = NULL;
        struct btree_iter iter;
+       struct bkey preceding_key = ZERO_KEY;
+       struct bkey *preceding_key_p = NULL;

        BUG_ON(b->ops->is_extents && !KEY_SIZE(k));

-       m = bch_btree_iter_init(b, &iter, b->ops->is_extents
-                               ? PRECEDING_KEY(&START_KEY(k))
-                               : PRECEDING_KEY(k));
+
+       if (b->ops->is_extents) {
+               struct bkey tmp_k;
+               tmp_k = START_KEY(k);
+               if (KEY_INODE(&tmp_k) || KEY_OFFSET(&tmp_k)) {
+                       preceding_key = KEY(KEY_INODE(&tmp_k),
KEY_OFFSET(&tmp_k), 0);
+
+                       if (!preceding_key.low)
+                               preceding_key.high--;
+                       preceding_key.low--;
+               }
+               preceding_key_p = &preceding_key;
+       } else {
+               if (KEY_INODE(k) || KEY_OFFSET(k)) {
+                       preceding_key = KEY(KEY_INODE(k), KEY_OFFSET(k), 0);
+                       if (!preceding_key.low)
+                               preceding_key.high--;
+                       preceding_key.low--;
+               }
+               preceding_key_p = &preceding_key;
+       }
+
+       m = bch_btree_iter_init(b, &iter, preceding_key_p);

Now I can see the preceding key is reported and adjacent keys can be
merged. So far my simple fio testing works fine, but I need to go
through all locations where KEY() and PROCEDING_KEY() are used and find
a proper way to fix.

As I guessed this is a bcache bug and triggered by gcc9. I have no clear
idea why it works fine before (maybe depends on some compiling option
?), but definitely this code should be fixed.

Thanks.

-- 

Coly Li



[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Linux ARM Kernel]     [Linux Filesystem Development]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux