Re:Fw:About bcache-check

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Coly,

    Hmm, after a second thought, this problem sounds cant happen in the discard disabled reason:
because the seq is a random number, 

get_random_bytes(&i->seq, sizeof(uint64_t));

So it's not possible to get a same random seq in last invalidated bucket and the new bucket. 
But what about the power-cut case?

Yang




发件人:"杨东升" <dongsheng.yang@xxxxxxxxxxxx>
发送日期:2020-09-16 14:19:46
收件人:colyli <colyli@xxxxxxx>
抄送人:linux-bcache <linux-bcache@xxxxxxxxxxxxxxx>
主题:Fw:About bcache-check>Resending with no HTML format  ... ...
>
>
>Hi Coly and all,
>     I found there is an error message in our testing:
>
>
>Sep 27 17:43:00 node-1 kernel: bcache: error on 
>c2914b7e-d665-4ec1-80e1-272755de19ef: unsupported bset version at bucket
> 58290, block 0, 40818810 keys, disabling caching
>
>
>I checked the code in bch_btree_node_read_done() around this message:
>
> 214         for (;
> 215              b->written < btree_blocks(b) && i->seq == b->keys.set[0].data->seq;
> 216              i = write_block(b)) {
> 217                 err = "unsupported bset version";
> 218                 if (i->version > BCACHE_BSET_VERSION)
> 219                         goto err;
> 220 
>The problem is we found the i->seq is what we expected for this btree_node, but the version is not BCACHE_BSET_VERSION (1)
>
>
>
>I think there would be two reasons to cause this messages:
>(1) cache discard is not enabled.
>      When we allocate a bucket, if we dont enable discard, there could be some outdated data in this bucket, 
>
>and there is possibility that the location of i->seq is equal to what we expected,
>
>but that's really not an bset at all, so we will found version, magic and bset_csum are all unexpected, 
>
>currently we will goto err and stop cache_set.
>
>
>(2) power-cut.
>       When we are doing btree_node_write, if there is a power-cut happen, we could write a partial btree.
> 
>
>But when we meet this kind of problems, we cant use this cache device. There is no tool to recovery from this kind of problem.
>
>I think I can cook a bcache-check in bcache-tools, something like fsck. to check this kind of problem
>
>and allow user to repair it, warning on user force-repaire is risky.
>
>
>
>Please help to point out if there is something I am missing. 
>
>
>
>Thanx
>Dongsheng
>
>
>






[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Linux ARM Kernel]     [Linux Filesystem Development]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux