Re: [RFC PATCH v2 13/16] bcache: fix fifo index swapping condition in btree_flush_write()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2019/4/23 3:09 下午, Hannes Reinecke wrote:
> On 4/19/19 6:05 PM, Coly Li wrote:
>> Current journal_max_cmp() and journal_min_cmp() assume that smaller fifo
>> index indicating elder journal entries, but this is only true when fifo
>> index is not swapped.
>>
>> Fifo structure journal.pin is implemented by a cycle buffer, if the head
>> index reaches highest location of the cycle buffer, it will be swapped
>> to 0. Once the swapping happens, it means a smaller fifo index might be
>> associated to a newer journal entry. So the btree node with oldest
>> journal entry won't be selected by btree_flush_write() to flush out to
>> cache device. The result is, the oldest journal entries may always has
>> no chance to be written into cache device, and after a reboot
>> bch_journal_replay() may complain some journal entries are missing.
>>
>> This patch handles the fifo index swapping conditions properly, then in
>> btree_flush_write() the btree node with oldest journal entry can be
>> slected from c->flush_btree correctly.
>>
>> Cc: stable@xxxxxxxxxxxxxxx
>> Signed-off-by: Coly Li <colyli@xxxxxxx>
>> ---
>>   drivers/md/bcache/journal.c | 47
>> +++++++++++++++++++++++++++++++++++++++------
>>   1 file changed, 41 insertions(+), 6 deletions(-)
>>
>> diff --git a/drivers/md/bcache/journal.c b/drivers/md/bcache/journal.c
>> index bdb6f9cefe48..bc0e01151155 100644
>> --- a/drivers/md/bcache/journal.c
>> +++ b/drivers/md/bcache/journal.c
>> @@ -464,12 +464,47 @@ int bch_journal_replay(struct cache_set *s,
>> struct list_head *list)
>>   }
>>     /* Journalling */
>> -#define journal_max_cmp(l, r) \
>> -    (fifo_idx(&c->journal.pin, btree_current_write(l)->journal) < \
>> -     fifo_idx(&(c)->journal.pin, btree_current_write(r)->journal))
>> -#define journal_min_cmp(l, r) \
>> -    (fifo_idx(&c->journal.pin, btree_current_write(l)->journal) > \
>> -     fifo_idx(&(c)->journal.pin, btree_current_write(r)->journal))
>> +#define journal_max_cmp(l, r)                        \
>> +({                                    \
>> +    int l_idx, r_idx, f_idx, b_idx;                    \
>> +    bool _ret = true;                        \
>> +                                    \
>> +    l_idx = fifo_idx(&c->journal.pin,
>> btree_current_write(l)->journal); \
>> +    r_idx = fifo_idx(&c->journal.pin,
>> btree_current_write(r)->journal); \
>> +    f_idx = c->journal.pin.front;                    \
>> +    b_idx = c->journal.pin.back;                    \
>> +                                    \
>> +    _ret = (l_idx < r_idx);                        \
>> +    /* in case fifo back pointer is swapped */            \
>> +    if (b_idx < f_idx) {                         \
>> +        if (l_idx <= b_idx && r_idx >= f_idx)            \
>> +            _ret = false;                    \
>> +        else if (l_idx >= f_idx && r_idx <= b_idx)        \
>> +            _ret = true;                    \
>> +    }                                \
>> +    _ret;                                \
>> +})
>> +
>> +#define journal_min_cmp(l, r)                        \
>> +({                                    \
>> +    int l_idx, r_idx, f_idx, b_idx;                    \
>> +    bool _ret = true;                        \
>> +                                    \
>> +    l_idx = fifo_idx(&c->journal.pin,
>> btree_current_write(l)->journal); \
>> +    r_idx = fifo_idx(&c->journal.pin,
>> btree_current_write(r)->journal); \
>> +    f_idx = c->journal.pin.front;                    \
>> +    b_idx = c->journal.pin.back;                    \
>> +                                    \
>> +    _ret = (l_idx > r_idx);                        \
>> +    /* in case fifo back pointer is swapped */            \
>> +    if (b_idx < f_idx) {                        \
>> +        if (l_idx <= b_idx && r_idx >= f_idx)            \
>> +            _ret = true;                    \
>> +        else if (l_idx >= f_idx && r_idx <= b_idx)        \
>> +            _ret = false;                    \
>> +    }                                \
>> +    _ret;                                \
>> +})
>>     static void btree_flush_write(struct cache_set *c)
>>   {
>>
> Please make it a proper function.
> This is far too convoluted for being handled via #define, and it would
> avoid cluttering the function namespace with hidden variables.

Hi Hannes,

Sure let me do it in next version. Thanks.


-- 

Coly Li



[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux