On 2020-06-10, Petr Mladek <pmladek@xxxxxxxx> wrote: >> +static bool data_make_reusable(struct printk_ringbuffer *rb, >> + struct prb_data_ring *data_ring, >> + unsigned long lpos_begin, >> + unsigned long lpos_end, >> + unsigned long *lpos_out) >> +{ >> + struct prb_desc_ring *desc_ring = &rb->desc_ring; >> + struct prb_data_blk_lpos *blk_lpos; >> + struct prb_data_block *blk; >> + unsigned long tail_lpos; >> + enum desc_state d_state; >> + struct prb_desc desc; >> + unsigned long id; >> + >> + /* >> + * Using the provided @data_ring, point @blk_lpos to the correct >> + * blk_lpos within the local copy of the descriptor. >> + */ >> + if (data_ring == &rb->text_data_ring) >> + blk_lpos = &desc.text_blk_lpos; >> + else >> + blk_lpos = &desc.dict_blk_lpos; >> + >> + /* Loop until @lpos_begin has advanced to or beyond @lpos_end. */ >> + while ((lpos_end - lpos_begin) - 1 < DATA_SIZE(data_ring)) { >> + blk = to_block(data_ring, lpos_begin); >> + id = READ_ONCE(blk->id); /* LMM(data_make_reusable:A) */ >> + >> + /* >> + * Guarantee the block ID is loaded before checking the tail >> + * lpos. The loaded block ID can only be considered valid if >> + * the tail lpos has not overtaken @lpos_begin. This pairs >> + * with data_alloc:A. >> + * >> + * Memory barrier involvement: >> + * >> + * If data_make_reusable:A reads from data_alloc:B, then >> + * data_make_reusable:C reads from data_push_tail:D. >> + * >> + * Relies on: >> + * >> + * MB from data_push_tail:D to data_alloc:B >> + * matching >> + * RMB from data_make_reusable:A to data_make_reusable:C >> + * >> + * Note: data_push_tail:D and data_alloc:B can be different >> + * CPUs. However, the data_alloc:B CPU (which performs >> + * the full memory barrier) must have previously seen >> + * data_push_tail:D. >> + */ >> + smp_rmb(); /* LMM(data_make_reusable:B) */ >> + >> + tail_lpos = atomic_long_read(&data_ring->tail_lpos >> + ); /* LMM(data_make_reusable:C) */ >> + >> + /* >> + * If @lpos_begin has fallen behind the tail lpos, the read >> + * block ID cannot be trusted. Fast forward @lpos_begin to the >> + * tail lpos and try again. >> + */ >> + if (lpos_begin - tail_lpos >= DATA_SIZE(data_ring)) { >> + lpos_begin = tail_lpos; >> + continue; >> + } >> + >> + d_state = desc_read(desc_ring, id, >> + &desc); /* LMM(data_make_reusable:D) */ >> + >> + switch (d_state) { >> + case desc_miss: >> + return false; >> + case desc_reserved: >> + return false; >> + case desc_committed: >> + /* >> + * This data block is invalid if the descriptor >> + * does not point back to it. >> + */ > > Here again the comments describe what the check does but not why. > I would write something like: > > /* > * The block might have already been > * reused. Make sure that the descriptor really > * points back to the checked lpos. It covers > * both situations. Random data might point to > * a valid descriptor just by chance. Or the block > * has been already reused by another descriptor. > */ Originally this check was needed because the descriptor would be read even if there was a data race reading the ID from the data block. Validating the lpos value was a kind of protection against reading random data that by chance yielded an ID of a committed/reusable descriptor. However, after you pointed out that this check was not enough, the code now re-checks the data tail to make sure that no data race happened. So actually it is not possible that a descriptor in the committed/reusable state will point anywhere else. We know the ID is not random garbage or recycled, so the state can be trusted. I recommend to either remove this sanity check (for committed and reusable) or at least change it to: WARN_ON_ONCE(blk_lpos->begin != lpos_begin); Or can you see any possibility of this case? >> + if (blk_lpos->begin != lpos_begin) >> + return false; >> + desc_make_reusable(desc_ring, id); >> + break; >> + case desc_reusable: >> + /* >> + * This data block is invalid if the descriptor >> + * does not point back to it. >> + */ >> + if (blk_lpos->begin != lpos_begin) >> + return false; >> + break; >> + } >> + >> + /* Advance @lpos_begin to the next data block. */ >> + lpos_begin = blk_lpos->next; >> + } John Ogness _______________________________________________ kexec mailing list kexec@xxxxxxxxxxxxxxxxxxx http://lists.infradead.org/mailman/listinfo/kexec