> On 25 Jan 2019, at 21.20, Hans Holmberg <hans.ml.holmberg@xxxxxxxxxxxxx> wrote: > >> On Fri, Jan 25, 2019 at 7:33 PM Javier González <javier@xxxxxxxxxxx> wrote: >> >> >> >>>> On 25 Jan 2019, at 17.46, Hans Holmberg <hans.ml.holmberg@xxxxxxxxxxxxx> wrote: >>>> >>>>> On Fri, Jan 25, 2019 at 3:35 PM Matias Bjørling <mb@xxxxxxxxxxx> wrote: >>>>> >>>>>> On 1/25/19 3:21 PM, Hans Holmberg wrote: >>>>>> On Fri, Jan 25, 2019 at 2:33 PM Javier González <javier@xxxxxxxxxxx> wrote: >>>>>> >>>>>> >>>>>>> On 25 Jan 2019, at 13.59, Hans Holmberg <hans.ml.holmberg@xxxxxxxxxxxxx> wrote: >>>>>>> >>>>>>> On Fri, Jan 25, 2019 at 10:41 AM Javier González <javier@xxxxxxxxxxx> wrote: >>>>>>>>> On 25 Jan 2019, at 09.47, Hans Holmberg <hans.ml.holmberg@xxxxxxxxxxxxx> wrote: >>>>>>>>> >>>>>>>>> On Thu, Jan 24, 2019 at 8:51 PM Zhoujie Wu <zjwu@xxxxxxxxxxx> wrote: >>>>>>>>>> The write pointer of the bad block could be 0 or undefined, ignore >>>>>>>>>> the checking of the bad block wp for pblk_line_wp_is_unbalanced to >>>>>>>>>> avoid fake warning. >>>>>>>>> >>>>>>>>> fake -> spurious? >>>>>>>>> >>>>>>>>>> Signed-off-by: Zhoujie Wu <zjwu@xxxxxxxxxxx> >>>>>>>>>> --- >>>>>>>>>> v3: return in case bit >= lm->blk_per_line. >>>>>>>>>> v2: changed according to Javier's comments. >>>>>>>>>> >>>>>>>>>> drivers/lightnvm/pblk-recovery.c | 12 +++++++++--- >>>>>>>>>> 1 file changed, 9 insertions(+), 3 deletions(-) >>>>>>>>>> >>>>>>>>>> diff --git a/drivers/lightnvm/pblk-recovery.c b/drivers/lightnvm/pblk-recovery.c >>>>>>>>>> index 6761d2a..02d466e 100644 >>>>>>>>>> --- a/drivers/lightnvm/pblk-recovery.c >>>>>>>>>> +++ b/drivers/lightnvm/pblk-recovery.c >>>>>>>>>> @@ -312,21 +312,27 @@ static int pblk_line_wp_is_unbalanced(struct pblk *pblk, >>>>>>>>>> struct nvm_chk_meta *chunk; >>>>>>>>>> struct ppa_addr ppa; >>>>>>>>>> u64 line_wp; >>>>>>>>>> - int pos, i; >>>>>>>>>> + int pos, i, bit; >>>>>>>>> >>>>>>>>> We don't need both bit and i, one of them is enough. >>>>>>>>> >>>>>>>>>> - rlun = &pblk->luns[0]; >>>>>>>>>> + bit = find_first_zero_bit(line->blk_bitmap, lm->blk_per_line); >>>>>>>>>> + if (bit >= lm->blk_per_line) >>>>>>>>>> + return 0; >>>>>>>>> >>>>>>>>> If there is only one non-offline chunk in the line, the wp can't be unbalanced, >>>>>>>>> so it should be safe to return 0 here if bit >= lm->blk_per_line - 1 >>>>>>>>> >>>>>>>>> If you change this please document why using a comment, as it might >>>>>>>>> not be obvious >>>>>>>>> >>>>>>>>>> + rlun = &pblk->luns[bit]; >>>>>>>>>> ppa = rlun->bppa; >>>>>>>>>> pos = pblk_ppa_to_pos(geo, ppa); >>>>>>>>>> chunk = &line->chks[pos]; >>>>>>>>> >>>>>>>>>> line_wp = chunk->wp; >>>>>>>>>> >>>>>>>>>> - for (i = 1; i < lm->blk_per_line; i++) { >>>>>>>>>> + for (i = bit + 1; i < lm->blk_per_line; i++) { >>>>>>>>> >>>>>>>>>> rlun = &pblk->luns[i]; >>>>>>>>>> ppa = rlun->bppa; >>>>>>>>>> pos = pblk_ppa_to_pos(geo, ppa); >>>>>>>>>> chunk = &line->chks[pos]; >>>>>>>>> >>>>>>>>> This code is a copy of the code above, it'd be nice to refactor it >>>>>>>>> into a helper function or just do the chunk lookups in one place. >>>>>>>>> >>>>>>>>>> + if (chunk->state & NVM_CHK_ST_OFFLINE) >>>>>>>>>> + continue; >>>>>>>>>> + >>>>>>>>> >>>>>>>>> Since we rely on the block bitmap anyway, we might as well just >>>>>>>>> iterate over the zeroes in the block bitmap using find_next_zero_bit >>>>>>>>> instead. >>>>>>>>> We do this in lots of other places, see: git grep -n >>>>>>>>> find_next_zero_bit -- drivers/lightnvm >>>>>>>> >>>>>>>> Hans, I proposed him to use the chunk->state instead. I think it is way >>>>>>>> more robust. We introduced the block bitmap for OCSSD 1.2, because there >>>>>>>> was no state. Now that we have state, it is better to use it instead. In >>>>>>>> fact, we should remove the bock bitmap as we have to check for the state >>>>>>>> either way - note that this aligns also very well with you patches >>>>>>>> removing the other bitmaps. >>>>>>> >>>>>>> These are just nitpicks. >>>>>>> >>>>>>> Relying on two data structures(chunks, block bitmap) to be in sync in >>>>>>> this function in stead of one does not make it more robust imho. >>>>>>> Either or (checking chunks or the block bitmap) is fine by me. >>>>>>> Searching the bitmap is more efficient, so that is what I proposed. >>>>>> >>>>>> chunk log page is the ground truth, so it is more robust. >>>>>> >>>>>> Also, pblk has a long way to start seeing bitmap search vs. integer >>>>>> comparisons in profiling. >>>>> >>>>> Hehe, yeah, but it does not hurt to use the better alternative when available. >>>>> >>>>>>> >>>>>>> If we want to remove the block bitmap, we can do that as a separate patch(set) >>>>>>> I do agree, having two copies of the chunk state is something worth >>>>>>> getting rid of :) >>>>>>> >>>>>> >>>>>> A good start is not adding code using what we want to remove. >>>>> >>>>> Well, I think it's very confusing to use both copies in the same function. >>>>> >>>>> Now we're nitpicking nitpicks :) >>>>> >>>> >>>> I look forward to a patch. Will one of you volunteer a patch? >>> >>> Sure! It'd be easier to Illustrate what I mean with a patch. >> >> Cool. Go ahead - you will see how much cleaner it is using the chunk info. >> >> Matias : In any case Zhoujie’s fix is orthogonal to this discussion and I think you should pick it up - in one form of bb iteration or another - as it fixes a real issue. > > Second that! I feel bad about the fix not making it in just because of > this squabbling. No squabbling :) We both care about pblk and want to get the code right. If you give that first pass to that patch, it would be great. We can take it from there. Nos enjoy the weekend! Javier.