On Fri, Jan 25, 2019 at 7:33 PM Javier González <javier@xxxxxxxxxxx> wrote: > > > > > On 25 Jan 2019, at 17.46, Hans Holmberg <hans.ml.holmberg@xxxxxxxxxxxxx> wrote: > > > >> On Fri, Jan 25, 2019 at 3:35 PM Matias Bjørling <mb@xxxxxxxxxxx> wrote: > >> > >>> On 1/25/19 3:21 PM, Hans Holmberg wrote: > >>>> On Fri, Jan 25, 2019 at 2:33 PM Javier González <javier@xxxxxxxxxxx> wrote: > >>>> > >>>> > >>>>> On 25 Jan 2019, at 13.59, Hans Holmberg <hans.ml.holmberg@xxxxxxxxxxxxx> wrote: > >>>>> > >>>>> On Fri, Jan 25, 2019 at 10:41 AM Javier González <javier@xxxxxxxxxxx> wrote: > >>>>>>> On 25 Jan 2019, at 09.47, Hans Holmberg <hans.ml.holmberg@xxxxxxxxxxxxx> wrote: > >>>>>>> > >>>>>>> On Thu, Jan 24, 2019 at 8:51 PM Zhoujie Wu <zjwu@xxxxxxxxxxx> wrote: > >>>>>>>> The write pointer of the bad block could be 0 or undefined, ignore > >>>>>>>> the checking of the bad block wp for pblk_line_wp_is_unbalanced to > >>>>>>>> avoid fake warning. > >>>>>>> > >>>>>>> fake -> spurious? > >>>>>>> > >>>>>>>> Signed-off-by: Zhoujie Wu <zjwu@xxxxxxxxxxx> > >>>>>>>> --- > >>>>>>>> v3: return in case bit >= lm->blk_per_line. > >>>>>>>> v2: changed according to Javier's comments. > >>>>>>>> > >>>>>>>> drivers/lightnvm/pblk-recovery.c | 12 +++++++++--- > >>>>>>>> 1 file changed, 9 insertions(+), 3 deletions(-) > >>>>>>>> > >>>>>>>> diff --git a/drivers/lightnvm/pblk-recovery.c b/drivers/lightnvm/pblk-recovery.c > >>>>>>>> index 6761d2a..02d466e 100644 > >>>>>>>> --- a/drivers/lightnvm/pblk-recovery.c > >>>>>>>> +++ b/drivers/lightnvm/pblk-recovery.c > >>>>>>>> @@ -312,21 +312,27 @@ static int pblk_line_wp_is_unbalanced(struct pblk *pblk, > >>>>>>>> struct nvm_chk_meta *chunk; > >>>>>>>> struct ppa_addr ppa; > >>>>>>>> u64 line_wp; > >>>>>>>> - int pos, i; > >>>>>>>> + int pos, i, bit; > >>>>>>> > >>>>>>> We don't need both bit and i, one of them is enough. > >>>>>>> > >>>>>>>> - rlun = &pblk->luns[0]; > >>>>>>>> + bit = find_first_zero_bit(line->blk_bitmap, lm->blk_per_line); > >>>>>>>> + if (bit >= lm->blk_per_line) > >>>>>>>> + return 0; > >>>>>>> > >>>>>>> If there is only one non-offline chunk in the line, the wp can't be unbalanced, > >>>>>>> so it should be safe to return 0 here if bit >= lm->blk_per_line - 1 > >>>>>>> > >>>>>>> If you change this please document why using a comment, as it might > >>>>>>> not be obvious > >>>>>>> > >>>>>>>> + rlun = &pblk->luns[bit]; > >>>>>>>> ppa = rlun->bppa; > >>>>>>>> pos = pblk_ppa_to_pos(geo, ppa); > >>>>>>>> chunk = &line->chks[pos]; > >>>>>>> > >>>>>>>> line_wp = chunk->wp; > >>>>>>>> > >>>>>>>> - for (i = 1; i < lm->blk_per_line; i++) { > >>>>>>>> + for (i = bit + 1; i < lm->blk_per_line; i++) { > >>>>>>> > >>>>>>>> rlun = &pblk->luns[i]; > >>>>>>>> ppa = rlun->bppa; > >>>>>>>> pos = pblk_ppa_to_pos(geo, ppa); > >>>>>>>> chunk = &line->chks[pos]; > >>>>>>> > >>>>>>> This code is a copy of the code above, it'd be nice to refactor it > >>>>>>> into a helper function or just do the chunk lookups in one place. > >>>>>>> > >>>>>>>> + if (chunk->state & NVM_CHK_ST_OFFLINE) > >>>>>>>> + continue; > >>>>>>>> + > >>>>>>> > >>>>>>> Since we rely on the block bitmap anyway, we might as well just > >>>>>>> iterate over the zeroes in the block bitmap using find_next_zero_bit > >>>>>>> instead. > >>>>>>> We do this in lots of other places, see: git grep -n > >>>>>>> find_next_zero_bit -- drivers/lightnvm > >>>>>> > >>>>>> Hans, I proposed him to use the chunk->state instead. I think it is way > >>>>>> more robust. We introduced the block bitmap for OCSSD 1.2, because there > >>>>>> was no state. Now that we have state, it is better to use it instead. In > >>>>>> fact, we should remove the bock bitmap as we have to check for the state > >>>>>> either way - note that this aligns also very well with you patches > >>>>>> removing the other bitmaps. > >>>>> > >>>>> These are just nitpicks. > >>>>> > >>>>> Relying on two data structures(chunks, block bitmap) to be in sync in > >>>>> this function in stead of one does not make it more robust imho. > >>>>> Either or (checking chunks or the block bitmap) is fine by me. > >>>>> Searching the bitmap is more efficient, so that is what I proposed. > >>>> > >>>> chunk log page is the ground truth, so it is more robust. > >>>> > >>>> Also, pblk has a long way to start seeing bitmap search vs. integer > >>>> comparisons in profiling. > >>> > >>> Hehe, yeah, but it does not hurt to use the better alternative when available. > >>> > >>>>> > >>>>> If we want to remove the block bitmap, we can do that as a separate patch(set) > >>>>> I do agree, having two copies of the chunk state is something worth > >>>>> getting rid of :) > >>>>> > >>>> > >>>> A good start is not adding code using what we want to remove. > >>> > >>> Well, I think it's very confusing to use both copies in the same function. > >>> > >>> Now we're nitpicking nitpicks :) > >>> > >> > >> I look forward to a patch. Will one of you volunteer a patch? > > > > Sure! It'd be easier to Illustrate what I mean with a patch. > > Cool. Go ahead - you will see how much cleaner it is using the chunk info. > > Matias : In any case Zhoujie’s fix is orthogonal to this discussion and I think you should pick it up - in one form of bb iteration or another - as it fixes a real issue. Second that! I feel bad about the fix not making it in just because of this squabbling. > Javier.