> On 25 Jan 2019, at 17.46, Hans Holmberg <hans.ml.holmberg@xxxxxxxxxxxxx> wrote: > >> On Fri, Jan 25, 2019 at 3:35 PM Matias Bjørling <mb@xxxxxxxxxxx> wrote: >> >>> On 1/25/19 3:21 PM, Hans Holmberg wrote: >>>> On Fri, Jan 25, 2019 at 2:33 PM Javier González <javier@xxxxxxxxxxx> wrote: >>>> >>>> >>>>> On 25 Jan 2019, at 13.59, Hans Holmberg <hans.ml.holmberg@xxxxxxxxxxxxx> wrote: >>>>> >>>>> On Fri, Jan 25, 2019 at 10:41 AM Javier González <javier@xxxxxxxxxxx> wrote: >>>>>>> On 25 Jan 2019, at 09.47, Hans Holmberg <hans.ml.holmberg@xxxxxxxxxxxxx> wrote: >>>>>>> >>>>>>> On Thu, Jan 24, 2019 at 8:51 PM Zhoujie Wu <zjwu@xxxxxxxxxxx> wrote: >>>>>>>> The write pointer of the bad block could be 0 or undefined, ignore >>>>>>>> the checking of the bad block wp for pblk_line_wp_is_unbalanced to >>>>>>>> avoid fake warning. >>>>>>> >>>>>>> fake -> spurious? >>>>>>> >>>>>>>> Signed-off-by: Zhoujie Wu <zjwu@xxxxxxxxxxx> >>>>>>>> --- >>>>>>>> v3: return in case bit >= lm->blk_per_line. >>>>>>>> v2: changed according to Javier's comments. >>>>>>>> >>>>>>>> drivers/lightnvm/pblk-recovery.c | 12 +++++++++--- >>>>>>>> 1 file changed, 9 insertions(+), 3 deletions(-) >>>>>>>> >>>>>>>> diff --git a/drivers/lightnvm/pblk-recovery.c b/drivers/lightnvm/pblk-recovery.c >>>>>>>> index 6761d2a..02d466e 100644 >>>>>>>> --- a/drivers/lightnvm/pblk-recovery.c >>>>>>>> +++ b/drivers/lightnvm/pblk-recovery.c >>>>>>>> @@ -312,21 +312,27 @@ static int pblk_line_wp_is_unbalanced(struct pblk *pblk, >>>>>>>> struct nvm_chk_meta *chunk; >>>>>>>> struct ppa_addr ppa; >>>>>>>> u64 line_wp; >>>>>>>> - int pos, i; >>>>>>>> + int pos, i, bit; >>>>>>> >>>>>>> We don't need both bit and i, one of them is enough. >>>>>>> >>>>>>>> - rlun = &pblk->luns[0]; >>>>>>>> + bit = find_first_zero_bit(line->blk_bitmap, lm->blk_per_line); >>>>>>>> + if (bit >= lm->blk_per_line) >>>>>>>> + return 0; >>>>>>> >>>>>>> If there is only one non-offline chunk in the line, the wp can't be unbalanced, >>>>>>> so it should be safe to return 0 here if bit >= lm->blk_per_line - 1 >>>>>>> >>>>>>> If you change this please document why using a comment, as it might >>>>>>> not be obvious >>>>>>> >>>>>>>> + rlun = &pblk->luns[bit]; >>>>>>>> ppa = rlun->bppa; >>>>>>>> pos = pblk_ppa_to_pos(geo, ppa); >>>>>>>> chunk = &line->chks[pos]; >>>>>>> >>>>>>>> line_wp = chunk->wp; >>>>>>>> >>>>>>>> - for (i = 1; i < lm->blk_per_line; i++) { >>>>>>>> + for (i = bit + 1; i < lm->blk_per_line; i++) { >>>>>>> >>>>>>>> rlun = &pblk->luns[i]; >>>>>>>> ppa = rlun->bppa; >>>>>>>> pos = pblk_ppa_to_pos(geo, ppa); >>>>>>>> chunk = &line->chks[pos]; >>>>>>> >>>>>>> This code is a copy of the code above, it'd be nice to refactor it >>>>>>> into a helper function or just do the chunk lookups in one place. >>>>>>> >>>>>>>> + if (chunk->state & NVM_CHK_ST_OFFLINE) >>>>>>>> + continue; >>>>>>>> + >>>>>>> >>>>>>> Since we rely on the block bitmap anyway, we might as well just >>>>>>> iterate over the zeroes in the block bitmap using find_next_zero_bit >>>>>>> instead. >>>>>>> We do this in lots of other places, see: git grep -n >>>>>>> find_next_zero_bit -- drivers/lightnvm >>>>>> >>>>>> Hans, I proposed him to use the chunk->state instead. I think it is way >>>>>> more robust. We introduced the block bitmap for OCSSD 1.2, because there >>>>>> was no state. Now that we have state, it is better to use it instead. In >>>>>> fact, we should remove the bock bitmap as we have to check for the state >>>>>> either way - note that this aligns also very well with you patches >>>>>> removing the other bitmaps. >>>>> >>>>> These are just nitpicks. >>>>> >>>>> Relying on two data structures(chunks, block bitmap) to be in sync in >>>>> this function in stead of one does not make it more robust imho. >>>>> Either or (checking chunks or the block bitmap) is fine by me. >>>>> Searching the bitmap is more efficient, so that is what I proposed. >>>> >>>> chunk log page is the ground truth, so it is more robust. >>>> >>>> Also, pblk has a long way to start seeing bitmap search vs. integer >>>> comparisons in profiling. >>> >>> Hehe, yeah, but it does not hurt to use the better alternative when available. >>> >>>>> >>>>> If we want to remove the block bitmap, we can do that as a separate patch(set) >>>>> I do agree, having two copies of the chunk state is something worth >>>>> getting rid of :) >>>>> >>>> >>>> A good start is not adding code using what we want to remove. >>> >>> Well, I think it's very confusing to use both copies in the same function. >>> >>> Now we're nitpicking nitpicks :) >>> >> >> I look forward to a patch. Will one of you volunteer a patch? > > Sure! It'd be easier to Illustrate what I mean with a patch. Cool. Go ahead - you will see how much cleaner it is using the chunk info. Matias : In any case Zhoujie’s fix is orthogonal to this discussion and I think you should pick it up - in one form of bb iteration or another - as it fixes a real issue. Javier.