> On 23 Apr 2017, at 19.59, Matias Bjørling <mb@xxxxxxxxxxx> wrote: > > On 04/22/2017 11:31 AM, Javier González wrote: >>> On 22 Apr 2017, at 11.22, Matias Bjørling <mb@xxxxxxxxxxx> wrote: >>> >>> On 04/22/2017 01:32 AM, Javier González wrote: >>>> When block erases fail, these blocks are marked bad. The number of valid >>>> blocks in the line was not updated, which could cause an infinite loop >>>> on the erase path. >>>> >>>> Fix this atomic counter and, in order to avoid taking an irq lock on the >>>> interrupt context, make the erase counters atomic too. >>> >>> I can't find out where the counters are used in irq context? Can you >>> point me in the right direction? I'll prefer for these counters to go >>> in under the existing line_lock. >> >> This counter is line->blk_in_line, which is used on pblk_mark_bb. This >> is triggered on the erase completion path. Note that we do not want to >> wait until the recovery kicks in on the workqueue since the line might >> start getting recycled straight away if we are close to reaching >> capacity. This is indeed the scenario that triggers the race condition. >> >> Making all erase counters atomic allows us to avoid taking the >> line_lock. Note that the counters do not depend on each other. >> >>>> Also, in the case that a significant number of blocks become bad in a >>>> line, the result is the double shared metadata buffer (emeta) to stop >>>> the pipeline until all metadata is flushed to the media. Increase the >>>> number of metadata lines from 2 to 4 to avoid this case. >>> >>> How does moving to 4 lines solve this case? The way I read it is that >>> it only postpones when this occurs? >> >> The chances of having more than 2 lines falling out of blocks after >> pre-condition are slim. Adding two more lines should be enough. >> >>>> [...] >>>> >>>> -#define PBLK_DATA_LINES 2 >>>> +#define PBLK_DATA_LINES 4 >>> >>> Why this change? I like to keep new features for 4.13. Only bugfixes for 4.12. >> >> This is the 4 lines referred above. I see it as a bug fix since we risk >> stalling the pipeline if a line goes above a certain number of bad >> blocks on initialization, but we can leave it out and fix this when we >> add in-line metadata on the write thread for 4.12 >> >> Thanks, >> Javier > > Okay. It tickles me a bit. However, to make it pretty, some > refactoring is needed, which we won't push for this release. > > Reviewed-by: Matias Bjørling <matias@xxxxxxxxxxxx> Yes, you are right. I think it will become much better as we implement the FTL log and refactor the metadata path. Thanks! Javier
Attachment:
signature.asc
Description: Message signed with OpenPGP