Nice catch Igor, I hadn't thought of that. Nevertheless, here is what I think: In the absence of a flush we don't need to enforce ordering so we don't care about recovering the older gc'ed write. If we completed a flush after the user write, we should have already invalidated the gc mapping and hence will not recover it. Let me know if I am missing something. On Fri, Apr 26, 2019 at 6:46 AM Igor Konopko <igor.j.konopko@xxxxxxxxx> wrote: > > > > On 26.04.2019 12:04, Javier González wrote: > > > >> On 26 Apr 2019, at 11.11, Igor Konopko <igor.j.konopko@xxxxxxxxx> wrote: > >> > >> On 25.04.2019 07:21, Heiner Litz wrote: > >>> Introduce the capability to manage multiple open lines. Maintain one line > >>> for user writes (hot) and a second line for gc writes (cold). As user and > >>> gc writes still utilize a shared ring buffer, in rare cases a multi-sector > >>> write will contain both gc and user data. This is acceptable, as on a > >>> tested SSD with minimum write size of 64KB, less than 1% of all writes > >>> contain both hot and cold sectors. > >> > >> Hi Heiner > >> > >> Generally I really like this changes, I was thinking about sth similar since a while, so it is very good to see that patch. > >> > >> I have a one question related to this patch, since it is not very clear for me - how you ensure the data integrity in following scenarios: > >> -we have open line X for user data and line Y for GC > >> -GC writes LBA=N to line Y > >> -user writes LBA=N to line X > >> -we have power failure when both line X and Y were not written completely > >> -during pblk creation we are executing OOB metadata recovery > >> And here is the question, how we distinguish whether LBA=N from line Y or LBA=N from line X is the valid one? > >> Line X and Y might have seq_id either descending or ascending - this would create two possible scenarios too. > >> > >> Thanks > >> Igor > >> > > > > You are right, I think this is possible in the current implementation. > > > > We need an extra constrain so that we only GC lines above the GC line > > ID. This way, when we order lines on recovery, we can guarantee > > consistency. This means potentially that we would need several open > > lines for GC to avoid padding in case this constrain forces to choose a > > line with an ID higher than the GC line ID. > > > > What do you think? > > I'm not sure yet about your approach, I need to think and analyze this a > little more. > > I also believe that probably we need to ensure that current user data > line seq_id is always above the current GC line seq_id or sth like that. > We cannot also then GC any data from the lines which are still open, but > I believe that this is a case even right now. > > > > > Thanks, > > Javier > >