On Fri, Nov 01, 2019 at 08:40:31AM +1100, Dave Chinner wrote: > On Wed, Oct 30, 2019 at 09:37:11PM +0800, Pingfan Liu wrote: > > xc_cil_lock is not enough to protect the integrity of a trans logging. > > Taking the scenario: > > cpuA cpuB cpuC > > > > xlog_cil_insert_format_items() > > > > spin_lock(&cil->xc_cil_lock) > > link transA's items to xc_cil, > > including item1 > > spin_unlock(&cil->xc_cil_lock) > > xlog_cil_push() fetches transA's item under xc_cil_lock > > issue transB, modify item1 > > xlog_write(), but now, item1 contains content from transB and we have a broken transA > > TL;DR: 1. log vectors. 2. CIL context lock exclusion. > > When CPU A formats the item during commit, it copies all the changes > into a list of log vectors, and that is attached to the log item > and the item is added to the CIL. The item is then unlocked. This is > done with the CIL context lock held excluding CIL pushes. > > When CPU C pushes on the CIL, it detatches the -log vectors- from > the log item and removes the item from the CIL. This is done hold > the CIL context lock, excluding transaction commits from modifying > the CIL log vector list. It then formats the -log vectors- into the > journal by passing them to xlog_write(). It does not use log items > for this, and because the log vector list has been isolated and is > now private to the push context, we don't need to hold any locks > anymore to call xlog_write.... Yes. I failed to realize it. The critical "item->li_lv = NULL" in xlog_cil_push(), which isolates the vectors and free of new modification even after releasing xc_ctx_lock. [...] > > * initialise the new context and attach it to the CIL. Then attach > > @@ -783,6 +767,25 @@ xlog_cil_push( > > up_write(&cil->xc_ctx_lock); > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > > We don't hold the CIL context lock anymore.... > Doze on it, make a mistaken reverse recognition of the up/down meaning. Thank you for very patient and detailed explain. I get a full understanding now. Regards, Pingfan