On Wed, Oct 30, 2019 at 09:48:25AM -0700, Darrick J. Wong wrote: > On Wed, Oct 30, 2019 at 09:37:11PM +0800, Pingfan Liu wrote: > > xc_cil_lock is not enough to protect the integrity of a trans logging. > > Taking the scenario: > > cpuA cpuB cpuC > > > > xlog_cil_insert_format_items() > > > > spin_lock(&cil->xc_cil_lock) > > link transA's items to xc_cil, > > including item1 > > spin_unlock(&cil->xc_cil_lock) > > xlog_cil_push() fetches transA's item under xc_cil_lock > > issue transB, modify item1 > > xlog_write(), but now, item1 contains content from transB and we have a broken transA > > > > Survive this race issue by putting under the protection of xc_ctx_lock. > > Meanwhile the xc_cil_lock can be dropped as xc_ctx_lock does it against > > xlog_cil_insert_items() > > How did you trigger this race? Is there a test case to reproduce, or > did you figure this out via code inspection? > Via code inspection. To hit this bug, the condition is hard to meet: a broken transA is written to disk, then system encounters a failure before transB is written. Only if this happens, the recovery will bring us to a broken context. Regards, Pingfan