On Wed, May 19, 2021 at 05:15:53PM -0400, Josef Bacik wrote: > Error injection testing uncovered a pretty severe problem where we could > end up committing a super that pointed to the wrong tree roots, > resulting in transid mismatch errors. > > The way we commit the transaction is we update the super copy with the > current generations and bytenrs of the important roots, and then copy > that into our super_for_commit. Then we allow transactions to continue > again, we write out the dirty pages for the transaction, and then we > write the super. If the write out fails we'll bail and skip writing the > supers. > > However since we've allowed a new transaction to start, we can have a > log attempting to sync at this point, which would be blocked on > fs_info->tree_log_mutex. Once the commit fails we're allowed to do the > log tree commit, which uses super_for_commit, which now points at fs > tree's that were not written out. > > Fix this by checking BTRFS_FS_STATE_ERROR once we acquire the > tree_log_mutex. This way if the transaction commit fails we're sure to > see this bit set and we can skip writing the super out. This patch > fixes this specific transid mismatch error I was seeing with this > particular error path. > > cc: stable@xxxxxxxxxxxxxxx > Signed-off-by: Josef Bacik <josef@xxxxxxxxxxxxxx> Added to misc-next, with the suggested comment update. Thanks.