On Thu, Mar 26, 2020 at 10:25:00AM +1100, Dave Chinner wrote: > On Wed, Mar 25, 2020 at 08:33:14AM -0400, Brian Foster wrote: > > On Tue, Mar 24, 2020 at 06:44:52PM +0100, Christoph Hellwig wrote: > > > From: Dave Chinner <dchinner@xxxxxxxxxx> > > > > > > The xlog_write() function iterates over iclogs until it completes > > > writing all the log vectors passed in. The ticket tracks whether > > > a start record has been written or not, so only the first iclog gets > > > a start record. We only ever pass single use tickets to > > > xlog_write() so we only ever need to write a start record once per > > > xlog_write() call. > > > > > > Hence we don't need to store whether we should write a start record > > > in the ticket as the callers provide all the information we need to > > > determine if a start record should be written. For the moment, we > > > have to ensure that we clear the XLOG_TIC_INITED appropriately so > > > the code in xfs_log_done() still works correctly for committing > > > transactions. > > > > > > Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx> > > > [hch: use an need_start_rec bool] > > > Signed-off-by: Christoph Hellwig <hch@xxxxxx> > > > --- > > > fs/xfs/xfs_log.c | 63 ++++++++++++++++++++++++------------------------ > > > 1 file changed, 32 insertions(+), 31 deletions(-) > > > > > > diff --git a/fs/xfs/xfs_log.c b/fs/xfs/xfs_log.c > > > index 2a90a483c2d6..bf071552094a 100644 > > > --- a/fs/xfs/xfs_log.c > > > +++ b/fs/xfs/xfs_log.c > > ... > > > @@ -2372,25 +2359,29 @@ xlog_write( > > > int record_cnt = 0; > > > int data_cnt = 0; > > > int error = 0; > > > + bool need_start_rec = true; > > > > > > *start_lsn = 0; > > > > > > - len = xlog_write_calc_vec_length(ticket, log_vector); > > > > > > /* > > > * Region headers and bytes are already accounted for. > > > * We only need to take into account start records and > > > * split regions in this function. > > > */ > > > - if (ticket->t_flags & XLOG_TIC_INITED) > > > - ticket->t_curr_res -= sizeof(xlog_op_header_t); > > > + if (ticket->t_flags & XLOG_TIC_INITED) { > > > + ticket->t_curr_res -= sizeof(struct xlog_op_header); > > > + ticket->t_flags &= ~XLOG_TIC_INITED; > > > + } > > > > > > /* > > > * Commit record headers need to be accounted for. These > > > * come in as separate writes so are easy to detect. > > > */ > > > - if (flags & (XLOG_COMMIT_TRANS | XLOG_UNMOUNT_TRANS)) > > > - ticket->t_curr_res -= sizeof(xlog_op_header_t); > > > + if (flags & (XLOG_COMMIT_TRANS | XLOG_UNMOUNT_TRANS)) { > > > + ticket->t_curr_res -= sizeof(struct xlog_op_header); > > > + need_start_rec = false; > > > + } > > > > Hmm.. I was asking for a comment update in v1 for this logic change. > > Looking through it again, what happens here if > > xfs_log_write_unmount_record() clears the UNMOUNT_TRANS flag for that > > summary counter check? That looks like a potential behavior change wrt > > to the start record.. > > xfs_log_write_unmount_record() clears the XLOG_TIC_INITED > flag before calling xlog_write(), so the current code never writes > out a start record for the unmount record. i.e. the unmount > record is a single region with the unmount log item in it, and > AFAICT this code does not change the behaviour of the unmount record > write at all. > I'm referring to the UNMOUNT_TRANS flag, not t_flags. With this patch, we actually would write a start record in some cases because need_start_rec is toggled based on the flags parameter and the summary counter check zeroes it. > FWIW, that error injection code looks dodgy - it turns the unmount > record into an XFS_LOG transaction type with an invalid log item > type (0). That probably should be flagged as corruption, not be > silently ignored by recovery. IOWs, I think the error injection code > was wrong to begin with - if we want to ensure the log is dirty at > unmount, we should just skip writing the unmount record altogether. > That may be true... TBH I wasn't totally clear on what that logic was for (it isn't purely error injection). From the commit (f467cad95f5e3) log, the intent appears to be to "skip writing the unmount record," but that doesn't quite describe the behavior. Darrick might want to comment..? If we do revisit this, I'm mainly curious on whether there's a change in recovery behavior between having this specially crafted record vs. just writing nothing. For example, does recovery still set the head/tail based on this record even though we don't mark the log clean? If so, do we care..? Brian > Cheers, > > Dave. > -- > Dave Chinner > david@xxxxxxxxxxxxx >