Re: [PATCH 26/45] xfs: log ticket region debug is largely useless

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Mar 16, 2021 at 10:55:34AM -0400, Brian Foster wrote:
> On Fri, Mar 05, 2021 at 04:11:24PM +1100, Dave Chinner wrote:
> > From: Dave Chinner <dchinner@xxxxxxxxxx>
> > 
> > xlog_tic_add_region() is used to trace the regions being added to a
> > log ticket to provide information in the situation where a ticket
> > reservation overrun occurs. The information gathered is stored int
> > the ticket, and dumped if xlog_print_tic_res() is called.
> > 
> > For a front end struct xfs_trans overrun, the ticket only contains
> > reservation tracking information - the ticket is never handed to the
> > log so has no regions attached to it. The overrun debug information in this
> > case comes from xlog_print_trans(), which walks the items attached
> > to the transaction and dumps their attached formatted log vectors
> > directly. It also dumps the ticket state, but that only contains
> > reservation accounting and nothing else. Hence xlog_print_tic_res()
> > never dumps region or overrun information from this path.
> > 
> > xlog_tic_add_region() is actually called from xlog_write(), which
> > means it is being used to track the regions seen in a
> > CIL checkpoint log vector chain. In looking at CIL behaviour
> > recently, I've seen 32MB checkpoints regularly exceed 250,000
> > regions in the LV chain. The log ticket debug code can track *15*
> > regions. IOWs, if there is a ticket overrun in the CIL code, the
> > ticket region tracking code is going to be completely useless for
> > determining what went wrong. The only thing it can tell us is how
> > much of an overrun occurred, and we really don't need extra debug
> > information in the log ticket to tell us that.
> > 
> > Indeed, the main place we call xlog_tic_add_region() is also adding
> > up the number of regions and the space used so that xlog_write()
> > knows how much will be written to the log. This is exactly the same
> > information that log ticket is storing once we take away the useless
> > region tracking array. Hence xlog_tic_add_region() is not useful,
> > but can be called 250,000 times a CIL push...
> > 
> > Just strip all that debug "information" out of the of the log ticket
> > and only have it report reservation space information when an
> > overrun occurs. This also reduces the size of a log ticket down by
> > about 150 bytes...
> > 
> > Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx>
> > Reviewed-by: Christoph Hellwig <hch@xxxxxx>
> > ---
> >  fs/xfs/xfs_log.c      | 107 +++---------------------------------------
> >  fs/xfs/xfs_log_priv.h |  17 -------
> >  2 files changed, 6 insertions(+), 118 deletions(-)
> > 
> ...
> > diff --git a/fs/xfs/xfs_log_priv.h b/fs/xfs/xfs_log_priv.h
> > index 7f601c1c9f45..8ee6a5f74396 100644
> > --- a/fs/xfs/xfs_log_priv.h
> > +++ b/fs/xfs/xfs_log_priv.h
> > @@ -139,16 +139,6 @@ enum xlog_iclog_state {
> >  /* Ticket reservation region accounting */ 
> >  #define XLOG_TIC_LEN_MAX	15
> >  
> 
> This is unused now.

Removed.

> 
> > -/*
> > - * Reservation region
> > - * As would be stored in xfs_log_iovec but without the i_addr which
> > - * we don't care about.
> > - */
> > -typedef struct xlog_res {
> > -	uint	r_len;	/* region length		:4 */
> > -	uint	r_type;	/* region's transaction type	:4 */
> > -} xlog_res_t;
> > -
> >  typedef struct xlog_ticket {
> >  	struct list_head   t_queue;	 /* reserve/write queue */
> >  	struct task_struct *t_task;	 /* task that owns this ticket */
> > @@ -159,13 +149,6 @@ typedef struct xlog_ticket {
> >  	char		   t_ocnt;	 /* original count		 : 1  */
> >  	char		   t_cnt;	 /* current count		 : 1  */
> >  	char		   t_flags;	 /* properties of reservation	 : 1  */
> > -
> > -        /* reservation array fields */
> > -	uint		   t_res_num;                    /* num in array : 4 */
> > -	uint		   t_res_num_ophdrs;		 /* num op hdrs  : 4 */
> 
> I'm curious why we wouldn't want to retain the ophdr count..? That's
> managed separately from the _add_region() bits and provides some info on
> the total number of vectors, etc. Otherwise looks reasonable.

Because we calculate it when we build the lv chain in a push and
only use the value the checkpoint transaction header. This is now
entirely encapsulated within the CIL push code and is no longer
something the xlog_write() needs to know because it doesn't build on
disk transaction headers. If we need debug to track how long the
lvchain is, then grab it the CIL code where it is already known...

Cheers,

Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx



[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux