Re: [PATCH 3/6] xfs: create shadow transaction reservations for computing minimum log size

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Apr 25, 2022 at 04:39:05PM -0700, Darrick J. Wong wrote:
> On Sat, Apr 23, 2022 at 08:36:35AM +1000, Dave Chinner wrote:
> > On Thu, Apr 14, 2022 at 03:54:42PM -0700, Darrick J. Wong wrote:
....
> > > @@ -47,18 +48,25 @@ xfs_log_get_max_trans_res(
> > >  	struct xfs_trans_res	*max_resp)
> > >  {
> > >  	struct xfs_trans_res	*resp;
> > > +	struct xfs_trans_res	*start_resp;
> > >  	struct xfs_trans_res	*end_resp;
> > > +	struct xfs_trans_resv	*resv;
> > >  	int			log_space = 0;
> > >  	int			attr_space;
> > >  
> > >  	attr_space = xfs_log_calc_max_attrsetm_res(mp);
> > >  
> > > -	resp = (struct xfs_trans_res *)M_RES(mp);
> > > -	end_resp = (struct xfs_trans_res *)(M_RES(mp) + 1);
> > > -	for (; resp < end_resp; resp++) {
> > > +	resv = kmem_zalloc(sizeof(struct xfs_trans_resv), 0);
> > > +	xfs_trans_resv_calc_logsize(mp, resv);
> > > +
> > > +	start_resp = (struct xfs_trans_res *)resv;
> > > +	end_resp = (struct xfs_trans_res *)(resv + 1);
> > > +	for (resp = start_resp; resp < end_resp; resp++) {
> > >  		int		tmp = resp->tr_logcount > 1 ?
> > >  				      resp->tr_logres * resp->tr_logcount :
> > >  				      resp->tr_logres;
> > > +
> > > +		trace_xfs_trans_resv_calc_logsize(mp, resp - start_resp, resp);
> > >  		if (log_space < tmp) {
> > >  			log_space = tmp;
> > >  			*max_resp = *resp;		/* struct copy */
> > 
> > This took me a while to get my head around. The minimum logsize
> > calculation stuff is all a bit of a mess.
> > 
> > Essentially, we call xfs_log_get_max_trans_res() from two places.
> > One is to calculate the minimum log size, the other is the
> > transaction reservation tracing code done when M_RES(mp) is set up
> > via xfs_trans_trace_reservations().  We don't need the call from
> > xfs_trans_trace_reservations() - it's trivial to scan the list of
> > tracepoints emitted by this function at mount time to find the
> > maximum reservation.
> 
> Here's the thing -- xfs_db also calls xfs_log_get_max_trans_res to
> figure out the transaction reservation that's used to compute the
> minimum log size.  Whenever I get a report about mount failing due to
> minlogsize checks, I can ask the reporter to send me the ftrace output
> from the mount attempt and compare it against what userspace thinks:
> 
> # xfs_db /dev/sde -c logres
> type 0 logres 168184 logcount 5 flags 0x4
> type 1 logres 293760 logcount 5 flags 0x4
> type 2 logres 307936 logcount 2 flags 0x4
> type 3 logres 187760 logcount 2 flags 0x4
> type 4 logres 170616 logcount 2 flags 0x4
> type 5 logres 244720 logcount 3 flags 0x4
> type 6 logres 243568 logcount 2 flags 0x4
> type 7 logres 260352 logcount 2 flags 0x4
> type 8 logres 243568 logcount 3 flags 0x4
> type 9 logres 278648 logcount 2 flags 0x4
> type 10 logres 2168 logcount 0 flags 0x0
> type 11 logres 73728 logcount 2 flags 0x4
> type 12 logres 99960 logcount 2 flags 0x4
> type 13 logres 760 logcount 0 flags 0x0
> type 14 logres 292992 logcount 1 flags 0x4
> type 15 logres 23288 logcount 3 flags 0x4
> type 16 logres 13312 logcount 0 flags 0x0
> type 17 logres 147584 logcount 3 flags 0x4
> type 18 logres 640 logcount 0 flags 0x0
> type 19 logres 94968 logcount 2 flags 0x4
> type 20 logres 4224 logcount 0 flags 0x0
> type 21 logres 6512 logcount 0 flags 0x0
> type 22 logres 232 logcount 1 flags 0x0
> type 23 logres 172407 logcount 5 flags 0x4
> type 24 logres 640 logcount 1 flags 0x0
> type 25 logres 760 logcount 0 flags 0x0
> type 26 logres 243568 logcount 2 flags 0x4
> type 27 logres 170616 logcount 2 flags 0x4
> type -1 logres 547200 logcount 8 flags 0x4
> 
> And this "-1" entry matches the last output of the kernel.

I look at that and thing "xfs_db output is broken" because that last
line cannot be derived from the individual transaction reservations
that are listed. It makes no sense in isolation/without
documentation. :/

> I'd rather
> not lose this tracing facility (which means keeping this function
> non-static) though I will move the tracepoint out of
> xfs_trans_trace_reservations.

You mean "remove only the '-1' tracepoint" from
xfs_trans_trace_reservations()?

> > Hence I think we should start by removing that call to this
> > function, and making this a static function called only from
> > xfs_log_calc_minimum_size().
> > 
> > At this point, we can use an on-stack struct xfs_trans_resv for the
> > calculated values - no need for memory allocation here as we will
> > never be short of stack space in this path.
> 
> ~312 bytes?  That's ~8% of the kernel stack.  I'll see if I run into any
> complaints, though I bet I won't on x64.

What architecture still uses 4kB stacks?  Filesystems have blown
through 4kB stacks without even trying on 32bit systems for years
now.

Regardless, the mount path call chain here is nowhere near deep
enough to be at risk of blowing stacks, and this is at the leaf so
it's largely irrelevant if we put this on the stack...

Cheers,

Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx



[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux