Re: [PATCH 3/4] xfs: add quota-driven speculative preallocation throttling

Dave Chinner <david@xxxxxxxxxxxxx> · Thu, 13 Dec 2012 13:25:30 +1100

On Wed, Dec 05, 2012 at 11:47:57AM -0500, Brian Foster wrote:
> Speculative preallocation currently occurs based on the size of a
> file (8GB max) and is throttled only within 5% of ENOSPC. Enable
> similar throttling as an inode approaches EDQUOT.
> 
> Preallocation is throttled to a quota hard limit and disabled if
> the hard limit is surpassed (noenforce). If a soft limit is also
> specified, it serves as a low watermark to enable throttling and is
> used to adjust the percentage of free quota space a single
> preallocation is allowed to consume (5% by default).
> 
> The algorithm determines the max percentage allowed for each quota
> and calculates the associated raw values. The minimum raw value
> across all quotas applicable to the inode represents the maximum
> size allowed for a preallocation on that inode.
> 
> Signed-off-by: Brian Foster <bfoster@xxxxxxxxxx>

I'm having trouble determining what the algorithm is supposed to be
and what might be bugs in the algorithm....

These are notes, somewhat incoherent, but I'll post them anyway
because i think they convey my concerns and solutions well enough.

Cheers,

Dave.

----

Describe the algorithm to ensure I have it right:

	calculate preallocation size (alloc_blocks)
		uses file size to determine size

	determine ENOSPC throttling reduction (shift)

	calculate maximum quota prealloc amount allowed
		walk each dquot on inode
			check hard limit
				=> over = no prealloc
			check soft limit
				=> none = prealloc unmodified
			check over soft limit
				=> no = prealloc unmodified
			calculate "throttle percentage"
			calculate max prealloc
		use minimum prealloc value returned

	alloc_blocks = MIN(alloc_blocks, quota_alloc_blocks)

	apply ENOSPC throttle (shift)

-  prealloc size being overridden by quota throttling, and then the
   ENOSPC throttle is applied to that.

- the overall algorithm looks good to me, that means my problems are
  with the implementation....

	- it calculates stuff dynamically that could be set once in
	  the struct xfs_dquot on initialisation and whenever limits
	  are changed. i.e. the "percentage to throttle". This
	  should be carried by xfs_dquot as it simplifies
	  the logic here.

	- if there is a hard limit but no soft limit, it *always*
	  throttles preallocation to the default percentage even
	  where there is lots of space available for the full
	  prealloc.

	- it returns a block count based on limits, or -1 for no
	  throttling. I'd prefer a pair of functions - one to check
	  whether throttling is needed, and one to calculate the
	  throttling parameters

		- should use xfs_this_quota_on() to drive quota
		  checks completely inside throttle check. will make
		  the code much cleaner

		- check function can be boolean

		- calc function should return both raw space
		  available and shift values so the global prealloc
		  values can be overridden independently. i.e.
		  allows quota throttling to work even when overall
		  prealloc is less than the maximum quota would
		  allow.

Rough code:

need_throttle()
{
	if (!xfs_this_quota_on(mp, type))
		return false;

	dq = xfs_inode_dquot(ip, type);

	/* no hard limit, no throttle */
	if (!dq->q_hard_limit)
		return false;

	/* over hard limit, always throttle */
	if (dq->q_res_bcount > dq->q_hard_limit)
		return true;

	/*
	 * Under soft limit, no throttle.
	 *
	 * Note: we always have a soft limit for prealloc,
	 * calculated at dquot instantiation or limit change
	 */
	if (dq->q_res_bcount + alloc_blocks < dq->q_soft_limit)
		return false;

	/* between soft limit and hard limit, need to throttle */
	return true;
}

- needs struct xfs-dquot to be initialised appropriately and quota
  limit changes to handle changes correctly.

- allows soft limit defaults to be set in memory if they aren't on
  disk. i.e. default throttling values will be no different in
  implementation to on-disk limits.

calc_throttle()
{
	dq = xfs_inode_dquot(ip, type);

	freesp = dq->q_hard_limit - dq->q_res_bcount;

	if (freesp < dq->q_low_space[XFS_LOWSP_5_PCNT]) {
		shift = 2;
		if (freesp < dq->q_low_space[XFS_LOWSP_4_PCNT])
			shift++;
		if (freesp < dq->q_low_space[XFS_LOWSP_3_PCNT])
			shift++;
		if (freesp < dq->q_low_space[XFS_LOWSP_2_PCNT])
			shift++;
		if (freesp < dq->q_low_space[XFS_LOWSP_1_PCNT])
			shift++;
	}

	/* only overwrite current values if the result is a smaller prealloc */
	if ((freesp >> shift) >= (*qblocks >> *qshift))
		return;

	*qblocks = freesp;
	*qshift = shift;
}

- similar shift table to the ENOSPC code for a logarithmic mapping
  rather than a linear mapping.
- probably doesn't need 5 steps, 3 steps that do shift += 2 is
  probably sufficient and would reduce per-dquot memory overhead.

xfs_iomap_prealloc_size()
{
.....

	qblocks = alloc_blocks;
	qshift = 0;

	if (need_throttle(ip, XFS_DQ_USER, alloc_blocks)
		calc_throttle(ip, XFS_DQ_USER, &qblocks, &qshift);

	if (need_throttle(ip, XFS_DQ_GROUP, alloc_blocks)
		calc_throttle(ip, XFS_DQ_GROUP, &qblocks, &qshift);

	if (need_throttle(ip, XFS_DQ_PROJ, alloc_blocks)
		calc_throttle(ip, XFS_DQ_PROJ, &qblocks, &qshift);

	/*
	 * The final size of the preallocation is the smaller of the
	 * whole filesystem prealloc size and the quota prealloc
	 * size. i.e. whichever entity has the least space available
	 * for allocation determines the maximum preallocation size.
	 *
	 * The final throttling level is the larger of the ENOSPC
	 * and quota throttles. i.e. which ever is closer to their
	 * respective space limit determines how much we throttle
	 * by.
	 */
	alloc_blocks = MIN(qblocks, alloc_blocks)
	shift = MAX(qshift, shift)
....

-- 
Dave Chinner
david@xxxxxxxxxxxxx

_______________________________________________
xfs mailing list
xfs@xxxxxxxxxxx
http://oss.sgi.com/mailman/listinfo/xfs