On Wed, Dec 05, 2012 at 11:47:57AM -0500, Brian Foster wrote: > Speculative preallocation currently occurs based on the size of a > file (8GB max) and is throttled only within 5% of ENOSPC. Enable > similar throttling as an inode approaches EDQUOT. > > Preallocation is throttled to a quota hard limit and disabled if > the hard limit is surpassed (noenforce). If a soft limit is also > specified, it serves as a low watermark to enable throttling and is > used to adjust the percentage of free quota space a single > preallocation is allowed to consume (5% by default). > > The algorithm determines the max percentage allowed for each quota > and calculates the associated raw values. The minimum raw value > across all quotas applicable to the inode represents the maximum > size allowed for a preallocation on that inode. > > Signed-off-by: Brian Foster <bfoster@xxxxxxxxxx> I'm having trouble determining what the algorithm is supposed to be and what might be bugs in the algorithm.... These are notes, somewhat incoherent, but I'll post them anyway because i think they convey my concerns and solutions well enough. Cheers, Dave. ---- Describe the algorithm to ensure I have it right: calculate preallocation size (alloc_blocks) uses file size to determine size determine ENOSPC throttling reduction (shift) calculate maximum quota prealloc amount allowed walk each dquot on inode check hard limit => over = no prealloc check soft limit => none = prealloc unmodified check over soft limit => no = prealloc unmodified calculate "throttle percentage" calculate max prealloc use minimum prealloc value returned alloc_blocks = MIN(alloc_blocks, quota_alloc_blocks) apply ENOSPC throttle (shift) - prealloc size being overridden by quota throttling, and then the ENOSPC throttle is applied to that. - the overall algorithm looks good to me, that means my problems are with the implementation.... - it calculates stuff dynamically that could be set once in the struct xfs_dquot on initialisation and whenever limits are changed. i.e. the "percentage to throttle". This should be carried by xfs_dquot as it simplifies the logic here. - if there is a hard limit but no soft limit, it *always* throttles preallocation to the default percentage even where there is lots of space available for the full prealloc. - it returns a block count based on limits, or -1 for no throttling. I'd prefer a pair of functions - one to check whether throttling is needed, and one to calculate the throttling parameters - should use xfs_this_quota_on() to drive quota checks completely inside throttle check. will make the code much cleaner - check function can be boolean - calc function should return both raw space available and shift values so the global prealloc values can be overridden independently. i.e. allows quota throttling to work even when overall prealloc is less than the maximum quota would allow. Rough code: need_throttle() { if (!xfs_this_quota_on(mp, type)) return false; dq = xfs_inode_dquot(ip, type); /* no hard limit, no throttle */ if (!dq->q_hard_limit) return false; /* over hard limit, always throttle */ if (dq->q_res_bcount > dq->q_hard_limit) return true; /* * Under soft limit, no throttle. * * Note: we always have a soft limit for prealloc, * calculated at dquot instantiation or limit change */ if (dq->q_res_bcount + alloc_blocks < dq->q_soft_limit) return false; /* between soft limit and hard limit, need to throttle */ return true; } - needs struct xfs-dquot to be initialised appropriately and quota limit changes to handle changes correctly. - allows soft limit defaults to be set in memory if they aren't on disk. i.e. default throttling values will be no different in implementation to on-disk limits. calc_throttle() { dq = xfs_inode_dquot(ip, type); freesp = dq->q_hard_limit - dq->q_res_bcount; if (freesp < dq->q_low_space[XFS_LOWSP_5_PCNT]) { shift = 2; if (freesp < dq->q_low_space[XFS_LOWSP_4_PCNT]) shift++; if (freesp < dq->q_low_space[XFS_LOWSP_3_PCNT]) shift++; if (freesp < dq->q_low_space[XFS_LOWSP_2_PCNT]) shift++; if (freesp < dq->q_low_space[XFS_LOWSP_1_PCNT]) shift++; } /* only overwrite current values if the result is a smaller prealloc */ if ((freesp >> shift) >= (*qblocks >> *qshift)) return; *qblocks = freesp; *qshift = shift; } - similar shift table to the ENOSPC code for a logarithmic mapping rather than a linear mapping. - probably doesn't need 5 steps, 3 steps that do shift += 2 is probably sufficient and would reduce per-dquot memory overhead. xfs_iomap_prealloc_size() { ..... qblocks = alloc_blocks; qshift = 0; if (need_throttle(ip, XFS_DQ_USER, alloc_blocks) calc_throttle(ip, XFS_DQ_USER, &qblocks, &qshift); if (need_throttle(ip, XFS_DQ_GROUP, alloc_blocks) calc_throttle(ip, XFS_DQ_GROUP, &qblocks, &qshift); if (need_throttle(ip, XFS_DQ_PROJ, alloc_blocks) calc_throttle(ip, XFS_DQ_PROJ, &qblocks, &qshift); /* * The final size of the preallocation is the smaller of the * whole filesystem prealloc size and the quota prealloc * size. i.e. whichever entity has the least space available * for allocation determines the maximum preallocation size. * * The final throttling level is the larger of the ENOSPC * and quota throttles. i.e. which ever is closer to their * respective space limit determines how much we throttle * by. */ alloc_blocks = MIN(qblocks, alloc_blocks) shift = MAX(qshift, shift) .... -- Dave Chinner david@xxxxxxxxxxxxx _______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs