On Mon, Feb 02, 2015 at 12:11:33PM -0500, Brian Foster wrote: > On Mon, Feb 02, 2015 at 08:43:02AM +1100, Dave Chinner wrote: > > + /* > > + * Taking blocks away, need to be more accurate the closer we > > + * are to zero. > > + * > > + * batch size is set to a maximum of 1024 blocks - if we are > > + * allocating of freeing extents larger than this then we aren't > > + * going to be hammering the counter lock so a lock per update > > + * is not a problem. > > + * > > IIUC, the batch size determines the point at which the local cpu delta > is folded back into the global counter (under lock). If we're allocating > large extents, these will surpass the batch size and result in a locked > update. Smaller updates are aggregated into the cpu counter and folded > in at some later time. Right. > > + * If the counter has a value of less than 2 * max batch size, > > + * then make everything serialise as we are real close to > > + * ENOSPC. > > + */ > > +#define __BATCH 1024 > > + if (percpu_counter_compare(&mp->m_sb.sb_fdblocks, > > + 2 * __BATCH) < 0) > > + batch = 1; > > + else > > + batch = __BATCH; > > + > > The general approach seems logical. I do wonder whether blocks is the > right scale as opposed to block count normalized against some fixed I/O > size (to account for different block sizes). We allocate in blocks, so the IO size is really irrelevant. The scalability issue at hand is page-by-page space reservation during delayed allocation, so really the block size makes less difference to performance than the page size.... > Also, it seems like speculative preallocation could hide some of the > overhead here, depending on workload of course. Had that factored into > your testing? Yes, somewhat, though I shuld do some testing using 4k direct IO and buffered IO with allocsize set appropriately. > > > + __percpu_counter_add(&mp->m_sb.sb_fdblocks, delta, batch); > > + if (percpu_counter_compare(&mp->m_sb.sb_fdblocks, > > + XFS_ALLOC_SET_ASIDE(mp)) >= 0) { > > + /* we had space! */ > > + return 0; > > } > > > > - mp->m_sb.sb_fdblocks = lcounter + XFS_ALLOC_SET_ASIDE(mp); > > - return 0; > > + /* > > + * lock up the sb for dipping into reserves before releasing > > + * the space that took us to ENOSPC. > > + */ > > + spin_lock(&mp->m_sb_lock); > > Can you elaborate on the locking here, why it's needed where it wasn't > before? The lock protects the reserved pool. And it was used before as the only time we called into this function was with the m_sb_lock held. this is a bit of a hack because we now call into the function without the lock held.... Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx _______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs