On Thu, Jan 17, 2013 at 03:41:08PM -0600, Ben Myers wrote: > Hey Brian, > > On Thu, Jan 17, 2013 at 01:11:29PM -0500, Brian Foster wrote: > > The stack_switch check currently occurs in __xfs_bmapi_allocate, > > which means the stack switch only occurs when xfs_bmapi_allocate() > > is called in a loop. Pull the check up before the loop in > > xfs_bmapi_write() such that the first iteration of the loop has > > consistent behavior. > > > > Signed-off-by: Brian Foster <bfoster@xxxxxxxxxx> > > --- > > > > I was reading through this code and confused myself over whether the stack > > switch ever actually occurs. Eric and Ben pointed out on irc (simultaneously, > > I might add) the surrounding loop that I had missed, but it wasn't clear whether > > the behavior to enable the stack switch after the first iteration was > > intentional or not. I'm throwing this out there to either fix the issue or seek > > out an explanation for the existing behavior. Thanks! > > To me this looks to be the correct behavior. It might be better to > just get rid of the XFS_BMAPI_STACK_SWITCH flag entirely. Nice find. Which would take it back to the original logic which always switched stacks and we know that caused significant metadata performance degradation in various workloads. If we want to remove XFS_BMAPI_STACK_SWITCH, then we either need to solve either the stack overrun problem (not possible, AFAICT) or the metadata performance degradation as a result of always pushing allocation off into workqueues. So, unfortunately, until we have some other resolution, we stuck with it.... :/ Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx _______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs