On Thu, Jan 17, 2013 at 05:50:39PM -0600, Ben Myers wrote: > Hey Dave, > > On Fri, Jan 18, 2013 at 10:39:22AM +1100, Dave Chinner wrote: > > On Thu, Jan 17, 2013 at 03:41:08PM -0600, Ben Myers wrote: > > > Hey Brian, > > > > > > On Thu, Jan 17, 2013 at 01:11:29PM -0500, Brian Foster wrote: > > > > The stack_switch check currently occurs in __xfs_bmapi_allocate, > > > > which means the stack switch only occurs when xfs_bmapi_allocate() > > > > is called in a loop. Pull the check up before the loop in > > > > xfs_bmapi_write() such that the first iteration of the loop has > > > > consistent behavior. > > > > > > > > Signed-off-by: Brian Foster <bfoster@xxxxxxxxxx> > > > > --- > > > > > > > > I was reading through this code and confused myself over whether the stack > > > > switch ever actually occurs. Eric and Ben pointed out on irc (simultaneously, > > > > I might add) the surrounding loop that I had missed, but it wasn't clear whether > > > > the behavior to enable the stack switch after the first iteration was > > > > intentional or not. I'm throwing this out there to either fix the issue or seek > > > > out an explanation for the existing behavior. Thanks! > > > > > > To me this looks to be the correct behavior. It might be better to > > > just get rid of the XFS_BMAPI_STACK_SWITCH flag entirely. Nice find. > > > > Which would take it back to the original logic which always switched > > stacks and we know that caused significant metadata performance > > degradation in various workloads. > > > > If we want to remove XFS_BMAPI_STACK_SWITCH, then we either need to > > solve either the stack overrun problem (not possible, AFAICT) or the > > metadata performance degradation as a result of always pushing > > allocation off into workqueues. So, unfortunately, until we have some > > other resolution, we stuck with it.... :/ > > Yeah, I went off the rails there. I meant to suggest something more along the > lines of getting rid of the stack_switch member of the args structure, since > xfs_bmapi_write is the only caller of xfs_bmapi_allocate. But it didn't come > out that way... Anyway, what we have is just fine. Oh, ok, I see what you mean. The code currently has an extra level of indirection (XFS_BMAPI_STACK_SWITCH -> args->stack_switch). That is in line with all the other XFS_BMAPI_* flags, though. i.e. they are interface flags and not propagated down through the struct xfs_bmalloc arguments. To tell the truth, I've previously considered just passing a flags field down rather than the current single bit variables. I've never done it though, because (IMO) it doesn't really improve the code significantly or reduce the footprint of the structure. Perhaps we should look at that again now that the code has been significantly factored.... Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx _______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs