Hey, On Fri, Jan 18, 2013 at 11:15:49AM +1100, Dave Chinner wrote: > On Thu, Jan 17, 2013 at 05:50:39PM -0600, Ben Myers wrote: > > Hey Dave, > > > > On Fri, Jan 18, 2013 at 10:39:22AM +1100, Dave Chinner wrote: > > > On Thu, Jan 17, 2013 at 03:41:08PM -0600, Ben Myers wrote: > > > > Hey Brian, > > > > > > > > On Thu, Jan 17, 2013 at 01:11:29PM -0500, Brian Foster wrote: > > > > > The stack_switch check currently occurs in __xfs_bmapi_allocate, > > > > > which means the stack switch only occurs when xfs_bmapi_allocate() > > > > > is called in a loop. Pull the check up before the loop in > > > > > xfs_bmapi_write() such that the first iteration of the loop has > > > > > consistent behavior. > > > > > > > > > > Signed-off-by: Brian Foster <bfoster@xxxxxxxxxx> > > > > > --- > > > > > > > > > > I was reading through this code and confused myself over whether the stack > > > > > switch ever actually occurs. Eric and Ben pointed out on irc (simultaneously, > > > > > I might add) the surrounding loop that I had missed, but it wasn't clear whether > > > > > the behavior to enable the stack switch after the first iteration was > > > > > intentional or not. I'm throwing this out there to either fix the issue or seek > > > > > out an explanation for the existing behavior. Thanks! > > > > > > > > To me this looks to be the correct behavior. It might be better to > > > > just get rid of the XFS_BMAPI_STACK_SWITCH flag entirely. Nice find. > > > > > > Which would take it back to the original logic which always switched > > > stacks and we know that caused significant metadata performance > > > degradation in various workloads. > > > > > > If we want to remove XFS_BMAPI_STACK_SWITCH, then we either need to > > > solve either the stack overrun problem (not possible, AFAICT) or the > > > metadata performance degradation as a result of always pushing > > > allocation off into workqueues. So, unfortunately, until we have some > > > other resolution, we stuck with it.... :/ > > > > Yeah, I went off the rails there. I meant to suggest something more along the > > lines of getting rid of the stack_switch member of the args structure, since > > xfs_bmapi_write is the only caller of xfs_bmapi_allocate. But it didn't come > > out that way... Anyway, what we have is just fine. > > Oh, ok, I see what you mean. The code currently has an extra level > of indirection (XFS_BMAPI_STACK_SWITCH -> args->stack_switch). Exactly. Thanks. ;) > That > is in line with all the other XFS_BMAPI_* flags, though. i.e. they > are interface flags and not propagated down through the > struct xfs_bmalloc arguments. > > To tell the truth, I've previously considered just passing a flags > field down rather than the current single bit variables. I've never > done it though, because (IMO) it doesn't really improve the code > significantly or reduce the footprint of the structure. Perhaps we > should look at that again now that the code has been significantly > factored.... Some of the args style structures seem pretty sizeable, but I think I'm more concerned about readability than size. It gets to be kind of a muddle in through there copying all those flags around, and in this case we got bitten. Regards, Ben _______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs