On Mon, Jun 18, 2012 at 12:25:37PM -0600, Andreas Dilger wrote: > On 2012-06-18, at 6:08 AM, Christoph Hellwig wrote: > > May saw the release of Linux 3.4, including a decent sized XFS update. > > Remarkable XFS features in Linux 3.4 include moving over all metadata > > updates to use transactions, the addition of a work queue for the > > low-level allocator code to avoid stack overflows due to extreme stack > > use in the Linux VM/VFS call chain, > > This is essentially a workaround for too-small stacks in the kernel, > which we've had to do at times as well, by doing work in a separate > thread (with a new stack) and waiting for the results? This is a > generic problem that any reasonably-complex filesystem will have when > running under memory pressure on a complex storage stack (e.g. LVM + > iSCSI), but causes unnecessary context switching. I've seen no performance issues from the context switching. The overhead of them is so small to be unmeasurable most cases, because a typical allocation already requires context switches for contended locks and metadata IO.... > Any thoughts on a better way to handle this, or will there continue > to be a 4kB stack limit We were blowing 8k stacks on x86-64 with alarming ease. Even the flusher threads were overflowing. > and hack around this with repeated kmalloc > on callpaths for any struct over a few tens of bytes, implementing > memory pools all over the place, and "forking" over to other threads > to continue the stack consumption for another 4kB to work around > the small stack limit? I mentioned that we needed to consider 16k stacks at last years Kernel Summit and the response was along the lines of "you've got to be kidding - fix your broken filesystem". That's the perception you have to change, and i don't feel like having a 4k stacks battle again... Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx _______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs