On Wed, Jun 25, 2014 at 10:17:56AM -0400, Matthew Wilcox wrote: > > Okay ... but why is it so much worse in 3.15 than 3.14? > > And does ext4 think of "running out of space" as a percentage > free, or an absolute number of blocks remaining? From the code in > ext4_nonda_switch(), it seems to be the former, although maybe excessive > fragmentation has caused ext4 to think it's running out of space? When the blocks that were allocated using delayed allocation exceeds 50% of the free space, we initiate writeback. When delalloc blocks exceeds 66% of the free space, we fall back to nodelalloc, which among other things, means blocks are allocated for each write system call, and we also have to add and remove the inode from on the orphan inode list so that if we crash in the middle of the write system call, we don't end up exposing stale data. We did have a change to the orphan inode code to improve scalability, so that could have been a possible cause; but that happened after 3.15, so that can't be it. The other possibility is that there's simply a chance in the writeback code that is changing how aggressively we start writeback when we exceed the 50% threshold, so that we end up switching into nonda mode more often. Any chance you can run generic/127 under perf so we can see where we're spending all of our CPU time? The other thing I can imagine doing is to add tracepoint when whenver we drop into nonda mode, so we can see if that's happening more often under 3.15 versus 3.14. - Ted -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html