On Fri, Apr 10, 2015 at 12:21 AM, Dave Chinner <david@xxxxxxxxxxxxx> wrote: > On Thu, Apr 09, 2015 at 11:51:17PM -0700, Shrinand Javadekar wrote: >> Thanks for the reply Dave! >> >> > >> > Oh, right, it's that workqueue we removed in late 2012 (in the 3.7 >> > cycle) because it was redundant. The only remaining fragment of it >> > is the xfslogd. What kernel are you running? >> >> I am running 3.13.0-39-generic on Ubuntu 14.04. > > You can't be running that kernel if you are seeing a process called > xfssyncd in your traces. I don't see a process called xfssyncd. I started investigating the 30 second pauses but looking for xfs config options in sysctl. We found the option "fs.xfs.xfssyncd_centisecs" whose documentation[1] says it is the interval in which the "filesystem flushes metadata out to disk and runs internal cache cleanup routines". I tweaked this setting and saw the corresponding changes in the performance. Bumping this value up saw pauses at longer interval, decreasing this interval saw pauses more frequently. > > $ gl -n 1 5889608 > commit 5889608df35783590251cfd440fa5d48f1855179 > Author: Dave Chinner <dchinner@xxxxxxxxxx> > Date: Mon Oct 8 21:56:05 2012 +1100 > > xfs: syncd workqueue is no more > > With the syncd functions moved to the log and/or removed, the syncd > workqueue is the only remaining bit left. It is used by the log > covering/ail pushing work, as well as by the inode reclaim work. > > Given how cheap workqueues are these days, give the log and inode > reclaim work their own work queues and kill the syncd work queue. > > Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx> > Reviewed-by: Mark Tinguely <tinguely@xxxxxxx> > Reviewed-by: Christoph Hellwig <hch@xxxxxx> > Signed-off-by: Ben Myers <bpm@xxxxxxx> > > $ git describe --contains 5889608 > for-linus-v3.8-rc1~71 > $ > > Which, as you can see from the patch, the xfssyncd workqueue was > removed and they were separated into xfs-reclaim/<dev> and > xfs-log/<dev> work queues. > > So, what exactly are you calling "xfssyncd"? Can you please post > copies of the output you are seeing that has lead you think this > kernel thread/workqueue exists in your kernel? > >> >> I am seeing a behavior where the system pretty much stalls for ~5 >> >> seconds after every 30 seconds. I see that the # of ios goes up but >> >> the actual write bandwidth during this 5 second period is very low >> >> (see attached images). After a fair bit of investigation, we've >> >> narrowed down the problem to XFS's syncd (fs.xfs.xfssyncd_centisecs). >> >> This runs at a default interval of 30 seconds. >> > >> > It's doing background inode reclaim which, under some circumstances, >> > involves truncating specualtive allocation beyond EOF before reclaim >> > occurs, which results in transactions and inode writeback. It was >> > highly inefficient, which is why we replaced it. >> >> Oh.. I see. So, this isn't even actual filesystem metadata. And there >> is no option to turn the speculative allocation on/off? > > You can turn it off, but now you're jumping to conclusions that this > is the cause of your problems. Perhaps you should do some > tracing/profiling whenthe system goes through these stalls to see > what is actually happening? "perf top" and trace-cmd are very useful > for this sort of investigation... Let me dig deeper here using "perf top" and see what's running during these stalls. > >> What's the downside of not doing the truncation of the speculative >> allocation? Does that result in wasted disk space? If so, how much? > > Start at: > > http://xfs.org/index.php/XFS_FAQ#Q:_Why_do_files_on_XFS_use_more_data_blocks_than_expected.3F > > and read the next 4 FAQs... Thanks! -Shri [1] http://www.mjmwired.net/kernel/Documentation/filesystems/xfs.txt#265 _______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs