Re: XFS Syncd

Shrinand Javadekar <shrinand@xxxxxxxxxxxxxx> · Fri, 10 Apr 2015 00:29:34 -0700

On Fri, Apr 10, 2015 at 12:21 AM, Dave Chinner <david@xxxxxxxxxxxxx> wrote:
> On Thu, Apr 09, 2015 at 11:51:17PM -0700, Shrinand Javadekar wrote:
>> Thanks for the reply Dave!
>>
>> >
>> > Oh, right, it's that workqueue we removed in late 2012 (in the 3.7
>> > cycle) because it was redundant. The only remaining fragment of it
>> > is the xfslogd. What kernel are you running?
>>
>> I am running 3.13.0-39-generic on Ubuntu 14.04.
>
> You can't be running that kernel if you are seeing a process called
> xfssyncd in your traces.

I don't see a process called xfssyncd. I started investigating the 30
second pauses but looking for xfs config options in sysctl. We found
the option "fs.xfs.xfssyncd_centisecs" whose documentation[1] says it
is the interval in which the "filesystem flushes metadata out to disk
and runs internal cache cleanup routines".

I tweaked this setting and saw the corresponding changes in the
performance. Bumping this value up saw pauses at longer interval,
decreasing this interval saw pauses more frequently.

>
> $ gl -n 1 5889608
> commit 5889608df35783590251cfd440fa5d48f1855179
> Author: Dave Chinner <dchinner@xxxxxxxxxx>
> Date:   Mon Oct 8 21:56:05 2012 +1100
>
>     xfs: syncd workqueue is no more
>
>     With the syncd functions moved to the log and/or removed, the syncd
>     workqueue is the only remaining bit left. It is used by the log
>     covering/ail pushing work, as well as by the inode reclaim work.
>
>     Given how cheap workqueues are these days, give the log and inode
>     reclaim work their own work queues and kill the syncd work queue.
>
>     Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx>
>     Reviewed-by: Mark Tinguely <tinguely@xxxxxxx>
>     Reviewed-by: Christoph Hellwig <hch@xxxxxx>
>     Signed-off-by: Ben Myers <bpm@xxxxxxx>
>
> $ git describe --contains 5889608
> for-linus-v3.8-rc1~71
> $
>
> Which, as you can see from the patch, the xfssyncd workqueue was
> removed and they were separated into xfs-reclaim/<dev> and
> xfs-log/<dev> work queues.
>
> So, what exactly are you calling "xfssyncd"? Can you please post
> copies of the output you are seeing that has lead you think this
> kernel thread/workqueue exists in your kernel?
>
>> >> I am seeing a behavior where the system pretty much stalls for ~5
>> >> seconds after every 30 seconds. I see that the # of ios goes up but
>> >> the actual write bandwidth during this 5 second period is very low
>> >> (see attached images). After a fair bit of investigation, we've
>> >> narrowed down the problem to XFS's syncd (fs.xfs.xfssyncd_centisecs).
>> >> This runs at a default interval of 30 seconds.
>> >
>> > It's doing background inode reclaim which, under some circumstances,
>> > involves truncating specualtive allocation beyond EOF before reclaim
>> > occurs, which results in transactions and inode writeback. It was
>> > highly inefficient, which is why we replaced it.
>>
>> Oh.. I see. So, this isn't even actual filesystem metadata. And there
>> is no option to turn the speculative allocation on/off?
>
> You can turn it off, but now you're jumping to conclusions that this
> is the cause of your problems. Perhaps you should do some
> tracing/profiling whenthe system goes through these stalls to see
> what is actually happening? "perf top" and trace-cmd are very useful
> for this sort of investigation...

Let me dig deeper here using "perf top" and see what's running during
these stalls.

>
>> What's the downside of not doing the truncation of the speculative
>> allocation? Does that result in wasted disk space? If so, how much?
>
> Start at:
>
> http://xfs.org/index.php/XFS_FAQ#Q:_Why_do_files_on_XFS_use_more_data_blocks_than_expected.3F
>
> and read the next 4 FAQs...

Thanks!
-Shri

[1] http://www.mjmwired.net/kernel/Documentation/filesystems/xfs.txt#265

_______________________________________________
xfs mailing list
xfs@xxxxxxxxxxx
http://oss.sgi.com/mailman/listinfo/xfs