Re: [BUG 4.10-rc7] sb_fdblocks inconsistency in xfs/297 test

Michal Hocko <mhocko@xxxxxxxxxx> · Sat, 11 Feb 2017 07:33:08 +0100

On Sat 11-02-17 14:02:04, Eryu Guan wrote:
> On Fri, Feb 10, 2017 at 10:31:31AM +0100, Michal Hocko wrote:
> > [CC Christoph]
> > 
> > On Fri 10-02-17 09:02:10, Michal Hocko wrote:
> > > On Fri 10-02-17 08:14:18, Michal Hocko wrote:
> > > > On Fri 10-02-17 11:53:48, Eryu Guan wrote:
> > > > > Hi,
> > > > > 
> > > > > I was testing 4.10-rc7 kernel and noticed that xfs_repair reported XFS
> > > > > corruption after fstests xfs/297 test. This didn't happen with 4.10-rc6
> > > > > kernel, and git bisect pointed the first bad commit to
> > > > > 
> > > > > commit d1908f52557b3230fbd63c0429f3b4b748bf2b6d
> > > > > Author: Michal Hocko <mhocko@xxxxxxxx>
> > > > > Date:   Fri Feb 3 13:13:26 2017 -0800
> > > > > 
> > > > >     fs: break out of iomap_file_buffered_write on fatal signals
> > > > > 
> > > > >     Tetsuo has noticed that an OOM stress test which performs large write
> > > > >     requests can cause the full memory reserves depletion.  He has tracked
> > > > >     this down to the following path
> > > > > ....
> > > > > 
> > > > > It's the sb_fdblocks field reports inconsistency:
> > > > > ...
> > > > > Phase 2 - using internal log   
> > > > >         - zero log...
> > > > >         - scan filesystem freespace and inode maps...
> > > > > sb_fdblocks 3367765, counted 3367863
> > > > >         - 11:37:41: scanning filesystem freespace - 16 of 16 allocation groups done
> > > > >         - found root inode chunk
> > > > > ...
> > > > > 
> > > > > And it can be reproduced almost 100% with all XFS test configurations
> > > > > (e.g. xfs_4k xfs_2k_reflink), on all test hosts I tried (so I didn't
> > > > > bother pasting my detailed test and host configs, if more info is needed
> > > > > please let me know).
> > > > 
> > > > The patch can lead to short writes when the task is killed. Was there
> > > > any OOM killer triggered during the test? If not who is killing the
> > > > task? I will try to reproduce later today.
> > > 
> > > I have checked both tests and they are killing the test but none of them
> > > seems to be using SIGKILL. The patch should make a difference only for
> > > fatal signal (aka SIGKILL). Is there any other part that can do SIGKILL
> > > except for the OOM killer?
> 
> No, I'm not aware of any other part in fstests harness could send
> SIGKILL.

hmm, maybe this is a result of the group_exit which sends SIGKILL to
other threads (zap_other_threads)

[...]
> > So somebody had to send SIGKILL to fsstress. Anyway, I am wondering
> > whether this is really a regression. xfs_file_buffered_aio_write used to
> > call generic_perform_write which does the same thing.
> 
> Maybe it just uncovered some existing bug?

maybe

> Anyway, a reliable reproduced filesystem metadata inconsistency does
> smell like a bug.

definitely! Unfortunately I am going to disappear for week. Will be back
on 20th. Anyway, I believe iomap_file_buffered_write and its callers
_should_ be able to handle short reads. EINTR is not the only way how
can this happen. ENOMEM would be another.

-- 
Michal Hocko
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html