Re: xfs: garbage file data inclusion bug under memory pressure

Brian Foster <bfoster@xxxxxxxxxx> · Thu, 25 Jul 2019 12:00:27 -0400

On Thu, Jul 25, 2019 at 09:30:01PM +0900, Tetsuo Handa wrote:
> On 2019/07/25 19:53, Brian Foster wrote:
> > This is a known problem. XFS delayed allocation has a window between
> > delalloc to real block conversion and writeback completion where stale
> > data exposure is possible if the writeback doesn't complete (i.e., due
> > to crash, I/O error, etc.). See fstests generic/536 for another
> > reference.  We've batted around potential solutions like using unwritten
> > extents for delalloc allocations, but IIRC we haven't been able to come
> > up with something with suitable performance to this point.
> > 
> > I'm curious why your OOM test results in writeback errors in the first
> > place. Is that generally expected? Does dmesg show any other XFS related
> > events, such as filesystem shutdown for example? I gave it a quick try
> > on a 4GB swapless VM and it doesn't trigger OOM. What's your memory
> > configuration and what does the /tmp filesystem look like ('xfs_info
> > /tmp')?
> 
> Writeback errors should not happen by just close-to-OOM situation.
> And there is no other XFS related events.
> 

Indeed, that is strange.

...
> 
> Kernel config is http://I-love.SAKURA.ne.jp/tmp/config-5.3-rc1 .
> 
> Below result is from a different VM which shows the same problem.
> 
> # xfs_info /tmp
> meta-data=/dev/sda1              isize=256    agcount=4, agsize=16383936 blks
>          =                       sectsz=512   attr=2, projid32bit=1
>          =                       crc=0        finobt=0 spinodes=0
> data     =                       bsize=4096   blocks=65535744, imaxpct=25
>          =                       sunit=0      swidth=0 blks
> naming   =version 2              bsize=4096   ascii-ci=0 ftype=0
> log      =internal               bsize=4096   blocks=31999, version=2
>          =                       sectsz=512   sunit=0 blks, lazy-count=1
> realtime =none                   extsz=4096   blocks=0, rtextents=0
> 	

I ran your oom-torture.c (without the fs fill step) tool again after
dropping VM RAM to 3GB and still had to invoke some usemem (from
fstests) instances to consume memory before OOM triggered. I eventually
reproduced oom-torture OOM kills but did not reproduce writeback errors.
I've only run it once, but this is against a virtio vdisk backing
lvm+XFS in the guest. What is your target device here? Is it failing
independently by chance?

Brian

> 
>