On Nov 24, 2015 16:25, "Jan Kara" <jack@xxxxxxx> wrote:
>
> On Mon 23-11-15 20:02:48, Dmitry Monakhov wrote:
> > After freeze_fs was revoked (from Jan Kara) pages's write-back completion
> > is deffered before unwritten conversion, so explicit flush_unwritten_io()
> > was removed here: c724585b62411
> > But we still may face deferred conversion for aio-dio case
> > # Trivial testcase
> > for ((i=0;i<60;i++));do fsfreeze -f /mnt ;sleep 1;fsfreeze -u /mnt;done &
> > fio --bs=4k --ioengine=libaio --iodepth=128 --size=1g --direct=1 \
> > --runtime=60 --filename=/mnt/file --name=rand-write --rw=randwrite
> > NOTE: Sane testcase should be integrated to xfstests, but it requires
> > changes in common/* code, so let's use this this test at the moment.
> >
> > In order to fix this race we have to guard journal transaction with explicit
> > sb_{start,end}_intwrite() as we do with ext4_evict_inode here:8e8ad8a5
>
> Well, this problem seems to suggest that we have the freeze protection for
> AIO writes wrong. We should call file_end_write() from aio_complete() and
> not from aio_run_iocb()...
Yep. It was my first attempt to fix that issue, but unfortunately this trick will break lockdep. Caller will do file_start_write and exit to userspace. Lockdep treats such behaviour as bug (return to userspace with a lock held)
There are two way to fix that
1) add specific 'long' lock primitive to lockdep
2) let sync_filesystems to wait pended aio-dio
> I believe XFS and other filesystems may have
> problems with this as well (CCed). Attached patch (so far only compile
> tested since my test machine is pondering on something else) should fix
> this.
>
> Honza
>
> --
> Jan Kara <jack@xxxxxxxx>
> SUSE Labs, CR
_______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs