warnings complaining IOMAP_DELALLOC blocks in iomap_dio_actor from generic/446

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi all,

Currently generic/446 could trigger a warning in iomap_dio_actor()
easily, it's complaining about unexpected iomap->type (see the end for
full call trace).

fs/iomap.c: iomap_dio_actor()
 859         default:                                                                                                                                                                          
 860                 WARN_ON_ONCE(1);                                                                                                                                                          
 861                 return -EIO;

It's due to the race between direct read and mmap write pagefault on the
same *sparse* file. 

direct read process			mmap write process
xfs_file_dio_aio_read (take IOLOCK_SHARED)
 iomap_dio_rw
  iomap_apply
   filemap_write_and_wait_range
   invalidate_inode_pages2_range
   iomap_apply
					mmap
					 xfs_filemap_page_mkwrite (take MMAPLOCK_SHARED)
					  iomap_page_mkwrite
					   iomap_apply
					    xfs_file_iomap_begin
					     xfs_file_iomap_begin_delay (take ILOCK_EXCL)
					     (release ILOCK_EXCL)
					     ...
    xfs_file_iomap_begin
    (take ILOCK and read in bmap info)
    iomap_dio_actor
    ...

The dio path and page_mkwrite path are taking different locks so they're
not serialized. So after dio read flushing the file range but before
taking ILOCK, the page faults from mmap write could fault in and update
the file map first with delalloc blocks. Then the dio reader sees this
delalloc block map unexpectedly.

I'm not sure what's the best way to fix it, but a quick hack of
disabling delalloc in the write page fault path could do the work for
me, e.g.

--- a/fs/xfs/xfs_iomap.c
+++ b/fs/xfs/xfs_iomap.c
@@ -981,7 +981,7 @@ xfs_file_iomap_begin(
        if (XFS_FORCED_SHUTDOWN(mp))
                return -EIO;

-       if (((flags & (IOMAP_WRITE | IOMAP_DIRECT)) == IOMAP_WRITE) &&
+       if (((flags & (IOMAP_WRITE|IOMAP_DIRECT|IOMAP_FAULT)) == IOMAP_WRITE) &&
                        !IS_DAX(inode) && !xfs_get_extsz_hint(ip)) {
                /* Reserve delalloc blocks for regular writeback. */
                return xfs_file_iomap_begin_delay(inode, offset, length, flags,

So that mmap write page fault brings in already allocated blocks, and
dio reader sees non-IOMAP_DELALLOC iomaps.

I know concurrent dio and mmap io are not recommended, so is this
something that doens't need a fix at all, and the test should filter out
the warning instead?

Thanks,
Eryu

[162097.402359] ------------[ cut here ]------------
[162097.402745] WARNING: CPU: 1 PID: 20582 at fs/iomap.c:860 iomap_dio_actor+0xca/0x370
[162097.403392] Modules linked in: rpcsec_gss_krb5 nfsv4(OE) dns_resolver nfs(OE) fscache btrfs xor raid6_pq ppdev i2c_piix4 parport_pc sg virtio_balloon parport i2c_core pcspkr nfsd(OE) auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c sd_mod ata_generic pata_acpi virtio_scsi ata_piix 8139too libata virtio_pci 8139cp floppy virtio_ring mii serio_raw virtio
[162097.406258] CPU: 1 PID: 20582 Comm: xfs_io Tainted: G        W  OE   4.13.0-rc1 #191
[162097.406898] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2007
[162097.407388] task: ffff880212e40000 task.stack: ffffc900012ac000
[162097.408403] RIP: 0010:iomap_dio_actor+0xca/0x370
[162097.409072] RSP: 0018:ffffc900012afc20 EFLAGS: 00010202
[162097.409533] RAX: 0000000000000002 RBX: ffffc900012afcb0 RCX: 000000000241b000
[162097.410144] RDX: 00000000000001ff RSI: ffffc900012afe30 RDI: ffffc900012afe68
[162097.410720] RBP: ffffc900012afc90 R08: ffffc900012afcb0 R09: ffff880211a65c60
[162097.411324] R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000000
[162097.411912] R13: ffffffffa016bb60 R14: ffff880211a65c60 R15: 0000000000000009
[162097.412503] FS:  00007f94e825a740(0000) GS:ffff88021fc80000(0000) knlGS:0000000000000000
[162097.413181] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[162097.413648] CR2: 000000000241a000 CR3: 0000000212c22000 CR4: 00000000000006e0
[162097.414261] Call Trace:
[162097.414478]  ? iomap_dio_zero+0x100/0x100
[162097.414808]  iomap_apply+0xa1/0x110
[162097.415133]  iomap_dio_rw+0x20b/0x390
[162097.415442]  ? iomap_dio_zero+0x100/0x100
[162097.415804]  xfs_file_dio_aio_read+0x6e/0xf0 [xfs]
[162097.416250]  xfs_file_read_iter+0xab/0xc0 [xfs]
[162097.416629]  __vfs_read+0xe0/0x150
[162097.416925]  vfs_read+0x8c/0x130
[162097.417217]  SyS_pread64+0x87/0xb0
[162097.417512]  do_syscall_64+0x67/0x150
[162097.417818]  entry_SYSCALL64_slow_path+0x25/0x25
[162097.418234] RIP: 0033:0x7f94e7e3cf73
[162097.418533] RSP: 002b:00007fff7f52ae90 EFLAGS: 00000293 ORIG_RAX: 0000000000000011
[162097.419167] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f94e7e3cf73
[162097.419740] RDX: 0000000000001000 RSI: 000000000241a000 RDI: 0000000000000003
[162097.420339] RBP: 00007fff7f52af50 R08: 0000000000000000 R09: 0000000000000000
[162097.420924] R10: 0000000000000000 R11: 0000000000000293 R12: 0000000000000000
[162097.421516] R13: 0000000000020000 R14: 0000000000000000 R15: 0000000000000000
[162097.422124] Code: e1 48 09 c1 48 85 ca 0f 85 82 02 00 00 0f b7 43 18 66 83 f8 03 0f 84 fb 01 00 00 66 83 f8 04 74 35 66 83 f8 01 0f 84 07 02 00 00 <0f> ff 48 c7 c0 fb ff ff ff 48 8b 4d d0 65 48 33 0c 25 28 00 00
[162097.423661] ---[ end trace 927f98a352499782 ]---
--
To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux