Re: XFS and RAID10 with o2 layout

keld@xxxxxxxxxx · Thu, 13 Dec 2018 18:30:01 +0100

o2 is not the fastest layout,
but when the blk size is big enough 
the loss on head movement is small. f2 is better also for smaller
block sizes

On Thu, Dec 13, 2018 at 02:02:13PM +0100, Sinisa wrote:
> On 12/13/18 1:28 PM, Brian Foster wrote:
> >On Thu, Dec 13, 2018 at 09:21:18AM +0100, Sinisa wrote:
> >>Thanks for a quick reply. Replies are inline...
> >>
> >>On 12.12.2018 15:30, Brian Foster wrote:
> >>>cc linux-raid
> >>>
> >>>On Wed, Dec 12, 2018 at 01:29:49PM +0100, Sinisa wrote:
> >>>>Hello group,
> >>>>
> >>>>I have noticed something strange going on lately, but recently I have 
> >>>>come
> >>>>to conclusion that there is some unwanted interaction between XFS and 
> >>>>Linux
> >>>>RAID10 with "offset" layout.
> >>>>
> >>>>So here is the problem: I create a Linux RAID10 mirror with 2 disks 
> >>>>(HDD or
> >>>>SSD) and "o2" layout (best choice for read and write speed):
> >>>># mdadm -C -n2 -l10 -po2 /dev/mdX /dev/sdaX /dev/sdbX
> >>>># mkfs.xfs /dev/mdX
> >>>># mount /dev/mdX /mnt
> >>>># rsync -avxDPHS / /mnt
> >>>>
> >>>>So we have RAID10 initializing:
> >>>>
> >>>># cat /proc/mdstat
> >>>>Personalities : [raid1] [raid10]
> >>>>md2 : active raid10 sdb3[1] sda3[0]
> >>>>       314433536 blocks super 1.2 4096K chunks 2 offset-copies [2/2] 
> >>>> [UU]
> >>>>       [==>..................]  resync = 11.7% (36917568/314433536)
> >>>>finish=8678.2min speed=532K/sec
> >>>>       bitmap: 3/3 pages [12KB], 65536KB chunk
> >>>>
> >>>>but after a few minutes everything stops like you can see above. Rsync 
> >>>>(or
> >>>>any other process writing to that md device) also freezes. If I try to 
> >>>>read
> >>>>already copied files - freeze, usually with less that 2GB copied.
> >>>>
> >>>Does the same thing happen without the RAID initialization? E.g., if you
> >>>wait for it to complete or (IIRC) if you create with --assume-clean? I
> >>>assume the init-in-progress state is common with your tests on other
> >>>filesystems?
> >>>
> >>No, if I wait for RAID to finish initializing, or create it with
> >>--assume-clean, everything works just fine.
> >>
> >>Actualy, ever since openSUSE LEAP 15.0 release I have been doing just 
> >>that:
> >>pause installation process until initialization is done, then let it go 
> >>on.
> >>
> >>But recently it has happened so that I had to replace one of the disks in 
> >>a
> >>"live" system (small file server), and was unable to do that on multiple
> >>tries during work hours because of this problem. When I waited until
> >>afternoon, when nobody was working/writing, resync was able to finish...
> >>
> >So apparently there is some kind of poor interaction here with the
> >internal MD resync code. It's not clear to me whether it's a lockup or
> >extreme slowdown, but unless anybody else has ideas I'd suggest to
> >solicit feedback from the MD devs (note that you dropped the linux-raid
> >cc) as to why this set of I/O might be blocked in the raid device and go
> >from there.
> >
> >Brian
> 
> (Sorry, I'm not very much into mailing lists, I have added linux-raid back 
> to cc)
> 
> Today I tried lowering the RAID10 chunk size to 512 bytes (default), and 
> only difference was that freeze appeared much faster.
> Also tried with newest RC kernel 4.20.0-rc6-2.g91eea17-default (from 
> openSUSE kernel/HEAD), with same results.
> 
> And it is definitely a lockup, because I have tried to leave it overnight, 
> but once rsync/copy/write stops, it never moves on, and also once RAID sync 
> stops it never moves on...
> 
> I was trying many times to get that dmesg report again, but without success 
> (waited up to 30 minutes). Any help is welcome...
> 
> 
> What I did not try is some other distribution, only openSUSE LEAP and 
> Tumbleweed
> 
> 
> >
> >>>A few more notes below inline to the log..
> >>>
> >>>>Sometimes in dmesg I get some kernel messages about "task kworker/2:1:55
> >>>>blocked for more than 480 seconds." (please see attached dmesg.txt and 
> >>>>my
> >>>>reports here: https://bugzilla.opensuse.org/show_bug.cgi?id=1111073),
> >>>>sometimes nothing at all. When this happens, I can only reboot with 
> >>>>SysRq-b
> >>>>or "physically" with reset/power button.
> >>>>
> >>>>Same thing can happen with "far" layout, but it seems to me that it 
> >>>>does not
> >>>>happen every time (or that often). I might be wrong, because I never use
> >>>>"far" layout in real life, only for testing.
> >>>>I was unable to reproduce the failure with "near" layout.
> >>>>
> >>>>Also with EXT4 or BTRFS and any layout everything works just as it 
> >>>>should,
> >>>>that is sync goes on until finished, and rsync, cp, or any other write 
> >>>>work
> >>>>just fine at the same time.
> >>>>
> >>>>Let me just add that I first saw this behavior in openSUSE LEAP 15.0 
> >>>>(kernel
> >>>>4.12). In previous versions (up to kernel 4.4) I never had this 
> >>>>problem. In
> >>>>the meantime I have tested with kernels up to 4.20rc and it is the same.
> >>>>Unfortunately I cannot go back to test kernels 4.5 - 4.11 to pinpoint 
> >>>>the
> >>>>moment the problem first appeared.
> >>>>
> >>>>
> >>>>
> >>>>--
> >>>>Best regards,
> >>>>Sini??a Bandin
> >>>>(excuse my English)
> >>>>
> >>>>[ 180.981499] SGI XFS with ACLs, security attributes, no debug enabled
> >>>>[ 181.005019] XFS (md1): Mounting V5 Filesystem
> >>>>[ 181.132076] XFS (md1): Starting recovery (logdev: internal)
> >>>>[ 181.295606] XFS (md1): Ending recovery (logdev: internal)
> >>>>[ 181.804011] XFS (md1): Unmounting Filesystem
> >>>>[ 182.201794] XFS (md127): Mounting V4 Filesystem
> >>>>[ 182.736958] md: recovery of RAID array md127
> >>>>[ 182.915479] XFS (md127): Ending clean mount
> >>>>[ 183.819702] XFS (md127): Unmounting Filesystem
> >>>>[ 184.943831] EXT4-fs (md0): mounted filesystem with ordered data
> >>>>mode. Opts: (null)
> >>>>[ 529.784557] EXT4-fs (md0): mounted filesystem with ordered data
> >>>>mode. Opts: (null)
> >>>>[ 601.789958] md1: detected capacity change from 33284947968 to 0
> >>>>[ 601.789973] md: md1 stopped.
> >>>>[ 602.314112] md0: detected capacity change from 550436864 to 0
> >>>>[ 602.314128] md: md0 stopped.
> >>>>[ 602.745030] md: md127: recovery interrupted.
> >>>>[ 603.131684] md127: detected capacity change from 966229229568 to 0
> >>>>[ 603.132237] md: md127 stopped.
> >>>>[ 603.435808] sda: sda1 sda2
> >>>>[ 603.594074] udevd[5011]: inotify_add_watch(11, /dev/sda2, 10)
> >>>>failed: No such file or directory
> >>>>[ 603.643959] sda:
> >>>>[ 603.844724] sdb: sdb1 sdb2
> >>>>[ 604.255407] sdb: sdb1
> >>>>[ 604.490214] udevd[5050]: inotify_add_watch(11, /dev/sdb1, 10)
> >>>>failed: No such file or directory
> >>>>[ 605.140952] sdb: sdb1
> >>>>[ 605.628686] sdb: sdb1 sdb2
> >>>>[ 606.271192] sdb: sdb1 sdb2 sdb3
> >>>>[ 607.079626] sdb: sdb1 sdb2 sdb3
> >>>>[ 607.611092] sda:
> >>>>[ 608.273201] sda: sda1
> >>>>[ 608.611952] sda: sda1 sda2
> >>>>[ 609.031326] sda: sda1 sda2 sda3
> >>>>[ 609.753140] md/raid10:md1: not clean -- starting background 
> >>>>reconstruction
> >>>>[ 609.753145] md/raid10:md1: active with 2 out of 2 devices
> >>>>[ 609.768804] md1: detected capacity change from 0 to 32210157568
> >>>>[ 609.772677] md: resync of RAID array md1
> >>>>[ 614.590107] XFS (md1): Mounting V5 Filesystem
> >>>>[ 615.449035] XFS (md1): Ending clean mount
> >>>>[ 617.678462] md/raid1:md0: not clean -- starting background 
> >>>>reconstruction
> >>>>[ 617.678469] md/raid1:md0: active with 2 out of 2 mirrors
> >>>>[ 617.740729] md0: detected capacity change from 0 to 524222464
> >>>>[ 617.747107] md: delaying resync of md0 until md1 has finished
> >>>>(they share one or more physical units)
> >>>What are md0 and md1? Note that I don't see md2 anywhere in this log.
> >>>
> >>Sorry that I did not clarify that immediately, this log was taken earlier,
> >>during installation, when I got to see it in dmesg.
> >>md0 was /boot (with EXT4), md1 was / with XFS.
> >>
> >>Example of cat /proc/mdstat was taken later, when I brought up the system
> >>(by changing md1 to "near" layout at install time). So wherever you see 
> >>md1
> >>or md2, you can assume they are the same thing: new RAID10/o2 being 
> >>written
> >>to during initialization. But second time there was nothing in dmesg, so I
> >>could not attach that.
> >>
> >>
> >>>>[ 620.037818] EXT4-fs (md0): mounted filesystem with ordered data
> >>>>mode. Opts: (null)
> >>>>[ 1463.754785] INFO: task kworker/0:3:227 blocked for more than 480 
> >>>>seconds.
> >>>>[ 1463.754793] Not tainted 4.19.5-1-default #1
> >>>>[ 1463.754795] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> >>>>disables this message.
> >>>>[ 1463.754799] kworker/0:3 D 0 227 2 0x80000000
> >>>>[ 1463.755000] Workqueue: xfs-eofblocks/md1 xfs_eofblocks_worker [xfs]
> >>>>[ 1463.755005] Call Trace:
> >>>>[ 1463.755025] ? __schedule+0x29a/0x880
> >>>>[ 1463.755032] ? rwsem_down_write_failed+0x197/0x350
> >>>>[ 1463.755038] schedule+0x78/0x110
> >>>>[ 1463.755044] rwsem_down_write_failed+0x197/0x350
> >>>>[ 1463.755055] call_rwsem_down_write_failed+0x13/0x20
> >>>>[ 1463.755061] down_write+0x20/0x30
> >>>So we have a background task blocked on an inode lock.
> >>>
> >>>>[ 1463.755196] xfs_free_eofblocks+0x114/0x1a0 [xfs]
> >>>>[ 1463.755330] xfs_inode_free_eofblocks+0xd3/0x1e0 [xfs]
> >>>>[ 1463.755459] ? xfs_inode_ag_walk_grab+0x5b/0x90 [xfs]
> >>>>[ 1463.755586] xfs_inode_ag_walk.isra.15+0x1aa/0x420 [xfs]
> >>>>[ 1463.755714] ? __xfs_inode_clear_blocks_tag+0x120/0x120 [xfs]
> >>>>[ 1463.755727] ? trace_hardirqs_on_thunk+0x1a/0x1c
> >>>>[ 1463.755734] ? __switch_to_asm+0x40/0x70
> >>>>[ 1463.755738] ? __switch_to_asm+0x34/0x70
> >>>>[ 1463.755743] ? __switch_to_asm+0x40/0x70
> >>>>[ 1463.755748] ? __switch_to_asm+0x34/0x70
> >>>>[ 1463.755752] ? __switch_to_asm+0x40/0x70
> >>>>[ 1463.755757] ? __switch_to_asm+0x34/0x70
> >>>>[ 1463.755762] ? __switch_to_asm+0x40/0x70
> >>>>[ 1463.755893] ? __xfs_inode_clear_blocks_tag+0x120/0x120 [xfs]
> >>>>[ 1463.755900] ? radix_tree_gang_lookup_tag+0xc2/0x140
> >>>>[ 1463.756032] ? __xfs_inode_clear_blocks_tag+0x120/0x120 [xfs]
> >>>>[ 1463.756158] xfs_inode_ag_iterator_tag+0x73/0xb0 [xfs]
> >>>>[ 1463.756288] xfs_eofblocks_worker+0x29/0x40 [xfs]
> >>>>[ 1463.756298] process_one_work+0x1fd/0x420
> >>>>[ 1463.756305] worker_thread+0x2d/0x3d0
> >>>>[ 1463.756311] ? rescuer_thread+0x340/0x340
> >>>>[ 1463.756316] kthread+0x112/0x130
> >>>>[ 1463.756322] ? kthread_create_worker_on_cpu+0x40/0x40
> >>>>[ 1463.756329] ret_from_fork+0x3a/0x50
> >>>>[ 1463.756375] INFO: task kworker/u4:0:4615 blocked for more than 480 
> >>>>seconds.
> >>>>[ 1463.756379] Not tainted 4.19.5-1-default #1
> >>>>[ 1463.756380] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> >>>>disables this message.
> >>>>[ 1463.756383] kworker/u4:0 D 0 4615 2 0x80000000
> >>>>[ 1463.756395] Workqueue: writeback wb_workfn (flush-9:1)
> >>>>[ 1463.756400] Call Trace:
> >>>>[ 1463.756409] ? __schedule+0x29a/0x880
> >>>>[ 1463.756420] ? wait_barrier+0xdd/0x170 [raid10]
> >>>>[ 1463.756426] schedule+0x78/0x110
> >>>>[ 1463.756433] wait_barrier+0xdd/0x170 [raid10]
> >>>>[ 1463.756440] ? wait_woken+0x80/0x80
> >>>>[ 1463.756448] raid10_write_request+0xf2/0x900 [raid10]
> >>>>[ 1463.756454] ? wait_woken+0x80/0x80
> >>>>[ 1463.756459] ? mempool_alloc+0x55/0x160
> >>>>[ 1463.756483] ? md_write_start+0xa9/0x270 [md_mod]
> >>>>[ 1463.756492] raid10_make_request+0xc1/0x120 [raid10]
> >>>>[ 1463.756498] ? wait_woken+0x80/0x80
> >>>>[ 1463.756514] md_handle_request+0x121/0x190 [md_mod]
> >>>>[ 1463.756535] md_make_request+0x78/0x190 [md_mod]
> >>>>[ 1463.756544] generic_make_request+0x1c6/0x470
> >>>>[ 1463.756553] submit_bio+0x45/0x140
> >>>Writeback is blocked submitting I/O down in the MD driver.
> >>>
> >>>>[ 1463.756714] xfs_submit_ioend+0x9c/0x1e0 [xfs]
> >>>>[ 1463.756844] xfs_vm_writepages+0x68/0x80 [xfs]
> >>>>[ 1463.756856] do_writepages+0x31/0xb0
> >>>>[ 1463.756865] ? read_hpet+0x126/0x130
> >>>>[ 1463.756873] ? ktime_get+0x36/0xa0
> >>>>[ 1463.756881] __writeback_single_inode+0x3d/0x3e0
> >>>>[ 1463.756889] writeback_sb_inodes+0x1c4/0x430
> >>>>[ 1463.756902] __writeback_inodes_wb+0x5d/0xb0
> >>>>[ 1463.756910] wb_writeback+0x26b/0x310
> >>>>[ 1463.756920] wb_workfn+0x33a/0x410
> >>>>[ 1463.756932] process_one_work+0x1fd/0x420
> >>>>[ 1463.756940] worker_thread+0x2d/0x3d0
> >>>>[ 1463.756946] ? rescuer_thread+0x340/0x340
> >>>>[ 1463.756951] kthread+0x112/0x130
> >>>>[ 1463.756957] ? kthread_create_worker_on_cpu+0x40/0x40
> >>>>[ 1463.756965] ret_from_fork+0x3a/0x50
> >>>>[ 1463.756979] INFO: task kworker/0:2:4994 blocked for more than 480 
> >>>>seconds.
> >>>>[ 1463.756982] Not tainted 4.19.5-1-default #1
> >>>>[ 1463.756984] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> >>>>disables this message.
> >>>>[ 1463.756987] kworker/0:2 D 0 4994 2 0x80000000
> >>>>[ 1463.757013] Workqueue: md submit_flushes [md_mod]
> >>>>[ 1463.757016] Call Trace:
> >>>>[ 1463.757024] ? __schedule+0x29a/0x880
> >>>>[ 1463.757034] ? wait_barrier+0xdd/0x170 [raid10]
> >>>>[ 1463.757039] schedule+0x78/0x110
> >>>>[ 1463.757047] wait_barrier+0xdd/0x170 [raid10]
> >>>>[ 1463.757054] ? wait_woken+0x80/0x80
> >>>>[ 1463.757062] raid10_write_request+0xf2/0x900 [raid10]
> >>>>[ 1463.757067] ? wait_woken+0x80/0x80
> >>>>[ 1463.757072] ? mempool_alloc+0x55/0x160
> >>>>[ 1463.757088] ? md_write_start+0xa9/0x270 [md_mod]
> >>>>[ 1463.757095] ? trace_hardirqs_off_thunk+0x1a/0x1c
> >>>>[ 1463.757104] raid10_make_request+0xc1/0x120 [raid10]
> >>>>[ 1463.757110] ? wait_woken+0x80/0x80
> >>>>[ 1463.757126] md_handle_request+0x121/0x190 [md_mod]
> >>>>[ 1463.757132] ? _raw_spin_unlock_irq+0x22/0x40
> >>>>[ 1463.757137] ? finish_task_switch+0x74/0x260
> >>>>[ 1463.757156] submit_flushes+0x21/0x40 [md_mod]
> >>>Some other MD task (?) also blocked submitting a request.
> >>>
> >>>>[ 1463.757163] process_one_work+0x1fd/0x420
> >>>>[ 1463.757170] worker_thread+0x2d/0x3d0
> >>>>[ 1463.757177] ? rescuer_thread+0x340/0x340
> >>>>[ 1463.757181] kthread+0x112/0x130
> >>>>[ 1463.757186] ? kthread_create_worker_on_cpu+0x40/0x40
> >>>>[ 1463.757193] ret_from_fork+0x3a/0x50
> >>>>[ 1463.757205] INFO: task md1_resync:5215 blocked for more than 480 
> >>>>seconds.
> >>>>[ 1463.757207] Not tainted 4.19.5-1-default #1
> >>>>[ 1463.757209] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> >>>>disables this message.
> >>>>[ 1463.757212] md1_resync D 0 5215 2 0x80000000
> >>>>[ 1463.757216] Call Trace:
> >>>>[ 1463.757223] ? __schedule+0x29a/0x880
> >>>>[ 1463.757231] ? raise_barrier+0x8d/0x140 [raid10]
> >>>>[ 1463.757236] schedule+0x78/0x110
> >>>>[ 1463.757243] raise_barrier+0x8d/0x140 [raid10]
> >>>>[ 1463.757248] ? wait_woken+0x80/0x80
> >>>>[ 1463.757257] raid10_sync_request+0x1f6/0x1e30 [raid10]
> >>>>[ 1463.757265] ? _raw_spin_unlock_irq+0x22/0x40
> >>>>[ 1463.757284] ? is_mddev_idle+0x125/0x137 [md_mod]
> >>>>[ 1463.757302] md_do_sync.cold.78+0x404/0x969 [md_mod]
> >>>The md1 sync task is blocked, I'm not sure on what.
> >>>
> >>>>[ 1463.757311] ? wait_woken+0x80/0x80
> >>>>[ 1463.757336] ? md_rdev_init+0xb0/0xb0 [md_mod]
> >>>>[ 1463.757351] md_thread+0xe9/0x140 [md_mod]
> >>>>[ 1463.757358] ? _raw_spin_unlock_irqrestore+0x2e/0x60
> >>>>[ 1463.757364] ? __kthread_parkme+0x4c/0x70
> >>>>[ 1463.757369] kthread+0x112/0x130
> >>>>[ 1463.757374] ? kthread_create_worker_on_cpu+0x40/0x40
> >>>>[ 1463.757380] ret_from_fork+0x3a/0x50
> >>>>[ 1463.757395] INFO: task xfsaild/md1:5233 blocked for more than 480 
> >>>>seconds.
> >>>>[ 1463.757398] Not tainted 4.19.5-1-default #1
> >>>>[ 1463.757400] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> >>>>disables this message.
> >>>>[ 1463.757402] xfsaild/md1 D 0 5233 2 0x80000000
> >>>>[ 1463.757406] Call Trace:
> >>>>[ 1463.757413] ? __schedule+0x29a/0x880
> >>>>[ 1463.757421] ? wait_barrier+0xdd/0x170 [raid10]
> >>>>[ 1463.757426] schedule+0x78/0x110
> >>>>[ 1463.757433] wait_barrier+0xdd/0x170 [raid10]
> >>>>[ 1463.757438] ? wait_woken+0x80/0x80
> >>>>[ 1463.757446] raid10_write_request+0xf2/0x900 [raid10]
> >>>>[ 1463.757451] ? wait_woken+0x80/0x80
> >>>>[ 1463.757455] ? mempool_alloc+0x55/0x160
> >>>>[ 1463.757471] ? md_write_start+0xa9/0x270 [md_mod]
> >>>>[ 1463.757477] ? trace_hardirqs_on_thunk+0x1a/0x1c
> >>>>[ 1463.757485] raid10_make_request+0xc1/0x120 [raid10]
> >>>>[ 1463.757491] ? wait_woken+0x80/0x80
> >>>>[ 1463.757507] md_handle_request+0x121/0x190 [md_mod]
> >>>>[ 1463.757527] md_make_request+0x78/0x190 [md_mod]
> >>>>[ 1463.757536] generic_make_request+0x1c6/0x470
> >>>>[ 1463.757544] submit_bio+0x45/0x140
> >>>xfsaild (metadata writeback) is also blocked submitting I/O down in the
> >>>MD driver.
> >>>
> >>>>[ 1463.757552] ? bio_add_page+0x48/0x60
> >>>>[ 1463.757716] _xfs_buf_ioapply+0x2c1/0x450 [xfs]
> >>>>[ 1463.757849] ? xfs_buf_delwri_submit_buffers+0xec/0x280 [xfs]
> >>>>[ 1463.757974] __xfs_buf_submit+0x67/0x270 [xfs]
> >>>>[ 1463.758102] xfs_buf_delwri_submit_buffers+0xec/0x280 [xfs]
> >>>>[ 1463.758232] ? xfsaild+0x294/0x7e0 [xfs]
> >>>>[ 1463.758364] xfsaild+0x294/0x7e0 [xfs]
> >>>>[ 1463.758377] ? _raw_spin_unlock_irqrestore+0x2e/0x60
> >>>>[ 1463.758508] ? xfs_trans_ail_cursor_first+0x80/0x80 [xfs]
> >>>>[ 1463.758514] kthread+0x112/0x130
> >>>>[ 1463.758520] ? kthread_create_worker_on_cpu+0x40/0x40
> >>>>[ 1463.758527] ret_from_fork+0x3a/0x50
> >>>>[ 1463.758543] INFO: task rpm:5364 blocked for more than 480 seconds.
> >>>>[ 1463.758546] Not tainted 4.19.5-1-default #1
> >>>>[ 1463.758547] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> >>>>disables this message.
> >>>>[ 1463.758550] rpm D 0 5364 3757 0x00000000
> >>>>[ 1463.758554] Call Trace:
> >>>>[ 1463.758563] ? __schedule+0x29a/0x880
> >>>>[ 1463.758701] ? xlog_wait+0x5c/0x70 [xfs]
> >>>>[ 1463.759821] schedule+0x78/0x110
> >>>>[ 1463.760022] xlog_wait+0x5c/0x70 [xfs]
> >>>>[ 1463.760036] ? wake_up_q+0x70/0x70
> >>>>[ 1463.760167] __xfs_log_force_lsn+0x223/0x230 [xfs]
> >>>>[ 1463.760297] ? xfs_file_fsync+0x196/0x1d0 [xfs]
> >>>>[ 1463.760424] xfs_log_force_lsn+0x93/0x140 [xfs]
> >>>>[ 1463.760552] xfs_file_fsync+0x196/0x1d0 [xfs]
> >>>An fsync is blocked, presumably on XFS log I/O completion.
> >>>
> >>>>[ 1463.760562] ? __sb_end_write+0x36/0x60
> >>>>[ 1463.760571] do_fsync+0x38/0x70
> >>>>[ 1463.760578] __x64_sys_fdatasync+0x13/0x20
> >>>>[ 1463.760585] do_syscall_64+0x60/0x110
> >>>>[ 1463.760594] entry_SYSCALL_64_after_hwframe+0x49/0xbe
> >>>>[ 1463.760603] RIP: 0033:0x7f9757fae8a4
> >>>>[ 1463.760616] Code: Bad RIP value.
> >>>>[ 1463.760619] RSP: 002b:00007fff74fdb428 EFLAGS: 00000246 ORIG_RAX:
> >>>>000000000000004b
> >>>>[ 1463.760654] RAX: ffffffffffffffda RBX: 0000000000000064 RCX: 
> >>>>00007f9757fae8a4
> >>>>[ 1463.760657] RDX: 00000000012c4c60 RSI: 00000000012cc130 RDI: 
> >>>>0000000000000004
> >>>>[ 1463.760660] RBP: 0000000000000000 R08: 0000000000000000 R09: 
> >>>>00007f9758708c00
> >>>>[ 1463.760662] R10: 0000000000000000 R11: 0000000000000246 R12: 
> >>>>00000000012cc130
> >>>>[ 1463.760665] R13: 000000000123a3a0 R14: 0000000000010830 R15: 
> >>>>0000000000000062
> >>>>[ 1463.760679] INFO: task kworker/0:8:5367 blocked for more than 480 
> >>>>seconds.
> >>>>[ 1463.760683] Not tainted 4.19.5-1-default #1
> >>>>[ 1463.760684] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> >>>>disables this message.
> >>>>[ 1463.760687] kworker/0:8 D 0 5367 2 0x80000000
> >>>>[ 1463.760718] Workqueue: md submit_flushes [md_mod]
> >>>And that MD submit_flushes thing again.
> >>>
> >>>Not to say there isn't some issue between XFS and MD going on here, but
> >>>I think we might want an MD person to take a look at this and possibly
> >>>provide some insight. From an XFS perspective, this all just looks like
> >>>we're blocked on I/O (via writeback, AIL and log) to a slow device.
> >>>
> >>>Brian
> >>>
> >>>>[ 1463.760721] Call Trace:
> >>>>[ 1463.760731] ? __schedule+0x29a/0x880
> >>>>[ 1463.760741] ? wait_barrier+0xdd/0x170 [raid10]
> >>>>[ 1463.760746] schedule+0x78/0x110
> >>>>[ 1463.760753] wait_barrier+0xdd/0x170 [raid10]
> >>>>[ 1463.760761] ? wait_woken+0x80/0x80
> >>>>[ 1463.760768] raid10_write_request+0xf2/0x900 [raid10]
> >>>>[ 1463.760774] ? wait_woken+0x80/0x80
> >>>>[ 1463.760778] ? mempool_alloc+0x55/0x160
> >>>>[ 1463.760795] ? md_write_start+0xa9/0x270 [md_mod]
> >>>>[ 1463.760801] ? try_to_wake_up+0x44/0x470
> >>>>[ 1463.760810] raid10_make_request+0xc1/0x120 [raid10]
> >>>>[ 1463.760816] ? wait_woken+0x80/0x80
> >>>>[ 1463.760831] md_handle_request+0x121/0x190 [md_mod]
> >>>>[ 1463.760851] md_make_request+0x78/0x190 [md_mod]
> >>>>[ 1463.760860] generic_make_request+0x1c6/0x470
> >>>>[ 1463.760870] raid10_write_request+0x77a/0x900 [raid10]
> >>>>[ 1463.760875] ? wait_woken+0x80/0x80
> >>>>[ 1463.760879] ? mempool_alloc+0x55/0x160
> >>>>[ 1463.760895] ? md_write_start+0xa9/0x270 [md_mod]
> >>>>[ 1463.760904] raid10_make_request+0xc1/0x120 [raid10]
> >>>>[ 1463.760910] ? wait_woken+0x80/0x80
> >>>>[ 1463.760926] md_handle_request+0x121/0x190 [md_mod]
> >>>>[ 1463.760931] ? _raw_spin_unlock_irq+0x22/0x40
> >>>>[ 1463.760936] ? finish_task_switch+0x74/0x260
> >>>>[ 1463.760954] submit_flushes+0x21/0x40 [md_mod]
> >>>>[ 1463.760962] process_one_work+0x1fd/0x420
> >>>>[ 1463.760970] worker_thread+0x2d/0x3d0
> >>>>[ 1463.760976] ? rescuer_thread+0x340/0x340
> >>>>[ 1463.760981] kthread+0x112/0x130
> >>>>[ 1463.760986] ? kthread_create_worker_on_cpu+0x40/0x40
> >>>>[ 1463.760992] ret_from_fork+0x3a/0x50
> 
> -- 
> Srda??an pozdrav/Best regards,
> Sini??a Bandin
>