Re: xfs_db causing OOM failures while running fstests on 64k blocksize

[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]



On Tue, Jun 01, 2021 at 07:30:59PM +0530, riteshh wrote:
> Hello Darrick,
> 
> With 64K blocksize, we are running into this OOM failure invoked by xfs_db on
> our test machines. Due to this, internally many times we are unable to get a
> report of fstests for xfs on Power. This is quite easily seen with 64K blocksize.
> 
> I remember you were mentioning some solution around this in one of the call.
> Something that you may have some patches in your tree, but also that there could

Yeah, we last discussed this on-list in 2018:

https://lore.kernel.org/fstests/151803837117.19313.3481001926133028039.stgit@magnolia/

The patch has been stalled out ever since because I don't often have
time to run the full offline fuzz test suite to compare xfs_check and
xfs_repair to make sure that there are zero things that cause check to
complain while repair remains silent.  It would probably be a good idea
to compare scrub and repair too, but as scrub is still experimental I
don't consider it a blocking issue for disabling xfs_check.

IIRC I think I /did/ actually fix the last of the "check complains but
repair doesn't" bugs last summer, but I never got around to re-checking
after all the patches got merged.

If you want to do that, all you have to do is run the dangerous_repair
group tests with check/repair comparison:

# SCRATCH_XFS_FUZZ_CHECK=1 ./check -g dangerous_repair

Though you probably want a way to run each test individually in parallel
on a vm farm somewhere.

> be some more work required in this area to fix the issue.
> Sorry, I don't recollect our discussion completely. It was something around
> xfs_check and xfs_repair. Could you please help understand once again on why
> this issue? (even with quite high RAM size and small disk size)

xfs_check allocates a large amount of memory to track the state of each
block in the filesystem.  Individually.  See init() in db/check.c.

Repair tracks block state by extent, which is why it scales better
unless the fs is severely fragmented.

> And any known workaround for this?

Don't run xfs_check, or only run fstests with small disks.  (Both suck.)

> Also would you like some help with this work? Since it is anyway required for
> Power, so someone internally from our team can also start to work on this.

Yes please.

--D

> 
> 
> e.g. log
> generic/234		[12:08:15][ 5800.944162] run fstests generic/234 at 2021-04-24 12:08:15
> [ 5802.188921] XFS (vdd): Mounting V5 Filesystem
> [ 5802.280632] XFS (vdd): Ending clean mount
> [ 5802.337718] xfs filesystem being mounted at /vdd supports timestamps until 2038 (0x7fffffff)
> [ 5803.069534] XFS (vdc): Mounting V5 Filesystem
> [ 5803.121686] XFS (vdc): Ending clean mount
> [ 5803.123598] XFS (vdc): Quotacheck needed: Please wait.
> [ 5803.166977] XFS (vdc): Quotacheck: Done.
> [ 5803.170940] xfs filesystem being mounted at /vdc supports timestamps until 2038 (0x7fffffff)
> [ 5831.708588] XFS (vdc): Unmounting Filesystem
> [ 5832.530761] XFS (vdd): Unmounting Filesystem
> [ 5854.828260] xfs_db invoked oom-killer: gfp_mask=0x100cca(GFP_HIGHUSER_MOVABLE), order=0, oom_score_adj=-1000
> [ 5854.828717] CPU: 0 PID: 16727 Comm: xfs_db Not tainted 5.12.0-rc8-00043-g8db5efb83fa9 #20
> [ 5854.828868] Call Trace:
> [ 5854.828916] [c0000000204b7750] [c000000000c0ce88] dump_stack+0xec/0x144 (unreliable)
> [ 5854.829145] [c0000000204b7790] [c00000000042a054] dump_header+0x64/0x414
> [ 5854.829306] [c0000000204b7810] [c000000000427cb8] oom_kill_process+0x108/0x350
> [ 5854.829495] [c0000000204b7850] [c000000000429dd4] out_of_memory+0x874/0x9b0
> [ 5854.829647] [c0000000204b78f0] [c0000000004c95b8] __alloc_pages_slowpath.constprop.85+0xe98/0x11e0
> [ 5854.829852] [c0000000204b7ac0] [c0000000004c9c1c] __alloc_pages_nodemask+0x31c/0x500
> [ 5854.830029] [c0000000204b7b50] [c0000000004f77e4] alloc_pages_vma+0x2b4/0x320
> [ 5854.830209] [c0000000204b7bc0] [c0000000004913c4] __handle_mm_fault+0xb14/0x1790
> [ 5854.830389] [c0000000204b7ca0] [c000000000492390] handle_mm_fault+0x350/0x4c0
> [ 5854.830568] [c0000000204b7d00] [c00000000009c914] ___do_page_fault+0x9a4/0xd30
> [ 5854.830750] [c0000000204b7db0] [c00000000009ccd4] __do_page_fault+0x34/0x90
> [ 5854.830901] [c0000000204b7de0] [c0000000000a6268] do_hash_fault+0x48/0x90
> [ 5854.831053] [c0000000204b7e10] [c000000000008994] data_access_common_virt+0x194/0x1f0
> <...>
> [ 5854.833718] --- interrupt: 300
> [ 5854.833797] Mem-Info:
> [ 5854.833866] active_anon:15 inactive_anon:158676 isolated_anon:0
> [ 5854.833866]  active_file:14 inactive_file:0 isolated_file:0
> [ 5854.833866]  unevictable:0 dirty:0 writeback:0
> [ 5854.833866]  slab_reclaimable:741 slab_unreclaimable:1835
> [ 5854.833866]  mapped:101 shmem:493 pagetables:77 bounce:0
> [ 5854.833866]  free:174 free_pcp:2 free_cma:0
> [ 5854.834514] Node 0 active_anon:960kB inactive_anon:10155264kB active_file:896kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB mapped:6464kB dirty:0kB writeback:0kB shmem:31552kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 0kB writeback_tmp:0kB kernel_stack:1280kB pagetables:4928kB all_unreclaimable? yes
> [ 5854.835088] Node 0 Normal free:11136kB min:12800kB low:23104kB high:33408kB reserved_highatomic:0KB active_anon:960kB inactive_anon:10155264kB active_file:896kB inactive_file:0kB unevictable:0kB writepending:0kB present:10485760kB managed:10399232kB mlocked:0kB bounce:0kB free_pcp:128kB local_pcp:128kB free_cma:0kB
> [ 5854.835641] lowmem_reserve[]: 0 0 0
> [ 5854.835787] Node 0 Normal: 94*64kB (UE) 24*128kB (U) 4*256kB (U) 2*512kB (U) 0*1024kB 0*2048kB 0*4096kB 0*8192kB 0*16384kB = 11136kB
> [ 5854.836091] 507 total pagecache pages
> [ 5854.836182] 0 pages in swap cache
> [ 5854.836273] Swap cache stats: add 0, delete 0, find 0/0
> [ 5854.836369] Free swap  = 0kB
> [ 5854.836463] Total swap = 0kB
> [ 5854.836554] 163840 pages RAM
> [ 5854.836645] 0 pages HighMem/MovableOnly
> [ 5854.836737] 1352 pages reserved
> [ 5854.836830] 0 pages cma reserved
> [ 5854.836923] 0 pages hwpoisoned
> [ 5854.837034] Tasks state (memory values in pages):
> [ 5854.837151] [  pid  ]   uid  tgid total_vm      rss pgtables_bytes swapents oom_score_adj name
> [ 5854.837355] [   1397]     0  1397      512      117    43008        0             0 systemd-journal
> [ 5854.837755] [   1406]     0  1406      122       26    38912        0             0 blkmapd
> [ 5854.837926] [   1606]     0  1606      337       47    40960        0         -1000 systemd-udevd
> [ 5854.838121] [   2496]   101  2496     1502       74    43008        0             0 systemd-timesyn
> [ 5854.838314] [   2498]   106  2498      176       48    38912        0             0 rpcbind
> [ 5854.838483] [   2503]   104  2503      167       47    38912        0          -900 dbus-daemon
> [ 5854.838676] [   2504]     0  2504      126       18    38912        0             0 kvm-xfstests.bo
> [ 5854.838869] [   2505]     0  2505      344       72    40960        0             0 systemd-logind
> [ 5854.839062] [   2537]     0  2537      190       63    38912        0             0 login
> [ 5854.839231] [   2538]     0  2538       60       10    38912        0             0 agetty
> [ 5854.839400] [   2540]     0  2540      190       63    38912        0             0 login
> [ 5854.839570] [   2541]     0  2541      190       63    38912        0             0 login
> [ 5854.839739] [   2613]     0  2613      132       22    38912        0             0 runtests.sh
> [ 5854.839934] [   3114]     0  3114      144       35    38912        0             0 bash
> [ 5854.840105] [   3115]     0  3115      144       35    38912        0             0 bash
> [ 5854.840274] [   3116]     0  3116      144       35    38912        0             0 bash
> [ 5854.840442] [  15152]     0 15152      182       75    38912        0         -1000 bash
> [ 5854.840610] [  16727]     0 16727   157168   157096  1294336        0         -1000 xfs_db
> 
> -ritesh



[Index of Archives]     [Linux Filesystems Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux