Hi, On 08/11/19 12:48 AM, Tejun Heo wrote: > blk-cgroup has been using blkg_rwstat to track basic IO stats. > Unfortunately, reading recursive stats scales badly as itinvolves > walking all descendants. On systems with a huge number of cgroups > (dead or alive), this can lead to substantial CPU cost when reading IO > stats. > > This patch reimplements basic IO stats using cgroup rstat which uses > more memory but makes recursive stat reading O(# descendants which > have been active since last reading) instead of O(# descendants). > > * blk-cgroup core no longer uses sync/async stats. Introduce new stat > enums - BLKG_IOSTAT_{READ|WRITE|DISCARD}. > > * Add blkg_iostat[_set] which encapsulates byte and io stats, last > values for propagation delta calculation and u64_stats_sync for > correctness on 32bit archs. > > * Update the new percpu stat counters directly and implement > blkcg_rstat_flush() to implement propagation. > > * blkg_print_stat() can now bring the stats up to date by calling > cgroup_rstat_flush() and print them instead of directly summing up > all descendants. > > * It now allocates 96 bytes per cpu. It used to be 40 bytes. > > Signed-off-by: Tejun Heo <tj@xxxxxxxxxx> > Cc: Dan Schatzberg <dschatzberg@xxxxxx> > Cc: Daniel Xu <dlxu@xxxxxx> > --- I bisected a Kernel OOPs issue to this patch on linux-next. Any idea why this is happening? Here is the log: [ 32.033025] 8<--- cut here --- [ 32.036136] Unable to handle kernel paging request at virtual address 2e83803c [ 32.043637] pgd = 75330198 [ 32.046360] [2e83803c] *pgd=00000000 [ 32.050008] Internal error: Oops: 5 [#1] SMP ARM [ 32.054647] Modules linked in: [ 32.057724] CPU: 0 PID: 780 Comm: (systemd) Tainted: G W 5.4.0-rc7-next-20191113 #172 [ 32.066893] Hardware name: Generic AM33XX (Flattened Device Tree) [ 32.073026] PC is at cgroup_rstat_updated+0x30/0xe8 [ 32.077939] LR is at generic_make_request_checks+0x3d4/0x748 [ 32.083621] pc : [<c01e6f50>] lr : [<c04af820>] psr: a0040013 [ 32.089912] sp : ed9b3b78 ip : 2e838000 fp : ed826c00 [ 32.095156] r10: 00001000 r9 : 00000000 r8 : ff7ff428 [ 32.100402] r7 : c0d05148 r6 : c0d0554c r5 : c0c8b9ec r4 : edb26180 [ 32.106954] r3 : 2e838000 r2 : 2e838000 r1 : 00000000 r0 : eda32000 [ 32.113510] Flags: NzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment none [ 32.120674] Control: 10c5387d Table: adac0019 DAC: 00000051 [ 32.126444] Process (systemd) (pid: 780, stack limit = 0x5087843c) [ 32.132648] Stack: (0xed9b3b78 to 0xed9b4000) [ 32.137022] 3b60: edb26180 eee19550 [ 32.145237] 3b80: 2e838000 c0d05148 ff7ff428 c04af820 00000004 00000800 0074e7f8 00000000 [ 32.153452] 3ba0: a0040093 c08d1798 00000000 80040093 00002000 00000008 00000007 edb8168c [ 32.161667] 3bc0: 00000000 00000000 ffffe000 71b97da9 00000022 edb26180 c0d05148 00000008 [ 32.169882] 3be0: c0d05148 00000001 00000000 edb26180 00000000 c04b0ad8 00000000 00000000 [ 32.178097] 3c00: edb81a00 ed826c00 ed826cc4 71b97da9 c0de2c7c edb26180 c0d05148 00000008 [ 32.186312] 3c20: 00000001 00000001 00000000 0005fcfd 00000000 c04b0de0 c0de2c88 edb81600 [ 32.194526] 3c40: ed826800 0005fcfd 00000000 c04ce968 00001000 c0d05148 edb26180 efd29a84 [ 32.202741] 3c60: 00000000 00000000 0005fcfd 71b97da9 ed9b3c7b 00001000 00000001 00000001 [ 32.210956] 3c80: 00000001 00000001 00000000 0005fcfd 00000000 c039bdb0 20040013 00000001 [ 32.219170] 3ca0: 00000001 00000000 0005fcfd 00000000 ed9b3cc0 00000001 efd29a84 00000000 [ 32.227385] 3cc0: 00000000 ed9b3e04 edb26180 ec8421b0 00000001 ec842100 0000000c ec8422b8 [ 32.235600] 3ce0: 0005fcfd 00000000 00000fff 00000000 ee2a7b40 00080000 00000000 00112cca [ 32.243814] 3d00: ec8422bc c02983e0 0005fcfd 00000000 00000000 00000001 00000000 00000008 [ 32.252028] 3d20: 0005fcfd 00000000 00000000 eef82400 00000010 00000000 00000004 ed9b3e88 [ 32.260242] 3d40: 00000000 ed9b3d68 00000000 00000003 00000000 c0d05148 60040013 c01837f4 [ 32.268457] 3d60: 00000000 71b97da9 00000000 00000001 00000001 c03783bc ec8422b8 ed9b3e04 [ 32.276671] 3d80: ed9b3e04 00000001 ec8422bc c0378404 00000001 00000000 ec8421b0 c0255360 [ 32.284886] 3da0: eeee0000 ed9d2180 ed9b3da8 ed9b3da8 ed9b3db0 ed9b3db0 00000000 71b97da9 [ 32.293101] 3dc0: 00000000 00000001 00000001 00000000 00000003 ed9b3e04 00000000 00112cca [ 32.301316] 3de0: ec8422bc c025563c 00112cca 00000000 00000000 00000001 ec8422b8 ed9d2180 [ 32.309531] 3e00: ed9b3dfc ed9b3e04 ed9b3e04 71b97da9 ec8422b8 ed9d21e8 ed9d2180 ec8422b8 [ 32.317746] 3e20: 00000000 00000001 ffffffff 00000000 ed9d2180 c0255b8c 00000003 00000001 [ 32.325961] 3e40: ec8421b0 ed9b3f00 00000000 00000000 ec8422b8 c024b73c 00000001 beba6ca0 [ 32.334175] 3e60: c0d05148 00000000 00000000 beba6ca0 ed9b3ee8 ed9d2180 00000051 00000000 [ 32.342389] 3e80: ed9d21e8 00000001 ffffffff 00000fff 000081a4 00000001 000003e8 000003e8 [ 32.350604] 3ea0: 00000000 00000000 00000000 71b97da9 000000d2 ed9d2180 c0d05148 00000000 [ 32.358819] 3ec0: 00000000 ed9b3f78 00001000 00000000 00000000 c02bdb6c 00001000 00020000 [ 32.367033] 3ee0: 0058b9c8 00001000 00000004 00000000 00001000 ed9b3ee0 00000001 00000000 [ 32.375248] 3f00: ed9d2180 00000000 00000000 00000000 00000000 00000000 00000000 00000000 [ 32.383463] 3f20: 00000000 00000000 00000000 71b97da9 0058b9c8 00000001 00001000 ed9b3f78 [ 32.391678] 3f40: ed9d2180 00000000 00000000 c02bdc78 00000000 eda3ce1c eda3cc00 ed9d2180 [ 32.399893] 3f60: ed9d2180 c0d05148 0058b9c8 00001000 ed9b2000 c02bdf68 00000000 00000000 [ 32.408107] 3f80: 000005e8 71b97da9 005868d8 b6c02f41 000005e8 00000003 c0101204 00000003 [ 32.416322] 3fa0: 00000000 c01011e0 005868d8 b6c02f41 00000007 0058b9c8 00001000 00000000 [ 32.424537] 3fc0: 005868d8 b6c02f41 000005e8 00000003 0000000a beba6e88 00000000 00000000 [ 32.432753] 3fe0: 00000000 beba6d24 b6c037e1 b6c3e4b8 40040030 00000007 00000000 00000000 [ 32.440982] [<c01e6f50>] (cgroup_rstat_updated) from [<c04af820>] (generic_make_request_checks+0x3d4/0 x748) [ 32.450770] [<c04af820>] (generic_make_request_checks) from [<c04b0ad8>] (generic_make_request+0x1c/0x 2e4) [ 32.460468] [<c04b0ad8>] (generic_make_request) from [<c04b0de0>] (submit_bio+0x40/0x1b4) [ 32.468686] [<c04b0de0>] (submit_bio) from [<c039bdb0>] (ext4_mpage_readpages+0x704/0x904) [ 32.476995] [<c039bdb0>] (ext4_mpage_readpages) from [<c0378404>] (ext4_readpages+0x48/0x50) [ 32.485481] [<c0378404>] (ext4_readpages) from [<c0255360>] (read_pages+0x50/0x154) [ 32.493175] [<c0255360>] (read_pages) from [<c025563c>] (__do_page_cache_readahead+0x1d8/0x1f8) [ 32.501914] [<c025563c>] (__do_page_cache_readahead) from [<c0255b8c>] (page_cache_sync_readahead+0xa0 /0xf4) [ 32.511799] [<c0255b8c>] (page_cache_sync_readahead) from [<c024b73c>] (generic_file_read_iter+0x75c/0 xc40) [ 32.521594] [<c024b73c>] (generic_file_read_iter) from [<c02bdb6c>] (__vfs_read+0x138/0x1bc) [ 32.530073] [<c02bdb6c>] (__vfs_read) from [<c02bdc78>] (vfs_read+0x88/0x114) [ 32.537241] [<c02bdc78>] (vfs_read) from [<c02bdf68>] (ksys_read+0x54/0xd0) [ 32.544237] [<c02bdf68>] (ksys_read) from [<c01011e0>] (__sys_trace_return+0x0/0x20) [ 32.552010] Exception stack(0xed9b3fa8 to 0xed9b3ff0) [ 32.557085] 3fa0: 005868d8 b6c02f41 00000007 0058b9c8 00001000 00000000 [ 32.565300] 3fc0: 005868d8 b6c02f41 000005e8 00000003 0000000a beba6e88 00000000 00000000 [ 32.573512] 3fe0: 00000000 beba6d24 b6c037e1 b6c3e4b8 [ 32.578591] Code: ee073fba e7962101 e5903168 e0823003 (e593303c) [ 32.584889] ---[ end trace 08d6b7172e3ff29b ]--- [ 32.797983] 8<--- cut here --- [ 32.801090] Unable to handle kernel paging request at virtual address 2e83803c [ 32.808421] pgd = f285aa90 [ 32.811140] [2e83803c] *pgd=00000000 [ 32.814739] Internal error: Oops: 5 [#2] SMP ARM [ 32.819378] Modules linked in: [ 32.822453] CPU: 0 PID: 527 Comm: login Tainted: G D W 5.4.0-rc7-next-20191113 #172 [ 32.831273] Hardware name: Generic AM33XX (Flattened Device Tree) [ 32.837406] PC is at cgroup_rstat_updated+0x30/0xe8 [ 32.842320] LR is at generic_make_request_checks+0x3d4/0x748 [ 32.848002] pc : [<c01e6f50>] lr : [<c04af820>] psr: a0070013 [ 32.854292] sp : edbdfb78 ip : 2e838000 fp : eda49c00 [ 32.859537] r10: 00001000 r9 : 00000000 r8 : ff7fff60 [ 32.864782] r7 : c0d05148 r6 : c0d0554c r5 : c0c8b9ec r4 : edd8c6c0 [ 32.871335] r3 : 2e838000 r2 : 2e838000 r1 : 00000000 r0 : ed9dec00 [ 32.877891] Flags: NzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment none [ 32.885056] Control: 10c5387d Table: adb40019 DAC: 00000051 [ 32.890826] Process login (pid: 527, stack limit = 0x1deade48) [ 32.896681] Stack: (0xedbdfb78 to 0xedbe0000) [ 32.901056] fb60: edd8c6c0 eee19550 [ 32.909271] fb80: 2e838000 c0d05148 ff7fff60 c04af820 c0d0554c c01023dc 0074e7f8 00000000 [ 32.917487] fba0: 0000000a ffff979f 00400100 71b97da9 ee81ba00 ffffe000 00000000 c0d0554c [ 32.925702] fbc0: c0c90e7c 00000000 c0d0554c 71b97da9 00000001 edd8c6c0 c0d05148 00000008 [ 32.933916] fbe0: c0d05148 00000001 00000000 edd8c6c0 00000000 c04b0ad8 00000000 c0101aec [ 32.942130] fc00: 00000000 00000000 00001000 71b97da9 edd8c6c0 edd8c6c0 c0d05148 00000008 [ 32.950345] fc20: 00000001 00000001 00000000 0005fba9 00000000 c04b0de0 c04a8f24 c04a8340 [ 32.958560] fc40: 20070013 ffffffff 00000051 bf000000 00001000 c0d05148 edd8c6c0 efd47fac [ 32.966775] fc60: 00000000 00000000 0005fba9 71b97da9 edbdfc7b 00001000 00000001 00000001 [ 32.974990] fc80: 00000001 00000001 00000000 0005fba9 00000000 c039bdb0 20070013 00000001 [ 32.983204] fca0: 00000001 00000000 0005fba9 00000000 edbdfcc0 00000001 efd47fac 00000000 [ 32.991418] fcc0: 00000000 edbdfe04 edd8c6c0 ec85dd70 00000001 ec85dcc0 0000000c ec85de78 [ 32.999633] fce0: 0005fba9 00000000 00000fff 00000000 ee2a7b40 00080000 00000000 00112cca [ 33.007848] fd00: ec85de7c c02983e0 0005fba9 00000000 00000000 00000001 00000000 00000008 [ 33.016061] fd20: 0005fba9 00000000 00000000 eef82400 00000010 00000000 00000004 edbdfe88 [ 33.024276] fd40: 00000000 edbdfd68 00000000 00000003 00000000 c0d05148 60070013 c01837f4 [ 33.032491] fd60: 00000000 71b97da9 00000000 00000001 00000001 c03783bc ec85de78 edbdfe04 [ 33.040705] fd80: edbdfe04 00000001 ec85de7c c0378404 00000001 00000000 ec85dd70 c0255360 [ 33.048919] fda0: eeee0000 ed952d80 edbdfda8 edbdfda8 edbdfdb0 edbdfdb0 00000000 71b97da9 [ 33.057134] fdc0: 00000000 00000001 00000001 00000000 00000003 edbdfe04 00000000 00112cca [ 33.065348] fde0: ec85de7c c025563c 00112cca 00000000 00000000 00000001 ec85de78 ed952d80 [ 33.073563] fe00: edbdfdfc edbdfe04 edbdfe04 71b97da9 ec85de78 ed952de8 ed952d80 ec85de78 [ 33.081777] fe20: 00000000 00000001 ffffffff 00000000 ed952d80 c0255b8c 00000003 00000001 [ 33.089992] fe40: ec85dd70 edbdff00 00000000 00000000 ec85de78 c024b73c 00000001 00000041 [ 33.098206] fe60: ffffe000 00000000 00000000 00000000 edbdfee8 ed952d80 00000000 00000000 [ 33.106422] fe80: ed952de8 00000001 ffffffff 00000fff edbdfe8c 71b97da9 000003e8 00000004 [ 33.114637] fea0: edbdff70 c0d05148 00000001 71b97da9 edbde000 ed952d80 c0d05148 00000000 [ 33.122852] fec0: 00000000 edbdff78 000003e8 00000000 00000000 c02bdb6c 000003e8 00020000 [ 33.131066] fee0: 000365a8 000003e8 00000004 00000000 000003e8 edbdfee0 00000001 00000000 [ 33.139280] ff00: ed952d80 00000000 00000000 00000000 00000000 00000000 00000000 00000000 [ 33.147495] ff20: 00000000 00000000 00000000 71b97da9 000365a8 00000001 000003e8 edbdff78 [ 33.155709] ff40: ed952d80 00000000 00000000 c02bdc78 00000000 edd7721c edd77000 ed952d80 [ 33.163923] ff60: ed952d80 c0d05148 000365a8 000003e8 edbde000 c02bdf68 00000000 00000000 [ 33.172137] ff80: 00000000 71b97da9 000003e8 be9bb7ac 00000000 00000003 c0101204 00000003 [ 33.180353] ffa0: 00000000 c01011e0 000003e8 be9bb7ac 00000004 000365a8 000003e8 00000000 [ 33.188567] ffc0: 000003e8 be9bb7ac 00000000 00000003 00000004 000365a8 b6d74d64 00000000 [ 33.196782] ffe0: 00000000 be9bb704 b6f7516c b6ef64b8 60070030 00000004 00000000 00000000 [ 33.205010] [<c01e6f50>] (cgroup_rstat_updated) from [<c04af820>] (generic_make_request_checks+0x3d4/0 x748) [ 33.214800] [<c04af820>] (generic_make_request_checks) from [<c04b0ad8>] (generic_make_request+0x1c/0x 2e4) [ 33.224495] [<c04b0ad8>] (generic_make_request) from [<c04b0de0>] (submit_bio+0x40/0x1b4) [ 33.232714] [<c04b0de0>] (submit_bio) from [<c039bdb0>] (ext4_mpage_readpages+0x704/0x904) [ 33.241023] [<c039bdb0>] (ext4_mpage_readpages) from [<c0378404>] (ext4_readpages+0x48/0x50) [ 33.249509] [<c0378404>] (ext4_readpages) from [<c0255360>] (read_pages+0x50/0x154) [ 33.257203] [<c0255360>] (read_pages) from [<c025563c>] (__do_page_cache_readahead+0x1d8/0x1f8) [ 33.265943] [<c025563c>] (__do_page_cache_readahead) from [<c0255b8c>] (page_cache_sync_readahead+0xa0 /0xf4) [ 33.275826] [<c0255b8c>] (page_cache_sync_readahead) from [<c024b73c>] (generic_file_read_iter+0x75c/0 xc40) [ 33.285621] [<c024b73c>] (generic_file_read_iter) from [<c02bdb6c>] (__vfs_read+0x138/0x1bc) [ 33.294099] [<c02bdb6c>] (__vfs_read) from [<c02bdc78>] (vfs_read+0x88/0x114) [ 33.301268] [<c02bdc78>] (vfs_read) from [<c02bdf68>] (ksys_read+0x54/0xd0) [ 33.308264] [<c02bdf68>] (ksys_read) from [<c01011e0>] (__sys_trace_return+0x0/0x20) [ 33.316038] Exception stack(0xedbdffa8 to 0xedbdfff0) [ 33.321112] ffa0: 000003e8 be9bb7ac 00000004 000365a8 000003e8 00000000 [ 33.329327] ffc0: 000003e8 be9bb7ac 00000000 00000003 00000004 000365a8 b6d74d64 00000000 [ 33.337540] ffe0: 00000000 be9bb704 b6f7516c b6ef64b8 [ 33.342619] Code: ee073fba e7962101 e5903168 e0823003 (e593303c) [ 33.348850] ---[ end trace 08d6b7172e3ff29c ]--- Thanks, Faiz