Re: [PATCH 5/6] blk-cgroup: reimplement basic IO stats using cgroup rstat

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

On 08/11/19 12:48 AM, Tejun Heo wrote:
> blk-cgroup has been using blkg_rwstat to track basic IO stats.
> Unfortunately, reading recursive stats scales badly as itinvolves
> walking all descendants.  On systems with a huge number of cgroups
> (dead or alive), this can lead to substantial CPU cost when reading IO
> stats.
> 
> This patch reimplements basic IO stats using cgroup rstat which uses
> more memory but makes recursive stat reading O(# descendants which
> have been active since last reading) instead of O(# descendants).
> 
> * blk-cgroup core no longer uses sync/async stats.  Introduce new stat
>   enums - BLKG_IOSTAT_{READ|WRITE|DISCARD}.
> 
> * Add blkg_iostat[_set] which encapsulates byte and io stats, last
>   values for propagation delta calculation and u64_stats_sync for
>   correctness on 32bit archs.
> 
> * Update the new percpu stat counters directly and implement
>   blkcg_rstat_flush() to implement propagation.
> 
> * blkg_print_stat() can now bring the stats up to date by calling
>   cgroup_rstat_flush() and print them instead of directly summing up
>   all descendants.
> 
> * It now allocates 96 bytes per cpu.  It used to be 40 bytes.
> 
> Signed-off-by: Tejun Heo <tj@xxxxxxxxxx>
> Cc: Dan Schatzberg <dschatzberg@xxxxxx>
> Cc: Daniel Xu <dlxu@xxxxxx>
> ---

I bisected a Kernel OOPs issue to this patch on linux-next. Any idea why
this is happening? Here is the log:

[   32.033025] 8<--- cut here ---
[   32.036136] Unable to handle kernel paging request at virtual address
2e83803c
[   32.043637] pgd = 75330198
[   32.046360] [2e83803c] *pgd=00000000
[   32.050008] Internal error: Oops: 5 [#1] SMP ARM
[   32.054647] Modules linked in:
[   32.057724] CPU: 0 PID: 780 Comm: (systemd) Tainted: G        W
  5.4.0-rc7-next-20191113 #172
[   32.066893] Hardware name: Generic AM33XX (Flattened Device Tree)
[   32.073026] PC is at cgroup_rstat_updated+0x30/0xe8
[   32.077939] LR is at generic_make_request_checks+0x3d4/0x748
[   32.083621] pc : [<c01e6f50>]    lr : [<c04af820>]    psr: a0040013
[   32.089912] sp : ed9b3b78  ip : 2e838000  fp : ed826c00
[   32.095156] r10: 00001000  r9 : 00000000  r8 : ff7ff428
[   32.100402] r7 : c0d05148  r6 : c0d0554c  r5 : c0c8b9ec  r4 : edb26180
[   32.106954] r3 : 2e838000  r2 : 2e838000  r1 : 00000000  r0 : eda32000
[   32.113510] Flags: NzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM
Segment none
[   32.120674] Control: 10c5387d  Table: adac0019  DAC: 00000051
[   32.126444] Process (systemd) (pid: 780, stack limit = 0x5087843c)
[   32.132648] Stack: (0xed9b3b78 to 0xed9b4000)
[   32.137022] 3b60:
  edb26180 eee19550
[   32.145237] 3b80: 2e838000 c0d05148 ff7ff428 c04af820 00000004
00000800 0074e7f8 00000000
[   32.153452] 3ba0: a0040093 c08d1798 00000000 80040093 00002000
00000008 00000007 edb8168c
[   32.161667] 3bc0: 00000000 00000000 ffffe000 71b97da9 00000022
edb26180 c0d05148 00000008
[   32.169882] 3be0: c0d05148 00000001 00000000 edb26180 00000000
c04b0ad8 00000000 00000000
[   32.178097] 3c00: edb81a00 ed826c00 ed826cc4 71b97da9 c0de2c7c
edb26180 c0d05148 00000008
[   32.186312] 3c20: 00000001 00000001 00000000 0005fcfd 00000000
c04b0de0 c0de2c88 edb81600
[   32.194526] 3c40: ed826800 0005fcfd 00000000 c04ce968 00001000
c0d05148 edb26180 efd29a84
[   32.202741] 3c60: 00000000 00000000 0005fcfd 71b97da9 ed9b3c7b
00001000 00000001 00000001
[   32.210956] 3c80: 00000001 00000001 00000000 0005fcfd 00000000
c039bdb0 20040013 00000001
[   32.219170] 3ca0: 00000001 00000000 0005fcfd 00000000 ed9b3cc0
00000001 efd29a84 00000000
[   32.227385] 3cc0: 00000000 ed9b3e04 edb26180 ec8421b0 00000001
ec842100 0000000c ec8422b8
[   32.235600] 3ce0: 0005fcfd 00000000 00000fff 00000000 ee2a7b40
00080000 00000000 00112cca
[   32.243814] 3d00: ec8422bc c02983e0 0005fcfd 00000000 00000000
00000001 00000000 00000008
[   32.252028] 3d20: 0005fcfd 00000000 00000000 eef82400 00000010
00000000 00000004 ed9b3e88
[   32.260242] 3d40: 00000000 ed9b3d68 00000000 00000003 00000000
c0d05148 60040013 c01837f4
[   32.268457] 3d60: 00000000 71b97da9 00000000 00000001 00000001
c03783bc ec8422b8 ed9b3e04
[   32.276671] 3d80: ed9b3e04 00000001 ec8422bc c0378404 00000001
00000000 ec8421b0 c0255360
[   32.284886] 3da0: eeee0000 ed9d2180 ed9b3da8 ed9b3da8 ed9b3db0
ed9b3db0 00000000 71b97da9
[   32.293101] 3dc0: 00000000 00000001 00000001 00000000 00000003
ed9b3e04 00000000 00112cca
[   32.301316] 3de0: ec8422bc c025563c 00112cca 00000000 00000000
00000001 ec8422b8 ed9d2180
[   32.309531] 3e00: ed9b3dfc ed9b3e04 ed9b3e04 71b97da9 ec8422b8
ed9d21e8 ed9d2180 ec8422b8
[   32.317746] 3e20: 00000000 00000001 ffffffff 00000000 ed9d2180
c0255b8c 00000003 00000001
[   32.325961] 3e40: ec8421b0 ed9b3f00 00000000 00000000 ec8422b8
c024b73c 00000001 beba6ca0
[   32.334175] 3e60: c0d05148 00000000 00000000 beba6ca0 ed9b3ee8
ed9d2180 00000051 00000000
[   32.342389] 3e80: ed9d21e8 00000001 ffffffff 00000fff 000081a4
00000001 000003e8 000003e8
[   32.350604] 3ea0: 00000000 00000000 00000000 71b97da9 000000d2
ed9d2180 c0d05148 00000000
[   32.358819] 3ec0: 00000000 ed9b3f78 00001000 00000000 00000000
c02bdb6c 00001000 00020000
[   32.367033] 3ee0: 0058b9c8 00001000 00000004 00000000 00001000
ed9b3ee0 00000001 00000000
[   32.375248] 3f00: ed9d2180 00000000 00000000 00000000 00000000
00000000 00000000 00000000
[   32.383463] 3f20: 00000000 00000000 00000000 71b97da9 0058b9c8
00000001 00001000 ed9b3f78
[   32.391678] 3f40: ed9d2180 00000000 00000000 c02bdc78 00000000
eda3ce1c eda3cc00 ed9d2180
[   32.399893] 3f60: ed9d2180 c0d05148 0058b9c8 00001000 ed9b2000
c02bdf68 00000000 00000000
[   32.408107] 3f80: 000005e8 71b97da9 005868d8 b6c02f41 000005e8
00000003 c0101204 00000003
[   32.416322] 3fa0: 00000000 c01011e0 005868d8 b6c02f41 00000007
0058b9c8 00001000 00000000
[   32.424537] 3fc0: 005868d8 b6c02f41 000005e8 00000003 0000000a
beba6e88 00000000 00000000
[   32.432753] 3fe0: 00000000 beba6d24 b6c037e1 b6c3e4b8 40040030
00000007 00000000 00000000
[   32.440982] [<c01e6f50>] (cgroup_rstat_updated) from [<c04af820>]
(generic_make_request_checks+0x3d4/0
x748)
[   32.450770] [<c04af820>] (generic_make_request_checks) from
[<c04b0ad8>] (generic_make_request+0x1c/0x
2e4)
[   32.460468] [<c04b0ad8>] (generic_make_request) from [<c04b0de0>]
(submit_bio+0x40/0x1b4)
[   32.468686] [<c04b0de0>] (submit_bio) from [<c039bdb0>]
(ext4_mpage_readpages+0x704/0x904)
[   32.476995] [<c039bdb0>] (ext4_mpage_readpages) from [<c0378404>]
(ext4_readpages+0x48/0x50)
[   32.485481] [<c0378404>] (ext4_readpages) from [<c0255360>]
(read_pages+0x50/0x154)
[   32.493175] [<c0255360>] (read_pages) from [<c025563c>]
(__do_page_cache_readahead+0x1d8/0x1f8)
[   32.501914] [<c025563c>] (__do_page_cache_readahead) from
[<c0255b8c>] (page_cache_sync_readahead+0xa0
/0xf4)
[   32.511799] [<c0255b8c>] (page_cache_sync_readahead) from
[<c024b73c>] (generic_file_read_iter+0x75c/0
xc40)
[   32.521594] [<c024b73c>] (generic_file_read_iter) from [<c02bdb6c>]
(__vfs_read+0x138/0x1bc)
[   32.530073] [<c02bdb6c>] (__vfs_read) from [<c02bdc78>]
(vfs_read+0x88/0x114)
[   32.537241] [<c02bdc78>] (vfs_read) from [<c02bdf68>]
(ksys_read+0x54/0xd0)
[   32.544237] [<c02bdf68>] (ksys_read) from [<c01011e0>]
(__sys_trace_return+0x0/0x20)
[   32.552010] Exception stack(0xed9b3fa8 to 0xed9b3ff0)
[   32.557085] 3fa0:                   005868d8 b6c02f41 00000007
0058b9c8 00001000 00000000
[   32.565300] 3fc0: 005868d8 b6c02f41 000005e8 00000003 0000000a
beba6e88 00000000 00000000
[   32.573512] 3fe0: 00000000 beba6d24 b6c037e1 b6c3e4b8
[   32.578591] Code: ee073fba e7962101 e5903168 e0823003 (e593303c)
[   32.584889] ---[ end trace 08d6b7172e3ff29b ]---
[   32.797983] 8<--- cut here ---
[   32.801090] Unable to handle kernel paging request at virtual address
2e83803c
[   32.808421] pgd = f285aa90
[   32.811140] [2e83803c] *pgd=00000000
[   32.814739] Internal error: Oops: 5 [#2] SMP ARM
[   32.819378] Modules linked in:
[   32.822453] CPU: 0 PID: 527 Comm: login Tainted: G      D W
5.4.0-rc7-next-20191113 #172
[   32.831273] Hardware name: Generic AM33XX (Flattened Device Tree)
[   32.837406] PC is at cgroup_rstat_updated+0x30/0xe8
[   32.842320] LR is at generic_make_request_checks+0x3d4/0x748
[   32.848002] pc : [<c01e6f50>]    lr : [<c04af820>]    psr: a0070013
[   32.854292] sp : edbdfb78  ip : 2e838000  fp : eda49c00
[   32.859537] r10: 00001000  r9 : 00000000  r8 : ff7fff60
[   32.864782] r7 : c0d05148  r6 : c0d0554c  r5 : c0c8b9ec  r4 : edd8c6c0
[   32.871335] r3 : 2e838000  r2 : 2e838000  r1 : 00000000  r0 : ed9dec00
[   32.877891] Flags: NzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM
Segment none
[   32.885056] Control: 10c5387d  Table: adb40019  DAC: 00000051
[   32.890826] Process login (pid: 527, stack limit = 0x1deade48)
[   32.896681] Stack: (0xedbdfb78 to 0xedbe0000)
[   32.901056] fb60:
  edd8c6c0 eee19550
[   32.909271] fb80: 2e838000 c0d05148 ff7fff60 c04af820 c0d0554c
c01023dc 0074e7f8 00000000
[   32.917487] fba0: 0000000a ffff979f 00400100 71b97da9 ee81ba00
ffffe000 00000000 c0d0554c
[   32.925702] fbc0: c0c90e7c 00000000 c0d0554c 71b97da9 00000001
edd8c6c0 c0d05148 00000008
[   32.933916] fbe0: c0d05148 00000001 00000000 edd8c6c0 00000000
c04b0ad8 00000000 c0101aec
[   32.942130] fc00: 00000000 00000000 00001000 71b97da9 edd8c6c0
edd8c6c0 c0d05148 00000008
[   32.950345] fc20: 00000001 00000001 00000000 0005fba9 00000000
c04b0de0 c04a8f24 c04a8340
[   32.958560] fc40: 20070013 ffffffff 00000051 bf000000 00001000
c0d05148 edd8c6c0 efd47fac
[   32.966775] fc60: 00000000 00000000 0005fba9 71b97da9 edbdfc7b
00001000 00000001 00000001
[   32.974990] fc80: 00000001 00000001 00000000 0005fba9 00000000
c039bdb0 20070013 00000001
[   32.983204] fca0: 00000001 00000000 0005fba9 00000000 edbdfcc0
00000001 efd47fac 00000000
[   32.991418] fcc0: 00000000 edbdfe04 edd8c6c0 ec85dd70 00000001
ec85dcc0 0000000c ec85de78
[   32.999633] fce0: 0005fba9 00000000 00000fff 00000000 ee2a7b40
00080000 00000000 00112cca
[   33.007848] fd00: ec85de7c c02983e0 0005fba9 00000000 00000000
00000001 00000000 00000008
[   33.016061] fd20: 0005fba9 00000000 00000000 eef82400 00000010
00000000 00000004 edbdfe88
[   33.024276] fd40: 00000000 edbdfd68 00000000 00000003 00000000
c0d05148 60070013 c01837f4
[   33.032491] fd60: 00000000 71b97da9 00000000 00000001 00000001
c03783bc ec85de78 edbdfe04
[   33.040705] fd80: edbdfe04 00000001 ec85de7c c0378404 00000001
00000000 ec85dd70 c0255360
[   33.048919] fda0: eeee0000 ed952d80 edbdfda8 edbdfda8 edbdfdb0
edbdfdb0 00000000 71b97da9
[   33.057134] fdc0: 00000000 00000001 00000001 00000000 00000003
edbdfe04 00000000 00112cca
[   33.065348] fde0: ec85de7c c025563c 00112cca 00000000 00000000
00000001 ec85de78 ed952d80
[   33.073563] fe00: edbdfdfc edbdfe04 edbdfe04 71b97da9 ec85de78
ed952de8 ed952d80 ec85de78
[   33.081777] fe20: 00000000 00000001 ffffffff 00000000 ed952d80
c0255b8c 00000003 00000001
[   33.089992] fe40: ec85dd70 edbdff00 00000000 00000000 ec85de78
c024b73c 00000001 00000041
[   33.098206] fe60: ffffe000 00000000 00000000 00000000 edbdfee8
ed952d80 00000000 00000000
[   33.106422] fe80: ed952de8 00000001 ffffffff 00000fff edbdfe8c
71b97da9 000003e8 00000004
[   33.114637] fea0: edbdff70 c0d05148 00000001 71b97da9 edbde000
ed952d80 c0d05148 00000000
[   33.122852] fec0: 00000000 edbdff78 000003e8 00000000 00000000
c02bdb6c 000003e8 00020000
[   33.131066] fee0: 000365a8 000003e8 00000004 00000000 000003e8
edbdfee0 00000001 00000000
[   33.139280] ff00: ed952d80 00000000 00000000 00000000 00000000
00000000 00000000 00000000
[   33.147495] ff20: 00000000 00000000 00000000 71b97da9 000365a8
00000001 000003e8 edbdff78
[   33.155709] ff40: ed952d80 00000000 00000000 c02bdc78 00000000
edd7721c edd77000 ed952d80
[   33.163923] ff60: ed952d80 c0d05148 000365a8 000003e8 edbde000
c02bdf68 00000000 00000000
[   33.172137] ff80: 00000000 71b97da9 000003e8 be9bb7ac 00000000
00000003 c0101204 00000003
[   33.180353] ffa0: 00000000 c01011e0 000003e8 be9bb7ac 00000004
000365a8 000003e8 00000000
[   33.188567] ffc0: 000003e8 be9bb7ac 00000000 00000003 00000004
000365a8 b6d74d64 00000000
[   33.196782] ffe0: 00000000 be9bb704 b6f7516c b6ef64b8 60070030
00000004 00000000 00000000
[   33.205010] [<c01e6f50>] (cgroup_rstat_updated) from [<c04af820>]
(generic_make_request_checks+0x3d4/0
x748)
[   33.214800] [<c04af820>] (generic_make_request_checks) from
[<c04b0ad8>] (generic_make_request+0x1c/0x
2e4)
[   33.224495] [<c04b0ad8>] (generic_make_request) from [<c04b0de0>]
(submit_bio+0x40/0x1b4)
[   33.232714] [<c04b0de0>] (submit_bio) from [<c039bdb0>]
(ext4_mpage_readpages+0x704/0x904)
[   33.241023] [<c039bdb0>] (ext4_mpage_readpages) from [<c0378404>]
(ext4_readpages+0x48/0x50)
[   33.249509] [<c0378404>] (ext4_readpages) from [<c0255360>]
(read_pages+0x50/0x154)
[   33.257203] [<c0255360>] (read_pages) from [<c025563c>]
(__do_page_cache_readahead+0x1d8/0x1f8)
[   33.265943] [<c025563c>] (__do_page_cache_readahead) from
[<c0255b8c>] (page_cache_sync_readahead+0xa0
/0xf4)
[   33.275826] [<c0255b8c>] (page_cache_sync_readahead) from
[<c024b73c>] (generic_file_read_iter+0x75c/0
xc40)
[   33.285621] [<c024b73c>] (generic_file_read_iter) from [<c02bdb6c>]
(__vfs_read+0x138/0x1bc)
[   33.294099] [<c02bdb6c>] (__vfs_read) from [<c02bdc78>]
(vfs_read+0x88/0x114)
[   33.301268] [<c02bdc78>] (vfs_read) from [<c02bdf68>]
(ksys_read+0x54/0xd0)
[   33.308264] [<c02bdf68>] (ksys_read) from [<c01011e0>]
(__sys_trace_return+0x0/0x20)
[   33.316038] Exception stack(0xedbdffa8 to 0xedbdfff0)
[   33.321112] ffa0:                   000003e8 be9bb7ac 00000004
000365a8 000003e8 00000000
[   33.329327] ffc0: 000003e8 be9bb7ac 00000000 00000003 00000004
000365a8 b6d74d64 00000000
[   33.337540] ffe0: 00000000 be9bb704 b6f7516c b6ef64b8
[   33.342619] Code: ee073fba e7962101 e5903168 e0823003 (e593303c)
[   33.348850] ---[ end trace 08d6b7172e3ff29c ]---

Thanks,
Faiz



[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]     [Monitors]

  Powered by Linux