[PATCH 00/15] blkcg ref count refactor/cleanup + blkcg avg_lat

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi everyone,

This is a fairly lengthy patchset that aims to cleanup reference
counting for blkcgs and blkgs. There are 4 problems that this patchset
tries to address:
  1. fix blkcg destruction
  2. always associate a bio with a blkg
  3. remove the extra css ref held by bios and utilize the blkg ref
  4. add average latency tracking to blkcg core in io.stat.

First, there is a regression in blkcg destruction where references
weren't properly put causing blkcgs to never be destroyed. Previously,
blkgs were destroyed during offlining of the blkcg. This puts back the
blkcg reference a blkg holds allowing blkcg ref to reach zero. Then,
blkcg_css_free() is called as part of the final cleanup.

To address the first problem, 0001 reverts the broken commit, 0002
delays blkg destruction until writeback has finished, and 0003 closes
the window on a race condition between a css migration and dying, and 
blkg association. This should fix the issue where blkg_get() was getting
called when a blkcg had already begun exiting. If a bio finds itself
here, it will just fall back to root. Oddly enough at one point,
blk-throttle was using policy data from and associating with potentially
different blkgs, thus how this was exposed.

0004 also address a similar problem with task_css(current, ...) where
association tries to get a css of a task that is migrating with the
cgroup dying.

Second, both blk-throttle and blk-iolatency rely on blkg association
to enable their policies. Rather than each policy (and future policies)
implement this logic independently, this consolidates it such that
all bios are tagged with a blkg.

Third, with the addition of always having a blkg reference, the blkcg
can now be referenced through it rather than maintaining an additional
pointer and reference. So let's clean this up.

Finally, it seems rather useful to know on average how well IOs are
doing per cgroup. This adds the average latency statistic to core where
it encompasses IOs from all decendants.

This patchset contains the following 15 patches:

  0001-Revert-blk-throttle-fix-race-between-blkcg_bio_issue.patch
  0002-blkcg-delay-blkg-destruction-until-after-writeback-h.patch
  0003-blkcg-use-tryget-logic-when-associating-a-blkg-with-.patch
  0004-blkcg-fix-ref-count-issue-with-bio_blkcg-using-task_.patch
  0005-blkcg-update-blkg_lookup_create-to-do-locking.patch
  0006-blkcg-always-associate-a-bio-with-a-blkg.patch
  0007-blkcg-consolidate-bio_issue_init-and-blkg-associatio.patch
  0008-blkcg-associate-a-blkg-for-pages-being-evicted-by-sw.patch
  0009-blkcg-associate-writeback-bios-with-a-blkg.patch
  0010-blkcg-remove-bio-bi_css-and-instead-use-bio-bi_blkg.patch
  0011-blkcg-remove-additional-reference-to-the-css.patch
  0012-blkcg-cleanup-and-make-blk_get_rl-use-blkg_lookup_cr.patch
  0013-blkcg-change-blkg-reference-counting-to-use-percpu_r.patch
  0014-blkcg-rename-blkg_try_get-to-blkg_tryget.patch
  0015-blkcg-add-average-latency-tracking-to-blk-cgroup.patch

0001-0003 addresses the regression in the blkcg cleanup path.
0004 fixes a small window race condition with task_css(current, ...).
0005 is a prepatory patch that cleans up blkg lookup create.
0006-0009 associates all bios with a blkg: regular IO, swap, writeback.
0010 removes the extra css pointer in bios.
0011 removes the implicit reference left behind in 0010.
0012 cleans up blk_get_rl making use of the new blkg ref
0013 changes blkg ref counting from atomic to percpu.
0014 renames and makes blkg_try_get consistent with css_tryget.
0015 adds average latency tracking as part of blk-cgroup.

This patchset is on top of axboe#for-4.19/block #b86d865cb1ca.

diffstats below:

Dennis Zhou (Facebook) (15):
  Revert "blk-throttle: fix race between blkcg_bio_issue_check() and
    cgroup_rmdir()"
  blkcg: delay blkg destruction until after writeback has finished
  blkcg: use tryget logic when associating a blkg with a bio
  blkcg: fix ref count issue with bio_blkcg using task_css
  blkcg: update blkg_lookup_create to do locking
  blkcg: always associate a bio with a blkg
  blkcg: consolidate bio_issue_init and blkg association
  blkcg: associate a blkg for pages being evicted by swap
  blkcg: associate writeback bios with a blkg
  blkcg: remove bio->bi_css and instead use bio->bi_blkg
  blkcg: remove additional reference to the css
  blkcg: cleanup and make blk_get_rl use blkg_lookup_create
  blkcg: change blkg reference counting to use percpu_ref
  blkcg: rename blkg_try_get to blkg_tryget
  blkcg: add average latency tracking to blk-cgroup

 Documentation/admin-guide/cgroup-v2.rst |  14 +-
 block/bio.c                             | 187 +++++++++++----
 block/blk-cgroup.c                      | 298 +++++++++++++++++-------
 block/blk-iolatency.c                   |  26 +--
 block/blk-throttle.c                    |  12 +-
 block/bounce.c                          |   4 +-
 drivers/block/loop.c                    |   5 +-
 drivers/md/raid0.c                      |   2 +-
 fs/buffer.c                             |  10 +-
 fs/ext4/page-io.c                       |   2 +-
 include/linux/bio.h                     |  23 +-
 include/linux/blk-cgroup.h              | 167 ++++++++-----
 include/linux/blk_types.h               |   1 -
 include/linux/cgroup.h                  |   2 +
 include/linux/writeback.h               |   5 +-
 kernel/cgroup/cgroup.c                  |   4 +-
 kernel/trace/blktrace.c                 |   4 +-
 mm/backing-dev.c                        |   5 +
 mm/page_io.c                            |   2 +-
 19 files changed, 520 insertions(+), 253 deletions(-)

Thanks,
Dennis



[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]     [Monitors]

  Powered by Linux