Re: [PATCH] block: remove redundant blk-cgroup init from __bio_clone

Dennis Zhou <dennis@xxxxxxxxxx> · Tue, 12 Apr 2022 00:52:13 -0700

On Mon, Apr 11, 2022 at 10:27:54PM -0700, Christoph Hellwig wrote:
> On Mon, Apr 11, 2022 at 01:33:58PM -0400, Mike Snitzer wrote:
> > When bio_{alloc,init}_clone are passed a bdev, bio_init() will call
> > bio_associate_blkg() so the __bio_clone() work to initialize blkcg
> > isn't needed.
> 
> No, unfortunately it isn't as simple as that.  There are bios that do
> not use the default cgroup and thus blkg, e.g. those that come from
> cgroup writeback.

Yeah I wasn't quite right earlier. But, the new api isn't in line with
the original semantics. Cloning the blkg preserves the original bios
request_queue which likely differs from the bdev passed into clone. This
means an IO might be charged to the wrong device.

So, the blkg combines the who, blkcg, and the where, the corresponding
request_queue. Before bios were inited in 2 phases:
    bio_alloc();
    bio_set_dev();

This meant at clone time, we didn't have the where, but the who was
encased in the blkg. So, after bio_clone_blkg_association() expected a
bio_set_dev() call which called bio_associate_blkg(). When the bio
already has a blkg, it attempts to reuse the blkcg while using the new
bdev to find the correct blkg.

The tricky part seems to be how to seamlessly expose the appropriate
blkcg without being intrusive to bio_alloc*() apis.

Regarding the NULL bdev, I think that works as long as we keep the
bio_clone_blkg_association() call to carry the correct blkcg to the
bio_set_dev() call.

Thanks,
Dennis