On Mon, Jul 24, 2023 at 10:34:20AM -0700, Christoph Hellwig wrote: > On Wed, Jul 12, 2023 at 05:11:00PM -0400, Kent Overstreet wrote: > > bio_iov_iter_get_pages() trims the IO based on the block size of the > > block device the IO will be issued to. > > > > However, bcachefs is a multi device filesystem; when we're creating the > > bio we don't yet know which block device the bio will be submitted to - > > we have to handle the alignment checks elsewhere. > > So, we've been trying really hard to always make sure to pass a bdev > to anything that allocates a bio, mostly due due the fact that we > actually derive information like the blk-cgroup associations from it. > > The whole blk-cgroup stuff is actually a problem for non-trivial > multi-device setups. XFS gets away fine because each file just > sits on either the main or RT device and no user I/O goes to the > log device, and btrfs papers over it in a weird way by always > associating with the last added device, which is in many ways gross > and wrong, but at least satisfies the assumptions made in blk-cgroup. > > How do you plan to deal with this? Because I really don't want folks > just to go ahead and ignore the issues, we need to actually sort this > out. Doing the blk-cgroup association at bio alloc time sounds broken to me, because of stacking block devices - why was the association not done at generic_make_request() time?