On Tue, Oct 31, 2023 at 05:22:46AM -0700, Christoph Hellwig wrote: > On Tue, Oct 31, 2023 at 01:14:42PM +0100, Christian Brauner wrote: > > What happens in the kernel right now I've mentiond in the mount api > > conversion patch for btrfs I sent out in June at [1] because I tweaked > > that behavior. Say I mount both subvolumes: > > > > mount /dev/sda -o subvol=subvol1 /vol1 # sb1@vfsmount1 > > mount /dev/sda -o subvol=subvol2 /vol2 # sb1@vfsmount2 > > > > It creates a superblock for /dev/sda. It then creates two vfsmounts: one > > for subvol1 and one for subvol2. So you end up with two subvolumes on > > the same superblock. > > > > So if you mount a subvolume today then you already get separate > > vfsmounts. To put it another way. If you start 10,000 containers each > > using a separate btrfs subvolume then you get 10,000 vfsmounts. > > But only if you mount them explicitly, which you don't have to. Yep, I'm aware. > > > Or is it that you want a separate superblock per subvolume? > > Does "you" refer to me here? No, I don't. > > > Because only > > if you allocate a new superblock you'll get clean device number > > handling, no? Or am I misunderstanding this? > > If you allocate a super block you get it for free. If you don't > you have to manually allocate it report it in ->getattr. So this is effectively a request for: btrfs subvolume create /mnt/subvol1 to create vfsmounts? IOW, mkfs.btrfs /dev/sda mount /dev/sda /mnt btrfs subvolume create /mnt/subvol1 btrfs subvolume create /mnt/subvol2 would create two new vfsmounts that are exposed in /proc/<pid>/mountinfo afterwards? That might be odd. Because these vfsmounts aren't really mounted, no? And so you'd be showing potentially hundreds of mounts in /proc/<pid>/mountinfo that you can't unmount? And even if you treat them as mounted what would unmounting mean? I'm not saying that it's a show stopper but we would need a clear understanding what the semantics are were after? My knee-jerk reaction is that if we wanted each btrfs subvolume to be a vfsmount then we don't want to have them show up in /proc/<pid>/mountinfo _unless_ they're actually mounted.