On 12/13/23 13:28, Kent Overstreet wrote: > On Wed, Dec 13, 2023 at 08:37:57AM +0100, Donald Buczek wrote: >> Probably not for the specific applications I mentioned (backup, mirror, >> accounting). These are intended to run continuously, slowly and unnoticed >> in the background, so they are memory and i/o throttled via cgroups anyway >> and one is even using sleep after so-and-so many stat calls to reduce >> its impact. >> >> If they could tell a directory from a snapshot, I would probably stop them >> from walking into snapshots. And if not, the snapshot id is all that is >> needed to tell a clone in a snapshot from a hardlink. So these don't really >> need the filehandle. > > Perhaps we should allocate a bit for differentiating a snapshot from a > non snapshot subvolume? Are there non-snapshots subvolumes? >From debugfs bcachefs/../btrees, I've got the impression, that every volume starts with a (single) snapshot. new fileystem: subvolumes ========== u64s 10 type subvolume 0:1:0 len 0 ver 0: root 4096 snapshot id 4294967295 parent 0 snapshots ========= u64s 10 type snapshot 0:4294967295:0 len 0 ver 0: is_subvol 1 deleted 0 parent 0 children 0 0 subvol 1 tree 1 depth 0 skiplist 0 0 0 `bcachefs subvolume create /mnt/v` subvolumes ========== u64s 10 type subvolume 0:1:0 len 0 ver 0: root 4096 snapshot id 4294967295 parent 0 u64s 10 type subvolume 0:2:0 len 0 ver 0: root 1207959552 snapshot id 4294967294 parent 0 snapshots ========= u64s 10 type snapshot 0:4294967294:0 len 0 ver 0: is_subvol 1 deleted 0 parent 0 children 0 0 subvol 2 tree 2 depth 0 skiplist 0 0 0 u64s 10 type snapshot 0:4294967295:0 len 0 ver 0: is_subvol 1 deleted 0 parent 0 children 0 0 subvol 1 tree 1 depth 0 skiplist 0 0 0 `bcachefs subvolume snapshot /mnt/v /mnt/s` subvolumes ========== u64s 10 type subvolume 0:1:0 len 0 ver 0: root 4096 snapshot id 4294967295 parent 0 u64s 10 type subvolume 0:2:0 len 0 ver 0: root 1207959552 snapshot id 4294967292 parent 0 u64s 10 type subvolume 0:3:0 len 0 ver 0: root 1207959552 snapshot id 4294967293 parent 2 snapshot ======== u64s 10 type snapshot 0:4294967292:0 len 0 ver 0: is_subvol 1 deleted 0 parent 4294967294 children 0 0 subvol 2 tree 2 depth 1 skiplist 4294967294 4294967294 4294967294 u64s 10 type snapshot 0:4294967293:0 len 0 ver 0: is_subvol 1 deleted 0 parent 4294967294 children 0 0 subvol 3 tree 2 depth 1 skiplist 4294967294 4294967294 4294967294 u64s 10 type snapshot 0:4294967294:0 len 0 ver 0: is_subvol 0 deleted 0 parent 0 children 4294967293 4294967292 subvol 0 tree 2 depth 0 skiplist 0 0 0 u64s 10 type snapshot 0:4294967295:0 len 0 ver 0: is_subvol 1 deleted 0 parent 0 children 0 0 subvol 1 tree 1 depth 0 skiplist 0 0 0 Now reading and interpreting the filehandles: /mnt/. type 177 : 00 10 00 00 00 00 00 00 01 00 00 00 00 00 00 00 : inode 0000000000001000 subvolume 00000001 generation 00000000 /mnt/v type 177 : 00 00 00 48 00 00 00 00 02 00 00 00 00 00 00 00 : inode 0000000048000000 subvolume 00000002 generation 00000000 /mnt/s type 177 : 00 00 00 48 00 00 00 00 03 00 00 00 00 00 00 00 : inode 0000000048000000 subvolume 00000003 generation 00000000 So is there really a type difference between the objects created by `bcachefs subvolume create` and `bcachefs subvolume snapshot` ? It appears that they both point to a volume which points to a snapshot in the snapshot tree. Best Donald >> In the thread it was assumed, that there are other (unspecified) >> applications which need the filehandle and currently use name_to_handle_at(). >> >> I though it was self-evident that a single syscall to retrieve all >> information atomically is better than a set of syscalls. Each additional >> syscall has overhead and you need to be concerned with the data changing >> between the calls. > > All other things being equal, yeah it would be. But things are never > equal :) > > Expanding struct statx is not going to be as easy as hoped, so we need > to be a bit careful how we use the remaining space, and since as Dave > pointed out the filehandle isn't needed for checking uniqueness unless > nlink > 1 it's not really a hotpath in any application I can think of. > > (If anyone does know of an application where it might matter, now's the > time to bring it up!) > >> Userspace nfs server as an example of an application, where visible >> performance is more relevant, was already mentioned by someone else. > > I'd love to hear confirmation from someone more intimately familiar with > NFS, but AFAIK it shouldn't matter there; the filehandle exists to > support resuming IO or other operations to a file (because the server > can go away and come back). If all the client did was a stat, there's no > need for a filehandle - that's not needed until a file is opened. -- Donald Buczek buczek@xxxxxxxxxxxxx Tel: +49 30 8413 1433