On Mon, Aug 17, 2020 at 12:09:08AM +0100, Matthew Wilcox wrote: > On Mon, Aug 17, 2020 at 08:56:20AM +1000, Dave Chinner wrote: > > Indeed, most filesystems will not be able to implement ADS as > > xattrs. xattrs are implemented as atomicly journalled metadata on > > most filesytems, they cannot be used like a seekable file by > > userspace at all. If you want ADS to masquerade as an xattr, then > > you have to graft the entire file IO path onto filesytsem xattrs, > > and that just ain't gonna work without a -lot- of development in > > every filesystem that wants to support ADS. > > > > We've already got a perfectly good presentation layer for user data > > files that are accessed by file descriptors (i.e. directories > > containing files), so that should be the presentation layer you seek > > to extend. > > > > IOWs, trying to use abuse xattrs for ADS support is a non-starter. > > One thing Dave didn't mention is that a directory can have xattrs, > forks and files (and acls). So your presentation layer needs to not > confuse one thing for another. I'd stop calling these "forks" already, too. The user wants "alternate data streams", while a "resource fork" is an internal filesystem implementation detail used to provide ADS functionality... e.g. an XFS inode has a "data fork" which contains the extent tree that points at user data. This is a seekable fork. Directories are also implemented internally in the data fork as directories are seekable. OTOH, the XFS inode has an "attr fork" which contains a key-value btree which only supports record based operations. i.e. and records can only be atomically updated via transactions. This is not a seekable data store. xattrs are stored in this data store. The key-value store supports multiple namespaces (e.g. system vs user) so things like ACLs and security information can be stored as xattrs and not be visible as user xattrs. On the gripping hand, the XFS inode also has a virtual "COW fork" which is used to track data fork regions that are in the process of underdoing a copy-on-write operation. This is a shadow extent tree that tracks the new location of the data until writeback occurs and then the new location is atomically swapped back into the data fork. This fork does not get exposed to userspace, nor does it ever end up on disk - users do not know this fork even exists. IOWs, historically speaking, a "fork" is something that is used to implement different storage types and address spaces within an inode, it's not a feature that is exposed to users and userspace. To implement ADS, we'd likely consider adding a new physical inode "ADS fork" which, internally, maps to a separate directory structure. This provides us with the ADS namespace for each inode and a mechanism that instantiates a physical inode per ADS. IOWs, each ADS can be referenced by the VFS natively and independently as an inode (native "file as a directory" semantics). Hence existing create/unlink APIs work for managing ADS, readdir() can list all your ADS, you can keep per ADS xattrs, etc.... IOWs, with a filesystem inode fork implementation like this for ADS, all we really need is for the VFS to pass a magic command to ->lookup() to tell us to use the ADS namespace attached to the inode rather than use the primary inode type/state to perform the operation. Hence all the ADS support infrastructure is essentially dentry cache infrastructure allowing a dentry to be both a file and directory, and providing the pathname resolution that recognises an ADS redirection. Name that however you want - we've got to do an on-disk format change to support ADS, so we can tell the VFS we support ADS or not. And we have no cares about existing names in the filesystem conflicting with the ADS pathname identifier because it's a mkfs time decision. Given that special flags are needed for the openat() call to resolve an ADS (e.g. O_ALT), we know if we should parse the ADS identifier as an ADS the moment it is seen... > I don't understand why a fork would be permitted to have its own > permissions. That makes no sense. Silly Solaris. I can't think of a reason why, either, but the above implementation for XFS would support it if the presentation layer allows it... :) Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx