Re: [RFC PATCH] getting misc stats/attributes via xattr API

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, May 10, 2022 at 02:40:33PM +0200, Christian Brauner wrote:
> On Tue, May 10, 2022 at 10:55:33AM +1000, Dave Chinner wrote:
> > On Mon, May 09, 2022 at 02:48:15PM +0200, Christian Brauner wrote:
> > > On Tue, May 03, 2022 at 02:23:23PM +0200, Miklos Szeredi wrote:
> > >   I really think users would love to have an interfact where they can
> > >   get a struct with binary info back.
> > 
> > No. Not for kernel informational interfaces. We have ioctls and
> 
> That feels like semantics. statx is in all sensible readings of the
> words a kernel informational interface.

statx is an special purpose binary syscall interface for returning
inode specific information, it's not an abstract, generic
informational interface.

> I'm really looking at this from the perspective of someone who uses
> these interfaces regularly in userspace and a text-based interface for
> very basic information such as detailed information about a mount is
> cumbersome. I know people like to "counter" with "parsing strings is
> easy" but it remains a giant pain for userspace; at least for basic
> info.

As I said last reply, you are making all the same arguements against
text based information interfaces that were made against proc and
sysfs a long long time again. they weren't convincing a couple of
decades ago, and there aren't really convincing now. Text-based
key/value data is hard to screw up in the long run, binary
interfaces have a habit of biting hard whenever the contents of
the binary structure needs to change...

> > >   Imho, xattrs are a bit like a wonky version of streams already (One of
> > >   the reasons I find them quite unpleasant.). Making mount and other
> > >   information retrievable directly through the getxattr() interface will
> > >   turn them into a full-on streams implementation imho. I'd prefer not
> > >   to do that (Which is another reason I'd prefer at least a separate
> > >   system call.).
> > 
> > And that's a total misunderstanding of what xattrs are.
> > 
> > Alternate data streams are just {file,offset} based data streams
> > accessed via ithe same read/write() mechanisms as the primary data
> > stream.
> 
> That's why I said "wonky". But I'm not going to argue this point. I
> think you by necessity have wider historical context on these things
> that I lack. But I don't find it unreasonable to also see them as an
> additional information channel.
> 
> Sure, they are a generic key=value store for anything _in principle_. In
> practice however xattrs are very much perceived and used as information
> storage on files, a metadata side-channel if you will.

That's how *you* perceive them, not how everyone perceives them.

> All I'm claiming here is that it will confuse the living hell out of
> users if the getxattr() api suddenly is used not to just set and get
> information associated with inodes but to also provides filesystem or
> mount information.

Why would it confuse people? The xattr namespace is already well
known to be heirarchical and context dependent based on the intial
name prefix (user, security, btrfs, trusted, etc). i.e. if you don't
know that the context the xattr acts on is determined by the initial
name prefix, then you need to read the xattr(7) man page again:

Extended attribute namespaces

	Attribute  names  are  null-terminated  strings.   The
	attribute name is always specified in the fully qualified
	namespace.attribute form, for example, user.mime_type,
	trusted.md5sum, system.posix_acl_access, or
	security.selinux.

	The namespace mechanism is used to define different classes
	of extended attributes.  These different classes exist for
	several reasons;  for  example, the permissions and
	capabilities required for manipulating extended attributes
	of one namespace may differ to another.

	Currently,  the  security, system, trusted, and user
	extended attribute classes are defined as described below.
	Additional classes may be added in the future.

> That's a totally a totally differnet type of information. Sure, it may
> fit well in the key=value scheme because the xattr key=value _interface_
> is generic but that's a very technical argument.

Yet adding a new xattr namespace for a new class of information that
is associated the mount that the path/inode/fd is associated with is
exactly what the xattr namespaces are intended to allow. And it is
clearly documented that new classes "may be added in the future".

I just don't see where the confusion would come from...

> 
> I'm looking at this from the experience of a user of the API for a
> moment and in code they'd do in one place:
> 
> getxattr('/super/special/binary', "security.capability", ...);
> 
> and then in another place they do:
> 
> getxattr('/path/to/some/mount', "mntns:info", ...);
> 
> that is just flatout confusing.

Why? Both are getting different classes of key/value information
that is specific to the given path. Just because on is on-disk and
the other is ephemeral doesn't make it in any way confusing. This is
exactly what xattr namesapces are intended to support...

> > Xattrs provide an *atomic key-value object store API*, not an offset
> > based data stream API. They are completely different beasts,
> > intended for completely different purposes. ADS are redundant when you
> > have directories and files, whilst an atomic key-value store is
> > something completely different.
> > 
> > You do realise we have an independent, scalable, ACID compliant
> > key-value object store in every inode in an XFS filesystem, right?
> 
> So far this was a really mail with good background information but I'm
> struggling to make sense of what that last sentence is trying to tell
> me. :)

That people in the past have built large scale data storage
applications that use XFS inodes as key based object stores, not as
a offset based data stream. Who needs atomic write() functionality
when you have ACID set and replace operations for named objects?

The reality is that modern filesystems are really just btree based
object stores with high performance transaction engines overlaid
with a POSIX wrapper. And in the case of xattrs, we effectively
expose that btree based key-value database functionality directly to
userspace....

Stop thinking like xattrs are some useless metadata side channel,
and start thinking of them as an atomic object store that stores and
retreives millions of small (< 1/2 the filesystem block size) named
objects far space effciently than a directory structure full of
small files indexed by object hash.

Cheers,

Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx



[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux