Re: size limit of extended attributes

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, May 05, 2015 at 05:43:48PM +0200, Jan Kara wrote:
> On Tue 05-05-15 15:38:11, Björn JACKE wrote:
> > On 2015-05-02 at 01:43 +1000 Dave Chinner sent off:
> > > On Thu, Apr 30, 2015 at 03:57:37PM +0200, Björn JACKE wrote:
> > > > Can we get rid of the 64k size limit for EAs? The API on AIX is the same as on
> > > > Linux.  But there is a huge size limit - which would help us already a lot.
> > > 
> > > No - the maximum xattr size of 64k is encoded into the on-disk
> > > format of many filesystems and that's not a simple thing to change.
> > 
> > I know ext4 even has a much lower limit. But some filesystems don't have a
> > limit, and there the kernel 64k limit strikes in. The EA size limit of some
> > file systems could also be increased. As there is a real use case for that I
> > guess filesystems would like to be able to support larger EA sizes.
>
>   Yeah, so XFS could support more than 64K in principle if I look correct.
> Supporting it for ext4 would mean ondisk format change - doable but
> requires some non-trivial effort.

In theory, the on disk attribute format for XFS can support 2^32
bytes for a remote attribute, but the attribute btree itself can't
really support arbitrary length xattrs in it's mappings with any
sort of performance or flexibility.

And I really mean that performance thing - remote attributes in XFs
are written *synchronously* in the syscall because we need them
on disk before we commit the transaction that updates all the
metadata that points to them.

> Regarding the API, the issue with larger xattr size is that currently we
> copy whole xattr into kernel memory and process it in one go. Currently
> that's OK but if you want xattrs that have megabytes, it may become an
> effective way to DOS a system. So to support that we'd need to change at
> least the API from VFS into filesystems so that xattrs could be processed
> in smaller chunks. Again doable but quite some work.

Well, it's not just the VFS that would need to support this.
Attributes currently are not designed to be extended or partially
overwritten. Attributes are replaced in whole when they are changed,
and the filesystem implementations reflect that.

Again, going back to crash resiliency, XFS has a 3-step attribute
replacement algorithm to guarantee that a crash during or soon after
the operation will leave you with either the old or new value. It's
designed around userspace providing new attributes in whole, so
something like partial writes or extends will need *significant*
amounts of redesign and rework.

Oh, and what about all of utilities that you rely on for backups,
restore, copying files, etc. They all think:

$ grep -R XATTR /usr/include/linux/limits.h |grep MAX
#define XATTR_NAME_MAX   255    /* # chars in an extended attribute name */
#define XATTR_SIZE_MAX 65536    /* size of an extended attribute value (64k) */
#define XATTR_LIST_MAX 65536    /* size of extended attribute namelist (64k) */
$

And so if we change the kernel, we suddenly are creating files that
all our existing tools can't deal with. Now, taht means I've got to
update xfs_repair, xfsdump, xfs_restore, xfs_db, xfs_fsr, etc. to
support arbitrarily sized attributes, not to mention the special XFS
ioctl kernel interfaces they use...

> All in all I don't think this is going to happen unless someone interested
> in this invests significant amount of time to make this happen.

Compared to how much work it is for the file server application to
map file streams to a directory+files on demand, it makes no sense
to invent a completely new xattr API and have to implement it in all
the required supporting infrastructure.

Cheers,

Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux