Re: common layout xattr

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Jul 15, 2009  15:19 -0700, Sage Weil wrote:
> On Wed, 15 Jul 2009, Andreas Dilger wrote:
> > I'm thinking of using simple ASCII key=value pairs to store basic
> > layout information like chunk size, stripe count, mirror count,
> > RAID type, etc.  Some of them may not be applicable/usable by all
> > filesystems, but having a handful of "well known" keys and values
> > for a common xattr name would at least be better than what we have
> > now (which is nothing).
> > 
> > Something like (not necessarily a firm proposal yet):
> > 
> > trusted.common_layout:
> > chunk_bytes=65536
> > stripe_count=32
> > mirror_count=3
> > raid_type=1+0
> > 
> > Is this something you would be interested to pursue?  I've also discussed
> > this with Panasas, and they had some interest in this as well.  Any GPFS
> > developers watching?
> 
> This sounds like a good idea to me.  I think the main hurdle is going to 
> be defining a generalized layout description that captures all the full 
> space of layouts for each file systems, and also translates gracefully 
> between them.  IIRC Lustre, for instance, will stripe over $stripe_count 
> objects, while Ceph (and Panasas?) will stripe up to some $max_object_size 
> and then move on to a new set of objects.  Or stagger chunk order in 
> successive stripes, etc.

Well, I don't think we can capture all of the details for every
filesystem, but I'm hoping we can get some of the main parameters
working.  Having additional attributes that are more filesystem
specific is fine too (to a reasonable extent of course).

For parts of the layout that are generated programatically, like the
Ceph/Panasas striping order, I don't think that has to be encoded
explicitly into the layout xattr, since I'd assume the pattern is
always the same between files (e.g. use $stripe_count objects until
$max_object_size bytes, then a different set of $stripe_count objects
for $max_object_size bytes).  That Lustre uses the same $stripe_count
objects for the whole file, and it would ignore $max_object_size is
below the level of detail that I'm currently interested in.  In the
reverse direction, I'd assume that Ceph/Panasas would fill in the
value for $max_object_size from a default, as if no layout was used.


Filesystems are free to ignore parameters they don't like, and/or save them
and return them again when asked (probably with a flag that indicates they
are not currently in use), basically treating them as an opaque user xattr.
This will preserve the settings across an fsX -> fsY -> fsX transfer.

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.

--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux