Re: [ANNOUNCE] util-linux-ng v2.17.1

Andreas Dilger <adilger@xxxxxxx> · Fri, 26 Feb 2010 13:07:36 -0700

On 2010-02-26, at 06:52, Karel Zak wrote:
On Thu, Feb 25, 2010 at 06:13:50PM -0700, Andreas Dilger wrote:
One question I had was regarding naming of the TYPE.  Currently we  
are
using "zfs" for this, but the current code is only really detecting  
the
volume information and has noting to do with mountable filesystems.

The TYPE is used by mount(8) or fsck(8) if the fstype is not
explicitly defined by user.

I don't know if anything depends on the TYPE, but I don't see
/sbin/mount.zfs, so it seems that zfs-fuse guys use something other.

To me it seems better that if we expect mount.zfs to mount a single  
filesystem from within the pool, then the TYPE of the disk that blkid  
is detecting today is that of the "volume", so the TYPE should  
probably not be "zfs".  After the ZFS pool is imported it may be that  
we would identify datasets with type "zfs", but that is a long way  
away from where it is today.  It probably also makes sense to change  
this from BLKID_USAGE_FILESYSTEM to BLKID_USAGE_RAID.

See for example vmfs.c where we have "VMFS" (mountable FS) and also
"VMFS_volume_member" (storage). The both TYPEs are completely
independent and you can selectively probe for FS or for the special
volume rather than probe always for both. I think this concept is
better that add a new identifier (e.g. CONTAINER).

Note, we have USAGE identifier to specify kind of the type, for
example raid, filesystem, crypto, etc. This is necessary for
udevd and some desktop tools. (try: blkid -p -o udev <device>).

Should we rename the TYPE to be "zfs_vdev" (ZFS equivalent to  
"lvm2pv")
instead of the current "zfs"?  It is probably more desirable to keep

Yes, TYPE="zfs" (mountable FS) and TYPE="zfs_volume_member" makes
sense. (The "_volume_member" is horribly long, but we use it for
compatibility with udev world.)

The only type that has "_volume_member" is VMFS.  In ZFS terms, the  
aggregate is called a "pool", and a component member is a "VDEV", and  
I'd prefer to stick to that if possible, just like MD uses  
"linux_raid_member" and LVM uses "LVM2_member" for their component  
devices.  It seems "zfs_pool_member" or simply "zfs_member" would be  
OK?  I'm not dead set against "zfs_volume_member" if there is a real  
reason for it.

The other question that has come up is whether the "UUID" for a  
component device should be the UUID of the component device itself, or  
that of the volume?  It seems to be for the component device, so is  
there any standard for the UUID/LABEL of the volume?  For ZFS it makes  
sense to use the pool name as the LABEL, and for LVM it would make  
sense to use the VG name as the LABEL.

I'm updating the patch, and will resend based on feedback here.

Have you considered adding CONTAINER or similar identification to the
blkid.tab file, so that it is possible to determine that the  
filesystem
with LABEL="home" is on CONTAINER="39u4yr-f5WW-dtD7-jDfr-
usGd-pYWf-qy6xKE", which in turn is the UUID of an lvm2pv on /dev/ 
sda2?

I'd like to avoid this if possible.

The reason I was thinking about this is if, for example, I want to  
mount "LABEL=home", I can see from blkid.tab that this is in an device  
called /dev/mapper/vgroot-lvhome, but if that volume is not currently  
set up, I have no way to know where it is or how to configure it  
(other than possibly very weak heuristics based on the device name).   
If, instead, it has a CONTAINER which matches the UUID of /dev/sda2  
that device can be probed based on its TYPE.  In SAN configurations  
where there are many devices that _might_ be available to this node  
(e.g. high-availability with multiple servers) configuring every  
device/volume that the node can see is probably a bad idea (e.g. the  
LVM or MD RAID configuration may have changed).

Instead, doing the probing at startup (so the mappings are available),  
but having the volumes configured on demand when some filesystem/ 
device within it is being mounted makes more sense.  Storing this  
hierarchical dependency in a central place (blkid.tab) makes sense.   
That way, udev/blkid can be told "I want to mount LABEL=home", it  
resolves this is in CONTAINER={UUID} (e.g. LV UUID), then blkid  
locates $CONTAINER, and if it resolves to a device that is not yet  
active we can at least know that it is TYPE=LVM2_member, ask lvm2 to  
probe and configure the device(s) on which $CONTAINER resides, repeat  
as necessary for e.g. MD member sub-devices.

To note, this discussion is not strictly related to the ZFS case, it's  
just something that I thought about while looking at the code.

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.

--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html