On May 28, 2008 10:24 -0700, Joel Becker wrote: > On Wed, May 28, 2008 at 10:09:52AM -0600, Andreas Dilger wrote: > > But the problem is that people are error prone in their updating of code, > > and if the filesystems assume "the VFS has checked all of the flags except > > this one I don't understand" will likely become incorrect over time as > > someone will forget, will misunderstand whether the different per-fs codes > > need to be updated, or some patch will be delayed in a FS maintainer queue > > while the VFS "acceptance" of the new feature will be included upstream. > > This is a specious argument - if it doesn't go upstream, we > then have the overloaded-flag problem. I was actually thinking of the opposite case - the VFS part of the new flag is included upstream (i.e. ioctl_fiemap() allows the new flag), but the filesystem-specific part is delayed by some maintainer (or lack thereof). We've had an ongoing issue with ext4 because we need EXPORT_SYMBOL(zero_page), but this is not making it through the m68k maintainer yet the ext4 part of the patch is already upstream and Andrew complains about it regularly. > If you're looking for vendor flags, let's just design a space for them. By no means am I looking for "private" flags or adding support for flags that don't exist upstream (assuming it is reasonable to get new flags upstream). What I'm specifically concerned about is being able to support new features that are properly accepted upstream in Lustre built against older vendor kernels. We are trying to get out of the kernel-patching days because customers aren't willing to void their kernel or 3rd-party application support by running a patched kernel on the client. Since this is a relatively new API, I think several features like FIEMAP_FLAG_XATTR, FIEMAP_FLAG_METADATA, and maybe a few others will be added in the next several months, and some vendor will grab one of the "has FIEMAP, but not all of the flags" kernels and we won't be able to add newer features on that kernel for possibly several years. > > The issue is that most users don't have the latest upstream kernel > > because they are using a vendor kernel that is a few years old, as you > > likely know, but an updated Lustre or OCFS2 or btrfs should work with > > the existing vendor kernels. > > > > If we wanted to add something like FIEMAP_FLAG_METADATA, if the check > > was done in the VFS, it would be impossible without patching the client > > even if it exactly matched the upstream kernel implementation. > > First, getting vendor kernels to update a supported flag set > that is already in mainline is pretty easy. They are rightly interested > in following a well-defined interface, which is what Mark's trying to do > - no filesystems supporting flags that aren't part of the well-defined > interface. Reasonably so, yes. The issue is that everyone is busy, and what may be a priority for us isn't necessarily for the vendor, and there is another hurdle trying to get the customer to upgrade the kernels on their 10000-node cluster to add some bits to the compatibility flags. Being able to add in e.g. FIEMAP_FLAG_XATTR ourselves is easier. > But if you are really worried about no kernel updates when you > install a new fs module, you can still solve it with a generic check. > Just add /proc/sys/fs/fiemap-flag-mask. This covers any new flags for > the generic VFS check. Alternately, allow filesystems to register their > flags and then do the VFS check based on that. If you are suggesting that the filesystems all export their "supported flags" mask somewhere, and the VFS uses that for a check, then yes I agree it would be possible to do. I don't see a huge benefit of that over just letting the filesystems do it directly themselves at that point. Adding a /proc or /sys or /debugfs tunable for this seems heavyweight, and needs a sysctl or other setting on each boot - a pain for diskless clients. It seems backward to me to add arbitrary limits to the API when it was designed in the first place to be flexible and allow features to be added easily. Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc. -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html