Re: [PATCH 5/7] XFS: Unicode case-insensitive lookup implementation

David Chinner <dgc@xxxxxxx> · Thu, 3 Apr 2008 18:31:51 +1000

On Wed, Apr 02, 2008 at 04:25:13PM +1000, Barry Naujok wrote:
> This is the core of the case-insensitive support - supporting and
> enforcing UTF-8 (Unicode) filenames. All filename and user-level
> extended attribute names are checked for UTF-8 compliance and the
> hashes generated are always case-insensitive by utilising the
> Unicode 5.0 standard case-folding table from:
> http://www.unicode.org/Public/UNIDATA/CaseFolding.txt
> 
> As the hash is always case-insensitive, this allows the user to
> mkfs.xfs the filesystem once and enable or disable (default)
> case-insensitive support by a mount option "-o ci". The mount
> option specifies which xfs_nameops.compname function to use.
> 
> Also, the Unicode support is a CONFIG option so users who do
> not required this functionality can CONFIG it to N.
> 
> As the case-folding table is stored on disk, this allows
> backwards and forwards compatibility and languages like Turkic
> to support true case-insensitivity with I and i.
> 
> To create a Unicode filesystem with case-insensitive mount
> support, run:
> # mkfs.xfs -n utf8[=default|turkic] <device>
> 
> The final patches implement NLS support for XFS Unicode.
> 
> Signed-off-by: Barry Naujok <bnaujok@xxxxxxx>
.....

> +config XFS_UNICODE
> +	bool "XFS Unicode support"
> +	depends on XFS_FS
> +	help
> +	  Unicode support enforces UTF-8 filenames and user extended

Not sure that "enforces" is the right word. Perhaps 

	  Unicode support encodes filenames and user extended attribute
	  names in UTF-8 format on disk....

> +	  attribute names. This option is required for filesystems
> +	  mkfs'ed with UTF-8 support. A Unicode filesystem guarantees
> +	  that filenames will be the same regardless of the user's
> +	  locale. For UTF-8 locales, no conversion is required.
> +
> +	  Unicode filesystems also allow the filesystem to be mounted with
> +	  case-insensitive lookup support with the "-o ci" mount option.
> +
> +	  If you don't require UTF-8 enforcement, say N.
                                     ^^^^^^^^^^^ support

> Index: kern_ci/fs/xfs/linux-2.6/xfs_super.c
> ===================================================================
> --- kern_ci.orig/fs/xfs/linux-2.6/xfs_super.c
> +++ kern_ci/fs/xfs/linux-2.6/xfs_super.c
> @@ -46,6 +46,7 @@
>  #include "xfs_acl.h"
>  #include "xfs_attr.h"
>  #include "xfs_buf_item.h"
> +#include "xfs_unicode.h"
>  #include "xfs_utils.h"
>  #include "xfs_vnodeops.h"
>  #include "xfs_vfsops.h"
> @@ -124,6 +125,7 @@ xfs_args_allocate(
>  #define MNTOPT_ATTR2	"attr2"		/* do use attr2 attribute format */
>  #define MNTOPT_NOATTR2	"noattr2"	/* do not use attr2 attribute format */
>  #define MNTOPT_FILESTREAM  "filestreams" /* use filestreams allocator */
> +#define MNTOPT_CILOOKUP	"ci"		/* case-insensitive dir lookup */
>  #define MNTOPT_QUOTA	"quota"		/* disk quotas (user) */
>  #define MNTOPT_NOQUOTA	"noquota"	/* no quotas */
>  #define MNTOPT_USRQUOTA	"usrquota"	/* user quota enabled */
> @@ -318,6 +320,14 @@ xfs_parseargs(
>  			args->flags &= ~XFSMNT_ATTR2;
>  		} else if (!strcmp(this_char, MNTOPT_FILESTREAM)) {
>  			args->flags2 |= XFSMNT2_FILESTREAMS;
> +		} else if (!strcmp(this_char, MNTOPT_CILOOKUP)) {
> +			args->flags2 |= XFSMNT2_CILOOKUP;
> +#ifndef CONFIG_XFS_UNICODE
> +			cmn_err(CE_WARN,
> +				"XFS: %s option requires Unicode support",
> +				this_char);
> +			return EINVAL;
> +#endif

Hmmmm - I don't like having CONFIG_XFS_... stuff in here. Use
a function that is only compiled in when CONFIG_XFS_UNICODE is not
defined for this....

The other way of doing this is to accept the mount option here
and then once we've read the superblock we can abort the mount
if the unicode feature is not present in the superblock. Of course,
if we don't compile in unicode support, that superblock check will
fail....

> @@ -567,7 +578,8 @@ xfs_set_inodeops(
>  		break;
>  	case S_IFDIR:
>  		inode->i_op =
> -			xfs_sb_version_hasoldci(&XFS_I(inode)->i_mount->m_sb) ?
> +			xfs_sb_version_hasoldci(&XFS_I(inode)->i_mount->m_sb) ||
> +			(XFS_I(inode)->i_mount->m_flags & XFS_MOUNT_CILOOKUP) ?
>  				&xfs_dir_ci_inode_operations :
>  				&xfs_dir_inode_operations;

Please turn that into a regular if..else 

Also, from the last patch, use XFS_M(inode->i_sb) instead of
XFS_I(inode)->i_mount....

> @@ -175,6 +176,13 @@ xfs_attr_get(
>  	if (namelen >= MAXNAMELEN)
>  		return(EFAULT);		/* match IRIX behaviour */
>  
> +	/* Enforce UTF-8 only for user attr names */
> +	if (xfs_sb_version_hasunicode(&ip->i_mount->m_sb) &&
> +			(flags & (ATTR_ROOT | ATTR_SECURE)) == 0) {
> +		error = xfs_unicode_validate(name, namelen);
> +		if (error)
> +			return error;
> +	}

I'd wrap this up completely into a helper, so the code here does
not need comments and becomes:

	error = xfs_attr_unicode_validate(ip, flags, name, namelen);
	if (error)
		return error;

>  	if (XFS_FORCED_SHUTDOWN(ip->i_mount))
>  		return(EIO);
>  
> @@ -435,6 +443,14 @@ xfs_attr_set(
>  	if (namelen >= MAXNAMELEN)
>  		return EFAULT;		/* match IRIX behaviour */
>  
> +	/* Enforce UTF-8 only for user attr names */
> +	if (xfs_sb_version_hasunicode(&dp->i_mount->m_sb) &&
> +			(flags & (ATTR_ROOT | ATTR_SECURE)) == 0) {
> +		int error = xfs_unicode_validate(name, namelen);
> +		if (error)
> +			return error;
> +	}

ditto.

> +
>  	XFS_STATS_INC(xs_attr_set);
>  
>  	if (XFS_FORCED_SHUTDOWN(dp->i_mount))
> @@ -581,6 +597,14 @@ xfs_attr_remove(
>  	if (namelen >= MAXNAMELEN)
>  		return EFAULT;		/* match IRIX behaviour */
>  
> +	/* Enforce UTF-8 only for user attr names */
> +	if (xfs_sb_version_hasunicode(&dp->i_mount->m_sb) &&
> +			(flags & (ATTR_ROOT | ATTR_SECURE)) == 0) {
> +		int error = xfs_unicode_validate(name, namelen);
> +		if (error)
> +			return error;
> +	}

ditto.

> @@ -52,6 +52,7 @@
>   */
>  static xfs_dahash_t
>  xfs_ascii_ci_hashname(
> +	xfs_inode_t	*inode,
>  	const uchar_t	*name,
>  	int		namelen)
>  {
> @@ -66,6 +67,7 @@ xfs_ascii_ci_hashname(
>  
>  static xfs_dacmp_t
>  xfs_ascii_ci_compname(
> +	xfs_inode_t	*inode,
>  	const uchar_t	*name1,
>  	int		len1,
>  	const uchar_t	*name2,
> @@ -113,8 +115,13 @@ xfs_dir_mount(
>  		(mp->m_dirblksize - (uint)sizeof(xfs_da_node_hdr_t)) /
>  		(uint)sizeof(xfs_da_node_entry_t);
>  	mp->m_dir_magicpct = (mp->m_dirblksize * 37) / 100;
> -	mp->m_dirnameops = xfs_sb_version_hasoldci(&mp->m_sb) ?
> -		&xfs_ascii_ci_nameops : &xfs_default_nameops;
> +
> +	if (xfs_sb_version_hasunicode(&mp->m_sb)) {
> +		mp->m_dirnameops = (mp->m_flags & XFS_MOUNT_CILOOKUP) ?
> +			&xfs_unicode_ci_nameops : &xfs_unicode_nameops;
> +	} else
> +		mp->m_dirnameops = xfs_sb_version_hasoldci(&mp->m_sb) ?
> +			&xfs_ascii_ci_nameops : &xfs_default_nameops;

Can I suggest that if you find xfs_sb_version_hasoldci() at mount,
then we should set mp->m_flags |= XFS_MOUNT_CILOOKUP?

That way we can use a single method of looking up whether CI is active,
and old vs new canbe determined by xfs_sb_version_hasunicode()....

> @@ -241,6 +241,7 @@ typedef struct xfs_fsop_resblks {
>  #define XFS_FSOP_GEOM_FLAGS_ATTR2	0x0400	/* inline attributes rework */
>  #define XFS_FSOP_GEOM_FLAGS_DIRV2CI	0x1000	/* ASCII only CI names */
>  #define XFS_FSOP_GEOM_FLAGS_LAZYSB	0x4000	/* lazy superblock counters */
> +#define XFS_FSOP_GEOM_FLAGS_UNICODE	0x10000	/* unicode filenames */

Can you reformat the list so that eveything lines up correctly
and has leading zeros for the full 32 bit flag space? like:

#define XFS_FSOP_GEOM_FLAGS_ATTR2	0x00000400 /* inline attributes rework */
#define XFS_FSOP_GEOM_FLAGS_DIRV2CI	0x00001000 /* ASCII only CI names */
#define XFS_FSOP_GEOM_FLAGS_LAZYSB	0x00004000 /* lazy superblock counters */
#define XFS_FSOP_GEOM_FLAGS_UNICODE	0x00010000 /* unicode filenames */

>  #include "xfs_utils.h"
> +#include "xfs_unicode.h"
>  
>  STATIC void	xfs_mount_log_sb(xfs_mount_t *, __int64_t);
>  STATIC int	xfs_uuid_mount(xfs_mount_t *);
> @@ -121,6 +122,7 @@ static const struct {
>      { offsetof(xfs_sb_t, sb_logsunit),	 0 },
>      { offsetof(xfs_sb_t, sb_features2),	 0 },
>      { offsetof(xfs_sb_t, sb_bad_features2), 0 },
> +    { offsetof(xfs_sb_t, sb_cftino),	 0 },
>      { sizeof(xfs_sb_t),			 0 }

whitespace.
>  	if (update_flags && !(mp->m_flags & XFS_MOUNT_RDONLY))
> @@ -1229,7 +1245,8 @@ xfs_mountfs(
>  	 * Free up the root inode.
>  	 */
>  	IRELE(rip);
> - error3:
> +	xfs_unicode_free_cft(mp->m_cft);
> +error3:
>  	xfs_log_unmount_dealloc(mp);
>   error2:
>  	for (agno = 0; agno < sbp->sb_agcount; agno++)

Please line up all the other labels while you are there ;)

> --- kern_ci.orig/fs/xfs/xfs_sb.h
> +++ kern_ci/fs/xfs/xfs_sb.h
> @@ -79,10 +79,18 @@ struct xfs_mount;
>  #define XFS_SB_VERSION2_LAZYSBCOUNTBIT	0x00000002	/* Superblk counters */
>  #define XFS_SB_VERSION2_RESERVED4BIT	0x00000004
>  #define XFS_SB_VERSION2_ATTR2BIT	0x00000008	/* Inline attr rework */
> +#define XFS_SB_VERSION2_UNICODEBIT	0x00000020	/* Unicode names */
>  
> -#define	XFS_SB_VERSION2_OKREALFBITS	\
> +#ifdef CONFIG_XFS_UNICODE
> +# define XFS_SB_VERSION2_OKREALFBITS	\
>  	(XFS_SB_VERSION2_LAZYSBCOUNTBIT	| \
> +	 XFS_SB_VERSION2_UNICODEBIT | \
>  	 XFS_SB_VERSION2_ATTR2BIT)
> +#else
> +# define XFS_SB_VERSION2_OKREALFBITS	\
> +	(XFS_SB_VERSION2_LAZYSBCOUNTBIT	| \
> +	 XFS_SB_VERSION2_ATTR2BIT)
> +#endif

Regardless of whether CONFIG_XFS_UNICODE is defined or not, we should
be defining this as a valid bit. What we want is xfs_sb_version_hasunicode()
to say "not supported" when CONFIG_XFS_UNICODE is not defined.

IOWs, if the sb_cftino field is defined in the superblock,
XFS_SB_VERSION2_UNICODEBIT must be defined as well.

> +#ifdef CONFIG_XFS_UNICODE
> +static inline int xfs_sb_version_hasunicode(xfs_sb_t *sbp)
> +{
> +	return (xfs_sb_version_hasmorebits(sbp) &&	\
> +		((sbp)->sb_features2 & XFS_SB_VERSION2_UNICODEBIT));
> +}
> +#else
> +# define xfs_sb_version_hasunicode(sbp)	(0)

static inline int xfs_sb_version_hasunicode(xfs_sb_t *sbp) { return 0; }

> Index: kern_ci/fs/xfs/xfs_unicode.c
> ===================================================================
> --- /dev/null
> +++ kern_ci/fs/xfs/xfs_unicode.c
> @@ -0,0 +1,499 @@
....
> +#include "xfs_inode.h"
> +#include "xfs_btree.h"
> +#include "xfs_ialloc.h"
> +#include "xfs_itable.h"
> +#include "xfs_rtalloc.h"
> +#include "xfs_error.h"
> +#include "xfs_bmap.h"
> +#include "xfs_rw.h"
> +#include "xfs_unicode.h"
> +
> +#define MAX_FOLD_CHARS	4

Why is this set to 4?

> +
> +static inline int
> +xfs_casefold(
> +	const xfs_cft_t *cft,
> +	__uint16_t	c,
> +	__uint16_t	*fc)
> +{
> +	__uint16_t	*table = XFS_CFT_PTR(cft, 0);
> +	__uint16_t	tmp = table[c >> 8];
> +	int		i;
> +
> +	if (!tmp) {
> +		*fc = c;
> +		return 1;
> +	}
> +	tmp = table[tmp + (c & 0xff)];
> +	if ((tmp & 0xf000) != 0xe000) {

lots of magic numbers here. Can you either use #defines that
make a little sense, or wrap it in a function that describes
the comparison?

Can we get a comment explaining how the case folding works?
i.e. the table layout, the transforms needed to find the
entry that matches, what the magic upper and lower bits are, etc.
This code by itself is not understandable or reviewable....

> +		*fc = tmp;
> +		return 1;
> +	}
> +	i = ((tmp >> 10) & 0x3) + 2;
> +	ASSERT(i < cft->num_tables);
> +	table = XFS_CFT_PTR(cft, i - 1) + ((tmp & 0x3ff) * i);

and it only gets worse :/

> +
> +	memcpy(fc, table, sizeof(__uint16_t) * i);
> +
> +	return i;
> +}
> +
> +static inline int
> +xfs_utf8_casefold(
> +	const xfs_cft_t	*cft,
> +	const uchar_t	**name,
> +	int		*namelen,
> +	__uint16_t	*fc)
> +{
> +	wchar_t		uc;
> +
> +	if (*namelen == 0)
> +		return 0;
> +
> +	if (**name & 0x80) {
> +		int	n = utf8_mbtowc(&uc, *name, *namelen);
> +		if (n < 0) {
> +			(*namelen)--;
> +			*fc = *(*name)++;
> +			return 1;
> +		}
> +		*name += n;
> +		*namelen -= n;
> +	} else {
> +		uc = *(*name)++;
> +		(*namelen)--;
> +	}
> +	return xfs_casefold(cft, uc, fc);
> +}

Comments, please.

> +/*
> + * Perform a case-folding case-insensitive string comparison,
> + * returns either XFS_CMP_CASE or XFS_CMP_DIFFERENT.
> + */
> +static xfs_dacmp_t
> +xfs_unicode_casecmp(
> +	xfs_cft_t	*cft,
> +	const uchar_t	*name1,
> +	int		len1,
> +	const uchar_t	*name2,
> +	int		len2)
> +{
> +	__uint16_t	fc1[MAX_FOLD_CHARS], fc2[MAX_FOLD_CHARS];
> +	__uint16_t	*pfc1, *pfc2;

Please put the "p" for pointer at the end of the variable.

> +/*
> + * Unicode Case Fold Table management
> + */
> +
> +struct cft_item {
> +	xfs_cft_t	*table;
> +	int		size;
> +	int		refcount;
> +};
> +
> +static mutex_t		cft_lock;
> +static int		cft_size;
> +static struct cft_item	*cft_list;
> +
> +static xfs_cft_t *
> +add_cft(
> +	xfs_dcft_t	*dcft,
> +	int		size)
> +{
....
> +
> +	cft_list = kmem_realloc(cft_list,
> +			(cft_size + 1) * sizeof(struct cft_item),
> +			cft_size  * sizeof(struct cft_item), KM_SLEEP);

KM_MAYFAIL and handle the allocation error.

> +static void
> +remove_cft(
> +	const xfs_cft_t	*cft)
> +{
> +	int		i;
> +
> +	mutex_lock(&cft_lock);
> +
> +	for (i = 0; i < cft_size; i++) {
> +		if (cft_list[i].table == cft) {
> +			ASSERT(cft_list[i].refcount > 0);
> +			cft_list[i].refcount--;
> +			break;
> +		}
> +	}

What happens when the refcount falls to zero?

We just leave it there consuming memory?

> +
> +	mutex_unlock(&cft_lock);
> +}
> +
> +
> +int
> +xfs_unicode_read_cft(
> +	xfs_mount_t	*mp)
> +{
> +	int		error;
> +	xfs_inode_t	*cftip;
> +	int		size;
> +	int		nfsb;
> +	int             nmap;
> +	xfs_bmbt_irec_t *mapp;
> +	int		n;
> +	int		byte_cnt;
> +	xfs_buf_t	*bp;
> +	char		*table;
> +	xfs_dcft_t	*dcft;
> +
> +	if (mp->m_sb.sb_cftino == NULLFSINO || mp->m_sb.sb_cftino == 0)
> +		return EINVAL;

ENOENT?

> +	error = xfs_iget(mp, NULL, mp->m_sb.sb_cftino, 0, 0, &cftip, 0);
> +	if (error)
> +		return error;
> +	ASSERT(cftip != NULL);
> +
> +	size = cftip->i_d.di_size;
> +	nfsb = cftip->i_d.di_nblocks;
> +
> +	table = vmalloc(size);
> +	if (!table) {
> +		xfs_iput(cftip, 0);
> +		return ENOMEM;
> +	}
> +	dcft = (xfs_dcft_t *)table;
> +
> +	nmap = nfsb;
> +	mapp = kmem_alloc(nfsb * sizeof(xfs_bmbt_irec_t), KM_SLEEP);

I think what you want here is the number of extents, not the number
of blocks for this. How big are these tables?

> +void
> +xfs_unicode_uninit(void)
> +{
> +	int		i;
> +
> +	mutex_lock(&cft_lock);
> +
> +	for (i = 0; i < cft_size; i++) {
> +		ASSERT(cft_list[i].refcount == 0);
> +		vfree(cft_list[i].table);
> +	}
> +	kmem_free(cft_list, cft_size * sizeof(struct cft_item));

Ah, we free unused tables on module unload. This won't ever occur
in many configurations (e.g. XFS on root fs). Perhaps it's better to
free them whenthe refcount falls to zero (otherwise what's the point
of refcounting them?)...

> +#ifdef CONFIG_XFS_UNICODE
> +
> +extern struct xfs_nameops xfs_unicode_nameops;
> +extern struct xfs_nameops xfs_unicode_ci_nameops;
> +
> +void xfs_unicode_init(void);
> +void xfs_unicode_uninit(void);
> +
> +int xfs_unicode_validate(const uchar_t *name, int namelen);
> +
> +int xfs_unicode_read_cft(struct xfs_mount *mp);
> +void xfs_unicode_free_cft(const xfs_cft_t *cft);
> +
> +#else
> +
> +#define xfs_unicode_nameops		xfs_default_nameops
> +#define xfs_unicode_ci_nameops		xfs_default_nameops
> +
> +#define xfs_unicode_init()
> +#define xfs_unicode_uninit()
> +#define xfs_unicode_validate(n,l)	0
> +#define xfs_unicode_read_cft(mp)	(EOPNOTSUPP)
> +#define xfs_unicode_free_cft(cft)

Same as before - static inlines where possible...

> @@ -399,6 +402,19 @@ xfs_finish_flags(
>  			mp->m_qflags |= XFS_OQUOTA_ENFD;
>  	}
>  
> +	if (xfs_sb_version_hasunicode(&mp->m_sb)) {
> +		if (ap->flags2 & XFSMNT2_CILOOKUP)
> +			mp->m_flags |= XFS_MOUNT_CILOOKUP;
> +	} else {
> +		/*
> +		 * Check for mount options which require a Unicode FS
> +		 */
> +		if (ap->flags2 & XFSMNT2_CILOOKUP) {
> +			cmn_err(CE_WARN,
> +	"XFS: can't do case-insensitive mount on non-utf8 filesystem");

Except if it has oldci defined.... ;)

> @@ -1777,6 +1778,12 @@ xfs_lookup(
>  	if (XFS_FORCED_SHUTDOWN(dp->i_mount))
>  		return XFS_ERROR(EIO);
>  
> +	if (xfs_sb_version_hasunicode(&dp->i_mount->m_sb)) {
> +		error = xfs_unicode_validate(d_name->name, d_name->len);
> +		if (error)
> +			return error;
> +	}

Seen that somewhere before - a wrapper perhaps?

/me snips the next 6 almost identical code fragments. I think a wrapper
that has the sb check inside it is the way to go here...

Cheers,

Dave.
-- 
Dave Chinner
Principal Engineer
SGI Australian Software Group
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html