Re: [PATCH e2fsprogs 4/9] mke2fs: Configure encoding during superblock initialization

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



"Theodore Y. Ts'o" <tytso@xxxxxxx> writes:

> On Mon, Oct 15, 2018 at 05:12:15PM -0400, Gabriel Krisman Bertazi wrote:
>> diff --git a/misc/mke2fs.c b/misc/mke2fs.c
>> index f05003fc30b9..5ed7b987540e 100644
>> --- a/misc/mke2fs.c
>> +++ b/misc/mke2fs.c
>> @@ -790,6 +790,8 @@ static void parse_extended_opts(struct ext2_super_block *param,
>>  	int	len;
>>  	int	r_usage = 0;
>>  	int	ret;
>> +	int	encoding = -1;
>> +	char 	*encoding_flags = NULL;
>
>     ...
>
>> +	if (ext2fs_has_feature_fname_encoding(param)) {
>> +		param->s_encoding_flags =
>> +			ext4_encoding_map[encoding].default_flags;
>
> This code is assuming that users will specify the encoding via "-E encoding=utf8-10.0"
> and this will set the FNAME_ENCODING flag implicitly.
>
> But consider what happens if the user runs command like this:
>
>     mke2fs -t ext4 -O fname_encoding -E resize=12T
>
> When parse_extended_opts gets called, the variable encoding will still
> be -1, and so we'll end up trying to use a negative array index to
> ext4_encoding_map[] which will be... unfortunate.
>
> As I mentioned in another e-mail, I'm a bit dubious about having
> per-encoding default flags.  Those flags should either global ext4
> code points, or they should be forced to specific values given the
> encoding that is specified.

Normalization and casefold types are too specific to each encoding, to
not be per-encoding.  ASCII has no normalization, for instance.

If I understand you correctly, we should make them ext4 code points to
ensure they don't change in the future.

> We probably also want to have a default encoding if the user just
> specifies "-O fname_encoding".   Say, in /etc/mke2fs.conf:

Right.  That solves the case for -O fname_encoding.  I will do this in v3.

>
> [options]
>     default_encoding = utf8-11.0
>
> Then at some point a few years from now, we might enable
> fname_encoding by default, so we might have in /etc/mke2fs.conf:
> [fs_types]
> 	ext4 = {
> 		features = has_journal,extent,huge_file,flex_bg,metadata_csum,64bit,dir_nlink,extra_isize,largedir,fname_encoding
> 		inode_size = 256
> 	}
>
> So having a way to specify the default encoding in /etc/mke2fs.conf is
> going to be important.  What will probably happen is two years, we'll
> be up to Unicode 13.0, and we might want to add support for Unicode
> 13.0 in some future kernel version,, say, 5.8.  But then we won't want
> to make utf8-13.0 the default for some amount of time, since if the
> file system is mounted on an older kernel, it won't work; the kernel
> will have to reject mounting a file system with an unknown encoding.
>
> So that's why I always like to make these sorts of configuration
> defaults to be tuneable in /etc/mke2fs.conf.  Different distros will
> have different backwards compatibility policies.  For example, For
> enterprise distros, they might want to wait 7 years before creating
> file systems with utf8-13.0 as the default.  For a community distro,
> they might want to wait 2-3 years.  And for a purpose-built Linux
> gaming Valve box, where the kernel is under the control of the box
> manufacturers, they might want to be super-aggressive about adopting a
> new Unicode encoding, in order to crack that critical Ancient Sanskrit
> market.  :-)

good point!

-- 
Gabriel Krisman Bertazi



[Index of Archives]     [Reiser Filesystem Development]     [Ceph FS]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Linux FS]     [Yosemite National Park]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Device Mapper]     [Linux Media]

  Powered by Linux