Re: [PATCH 36/38] vfs: Add a sample program for the new mount API [ver #10]

Matthew Wilcox <willy@xxxxxxxxxxxxx> · Mon, 30 Jul 2018 12:49:38 -0700

On Mon, Jul 30, 2018 at 11:59:13AM -0700, Linus Torvalds wrote:
> On Mon, Jul 30, 2018 at 11:38 AM Matthew Wilcox <willy@xxxxxxxxxxxxx> wrote:
> >
> > I wasn't proposing putting gettext in the kernel.  I was reacting to
> > Pavel saying "You can't return English strings from the kernel, you have
> > to translate numbers into any language's strings".
> 
> The problem with gettext() is that if you *don't* have the strings
> marked for translated at the source, you're going to have a hard time
> with anything but the simplest fixed strings.
> 
> When the kernel does something like
> 
>      mntinfo("Option %s can not take value %d", opt->name, opt->value);
> 
> a gettext() interface inside the kernel (which really would be nasty)
> would have seen the format string and the values independently and
> would have generated the translation database from that.
> 
> But once the string has been generated, it can now be thousands of
> different strings, and you can't just look them up from a table any
> more.
> 
> Real examples from David's patch-series:
> 
>                 errorf(fc, "%s: Lookup failure for '%s'",
>                        desc->name, param->key);
> 
>                 errorf(fc, "%s: Non-blockdev passed to '%s'",
>                        desc->name, param->key);
> 
> which means that by the time user space sees it, you can't just "look
> up the string". The string will have various random key names etc in
> it.

That's fair, it becomes a good deal more complex than gettext() can
cope with.  Someone sufficiently determined could regex-match the
"Non-blockdev passed to", and translate that.  If they really want to
sell a computer system to the Office québécois de la langue française.

> But the alternative to pass things as format strings and raw data, and
> having all the rules for a "good gettext interface" are worse. It gets
> very ugly very quickly.

Yes, catching all the corner cases gets ridiculously hard, particularly
plurals.
https://www.gnu.org/software/gettext/manual/html_node/Plural-forms.html#Plural-forms
really makes my brain hurt, and makes me glad I work on simple things
like OS kernels.

> So I really think the best option is "Ignore the problem". The system
> calls will still continue to report the basic error numbers (EINVAL
> etc), and the extended error strings will be just that: extended error
> strings. Ignore them if you can't understand them.
> 
> That said, people have wanted these kinds of extended error
> descriptors forever, and the reason we haven't added them is that it
> generally is more pain than it is necessarily worth. I'm not actually
> at all convinced that has magically changed with the mount
> configuration thing.

I'm not convinced we want to do this either, but if there's anywhere we
do want to do it then mount seems like one of the few places it might be
worth doing.  The reasons that a mount failed are many, and it doesn't
seem like a good idea to introduce a new errno every time a network
filesystem finds a new failure mode.

I do think it might be worth marking the strings specially to indicate
"This is returned to userspace, do not adjust the spelling lightly".
I mean, we haven't even put the 'e' back on creat yet, so we're clearly
willing to live with bad spelling.