Re: [PATCH 3/9] VFS: Introduce a mount context

Jeff Layton <jlayton@xxxxxxxxxx> · Wed, 10 May 2017 09:48:51 -0400

On Wed, 2017-05-10 at 15:30 +0200, Miklos Szeredi wrote:
> On Wed, May 10, 2017 at 3:20 PM, Jeff Layton <jlayton@xxxxxxxxxx> wrote:
> > On Wed, 2017-05-10 at 09:05 +0100, David Howells wrote:
> > > Miklos Szeredi <mszeredi@xxxxxxxxxx> wrote:
> > > 
> > > > Possible rule of thumb: use it only at the place where the error
> > > > originates and not where errors are just passed on.  This would result
> > > > in at most one report per syscall, normally.
> > > > 
> > 
> > That might be hard to enforce in practice once you get into some
> > complicated layering. What if we have device_mapper setting this along
> > with filesystems too? We need clear rules here.
> 
> If the error originates in the devicemapper, then why would the
> filesystem set it?
> 
> There's always a root cause of an error and that should be where the
> detailed error is set.
> 
> Am I missing something?
> 

I was thinking that you'd need some well-defined way to tell whether the
string should be replaced. If the thing just hangs out across syscalls,
then you don't know when it got put there. Is it a leftover from a
previous syscall or did a lower layer just put it there?

But...maybe I'm making assumptions about how this would work and I
should just wait until there are patches in flight. Getting the lifetime
of these strings right will be crucial though.

> > 
> > > > And the static string thing that David implemented is also a very good
> > > > idea, IMO.
> > > 
> > > There is an issue with it: it's fine as long as you keep a ref on the module
> > > that generated it or clear all strings as part of module removal (which the
> > > mount context in this patchset does).  With the NFS mount context I did, I
> > > have to keep a ref on the NFS protocol module as well as the NFS filesystem
> > > module.
> > > 
> > > I'm tempted to make it conditionally copy the string using kvasprintf_const()
> > > - which would also permit format substitution.
> > > 
> > 
> > On balance, I think this is a reasonable way to pass back detailed
> > errors. Up until now, we've mostly relied on just printk'ing them. Now
> > though, a lot of larger machines are running containerized setups. Good
> > luck scraping dmesg for _your_ error in that situation. There may be
> > tons of mounts failing all over the place.
> > 
> > That said, I have some concerns here:
> > 
> > What's the lifetime of these strings? Do they just hang around forever
> > until the process goes away or they're replaced? If this becomes common,
> > then you could easily end up with an extra string allocation per task in
> > some cases. That could add up.
> 
> That's why I liked the static string thing.  It's just one assignment
> and no worries about freeing.  Not sure what to do about modules,
> though.  Can we somehow move the cost of checking the validity to the
> place where the error is retrieved?
> 

Seems a little dangerous, and could be limiting. Dynamically allocated
strings seem like they could be more useful.

> > 
> > One idea might be to always kfree it on syscall entry, and that might
> > mitigate the problem assuming that not everything is erroring out. Then
> > you could always do some trivial syscall to clear it manually.
> > 
> > There's also the problem of how these should be formatted. Is English ok
> > everywhere? Do we need a facility to allow translating these things?
> 
> Messages in dmesg are in English too.  If necessary userspace will do
> the translation.  I don't think the kernel would need to worry about
> that.

Fair enough. It _is_ still an improvement over dmesg, IMO.
-- 
Jeff Layton <jlayton@xxxxxxxxxx>