Re: [PATCH 08/16] nfsd: escape high characters in binary data

Andy Shevchenko <andriy.shevchenko@xxxxxxxxx> · Wed, 7 Aug 2019 12:00:07 +0300

On Tue, Aug 06, 2019 at 02:50:08PM -0400, J. Bruce Fields wrote:
> On Tue, Aug 06, 2019 at 03:19:31PM +0300, Andy Shevchenko wrote:
> > On Thu, Jun 20, 2019 at 10:51:07AM -0400, J. Bruce Fields wrote:
> > > From: "J. Bruce Fields" <bfields@xxxxxxxxxx>
> > > 
> > > I'm exposing some information about NFS clients in pseudofiles.  I
> > > expect to eventually have simple tools to help read those pseudofiles.
> > > 
> > > But it's also helpful if the raw files are human-readable to the extent
> > > possible.  It aids debugging and makes them usable on systems that don't
> > > have the latest nfs-utils.
> > > 
> > > A minor challenge there is opaque client-generated protocol objects like
> > > state owners and client identifiers.  Some clients generate those to
> > > include handy information in plain ascii.  But they may also include
> > > arbitrary byte sequences.
> > > 
> > > I think the simplest approach is to limit to isprint(c) && isascii(c)
> > > and escape everything else.
> > > 
> > > That means you can just cat the file and get something that looks OK.
> > > Also, I'm trying to keep these files legal YAML, which requires them to
> > > UTF-8, and this is a simple way to guarantee that.
> > 
> > Two questions:
> > - why can't be original function extended to cover this case
> >   (using additional flags, maybe)?
> 
> I found the ESCAPE_NP/"only" logic made it a little difficult to extend
> string_escape_mem().

Maybe it requires more thinking about?
I think it is still possible to extend existing, rather to take workarounds
like this one.

> So, I wrote a patch series that removes the string_escape_mem flags that
> aren't used

Have you considered the potential users that can be converted to use
string_escape_mem()?

I know about at least one (needs to be reworked a bit, but it is in slow
progress).

There are potentially others that would be converted using "unused" flags.

>, simplifies it a bit, then separates the flags into two
> different types: those that select which characters to escape
> (non-printable, non-ascii, whitespace, etc.) and those that choose a
> style of escaping to use (octal, hex, or \\).  That seems to make the
> code a little easier to extend while still covering the cases people
> actually use.  I'll try to get those out this week and you can tell me
> what you think.

Will be glad to help!

In any case regarding to this one, I would like rather to see it's never
appeared, or now will be gone in favour of string_escape_mem() extension.

-- 
With Best Regards,
Andy Shevchenko