Re: [PATCH v2 06/53] CIFS: Add missing unicode handling routines needed by smb2

Jeff Layton <jlayton@xxxxxxxxx> · Thu, 12 Jan 2012 15:48:14 -0500

On Thu, 12 Jan 2012 13:58:50 -0600
Shirish Pargaonkar <shirishpargaonkar@xxxxxxxxx> wrote:

> On Thu, Jan 12, 2012 at 12:42 PM, Jeff Layton <jlayton@xxxxxxxxx> wrote:
> > On Thu, 12 Jan 2012 11:22:35 -0600
> > Shirish Pargaonkar <shirishpargaonkar@xxxxxxxxx> wrote:
> >
> >> On Fri, Oct 28, 2011 at 2:54 PM, Pavel Shilovsky <piastry@xxxxxxxxxxx> wrote:
> >> > From: Steve French <sfrench@xxxxxxxxxx>
> >> >
> >> > Signed-off-by: Steve French <sfrench@xxxxxxxxxx>
> >> > Signed-off-by: Pavel Shilovsky <piastry@xxxxxxxxxxx>
> >> > ---
> >> >  fs/cifs/cifs_unicode.c |   61 ++++++++++++++++++++++++++++++++++++++++++++++++
> >> >  fs/cifs/cifs_unicode.h |    7 +++++
> >> >  2 files changed, 68 insertions(+), 0 deletions(-)
> >> >
> >> > diff --git a/fs/cifs/cifs_unicode.c b/fs/cifs/cifs_unicode.c
> >> > index 1b2e180..7f09423 100644
> >> > --- a/fs/cifs/cifs_unicode.c
> >> > +++ b/fs/cifs/cifs_unicode.c
> >> > @@ -330,3 +330,64 @@ ctoUCS_out:
> >> >        return i;
> >> >  }
> >> >
> >> > +#ifdef CONFIG_CIFS_SMB2
> >> > +/*
> >> > + * smb2_local_to_ucs2_bytes - how long will a string be after conversion?
> >> > + * @from - pointer to input string
> >> > + * @maxbytes - don't go past this many bytes of input string
> >> > + * @codepage - source codepage
> >> > + *
> >> > + * Walk a string and return the number of bytes that the string will
> >> > + * be after being converted to the given charset, not including any null
> >> > + * termination required. Don't walk past maxbytes in the source buffer.
> >> > + */
> >> > +
> >> > +int
> >> > +smb2_local_to_ucs2_bytes(const char *from, int len,
> >> > +                         const struct nls_table *codepage)
> >> > +{
> >> > +       int charlen;
> >> > +       int i;
> >> > +       wchar_t wchar_to;
> >> > +
> >> > +       if (from == NULL)
> >> > +               return 0;
> >> > +       for (i = 0; len && *from; i++, from += charlen, len -= charlen) {
> >> > +               charlen = codepage->char2uni(from, len, &wchar_to);
> >>
> >> Why call function char2uni?  It either returns 1 or EINVAL.
> >> If it returns EINVAL, charlen is set to 1.  So either way charlen
> >> will always be 1.
> >> So why not just call strlen(from) to determine the length of the string?
> >>
> >
> > I thought char2uni returned the width of the character. That's not
> > necessarily going to be 1 in all cases.
> >
> 
> It is very confusing.   Does width of the character means width of the
> unicode character?  If so, should not it be the same for all unicode
> characters?  Or by width of the character, is it meant the width of
> the encoding of a unicode character?  If so, I do not see anywhere
> in fs/nls direcotry any char2uni function returning width of a
> utf16 encoding of a unicode character corrosponding to a character
> within the native codepath
> 
> 

I meant the width of the original character. The result is always going
to be 2 bytes, but the original character can be up to 4 bytes long in
the case of UTF-8.

I think Pavel is quite correct to call char2uni this way in order to
determine the length of the resulting string.

> >> > +               /* Failed conversion defaults to a question mark */
> >> > +               if (charlen < 1)
> >> > +                       charlen = 1;
> >> > +       }
> >> > +       return 2 * i; /* UCS characters are two bytes */
> >> > +}
> >> > +
> >> > +/*
> >> > + * smb2_strndup_to_ucs - copy a string to wire format from the local codepage
> >> > + * @src - source string
> >> > + * @maxlen - don't walk past this many bytes in the source string
> >> > + * @ucslen - the length of the allocated string in bytes (including null)
> >> > + * @codepage - source codepage
> >> > + *
> >> > + * Take a string convert it from the local codepage to UCS2 and
> >> > + * put it in a new buffer. Returns a pointer to the new string or NULL on
> >> > + * error.
> >> > + */
> >> > +__le16 *
> >> > +smb2_strndup_to_ucs(const char *src, const int maxlen, int *ucs_len,
> >> > +            const struct nls_table *codepage)
> >> > +{
> >> > +       int len;
> >> > +       __le16 *dst;
> >> > +
> >> > +       len = smb2_local_to_ucs2_bytes(src, maxlen, codepage);
> >> > +       len += 2; /* NULL */
> >> > +       dst = kmalloc(len, GFP_KERNEL);
> >> > +       if (!dst) {
> >> > +               *ucs_len = 0;
> >> > +               return NULL;
> >> > +       }
> >> > +       cifs_strtoUCS(dst, src, maxlen, codepage);
> >> > +       *ucs_len = len;
> >> > +       return dst;
> >> > +}
> >> > +#endif /* CONFIG_CIFS_SMB2 */
> >> > diff --git a/fs/cifs/cifs_unicode.h b/fs/cifs/cifs_unicode.h
> >> > index 6d02fd5..e00f677 100644
> >> > --- a/fs/cifs/cifs_unicode.h
> >> > +++ b/fs/cifs/cifs_unicode.h
> >> > @@ -380,4 +380,11 @@ UniStrlwr(register wchar_t *upin)
> >> >
> >> >  #endif
> >> >
> >> > +#ifdef CONFIG_CIFS_SMB2
> >> > +extern int smb2_local_to_ucs2_bytes(const char *from, int len,
> >> > +                                   const struct nls_table *codepage);
> >> > +extern __le16 *smb2_strndup_to_ucs(const char *src, const int maxlen,
> >> > +                                  int *ucs_len, const struct nls_table *cp);
> >> > +#endif /* CONFIG_CIFS_SMB2 */
> >> > +
> >> >  #endif /* _CIFS_UNICODE_H */
> >> > --
> >> > 1.7.1
> >> >
> >> > --
> >> > To unsubscribe from this list: send the line "unsubscribe linux-cifs" in
> >> > the body of a message to majordomo@xxxxxxxxxxxxxxx
> >> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >> --
> >> To unsubscribe from this list: send the line "unsubscribe linux-cifs" in
> >> the body of a message to majordomo@xxxxxxxxxxxxxxx
> >> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >
> >
> > --
> > Jeff Layton <jlayton@xxxxxxxxx>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-cifs" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

-- 
Jeff Layton <jlayton@xxxxxxxxx>
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html