Re: [PATCH v2 06/53] CIFS: Add missing unicode handling routines needed by smb2

Shirish Pargaonkar <shirishpargaonkar@xxxxxxxxx> · Thu, 12 Jan 2012 13:58:50 -0600

On Thu, Jan 12, 2012 at 12:42 PM, Jeff Layton <jlayton@xxxxxxxxx> wrote:
> On Thu, 12 Jan 2012 11:22:35 -0600
> Shirish Pargaonkar <shirishpargaonkar@xxxxxxxxx> wrote:
>
>> On Fri, Oct 28, 2011 at 2:54 PM, Pavel Shilovsky <piastry@xxxxxxxxxxx> wrote:
>> > From: Steve French <sfrench@xxxxxxxxxx>
>> >
>> > Signed-off-by: Steve French <sfrench@xxxxxxxxxx>
>> > Signed-off-by: Pavel Shilovsky <piastry@xxxxxxxxxxx>
>> > ---
>> >  fs/cifs/cifs_unicode.c |   61 ++++++++++++++++++++++++++++++++++++++++++++++++
>> >  fs/cifs/cifs_unicode.h |    7 +++++
>> >  2 files changed, 68 insertions(+), 0 deletions(-)
>> >
>> > diff --git a/fs/cifs/cifs_unicode.c b/fs/cifs/cifs_unicode.c
>> > index 1b2e180..7f09423 100644
>> > --- a/fs/cifs/cifs_unicode.c
>> > +++ b/fs/cifs/cifs_unicode.c
>> > @@ -330,3 +330,64 @@ ctoUCS_out:
>> >        return i;
>> >  }
>> >
>> > +#ifdef CONFIG_CIFS_SMB2
>> > +/*
>> > + * smb2_local_to_ucs2_bytes - how long will a string be after conversion?
>> > + * @from - pointer to input string
>> > + * @maxbytes - don't go past this many bytes of input string
>> > + * @codepage - source codepage
>> > + *
>> > + * Walk a string and return the number of bytes that the string will
>> > + * be after being converted to the given charset, not including any null
>> > + * termination required. Don't walk past maxbytes in the source buffer.
>> > + */
>> > +
>> > +int
>> > +smb2_local_to_ucs2_bytes(const char *from, int len,
>> > +                         const struct nls_table *codepage)
>> > +{
>> > +       int charlen;
>> > +       int i;
>> > +       wchar_t wchar_to;
>> > +
>> > +       if (from == NULL)
>> > +               return 0;
>> > +       for (i = 0; len && *from; i++, from += charlen, len -= charlen) {
>> > +               charlen = codepage->char2uni(from, len, &wchar_to);
>>
>> Why call function char2uni?  It either returns 1 or EINVAL.
>> If it returns EINVAL, charlen is set to 1.  So either way charlen
>> will always be 1.
>> So why not just call strlen(from) to determine the length of the string?
>>
>
> I thought char2uni returned the width of the character. That's not
> necessarily going to be 1 in all cases.
>

It is very confusing.   Does width of the character means width of the
unicode character?  If so, should not it be the same for all unicode
characters?  Or by width of the character, is it meant the width of
the encoding of a unicode character?  If so, I do not see anywhere
in fs/nls direcotry any char2uni function returning width of a
utf16 encoding of a unicode character corrosponding to a character
within the native codepath

>> > +               /* Failed conversion defaults to a question mark */
>> > +               if (charlen < 1)
>> > +                       charlen = 1;
>> > +       }
>> > +       return 2 * i; /* UCS characters are two bytes */
>> > +}
>> > +
>> > +/*
>> > + * smb2_strndup_to_ucs - copy a string to wire format from the local codepage
>> > + * @src - source string
>> > + * @maxlen - don't walk past this many bytes in the source string
>> > + * @ucslen - the length of the allocated string in bytes (including null)
>> > + * @codepage - source codepage
>> > + *
>> > + * Take a string convert it from the local codepage to UCS2 and
>> > + * put it in a new buffer. Returns a pointer to the new string or NULL on
>> > + * error.
>> > + */
>> > +__le16 *
>> > +smb2_strndup_to_ucs(const char *src, const int maxlen, int *ucs_len,
>> > +            const struct nls_table *codepage)
>> > +{
>> > +       int len;
>> > +       __le16 *dst;
>> > +
>> > +       len = smb2_local_to_ucs2_bytes(src, maxlen, codepage);
>> > +       len += 2; /* NULL */
>> > +       dst = kmalloc(len, GFP_KERNEL);
>> > +       if (!dst) {
>> > +               *ucs_len = 0;
>> > +               return NULL;
>> > +       }
>> > +       cifs_strtoUCS(dst, src, maxlen, codepage);
>> > +       *ucs_len = len;
>> > +       return dst;
>> > +}
>> > +#endif /* CONFIG_CIFS_SMB2 */
>> > diff --git a/fs/cifs/cifs_unicode.h b/fs/cifs/cifs_unicode.h
>> > index 6d02fd5..e00f677 100644
>> > --- a/fs/cifs/cifs_unicode.h
>> > +++ b/fs/cifs/cifs_unicode.h
>> > @@ -380,4 +380,11 @@ UniStrlwr(register wchar_t *upin)
>> >
>> >  #endif
>> >
>> > +#ifdef CONFIG_CIFS_SMB2
>> > +extern int smb2_local_to_ucs2_bytes(const char *from, int len,
>> > +                                   const struct nls_table *codepage);
>> > +extern __le16 *smb2_strndup_to_ucs(const char *src, const int maxlen,
>> > +                                  int *ucs_len, const struct nls_table *cp);
>> > +#endif /* CONFIG_CIFS_SMB2 */
>> > +
>> >  #endif /* _CIFS_UNICODE_H */
>> > --
>> > 1.7.1
>> >
>> > --
>> > To unsubscribe from this list: send the line "unsubscribe linux-cifs" in
>> > the body of a message to majordomo@xxxxxxxxxxxxxxx
>> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-cifs" in
>> the body of a message to majordomo@xxxxxxxxxxxxxxx
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
>
> --
> Jeff Layton <jlayton@xxxxxxxxx>
--
To unsubscribe from this list: send the line "unsubscribe linux-cifs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html