Re: [PATCH v2 06/53] CIFS: Add missing unicode handling routines needed by smb2

Pavel Shilovsky <piastry@xxxxxxxxxxx> · Thu, 12 Jan 2012 14:47:31 +0400

2011/12/30 Jeff Layton <jlayton@xxxxxxxxxx>:
> On Thu, 29 Dec 2011 19:01:24 +0400
> Pavel Shilovsky <piastry@xxxxxxxxxxx> wrote:
>
>> 2011/11/1 Shirish Pargaonkar <shirishpargaonkar@xxxxxxxxx>:
>> > On Fri, Oct 28, 2011 at 2:54 PM, Pavel Shilovsky <piastry@xxxxxxxxxxx> wrote:
>> >> From: Steve French <sfrench@xxxxxxxxxx>
>> >>
>> >> Signed-off-by: Steve French <sfrench@xxxxxxxxxx>
>> >> Signed-off-by: Pavel Shilovsky <piastry@xxxxxxxxxxx>
>> >> ---
>> >>  fs/cifs/cifs_unicode.c |   61 ++++++++++++++++++++++++++++++++++++++++++++++++
>> >>  fs/cifs/cifs_unicode.h |    7 +++++
>> >>  2 files changed, 68 insertions(+), 0 deletions(-)
>> >>
>> >> diff --git a/fs/cifs/cifs_unicode.c b/fs/cifs/cifs_unicode.c
>> >> index 1b2e180..7f09423 100644
>> >> --- a/fs/cifs/cifs_unicode.c
>> >> +++ b/fs/cifs/cifs_unicode.c
>> >> @@ -330,3 +330,64 @@ ctoUCS_out:
>> >>        return i;
>> >>  }
>> >>
>> >> +#ifdef CONFIG_CIFS_SMB2
>> >> +/*
>> >> + * smb2_local_to_ucs2_bytes - how long will a string be after conversion?
>> >> + * @from - pointer to input string
>> >> + * @maxbytes - don't go past this many bytes of input string
>> >> + * @codepage - source codepage
>> >> + *
>> >> + * Walk a string and return the number of bytes that the string will
>> >> + * be after being converted to the given charset, not including any null
>> >> + * termination required. Don't walk past maxbytes in the source buffer.
>> >> + */
>> >> +
>> >> +int
>> >> +smb2_local_to_ucs2_bytes(const char *from, int len,
>> >> +                         const struct nls_table *codepage)
>> >> +{
>> >> +       int charlen;
>> >> +       int i;
>> >> +       wchar_t wchar_to;
>> >> +
>> >> +       if (from == NULL)
>> >> +               return 0;
>
>                ^^^^^^^^
> Are there really cases where you'll pass in a NULL pointer here?

Good catch, thanks!
>
>> >> +       for (i = 0; len && *from; i++, from += charlen, len -= charlen) {
>> >> +               charlen = codepage->char2uni(from, len, &wchar_to);
>                                                                ^^^^
>                                                This may result in an unaligned
>                                                access. You should use
>                                                put_unaligned to place this
>                                                in the string.

We only count the length of new unicode string here - so, don't need
to put it somewhere.

>
>> >> +               /* Failed conversion defaults to a question mark */
>> >> +               if (charlen < 1)
>> >> +                       charlen = 1;
>> >> +       }
>> >> +       return 2 * i; /* UCS characters are two bytes */
>> >> +}
>> >> +
>> >> +/*
>> >> + * smb2_strndup_to_ucs - copy a string to wire format from the local codepage
>> >> + * @src - source string
>> >> + * @maxlen - don't walk past this many bytes in the source string
>> >> + * @ucslen - the length of the allocated string in bytes (including null)
>> >> + * @codepage - source codepage
>> >> + *
>> >> + * Take a string convert it from the local codepage to UCS2 and
>> >> + * put it in a new buffer. Returns a pointer to the new string or NULL on
>> >> + * error.
>> >> + */
>> >> +__le16 *
>> >> +smb2_strndup_to_ucs(const char *src, const int maxlen, int *ucs_len,
>> >> +            const struct nls_table *codepage)
>> >> +{
>> >> +       int len;
>> >> +       __le16 *dst;
>> >> +
>> >> +       len = smb2_local_to_ucs2_bytes(src, maxlen, codepage);
>> >> +       len += 2; /* NULL */
>> >> +       dst = kmalloc(len, GFP_KERNEL);
>> >> +       if (!dst) {
>> >> +               *ucs_len = 0;
>> >> +               return NULL;
>> >> +       }
>> >> +       cifs_strtoUCS(dst, src, maxlen, codepage);
>> >> +       *ucs_len = len;
>> >> +       return dst;
>> >> +}
>> >> +#endif /* CONFIG_CIFS_SMB2 */
>> >> diff --git a/fs/cifs/cifs_unicode.h b/fs/cifs/cifs_unicode.h
>> >> index 6d02fd5..e00f677 100644
>> >> --- a/fs/cifs/cifs_unicode.h
>> >> +++ b/fs/cifs/cifs_unicode.h
>> >> @@ -380,4 +380,11 @@ UniStrlwr(register wchar_t *upin)
>> >>
>> >>  #endif
>> >>
>> >> +#ifdef CONFIG_CIFS_SMB2
>> >> +extern int smb2_local_to_ucs2_bytes(const char *from, int len,
>> >> +                                   const struct nls_table *codepage);
>> >> +extern __le16 *smb2_strndup_to_ucs(const char *src, const int maxlen,
>> >> +                                  int *ucs_len, const struct nls_table *cp);
>> >> +#endif /* CONFIG_CIFS_SMB2 */
>> >> +
>> >>  #endif /* _CIFS_UNICODE_H */
>> >> --
>> >> 1.7.1
>> >>
>> >> --
>> >> To unsubscribe from this list: send the line "unsubscribe linux-cifs" in
>> >> the body of a message to majordomo@xxxxxxxxxxxxxxx
>> >> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> >>
>> >
>> > Should these functions be renamed to *utf16* instead of *ucs*
>> > to reflect the unicode encoding used in cifs/smb2 protocol?
>>
>> Seems no problem to rename it.
>>
>> Jeff, Steve, thoughts? If no objections - I will rename *ucs* to *utf16*.
>>
> Yeah, now that I look over the NLS code, it does seem to be converting
> to UTF16 and not to UCS2. You should probably rename the existing *ucs*
> functions in cifs_unicode.c as well.
>
> Also, we should probably settle on a prefix for these that makes sense.
> These are only currently used by smb2 code, but they aren't truly
> smb2-specific. I'd name them cifs_* myself, but maybe something more
> neutral makes sense?
>

I think, cifs_* will be ok.

-- 
Best regards,
Pavel Shilovsky.
--
To unsubscribe from this list: send the line "unsubscribe linux-cifs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html