Re: [PATCH v2 06/53] CIFS: Add missing unicode handling routines needed by smb2

Jeff Layton <jlayton@xxxxxxxxxx> · Fri, 30 Dec 2011 07:42:08 -0500

On Thu, 29 Dec 2011 19:01:24 +0400
Pavel Shilovsky <piastry@xxxxxxxxxxx> wrote:

> 2011/11/1 Shirish Pargaonkar <shirishpargaonkar@xxxxxxxxx>:
> > On Fri, Oct 28, 2011 at 2:54 PM, Pavel Shilovsky <piastry@xxxxxxxxxxx> wrote:
> >> From: Steve French <sfrench@xxxxxxxxxx>
> >>
> >> Signed-off-by: Steve French <sfrench@xxxxxxxxxx>
> >> Signed-off-by: Pavel Shilovsky <piastry@xxxxxxxxxxx>
> >> ---
> >>  fs/cifs/cifs_unicode.c |   61 ++++++++++++++++++++++++++++++++++++++++++++++++
> >>  fs/cifs/cifs_unicode.h |    7 +++++
> >>  2 files changed, 68 insertions(+), 0 deletions(-)
> >>
> >> diff --git a/fs/cifs/cifs_unicode.c b/fs/cifs/cifs_unicode.c
> >> index 1b2e180..7f09423 100644
> >> --- a/fs/cifs/cifs_unicode.c
> >> +++ b/fs/cifs/cifs_unicode.c
> >> @@ -330,3 +330,64 @@ ctoUCS_out:
> >>        return i;
> >>  }
> >>
> >> +#ifdef CONFIG_CIFS_SMB2
> >> +/*
> >> + * smb2_local_to_ucs2_bytes - how long will a string be after conversion?
> >> + * @from - pointer to input string
> >> + * @maxbytes - don't go past this many bytes of input string
> >> + * @codepage - source codepage
> >> + *
> >> + * Walk a string and return the number of bytes that the string will
> >> + * be after being converted to the given charset, not including any null
> >> + * termination required. Don't walk past maxbytes in the source buffer.
> >> + */
> >> +
> >> +int
> >> +smb2_local_to_ucs2_bytes(const char *from, int len,
> >> +                         const struct nls_table *codepage)
> >> +{
> >> +       int charlen;
> >> +       int i;
> >> +       wchar_t wchar_to;
> >> +
> >> +       if (from == NULL)
> >> +               return 0;

		^^^^^^^^
Are there really cases where you'll pass in a NULL pointer here?

> >> +       for (i = 0; len && *from; i++, from += charlen, len -= charlen) {
> >> +               charlen = codepage->char2uni(from, len, &wchar_to);
								^^^^
						This may result	in an unaligned
						access. You should use
						put_unaligned to place this
						in the string.

> >> +               /* Failed conversion defaults to a question mark */
> >> +               if (charlen < 1)
> >> +                       charlen = 1;
> >> +       }
> >> +       return 2 * i; /* UCS characters are two bytes */
> >> +}
> >> +
> >> +/*
> >> + * smb2_strndup_to_ucs - copy a string to wire format from the local codepage
> >> + * @src - source string
> >> + * @maxlen - don't walk past this many bytes in the source string
> >> + * @ucslen - the length of the allocated string in bytes (including null)
> >> + * @codepage - source codepage
> >> + *
> >> + * Take a string convert it from the local codepage to UCS2 and
> >> + * put it in a new buffer. Returns a pointer to the new string or NULL on
> >> + * error.
> >> + */
> >> +__le16 *
> >> +smb2_strndup_to_ucs(const char *src, const int maxlen, int *ucs_len,
> >> +            const struct nls_table *codepage)
> >> +{
> >> +       int len;
> >> +       __le16 *dst;
> >> +
> >> +       len = smb2_local_to_ucs2_bytes(src, maxlen, codepage);
> >> +       len += 2; /* NULL */
> >> +       dst = kmalloc(len, GFP_KERNEL);
> >> +       if (!dst) {
> >> +               *ucs_len = 0;
> >> +               return NULL;
> >> +       }
> >> +       cifs_strtoUCS(dst, src, maxlen, codepage);
> >> +       *ucs_len = len;
> >> +       return dst;
> >> +}
> >> +#endif /* CONFIG_CIFS_SMB2 */
> >> diff --git a/fs/cifs/cifs_unicode.h b/fs/cifs/cifs_unicode.h
> >> index 6d02fd5..e00f677 100644
> >> --- a/fs/cifs/cifs_unicode.h
> >> +++ b/fs/cifs/cifs_unicode.h
> >> @@ -380,4 +380,11 @@ UniStrlwr(register wchar_t *upin)
> >>
> >>  #endif
> >>
> >> +#ifdef CONFIG_CIFS_SMB2
> >> +extern int smb2_local_to_ucs2_bytes(const char *from, int len,
> >> +                                   const struct nls_table *codepage);
> >> +extern __le16 *smb2_strndup_to_ucs(const char *src, const int maxlen,
> >> +                                  int *ucs_len, const struct nls_table *cp);
> >> +#endif /* CONFIG_CIFS_SMB2 */
> >> +
> >>  #endif /* _CIFS_UNICODE_H */
> >> --
> >> 1.7.1
> >>
> >> --
> >> To unsubscribe from this list: send the line "unsubscribe linux-cifs" in
> >> the body of a message to majordomo@xxxxxxxxxxxxxxx
> >> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >>
> >
> > Should these functions be renamed to *utf16* instead of *ucs*
> > to reflect the unicode encoding used in cifs/smb2 protocol?
> 
> Seems no problem to rename it.
> 
> Jeff, Steve, thoughts? If no objections - I will rename *ucs* to *utf16*.
> 
Yeah, now that I look over the NLS code, it does seem to be converting
to UTF16 and not to UCS2. You should probably rename the existing *ucs*
functions in cifs_unicode.c as well.

Also, we should probably settle on a prefix for these that makes sense.
These are only currently used by smb2 code, but they aren't truly
smb2-specific. I'd name them cifs_* myself, but maybe something more
neutral makes sense?

-- 
Jeff Layton <jlayton@xxxxxxxxxx>
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html