From: Ben Keene <seraphire@xxxxxxxxx> Python 3+ handles strings differently than Python 2.7. Since Python 2 is reaching it's end of life, a series of changes are being submitted to enable python 3.7+ support. The current code fails basic tests under python 3.7. Change the existing unicode test add new support functions for python2-python3 support. Define the following variables: - isunicode - a boolean variable that states if the version of python natively supports unicode (true) or not (false). This is true for Python3 and false for Python2. - unicode - a type alias for the datatype that holds a unicode string. It is assigned to a str under python 3 and the unicode type for Python2. - bytes - a type alias for an array of bytes. It is assigned the native bytes type for Python3 and str for Python2. Add the following new functions: - as_string(text) - A new function that will convert a byte array to a unicode (UTF-8) string under python 3. Under python 2, this returns the string unchanged. - as_bytes(text) - A new function that will convert a unicode string to a byte array under python 3. Under python 2, this returns the string unchanged. - to_unicode(text) - Converts a text string as Unicode(UTF-8) on both Python2 and Python3. Add a new function alias raw_input: If raw_input does not exist (it was renamed to input in python 3) alias input as raw_input. The AS_STRING and AS_BYTES functions allow for modifying the code with a minimal amount of impact on Python2 support. When a string is expected, the as_string() will be used to convert "cast" the incoming "bytes" to a string type. Conversely as_bytes() will be used to convert a "string" to a "byte array" type. Since Python2 overloads the datatype 'str' to serve both purposes, the Python2 versions of these function do not change the data, since the str functions as both a byte array and a string. basestring is removed since its only references are found in tests that were changed in the previous change list. Signed-off-by: Ben Keene <seraphire@xxxxxxxxx> (cherry picked from commit 7921aeb3136b07643c1a503c2d9d8b5ada620356) --- git-p4.py | 70 +++++++++++++++++++++++++++++++++++++++++++++++++++---- 1 file changed, 66 insertions(+), 4 deletions(-) diff --git a/git-p4.py b/git-p4.py index 0f27996393..93dfd0920a 100755 --- a/git-p4.py +++ b/git-p4.py @@ -32,16 +32,78 @@ unicode = unicode except NameError: # 'unicode' is undefined, must be Python 3 - str = str + # + # For Python3 which is natively unicode, we will use + # unicode for internal information but all P4 Data + # will remain in bytes + isunicode = True unicode = str bytes = bytes - basestring = (str,bytes) + + def as_string(text): + """Return a byte array as a unicode string""" + if text == None: + return None + if isinstance(text, bytes): + return unicode(text, "utf-8") + else: + return text + + def as_bytes(text): + """Return a Unicode string as a byte array""" + if text == None: + return None + if isinstance(text, bytes): + return text + else: + return bytes(text, "utf-8") + + def to_unicode(text): + """Return a byte array as a unicode string""" + return as_string(text) + + def path_as_string(path): + """ Converts a path to the UTF8 encoded string """ + if isinstance(path, unicode): + return path + return encodeWithUTF8(path).decode('utf-8') + else: # 'unicode' exists, must be Python 2 - str = str + # + # We will treat the data as: + # str -> str + # bytes -> str + # So for Python2 these functions are no-ops + # and will leave the data in the ambiguious + # string/bytes state + isunicode = False unicode = unicode bytes = str - basestring = basestring + + def as_string(text): + """ Return text unaltered (for Python3 support) """ + return text + + def as_bytes(text): + """ Return text unaltered (for Python3 support) """ + return text + + def to_unicode(text): + """Return a string as a unicode string""" + return text.decode('utf-8') + + def path_as_string(path): + """ Converts a path to the UTF8 encoded bytes """ + return encodeWithUTF8(path) + + + +# Check for raw_input support +try: + raw_input +except NameError: + raw_input = input try: from subprocess import CalledProcessError -- gitgitgadget