Issue: The current git-p4.py script does not work with python3. I have attempted to use the P4 integration built into GIT and I was unable to get the program to run because I have Python 3.8 installed on my computer. I was able to get the program to run when I downgraded my python to version 2.7. However, python 2 is reaching its end of life. Submission: I am submitting a patch for the git-p4.py script that partially supports python 3.8. This code was able to pass the basic tests (t9800) when run against Python3. This provides basic functionality. In an attempt to pass the t9822 P4 path-encoding test, a new parameter for git P4 Clone was introduced. --encoding Format-identifier This will create the GIT repository following the current functionality; however, before importing the files from P4, it will set the git-p4.pathEncoding option so any files or paths that are encoded with non-ASCII/non-UTF-8 formats will import correctly. Technical details: The script was updated by futurize ( https://python-future.org/futurize.html) to support Py2/Py3 syntax. The few references to classes in future were reworked so that future would not be required. The existing code test for Unicode support was extended to normalize the classes “unicode” and “bytes” to across platforms: * ‘unicode’ is an alias for ‘str’ in Py3 and is the unicode class in Py2. * ‘bytes’ is bytes in Py3 and an alias for ‘str’ in Py2. New coercion methods were written for both Python2 and Python3: * as_string(text) – In Python3, this encodes a bytes object as a UTF-8 encoded Unicode string. * as_bytes(text) – In Python3, this decodes a Unicode string to an array of bytes. In Python2, these functions do not change the data since a ‘str’ object function in both roles as strings and byte arrays. This reduces the potential impact on backward compatibility with Python 2. * to_unicode(text) – ensures that the supplied data is encoded as a UTF-8 string. This function will encode data in both Python2 and Python3. * path_as_string(path) – This function is an extension function that honors the option “git-p4.pathEncoding” to convert a set of bytes or characters to UTF-8. If the str/bytes cannot decode as ASCII, it will use the encodeWithUTF8() method to convert the custom encoded bytes to Unicode in UTF-8. Generally speaking, information in the script is converted to Unicode as early as possible and converted back to a byte array just before passing to external programs or files. The exception to this rule is P4 Repository file paths. Paths are not converted but left as “bytes” so the original file path encoding can be preserved. This formatting is required for commands that interact with the P4 file path. When the file path is used by GIT, it is converted with encodeWithUTF8(). Signed-off-by: Ben Keene seraphire@xxxxxxxxx [seraphire@xxxxxxxxx] Ben Keene (1): Python3 support for t9800 tests. Basic P4/Python3 support git-p4.py | 825 +++++++++++++++++++++++++++++++++++++++++------------- 1 file changed, 628 insertions(+), 197 deletions(-) base-commit: d9f6f3b6195a0ca35642561e530798ad1469bd41 Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-463%2Fseraphire%2Fseraphire%2Fp4-python3-unicode-v3 Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-463/seraphire/seraphire/p4-python3-unicode-v3 Pull-Request: https://github.com/gitgitgadget/git/pull/463 Range-diff vs v2: 1: 0bca930ff8 < -: ---------- Cast byte strings to unicode strings in python3 2: 0435d0e2cb < -: ---------- FIX: cast as unicode fails when a value is already unicode 3: 2288690b94 < -: ---------- FIX: wrap return for read_pipe_lines in ustring() and wrap GitLFS read of the pointer file in ustring() -: ---------- > 1: 02b3843e9f Python3 support for t9800 tests. Basic P4/Python3 support -- gitgitgadget