Bug: git p4 clone fails if non unicode files with +k or+ko attributes are present

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello,

I would like to report a problem which is preventing me from using
git's Perforce integration.

Problem Command:
git p4 clone //path/to/files/

Steps to reproduce:
* Create/have a configured and running Perforce server and client.
* Create a file with any encoding aside from UTF-8 or UTF-16
containing at least one character outside the ASCII range.
* Adjust the file's type to text and include the "+k" or "+ko" attributes.
  * This can be done in the graphical p4v client by right clicking the
file and selecting "Change Filetype"
* Submit that file to Perforce.
* Attempt to run `git p4 clone //path/to/files/`

Expected behavior:
The Perforce repository is successfully cloned.

Actual behavior:

The clone failed with the following error:
```
Traceback (most recent call last):
  File "/usr/libexec/git-core/git-p4", line 4441, in <module>
    main()
  File "/usr/libexec/git-core/git-p4", line 4435, in main
    if not cmd.run(args):
  File "/usr/libexec/git-core/git-p4", line 4187, in run
    if not P4Sync.run(self, depotPaths):
  File "/usr/libexec/git-core/git-p4", line 4053, in run
    self.importHeadRevision(revision)
  File "/usr/libexec/git-core/git-p4", line 3761, in importHeadRevision
    self.commit(details, self.extractFilesFromCommit(details), self.branch)
  File "/usr/libexec/git-core/git-p4", line 3311, in commit
    self.streamP4Files(files)
  File "/usr/libexec/git-core/git-p4", line 3155, in streamP4Files
    p4CmdList(["-x", "-", "print"],
  File "/usr/libexec/git-core/git-p4", line 784, in p4CmdList
    cb(entry)
  File "/usr/libexec/git-core/git-p4", line 3142, in streamP4FilesCbSelf
    self.streamP4FilesCb(entry)
  File "/usr/libexec/git-core/git-p4", line 3090, in streamP4FilesCb
    self.streamOneP4File(self.stream_file, self.stream_contents)
  File "/usr/libexec/git-core/git-p4", line 3035, in streamOneP4File
    text = ''.join(decode_text_stream(c) for c in contents)
  File "/usr/libexec/git-core/git-p4", line 3035, in <genexpr>
    text = ''.join(decode_text_stream(c) for c in contents)
  File "/usr/libexec/git-core/git-p4", line 184, in decode_text_stream
    return s.decode() if isinstance(s, bytes) else s
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xde in position
6276: invalid continuation byte
```

Failure Analysis:

The problem appears to be partly in `streamOneP4File` and partly in `
p4_keywords_regexp_for_type`.

Perforce allows embedding special keywords in files that are expanded
at checkout, and this function strips the expansion.
To do so,  it converts the binary data to a real string, performs a
regex replace, then converts it back to binary data.
Unfortunately, that assumes the file is encoded in UTF-8, and when
this is not the case an uncaught exception is thrown.

Importantly, perforce also allows injecting those keywords in raw
binary files, so this goes beyond a simple "support more encodings"
issue.

https://community.perforce.com/s/article/3482

[System Info]
git version 2.31.1

[Enabled Hooks]
not run from a git repository - no hooks to show

Sincerely,
Arthur Moore



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux