Thank you for your feedback. > are there any plans to integrate the parser into connect.c and fetch ? Yes. That was my intention but I was not confident enough to touch connect.c before getting feedback from the community, since it's critical code and it is my first contribution. I do want to merge all URL parsing in git into this one function though, thereby creating a "single point of truth". This is so that if the algorithm is modified the changes are visible to the URL parser builtin as well. > Speaking as a person, who manage to break the parsing of URLs once, > with the good intention to improve things, I need to learn that > test cases are important. Absolutely agree. When adding test cases, I looked at the possibilities enumerated in urls.txt and generated test cases based on those. I also looked at the urlmatch.h test cases. However... > Some work can be seen in t5601-clone.sh ... I did not think to check those. > Especially, when dealing with literal IPv6 addresses, > the ones with [] and the simplified ssh syntax 'myhost:src' > are interesting to test. You're right about that. I shall prepare an updated v2 patchset with more test cases, and also any other changes/improvements requested by maintainers. > And some features using the [] syntax to embedd a port number > inside the simplified ssh syntax had not been documented, > but used in practise, and are now part of the test suite. > See "[myhost:123]:src" in t5601 Indeed, I did not read anything of the sort when I checked it. Would you like me to commit a note to this effect to urls.txt ? > Or is this new tool just a helper, to verify "good" URL's, > and not accepting our legacy parser quirks ? It is my intention that this builtin be able to accept, parse and decompose all types of URLs that git itself can accept. > Then we still should see some IPv6 tests ? I will add them! > Or may be not, as we prefer hostnames these days ? I would have to defer that choice to someone more experienced with the codebase. Please advise on how to proceed. > The RFC 1738 uses the term "scheme" here, and using the very generic > term "protocol" may lead to name clashes later. > Would something like "git_scheme" or so be better ? Scheme does seem like a better word if it's the terminology used by RFCs. I can change that in a new version if necessary. That code is based on the existing connect.c parsing code though. > I think that the "///" version is superflous, it should already > be covered by the "//" version I thought it was a good idea because of existing precedent: my first approach to creating the test cases was to copy the ones from t0110-urlmatch-normalization.sh which did have many cases such as those. Then as I developed the code I came to believe that it was not necessary: I call url_normalize in the url_parse function and url_normalize is already being tested. I think I just forgot to delete those lines. Reading that file over once again, it does have IPv6 address test cases. So I should probably go over it again. Thanks again for the feedback, Matheus