On Sun, Oct 9, 2016 at 1:20 PM, Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx> wrote: > On Thu, Oct 6, 2016 at 9:51 PM, Junio C Hamano <gitster@xxxxxxxxx> wrote: >> Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx> writes: >> >>> Change the log formatting function to know about "git describe" output >>> such as "v2.8.0-4-g867ad08", in addition to just plain "867ad08". >>> >>> There are still many valid refnames that we don't link to >>> e.g. v2.10.0-rc1~2^2~1 is also a valid way to refer to >>> v2.8.0-4-g867ad08, but I'm not supporting that with this commit, >>> similarly it's trivially possible to create some refnames like >>> "æ/var-gf6727b0" or which won't be picked up by this regex. >> >> Not a serious counter-proposal or suggestion, and certainly not an >> objection to the patch I am responding to, but I wonder if it is so >> bad if we made the 867ad08 into link when showing v2.8.0-4-g867ad08. >> >> IOW, in addition to \b followed by [0-9a-f]+ followed by \b, if we >> allowed an optional 'g' in front of the hex, e.g. >> >> - $line =~ s{\b([0-9a-fA-F]{7,40})\b}{ >> + $line =~ s{\bg?([0-9a-fA-F]{7,40})\b}{ >> >> wouldn't that be much simpler, covers more cases and sufficient? > > It would be more simpler, maybe that's the right approach. I have a > preference for making the entire reference a link. I think it makes > more sense for the UI. I just ran into an example of a better reason for doing it like my patch is doing, which is that if you have some tag like: deployment-20160928-171914-16-g42e13d8 With my patch the whole thing will be a link to the 42e13d8 commit, but with this suggestion both 20160928 and 42e13d8 would become commit links, the former one would be broken. Of course we could have some code that would detect that the whole \S+ is part of one thing that ends in g<commit>, but the complexity in doing that would be equivalent or more to doing what my patch is doing. >>> There's surely room for improvement here, but I just wanted to address >>> the very common case of sticking "git describe" output into commit >>> messages without trying to link to all possible refnames, that's going >>> to be a rather futile exercise given that this is free text, and it >>> would be prohibitively expensive to look up whether the references in >>> question exist in our repository. >>> >>> There was on-list discussion about how we could do better than this >>> patch. Junio suggested to update parse_commits() to call a new >>> "gitweb--helper" command which would pass each of the revision >>> candidates through "rev-parse --verify --quiet". That would cut down >>> on our false positives (e.g. we'll link to "deadbeef"), and also allow >>> us to be more aggressive in selecting candidate revisions. >>> >>> That may be too expensive to work in practice, or it may >>> not. Investigating that would be a good follow-up to this patch. >>> >>> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx> >>> --- >>> gitweb/gitweb.perl | 18 ++++++++++++++++-- >>> 1 file changed, 16 insertions(+), 2 deletions(-) >>> >>> diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl >>> index 92b5e91..7cf68f0 100755 >>> --- a/gitweb/gitweb.perl >>> +++ b/gitweb/gitweb.perl >>> @@ -2036,10 +2036,24 @@ sub format_log_line_html { >>> my $line = shift; >>> >>> $line = esc_html($line, -nbsp=>1); >>> - $line =~ s{\b([0-9a-fA-F]{7,40})\b}{ >>> + $line =~ s{ >>> + \b >>> + ( >>> + # The output of "git describe", e.g. v2.10.0-297-gf6727b0 >>> + # or hadoop-20160921-113441-20-g094fb7d >>> + (?<!-) # see strbuf_check_tag_ref(). Tags can't start with - >>> + [A-Za-z0-9.-]+ >>> + (?!\.) # refs can't end with ".", see check_refname_format() >>> + -g[0-9a-fA-F]{7,40} >>> + | >>> + # Just a normal looking Git SHA1 >>> + [0-9a-fA-F]{7,40} >>> + ) >>> + \b >>> + }{ >>> $cgi->a({-href => href(action=>"object", hash=>$1), >>> -class => "text"}, $1); >>> - }eg; >>> + }egx; >>> >>> return $line; >>> }