Re: [RFC 0/1] Leading whitespace as a function identification heuristic?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Peff

On 25/09/2020 19:43, Jeff King wrote:
On Fri, Sep 25, 2020 at 10:11:33AM +0100, Phillip Wood wrote:

If I've understood correctly when ...code 2... contains changes that are
themselves indented then we'll pick the wrong function header for example

	void foo() {
		void bar() {
			...code 1...
		}
		for (...) {
			// if this line is changed we pick bar rather
			// than foo because it is the first function
			// header with a smaller indentation than the
			// first changed line.
		}
	}

Unfortunately I suspect code like that is not uncommon and the diff would
regress with this simple heuristic. It might be possible to recalculate the
required indentation as we walk backwards up the file though, so when we hit
the "for" line we reduce the maximum indentation allowed for a match and so
skip "bar" as a function header.

Thanks, that's a great counter-example I hadn't considered.

Yes, I agree that adjusting the desired indentation as we walk backwards
would work. That's assuming indentation is hierarchical, but I think
that's implied by the existence of this feature at all.

Another possible corner case: tabs vs spaces. If I have:

   <space><space><space><space><space><space><space><space>foo
   <tab><tab>bar

which is more indented? Counting isspace(), it is the first one. But
visually, it would _usually_ be the second one. But of course it would
depend on your tabstops.

The above example is obviously stupid and contrived, but I wonder if
there are legitimate confusing cases where people mix tabs and spaces
(e.g., mixed tabs and spaces to align function parameters, etc).

To calculate the indentation for diff --color-moved-ws=allow-indentation-change in fill_es_indent_data() we use the tabwidth whitespace attribute to calculate the width of a tab in spaces

	int tab_width = es->flags & WS_TAB_WIDTH_MASK;
	/* calculate the visual width of indentation */
	while(1) {
		if (s[off] == ' ') {
			width++;
			off++;
		} else if (s[off] == '\t') {
			width += tab_width - (width % tab_width);
			while (s[++off] == '\t')
				width += tab_width;
		} else {
			break;
		}
	}

I think we could probably do something similar here assuming it is the visual width of the indentation that we care about.

Best Wishes

Phillip

-Peff




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux