Re: Fwd: dependency tee from c parser entities downto token

Konrad Eisele <konrad@xxxxxxxxxxx> · Wed, 09 May 2012 11:48:00 +0200

Christopher Li wrote:
On Mon, May 7, 2012 at 11:38 PM, Konrad Eisele<konrad@xxxxxxxxxxx>  wrote:

Yea, I think it is better that way. You should implement it yourself first,
it kind of takes too long otherwise :-). Still I'm kind of
curious how you can trace macro expansion with just 1 callback
but I'll like to be surprised.

OK, there is the initial version of the preprocessor hook.
I create an branch "unclean-preprocess-hook" for review.

http://git.kernel.org/?p=devel/sparse/chrisl/sparse.git;a=shortlog;h=refs/heads/unclean-preprocess-hook

I end up use more than one call back, but it is still better that 6.
I also think that is easier for the caller to use. because it receive the
the before and after at the same time.

struct preprocess_hook {
	void (*expand_macro)(struct token *macro, struct symbol *sym,
			     struct token **replace, struct token **replace_tail);
	void (*expand_arg)(struct token *macro, struct symbol *sym, int arg,
			   struct token *orig, struct token *expanded);
};

I dont think its practical: If you have a argument that is expanded then
when using 4 callbacks you get the calls:

  arg-expand-begin(a)
    body-expand-begin(c)
    body-expand-end(d)
  arg-expand-end(b)

When using 2 callbacks you get the calls:

    body-expand(c d)
  arg-expand-begin(a b)

But "a" is the source to both "c" and "d". The goal of all this is to
generate a tree. You need to know where a token originated from. The
tokens in "a" might be duplicated, then in your case you dont have
enough information to reason about the origin of "c".

I do it like this: Inside
arg-expand-begin(a) I "dope" all tokens of "a" by setting
token.pos.position (which I understood I can use as I want)
with an unique id (token.pos.stream is my preprocessor stream). When a
token is duplicated in argument replacement etc. token.pos will also
be copied. The duplicates of "a" will always retain information where
they came from.
Then I can regenerate the tree.

T think you need to implement this first so that I can see how it could
be done...

The demo program expand your example macro with the following
results:

<beginning of 't.c'>
#define D0(d0a0,d0a1) 1 D1(d0a0) 2 D2(d0a1) 3
#define D1(d1a0) 4 d1a0 5
#define D2(d2a0) 6 d2a0 7
#define D3(d3a0) 8 d3a0 9
D0(D3(10),11)<end of 't.c'>
arg0 in D3 :10 ->  10
macro D3 inside D0
expand result: 8 10 9<untaint: D3>
arg0 in D0 :D3(10) ->  8 10 9
arg1 in D0 :11 ->  11
macro D0 inside<noident>
expand result: 1 D1(8 10 9) 2 D2(11) 3<untaint: D0>
arg0 in D1 :8 10 9 ->  8 10 9
macro D1 inside D0
expand result: 4 8 10 9 5<untaint: D1>
arg0 in D2 :11 ->  11
macro D2 inside D0
expand result: 6 11 7<untaint: D2>
After preprocessing
1 4 8 10 9 5 2 6 11 7 3

A few things. I don't think you need to manipulate the define for empty
body macro any more. You should be able to find out the macro expand
to empty in the hook.

I still haven't fully understand why you need the empty token type. However
there is the untaint token which mark the end of the a macro expand. You
might able to use that as well.

I dont think you can, not without patching preprocess.c. And the patching
would be messier than by introducing a dedicated token.
Also: TOKEN_M_EMPTY is only used by the hook, it is also
 removed afterwards

This branch needs cleanup before merge to the upstream.
Please let me know what I miss.

Chris
--
To unsubscribe from this list: send the line "unsubscribe linux-sparse" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe linux-sparse" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html