On Tue, Feb 3, 2015 at 10:22 PM, Luc Van Oostenryck <luc.vanoostenryck@xxxxxxxxx> wrote: >> Are you sure about this behavior? You mean you see "b" has the string >> size as 2. I haven't understand how this can happen. > > > But if the macro is used several times: > === > #define BACKSLASH "\\" > const char a[] = BACKSLASH; > const char b[] = BACKSLASH; > const char c[] = "<" BACKSLASH ">"; > === > > the, we get: > === > symbol a: > char const [addressable] [toplevel] a[0] > bit_size = 16 > val = "\0" > symbol b: > char const [addressable] [toplevel] b[0] > bit_size = 16 The value buffer is corrupted. But the bit_size is still 16, which is correct. I just think that in your example it shouldn't corrupt the size. Your test case seems confirm that. > Is it only with macros that the string structure is so shared? That is right. I haven't see it can happen any other way. The tokenizer always construct new token and string structure from the C source file. It is the preprocessor using macro expand which copy and duplicate the token list. The token has a pointer point to the string which is shared across different invocation of macro. > And have we a way to test if the string is coming from a macro? Not right now. But we can add it. > > A simpler and safer way would be to directly do the string expansion just after > a string token is recognized, or even better in the lexer itself. > So the string buffer, macro or not, will always directly contain the right values. > But maybe there was good reasons to not do it this way. I have an counter example that will not work. Let say #define b(a, d) a##d wchar_t s[] = b(L, "\xabcdabc"); When the lexer process the escape char, you did not know the string is wide char or not. That can be changed after the macro expansion. Chris -- To unsubscribe from this list: send the line "unsubscribe linux-sparse" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html