Re: coccinelle: adjustments for array.cocci?

René Scharfe <l.s.r@xxxxxx> · Sun, 17 Nov 2019 14:40:03 +0100

Am 17.11.19 um 08:56 schrieb Markus Elfring:
>>> @@
>>> expression dst, src, n, E;
>>> @@
>>>  memcpy(dst, src, sizeof(
>>> +                        *(
>>>                            E
>>> -                           [...]
>>> +                         )
>>>                          ) * n
>>>        )
>>
>> That's longer and looks more complicated to me
>
> I point another possibility out to express a change specification
> by the means of the semantic patch language.
> How would you think about such SmPL code if the indentation
> will be reduced?

Whitespace is not what makes the above example more complicated than the
equivalent rule below; separating the pieces of simple expressions does.

>> than what we currently have:
>>   @@
>>   expression dst, src, n, E;
>>   @@
>>     memcpy(dst, src, n * sizeof(
>>   - E[...]
>>   + *(E)
>>     ))

>>> @@
>>> type T;
>>> T *ptr;
>>> T[] arr;
>>> expression E, n;
>>> @@
>>>  memcpy(
>>> (       ptr, E, sizeof(
>>> -                      *(ptr)
>>> +                      T
>>>                       ) * n
>>> |       arr, E, sizeof(
>>> -                      *(arr)
>>> +                      T
>>>                       ) * n
>>> |       E, ptr, sizeof(
>>> -                      *(ptr)
>>> +                      T
>>>                       ) * n
>>> |       E, arr, sizeof(
>>> -                      *(arr)
>>> +                      T
>>>                       ) * n
>>> )
>>>        )
>>
>> This still fails to regenerate two of the changes from 921d49be86
>> (use COPY_ARRAY for copying arrays, 2019-06-15), at least with for me
>> (and Coccinelle 1.0.4).
>
> Would you become keen to find the reasons out for unexpected data processing
> results (also by the software combination “Coccinelle 1.0.8-00004-g842075f7”)
> at this place?

It looks like a bug in Coccinelle to me and I'd like to see it fixed if
that's confirmed, of course.  And I'd like to see Debian pick up a newer
version, preferably containing that fix.  But at least until then our
semantic patches need to work around it.

> But this transformation rule can probably be omitted if the usage
> of SmPL disjunctions will be increased in a subsequent rule, can't it?

Perhaps, but I don't see how.  Do you?

>>> @@
>>> type T;
>>> T* dst_ptr, src_ptr;
>>> T[] dst_arr, src_arr;
>>> expression n, x;
>>> @@
>>> -memcpy
>>> +COPY_ARRAY
>>>        (
>>> (       dst_ptr
>>> |       dst_arr
>>> )
>>>        ,
>>> (       src_ptr
>>> |       src_arr
>>> )
>>> -      , (n) * \( sizeof(T) \| sizeof(*(x)) \)
>>> +      , n
>>>        )
>>
>> That x could be anything -- it's not tied to the element size of source
>> or destination.  Such a transformation might change the meaning of the
>> code, as COPY_ARRAY will use the element size of the destination behind
>> the scenes.  So that doesn't look safe to me.
>
> Would you like to use the SmPL code “*( \( src_ptr \| src_arr \) )” instead?

That leaves out dst_ptr and dst_arr.

And what would it mean to match e.g. this ?

	memcpy(dst_ptr, src_ptr, n * sizeof(*src_arr))

At least the element size would be the same, but I'd rather shy away from
transforming weird cases like this automatically.

>>> @@
>>> type T;
>>> T* dst, src, ptr;
>>> expression n;
>>> @@
>>> (
>>> -memmove
>>> +MOVE_ARRAY
>>>         (dst, src
>>> -                , (n) * \( sizeof(* \( dst \| src \) ) \| sizeof(T) \)
>>> +                , n
>>>         )
>>> |
>>> -ptr = xmalloc((n) * \( sizeof(*ptr) \| sizeof(T) \))
>>> +ALLOC_ARRAY(ptr, n)
>>> );
>>
>> memmove/MOVE_ARRAY and xmalloc/ALLOC_ARRAY are quite different;
>
> These functions provide another programming interface.

Huh, which one specifically?  Here are the signatures of the functions
and macros, for reference:

  void *memmove(void *dest, const void *src, size_t n);
  void *memcpy(void *dest, const void *src, size_t n);

  COPY_ARRAY(dst, src, n)
  MOVE_ARRAY(dst, src, n)

>> why would we want to jam transformations for them into the same rule
>> like this?
>
> Possible nicer run time characteristics by the Coccinelle software.

How much faster is it exactly?

Speedups are good, but I think readability of rules is more important
than coccicheck duration.

>> Handling memmove/MOVE_ARRAY and memcpy/COPY_ARRAY together would make
>> more sense, as they take the same kinds of parameters.
>
> Would you like to adjust the SmPL code in such a design direction?

I can't find any examples in our code base that would be transformed by
a generalized rule.  That reduces my own motivation to tinker with the
existing rules to close to zero.

>>> Now I observe that the placement of space characters can be a coding style
>>> concern at four places for adjusted lines by the generated patch.
>>> Would you like to clarify remaining issues for pretty-printing
>>> in such use cases?
>>
>> Ideally, generated code should adhere to Documentation/CodingGuidelines,
>> so that it can be accepted without requiring hand-editing.
>
> But how does the software situation look like if the original source code
> would contain coding style issues?

The same: Generated code should not add coding style issues.  We can
still use results that need to be polished, but that's a manual step
which reduces the benefits of automation.

> It seems to be possible to specify SmPL code in a way so that even questionable
> code layout would be preserved by an automatic transformation.

That may be acceptable.

René