Am 10.07.22 um 16:45 schrieb Ævar Arnfjörð Bjarmason: > > On Sun, Jul 10 2022, René Scharfe wrote: > >> Some of the rules for using COPY_ARRAY instead of memcpy with sizeof are >> intended to reduce the number of sizeof variants to deal with. They can >> have unintended side effects if only they match, but not the one for the >> COPY_ARRAY conversion at the end. > > Since ab/cocci-unused is marked for "next" it would be really nice to > have this based on top so we can add tests for these transformations > (the topic adds a way to do that). Testing semantic patches sounds like a good idea. We can add tests later, once that feature landed. > > But if you don't feel like doing that this is fine too. > >> diff --git a/contrib/coccinelle/array.cocci b/contrib/coccinelle/array.cocci >> index 9a4f00cb1b..aa75937950 100644 >> --- a/contrib/coccinelle/array.cocci >> +++ b/contrib/coccinelle/array.cocci >> @@ -1,60 +1,58 @@ >> -@@ >> -expression dst, src, n, E; >> -@@ >> - memcpy(dst, src, n * sizeof( >> -- E[...] >> -+ *(E) >> - )) >> - >> -@@ >> -type T; >> -T *ptr; >> -T[] arr; >> -expression E, n; >> -@@ >> -( >> - memcpy(ptr, E, >> -- n * sizeof(*(ptr)) >> -+ n * sizeof(T) >> - ) >> -| >> - memcpy(arr, E, >> -- n * sizeof(*(arr)) >> -+ n * sizeof(T) >> - ) >> -| >> - memcpy(E, ptr, >> -- n * sizeof(*(ptr)) >> -+ n * sizeof(T) >> - ) >> -| >> - memcpy(E, arr, >> -- n * sizeof(*(arr)) >> -+ n * sizeof(T) >> - ) >> -) >> - >> @@ >> type T; >> T *dst_ptr; >> T *src_ptr; >> -T[] dst_arr; >> -T[] src_arr; >> expression n; >> @@ >> -( >> -- memcpy(dst_ptr, src_ptr, (n) * sizeof(T)) >> +- memcpy(dst_ptr, src_ptr, (n) * \( sizeof(T) >> +- \| sizeof(*(dst_ptr)) >> +- \| sizeof(*(src_ptr)) >> +- \| sizeof(dst_ptr[...]) >> +- \| sizeof(src_ptr[...]) >> +- \) ) >> + COPY_ARRAY(dst_ptr, src_ptr, n) >> -| >> -- memcpy(dst_ptr, src_arr, (n) * sizeof(T)) >> + >> +@@ >> +type T; >> +T *dst_ptr; >> +T[] src_arr; >> +expression n; >> +@@ >> +- memcpy(dst_ptr, src_arr, (n) * \( sizeof(T) >> +- \| sizeof(*(dst_ptr)) >> +- \| sizeof(*(src_arr)) >> +- \| sizeof(dst_ptr[...]) >> +- \| sizeof(src_arr[...]) >> +- \) ) >> + COPY_ARRAY(dst_ptr, src_arr, n) >> -| >> -- memcpy(dst_arr, src_ptr, (n) * sizeof(T)) >> + >> +@@ >> +type T; >> +T[] dst_arr; >> +T *src_ptr; >> +expression n; >> +@@ >> +- memcpy(dst_arr, src_ptr, (n) * \( sizeof(T) >> +- \| sizeof(*(dst_arr)) >> +- \| sizeof(*(src_ptr)) >> +- \| sizeof(dst_arr[...]) >> +- \| sizeof(src_ptr[...]) >> +- \) ) >> + COPY_ARRAY(dst_arr, src_ptr, n) >> -| >> -- memcpy(dst_arr, src_arr, (n) * sizeof(T)) >> + >> +@@ >> +type T; >> +T[] dst_arr; >> +T[] src_arr; >> +expression n; >> +@@ >> +- memcpy(dst_arr, src_arr, (n) * \( sizeof(T) >> +- \| sizeof(*(dst_arr)) >> +- \| sizeof(*(src_arr)) >> +- \| sizeof(dst_arr[...]) >> +- \| sizeof(src_arr[...]) >> +- \) ) >> + COPY_ARRAY(dst_arr, src_arr, n) >> -) >> >> @@ >> type T; > > Hrm, this seems like a lot of repetition, it's here in the rules you're > editing already, but these repeated "sizeof" make it a lot more verbose. > > Isn't there a way to avoid this by simply wrapping this across lines, I > didn't test, but I think you can do this sort of thing in the cocci > grammar: > > - memcpy( > - COPY_ARRAY( > ( > dst_arr > | > dst_ptr > ) > , > ( > src_arr > | > src_ptr > ) > , > (n) * > - [your big sizeof alternate here] > ) Hmm, that would match many more combinations, e.g. this one: void f(int *a, int *b, long n, int c[1]) { memcpy(a, b, n * sizeof(*c)); } The elements of a, b and c have the same type, so replacing c with a (which a conversion to COPY_ARRAY would do) would produce the same object code. I can't come up with a plausible scenario like above and where a type change of c down the line would cause problems, but I also can't convince myself that no such thing can exist. Tricky. And I can't get it to format the whitespace around the third argument of COPY_ARRAY nicely in all cases. And it takes 37% longer on my machine. But it sure is more compact. :) @@ type T; T *dst_ptr; T *src_ptr; T[] dst_arr; T[] src_arr; expression n; @@ - memcpy + COPY_ARRAY ( \( dst_ptr \| dst_arr \) , \( src_ptr \| src_arr \) , - (n) * \( sizeof(T) - \| sizeof(*(dst_ptr)) - \| sizeof(*(dst_arr)) - \| sizeof(*(src_ptr)) - \| sizeof(*(src_arr)) - \| sizeof(dst_ptr[...]) - \| sizeof(dst_arr[...]) - \| sizeof(src_ptr[...]) - \| sizeof(src_arr[...]) - \) + n ) > > I.e. you want to preserve whatever we match in the 1st and 2nd > arguments, but only want to munge part of the 3rd argument. The cocci > grammar can "reach into" lines like that, it doesn't need to be limited > to line-based diffs. > > But I didn't try it in this caes, and maybe there's a good reason for > why it can't happen in this case... > > I also wonder if that won't be a lot faster, i.e. if you can condense > this all into one rule it won't need to match this N times, but maybe > the overall complexity of the rules makes it come out to the same thing > in the end...