On 24.10.24 14:41, Christian König wrote:
The merge function initially handled only individual fences and arrays which in turn were created by the merge function. This allowed to create the new array by a simple merge sort based on the fence context number. The problem is now that since the addition of timeline sync objects userspace can create chain containers in basically any fence context order. If those are merged together it can happen that we create really large arrays since the merge sort algorithm doesn't work any more. So put an insert sort behind the merge sort which kicks in when the input fences are not in the expected order. This isn't as efficient as a heap sort, but has better properties for the most common use case. Signed-off-by: Christian König <christian.koenig@xxxxxxx> --- drivers/dma-buf/dma-fence-unwrap.c | 39 ++++++++++++++++++++++++++---- 1 file changed, 34 insertions(+), 5 deletions(-) diff --git a/drivers/dma-buf/dma-fence-unwrap.c b/drivers/dma-buf/dma-fence-unwrap.c index 628af51c81af..d9aa280d9ff6 100644 --- a/drivers/dma-buf/dma-fence-unwrap.c +++ b/drivers/dma-buf/dma-fence-unwrap.c @@ -106,7 +106,7 @@ struct dma_fence *__dma_fence_unwrap_merge(unsigned int num_fences, fences[i] = dma_fence_unwrap_first(fences[i], &iter[i]); count = 0; - do { + while (true) { unsigned int sel; restart: @@ -144,11 +144,40 @@ struct dma_fence *__dma_fence_unwrap_merge(unsigned int num_fences, } } - if (tmp) { - array[count++] = dma_fence_get(tmp); - fences[sel] = dma_fence_unwrap_next(&iter[sel]); + if (!tmp) + break; + + /* + * We could use a binary search here, but since the assumption + * is that the main input are already sorted dma_fence_arrays + * just looking from end has a higher chance of finding the + * right location on the first try + */ + + for (i = count; i--;) {
This is broken. The first iteration of this loop will always index out of bounds. What you probably want here is: + for (i = count - 1; count && i--;) { This intentionally overflows for count == 0, but the ++i after the loop undoes that. Maybe it would be worth a comment to point out that's intentional.
+ if (likely(array[i]->context < tmp->context)) + break; + + if (array[i]->context == tmp->context) { + if (dma_fence_is_later(tmp, array[i])) { + dma_fence_put(array[i]); + array[i] = dma_fence_get(tmp); + } + fences[sel] = dma_fence_unwrap_next(&iter[sel]); + goto restart; + } } - } while (tmp); + + ++i; + /* + * Make room for the fence, this should be a nop most of the + * time. + */ + memcpy(&array[i + 1], &array[i], (count - i) * sizeof(*array));
Need memmove here, src and dst alias. I took it for a spin with these things fixed and it seemed to resolve the issue as well. How do you want to proceed? I guess I would be comfortable putting a Reviewed-by and/or Tested-by on a version with these things fixed (with the usual caveat that I'm not a maintainer - I guess the process requires (at least one) reviewer to be?). By the way, I guess you might've had some internal branches where this fix needed to go into quick or something? Usually I'm happy to make a v2 for my patches myself, too ;) Regards, Friedrich
+ array[i] = dma_fence_get(tmp); + fences[sel] = dma_fence_unwrap_next(&iter[sel]); + count++; + }; if (count == 0) { tmp = dma_fence_allocate_private_stub(ktime_get());