> On Nov 15, 2019, at 1:43 AM, Shile Zhang <shile.zhang@xxxxxxxxxxxxxxxxx> wrote: > > > >> On 2019/11/15 17:07, Peter Zijlstra wrote: >>> On Fri, Nov 15, 2019 at 02:47:49PM +0800, Shile Zhang wrote: >>> >>> +#if defined(SORTTABLE_64) && defined(UNWINDER_ORC_ENABLED) >>> +/* ORC unwinder only support X86_64 */ >>> +#include <errno.h> >>> +#include <pthread.h> >>> +#include <linux/types.h> >>> + >>> +#define ORC_REG_UNDEFINED 0 >>> +#define ERRSTRING_MAXSZ 256 >>> + >>> +struct orc_entry { >>> + s16 sp_offset; >>> + s16 bp_offset; >>> + unsigned sp_reg:4; >>> + unsigned bp_reg:4; >>> + unsigned type:2; >>> + unsigned end:1; >>> +} __attribute__((packed)); >>> + >>> +struct orctable_info { >>> + size_t orc_size; >>> + size_t orc_ip_size; >>> +} orctable; >> There's ./arch/x86/include/asm/orc_types.h for this. Please don't >> duplicate. objtool uses that same header. > Good catch! Thanks for your kindly reminder! I'll remove it. >>> +/** >>> + * sort - sort an array of elements >>> + * @base: pointer to data to sort >>> + * @num: number of elements >>> + * @size: size of each element >>> + * @cmp_func: pointer to comparison function >>> + * @swap_func: pointer to swap function >>> + * >>> + * This function does a heapsort on the given array. You may provide a >>> + * swap_func function optimized to your element type. >>> + * >>> + * Sorting time is O(n log n) both on average and worst-case. While >>> + * qsort is about 20% faster on average, it suffers from exploitable >>> + * O(n*n) worst-case behavior and extra memory requirements that make >>> + * it less suitable for kernel use. >>> + * >>> + * This code token out of /lib/sort.c. >>> + */ >>> +static void sort(void *base, size_t num, size_t size, >>> + int (*cmp_func)(const void *, const void *), >>> + void (*swap_func)(void *, void *, int size)) >>> +{ >>> + /* pre-scale counters for performance */ >>> + int i = (num/2 - 1) * size, n = num * size, c, r; >>> + >>> + /* heapify */ >>> + for ( ; i >= 0; i -= size) { >>> + for (r = i; r * 2 + size < n; r = c) { >>> + c = r * 2 + size; >>> + if (c < n - size && >>> + cmp_func(base + c, base + c + size) < 0) >>> + c += size; >>> + if (cmp_func(base + r, base + c) >= 0) >>> + break; >>> + swap_func(base + r, base + c, size); >>> + } >>> + } >>> + >>> + /* sort */ >>> + for (i = n - size; i > 0; i -= size) { >>> + swap_func(base, base + i, size); >>> + for (r = 0; r * 2 + size < i; r = c) { >>> + c = r * 2 + size; >>> + if (c < i - size && >>> + cmp_func(base + c, base + c + size) < 0) >>> + c += size; >>> + if (cmp_func(base + r, base + c) >= 0) >>> + break; >>> + swap_func(base + r, base + c, size); >>> + } >>> + } >>> +} >> Do we really need to copy the heapsort implementation? That is, why not >> use libc's qsort() ? This is userspace after all. > > Yes, I think qsort is better choice than copy-paste here. But qsort does not support customized swap func, which is needed for ORC unwind swap two tables together. > I think it's hard to do with qsort, so I used sort same with original orc unwind table sort. One solution is to make an array of indices 0, 1, 2, etc, and sort that using a comparison function that compares i,j by actually comparing source[i], source[j]. (Or use pointers, but indices are probably faster on a 64-bit machine if you can use 32-bit indices.) Then, after sorting, permute the original array using the now-sorted indices. In the case where swapping is expensive, this is actually faster, since it does exactly n moves instead of O(n log n). >