On Mon, Jan 3, 2022 at 8:28 AM Dirk Müller <dmueller@xxxxxxx> wrote: > > On Sonntag, 2. Januar 2022 01:03:44 CET Song Liu wrote: > > > We need more explanation/documentation about 0 vs. 1 vs. 2 priority. > > In the commit message? in the code? this is basically a copy&paste of the same > concept and code from a few lines below the diff, struct raid6_recov_calls > which works the same way and currently has no documentation at all. > > want me to add to both then? I guess we only need something like: .priority = 2 /* avx is always faster than sse */ > > > > if ((*algo)->valid && !(*algo)->valid()) > > > > If the module load time is really critical, maybe we can run all > > ->valid() calls first and > > find the highest valid priority. Then, we only run the benchmark for > > these algorithms. > > thats exactly what the code always did. previously all x86_64 specific > implementations (be it SSE1/SSE2/AVX2/AVX512) all had the same priority level > 1, over the default priority level 0 for the implemented-in-C int*.c routines. > with this change, we have one more level p refering AVX* over the rest, so > that we skip testing SSE1/SSE2 (similary to how the integer implementations > have always been skipped before). > > > Does this make sense? > > the valid call is not probing anything by itself. it just iterates over a > small array of functions and stops executing benchmarks for those that have > lower priority ranks. > > so there isn't really a lot of cycles to win by changing the execution order > here. I would assume it will actually slow things down as we have to store the > valid() result for the 2nd iteration. Let's keep this part as-is then. Thanks, Song