On Wed, 13 Sep 2023 13:06:10 +0200 Eike Rathke <erack@xxxxxxxxxx> wrote: > Hi, > > On Thursday, 2023-09-07 11:36:33 +0200, Stephan Bergmann wrote: > > > On 9/7/23 08:25, Dan Horák wrote: > > > On Thu, 7 Sep 2023 08:16:28 +0200 > > > Stephan Bergmann <sbergman@xxxxxxxxxx> wrote: > > > > On 9/5/23 18:53, Dan Horák wrote: > > > > > seems the recent change [1] to KahanSum handling causes a test failure > > > > > on ppc64le. > > > > > > > > > > Running scope as unit: -home-sharkcz-projects-libreoffice-workdir-CppunitTest-sc_statistical_functions_test.test:20230905152639:2378561.scope > > > > > [_RUN_____] StatisticalFunctionsTest::testStatisticalFormulasFODS > > > > > Testing load file:///home/sharkcz/projects/libreoffice//sc/qa/unit/data/functions/statistical/fods/KahanSum.fods: > > > > > /home/sharkcz/projects/libreoffice/sc/qa/unit/functions_test.cxx:49:StatisticalFunctionsTest::testStatisticalFormulasFODS > > > > > forced failure > > > > > - Testing file:///home/sharkcz/projects/libreoffice//sc/qa/unit/data/functions/statistical/fods/KahanSum.fods failed, Sheet2.A90 '=SUM(F3:F102)' result: 6.6, expected: 5 > > > > > > > > Interesting; I also saw that failure with my latest local build on macOS > > > > aarch64 against Clang trunk. (But didn't debug it further and wrote it > > > > off as maybe some intermittent Clang trunk bug.) > > > > > > > > > > someone on IRC reported the same issue, also from macOS I believe > > > > So, wild speculation, maybe it's a difference between x86-64 and all other > > architectures, given how that > > <https://gerrit.libreoffice.org/c/core/+/156253/> "Resolves: tdf#156985 > > Treat adding two KahanSum differently" kept failing for (32-bit) > > <https://ci.libreoffice.org/job/gerrit_windows/> up until patch set 7. (And > > where patch set 8 reverted back to the original code for _WIN32, maybe in a > > misguided attempt to fix something that was seen failing on Windows 32-bit > > x86, but not known to (not) fail on Windows x86-64.) > > > > Eike, any thoughts? > > Hum.. seeing there's platform specific code in > sc/inc/arraysumfunctor.hxx executeFast() that for > #if defined(X86_64) || (defined(X86) && defined(_WIN32)) > calls executeSSE2() (implemented in > sc/source/core/tool/arraysumSSE2.cxx) and for others executeUnrolled(), > I wonder if we never have X86 defined and thus the _WIN32 fails as well > with executeUnrolled() and that "unrolled pair-wise" algorithm has > a flaw. > > It would be worth a try to simply call executeFast() only if SC_USE_SSE2 > is defined, so the failing platforms skip executeUnrolled(), here > https://opengrok.libreoffice.org/xref/core/sc/inc/arraysumfunctor.hxx?r=7f15354c#89 > > Please report back if that helps and I'll prepare another patch. I hop I got your idea right and with diff --git a/sc/inc/arraysumfunctor.hxx b/sc/inc/arraysumfunctor.hxx index c261c120addf..d9a5b805db50 100644 --- a/sc/inc/arraysumfunctor.hxx +++ b/sc/inc/arraysumfunctor.hxx @@ -86,7 +86,11 @@ inline KahanSum sumArray(const double* pArray, size_t nSize) { size_t i = 0; const double* pCurrent = pArray; +#if 0 KahanSum fSum = executeFast(i, nSize, pCurrent); +#else + KahanSum fSum = 0.0; +#endif // sum rest of the array for (; i < nSize; ++i) the test passed on aarch64 on master branch (no reverts or such) Dan