Re: recent KahanSum change causes a new test failure on ppc64le

Dan Horák <dan@xxxxxxxx> · Wed, 13 Sep 2023 13:45:36 +0200

On Wed, 13 Sep 2023 13:06:10 +0200
Eike Rathke <erack@xxxxxxxxxx> wrote:

> Hi,
> 
> On Thursday, 2023-09-07 11:36:33 +0200, Stephan Bergmann wrote:
> 
> > On 9/7/23 08:25, Dan Horák wrote:
> > > On Thu, 7 Sep 2023 08:16:28 +0200
> > > Stephan Bergmann <sbergman@xxxxxxxxxx> wrote:
> > > > On 9/5/23 18:53, Dan Horák wrote:
> > > > > seems the recent change [1] to KahanSum handling causes a test failure
> > > > > on ppc64le.
> > > > > 
> > > > > Running scope as unit: -home-sharkcz-projects-libreoffice-workdir-CppunitTest-sc_statistical_functions_test.test:20230905152639:2378561.scope
> > > > > [_RUN_____] StatisticalFunctionsTest::testStatisticalFormulasFODS
> > > > > Testing load file:///home/sharkcz/projects/libreoffice//sc/qa/unit/data/functions/statistical/fods/KahanSum.fods:
> > > > > /home/sharkcz/projects/libreoffice/sc/qa/unit/functions_test.cxx:49:StatisticalFunctionsTest::testStatisticalFormulasFODS
> > > > > forced failure
> > > > > - Testing file:///home/sharkcz/projects/libreoffice//sc/qa/unit/data/functions/statistical/fods/KahanSum.fods failed, Sheet2.A90 '=SUM(F3:F102)' result: 6.6, expected: 5
> > > > 
> > > > Interesting; I also saw that failure with my latest local build on macOS
> > > > aarch64 against Clang trunk.  (But didn't debug it further and wrote it
> > > > off as maybe some intermittent Clang trunk bug.)
> > > > 
> > > 
> > > someone on IRC reported the same issue, also from macOS I believe
> > 
> > So, wild speculation, maybe it's a difference between x86-64 and all other
> > architectures, given how that
> > <https://gerrit.libreoffice.org/c/core/+/156253/> "Resolves: tdf#156985
> > Treat adding two KahanSum differently" kept failing for (32-bit)
> > <https://ci.libreoffice.org/job/gerrit_windows/> up until patch set 7. (And
> > where patch set 8 reverted back to the original code for _WIN32, maybe in a
> > misguided attempt to fix something that was seen failing on Windows 32-bit
> > x86, but not known to (not) fail on Windows x86-64.)
> > 
> > Eike, any thoughts?
> 
> Hum.. seeing there's platform specific code in
> sc/inc/arraysumfunctor.hxx executeFast() that for
> #if defined(X86_64) || (defined(X86) && defined(_WIN32))
> calls executeSSE2() (implemented in
> sc/source/core/tool/arraysumSSE2.cxx) and for others executeUnrolled(),
> I wonder if we never have X86 defined and thus the _WIN32 fails as well
> with executeUnrolled() and that "unrolled pair-wise" algorithm has
> a flaw.
> 
> It would be worth a try to simply call executeFast() only if SC_USE_SSE2
> is defined, so the failing platforms skip executeUnrolled(), here
> https://opengrok.libreoffice.org/xref/core/sc/inc/arraysumfunctor.hxx?r=7f15354c#89
> 
> Please report back if that helps and I'll prepare another patch.

I hop I got your idea right and with

diff --git a/sc/inc/arraysumfunctor.hxx b/sc/inc/arraysumfunctor.hxx
index c261c120addf..d9a5b805db50 100644
--- a/sc/inc/arraysumfunctor.hxx
+++ b/sc/inc/arraysumfunctor.hxx
@@ -86,7 +86,11 @@ inline KahanSum sumArray(const double* pArray, size_t nSize)
 {
     size_t i = 0;
     const double* pCurrent = pArray;
+#if 0
     KahanSum fSum = executeFast(i, nSize, pCurrent);
+#else
+    KahanSum fSum = 0.0;
+#endif
 
     // sum rest of the array
     for (; i < nSize; ++i)

the test passed on aarch64 on master branch (no reverts or such)


		Dan