On Sun, Nov 03, 2024 at 11:02:16AM -0500, Ben Beasley wrote: > Kevin’s observation about floating-point rounding and runtime dispatch is an excellent one in general. > > Those two CPU’s should, as far as I can tell, be dispatched to the same SIMD implementations in this case. > > Skimming https://github.com/qt/qtbase/blob/v6.8.0/src/gui/painting/qimagescale_sse4.cpp, it looks like a fixed-point implementation that entirely avoids floating-poont operations. If there are no bugs, and if I’m not missing something, it should be possible to get identical results regardless of ISA extensions since no rounding is involved. > > The fact that the scaling algorithm appears to be integer-based also makes the following sources of irreproducibility less likely, but maybe not impossible: > > - Some algorithms compute “left-over” leading and/or trailing data with a scalar algorithm, and in some cases this could make the results depend on alignment of buffers in memory. Besides the fact that this is an integer implementation, at a glance, Qt doesn’t appear to be doing this. It looks like QImage must be aligned and (over-)allocated to allow everything to be done in SIMD, processing some extra pixels outside the image as necessary to make complete vectors. > > - SIMD algorithms might operate on input values and combine pixels in a different order than scalar ones, which could result in different rounding for floating-point operations. That shouldn’t matter for an integer algorithm like this, except maybe in cases of wrapping/overflow – which might perhaps be in play here. > > Another relevant fact is that the implementation is multi-threaded using a thread pool. If there is anything that depends on the order in which pixels/blocks are computed and combined, this could also result in different outputs, even in different runs on the same machine, and especially on machines with different numbers of cores. Thanks, those are all good considerations. I ran the conversion under valgrind just to make sure it's not some trivial memory bug, but valgrind doesn't report anything. This code is exectued from Python, so I think it's unlikely that there's some alignment problem or memory use bug. If there was, we'd be seeing it much more. (And if we were reading past the end of a buffer, I'd expect some actual corruption, i.e. random looking pixel values, not a subtle difference. So essentially the problem is that we have an integer algorithm where we don't expect any rounding effects, but we get an effect that looks like rounding error. ¯\_(ツ)_/¯ Zbyszek -- _______________________________________________ devel mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxx To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxx Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@xxxxxxxxxxxxxxxxxxxxxxx Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue