Kevin Kofler <kevin.kofler@xxxxxxxxx> writes: > Dave Love wrote: >> You'll find Dominik was correct if you try it. > > If you are talking about the missing RPM AutoProvides: > Provides: libblas.so.3()(64bit) > does wonders. I mean you need to get the soname right and ensure that you have everything implemented in the replacement library. >> Various things have been changed to use openblas on x86 after some of us >> agitated. > > The problem is, "various things" is not enough, we need a plan to ensure ALL > things use it. It's not available for them all as far as I know -- there's an rpm macro which says which ones. I'm happy if that's wrong now. >> As far as I remember, there's good reason (apart from the previous >> (FPC?) vote against it), like the atlas blas and lapack have to go >> together. > > If the ld.so.conf.d override were implemented correctly, it would just work. Yes, with caveats above. (My hack just worked for at least a couple of years on a general-purpose HPC system.) > The issue with the old state was that there was only an override for > liblapack installed and not for libblas, which is of course very broken. > > But the new approach I am proposing installs only one version of both BLAS > and LAPACK (the OpenBLAS one), so there cannot possibly be mismatched > versions (except if you have third-party binaries bundling BLAS and linking > to the system LAPACK or the other way round, but those are then very broken > and will also fail on other distributions for the same reason). "Third party" (user and system) binaries linking non-OB linear algebra is normal on the sort of systems I work on, though I wish I was allowed to package system installations. (Red Hat would have caused chaos on our systems by introducing backwards-incompatible openmpi if it hadn't been caught by the local package dependencies.) Also, OB still has at least some correctness problems as far as I know, e.g. <https://github.com/xianyi/OpenBLAS/issues/458>. Is there actually anything wrong with Debian's tried and tested approach as long as openblas is preferred where appropriate? >> It will exceed it on any relevant platform that openblas doesn't >> support, > > Is there even such a platform, the keyword being "relevant"? :-) Apparently, as above. >> but where OB doesn't do DYNAMIC_ARCH you still have the problem of needing >> micro-architecture-specific packages (which seems to be against policy, >> although atlas does it). > > ld.so.conf.d is the only way to build those, if you want to support them. I > wonder whether non-x86 architectures are even worth investing the effort. How would relevant hwcaps not help, if they were available, as they seemed to be on some architectures when I looked some time ago? (That used to be important on SPARC for efficiency in crypto libraries, at least.) >> On avx512, that's BLIS (+libxsmm), but MKL is still substantially faster. > > Well, another technical criterion is that we can really only pick the > default implementation per architecture (e.g., x86_64), not per > subarchitecture (e.g., AVX-512), and I think OpenBLAS is still the best > option for x86_64 overall (also because it supports runtime subarchitecture > detection and BLIS does not). Yes it is, but if you can have atlas-sse3, I don't see why you can't have blis-avx512. (Dynamism was on the radar for BLIS when I last looked, and might be worth contributing, but I haven't evaluated it against OB on anything other than KNL.) > For AVX-512, I think what we really want is to get AVX-512 optimizations > into OpenBLAS. It can match MKL performance when it is using the same > instruction set, I don't think that is generally true, even for avx < 512, but for the major operations -- at least dgemm -- it equals or betters MKL for all x86 I've tried from 10-year-old Opteron through to avx2. (I think MKL has more operations optimized.) > see e.g. the graph for Sandy Bridge (AVX): > https://github.com/xianyi/OpenBLAS/wiki/faq#sandybridge_perf > Of course, if your CPU supports AVX-512 and your BLAS is only using AVX2 (as > OpenBLAS currently does), it will not be optimal, no surprise there. I know how it performs, as above and in the link I posted for something more recent than SB, but no-one is working on avx512 support as far as I can tell, and the KNL qua Haswell is contributed is worse than you might expect <https://github.com/xianyi/OpenBLAS/issues/991>. In case I seem to be arguing against it, let me stress that I think this needs sanitizing, OB should be the default BLAS on the appropriate platforms (all, if that works), and someone should be funded to add KNL/Skylake support to OB. Is it even possible to revisit this in view of the previous proposal being rejected? _______________________________________________ devel mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxx To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxx