[Bug 1389016] Review Request: libxsmm - Library for small matrix-matrix multiplications on Intel x86_64 (e.g. for cp2k)

bugzilla@xxxxxxxxxx · Fri, 28 Oct 2016 12:02:46 +0000



https://bugzilla.redhat.com/show_bug.cgi?id=1389016


--- Comment #10 from Hans Pabst <hans.pabst@xxxxxxxxx> ---
> Unfortunately an SSE2 baseline is necessary for packaging.
Well, SSE2 is "nothing" wrt 64-bit since it's already part of the 64-bit ABI.

> There are actually AMD Barcelonas
They support SSE3; no problem!

> they have a variety of sse4a.
which includes SSE3. I have selected SSE3 as a baseline on purpose to not
exclude such systems.

> -- and older!
Hmm, for systems without SSE3 you need to go back pretty far. I doubt that any
of those systems run 64-bit, and if -- they are unlikely interested in LIBXSMM.
More important, they value of the library goes towards zero since JIT is only
supported with AVX and beyond. Also, statically generating our SMM kernels
won't help either since our baseline there is SSE3 as well (inline assembly).
There is some value left, but it's more on the edge of what the library aims to
provide.

> I'm still not sure whether OMP=1 is worthwhile.
Sorry my comment might have been misleading. The library warns if you use OMP=1
since it's meant to be agnostic wrt threading runtime. The OpenMP compiler
flags is automatically applied only for libxsmmext, which is meant to keep the
OpenMP dependency separate. In the early times of LIBXSMM, OMP=1 (when applied)
meant to use OpenMP synchronization primitives for the code registry (instead
of OS-level primitives or Pthreads). The warning I was mentioning is related to
the latter.

> Yes, but it's silly to use anything else
There OpenBLAS is the default I am looking for (just learned people would take
it in any case). However, RefLAPACK/BLAS is surprisingly good (I believe for
small matrices it even better than OpenBLAS). But sure, relying on OpenBLAS
makes much sense.

> I don't understand why rpmlint complains about the reference to dgemm (?)
This is a real dependency on the ?gemm_ symbol. Anyhow, this symbol is still
satisfied by any kind BLAS (OpenBLAS, RefLAPACK/BLAS, MKL, ATLAS, etc.)

> It was meant to include the samples (which are large).
> Is that worthwhile?
No it's not worth. People who want the dev-package would typically need to copy
the sample source code anyways into a writable destination. If the sample
source code would not compile then (or easy to get it to) -- the impression of
the library will be ruined (which is not your problem :-).

> Small dense or sparse matrix multiplications and convolutions for x86
Sure go ahead! I am not on particular words e.g., this NA Digest
(http://www.netlib.org/na-digest-html/16/v16n38.html; search for LIBXSMM) says:
"Library for small convolutions (Machine Learning), and small dense or sparse
matrix multiplications."  (which leaves out the mighty x86 ;-)

> Just a pity we're currently missing optimized KNL BLAS.
I guess the MKL community download is somewhat inconvenient?
As a side-note, LIBXSMM makes some attempt to come up with regular BLAS sizes
as well (status may be here:
https://github.com/hfp/libxsmm/issues/99#issuecomment-255314392). First signs
are the libxsmm_gemm_omp functions and the "blkgemm" sample code. A lot of work
is still left.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are always notified about changes to this product and component
_______________________________________________
package-review mailing list -- package-review@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to package-review-leave@xxxxxxxxxxxxxxxxxxxxxxx