On Tue, Jul 23, 2019 at 12:51:24PM +0800, Wu Hao wrote: > In early partial reconfiguration private feature, it only > supports 32bit data width when writing data to hardware for > PR. 512bit data width PR support is an important optimization > for some specific solutions (e.g. XEON with FPGA integrated), > it allows driver to use AVX512 instruction to improve the > performance of partial reconfiguration. e.g. programming one > 100MB bitstream image via this 512bit data width PR hardware > only takes ~300ms, but 32bit revision requires ~3s per test > result. > > Please note now this optimization is only done on revision 2 > of this PR private feature which is only used in integrated > solution that AVX512 is always supported. This revision 2 > hardware doesn't support 32bit PR. > > Signed-off-by: Ananda Ravuri <ananda.ravuri@xxxxxxxxx> > Signed-off-by: Xu Yilun <yilun.xu@xxxxxxxxx> > Signed-off-by: Wu Hao <hao.wu@xxxxxxxxx> > Acked-by: Alan Tull <atull@xxxxxxxxxx> > Signed-off-by: Moritz Fischer <mdf@xxxxxxxxxx> > --- > v2: remove DRV/MODULE_VERSION modifications > --- > drivers/fpga/dfl-fme-mgr.c | 110 ++++++++++++++++++++++++++++++++++++++------- > drivers/fpga/dfl-fme-pr.c | 43 +++++++++++------- > drivers/fpga/dfl-fme.h | 2 + > drivers/fpga/dfl.h | 5 +++ > 4 files changed, 129 insertions(+), 31 deletions(-) > > diff --git a/drivers/fpga/dfl-fme-mgr.c b/drivers/fpga/dfl-fme-mgr.c > index b3f7eee..46e17f0 100644 > --- a/drivers/fpga/dfl-fme-mgr.c > +++ b/drivers/fpga/dfl-fme-mgr.c > @@ -22,6 +22,7 @@ > #include <linux/io-64-nonatomic-lo-hi.h> > #include <linux/fpga/fpga-mgr.h> > > +#include "dfl.h" > #include "dfl-fme-pr.h" > > /* FME Partial Reconfiguration Sub Feature Register Set */ > @@ -30,6 +31,7 @@ > #define FME_PR_STS 0x10 > #define FME_PR_DATA 0x18 > #define FME_PR_ERR 0x20 > +#define FME_PR_512_DATA 0x40 /* Data Register for 512bit datawidth PR */ > #define FME_PR_INTFC_ID_L 0xA8 > #define FME_PR_INTFC_ID_H 0xB0 > > @@ -67,8 +69,43 @@ > #define PR_WAIT_TIMEOUT 8000000 > #define PR_HOST_STATUS_IDLE 0 > > +#if defined(CONFIG_X86) && defined(CONFIG_AS_AVX512) > + > +#include <linux/cpufeature.h> > +#include <asm/fpu/api.h> > + > +static inline int is_cpu_avx512_enabled(void) > +{ > + return cpu_feature_enabled(X86_FEATURE_AVX512F); > +} That's a very arch specific function, why would a driver ever care about this? > + > +static inline void copy512(const void *src, void __iomem *dst) > +{ > + kernel_fpu_begin(); > + > + asm volatile("vmovdqu64 (%0), %%zmm0;" > + "vmovntdq %%zmm0, (%1);" > + : > + : "r"(src), "r"(dst) > + : "memory"); > + > + kernel_fpu_end(); > +} Shouldn't this be an arch-specific function somewhere? Burying this in a random driver is not ok. Please make this generic for all systems. > +#else > +static inline int is_cpu_avx512_enabled(void) > +{ > + return 0; > +} > + > +static inline void copy512(const void *src, void __iomem *dst) > +{ > + WARN_ON_ONCE(1); Are you trying to get reports from syzbot? :) Please fix this all up. greg k-h