Re: Why the memcpy from a mapped GPU memory is so slow on Intel Bay Trail?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



三月! <sunnymarch@xxxxxx> writes:

> Hello!  I'm developing some openCL application with Beignet in Ubuntu
> 14.04 x64 Desktop upon Bay Trail E3825.  And I found that reading data
> from GPU memory through whatever drm_intel gem_bo_map or
> drm_intel_gem_bo_get subdata cost about 0.002 ~ 0.003 second to fetch
> a 7MiB array, which is not quite satisfing.  Could anybody help solve
> this problem?

GPUs (except in the case of SNB/IVB/HSW where the CPU is coherent with
the GPU other than the GPU's L1/2 caches) are extremely slow to read
From because write-combining memory is effectively uncached performance
for reads.  You can get better streaming read performance using the
movntdqa instruction, and you can see an example of code using it in
streaming-load-memcpy.c in mesa (though it looks like that code is
missing an mfence, which iirc is required).

Attachment: pgpneAJEcziMi.pgp
Description: PGP signature

_______________________________________________
Intel-gfx mailing list
Intel-gfx@xxxxxxxxxxxxxxxxxxxxx
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux