Texture streaming to passthrough AMD GPU is slow

geoff@xxxxxxxxxxxxxxx · Wed, 16 May 2018 20:02:21 +1000

Hi All,

I have been working on making LookingGlass
(https://github.com/gnif/LookingGlass/) usable from inside a Linux Guest 
VM,
but I have come up against what seems to be a problem with passthrough
performance.

Currently I am mapping a block of memory from a IVSHMEM device that 
contains
a captured frame from a Windows guest. Both the Windows and Linux guests 
are
running on the same numa node with all allocations bound to local RAM.

The frame I am trying to render is 1920x1200 @ 32bpp which equates to a
little over 8MB.

Copying the shared memory between hosts is no problem at 30-40GB/s, but 
as
soon as I try to feed the buffer to the video card via glBufferSubData 
the
transfer rate is around 105MB/s, resulting in a frame rate of around 
15FPS.

I have tried various methods of making the data available to the card, 
such
as pinned memory mappings using memcpy directly, glBufferSubData and 
copying
the buffer from the shared memory mapping to a local buffer first.

I also have found that if I malloc the buffer instead of taking it from 
the
shared mapping, I can feed this buffer to the video card without the 
huge
performance penalty. ie:

  static int offset = 0;
  uint8_t * b = malloc(1920*1200*4);
  for(int i = 0; i < 1920*1200*4; ++i)
    b[i] = i + offset;
  ++offset;
  glBufferSubData(GL_PIXEL_UNPACK_BUFFER, 0, 1920*1200*4, b);

This leaves me to believe that the issue lies with the KVM or Qemu 
memory
layer instead of a graphics adapter problem.

Thanks in advance,
Geoffrey McRae