I've spent several weeks analyzing the network performance of Xspice against two test cases. I also crafted a patch which implements an alternate mode for the xf86-video-qxl driver that dramatically improves network performance. The two test cases are simple [1]; a script drives either Libre Office or Microsoft Office 2007. They have a few minutes of typing, a bit of graphics (the test case draws a smiley face), and a little bit of interaction with the UI. For now, I'm focused on measuring network performance; I plan to later do some analysis to try to get at the quality of the user experience as well. Network performance, particularly the total # of packets, has a dramatic effect on Spice performance over a remote network. For a test involving 5 minutes of light use of LibreOffice, the results are as follows: Packets Bytes Xspice 148,428 19,647,168 Tight VNC 19,980 4,724,880 SSH -X 48,894 77,733,200 SSH -CX 21,445 5,096,702 You can see that Xspice network performance is pretty bad. This is partly by design; Spice is designed to replicate the philosophy of the X server, wherein graphics operations themselves - not pixmaps - are transmitted across the wire. Further, Spice as a whole is really built around qemu and Windows qxl running on a local LAN; Xspice is really just an experimental feature, and I'm pretty much the first person to examine it's network flow. However, if I modify the xf86-video-qxl driver to shift modes into a render and send mode, whereby only changed pixmaps are sent across the wire (and that on a periodic basis), I get dramatically better results. That patch is attached; these are the results: Xspice + deferred_fps 14,910 2,971,886 As you can see, with that change, Spice vaults into the lead. The numbers for Office 2007 are somewhat similar, although ssh -CX does not fare nearly as well. (The underlying behaviors are quite different; Office 2007 is surface + render heavy; Libre Office is fill region heavy). Additionally, I spent a lot of energy trying to decide if this mode could be injected into the Spice server itself, where it could potentially benefit the whole stack. I'm grateful to Alon and Soren for helping me privately with that effort. However, in the end, I could not see a path to do this with the kind of tight efficiency that I think is required. I had a test case where an ill behaved X application (fluxbox) would do 27,000 pixel draws; and I just couldn't find a way to run those draws through the Spice server efficiently. My old nemesis, red_worker.c, defeats me again :-/. I also spent some energy experimenting with modes where we kicked the spice server into a video streaming mode. I had only limited success with that. But I did do some analysis that shows that the test case (Libre Office) requires about 3,000,000 bytes to stream a high quality Theora video. Given that the deferred_fps patch matches that, and it is much lighter on the CPU, I've decided to not pursue that approach for now. At any rate, I would appreciate any comments or feedback on the basic approach outlined by the patch. Cheers, Jeremy [1] The tests are repeatable, automated scripts, and can be found here: git://people.freedesktop.org/~jwhite/spice-measure They are not at all user friendly or easy to parse, I'm afraid. They rely on some crude mechanisms (e.g. I have a dedicated machine, and I measure bandwidth by sampling ifconfig results before + after). You can try it by having the gtk spice client connected to an 800x600 system with a minimal window manager, with LibreOffice maximized. Then try running make ./limeasure "spice display 0" and don't blame me if it destroys your system <grin>.
>From 73bd9cb14fa559af40899077d8f1ddf9dfb065b8 Mon Sep 17 00:00:00 2001 From: Jeremy White <jwhite@xxxxxxxxxxxxxxx> Date: Mon, 6 Aug 2012 13:58:04 -0500 Subject: [PATCH] Implement a Deferred Frames mode, in which we defer sending any messages to the Spice server except at a very specific rate, at which time we just send the modified screen. To: wine-patches <wine-patches@xxxxxxxxxx> --- src/qxl.h | 5 ++ src/qxl_driver.c | 40 +++++++++++-- src/qxl_surface.c | 173 +++++++++++++++++++++++++++++++++++++++++++++++++---- 3 files changed, 200 insertions(+), 18 deletions(-) diff --git a/src/qxl.h b/src/qxl.h index 1db0581..9ef0b27 100644 --- a/src/qxl.h +++ b/src/qxl.h @@ -123,6 +123,7 @@ enum { OPTION_SPICE_TLS_CIPHERS, OPTION_SPICE_CACERT_FILE, OPTION_SPICE_DH_FILE, + OPTION_SPICE_DEFERRED_FPS, #endif OPTION_COUNT, }; @@ -245,6 +246,8 @@ struct _qxl_screen_t uint32_t bytes_pp; uint8_t *data, *flipped; } guest_primary; + + uint32_t deferred_fps; #endif /* XSPICE */ }; @@ -362,6 +365,8 @@ Bool qxl_surface_put_image (qxl_surface_t *dest, const char *src, int src_pitch); void qxl_surface_unref (surface_cache_t *cache, uint32_t surface_id); + +void surface_frame_ticker (qxl_screen_t *qxl); #if HAS_DEVPRIVATEKEYREC extern DevPrivateKeyRec uxa_pixmap_index; diff --git a/src/qxl_driver.c b/src/qxl_driver.c index e4c477a..accb964 100644 --- a/src/qxl_driver.c +++ b/src/qxl_driver.c @@ -115,6 +115,8 @@ const OptionInfoRec DefaultOptions[] = { "SpiceCacertFile", OPTV_STRING, {0}, FALSE}, { OPTION_SPICE_DH_FILE, "SpiceDhFile", OPTV_STRING, {0}, FALSE}, + { OPTION_SPICE_DEFERRED_FPS, + "DeferredFPS", OPTV_INTEGER, {0}, FALSE}, #endif { -1, NULL, OPTV_NONE, {0}, FALSE } @@ -762,7 +764,7 @@ qxl_create_screen_resources(ScreenPtr pScreen) qxl_surface_kill (surf); set_surface (pPixmap, qxl->primary); - + qxl_surface_set_pixmap (qxl->primary, pPixmap); return TRUE; } @@ -932,11 +934,16 @@ qxl_create_pixmap (ScreenPtr screen, int w, int h, int depth, unsigned usage) { /* ErrorF (" Successfully created surface in video memory\n"); */ - pixmap = fbCreatePixmap (screen, 0, 0, depth, usage); + if (qxl->deferred_fps > 0) + pixmap = fbCreatePixmap (screen, w, h, depth, usage); + else + { + pixmap = fbCreatePixmap (screen, 0, 0, depth, usage); - screen->ModifyPixmapHeader(pixmap, w, h, - -1, -1, -1, - NULL); + screen->ModifyPixmapHeader(pixmap, w, h, + -1, -1, -1, + NULL); + } #if 0 ErrorF ("Create pixmap %p with surface %p\n", pixmap, surface); @@ -1070,10 +1077,19 @@ setup_uxa (qxl_screen_t *qxl, ScreenPtr screen) #ifdef XSPICE +SpiceCoreInterface *core; +SpiceTimer *frames_timer; +void frame_ticker(void *opaque); +void frame_ticker(void *opaque) +{ + qxl_screen_t *qxl = (qxl_screen_t *) opaque; + surface_frame_ticker(qxl); + core->timer_start(frames_timer, 1000 / qxl->deferred_fps); +} + static void spiceqxl_screen_init(ScrnInfoPtr pScrn, qxl_screen_t *qxl) { - SpiceCoreInterface *core; // Init spice if (!qxl->spice_server) { @@ -1084,6 +1100,11 @@ spiceqxl_screen_init(ScrnInfoPtr pScrn, qxl_screen_t *qxl) qxl_add_spice_display_interface(qxl); qxl->worker->start(qxl->worker); qxl->worker_running = TRUE; + if (qxl->deferred_fps) + { + frames_timer = core->timer_add(frame_ticker, qxl); + core->timer_start(frames_timer, 1000 / qxl->deferred_fps); + } } qxl->spice_server = qxl->spice_server; } @@ -1529,6 +1550,13 @@ qxl_pre_init(ScrnInfoPtr pScrn, int flags) qxl->enable_image_cache? "Enabled" : "Disabled"); xf86DrvMsg(scrnIndex, X_INFO, "Fallback Cache: %s\n", qxl->enable_fallback_cache? "Enabled" : "Disabled"); +#ifdef XSPICE + qxl->deferred_fps = get_int_option(qxl->options, OPTION_SPICE_DEFERRED_FPS, "XSPICE_DEFERRED_FPS"); + if (qxl->deferred_fps > 0) + xf86DrvMsg(scrnIndex, X_INFO, "Deferred FPS: %d\n", qxl->deferred_fps); + else + xf86DrvMsg(scrnIndex, X_INFO, "Deferred Frames: Disabled\n"); +#endif if (!qxl_map_memory(qxl, scrnIndex)) goto out; diff --git a/src/qxl_surface.c b/src/qxl_surface.c index 113f09b..17501c2 100644 --- a/src/qxl_surface.c +++ b/src/qxl_surface.c @@ -69,6 +69,8 @@ struct qxl_surface_t uxa_access_t access_type; RegionRec access_region; + RegionRec updated_region; + void * address; void * end; @@ -198,6 +200,8 @@ surface_cache_init (surface_cache_t *cache, qxl_screen_t *qxl) REGION_INIT ( NULL, &(cache->all_surfaces[i].access_region), (BoxPtr)NULL, 0); + REGION_INIT ( + NULL, &(cache->all_surfaces[i].updated_region), (BoxPtr)NULL, 0); cache->all_surfaces[i].access_type = UXA_ACCESS_RO; if (i) /* surface 0 is the primary surface */ @@ -402,6 +406,7 @@ qxl_surface_cache_create_primary (surface_cache_t *cache, surface->evacuated = NULL; REGION_INIT (NULL, &(surface->access_region), (BoxPtr)NULL, 0); + REGION_INIT (NULL, &(surface->updated_region), (BoxPtr)NULL, 0); surface->access_type = UXA_ACCESS_RO; return surface; @@ -640,18 +645,6 @@ retry: surface->address = address; surface->end = (char *)address + stride * height; - cmd = make_surface_cmd (cache, surface->id, QXL_SURFACE_CMD_CREATE); - - cmd->u.surface_create.format = format; - cmd->u.surface_create.width = width; - cmd->u.surface_create.height = height; - cmd->u.surface_create.stride = - stride; - - cmd->u.surface_create.data = - physical_address (qxl, surface->address, qxl->vram_mem_slot); - - push_surface_cmd (cache, cmd); - dev_addr = (uint32_t *)((uint8_t *)surface->address + stride * (height - 1)); surface->dev_image = pixman_image_create_bits ( @@ -664,6 +657,21 @@ retry: n_live++; + if (qxl->deferred_fps > 0 && surface->id != 0) + return surface; + + cmd = make_surface_cmd (cache, surface->id, QXL_SURFACE_CMD_CREATE); + + cmd->u.surface_create.format = format; + cmd->u.surface_create.width = width; + cmd->u.surface_create.height = height; + cmd->u.surface_create.stride = - stride; + + cmd->u.surface_create.data = + physical_address (qxl, surface->address, qxl->vram_mem_slot); + + push_surface_cmd (cache, cmd); + return surface; } @@ -760,6 +768,12 @@ send_destroy (qxl_surface_t *surface) pixman_image_unref (surface->dev_image); if (surface->host_image) pixman_image_unref (surface->host_image); + + if (surface->cache->qxl->deferred_fps > 0) + { + qxl_surface_recycle (surface->cache, surface->id); + return; + } cmd = make_surface_cmd (surface->cache, surface->id, QXL_SURFACE_CMD_DESTROY); @@ -925,6 +939,18 @@ qxl_surface_prepare_access (qxl_surface_t *surface, if (!pScrn->vtSema) return FALSE; + if (surface->cache->qxl->deferred_fps > 0) + { + fbPrepareAccess(pixmap); + if (access == UXA_ACCESS_RW) + { + Bool throwaway_bool; + RegionAppend(&surface->updated_region, region); + RegionValidate(&surface->updated_region, &throwaway_bool); + } + return TRUE; + } + REGION_INIT (NULL, &new, (BoxPtr)NULL, 0); REGION_SUBTRACT (NULL, &new, region, &surface->access_region); @@ -979,6 +1005,65 @@ translate_rect (struct QXLRect *rect) rect->left = rect->top = 0; } +static void upload_one_region(qxl_surface_t *surface, int x1, int y1, int x2, int y2) +{ + struct QXLRect rect; + struct QXLDrawable *drawable; + struct QXLImage *image; + qxl_screen_t *qxl = surface->cache->qxl; + FbBits *data; + int stride; + int bpp; + + rect.left = x1; + rect.right = x2; + rect.top = y1; + rect.bottom = y2; + + drawable = make_drawable (qxl, surface->id, QXL_DRAW_COPY, &rect); + drawable->u.copy.src_area = rect; + translate_rect (&drawable->u.copy.src_area); + drawable->u.copy.rop_descriptor = ROPD_OP_PUT; + drawable->u.copy.scale_mode = 0; + drawable->u.copy.mask.flags = 0; + drawable->u.copy.mask.pos.x = 0; + drawable->u.copy.mask.pos.y = 0; + drawable->u.copy.mask.bitmap = 0; + + fbGetPixmapBitsData(surface->pixmap, data, stride, bpp); + image = qxl_image_create ( + qxl, (const uint8_t *)data, x1, y1, x2 - x1, y2 - y1, stride * sizeof(*data), + surface->bpp == 24 ? 4 : surface->bpp / 8, TRUE); + drawable->u.copy.src_bitmap = + physical_address (qxl, image, qxl->main_mem_slot); + + push_drawable (qxl, drawable); +} + +static void +upload_updated_regions(qxl_surface_t *surface) +{ + int n_boxes; + BoxPtr boxes; + n_boxes = RegionNumRects(&surface->updated_region); + boxes = RegionRects(&surface->updated_region); + + while (n_boxes--) + { + upload_one_region (surface, boxes->x1, boxes->y1, boxes->x2, boxes->y2); + boxes++; + } +} + +void surface_frame_ticker(qxl_screen_t *qxl) +{ + upload_updated_regions(qxl->primary); + RegionUninit(&qxl->primary->updated_region); + RegionInit(&qxl->primary->updated_region, NULL, 0); +} + + + static void real_upload_box (qxl_surface_t *surface, int x1, int y1, int x2, int y2) { @@ -1050,6 +1135,12 @@ qxl_surface_finish_access (qxl_surface_t *surface, PixmapPtr pixmap) int n_boxes; BoxPtr boxes; + if (surface->cache->qxl->deferred_fps > 0) + { + fbFinishAccess(pixmap); + return; + } + n_boxes = REGION_NUM_RECTS (&surface->access_region); boxes = REGION_RECTS (&surface->access_region); @@ -1231,6 +1322,27 @@ qxl_surface_solid (qxl_surface_t *destination, struct QXLRect qrect; uint32_t p; + if (qxl->deferred_fps > 0) + { + FbBits *bits; + int stride; + int bpp; + int xoff, yoff; + struct pixman_box16 box; + RegionPtr region; + Bool throwaway_bool; + fbGetDrawable((DrawablePtr)destination->pixmap, bits, stride, bpp, xoff, yoff); + pixman_fill((uint32_t *) bits, stride, bpp, x1 + xoff, y1 + yoff, x2 - x1, y2 - y1, destination->u.solid_pixel); + fbValidateDrawable(destination->pixmap); + fbFinishAccess(destination->pixmap); + box.x1 = x1; box.x2 = x2; box.y1 = y1; box.y2 = y2; + region = RegionCreate(&box, 1); + RegionAppend(&destination->updated_region, region); + RegionValidate(&destination->updated_region, &throwaway_bool); + RegionUninit(region); + return; + } + qrect.top = y1; qrect.bottom = y2; qrect.left = x1; @@ -1267,6 +1379,27 @@ qxl_surface_copy (qxl_surface_t *dest, struct QXLDrawable *drawable; struct QXLRect qrect; + if (qxl->deferred_fps > 0) + { + struct pixman_box16 box; + RegionPtr region; + Bool throwaway_bool; + GCPtr pgc; + + pgc = GetScratchGC(dest->pixmap->drawable.depth, dest->pixmap->drawable.pScreen); + ValidateGC(&dest->pixmap->drawable, pgc); + fbCopyArea(&dest->u.copy_src->pixmap->drawable, &dest->pixmap->drawable, pgc, src_x1, src_y1, width, height, dest_x1, dest_y1); + FreeScratchGC(pgc); + + box.x1 = dest_x1; box.x2 = dest_x1 + width; box.y1 = dest_y1; box.y2 = dest_y1 + height; + region = RegionCreate(&box, 1); + RegionAppend(&dest->updated_region, region); + RegionValidate(&dest->updated_region, &throwaway_bool); + RegionUninit(region); + + return; + } + #ifdef DEBUG_REGIONS print_region (" copy src", &(dest->u.copy_src->access_region)); print_region (" copy dest", &(dest->access_region)); @@ -1339,6 +1472,22 @@ qxl_surface_put_image (qxl_surface_t *dest, struct QXLRect rect; struct QXLImage *image; + if (qxl->deferred_fps > 0) + { + struct pixman_box16 box; + RegionPtr region; + Bool throwaway_bool; + box.x1 = x; box.x2 = x + width; box.y1 = y; box.y2 = y + height; + region = RegionCreate(&box, 1); + RegionAppend(&dest->updated_region, region); + RegionValidate(&dest->updated_region, &throwaway_bool); + RegionUninit(region); + + /* We can avoid doing the put image ourselves, as the uxa driver + will fall back and do it for us if we return false */ + return FALSE; + } + rect.left = x; rect.right = x + width; rect.top = y; -- 1.7.10.4
_______________________________________________ Spice-devel mailing list Spice-devel@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/spice-devel