We run the VP8 encoder in real time mode so it uses only the minimum amount of time needed to encode each frame. However by default it only uses one thread so that for large/complex frames it may run at less than the source fps. Besides resulting in dropped frames this blocks the main server thread for most of the time. So this patch configures the VP8 encoder to use all the CPU's physical core, resulting in less wall clock time spent in encode_frame(). Signed-off-by: Francois Gouget <fgouget@xxxxxxxxxxxxxxx> --- I am resubmitting this patch because I think the reasons for not applying it last time were wrong. See: https://lists.freedesktop.org/archives/spice-devel/2016-March/027026.html Here is an illustration of the impact of threading for the big_buck_bunny_1080p_h264.mov video: http://fgouget.free.fr/tmp/Spice-vp8-threads.png http://fgouget.free.fr/tmp/Spice-vp8-threads.xls The graph shows the time spent in encode_frame() (taken from the standard traces) when vp8enc uses 1, 2 or 4 threads. One can see that the one-thread line spends quite a bit of time above the 33 ms mark which corresponds to the 30 fps of the source material. This means dropped frames. Indeed, the x axis corresponds to the frame number and we can clearly see the 1-thread line getting out of sync with the others as it encoded fewer frames. The two-thread line is much lower and only goes above the 33 ms mark for a short time. The four-thread line is a bit lower still but we can also see diminishing returns there. I'll also note that the h264 encoder automatically uses multiple threads already so this patch only brings vp8enc in line with it. configure.ac | 4 ++++ server/gstreamer-encoder.c | 32 ++++++++++++++++++++++++++++++-- 2 files changed, 34 insertions(+), 2 deletions(-) diff --git a/configure.ac b/configure.ac index 68aed15..6742577 100644 --- a/configure.ac +++ b/configure.ac @@ -150,6 +150,10 @@ AC_SUBST([SPICE_PROTOCOL_MIN_VER]) PKG_CHECK_MODULES([GLIB2], [glib-2.0 >= 2.22 gio-2.0 >= 2.22]) AS_VAR_APPEND([SPICE_REQUIRES], [" glib-2.0 >= 2.22 gio-2.0 >= 2.22"]) +AC_CHECK_LIB(glib-2.0, g_get_num_processors, + AC_DEFINE([HAVE_G_GET_NUMPROCESSORS], 1, [Defined if we have g_get_num_processors()]),, + $GLIB2_LIBS) + PKG_CHECK_MODULES([GOBJECT2], [gobject-2.0 >= 2.22]) AS_VAR_APPEND([SPICE_REQUIRES], [" gobject-2.0 >= 2.22"]) diff --git a/server/gstreamer-encoder.c b/server/gstreamer-encoder.c index a101ab6..eb2a28c 100644 --- a/server/gstreamer-encoder.c +++ b/server/gstreamer-encoder.c @@ -866,6 +866,27 @@ static GstFlowReturn new_sample(GstAppSink *gstappsink, gpointer video_encoder) return GST_FLOW_OK; } +static int physical_core_count = 0; +static int get_physical_core_count(void) +{ + if (!physical_core_count) { +#ifdef HAVE_G_GET_NUMPROCESSORS + physical_core_count = g_get_num_processors(); +#elif defined(_SC_NPROCESSORS_ONLN) + physical_core_count = sysconf(_SC_NPROCESSORS_ONLN); +#endif + if (system("egrep -l '^flags\\b.*: .*\\bht\\b' /proc/cpuinfo >/dev/null 2>&1") == 0) { + /* Hyperthreading is enabled so divide by two to get the number + * of physical cores. + */ + physical_core_count = physical_core_count / 2; + } + if (physical_core_count == 0) + physical_core_count = 1; + } + return physical_core_count; +} + static const gchar* get_gst_codec_name(SpiceGstEncoder *encoder) { switch (encoder->base.codec_type) @@ -887,6 +908,7 @@ static const gchar* get_gst_codec_name(SpiceGstEncoder *encoder) } } +/* A helper for spice_gst_encoder_encode_frame() */ static gboolean create_pipeline(SpiceGstEncoder *encoder) { #ifdef HAVE_GSTREAMER_0_10 @@ -925,11 +947,17 @@ static gboolean create_pipeline(SpiceGstEncoder *encoder) * 75% CPU usage while speed simply prioritizes encoding speed. * - deadline is supposed to be set in microseconds but in practice * it behaves like a boolean. + * - At least up to GStreamer 1.6.2, vp8enc cannot be trusted to pick + * the optimal number of threads. Also exceeding the number of + * physical core really degrades image quality. + * - token-parts/token-partitions parallelizes more operations. */ + int threads = get_physical_core_count(); + int parts = threads < 2 ? 0 : threads < 4 ? 1 : threads < 8 ? 2 : 3; #ifdef HAVE_GSTREAMER_0_10 - gstenc_opts = g_strdup_printf("mode=cbr min-quantizer=10 error-resilient=true max-latency=0 speed=7"); + gstenc_opts = g_strdup_printf("mode=cbr min-quantizer=10 error-resilient=true max-latency=0 speed=7 threads=%d token-parts=%d", threads, parts); #else - gstenc_opts = g_strdup_printf("end-usage=cbr min-quantizer=10 error-resilient=default lag-in-frames=0 deadline=1 cpu-used=4"); + gstenc_opts = g_strdup_printf("end-usage=cbr min-quantizer=10 error-resilient=default lag-in-frames=0 deadline=1 cpu-used=4 threads=%d token-partitions=%d", threads, parts); #endif break; } -- 2.10.1 _______________________________________________ Spice-devel mailing list Spice-devel@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/spice-devel