On Wed, Jul 25, 2018 at 05:44:02PM +0100, Dr. David Alan Gilbert wrote: > * Peter Xu (peterx@xxxxxxxxxx) wrote: > > On Mon, Jul 23, 2018 at 03:39:18PM +0800, Xiao Guangrong wrote: > > > > > > > > > On 07/23/2018 12:36 PM, Peter Xu wrote: > > > > On Thu, Jul 19, 2018 at 08:15:15PM +0800, guangrong.xiao@xxxxxxxxx wrote: > > > > > @@ -1597,6 +1608,24 @@ static void migration_update_rates(RAMState *rs, int64_t end_time) > > > > > rs->xbzrle_cache_miss_prev) / iter_count; > > > > > rs->xbzrle_cache_miss_prev = xbzrle_counters.cache_miss; > > > > > } > > > > > + > > > > > + if (migrate_use_compression()) { > > > > > + uint64_t comp_pages; > > > > > + > > > > > + compression_counters.busy_rate = (double)(compression_counters.busy - > > > > > + rs->compress_thread_busy_prev) / iter_count; > > > > > > > > Here I'm not sure it's correct... > > > > > > > > "iter_count" stands for ramstate.iterations. It's increased per > > > > ram_find_and_save_block(), so IMHO it might contain multiple guest > > > > > > ram_find_and_save_block() returns if a page is successfully posted and > > > it only posts 1 page out at one time. > > > > ram_find_and_save_block() calls ram_save_host_page(), and we should be > > sending multiple guest pages in ram_save_host_page() if the host page > > is a huge page? > > > > > > > > > pages. However compression_counters.busy should be per guest page. > > > > > > > > > > Actually, it's derived from xbzrle_counters.cache_miss_rate: > > > xbzrle_counters.cache_miss_rate = (double)(xbzrle_counters.cache_miss - > > > rs->xbzrle_cache_miss_prev) / iter_count; > > > > Then this is suspecious to me too... > > Actually; I think this isn't totally wrong; iter_count is the *difference* in > iterations since the last time it was updated: > > uint64_t iter_count = rs->iterations - rs->iterations_prev; > > xbzrle_counters.cache_miss_rate = (double)(xbzrle_counters.cache_miss - > rs->xbzrle_cache_miss_prev) / iter_count; > > so this is: > cache-misses-since-last-update > ------------------------------ > iterations since last-update > > so the 'miss_rate' is ~misses / iteration. > Although that doesn't really correspond to time. I'm not sure I got the idea here, the thing is that I think the counters are for different granularities which might be problematic: - xbzrle_counters.cache_miss is done in save_xbzrle_page(), so it's per-guest-page granularity - RAMState.iterations is done for each ram_find_and_save_block(), so it's per-host-page granularity An example is that when we migrate a 2M huge page in the guest, we will only increase the RAMState.iterations by 1 (since ram_find_and_save_block() will be called once), but we might increase xbzrle_counters.cache_miss for 2M/4K=512 times (we'll call save_xbzrle_page() that many times) if all the pages got cache miss. Then IMHO the cache miss rate will be 512/1=51200% (while it should actually be just 100% cache miss). Regards, -- Peter Xu