On Mon, 10 Oct 2016 21:57:55 +0800 Xiao Guangrong <guangrong.xiao@xxxxxxxxxxxxxxx> wrote: > On 10/10/2016 08:59 PM, Igor Mammedov wrote: > > On Sat, 8 Oct 2016 16:34:06 +0800 > > Xiao Guangrong <guangrong.xiao@xxxxxxxxxxxxxxx> wrote: > > > >> On 10/03/2016 09:48 PM, Igor Mammedov wrote: > >>> On Fri, 12 Aug 2016 14:54:02 +0800 > >>> Xiao Guangrong <guangrong.xiao@xxxxxxxxxxxxxxx> wrote: > >>> > >>> General design issue in this series is regenerating > >>> _FIT data every time inside of _FIT read loop. > >>> > >>> The issue here is that if FIT data doesn't fit in one page > >>> RFIT would be called several times resulting in calling > >>> nvdimm_dsm_func_read_fit() N-times, and in between > >>> these N exits generated FIT data could change > >>> (for example if another NVDIMM has been hotpluged in between) > >>> resulting in corrupted concatenated FIT buffer returned by > >>> _FIT method. > >>> So one need either to block follow up hotplug or make sure > >>> that FIT data regenerated/updated only once per _FIT > >>> invocation and stay immutable until _FIT is completed > >>> (maybe RCU protected buffer). > >>> > > > > > >>> Regenerating/updating inside qemu could also be shared > >>> with NFIT table creation path so that both cold/hotplug > >>> would reuse the same data which are updated only when > >>> a NVDIMM is added/removed. > > try to take into account this point so that FIT data > > aren't regenerated on every RFIT invocation. > > Okay. > > The reason that the FIT was always regenerated is avoiding saving > FIT during live-migration. Well, it seems that we are hard to > avoid it now. :) > > > > > > >> Explicitly sync up QEMU and OSPM is hard as OSPM is running > >> in a VM which is vulnerable and can not be expected always > >> triggering correct sequence. > >> > >> So how about introducing generation-number which is updated > >> for each hotplug event and a new DSM function reserved to > >> get this number by OSPM. Then the ACPI code is like this: > >> > >> restart: > >> old-number = DSM(Read_NUM); > >> FIT = DSM(FIT) > >> new-number = DSM(Read_NUM); > >> > >> if (old-number != new-number) > >> goto restart; > >> > >> return FIT > > It's a bit complicated. > > > > Hmm, how about this algorithm: > > struct fit_data { > u64 generation_number; > GArray *fit; > }; > > struct fit_data *fit; > u64 current_number; > > The update-side (hotplug happens): > new_fit = malloc(); > > old_fit = rcu_dereference(fit); > > new_fit->generation_number = old_fit->generation_number + 1; > new_fit->fit = ...generate-fit...; > > rcu_assign(fit, new_fit); > sync_rcu() > free(old_fit); > > > On the read-side (DSM issued by OSPM): > { > rcu_read_lock() > > fit = rcu_dereference(fit); > > /* if it is the first time issuing RFIT. */ > if (fit-read-position == 0) > current_number = fit->generation_number; > else if (current_number != fit->generation_number) > return FAIL_FIT_CHANGED; > > ... fill returned info with fit->fit...; > > rcu_read_unlock() > } > > Of course, @fit and @current_number should be persistent during live migration. you can drop RCU and @current_number, and @fit could be exactly recreated on target side from -device nvdim ... set of options, which must include all currently present (including hotplugged) devices from source side. It's sufficient to invalidate and restart DMA transfer in flight. i.e. pc_dimm_memory_plug () -> regenerate_fit() -> set local flag @fit-dirty -> start QEMU.fit_read() -> if offset == 0 ? clear @fit-dirty make sure fit_read() won't conflict with regenerate_fit() either plain mutex or RCU would do the job ... OSPM.rfit() ... if (@fit-dirty) abort/restart DMA from start In migration case, the target would have @fit-dirty set thanks to pc_dimm_memory_plug () -> regenerate_fit() and any DMA in flight would be restarted. That's a little bit inefficient as source_fit exactly matches target_fit and DMA could continue, but it would save us from having fit specific migration state to transfer and maintain. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html