On Tue, 8 Sep 2015 22:03:01 +0800 Xiao Guangrong <guangrong.xiao@xxxxxxxxxxxxxxx> wrote: > > > On 09/07/2015 09:40 PM, Igor Mammedov wrote: > > On Sun, 6 Sep 2015 14:07:21 +0800 > > Xiao Guangrong <guangrong.xiao@xxxxxxxxxxxxxxx> wrote: > > > >> > >> > >> On 09/02/2015 07:31 PM, Igor Mammedov wrote: > >>> On Wed, 2 Sep 2015 18:36:43 +0800 > >>> Xiao Guangrong <guangrong.xiao@xxxxxxxxxxxxxxx> wrote: > >>> > >>>> > >>>> > >>>> On 09/02/2015 05:58 PM, Igor Mammedov wrote: > >>>>> On Fri, 14 Aug 2015 22:51:59 +0800 > >>>>> Xiao Guangrong <guangrong.xiao@xxxxxxxxxxxxxxx> wrote: > >>>>> > >>>>>> Introduce "pc-nvdimm" device and it has two parameters: > >>>>> Why do you use prefix "pc-", I suppose we potentially > >>>>> could use this device not only with x86 targets but with > >>>>> other targets as well. > >>>>> I'd just drop 'pc' prefix through out patchset. > >>>> > >>>> Yeah, the prefix is stolen from pc-dimm, will drop this > >>>> prefix as your suggestion. > >>>> > >>>>> > >>>>>> - @file, which is the backed memory file for NVDIMM device > >>>>> Could you try to split device into backend/frontend parts, > >>>>> like it's done with pc-dimm. As I understand it's preferred > >>>>> way to implement this kind of devices. > >>>>> Then you could reuse memory backends that we already have > >>>>> including file backend. > >>>> > >>>> I considered it too and Stefan, Paolo got the some idea in > >>>> V1's review, however: > >>>> > >>>> | However, file-based memory used by NVDIMM is special, it divides the file > >>>> | to two parts, one part is used as PMEM and another part is used to store > >>>> | NVDIMM's configure data. > >>>> | > >>>> | Maybe we can introduce "end-reserved" property to reserve specified size > >>>> | at the end of the file. Or create a new class type based on > >>>> | memory-backend-file (named nvdimm-backend-file) class to hide this magic > >>>> | thing? > >>> I'd go with separate backend/frontend idea. > >>> > >>> Question is if this config area is part backend or frontend? > >> > >> Configdata area is used to store nvdimm device's configuration, normally, it's > >> namespace info. > >> > >> Currently, we chosen configdata located at the end of nvdimm's backend-memory > >> as it's easy to configure / use and configdata is naturally non-volatile and it > >> is like the layout on physical device. > >> > >> However, using two separated backed-memory is okay, for example: > >> -object memory-backend-file,id=mem0,file=/storage/foo > >> -object memory-backend-file,id=mem1,file=/storage/bar > >> -device nvdimm,memdev=mem0,configdata=mem1 > >> then configdata is written to a single backend. > >> > >> Which one is better for you? :) > >> > >>> If we pass-through NVDIMM device do we need to set configdata=true > >>> and QEMU would skip building config structures and use structures > >>> that are already present on passed-through device in that place? > >>> > >> > >> The file specified by @file is something like a normal disk, like /dev/sda/, > >> host process can use whole space on it. If we want to directly pass it to guest, > >> we can specify 'configdata=false'. If we allow guest to 'partition' (create > >> namespace on) it then we use 'configdata=true' to reserve some space to store > >> its partition info (namesapce info). > > As far as I understand currently linux provides to userspace only one interface > > which is block device i.e. /dev/sdX and on top of it userspace can put > > PM/DAX aware filesystem and use files from it. In either cases kernel > > just provides access to separate namespaces and not to a whole NVDIMM which > > includes 'labels area'. Hence /dev/sdX is not passed-though NVDIMM, > > so we could consider it as just a file/storage that could be used by userspace. > > > > Yes, it is. > > > Lets assume that NVDIMM should always have 'labels area'. > > In that case I'd always reserve space for it and > > * format it (build a new one) if backend doesn't have a > > valid labels area dropping configdata parameter along the way On a second glance, qemu probably shouldn't 'format/build' labels in this case and only supply non configured NVDIMM to guest. It probably should be guest responsibility to configure namespaces on bare NVDIMM. Also it would allow to seamlessly introduce separate labels backend as described below to cover 'configdata = false' case. > > * or if backing-file already has valid labels area I'd just use it. > > Yes. > > > > > If you need to make labels area readonly you can introduce 'NVDIMM.readonly_labels' > > option and just use labels backend's without allowing changes writeback. > > IT would be better to make it another series on top of basic NVDIMM implementation > > if there is an actual usecase for it. > > I'd prefer the way that discards not only its label data but also the whole nvdimm device, > that is, open(, RDONLY) + mmap(, MAP_PRIVATE), the idea was raised by Stefan. Yes, we can do this but it's not related to NVDIMM at all, it's backend's feature that also could be (re)used by pc-dimm with file backend. I think it could be done on top. > > The 'configdata = false' in this patchset does not aim at making label readonly, it provides > a way to make the file in a single partition. For example, you create a image in /dev/pmem0, > pass it to guest, then the whole file will appear at /dev/pmem0 in guest, and guest can directly > use the image in that device. Under this case, no file region is reserved and the label data is > build in memory which can not be updated by guest. If we will have 'configdata = false' as in this patchset, one will have to take special care to keep it bug compatible with previous versions as implementation changes/fixes might affect compatibility. I think it is simpler and more robust to always have metadata in backend. That's also would be migration friendly since labels area will stay consistent with data it describes, regardless of QEMU version when we do cross version migration. Later we could introduce a separate option for a separate labels area to cover above described usecase. > > > > > PS: > > Also when you write commit messages, comment and name variables try to use terms from > > relevant spec and mention specs where you describe data structures from them. > > Parts of the names/definitions were stolen from Kernel NVDIMM driver, i will update it to > let them reflect the specs. Thanks, Igor. > > > > > > > -- > To unsubscribe from this list: send the line "unsubscribe kvm" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html