On Tue, 8 Sep 2015 21:38:17 +0800 Xiao Guangrong <guangrong.xiao@xxxxxxxxxxxxxxx> wrote: > > > On 09/07/2015 10:11 PM, Igor Mammedov wrote: > > On Fri, 14 Aug 2015 22:52:01 +0800 > > Xiao Guangrong <guangrong.xiao@xxxxxxxxxxxxxxx> wrote: > > > >> The parameter @file is used as backed memory for NVDIMM which is > >> divided into two parts if @dataconfig is true: > >> - first parts is (0, size - 128K], which is used as PMEM (Persistent > >> Memory) > >> - 128K at the end of the file, which is used as Config Data Area, it's > >> used to store Label namespace data > >> > >> The @file supports both regular file and block device, of course we > >> can assign any these two kinds of files for test and emulation, however, > >> in the real word for performance reason, we usually used these files as > >> NVDIMM backed file: > >> - the regular file in the filesystem with DAX enabled created on NVDIMM > >> device on host > >> - the raw PMEM device on host, e,g /dev/pmem0 > > > > A lot of code in this series could reuse what QEMU already > > uses for implementing pc-dimm devices. > > > > here is common concepts that could be reused. > > - on physical system both DIMM and NVDIMM devices use > > the same slots. We could share QEMU's '-m slots' option between > > both devices. An alternative to not sharing would be to introduce > > '-machine nvdimm_slots' option. > > And yes, we need to know number of NVDIMMs to describe > > them all in ACPI table (taking in amount future hotplug > > include in this possible NVDIMM devices) > > I'd go the same way as on real hardware on make them share the same slots. > > I'd prefer sharing slots for pc-dimm and nvdimm, it's easier to reuse the > logic of slot-assignment and plug/unplug. > > > - they share the same physical address space and limits > > on how much memory system can handle. So I'd suggest sharing existing > > '-m maxmem' option and reuse hotplug_memory address space. > > Sounds good to me. > > > > > Essentially what I'm suggesting is to inherit NVDIMM's implementation > > from pc-dimm reusing all of its code/backends and > > just override parts that do memory mapping into guest's address space to > > accommodate NVDIMM's requirements. > > Good idea! > > We have to differentiate pc-dimm and nvdimm in the common code and nvdimm > has different points with pc-dimm (for example, its has reserved-region, and > need support live migration of label data). How about rename 'pc-nvdimm' to > 'memory-device' and make it as a common device type, then build pc-dimm and > nvdimm on top of it? sound good, only I'd call it just 'dimm' as 'memory-device' is too broad. Also I'd make base class abstract. > > Something like: > static TypeInfo memory_device_info = { > .name = TYPE_MEM_DEV, > .parent = TYPE_DEVICE, > }; > > static TypeInfo memory_device_info = { > .name = TYPE_PC_DIMM, > .parent = TYPE_MEM_DEV, > }; > > static TypeInfo memory_device_info = { > .name = TYPE_NVDIMM, > .parent = TYPE_MEM_DEV, > }; > > It also make CONIFG_NVDIMM and CONFIG_HOT_PLUG be independent. > > > > >> > >> Signed-off-by: Xiao Guangrong <guangrong.xiao@xxxxxxxxxxxxxxx> > >> --- > >> hw/mem/nvdimm/pc-nvdimm.c | 109 ++++++++++++++++++++++++++++++++++++++++++++- > >> include/hw/mem/pc-nvdimm.h | 7 +++ > >> 2 files changed, 115 insertions(+), 1 deletion(-) > >> > >> diff --git a/hw/mem/nvdimm/pc-nvdimm.c b/hw/mem/nvdimm/pc-nvdimm.c > >> index 7a270a8..97710d1 100644 > >> --- a/hw/mem/nvdimm/pc-nvdimm.c > >> +++ b/hw/mem/nvdimm/pc-nvdimm.c > >> @@ -22,12 +22,20 @@ > >> * License along with this library; if not, see <http://www.gnu.org/licenses/> > >> */ > >> > >> +#include <sys/mman.h> > >> +#include <sys/ioctl.h> > >> +#include <linux/fs.h> > >> + > >> +#include "exec/address-spaces.h" > >> #include "hw/mem/pc-nvdimm.h" > >> > >> -#define PAGE_SIZE (1UL << 12) > >> +#define PAGE_SIZE (1UL << 12) > >> + > >> +#define MIN_CONFIG_DATA_SIZE (128 << 10) > >> > >> static struct nvdimms_info { > >> ram_addr_t current_addr; > >> + int device_index; > >> } nvdimms_info; > >> > >> /* the address range [offset, ~0ULL) is reserved for NVDIMM. */ > >> @@ -37,6 +45,26 @@ void pc_nvdimm_reserve_range(ram_addr_t offset) > >> nvdimms_info.current_addr = offset; > >> } > >> > >> +static ram_addr_t reserved_range_push(uint64_t size) > >> +{ > >> + uint64_t current; > >> + > >> + current = ROUND_UP(nvdimms_info.current_addr, PAGE_SIZE); > >> + > >> + /* do not have enough space? */ > >> + if (current + size < current) { > >> + return 0; > >> + } > >> + > >> + nvdimms_info.current_addr = current + size; > >> + return current; > >> +} > > You can't use all memory above hotplug_memory area since > > we have to tell guest where 64-bit PCI window starts, > > and currently it should start at reserved-memory-end > > (but it isn't due to a bug: I've just posted fix to qemu-devel > > "[PATCH 0/2] pc: fix 64-bit PCI window clashing with memory hotplug region" > > ) > > Ah, got it, thanks for you pointing it out. > > > > >> + > >> +static uint32_t new_device_index(void) > >> +{ > >> + return nvdimms_info.device_index++; > >> +} > >> + > >> static char *get_file(Object *obj, Error **errp) > >> { > >> PCNVDIMMDevice *nvdimm = PC_NVDIMM(obj); > >> @@ -48,6 +76,11 @@ static void set_file(Object *obj, const char *str, Error **errp) > >> { > >> PCNVDIMMDevice *nvdimm = PC_NVDIMM(obj); > >> > >> + if (memory_region_size(&nvdimm->mr)) { > >> + error_setg(errp, "cannot change property value"); > >> + return; > >> + } > >> + > >> if (nvdimm->file) { > >> g_free(nvdimm->file); > >> } > >> @@ -76,13 +109,87 @@ static void pc_nvdimm_init(Object *obj) > >> set_configdata, NULL); > >> } > >> > >> +static uint64_t get_file_size(int fd) > >> +{ > >> + struct stat stat_buf; > >> + uint64_t size; > >> + > >> + if (fstat(fd, &stat_buf) < 0) { > >> + return 0; > >> + } > >> + > >> + if (S_ISREG(stat_buf.st_mode)) { > >> + return stat_buf.st_size; > >> + } > >> + > >> + if (S_ISBLK(stat_buf.st_mode) && !ioctl(fd, BLKGETSIZE64, &size)) { > >> + return size; > >> + } > >> + > >> + return 0; > >> +} > > All this file stuff I'd leave to already existing backends like > > memory-backend-file or even memory-backend-ram which already do > > above and more allowing to configure persistent and volatile > > NVDIMMs without changing NVDIMM fronted code. > > > > The current memory backends use all memory size and map it to guest's > address space. However, nvdimm needs a reserved region for its label > data which is only accessed in Qemu. > > How about introduce two parameters, "reserved_size" and "reserved_addr" > to TYPE_MEMORY_BACKEND, then only the memory region > [0, size - reserved_size) is mapped to guest and the remain part is > pointed by "reserved_addr"? Looks like only "reserved_size" is sufficient if there aren't any plans /reasons to keep labels area not at the end of NVDIMM. Also keeping reserved area in the same backend => MemoryRegion should automatically migrate it during live migration. To separate and map guest visible part data part of backend's MemoryRegion you could use memory_region_init_alias(). -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html