> -----Original Message----- > From: Dan Williams <dan.j.williams@xxxxxxxxx> > Sent: Wednesday, 1 April 2020 7:48 PM > To: Alastair D'Silva <alastair@xxxxxxxxxxx> > Cc: Aneesh Kumar K . V <aneesh.kumar@xxxxxxxxxxxxx>; Oliver O'Halloran > <oohall@xxxxxxxxx>; Benjamin Herrenschmidt > <benh@xxxxxxxxxxxxxxxxxxx>; Paul Mackerras <paulus@xxxxxxxxx>; Michael > Ellerman <mpe@xxxxxxxxxxxxxx>; Frederic Barrat <fbarrat@xxxxxxxxxxxxx>; > Andrew Donnellan <ajd@xxxxxxxxxxxxx>; Arnd Bergmann > <arnd@xxxxxxxx>; Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>; > Vishal Verma <vishal.l.verma@xxxxxxxxx>; Dave Jiang > <dave.jiang@xxxxxxxxx>; Ira Weiny <ira.weiny@xxxxxxxxx>; Andrew Morton > <akpm@xxxxxxxxxxxxxxxxxxxx>; Mauro Carvalho Chehab > <mchehab+samsung@xxxxxxxxxx>; David S. Miller <davem@xxxxxxxxxxxxx>; > Rob Herring <robh@xxxxxxxxxx>; Anton Blanchard <anton@xxxxxxxxxx>; > Krzysztof Kozlowski <krzk@xxxxxxxxxx>; Mahesh Salgaonkar > <mahesh@xxxxxxxxxxxxxxxxxx>; Madhavan Srinivasan > <maddy@xxxxxxxxxxxxxxxxxx>; Cédric Le Goater <clg@xxxxxxxx>; Anju T > Sudhakar <anju@xxxxxxxxxxxxxxxxxx>; Hari Bathini > <hbathini@xxxxxxxxxxxxx>; Thomas Gleixner <tglx@xxxxxxxxxxxxx>; Greg > Kurz <groug@xxxxxxxx>; Nicholas Piggin <npiggin@xxxxxxxxx>; Masahiro > Yamada <yamada.masahiro@xxxxxxxxxxxxx>; Alexey Kardashevskiy > <aik@xxxxxxxxx>; Linux Kernel Mailing List <linux-kernel@xxxxxxxxxxxxxxx>; > linuxppc-dev <linuxppc-dev@xxxxxxxxxxxxxxxx>; linux-nvdimm <linux- > nvdimm@xxxxxxxxxxxx>; Linux MM <linux-mm@xxxxxxxxx> > Subject: Re: [PATCH v4 00/25] Add support for OpenCAPI Persistent Memory > devices > > On Sun, Mar 29, 2020 at 10:23 PM Alastair D'Silva <alastair@xxxxxxxxxxx> > wrote: > > > > This series adds support for OpenCAPI Persistent Memory devices on > > bare metal (arch/powernv), exposing them as nvdimms so that we can > > make use of the existing infrastructure. There already exists a driver > > for the same devices abstracted through PowerVM (arch/pseries): > > arch/powerpc/platforms/pseries/papr_scm.c > > > > These devices are connected via OpenCAPI, and present as LPC (lowest > coherence point) memory to the system, practically, that means that > memory on these cards could be treated as conventional, cache-coherent > memory. > > > > Since the devices are connected via OpenCAPI, they are not enumerated > via ACPI. Instead, OpenCAPI links present as pseudo-PCI bridges, with > devices below them. > > > > This series introduces a driver that exposes the memory on these cards as > nvdimms, with each card getting it's own bus. This is somewhat complicated > by the fact that the cards do not have out of band persistent storage for > metadata, so 1 SECTION_SIZE's (see SPARSEMEM) worth of storage is carved > out of the top of the card storage to implement the ndctl_config_* calls. > > Is it really tied to section-size? Can't that change based on the configured > page-size? It's not clear to me why that would be the choice, but I'll dig into > the implementation. > I had tried using PAGE_SIZE, but ran into problems carving off just 1 page and handing it to the kernel, while leaving the rest as pmem. That was a while ago though, so maybe I should retry it. > > The driver is not responsible for configuring the NPU (NVLink Processing > Unit) BARs to map the LPC memory from the card into the system's physical > address space, instead, it requests this to be done via OPAL calls (typically > implemented by Skiboot). > > Are OPAL calls similar to ACPI DSMs? I.e. methods for the OS to invoke > platform firmware services? What's Skiboot? > Yes, OPAL is the interface to firmware for POWER. Skiboot is the open-source (and only) implementation of OPAL. > > > > The series is structured as follows: > > - Required infrastructure changes & cleanup > > - A minimal driver implementation > > - Implementing additional features within the driver > > Thanks for the intro and the changelog! > > > > > Changelog: > > V4: > > - Rebase on next-20200320 > > Do you have dependencies on other material that's in -next? Otherwise - > next is only a viable development baseline if you are going to merge through > Andrew's tree. > > > - Bump copyright to 2020 > > - Ensure all uapi headers use C89 compatible comments (missed > ocxlpmem.h) > > - Move the driver back to drivers/nvdimm/ocxl, after confirmation > > that this location is desirable > > - Rename ocxl.c to ocxlpmem.c (+ support files) > > - Rename all ocxl_pmem to ocxlpmem > > - Address checkpatch --strict issues > > - "powerpc/powernv: Add OPAL calls for LPC memory alloc/release" > > - Pass base address as __be64 > > - "ocxl: Tally up the LPC memory on a link & allow it to be mapped" > > - Address checkpatch spacing warnings > > - Reword blurb > > - Reword size description for ocxl_link_add_lpc_mem() > > - Add an early exit in ocxl_link_lpc_release() to avoid triggering > > bogus warnings if called after ocxl_link_lpc_map() fails > > - "powerpc/powernv: Add OPAL calls for LPC memory alloc/release" > > - Reword blurb > > - "powerpc/powernv: Map & release OpenCAPI LPC memory" > > - Reword blurb > > - Move minor_idr init from file_init() to ocxlpmem_init() (fixes runtime > error > > in "nvdimm: Add driver for OpenCAPI Persistent Memory") > > - Wrap long lines > > - "nvdimm: Add driver for OpenCAPI Storage Class Memory" > > - Remove '+ 1' workround from serial number->cookie assignment > > - Drop out of memory message for ocxlpmem in probe() > > - Fix leaks of ocxlpmem & ocxlpmem->ocxl_fn in probe() > > - remove struct ocxlpmem_function0, it didn't value add > > - factor out err_unregistered label in probe > > - Address more checkpatch warnings > > - get/put the pci dev on probe/free > > - Drop ocxlpmem_ prefix from static functions > > - Propogate errors up from called functions in probe() > > - Set MODULE_LICENSE to GPLv2 > > - Add myself as module author > > - Call nvdimm_bus_unregister() in remove() to release references > > - Don't call devm_memunmap on metadata_address, the release > handler on > > the device already deals with this > > - "nvdimm/ocxl: Read the capability registers & wait for device ready" > > - Fix mask for read_latency > > - Fold in is_usable logic into timeout to remove error message race > > - propogate bad rc from read_device_metadata > > - "nvdimm/ocxl: Add register addresses & status values to the header" > > - Add comments for register abbreviations where names have been > > expanded > > - Add missing status for blocked on background task > > - Alias defines for firmware update status to show that the duplication > > of values is intentional > > - "nvdimm/ocxl: Register a character device for userspace to interact with" > > - Add lock around minors IDR, delete the cdev before > device_unregister > > - Propogate errors up from called functions in probe() > > - "nvdimm/ocxl: Add support for Admin commands" > > - Fix typo in setup_command_data error message, and drop 'ocxl' from > it > > - Drop vestigial CHI read from admin_command_request > > - Change command ID mismatch message to dev_err, and return an > error > > - Use jiffies to implement admin_command_complete_timeout() > > - Flesh out blurb > > - Create a wrapper to issue the command & wait for timeout > > - "nvdimm/ocxl: Add support for near storage commands" > > - dropped (will submit with the patches for nvdimm overwrite) > > - "nvdimm/ocxl: Implement the Read Error Log command" > > - Remove stray blank line > > - change misplaced goto to an early exit in read_error_log > > - Inline error_log_offset_0x08 > > - Read WWID data as LE rather than host endian > > - Move the include of nvdimm/ocxlpmem.h to ocxl.c > > - Add padding after fwrevision in struct ioctl_ocxl_pmem_error_log > > - Register IOCTL magic > > - Coerce pointers to __u64 in IOCTLs > > - "nvdimm/ocxl: Add controller dump IOCTLs" > > - Coerce pointers to __u64 in IOCTLs > > - Document expected IOCTL usage in blurb > > - Add missing rc check > > - Only populate up to the number of bytes returned by the card, > > and return this length to the caller > > - Add missing header check > > - "nvdimm/ocxl: Add an IOCTL to report controller statistics" > > - Update to match the latest version of the spec > > - Verify that parametr block IDs & lengths match what we expect > > - Use defines for offsets > > - "nvdimm/ocxl: Forward events to userspace" > > - Don't enable NSCRA doorbell > > - return -EBUSY if the event context is already used > > - return -ENODEV if IRQs cannot be mapped > > - Tag IRQ pointers with __iomem > > - Drop ocxlpmem_ prefix from static functions > > - Propogate error from eventfd_ctx_fdget > > - Fix error check in copy_to_user > > - Drop GLOBAL_MMIO_CHI_NSCRA (this should be in the overwrite > patch) > > - Drop unused irq_pgmap > > - Don't redef BIT_ULL > > - "nvdimm/ocxl: Add debug IOCTLs" > > - Eliminate clearing loop (now done in admin_command_execute() > > - Drop dummy IOCTLs if CONFIG_OCXL_PMEM_DEBUG is not set > > - Group debug IOCTLs together & comment that they may not be > available > > - "nvdimm/ocxl: Expose SMART data via ndctl" > > - Drop 'rc = 0; goto out;' > > - Propogate errors from ndctl_smart() > > - "nvdimm/ocxl: Expose the serial number in sysfs" & "nvdimm/ocxl: > Expose the firmware version in sysfs" > > - Squash these 2 patches together > > - Expose data as a DIMM attribute rather than an ocxlpmem > > attribute > > - "nvdimm/ocxl: Add an IOCTL to request controller health & perf data" > > - Reword blurb > > - "nvdimm/ocxl: Implement the heartbeat command" > > - Propogate rc in probe() > > > > V3: > > - Rebase against next/next-20200220 > > - Move driver to arch/powerpc/platforms/powernv, we now expect this > > driver to go upstream via the powerpc tree > > - "nvdimm/ocxl: Implement the Read Error Log command" > > - Fix bad header path > > - "nvdimm/ocxl: Read the capability registers & wait for device ready" > > - Fix overlapping masks between readiness_timeout & > memory_available_timeout > > - "nvdimm: Add driver for OpenCAPI Storage Class Memory" > > - Address minor review comments from Jonathan Cameron > > - Remove attributes > > - Default to module if building LIBNVDIMM > > - Propogate errors up from called functions in probe() > > - "nvdimm/ocxl: Expose SMART data via ndctl" > > - Pack attributes in struct > > - Support different size SMART buffers for compatibility with newer > > ndctls that may want more SMART attribs than we provide > > - Rework to to use ND_CMD_CALL instead of ND_CMD_SMART > > - drop "ocxl: Free detached contexts in ocxl_context_detach_all()" > > - "powerpc: Map & release OpenCAPI LPC memory" > > - Remove 'extern' > > - Only available with CONFIG_MEMORY_HOTPLUG_SPARSE > > - "ocxl: Tally up the LPC memory on a link & allow it to be mapped" > > - Address minor review comments from Jonathan Cameron > > - "ocxl: Add functions to map/unmap LPC memory" > > - Split detected memory message into a separate patch > > - Address minor review comments from Jonathan Cameron > > - Add a comment explaining why unmap_lpc_mem is in > deconfigure_afu > > - "nvdimm/ocxl: Add support for Admin commands" > > - use sizeof(u64) rather than 0x08 when iterating u64s > > - "nvdimm/ocxl: Implement the heartbeat command" > > - Fix typo in blurb > > - Address kernel doc issues > > - Ensure all uapi headers use C89 compatible comments > > - Drop patches for firmware update & overwrite, these will be > > submitted later once patches are available for ndctl > > - Rename SCM to OpenCAPI Persistent Memory > > > > V2: > > - "powerpc: Map & release OpenCAPI LPC memory" > > - Fix #if -> #ifdef > > - use pci_dev_id to get the bdfn > > - use __be64 to hold be data > > - indent check_hotplug_memory_addressable correctly > > - Remove export of check_hotplug_memory_addressable > > - "ocxl: Conditionally bind SCM devices to the generic OCXL driver" > > - Improve patch description and remove redundant default > > - "nvdimm: Add driver for OpenCAPI Storage Class Memory" > > - Mark a few funcs as static as identified by the 0day bot > > - Add OCXL dependancies to OCXL_SCM > > - Use memcpy_mcsafe in scm_ndctl_config_read > > - Rename scm_foo_offset_0x00 to scm_foo_header_parse & add docs > > - Name DIMM attribs "ocxl" rather than "scm" > > - Split out into base + many feature patches > > - "powerpc: Enable OpenCAPI Storage Class Memory driver on bare > metal" > > - Build DEV_DAX & friends as modules > > - "ocxl: Conditionally bind SCM devices to the generic OCXL driver" > > - Patch dropped (easy enough to maintain this out of tree for > development) > > - "ocxl: Tally up the LPC memory on a link & allow it to be mapped" > > - Add a warning if an unmatched lpc_release is called > > - "ocxl: Add functions to map/unmap LPC memory" > > - Use EXPORT_SYMBOL_GPL > > > > > > Alastair D'Silva (25): > > powerpc/powernv: Add OPAL calls for LPC memory alloc/release > > mm/memory_hotplug: Allow check_hotplug_memory_addressable to be > called > > from drivers > > powerpc/powernv: Map & release OpenCAPI LPC memory > > ocxl: Remove unnecessary externs > > ocxl: Address kernel doc errors & warnings > > ocxl: Tally up the LPC memory on a link & allow it to be mapped > > ocxl: Add functions to map/unmap LPC memory > > ocxl: Emit a log message showing how much LPC memory was detected > > ocxl: Save the device serial number in ocxl_fn > > nvdimm: Add driver for OpenCAPI Persistent Memory > > powerpc: Enable the OpenCAPI Persistent Memory driver for > > powernv_defconfig > > nvdimm/ocxl: Add register addresses & status values to the header > > nvdimm/ocxl: Read the capability registers & wait for device ready > > nvdimm/ocxl: Add support for Admin commands > > nvdimm/ocxl: Register a character device for userspace to interact > > with > > nvdimm/ocxl: Implement the Read Error Log command > > nvdimm/ocxl: Add controller dump IOCTLs > > nvdimm/ocxl: Add an IOCTL to report controller statistics > > nvdimm/ocxl: Forward events to userspace > > nvdimm/ocxl: Add an IOCTL to request controller health & perf data > > nvdimm/ocxl: Implement the heartbeat command > > nvdimm/ocxl: Add debug IOCTLs > > nvdimm/ocxl: Expose SMART data via ndctl > > nvdimm/ocxl: Expose the serial number & firmware version in sysfs > > MAINTAINERS: Add myself & nvdimm/ocxl to ocxl > > > > .../userspace-api/ioctl/ioctl-number.rst | 1 + > > MAINTAINERS | 3 + > > arch/powerpc/configs/powernv_defconfig | 5 + > > arch/powerpc/include/asm/opal-api.h | 2 + > > arch/powerpc/include/asm/opal.h | 2 + > > arch/powerpc/include/asm/pnv-ocxl.h | 42 +- > > arch/powerpc/platforms/powernv/ocxl.c | 43 + > > arch/powerpc/platforms/powernv/opal-call.c | 2 + > > drivers/misc/ocxl/config.c | 74 +- > > drivers/misc/ocxl/core.c | 61 + > > drivers/misc/ocxl/link.c | 60 + > > drivers/misc/ocxl/ocxl_internal.h | 45 +- > > drivers/nvdimm/Kconfig | 2 + > > drivers/nvdimm/Makefile | 1 + > > drivers/nvdimm/ocxl/Kconfig | 21 + > > drivers/nvdimm/ocxl/Makefile | 7 + > > drivers/nvdimm/ocxl/main.c | 1975 +++++++++++++++++ > > drivers/nvdimm/ocxl/ocxlpmem.h | 197 ++ > > drivers/nvdimm/ocxl/ocxlpmem_internal.c | 280 +++ > > include/linux/memory_hotplug.h | 5 + > > include/misc/ocxl.h | 122 +- > > include/uapi/linux/ndctl.h | 1 + > > include/uapi/nvdimm/ocxlpmem.h | 127 ++ > > mm/memory_hotplug.c | 4 +- > > 24 files changed, 2983 insertions(+), 99 deletions(-) create mode > > 100644 drivers/nvdimm/ocxl/Kconfig create mode 100644 > > drivers/nvdimm/ocxl/Makefile create mode 100644 > > drivers/nvdimm/ocxl/main.c create mode 100644 > > drivers/nvdimm/ocxl/ocxlpmem.h create mode 100644 > > drivers/nvdimm/ocxl/ocxlpmem_internal.c > > create mode 100644 include/uapi/nvdimm/ocxlpmem.h > > > > -- > > 2.24.1 > > -- Alastair D'Silva mob: 0423 762 819 skype: alastair_dsilva msn: alastair@xxxxxxxxxxx blog: http://alastair.d-silva.org Twitter: @EvilDeece