[GIT PULL] libnvdimm for 4.3

"Williams, Dan J" <dan.j.williams@xxxxxxxxx> · Fri, 4 Sep 2015 00:21:59 +0000

Hi Linus, please pull from:

  git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm tags/libnvdimm-for-4.3

...to receive the libnvdimm update and related changes for 4.3.

This update has successfully completed a 0day-kbuild run and has
appeared in a linux-next release.  The changes outside of the typical
drivers/nvdimm/ and drivers/acpi/nfit.[ch] paths are related to the
removal of IORESOURCE_CACHEABLE, the introduction of memremap(), and the
introduction of ZONE_DEVICE + devm_memremap_pages().

This has a minor conflict with a fix that went into v4.2, commit
de4a196c02a2 "nfit, nd_blk: BLK status register is only 32 bits", but
otherwise merges cleanly with mainline.

--

The following changes since commit cbfe8fa6cd672011c755c3cd85c9ffd4e2d10a6f:

  Linux 4.2-rc4 (2015-07-26 12:26:21 -0700)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm tags/libnvdimm-for-4.3

for you to fetch changes up to 004f1afbe199e6ab20805b95aefd83ccd24bc5c7:

  libnvdimm, pmem: direct map legacy pmem by default (2015-08-28 23:40:05 -0400)

----------------------------------------------------------------
libnvdimm for 4.3:

1/ Introduce ZONE_DEVICE and devm_memremap_pages() as a generic
   mechanism for adding device-driver-discovered memory regions to the
   kernel's direct map.  This facility is used by the pmem driver to
   enable pfn_to_page() operations on the page frames returned by DAX
   ('direct_access' in 'struct block_device_operations'). For now, the
   'memmap' allocation for these "device" pages comes from "System
   RAM".  Support for allocating the memmap from device memory will
   arrive in a later kernel.

2/ Introduce memremap() to replace usages of ioremap_cache() and
   ioremap_wt().  memremap() drops the __iomem annotation for these
   mappings to memory that do not have i/o side effects.  The
   replacement of ioremap_cache() with memremap() is limited to the
   pmem driver to ease merging the api change in v4.3.  Completion of
   the conversion is targeted for v4.4.

3/ Similar to the usage of memcpy_to_pmem() + wmb_pmem() in the pmem
   driver, update the VFS DAX implementation and PMEM api to provide
   persistence guarantees for kernel operations on a DAX mapping.

4/ Convert the ACPI NFIT 'BLK' driver to map the block apertures as
   cacheable to improve performance.

5/ Miscellaneous updates and fixes to libnvdimm including support
   for issuing "address range scrub" commands, clarifying the optimal
   'sector size' of pmem devices, a clarification of the usage of the
   ACPI '_STA' (status) property for DIMM devices, and other minor
   fixes.

----------------------------------------------------------------
Christoph Hellwig (4):
      devres: add devm_memremap
      pmem: switch to devm_ allocations
      mm: move __phys_to_pfn and __pfn_to_phys to asm/generic/memory_model.h
      add devm_memremap_pages

Dan Williams (15):
      libnvdimm, btt: sparse fix
      mm: enhance region_is_ram() to region_intersects()
      arch, drivers: don't include <asm/io.h> directly, use <linux/io.h> instead
      cleanup IORESOURCE_CACHEABLE vs ioremap()
      arch: introduce memremap()
      visorbus: switch from ioremap_cache to memremap
      pmem: convert to generic memremap
      libnvdimm, e820: make CONFIG_X86_PMEM_LEGACY a tristate option
      Merge branch 'pmem-api' into libnvdimm-for-next
      dax: drop size parameter to ->direct_access()
      mm: ZONE_DEVICE for "device memory"
      x86, pmem: clarify that ARCH_HAS_PMEM_API implies PMEM mapped WB
      libnvdimm, pfn: 'struct page' provider infrastructure
      libnvdimm, pmem: 'struct page' for pmem
      libnvdimm, pmem: direct map legacy pmem by default

Linda Knippers (1):
      nfit: Don't check _STA on NVDIMM devices

Randy Dunlap (1):
      nvdimm: fix inline function return type warning

Ross Zwisler (7):
      pmem, x86: move x86 PMEM API to new pmem.h header
      pmem: remove layer when calling arch_has_wmb_pmem()
      pmem, x86: clean up conditional pmem includes
      pmem: add copy_from_iter_pmem() and clear_pmem()
      dax: update I/O path to do proper PMEM flushing
      pmem, dax: have direct_access use __pmem annotation
      nd_blk: change aperture mapping from WC to WB

Vishal Verma (6):
      libnvdimm: Update name of the ars_status_record mask field
      libnvdimm: Add DSM support for Address Range Scrub commands
      libnvdimm, pmem: Change pmem physical sector size to PAGE_SIZE
      libnvdimm, btt: clean up internal interfaces
      libnvdimm, btt: consolidate arena validation
      libnvdimm, btt: write and validate parent_uuid

yalin wang (1):
      nvdimm: change to use generic kvfree()

 Documentation/filesystems/Locking              |   3 +-
 MAINTAINERS                                    |   1 +
 arch/arm/include/asm/memory.h                  |   6 -
 arch/arm/mach-clps711x/board-cdb89712.c        |   2 +-
 arch/arm/mach-shmobile/pm-rcar.c               |   2 +-
 arch/arm64/include/asm/memory.h                |   6 -
 arch/ia64/include/asm/io.h                     |   1 +
 arch/ia64/kernel/cyclone.c                     |   2 +-
 arch/ia64/mm/init.c                            |   4 +-
 arch/powerpc/kernel/pci_of_scan.c              |   2 +-
 arch/powerpc/mm/mem.c                          |   4 +-
 arch/powerpc/sysdev/axonram.c                  |   7 +-
 arch/s390/mm/init.c                            |   2 +-
 arch/sh/include/asm/io.h                       |   1 +
 arch/sh/mm/init.c                              |   5 +-
 arch/sparc/kernel/pci.c                        |   3 +-
 arch/tile/mm/init.c                            |   2 +-
 arch/unicore32/include/asm/memory.h            |   6 -
 arch/x86/Kconfig                               |   9 +-
 arch/x86/include/asm/cacheflush.h              |  73 +-----
 arch/x86/include/asm/io.h                      |   6 -
 arch/x86/include/asm/pmem.h                    | 153 +++++++++++
 arch/x86/include/uapi/asm/e820.h               |   2 +-
 arch/x86/kernel/Makefile                       |   2 +-
 arch/x86/kernel/pmem.c                         |  79 +-----
 arch/x86/mm/init_32.c                          |   4 +-
 arch/x86/mm/init_64.c                          |   4 +-
 arch/xtensa/include/asm/io.h                   |   1 +
 drivers/acpi/Kconfig                           |   1 +
 drivers/acpi/nfit.c                            |  79 +++---
 drivers/acpi/nfit.h                            |  17 +-
 drivers/block/brd.c                            |   8 +-
 drivers/isdn/icn/icn.h                         |   2 +-
 drivers/mtd/devices/slram.c                    |   2 +-
 drivers/mtd/nand/diskonchip.c                  |   2 +-
 drivers/mtd/onenand/generic.c                  |   2 +-
 drivers/nvdimm/Kconfig                         |  23 ++
 drivers/nvdimm/Makefile                        |   5 +
 drivers/nvdimm/btt.c                           |  50 +---
 drivers/nvdimm/btt.h                           |   3 +
 drivers/nvdimm/btt_devs.c                      | 215 ++++------------
 drivers/nvdimm/claim.c                         | 201 +++++++++++++++
 drivers/nvdimm/dimm_devs.c                     |   5 +-
 drivers/nvdimm/e820.c                          |  87 +++++++
 drivers/nvdimm/namespace_devs.c                |  89 ++++++-
 drivers/nvdimm/nd-core.h                       |   9 +
 drivers/nvdimm/nd.h                            |  67 ++++-
 drivers/nvdimm/pfn.h                           |  35 +++
 drivers/nvdimm/pfn_devs.c                      | 337 +++++++++++++++++++++++++
 drivers/nvdimm/pmem.c                          | 245 +++++++++++++++---
 drivers/nvdimm/region.c                        |   2 +
 drivers/nvdimm/region_devs.c                   |  20 ++
 drivers/pci/probe.c                            |   3 +-
 drivers/pnp/manager.c                          |   2 -
 drivers/s390/block/dcssblk.c                   |  10 +-
 drivers/scsi/aic94xx/aic94xx_init.c            |   7 +-
 drivers/scsi/arcmsr/arcmsr_hba.c               |   5 +-
 drivers/scsi/mvsas/mv_init.c                   |  15 +-
 drivers/scsi/sun3x_esp.c                       |   2 +-
 drivers/staging/comedi/drivers/ii_pci20kc.c    |   1 +
 drivers/staging/unisys/visorbus/visorchannel.c |  16 +-
 drivers/staging/unisys/visorbus/visorchipset.c |  17 +-
 drivers/tty/serial/8250/8250_core.c            |   2 +-
 drivers/video/fbdev/ocfb.c                     |   1 -
 drivers/video/fbdev/s1d13xxxfb.c               |   3 +-
 drivers/video/fbdev/stifb.c                    |   1 +
 fs/block_dev.c                                 |   4 +-
 fs/dax.c                                       |  62 +++--
 include/asm-generic/memory_model.h             |   6 +
 include/linux/blkdev.h                         |   8 +-
 include/linux/io-mapping.h                     |   2 +-
 include/linux/io.h                             |  33 +++
 include/linux/libnvdimm.h                      |   4 +
 include/linux/memory_hotplug.h                 |   5 +-
 include/linux/mm.h                             |   9 +-
 include/linux/mmzone.h                         |  23 ++
 include/linux/mtd/map.h                        |   2 +-
 include/linux/pmem.h                           | 115 ++++++---
 include/uapi/linux/ndctl.h                     |  12 +-
 include/video/vga.h                            |   2 +-
 kernel/Makefile                                |   2 +
 kernel/memremap.c                              | 190 ++++++++++++++
 kernel/resource.c                              |  61 +++--
 lib/Kconfig                                    |   3 +
 lib/devres.c                                   |  13 +-
 lib/pci_iomap.c                                |   7 +-
 mm/Kconfig                                     |  17 ++
 mm/memory_hotplug.c                            |  14 +-
 mm/page_alloc.c                                |   3 +
 tools/testing/nvdimm/Kbuild                    |  13 +-
 tools/testing/nvdimm/test/iomap.c              |  85 ++++++-
 tools/testing/nvdimm/test/nfit.c               | 209 ++++++++++-----
 92 files changed, 2142 insertions(+), 745 deletions(-)
 create mode 100644 arch/x86/include/asm/pmem.h
 create mode 100644 drivers/nvdimm/claim.c
 create mode 100644 drivers/nvdimm/e820.c
 create mode 100644 drivers/nvdimm/pfn.h
 create mode 100644 drivers/nvdimm/pfn_devs.c
 create mode 100644 kernel/memremap.c

commit 5e32940621eb62064d98f42c9889db71b0368bde
Author: Dan Williams <dan.j.williams@xxxxxxxxx>
Date:   Sat Jul 11 10:02:46 2015 -0400

    libnvdimm, btt: sparse fix

    Fix:
    drivers/nvdimm/btt.c:635:29: warning: restricted __le64 degrades to integer

    Signed-off-by: Dan Williams <dan.j.williams@xxxxxxxxx>

commit ec92777f2ba93c00387b8fe53780c25adc57c744
Author: Vishal Verma <vishal.l.verma@xxxxxxxxx>
Date:   Thu Jul 9 13:25:35 2015 -0600

    libnvdimm: Update name of the ars_status_record mask field

    The spec suggests that this is a simple 'length' field, not a mask.
    Update the name accordingly.

    Signed-off-by: Vishal Verma <vishal.l.verma@xxxxxxxxx>
    Signed-off-by: Dan Williams <dan.j.williams@xxxxxxxxx>

commit 39c686b862cdb2049b90e095b6c6c727b2a7ab60
Author: Vishal Verma <vishal.l.verma@xxxxxxxxx>
Date:   Thu Jul 9 13:25:36 2015 -0600

    libnvdimm: Add DSM support for Address Range Scrub commands

    Add support for the three ARS DSM commands:
    - Query ARS Capabilities - Queries the firmware to check if a given
      range supports scrub, and if so, which type (persistent vs. volatile)
    - Start ARS - Starts a scrub for a given range/type
    - Query ARS Status - Checks status of a previously started scrub, and
      provides the error logs if any.

      The commands are described by the example DSM spec at:
      http://pmem.io/documents/NVDIMM_DSM_Interface_Example.pdf

    Also add these commands to the nfit_test test framework, and return
    canned data.

    Signed-off-by: Vishal Verma <vishal.l.verma@xxxxxxxxx>
    Signed-off-by: Dan Williams <dan.j.williams@xxxxxxxxx>

commit 6b47496a6fc81816e7edaf8224dfb88e402a05f5
Author: Vishal Verma <vishal.l.verma@xxxxxxxxx>
Date:   Thu Jul 23 11:58:48 2015 -0600

    libnvdimm, pmem: Change pmem physical sector size to PAGE_SIZE

    Based on a patch: c8fa317 brd: Request from fdisk 4k alignment by Boaz
    Harrosh, allow fdisk to create properly aligned partitions for DAX. This
    will also cause mkfs.ext4 to emit a warning if using a file system block
    size of less than PAGE_SIZE.

    Cc: Dan Williams <dan.j.williams@xxxxxxxxx>
    Cc: Ross Zwisler <ross.zwisler@xxxxxxxxxxxxxxx>
    Cc: Matthew Wilcox <matthew.r.wilcox@xxxxxxxxx>
    Cc: Christoph Hellwig <hch@xxxxxx>
    Cc: Elliott, Robert <Elliott@xxxxxx>
    Signed-off-by: Vishal Verma <vishal.l.verma@xxxxxxxxx>
    Acked-by: Boaz Harrosh <boaz@xxxxxxxxxxxxx>
    Acked-by: Ross Zwisler <ross.zwisler@xxxxxxxxxxxxxxx>
    Signed-off-by: Dan Williams <dan.j.williams@xxxxxxxxx>

commit 60e95f43fc8573e81f54b0c1e0bc542c2260d956
Author: Linda Knippers <linda.knippers@xxxxxx>
Date:   Wed Jul 22 16:17:22 2015 -0400

    nfit: Don't check _STA on NVDIMM devices

    The _STA only applies to the root device, not the individual NVDIMMS,
    so don't check here. NVDIMM device state flags are checked elsewhere.

    Signed-off-by: Linda Knippers <linda.knippers@xxxxxx>
    Signed-off-by: Dan Williams <dan.j.williams@xxxxxxxxx>

commit f6ef5a2a50816b58e3126206de13d0b9fdf89df5
Author: Randy Dunlap <rdunlap@xxxxxxxxxxxxx>
Date:   Tue Jul 28 12:27:01 2015 -0700

    nvdimm: fix inline function return type warning

    Fix multiple build warnings when CONFIG_BTT is not enabled:

    In file included from ../drivers/nvdimm/bus.c:29:0:
    ../drivers/nvdimm/nd.h:169:15: warning: return type defaults to 'int' [-Wreturn-type]
     static inline nd_btt_probe(struct nd_namespace_common *ndns, void *drvdata)
                   ^

    Signed-off-by: Randy Dunlap <rdunlap@xxxxxxxxxxxxx>
    Cc: Dan Williams <dan.j.williams@xxxxxxxxx>
    Cc: linux-nvdimm@xxxxxxxxxxxx
    Signed-off-by: Dan Williams <dan.j.williams@xxxxxxxxx>

commit 124fe20d94630b6f173dae5eb815e6e6e350c72d
Author: Dan Williams <dan.j.williams@xxxxxxxxx>
Date:   Mon Aug 10 23:07:05 2015 -0400

    mm: enhance region_is_ram() to region_intersects()

    region_is_ram() is used to prevent the establishment of aliased mappings
    to physical "System RAM" with incompatible cache settings.  However, it
    uses "-1" to indicate both "unknown" memory ranges (ranges not described
    by platform firmware) and "mixed" ranges (where the parameters describe
    a range that partially overlaps "System RAM").

    Fix this up by explicitly tracking the "unknown" vs "mixed" resource
    cases and returning REGION_INTERSECTS, REGION_MIXED, or REGION_DISJOINT.
    This re-write also adds support for detecting when the requested region
    completely eclipses all of a resource.  Note, the implementation treats
    overlaps between "unknown" and the requested memory type as
    REGION_INTERSECTS.

    Finally, other memory types can be passed in by name, for now the only
    usage "System RAM".

    Suggested-by: Luis R. Rodriguez <mcgrof@xxxxxxxx>
    Reviewed-by: Toshi Kani <toshi.kani@xxxxxx>
    Signed-off-by: Dan Williams <dan.j.williams@xxxxxxxxx>

commit 2584cf83578c26db144730ef498f4070f82ee3ea
Author: Dan Williams <dan.j.williams@xxxxxxxxx>
Date:   Mon Aug 10 23:07:05 2015 -0400

    arch, drivers: don't include <asm/io.h> directly, use <linux/io.h> instead

    Preparation for uniform definition of ioremap, ioremap_wc, ioremap_wt,
    and ioremap_cache, tree-wide.

    Acked-by: Christoph Hellwig <hch@xxxxxx>
    Signed-off-by: Dan Williams <dan.j.williams@xxxxxxxxx>

commit 92b19ff50e8f242392d78b2aacc5b5b672f1796b
Author: Dan Williams <dan.j.williams@xxxxxxxxx>
Date:   Mon Aug 10 23:07:06 2015 -0400

    cleanup IORESOURCE_CACHEABLE vs ioremap()

    Quoting Arnd:
        I was thinking the opposite approach and basically removing all uses
        of IORESOURCE_CACHEABLE from the kernel. There are only a handful of
        them.and we can probably replace them all with hardcoded
        ioremap_cached() calls in the cases they are actually useful.

    All existing usages of IORESOURCE_CACHEABLE call ioremap() instead of
    ioremap_nocache() if the resource is cacheable, however ioremap() is
    uncached by default. Clearly none of the existing usages care about the
    cacheability. Particularly devm_ioremap_resource() never worked as
    advertised since it always fell back to plain ioremap().

    Clean this up as the new direction we want is to convert
    ioremap_<type>() usages to memremap(..., flags).

    Suggested-by: Arnd Bergmann <arnd@xxxxxxxx>
    Reviewed-by: Christoph Hellwig <hch@xxxxxx>
    Signed-off-by: Dan Williams <dan.j.williams@xxxxxxxxx>

commit 92281dee825f6d2eb07c441437e4196a44b0861c
Author: Dan Williams <dan.j.williams@xxxxxxxxx>
Date:   Mon Aug 10 23:07:06 2015 -0400

    arch: introduce memremap()

    Existing users of ioremap_cache() are mapping memory that is known in
    advance to not have i/o side effects.  These users are forced to cast
    away the __iomem annotation, or otherwise neglect to fix the sparse
    errors thrown when dereferencing pointers to this memory.  Provide
    memremap() as a non __iomem annotated ioremap_*() in the case when
    ioremap is otherwise a pointer to cacheable memory. Empirically,
    ioremap_<cacheable-type>() call sites are seeking memory-like semantics
    (e.g.  speculative reads, and prefetching permitted).

    memremap() is a break from the ioremap implementation pattern of adding
    a new memremap_<type>() for each mapping type and having silent
    compatibility fall backs.  Instead, the implementation defines flags
    that are passed to the central memremap() and if a mapping type is not
    supported by an arch memremap returns NULL.

    We introduce a memremap prototype as a trivial wrapper of
    ioremap_cache() and ioremap_wt().  Later, once all ioremap_cache() and
    ioremap_wt() usage has been removed from drivers we teach archs to
    implement arch_memremap() with the ability to strictly enforce the
    mapping type.

    Cc: Arnd Bergmann <arnd@xxxxxxxx>
    Reviewed-by: Christoph Hellwig <hch@xxxxxx>
    Signed-off-by: Dan Williams <dan.j.williams@xxxxxxxxx>

commit 3103dc0304fd9c8ab576977cd98140d4fbac1730
Author: Dan Williams <dan.j.williams@xxxxxxxxx>
Date:   Mon Aug 10 23:07:06 2015 -0400

    visorbus: switch from ioremap_cache to memremap

    In preparation for deprecating ioremap_cache() convert its usage in
    visorbus to memremap.

    Cc: Benjamin Romer <benjamin.romer@xxxxxxxxxx>
    Cc: David Kershner <david.kershner@xxxxxxxxxx>
    Acked-by: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>
    Signed-off-by: Dan Williams <dan.j.williams@xxxxxxxxx>

commit e836a256e8fd579c9d7a3685f22981225a1ca451
Author: Dan Williams <dan.j.williams@xxxxxxxxx>
Date:   Wed Aug 12 18:42:56 2015 -0400

    pmem: convert to generic memremap

    Kill arch_memremap_pmem() and just let the architecture specify the
    flags to be passed to memremap().  Default to writethrough by default.

    Suggested-by: Christoph Hellwig <hch@xxxxxx>
    Reviewed-by: Christoph Hellwig <hch@xxxxxx>
    Reviewed-by: Ross Zwisler <ross.zwisler@xxxxxxxxxxxxxxx>
    Signed-off-by: Dan Williams <dan.j.williams@xxxxxxxxx>

commit fbde1414acc0440024083bf0c391b259bcfc4826
Author: Vishal Verma <vishal.l.verma@xxxxxxxxx>
Date:   Wed Jul 29 14:58:07 2015 -0600

    libnvdimm, btt: clean up internal interfaces

    Consolidate the parameters passed to arena_is_valid into just nd_btt,
    and an info block to increase re-usability.

    Similarly, btt_arena_write_layout doesn't need to be passed a uuid, as
    it can be obtained from arena->nd_btt.

    Signed-off-by: Vishal Verma <vishal.l.verma@xxxxxxxxx>
    Signed-off-by: Dan Williams <dan.j.williams@xxxxxxxxx>

commit ab45e7632717b811e0786e46ca5ad279cb731b66
Author: Vishal Verma <vishal.l.verma@xxxxxxxxx>
Date:   Wed Jul 29 14:58:08 2015 -0600

    libnvdimm, btt: consolidate arena validation

    Use arena_is_valid as a common routine for checking the validity of an
    info block from both discover_arenas, and nd_btt_probe.

    As a result, don't check for validity of the BTT's UUID, and lbasize.
    The checksum in the BTT info block guarantees self-consistency, and when
    we're called from nd_btt_probe, we don't have a valid uuid or lbasize
    available to check against.

    Also cleanup to return a bool instead of an int.

    Signed-off-by: Vishal Verma <vishal.l.verma@xxxxxxxxx>
    Signed-off-by: Dan Williams <dan.j.williams@xxxxxxxxx>

commit 6ec689542b5bc516187917d49b112847dfb75b0b
Author: Vishal Verma <vishal.l.verma@xxxxxxxxx>
Date:   Wed Jul 29 14:58:09 2015 -0600

    libnvdimm, btt: write and validate parent_uuid

    When a BTT is instantiated on a namespace it must validate the namespace
    uuid matches the 'parent_uuid' stored in the btt superblock. This
    property enforces that changing the namespace UUID invalidates all
    former BTT instances on that storage. For "IO namespaces" that don't
    have a label or UUID, the parent_uuid is set to zero, and this
    validation is skipped. For such cases, old BTTs have to be invalidated
    by forcing the namespace to raw mode, and overwriting the BTT info
    blocks.

    Based on a patch by Dan Williams <dan.j.williams@xxxxxxxxx>

    Signed-off-by: Vishal Verma <vishal.l.verma@xxxxxxxxx>
    Signed-off-by: Dan Williams <dan.j.williams@xxxxxxxxx>

commit 7d3dcf26a6559fa82af3f53e2c8b163cec95fdaf
Author: Christoph Hellwig <hch@xxxxxx>
Date:   Mon Aug 10 23:07:07 2015 -0400

    devres: add devm_memremap

    Signed-off-by: Christoph Hellwig <hch@xxxxxx>
    Signed-off-by: Dan Williams <dan.j.williams@xxxxxxxxx>

commit 708ab62bef1ed3a3cf065a4138bd87f5d083cfeb
Author: Christoph Hellwig <hch@xxxxxx>
Date:   Mon Aug 10 23:07:08 2015 -0400

    pmem: switch to devm_ allocations

    Signed-off-by: Christoph Hellwig <hch@xxxxxx>
    [djbw: tools/testing/nvdimm/ and memunmap_pmem support]
    Reviewed-by: Ross Zwisler <ross.zwisler@xxxxxxxxxxxxxxx>
    Signed-off-by: Dan Williams <dan.j.williams@xxxxxxxxx>

commit 7a67832c7e44c20935c5d6f2264035a0f7bf0d8f
Author: Dan Williams <dan.j.williams@xxxxxxxxx>
Date:   Wed Aug 19 00:34:34 2015 -0400

    libnvdimm, e820: make CONFIG_X86_PMEM_LEGACY a tristate option

    We currently register a platform device for e820 type-12 memory and
    register a nvdimm bus beneath it.  Registering the platform device
    triggers the device-core machinery to probe for a driver, but that
    search currently comes up empty.  Building the nvdimm-bus registration
    into the e820_pmem platform device registration in this way forces
    libnvdimm to be built-in.  Instead, convert the built-in portion of
    CONFIG_X86_PMEM_LEGACY to simply register a platform device and move the
    rest of the logic to the driver for e820_pmem, for the following
    reasons:

    1/ Letting e820_pmem support be a module allows building and testing
       libnvdimm.ko changes without rebooting

    2/ All the normal policy around modules can be applied to e820_pmem
       (unbind to disable and/or blacklisting the module from loading by
       default)

    3/ Moving the driver to a generic location and converting it to scan
       "iomem_resource" rather than "e820.map" means any other architecture can
       take advantage of this simple nvdimm resource discovery mechanism by
       registering a resource named "Persistent Memory (legacy)"

    Cc: Christoph Hellwig <hch@xxxxxx>
    Signed-off-by: Dan Williams <dan.j.williams@xxxxxxxxx>

commit 40603526569b304dd92f720f2f8ab11e828ea145
Author: Ross Zwisler <ross.zwisler@xxxxxxxxxxxxxxx>
Date:   Tue Aug 18 13:55:36 2015 -0600

    pmem, x86: move x86 PMEM API to new pmem.h header

    Move the x86 PMEM API implementation out of asm/cacheflush.h and into
    its own header asm/pmem.h.  This will allow members of the PMEM API to
    be more easily identified on this and other architectures.

    Signed-off-by: Ross Zwisler <ross.zwisler@xxxxxxxxxxxxxxx>
    Suggested-by: Christoph Hellwig <hch@xxxxxx>
    Reviewed-by: Christoph Hellwig <hch@xxxxxx>
    Signed-off-by: Dan Williams <dan.j.williams@xxxxxxxxx>

commit 18279b467a9d89afe44afbc19d768e834dbf4545
Author: Ross Zwisler <ross.zwisler@xxxxxxxxxxxxxxx>
Date:   Tue Aug 18 13:55:37 2015 -0600

    pmem: remove layer when calling arch_has_wmb_pmem()

    Prior to this change arch_has_wmb_pmem() was only called by
    arch_has_pmem_api().  Both arch_has_wmb_pmem() and arch_has_pmem_api()
    checked to make sure that CONFIG_ARCH_HAS_PMEM_API was enabled.

    Instead, remove the old arch_has_wmb_pmem() wrapper to be rid of one
    extra layer of indirection and the redundant CONFIG_ARCH_HAS_PMEM_API
    check. Rename __arch_has_wmb_pmem() to arch_has_wmb_pmem() since we no
    longer have a wrapper, and just have arch_has_pmem_api() call the
    architecture specific arch_has_wmb_pmem() directly.

    Signed-off-by: Ross Zwisler <ross.zwisler@xxxxxxxxxxxxxxx>
    Reviewed-by: Christoph Hellwig <hch@xxxxxx>
    Signed-off-by: Dan Williams <dan.j.williams@xxxxxxxxx>

commit 4a370df5534ef727cba9a9d74bf22e0609f91d6e
Author: Ross Zwisler <ross.zwisler@xxxxxxxxxxxxxxx>
Date:   Tue Aug 18 13:55:38 2015 -0600

    pmem, x86: clean up conditional pmem includes

    Prior to this change x86_64 used the pmem defines in
    arch/x86/include/asm/pmem.h, and UM used the default ones at the
    top of include/linux/pmem.h.  The inclusion or exclusion in linux/pmem.h
    was controlled by CONFIG_ARCH_HAS_PMEM_API, but the ones in asm/pmem.h
    were controlled by ARCH_HAS_NOCACHE_UACCESS.

    Instead, control them both with CONFIG_ARCH_HAS_PMEM_API so that it's
    clear that they are related and we don't run into the possibility where
    they are both included or excluded.  Also remove a bunch of stale
    function prototypes meant for UM in asm/pmem.h - these just conflicted
    with the inline defaults in linux/pmem.h and gave compile errors.

    Signed-off-by: Ross Zwisler <ross.zwisler@xxxxxxxxxxxxxxx>
    Reviewed-by: Christoph Hellwig <hch@xxxxxx>
    Signed-off-by: Dan Williams <dan.j.williams@xxxxxxxxx>

commit 5de490daec8b6354b90d5c9d3e2415b195f5adb6
Author: Ross Zwisler <ross.zwisler@xxxxxxxxxxxxxxx>
Date:   Tue Aug 18 13:55:39 2015 -0600

    pmem: add copy_from_iter_pmem() and clear_pmem()

    Add support for two new PMEM APIs, copy_from_iter_pmem() and
    clear_pmem().  copy_from_iter_pmem() is used to copy data from an
    iterator into a PMEM buffer.  clear_pmem() zeros a PMEM memory range.

    Both of these new APIs must be explicitly ordered using a wmb_pmem()
    function call and are implemented in such a way that the wmb_pmem()
    will make the stores to PMEM durable.  Because both APIs are unordered
    they can be called as needed without introducing any unwanted memory
    barriers.

    Signed-off-by: Ross Zwisler <ross.zwisler@xxxxxxxxxxxxxxx>
    Signed-off-by: Dan Williams <dan.j.williams@xxxxxxxxx>

commit 2765cfbb342c727c3fd47b165196cb16da158022
Author: Ross Zwisler <ross.zwisler@xxxxxxxxxxxxxxx>
Date:   Tue Aug 18 13:55:40 2015 -0600

    dax: update I/O path to do proper PMEM flushing

    Update the DAX I/O path so that all operations that store data (I/O
    writes, zeroing blocks, punching holes, etc.) properly synchronize the
    stores to media using the PMEM API.  This ensures that the data DAX is
    writing is durable on media before the operation completes.

    Signed-off-by: Ross Zwisler <ross.zwisler@xxxxxxxxxxxxxxx>
    Reviewed-by: Christoph Hellwig <hch@xxxxxx>
    Signed-off-by: Dan Williams <dan.j.williams@xxxxxxxxx>

commit e2e05394e4a3420dab96f728df4531893494e15d
Author: Ross Zwisler <ross.zwisler@xxxxxxxxxxxxxxx>
Date:   Tue Aug 18 13:55:41 2015 -0600

    pmem, dax: have direct_access use __pmem annotation

    Update the annotation for the kaddr pointer returned by direct_access()
    so that it is a __pmem pointer.  This is consistent with the PMEM driver
    and with how this direct_access() pointer is used in the DAX code.

    Signed-off-by: Ross Zwisler <ross.zwisler@xxxxxxxxxxxxxxx>
    Reviewed-by: Christoph Hellwig <hch@xxxxxx>
    Signed-off-by: Dan Williams <dan.j.williams@xxxxxxxxx>

commit a06a7576526e10a99ea7721533e7f2df3e26baad
Author: yalin wang <yalin.wang2010@xxxxxxxxx>
Date:   Thu Aug 27 19:35:48 2015 -0400

    nvdimm: change to use generic kvfree()

    Signed-off-by: yalin wang <yalin.wang2010@xxxxxxxxx>
    Reviewed-by: Ross Zwisler <ross.zwisler@xxxxxxxxxxxxxxx>
    Signed-off-by: Dan Williams <dan.j.williams@xxxxxxxxx>

commit 67a3e8fe90156d41cd480d3dfbb40f3bc007c262
Author: Ross Zwisler <ross.zwisler@xxxxxxxxxxxxxxx>
Date:   Thu Aug 27 13:14:20 2015 -0600

    nd_blk: change aperture mapping from WC to WB

    This should result in a pretty sizeable performance gain for reads.  For
    rough comparison I did some simple read testing using PMEM to compare
    reads of write combining (WC) mappings vs write-back (WB).  This was
    done on a random lab machine.

    PMEM reads from a write combining mapping:
    	# dd of=/dev/null if=/dev/pmem0 bs=4096 count=100000
    	100000+0 records in
    	100000+0 records out
    	409600000 bytes (410 MB) copied, 9.2855 s, 44.1 MB/s

    PMEM reads from a write-back mapping:
    	# dd of=/dev/null if=/dev/pmem0 bs=4096 count=1000000
    	1000000+0 records in
    	1000000+0 records out
    	4096000000 bytes (4.1 GB) copied, 3.44034 s, 1.2 GB/s

    To be able to safely support a write-back aperture I needed to add
    support for the "read flush" _DSM flag, as outlined in the DSM spec:

    http://pmem.io/documents/NVDIMM_DSM_Interface_Example.pdf

    This flag tells the ND BLK driver that it needs to flush the cache lines
    associated with the aperture after the aperture is moved but before any
    new data is read.  This ensures that any stale cache lines from the
    previous contents of the aperture will be discarded from the processor
    cache, and the new data will be read properly from the DIMM.  We know
    that the cache lines are clean and will be discarded without any
    writeback because either a) the previous aperture operation was a read,
    and we never modified the contents of the aperture, or b) the previous
    aperture operation was a write and we must have written back the dirtied
    contents of the aperture to the DIMM before the I/O was completed.

    In order to add support for the "read flush" flag I needed to add a
    generic routine to invalidate cache lines, mmio_flush_range().  This is
    protected by the ARCH_HAS_MMIO_FLUSH Kconfig variable, and is currently
    only supported on x86.

    Signed-off-by: Ross Zwisler <ross.zwisler@xxxxxxxxxxxxxxx>
    Signed-off-by: Dan Williams <dan.j.williams@xxxxxxxxx>

commit 4a9bf88a5caa8495b5eb2b738d5fb40924bbc538
Merge: a06a7576526e 67a3e8fe9015
Author: Dan Williams <dan.j.williams@xxxxxxxxx>
Date:   Thu Aug 27 19:40:26 2015 -0400

    Merge branch 'pmem-api' into libnvdimm-for-next

commit cb389b9c0e00c30c9daf20287f7d91e2466edbb1
Author: Dan Williams <dan.j.williams@xxxxxxxxx>
Date:   Fri Aug 7 17:41:00 2015 -0400

    dax: drop size parameter to ->direct_access()

    None of the implementations currently use it.  The common
    bdev_direct_access() entry point handles all the size checks before
    calling ->direct_access().

    Signed-off-by: Christoph Hellwig <hch@xxxxxx>
    Signed-off-by: Dan Williams <dan.j.williams@xxxxxxxxx>

commit 012dcef3f058385268630c0003e9b7f8dcafbeb4
Author: Christoph Hellwig <hch@xxxxxx>
Date:   Fri Aug 7 17:41:01 2015 -0400

    mm: move __phys_to_pfn and __pfn_to_phys to asm/generic/memory_model.h

    Three architectures already define these, and we'll need them genericly
    soon.

    Signed-off-by: Christoph Hellwig <hch@xxxxxx>
    Signed-off-by: Dan Williams <dan.j.williams@xxxxxxxxx>

commit 033fbae988fcb67e5077203512181890848b8e90
Author: Dan Williams <dan.j.williams@xxxxxxxxx>
Date:   Sun Aug 9 15:29:06 2015 -0400

    mm: ZONE_DEVICE for "device memory"

    While pmem is usable as a block device or via DAX mappings to userspace
    there are several usage scenarios that can not target pmem due to its
    lack of struct page coverage. In preparation for "hot plugging" pmem
    into the vmemmap add ZONE_DEVICE as a new zone to tag these pages
    separately from the ones that are subject to standard page allocations.
    Importantly "device memory" can be removed at will by userspace
    unbinding the driver of the device.

    Having a separate zone prevents allocation and otherwise marks these
    pages that are distinct from typical uniform memory.  Device memory has
    different lifetime and performance characteristics than RAM.  However,
    since we have run out of ZONES_SHIFT bits this functionality currently
    depends on sacrificing ZONE_DMA.

    Cc: H. Peter Anvin <hpa@xxxxxxxxx>
    Cc: Ingo Molnar <mingo@xxxxxxxxxx>
    Cc: Dave Hansen <dave.hansen@xxxxxxxxxxxxxxx>
    Cc: Rik van Riel <riel@xxxxxxxxxx>
    Cc: Mel Gorman <mgorman@xxxxxxx>
    Cc: Jerome Glisse <j.glisse@xxxxxxxxx>
    [hch: various simplifications in the arch interface]
    Signed-off-by: Christoph Hellwig <hch@xxxxxx>
    Signed-off-by: Dan Williams <dan.j.williams@xxxxxxxxx>

commit 41e94a851304f7acac840adec4004f8aeee53ad4
Author: Christoph Hellwig <hch@xxxxxx>
Date:   Mon Aug 17 16:00:35 2015 +0200

    add devm_memremap_pages

    This behaves like devm_memremap except that it ensures we have page
    structures available that can back the region.

    Signed-off-by: Christoph Hellwig <hch@xxxxxx>
    [djbw: catch attempts to remap RAM, drop flags]
    Signed-off-by: Dan Williams <dan.j.williams@xxxxxxxxx>

commit 96601adb745186ccbcf5b078d4756f13381ec2af
Author: Dan Williams <dan.j.williams@xxxxxxxxx>
Date:   Mon Aug 24 18:29:38 2015 -0400

    x86, pmem: clarify that ARCH_HAS_PMEM_API implies PMEM mapped WB

    Given that a write-back (WB) mapping plus non-temporal stores is
    expected to be the most efficient way to access PMEM, update the
    definition of ARCH_HAS_PMEM_API to imply arch support for
    WB-mapped-PMEM.  This is needed as a pre-requisite for adding PMEM to
    the direct map and mapping it with struct page.

    The above clarification for X86_64 means that memcpy_to_pmem() is
    permitted to use the non-temporal arch_memcpy_to_pmem() rather than
    needlessly fall back to default_memcpy_to_pmem() when the pcommit
    instruction is not available.  When arch_memcpy_to_pmem() is not
    guaranteed to flush writes out of cache, i.e. on older X86_32
    implementations where non-temporal stores may just dirty cache,
    ARCH_HAS_PMEM_API is simply disabled.

    The default fall back for persistent memory handling remains.  Namely,
    map it with the WT (write-through) cache-type and hope for the best.

    arch_has_pmem_api() is updated to only indicate whether the arch
    provides the proper helpers to meet the minimum "writes are visible
    outside the cache hierarchy after memcpy_to_pmem() + wmb_pmem()".  Code
    that cares whether wmb_pmem() actually flushes writes to pmem must now
    call arch_has_wmb_pmem() directly.

    Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
    Cc: Ingo Molnar <mingo@xxxxxxxxxx>
    Cc: "H. Peter Anvin" <hpa@xxxxxxxxx>
    Reviewed-by: Ross Zwisler <ross.zwisler@xxxxxxxxxxxxxxx>
    [hch: set ARCH_HAS_PMEM_API=n on x86_32]
    Reviewed-by: Christoph Hellwig <hch@xxxxxx>
    [toshi: x86_32 compile fixes]
    Signed-off-by: Toshi Kani <toshi.kani@xxxxxx>
    Signed-off-by: Dan Williams <dan.j.williams@xxxxxxxxx>

commit e1455744b27c9e6115c3508a7b2902157c2c4347
Author: Dan Williams <dan.j.williams@xxxxxxxxx>
Date:   Thu Jul 30 17:57:47 2015 -0400

    libnvdimm, pfn: 'struct page' provider infrastructure

    Implement the base infrastructure for libnvdimm PFN devices. Similar to
    BTT devices they take a namespace as a backing device and layer
    functionality on top. In this case the functionality is reserving space
    for an array of 'struct page' entries to be handed out through
    pfn_to_page(). For now this is just the basic libnvdimm-device-model for
    configuring the base PFN device.

    As the namespace claiming mechanism for PFN devices is mostly identical
    to BTT devices drivers/nvdimm/claim.c is created to house the common
    bits.

    Cc: Ross Zwisler <ross.zwisler@xxxxxxxxxxxxxxx>
    Signed-off-by: Dan Williams <dan.j.williams@xxxxxxxxx>

commit 32ab0a3f51701cb37ab960635254d5f84ec3de0a
Author: Dan Williams <dan.j.williams@xxxxxxxxx>
Date:   Sat Aug 1 02:16:37 2015 -0400

    libnvdimm, pmem: 'struct page' for pmem

    Enable the pmem driver to handle PFN device instances.  Attaching a pmem
    namespace to a pfn device triggers the driver to allocate and initialize
    struct page entries for pmem.  Memory capacity for this allocation comes
    exclusively from RAM for now which is suitable for low PMEM to RAM
    ratios.  This mechanism will be expanded later for setting an "allocate
    from PMEM" policy.

    Cc: Boaz Harrosh <boaz@xxxxxxxxxxxxx>
    Cc: Ross Zwisler <ross.zwisler@xxxxxxxxxxxxxxx>
    Cc: Christoph Hellwig <hch@xxxxxx>
    Signed-off-by: Dan Williams <dan.j.williams@xxxxxxxxx>

commit 004f1afbe199e6ab20805b95aefd83ccd24bc5c7
Author: Dan Williams <dan.j.williams@xxxxxxxxx>
Date:   Mon Aug 24 19:20:23 2015 -0400

    libnvdimm, pmem: direct map legacy pmem by default

    The expectation is that the legacy / non-standard pmem discovery method
    (e820 type-12) will only ever be used to describe small quantities of
    persistent memory.  Larger capacities will be described via the ACPI
    NFIT.  When "allocate struct page from pmem" support is added this default
    policy can be overridden by assigning a legacy pmem namespace to a pfn
    device, however this would be only be necessary if a platform used the
    legacy mechanism to define a very large range.

    Cc: Christoph Hellwig <hch@xxxxxx>
    Signed-off-by: Dan Williams <dan.j.williams@xxxxxxxxx>

��.n��������+%������w��{.n�����{�����ܨ}���Ơz�j:+v�����w����ޙ��&�)ߡ�a����z�ޗ���ݢj��w�f