On 8/12/21 1:43 PM, Dan Williams wrote: > On Wed, Aug 11, 2021 at 10:04 AM Coly Li <colyli@xxxxxxx> wrote: >> From: Jianpeng Ma <jianpeng.ma@xxxxxxxxx> >> >> This patch define the prototype data structures in memory and >> initializes the nvm pages allocator. >> >> The nvm address space which is managed by this allocator can consist of >> many nvm namespaces, and some namespaces can compose into one nvm set, >> like cache set. For this initial implementation, only one set can be >> supported. >> >> The users of this nvm pages allocator need to call register_namespace() >> to register the nvdimm device (like /dev/pmemX) into this allocator as >> the instance of struct nvm_namespace. >> >> Reported-by: Randy Dunlap <rdunlap@xxxxxxxxxxxxx> >> Signed-off-by: Jianpeng Ma <jianpeng.ma@xxxxxxxxx> >> Co-developed-by: Qiaowei Ren <qiaowei.ren@xxxxxxxxx> >> Signed-off-by: Qiaowei Ren <qiaowei.ren@xxxxxxxxx> >> Cc: Christoph Hellwig <hch@xxxxxx> >> Cc: Dan Williams <dan.j.williams@xxxxxxxxx> >> Cc: Hannes Reinecke <hare@xxxxxxx> >> Cc: Jens Axboe <axboe@xxxxxxxxx> >> --- >> drivers/md/bcache/Kconfig | 10 + >> drivers/md/bcache/Makefile | 1 + >> drivers/md/bcache/nvm-pages.c | 339 ++++++++++++++++++++++++++++++++++ >> drivers/md/bcache/nvm-pages.h | 96 ++++++++++ >> drivers/md/bcache/super.c | 3 + >> 5 files changed, 449 insertions(+) >> create mode 100644 drivers/md/bcache/nvm-pages.c >> create mode 100644 drivers/md/bcache/nvm-pages.h >> [snipped] >> + >> + err = -EOPNOTSUPP; >> + if (!bdev_dax_supported(bdev, ns->page_size)) { >> + pr_err("%s don't support DAX\n", bdevname(bdev, buf)); >> + goto free_ns; >> + } >> + >> + err = -EINVAL; >> + if (bdev_dax_pgoff(bdev, 0, ns->page_size, &pgoff)) { >> + pr_err("invalid offset of %s\n", bdevname(bdev, buf)); >> + goto free_ns; >> + } >> + >> + err = -ENOMEM; >> + ns->dax_dev = fs_dax_get_by_bdev(bdev); >> + if (!ns->dax_dev) { >> + pr_err("can't by dax device by %s\n", bdevname(bdev, buf)); >> + goto free_ns; >> + } >> + >> + err = -EINVAL; >> + id = dax_read_lock(); >> + dax_ret = dax_direct_access(ns->dax_dev, pgoff, ns->pages_total, >> + &ns->base_addr, &ns->start_pfn); >> + if (dax_ret <= 0) { >> + pr_err("dax_direct_access error\n"); >> + dax_read_unlock(id); >> + goto free_ns; >> + } >> + >> + if (dax_ret < ns->pages_total) { >> + pr_warn("mapped range %ld is less than ns->pages_total %lu\n", >> + dax_ret, ns->pages_total); Hi Dan, Many thanks for your information. > This failure will become a common occurrence with CXL namespaces that > will have discontiguous range support. It's already the case for > dax-devices for soft-reserved memory [1]. In the CXL case the > discontinuity will be 256MB aligned, for the soft-reserved dax-devices > the discontinuity granularity can be as small as 4K. > > [1]: https://elixir.bootlin.com/linux/v5.14-rc5/source/drivers/dax/device.c#L414 Fortunately the on-media allocation list format works with multiple ranges of the namespace. For the in-memory struct bch_nvmpg_ns currently assumes the namespace is a flat continuous range. Yes, we need to consider and support multiple ranges in struct bch_nvmpg_ns for buddy allocation initialization to skip the discontinuous gap. It will be in the to-do list for next work. Thanks for your comments and hint. Coly Li