Re: [PATCH v13 02/12] bcache: initialize the nvm pages allocator

Coly Li <colyli@xxxxxxx> · Tue, 28 Dec 2021 13:29:18 +0800

Hi Jens,

Thank you for review the patches. I was in travel for a week when you 
replied the email and after the travel I was sick for one more week. 
Just being able to sit in front of my laptop now.

I reply inline bellow your comments.

On 12/13/21 3:34 AM, Jens Axboe wrote:
On 12/12/21 10:05 AM, Coly Li wrote:

diff --git a/drivers/md/bcache/nvmpg.c b/drivers/md/bcache/nvmpg.c
new file mode 100644
index 000000000000..b654bbbda03e
--- /dev/null
+++ b/drivers/md/bcache/nvmpg.c
@@ -0,0 +1,340 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Nvdimm page-buddy allocator
+ *
+ * Copyright (c) 2021, Intel Corporation.
+ * Copyright (c) 2021, Qiaowei Ren<qiaowei.ren@xxxxxxxxx>.
+ * Copyright (c) 2021, Jianpeng Ma<jianpeng.ma@xxxxxxxxx>.
+ */
+
+#include "bcache.h"
+#include "nvmpg.h"
+
+#include <linux/slab.h>
+#include <linux/list.h>
+#include <linux/mutex.h>
+#include <linux/dax.h>
+#include <linux/pfn_t.h>
+#include <linux/libnvdimm.h>
+#include <linux/mm_types.h>
+#include <linux/err.h>
+#include <linux/pagemap.h>
+#include <linux/bitmap.h>
+#include <linux/blkdev.h>
+
[snipped]
+	expected_csum = csum_set(sb);
+	if (expected_csum != sb->csum) {
+		pr_info("csum is not match with expected one\n");
"Checksum mismatch"

Copied, it will be fixed.


would be more correct english, should it print the checksums as well?

This is the style following bcache code to report bad check sum, like,
super.c:718:                pr_warn("bad csum reading priorities\n");
journal.c:141:                pr_info("%u: bad csum, %zu bytes, offset 
%u\n",

There might be fine to only notice the bad csum status.

+		goto put_page;
+	}
+
+	if (memcmp(sb->magic, bch_nvmpg_magic, 16)) {
+		pr_info("invalid bch_nvmpg_magic\n");
+		goto put_page;
+	}
+
+	if (sb->sb_offset !=
+	    BCH_NVMPG_OFFSET(sb->this_ns, BCH_NVMPG_SB_OFFSET)) {
+		pr_info("invalid superblock offset 0x%llx\n", sb->sb_offset);
+		goto put_page;
+	}
+
+	r = -EOPNOTSUPP;
+	if (sb->total_ns != 1) {
+		pr_info("multiple name space not supported yet.\n");
+		goto put_page;
+	}
Please use namespace consistently.

Copied, it will be fixed.

+struct bch_nvmpg_ns *bch_register_namespace(const char *dev_path)
+{
+	struct bch_nvmpg_ns *ns = NULL;
+	struct bch_nvmpg_sb *sb = NULL;
+	char buf[BDEVNAME_SIZE];
+	struct block_device *bdev;
+	pgoff_t pgoff;
+	int id, err;
+	char *path;
+	long dax_ret = 0;
+
+	path = kstrndup(dev_path, 512, GFP_KERNEL);
+	if (!path) {
+		pr_err("kstrndup failed\n");
+		return ERR_PTR(-ENOMEM);
+	}
Really don't think we need that piece of information. Same for a lot of
other places, you have a ton of pr_err() stuff that looks mostly like
debugging.

Hmm, I think I need your guidance here. When I review the code, IMHO the 
error message is necessary but don't print them in run time code. Such 
place is in NVDIMM namespace registration code path, therefore I 
accepted it. This is what I was trained for long time....

Could you please give me some hint to judge whether a printk message 
should be avoided? Then I will use it in future code review and my own 
coding work.


+	ns->page_size = sb->page_size;
+	ns->pages_offset = sb->pages_offset;
+	ns->pages_total = sb->pages_total;
+	ns->sb = sb;
+	ns->free = 0;
+	ns->bdev = bdev;
+	ns->set = global_nvmpg_set;
+
+	err = attach_nvmpg_set(ns);
+	if (err < 0)
+		goto free_ns;
+
+	mutex_init(&ns->lock);
+
+	err = init_nvmpg_set_header(ns);
+	if (err < 0)
+		goto free_ns;
Does this error path need to un-attach?

Detach or un-attach is not implemented in this series, because it is 
related to how to flush the whole NVDIMM namespace during un-attach. It 
is planed to posted with the NVDIMM support for Bcache B+tree node.

For current bcache code only with SSD, if a flush request received from 
upper layer, it is synchronously sent to backing device and 
asynchronously sent to cache device with journal flush. But for NVDIMM 
namespace, such operation is to flush last level cache, we have 3 
candidate methods to handle,
1) flush cache for the whole dax mapped linear address range (like 
nvdimm_flush() in libnvdimm)
2) flush cache for the only allocated NVDIMM ranges in the nvmpg 
allocator for selected requesters (uuids).
3) maintain a list of dirty NVDIMM pages, and only flush cache for these 
linear address ranges.

We are still testing the performance and not make decision yet. So the 
detach code path is not posted in this series and we un-attach the 
NVDIMM namespace by unloading bcache kernel module currently.

+int __init bch_nvmpg_init(void)
+{
+	global_nvmpg_set = kzalloc(sizeof(*global_nvmpg_set), GFP_KERNEL);
+	if (!global_nvmpg_set)
+		return -ENOMEM;
+
+	global_nvmpg_set->total_ns = 0;
+	mutex_init(&global_nvmpg_set->lock);
+
+	pr_info("bcache nvm init\n");
Another useless pr debug print, just get rid of it (and others).

Copied, it will be dropped.

+void bch_nvmpg_exit(void)
+{
+	release_nvmpg_set(global_nvmpg_set);
+	pr_info("bcache nvm exit\n");
+}
Ditto

Copied.

Thanks for your comments.

Coly Li