On 12/13/21 4:10 AM, Jens Axboe wrote:
diff --git a/drivers/md/bcache/nvmpg.c b/drivers/md/bcache/nvmpg.c
index b654bbbda03e..2b70ee4a6028 100644
--- a/drivers/md/bcache/nvmpg.c
+++ b/drivers/md/bcache/nvmpg.c
@@ -50,6 +50,36 @@ unsigned long bch_nvmpg_ptr_to_offset(struct bch_nvmpg_ns *ns, void *ptr)
return BCH_NVMPG_OFFSET(ns_id, offset);
}
+static struct page *bch_nvmpg_va_to_pg(void *addr)
+{
+ return virt_to_page(addr);
+}
What's the purpose of this helper?
This is used for the simplified buddy-like allocator. When releasing a
bulk of continuous NVDIMM pages, what we record in nvmpg metadata is the
offset value (combined with NVDIMM namespace ID and the in-namespace
offset). The offset value can be converted into a DAX mapped linear
address of the allocated continuous NVDIMM pages, bch_nvmpg_va_to_pg()
is to find the header page and release the continuous pages into the
page list of specific order in the buddy-like pages allocator.
A typical usage for bch_nvmpg_va_pg() is in bch_nvmpg_alloc_pages(),
after the buddy-like allocator chooses a bulk of continuous pages, what
we have is only the in-namespace offset of the first page, we need to
find the corresponding struct page by,
bch_nvmpg_va_pg(dax-mapped-base-address + header-page-index <<
page-size-bits)
Then we set buddy-like allocator related information into the header page.
+static inline void reserve_nvmpg_pages(struct bch_nvmpg_ns *ns,
+ pgoff_t pgoff, u64 nr)
+{
+ while (nr > 0) {
+ unsigned int num = nr > UINT_MAX ? UINT_MAX : nr;
Surely UINT_MAX isn't anywhere near a valid limit?
Hmm, do you mean whether UINT_MAX is too large, or too small?
The while() loop here I took it as a paranoid oversize handling and no
real effect indeed, so I was fine with it as the first version. The idea
method should be an extent tree to record all the reserved area which
may save a lot of system memory space than bitmap does.
I will suggest Qiaowen and Jianpeng to use extent tree to record the
free and allocated areas from the NVDIMM namespace and drop the bitmap
method now.
@@ -76,10 +110,73 @@ static void release_nvmpg_set(struct bch_nvmpg_set *set)
kfree(set);
}
+static int validate_recs(int ns_id,
+ struct bch_nvmpg_head *head,
+ struct bch_nvmpg_recs *recs)
+{
+ if (memcmp(recs->magic, bch_nvmpg_recs_magic, 16)) {
+ pr_err("Invalid bch_nvmpg_recs magic\n");
+ return -EINVAL;
+ }
+
+ if (memcmp(recs->uuid, head->uuid, 16)) {
+ pr_err("Invalid bch_nvmpg_recs uuid\n");
+ return -EINVAL;
+ }
+
+ if (recs->head_offset !=
+ bch_nvmpg_ptr_to_offset(global_nvmpg_set->ns_tbl[ns_id], head)) {
+ pr_err("Invalid recs head_offset\n");
+ return -EINVAL;
+ }
Same comments here on the frivilous error messaging, other places in
this file too. Check all the other patches as well, please.
This is the error message style we try to follow from bcache code, and
IMHO it is necessary. Any of the above error condition means meta data
might be corrupted, which is critical.
[snipped]
static int attach_nvmpg_set(struct bch_nvmpg_ns *ns)
@@ -200,7 +371,7 @@ struct bch_nvmpg_ns *bch_register_namespace(const char *dev_path)
char buf[BDEVNAME_SIZE];
struct block_device *bdev;
pgoff_t pgoff;
- int id, err;
+ int id, i, err;
char *path;
long dax_ret = 0;
@@ -304,13 +475,48 @@ struct bch_nvmpg_ns *bch_register_namespace(const char *dev_path)
mutex_init(&ns->lock);
+ /*
+ * parameters of bitmap_set/clear are unsigned int.
+ * Given currently size of nvm is far from exceeding this limit,
+ * so only add a WARN_ON message.
+ */
+ WARN_ON(BITS_TO_LONGS(ns->pages_total) > UINT_MAX);
Does this really need to be a WARN_ON()? Looks more like an -EINVAL
condition.
This is because currently the free and allocated areas are recorded by
bitmap during the buddy system initialization. As I said after the
bitmap is switched to an extent tree, such limitation check will
disappear. After Qiaowen and Jianpeng replace the bitmap by extent tree,
people won't see the limitation.
Thanks for your review.
Coly Li