Re: [PATCH v13 03/12] bcache: initialization of the buddy

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 12/13/21 4:10 AM, Jens Axboe wrote:
diff --git a/drivers/md/bcache/nvmpg.c b/drivers/md/bcache/nvmpg.c
index b654bbbda03e..2b70ee4a6028 100644
--- a/drivers/md/bcache/nvmpg.c
+++ b/drivers/md/bcache/nvmpg.c
@@ -50,6 +50,36 @@ unsigned long bch_nvmpg_ptr_to_offset(struct bch_nvmpg_ns *ns, void *ptr)
  	return BCH_NVMPG_OFFSET(ns_id, offset);
  }
+static struct page *bch_nvmpg_va_to_pg(void *addr)
+{
+	return virt_to_page(addr);
+}
What's the purpose of this helper?

This is used for the simplified buddy-like allocator. When releasing a bulk of continuous NVDIMM pages, what we record in nvmpg metadata is the offset value (combined with NVDIMM namespace ID and the in-namespace offset). The offset value can be converted into a DAX mapped linear address of the allocated continuous NVDIMM pages, bch_nvmpg_va_to_pg() is to find the header page and release the continuous pages into the page list of specific order in the buddy-like pages allocator.

A typical usage for bch_nvmpg_va_pg() is in bch_nvmpg_alloc_pages(), after the buddy-like allocator chooses a bulk of continuous pages, what we have is only the in-namespace offset of the first page, we need to find the corresponding struct page by,     bch_nvmpg_va_pg(dax-mapped-base-address + header-page-index << page-size-bits)
Then we set buddy-like allocator related information into the header page.

+static inline void reserve_nvmpg_pages(struct bch_nvmpg_ns *ns,
+				       pgoff_t pgoff, u64 nr)
+{
+	while (nr > 0) {
+		unsigned int num = nr > UINT_MAX ? UINT_MAX : nr;
Surely UINT_MAX isn't anywhere near a valid limit?

Hmm, do you mean whether UINT_MAX is too large, or too small?

The while() loop here I took it as a paranoid oversize handling and no real effect indeed, so I was fine with it as the first version. The idea method should be an extent tree to record all the reserved area which may save a lot of system memory space than bitmap does.

I will suggest Qiaowen and Jianpeng to use extent tree to record the free and allocated areas from the NVDIMM namespace and drop the bitmap method now.


@@ -76,10 +110,73 @@ static void release_nvmpg_set(struct bch_nvmpg_set *set)
  	kfree(set);
  }
+static int validate_recs(int ns_id,
+			 struct bch_nvmpg_head *head,
+			 struct bch_nvmpg_recs *recs)
+{
+	if (memcmp(recs->magic, bch_nvmpg_recs_magic, 16)) {
+		pr_err("Invalid bch_nvmpg_recs magic\n");
+		return -EINVAL;
+	}
+
+	if (memcmp(recs->uuid, head->uuid, 16)) {
+		pr_err("Invalid bch_nvmpg_recs uuid\n");
+		return -EINVAL;
+	}
+
+	if (recs->head_offset !=
+	    bch_nvmpg_ptr_to_offset(global_nvmpg_set->ns_tbl[ns_id], head)) {
+		pr_err("Invalid recs head_offset\n");
+		return -EINVAL;
+	}
Same comments here on the frivilous error messaging, other places in
this file too. Check all the other patches as well, please.

This is the error message style we try to follow from bcache code, and IMHO it is necessary. Any of the above error condition means meta data might be corrupted, which is critical.

[snipped]
  static int attach_nvmpg_set(struct bch_nvmpg_ns *ns)
@@ -200,7 +371,7 @@ struct bch_nvmpg_ns *bch_register_namespace(const char *dev_path)
  	char buf[BDEVNAME_SIZE];
  	struct block_device *bdev;
  	pgoff_t pgoff;
-	int id, err;
+	int id, i, err;
  	char *path;
  	long dax_ret = 0;
@@ -304,13 +475,48 @@ struct bch_nvmpg_ns *bch_register_namespace(const char *dev_path) mutex_init(&ns->lock); + /*
+	 * parameters of bitmap_set/clear are unsigned int.
+	 * Given currently size of nvm is far from exceeding this limit,
+	 * so only add a WARN_ON message.
+	 */
+	WARN_ON(BITS_TO_LONGS(ns->pages_total) > UINT_MAX);
Does this really need to be a WARN_ON()? Looks more like an -EINVAL
condition.

This is because currently the free and allocated areas are recorded by bitmap during the buddy system initialization. As I said after the bitmap is switched to an extent tree, such limitation check will disappear. After Qiaowen and Jianpeng replace the bitmap by extent tree, people won't see the limitation.

Thanks for your review.

Coly Li



[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux