Dan Williams <dan.j.williams@xxxxxxxxx> writes: > At namespace creation time there is the potential for the "expected to > be zero" fields of a 'pfn' info-block to be filled with indeterminate > data. While the kernel buffer is zeroed on allocation it is immediately > overwritten by nd_pfn_validate() filling it with the current contents of > the on-media info-block location. For fields like, 'flags' and the > 'padding' it potentially means that future implementations can not rely > on those fields being zero. > > In preparation to stop using the 'start_pad' and 'end_trunc' fields for > section alignment, arrange for fields that are not explicitly > initialized to be guaranteed zero. Bump the minor version to indicate it > is safe to assume the 'padding' and 'flags' are zero. Otherwise, this > corruption is expected to benign since all other critical fields are > explicitly initialized. > > Fixes: 32ab0a3f5170 ("libnvdimm, pmem: 'struct page' for pmem") > Cc: <stable@xxxxxxxxxxxxxxx> > Signed-off-by: Dan Williams <dan.j.williams@xxxxxxxxx> > --- > drivers/nvdimm/dax_devs.c | 2 +- > drivers/nvdimm/pfn.h | 1 + > drivers/nvdimm/pfn_devs.c | 18 +++++++++++++++--- > 3 files changed, 17 insertions(+), 4 deletions(-) > > diff --git a/drivers/nvdimm/dax_devs.c b/drivers/nvdimm/dax_devs.c > index 0453f49dc708..326f02ffca81 100644 > --- a/drivers/nvdimm/dax_devs.c > +++ b/drivers/nvdimm/dax_devs.c > @@ -126,7 +126,7 @@ int nd_dax_probe(struct device *dev, struct nd_namespace_common *ndns) > nvdimm_bus_unlock(&ndns->dev); > if (!dax_dev) > return -ENOMEM; > - pfn_sb = devm_kzalloc(dev, sizeof(*pfn_sb), GFP_KERNEL); > + pfn_sb = devm_kmalloc(dev, sizeof(*pfn_sb), GFP_KERNEL); > nd_pfn->pfn_sb = pfn_sb; > rc = nd_pfn_validate(nd_pfn, DAX_SIG); > dev_dbg(dev, "dax: %s\n", rc == 0 ? dev_name(dax_dev) : "<none>"); > diff --git a/drivers/nvdimm/pfn.h b/drivers/nvdimm/pfn.h > index dde9853453d3..e901e3a3b04c 100644 > --- a/drivers/nvdimm/pfn.h > +++ b/drivers/nvdimm/pfn.h > @@ -36,6 +36,7 @@ struct nd_pfn_sb { > __le32 end_trunc; > /* minor-version-2 record the base alignment of the mapping */ > __le32 align; > + /* minor-version-3 guarantee the padding and flags are zero */ > u8 padding[4000]; > __le64 checksum; > }; > diff --git a/drivers/nvdimm/pfn_devs.c b/drivers/nvdimm/pfn_devs.c > index 01f40672507f..a2406253eb70 100644 > --- a/drivers/nvdimm/pfn_devs.c > +++ b/drivers/nvdimm/pfn_devs.c > @@ -420,6 +420,15 @@ static int nd_pfn_clear_memmap_errors(struct nd_pfn *nd_pfn) > return 0; > } > > +/** > + * nd_pfn_validate - read and validate info-block > + * @nd_pfn: fsdax namespace runtime state / properties > + * @sig: 'devdax' or 'fsdax' signature > + * > + * Upon return the info-block buffer contents (->pfn_sb) are > + * indeterminate when validation fails, and a coherent info-block > + * otherwise. > + */ > int nd_pfn_validate(struct nd_pfn *nd_pfn, const char *sig) > { > u64 checksum, offset; > @@ -565,7 +574,7 @@ int nd_pfn_probe(struct device *dev, struct nd_namespace_common *ndns) > nvdimm_bus_unlock(&ndns->dev); > if (!pfn_dev) > return -ENOMEM; > - pfn_sb = devm_kzalloc(dev, sizeof(*pfn_sb), GFP_KERNEL); > + pfn_sb = devm_kmalloc(dev, sizeof(*pfn_sb), GFP_KERNEL); > nd_pfn = to_nd_pfn(pfn_dev); > nd_pfn->pfn_sb = pfn_sb; > rc = nd_pfn_validate(nd_pfn, PFN_SIG); > @@ -702,7 +711,7 @@ static int nd_pfn_init(struct nd_pfn *nd_pfn) > u64 checksum; > int rc; > > - pfn_sb = devm_kzalloc(&nd_pfn->dev, sizeof(*pfn_sb), GFP_KERNEL); > + pfn_sb = devm_kmalloc(&nd_pfn->dev, sizeof(*pfn_sb), GFP_KERNEL); > if (!pfn_sb) > return -ENOMEM; > > @@ -711,11 +720,14 @@ static int nd_pfn_init(struct nd_pfn *nd_pfn) > sig = DAX_SIG; > else > sig = PFN_SIG; > + > rc = nd_pfn_validate(nd_pfn, sig); > if (rc != -ENODEV) > return rc; > > /* no info block, do init */; > + memset(pfn_sb, 0, sizeof(*pfn_sb)); > + > nd_region = to_nd_region(nd_pfn->dev.parent); > if (nd_region->ro) { > dev_info(&nd_pfn->dev, > @@ -768,7 +780,7 @@ static int nd_pfn_init(struct nd_pfn *nd_pfn) > memcpy(pfn_sb->uuid, nd_pfn->uuid, 16); > memcpy(pfn_sb->parent_uuid, nd_dev_to_uuid(&ndns->dev), 16); > pfn_sb->version_major = cpu_to_le16(1); > - pfn_sb->version_minor = cpu_to_le16(2); > + pfn_sb->version_minor = cpu_to_le16(3); > pfn_sb->start_pad = cpu_to_le32(start_pad); > pfn_sb->end_trunc = cpu_to_le32(end_trunc); > pfn_sb->align = cpu_to_le32(nd_pfn->align); > How will this minor version 3 be used? If we are not having start_pad/end_trunc updated in pfn_sb, how will the older kernel enable these namesapces? Do we need a patch like https://lore.kernel.org/linux-mm/20190604091357.32213-2-aneesh.kumar@xxxxxxxxxxxxx -aneesh