Hi Xiao,
Thanks for taking care of this.
On 15.10.2021 11:25, Xiao Ni wrote:
The test case is:
1. create one imsm container
2. create a raid5 device from the container
3. unplug two disks
4. mdadm --detail /dev/md126
[root@rhel85 ~]# mdadm -D /dev/md126
/dev/md126:
Container : ��, member 0
The Detail function first gets container name by function
map_dev_preferred. Then it tries to find which disks are
available. In patch db5377883fef(It should be FAILED..)
uses map_dev_preferred to find which disks are under /dev.
But now, the major/minor information comes from kernel space.
map_dev_preferred malloc memory and init a device list when
first be called by Detail. It can't find the device in the
list by the major/minor. It free the memory and reinit the
list.
> The container name now points to an area tha has been freed.
So the containt is a mess.
Container name is collected with 'create' flag set, so it's
name is additionally copied to static memory to prevent
overwrites. Could you verify?
So summarizing: the previously returned value might be lost
in next call.
I looked into code, map_dev_preffered() mainly is used for
determining short-life buffers, which are used only to the next
call (next call overwrites previous result, expect case you fixed).
IMO map_dev_preffered() cannot be trusted now. I see some options:
1 - allocate additional memory and save return value there (caller
needs to free it later).
2 - describe limitations in description of the function to avoid
incorrect usages in the future.
This patch replaces map_dev_preferred with devid2kname. If
the name is NULL, it means the disk is unplug.
Your patch fixes only one place. Please go forward and analyze all
map_dev_preffered() calls (which looks safe to me). Maybe this
function can be replaced at all and we can drop this code in
flavor of devid2kname() or other.
Fixes: db5377883fef (It should be FAILED when raid has)
Signed-off-by: Xiao Ni <xni@xxxxxxxxxx>
Reported-by: Fine Fan <ffan@xxxxxxxxxx>
---
Detail.c | 12 +++++++-----
1 file changed, 7 insertions(+), 5 deletions(-)
diff --git a/Detail.c b/Detail.c
index d3af0ab..2164de3 100644
--- a/Detail.c
+++ b/Detail.c
@@ -351,11 +351,13 @@ int Detail(char *dev, struct context *c)
avail = xcalloc(array.raid_disks, 1);
for (d = 0; d < array.raid_disks; d++) {
- char *dv, *dv_rep;
- dv = map_dev_preferred(disks[d*2].major,
- disks[d*2].minor, 0, c->prefer);
- dv_rep = map_dev_preferred(disks[d*2+1].major,
- disks[d*2+1].minor, 0, c->prefer);
+ char *dv, *dv_rep = NULL;
+
+ if (!disks[d*2].major && !disks[d*2].minor)
+ continue; > +
+ dv = devid2kname(makedev(disks[d*2].major, disks[d*2].minor));
+ dv_rep = devid2kname(makedev(disks[d*2+1].major, disks[d*2+1].minor));
if ((dv && (disks[d*2].state & (1<<MD_DISK_SYNC))) ||
(dv_rep && (disks[d*2+1].state & (1<<MD_DISK_SYNC)))) {
Yeah, I know that it is used in Detail this way, but please determine
way to replace this ugly [d*2] and [d*2+1].
This whole block should be moved from Detail() code to separate
function, which determines if device or replacement is in sync.
Thanks,
Mariusz