On Mon, 2021-02-01 at 23:26 -0600, Benjamin Marzinski wrote: > On Mon, Feb 01, 2021 at 04:12:34PM +0100, Martin Wilck wrote: > > On Mon, 2021-02-01 at 22:50 +0800, lixiaokeng wrote: > > > > > > > > > > > > > cli_add_path > > > > > ->ev_add_path > > > > > ->add_map_with_path > > > > > ->adopt_paths > > > > > ->pathinfo > > > > > ->filter_property > > > > > ->return PATHINFO_SKIPPED, > > > > > ->pp->mpp is NULL and not be set > > > > > ->return 0 > > > > > > > > This returns 0, but add_map_with_path() has this code to check > > > > whether > > > > the path passed to it was actually added to the new map: > > > > > > > > if (adopt_paths(vecs->pathvec, mpp) || > > > > find_slot(vecs->pathvec, pp) == -1) > > > > goto out; -> return NULL > > > > > > > > So ev_add_path() should have seen a NULL return from > > > > add_map_with_path(), should not have set start_waiter, and > > > > failed. > > > > > > > > > > I'm sorry for a big mistake in my stack. As the code is > > > optimized, > > > pathinfo > > > return PATHINFO_SKIPPED after finish filter_property when I use > > > gdb. > > > It > > > happens acctualy in: > > > 2141 if (pp->bus == SYSFS_BUS_SCSI && > > > 2142 pp->sg_id.proto_id == > > > SCSI_PROTOCOL_USB > > > && > > > 2143 !conf->allow_usb_devices) { > > > 2144 condlog(3, "%s: skip USB device > > > %s", > > > pp->dev, > > > 2145 pp->tgt_node_name); > > > 2146 return PATHINFO_SKIPPED; > > > 2147 } > > > 2148 } > > > > > > Stack: > > > cli_add_path > > > ->ev_add_path > > > ->add_map_with_path > > > ->adopt_paths > > > ->pathinfo > > > ->pp->bus == SYSFS_BUS_SCSI > > > ->return PATHINFO_SKIPPED, > > > ->pp->mpp is NULL and not be set > > > ->return 0 > > > ->mpath_pr_event_handle > > > ->get_be64 //pp->mpp is dereference > > > > > > If you think my patch is ok, I will resend it. > > > > The same argument I made above still holds. "pp" wouldn't have been > > added to mpp, and add_map_with_path() would fail and return NULL. > > Also, if pathinfo() returns PATHINFO_SKIPPED for this device, > > how comes that cli_add_path() called ev_add_path() for it? It > > should > > have returned "blacklisted" instead. > > So, I think the main issue here is that filter_property appears to be > broken. It only filters if uid_attribute is set, but that will never > be > set the first time it's called in pathinfo. This means that it will > pass in the pathinfo call in cli_add_path, and the path will get > stored > in the pathvec. I'd not say filter_property() is broken; rather, its callers. See below. > However, it will fail in the pathinfo call from adopt_paths, so the > path > won't be added to the multipath device. This means adopt paths > doesn't > actually adopt any paths potentially, but that in itself doesn't > cause > it to fail. This check > > if (adopt_paths(vecs->pathvec, mpp) || > find_slot(vecs->pathvec, pp) == -1) > goto out; > > passes, since we only check if the path is on the pathvec, not part > of > the multipath device, and since filter_property let the path past the > first time, it is. So add_map_with_path() will create a multipath > device, but the path won't be added to it, and pp->mpp == NULL. > > So, add_map_with_path() should probably check that we actually > created a > map that included the path that got added. Exactly, this is a bug. Thanks for analyzing this. I'll send a patch asap. > But more importantly, > filter_property shouldn't return different results the when it's > called > the first time. That would have avoid the entire situation. I'm not sure about this. filter_property() returning different results depending on the initialization state of the path is irritating, sure. Yet it makes certain sense that once the information about a path object is more complete, the decision whether or not it should be blacklisted might change. The only way to avoid that is to provide the necessary information in the first place. IMO the only reasonable way to amend this is to call select_getuid() before filter_property(). I'll send a patch for that, too. Determination of getuid/uid_attribute isn't expensive; the only reason why we've been doing it "lazily" is legacy, AFAICS. Note: We bump into a deep design issue with multipath-tools here, the fact that path and multipath properties depend on each other in complex ways, but these dependencies are hidden deeply in the code and need to be explicitly memorized by us developers when we work on the code, which is error-prone. Unfortunately, I see no easy way out of this. Regards Martin -- dm-devel mailing list dm-devel@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/dm-devel