On Wed, Mar 24 2021 at 9:21pm -0400, Zhiqiang Liu <liuzhiqiang26@xxxxxxxxxx> wrote: > > > On 2021/3/22 22:22, Mike Snitzer wrote: > > On Mon, Mar 22 2021 at 4:11am -0400, > > Christoph Hellwig <hch@xxxxxxxxxxxxx> wrote: > > > >> On Sat, Mar 20, 2021 at 03:19:23PM +0800, Zhiqiang Liu wrote: > >>> From: Zhiqiang Liu <liuzhiqiang26@xxxxxxxxxx> > >>> > >>> When we make IO stress test on multipath device, there will > >>> be a metadata err because of wrong path. In the test, we > >>> concurrent execute 'iscsi device login|logout' and > >>> 'multipath -r' command with IO stress on multipath device. > >>> In some case, systemd-udevd may have not time to process > >>> uevents of iscsi device logout|login, and then 'multipath -r' > >>> command triggers multipathd daemon calls ioctl to load table > >>> with incorrect old device info from systemd-udevd. > >>> Then, one iscsi path may be incorrectly attached to another > >>> multipath which has different uuid. Finally, the metadata err > >>> occurs when umounting filesystem to down write metadata on > >>> the iscsi device which is actually not owned by the multipath > >>> device. > >>> > >>> So we need to check whether all pgpaths of one multipath have > >>> the same uuid, if not, we should throw a error. > >>> > >>> Signed-off-by: Zhiqiang Liu <liuzhiqiang26@xxxxxxxxxx> > >>> Signed-off-by: lixiaokeng <lixiaokeng@xxxxxxxxxx> > >>> Signed-off-by: linfeilong <linfeilong@xxxxxxxxxx> > >>> Signed-off-by: Wubo <wubo40@xxxxxxxxxx> > >>> --- > >>> drivers/md/dm-mpath.c | 52 +++++++++++++++++++++++++++++++++++++++++ > >>> drivers/scsi/scsi_lib.c | 1 + > >>> 2 files changed, 53 insertions(+) > >>> > >>> diff --git a/drivers/md/dm-mpath.c b/drivers/md/dm-mpath.c > >>> index bced42f082b0..f0b995784b53 100644 > >>> --- a/drivers/md/dm-mpath.c > >>> +++ b/drivers/md/dm-mpath.c > >>> @@ -24,6 +24,7 @@ > >>> #include <linux/workqueue.h> > >>> #include <linux/delay.h> > >>> #include <scsi/scsi_dh.h> > >>> +#include <linux/dm-ioctl.h> > >>> #include <linux/atomic.h> > >>> #include <linux/blk-mq.h> > >>> > >>> @@ -1169,6 +1170,45 @@ static int parse_features(struct dm_arg_set *as, struct multipath *m) > >>> return r; > >>> } > >>> > >>> +#define SCSI_VPD_LUN_ID_PREFIX_LEN 4 > >>> +#define MPATH_UUID_PREFIX_LEN 7 > >>> +static int check_pg_uuid(struct priority_group *pg, char *md_uuid) > >>> +{ > >>> + char pgpath_uuid[DM_UUID_LEN] = {0}; > >>> + struct request_queue *q; > >>> + struct pgpath *pgpath; > >>> + struct scsi_device *sdev; > >>> + ssize_t count; > >>> + int r = 0; > >>> + > >>> + list_for_each_entry(pgpath, &pg->pgpaths, list) { > >>> + q = bdev_get_queue(pgpath->path.dev->bdev); > >>> + sdev = scsi_device_from_queue(q); > >> > >> Common dm-multipath code should never poke into scsi internals. This > >> is something for the device handler to check. It probably also won't > >> work for all older devices. > > > > Definitely. > > > > But that aside, userspace (multipathd) _should_ be able to do extra > > validation, _before_ pushing down a new table to the kernel, rather than > > forcing the kernel to do it. > > As your said, it is better to do extra validation in userspace (multipathd). > However, in some cases, the userspace cannot see the real-time present devices > info as Martin (committer of multipath-tools) said. > In addition, the kernel can see right device info in the table at any time, > so the uuid check in kernel can ensure one multipath is composed with paths mapped to > the same device. > > Considering the severity of the wrong path in multipath, I think it worths more > checking. As already said: this should be fixable in userspace. Please work with multipath-tools developers to address this. Mike