On Thu, 2021-02-04 at 19:25 +0800, lixiaokeng wrote: > > Hi Martin, > > On 2021/1/27 7:11, Martin Wilck wrote: > > So we can only conclude that (if there's no kernel refcounting bug, > > which I doubt) either orphan_path()->uninitialize_path() had been > > called (closing the fd), or that opening the sd device had failed > > in > > the first place (in which case the path WWID should have been > > nulled in > > pathinfo(). In both cases it makes little sense that the path > > should > > still be part of a struct multipath. > > I have an idea. > > If pp->fd < 0 ("Couldn't open device node"), pathinfo() returns > PATHINFO_FAILED. Don't close(pp->fd) in orphan_path(). It may solve > the > problem (device with wrong path). I will take some time to test it. Do you have evidence that the fd had been closed in your error case? The path in question wasn't orphaned, if I understood correctly. You said it was still member of a map. In that case, the fd *must* be open. > However, I don’t know if there are potential risks. Do you have > suggestions about this? Other than resource usage ... users might be irritated because if we do this and a device is remove and reappears, it will *always* have a different device node attached. But the device nodes are random today, anyway. If we missed a delete event, we might keep this fd open forever, because a re-added path would never get the same sysfs path again; not sure if that might hurt in some scenarios. We shouldn't miss delete events anyway, of course. So no, at least off the to of my head, I can't think of anything serious. Famous last words ;-) We must make sure to close the fd in the free_path() code path, of course. Btw, I just double-checked that the kernel really behaves as I thought. You can run e.g. in python: >>> import os >>> f=os.open("/dev/sdh", os.O_RDWR|os.O_EXCL) This will keep an fd to the device open. Now if you delete the device and re-add it by scanning the scsi host, it will get a new device ID. echo 1 >/sys/block/sdh/device/delete echo - - - >/sys/class/scsi_host/host2/scan If you close the fd in python and repeat the delete/re-add (and nothing else happened in the meantime), it will become "sdh" again. Cheers, Martin -- dm-devel mailing list dm-devel@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/dm-devel