On Wed, Apr 13, 2022 at 07:09:57PM +0800, Yao Hongbo wrote: > > 在 2022/4/13 下午5:43, Greg KH 写道: > > On Wed, Apr 13, 2022 at 05:25:40PM +0800, Yao Hongbo wrote: > > > 在 2022/4/13 下午4:51, Michael S. Tsirkin 写道: > > > > On Wed, Apr 13, 2022 at 09:33:17AM +0200, Greg KH wrote: > > > > > On Wed, Apr 13, 2022 at 03:01:42PM +0800, Yao Hongbo wrote: > > > > > > If two userspace programs both open the PCI UIO fd, when one > > > > > > of the program exits uncleanly, the other will cause IO hang > > > > > > due to bus-mastering disabled. > > > > > > > > > > > > It's a common usage for spdk/dpdk to use UIO. So, introduce refcnt > > > > > > to avoid such problems. > > > > > Why do you have multiple userspace programs opening the same device? > > > > > Shouldn't they coordinate? > > > > Or to restate, I think the question is, why not open the device > > > > once and pass the FD around? > > > Hmm, it will have the same result, no matter whether opening the same > > > device or pass the FD around. > > How? You only open once, and close once. Where is the multiple closes? > > > > > Our expectation is that even if the primary process exits abnormally, the > > > second process can still send > > > > > > or receive data. > > Then use the same file descriptor. > > > Yes, we can use the same file descriptor. > > but since the pcie bus-master has been disabled by the primary process, > > the seconday process cannot continue to operate. Really? With the same file descriptor? Try it and see. release should only be called when the file descriptor is closed. > > > The impact of disabling pci bus-master is relatively large, and we should > > > make some restrictions on > > > this behavior. > > Why? UIO is "you better really really know what you are doing to use > > this interface", right? Just duplicate the fd and pass it around if you > > must have multiple accesses to the same device. > > > > And again, this will be a functional change. How can you handle your > > userspace on older kernels if you make this change? > > Without this change, our userspace cannot work properly on older kernels. What change broke your userspace? > Our userspace only use the "multi process mode" feature of the spdk. > > The SPDK links: > https://spdk.io/doc/app_overview.html > > "Multi process mode > When --shm-id is specified, the application is started in multi-process > mode. > > Applications using the same shm-id share their memory and NVMe devices. > > The first app to start with a given id becomes a primary process, with the > rest, > > called secondary processes, only attaching to it. When the primary process > exits, > > the secondary ones continue to operate, but no new processes can be attached > > at this point. All processes within the same shm-id group must use the same > --single-file-segments setting." Please work with the spdk users, I know nothing about that mess, sorry. greg k-h