On Mon, Jan 06, 2025 at 12:27:57PM -0400, Jason Gunthorpe wrote: > On Mon, Jan 06, 2025 at 01:05:08PM +0100, Simona Vetter wrote: > > On Thu, Jan 02, 2025 at 09:39:51AM -0400, Jason Gunthorpe wrote: > > > On Thu, Dec 19, 2024 at 11:04:54AM +0100, Christian König wrote: > > > > > > > > Based on all the above, I think we can conclude that since dma_buf_put() > > > > > does not directly (or synchronously) call the f_op->release() handler, a > > > > > deadlock is unlikely to occur in the scenario you described. > > > > > > Right, there is no deadlock here, and there is nothing inhernetly > > > wrong with using try_get like this. The locking here is ugly ugly > > > ugly, I forget why, but this was the best I came up with to untangle > > > it without deadlocks or races. > > > > Yeah, imo try_get is perfectly fine pattern. With that sorted my only > > request is to make the try_get specific to the dma_ops, because it only > > works if both ->release and the calling context of try_get follow the same > > rules, which by necessity are exporter specific. > > We already know the fd is a dma_ops one because it is on an internal > list and it could not get there otherwise. > > The pointer cannot become invalid and freed back to the type safe RCU > while on the list, meaning the try_get is safe as is. > > I think Christian's original advice to just open code it is the best > option. Yeah open coding in vfio is imo best, agreed on that. > > In full generality as a dma_buf.c interface it's just busted and there's > > no way to make it work, unless we inflict that locking ugliness on > > absolutely every exporter. > > I'm not sure about that, the struct file code has special logic to > accommodate the type safe RCU trick. I didn't try to digest it fully, > but I expect there are ways to use it safely without the locking on > release. Hm yes, if you use a write barrier when set your file pointer and clear it in release, then you can use get_file_rcu with just rcu_read_lock on the read side. But you have to directly use your own struct file * pointer since it needs to reload that in a loop, you can't use dma_buf->file. At that point you're already massively peeking behind the dma_buf abstraction that just directly using get_file_rcu is imo better. And for anything else, whether it's rcu or plain locks, it's all subsystem specific anyway that I think a dma_buf.c wrapper Not entirely sure, but for the dma_buf_try_get wrapper you have to put a full lock acquisition or rcu_sychnronize into your ->release callback, otherwise I don't think it's safe to use. > > > IIRC it was changed a while back because call chains were getting kind of > > > crazy long with the file release functions and stack overflow was a > > > problem in some cases. > > > > Hm I thought it was also a "oh fuck deadlocks" moment, because usually > > most of the very deep fput calls are for temporary reference and not the > > final one. > > That sounds motivating too, but I've also seen cases of being careful > to unlock before fputting things.. Yeah I think knowledge about this issue was very uneven in different subsystems. -Sima -- Simona Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch