On 09-Dec-18 18:48, Jason Gunthorpe wrote: > On Sun, Dec 09, 2018 at 11:27:41AM +0200, Gal Pressman wrote: >> On 06-Dec-18 22:07, Jason Gunthorpe wrote: >>> On Thu, Dec 06, 2018 at 02:53:36PM -0500, Doug Ledford wrote: >>> >>>> Then we end up having to duplicate the three or so bugfix patches where >>>> we moved roce sgid resolution out of the critical path because it can >>>> sleep. We end up making the code problems we have worse. Better to >>>> redesign this to remove that atomic assumption. Hence why I had them >>>> drop the stuff they had. >>> >>> The resolution is a slightly different issue, that was to do with >>> creating the rdma_ah_attr. Once the attr is created then the create_ah >>> driver callback is non-sleepable.. >>> >>>> If we get to the point of having two link layers that can't comply >>>> with the atomic requirement, then I think it's time to look at >>>> changing the requirement. >>> >>> Sure, and the people who want it changed are responsible :) Don't just >>> sweep the issue under the rung by ignoring the API contract 'because >>> it works for me today' >>> >>> Jason >>> >> >> We're not trying to sweep anything under the rug, we just can't comply with the >> atomic requirement. >> >> We have two options: >> 1. Comply with the atomic requirement, this means we will poll the device for >> completion instead of waiting for an interrupt. >> 2. Remove the atomic requirement. > > You could probably add a 'sleepable' flag to the create_ah > function. EFA could do EOPNOTSUPP if called this way. > > This would be a good improvement overall as some drivers are already > busy looping in create_ah and could avoid that in sleepable contexts.> > Jason > Sounds good to me.