On 9/20/23 11:36, Bart Van Assche wrote: > On 9/20/23 09:24, Bob Pearson wrote: >> The verbs APIs do not make real time commitments. If a ULP fails because of response times it is the problem in the ULP not in the verbs provider. > > I think there is evidence that the root cause is in the RXE driver. I > haven't seen any evidence that there would be any issues in any of the > involved ULP drivers. Am I perhaps missing something? > > Bart. I agree it is definitely possible. But I have also seen the same behavior in the siw driver which is completely independent. I have tried but have not been able to figure out what the ULPs are waiting for when the hangs occur. If someone who has a good understanding of the ULPs could catch a hang and figure what is missing it would give a clue as to what is going on. As mentioned above at the moment Ubuntu is failing rarely. But it used to fail reliably (srp/002 about 75% of the time and srp/011 about 99% of the time.) There haven't been any changes to rxe to explain this. Bob