On 7/31/23 13:32, Jason Gunthorpe wrote: > On Mon, Jul 31, 2023 at 01:26:23PM -0500, Bob Pearson wrote: >> On 7/31/23 13:17, Jason Gunthorpe wrote: >>> On Fri, Jul 21, 2023 at 03:50:22PM -0500, Bob Pearson wrote: >>>> Network interruptions may cause long delays in the processing of >>>> send packets during which time the rxe driver may be unloaded. >>>> This will cause seg faults when the packet is ultimately freed as >>>> it calls the destructor function in the rxe driver. This has been >>>> observed in cable pull fail over fail back testing. >>> >>> No, module reference counts are only for code that is touching >>> function pointers. >> >> this is exactly the case here. it is the skb destructor function that >> is carried by the skb. > > It can't possibly call it correctly without also having the rxe > ib_device reference too though?? Nope. This was causing seg faults in testing when there was a long network hang and the admin tried to reload the rxe driver. The skb code doesn't care about the ib device at all. Bob > > Jason