On 2024/12/10 21:48, Jason Gunthorpe wrote: > On Tue, Dec 10, 2024 at 02:24:16PM +0800, Junxian Huang wrote: >> >> >> On 2024/12/10 3:01, Jason Gunthorpe wrote: >>> On Mon, Oct 14, 2024 at 09:07:31PM +0800, Junxian Huang wrote: >>>> From: Chengchang Tang <tangchengchang@xxxxxxxxxx> >>>> >>>> Mmap reset state to notify userspace about HW reset. The mmaped flag >>>> hw_ready will be initiated to a non-zero value. When HW is reset, >>>> the mmap page will be zapped and userspace will get a zero value of >>>> hw_ready. >>> >>> This needs alot more explanation about *why* does userspace need this >>> information and why is hns unique here. >>> >> >> Our HW cannot flush WQEs by itself unless the driver posts a modify-qp-to-err >> mailbox. But when the HW is reset, it'll stop handling mailbox too, so the HW >> becomes unable to produce any more CQEs for the existing WQEs. This will break >> some users' expectation that they should be able to poll CQEs as many as the >> number of the posted WQEs in any cases. > > But your reset flow partially disassociates the device, when the > userspace goes back to sleep, or rearms the CQ, it should get a hard > fail and do a full cleanup without relying on flushing. > Not sure if I got your point, when you said "the userspace goes back to sleep", did you mean the ibv_get_async_event() api? Are you suggesting that userspace should call ibv_get_async_event() to monitor async events, and when it gets a fatal event, it should stop polling CQs and clean up everything instead of still waiting for the remaining CQEs? Thanks, Junxian >> We try to notify the reset state to userspace so that we can generate software >> WCs for the existing WQEs in userspace instead of HW in reset state, which is >> what this rdma-core PR does: > > That doesn't sound right at all. Device disassociation is a hard fail, > we don't try to elegantly do things like generate completions. The > device is dead, the queues are gone. > > Jason >