On Wed, Feb 19, 2025 at 09:07:36PM +0800, Junxian Huang wrote: > > > On 2025/2/19 20:14, Leon Romanovsky wrote: > > On Mon, Feb 17, 2025 at 03:01:19PM +0800, Junxian Huang wrote: > >> When mailboxes for resource(QP/CQ/SRQ) destruction fail, it's unable > >> to notify HW about the destruction. In this case, driver will still > >> free the resources, while HW may still access them, thus leading to > >> a UAF. > > > >> This series introduces delay-destruction mechanism to fix such HW UAF, > >> including thw HW CTX and doorbells. > > > > And why can't you fix FW instead? > > > > The key is the failure of mailbox, and there are some cases that would > lead to it, which we don't really consider as FW bugs. > > For example, when some random fatal error like RAS error occurs in FW, > our FW will be reset. Driver's mailbox will fail during the FW reset. I don't understand this scenario. You said at the beginning that HW can access host memory and this triggers UAF. However now, you are presenting case where driver tries to access mailbox. > > Another case is the mailbox timeout when FW is under heavy load, as it is > shared by multi-functions. It is not different from any other mailbox errors. FW needs to handle these cases. Thanks > > Thanks, > Junxian > > > Thanks