> Subject: Re: [EXTERNAL] Re: [PATCH v4 1/1] RDMA/mana_ib: Add EQ > interrupt support to mana ib driver. > > On Wed, Aug 02, 2023 at 04:11:18AM +0000, Ajay Sharma wrote: > > > > > > > On Aug 1, 2023, at 6:46 PM, Jason Gunthorpe <jgg@xxxxxxxx> wrote: > > > > > > On Tue, Aug 01, 2023 at 07:06:57PM +0000, Long Li wrote: > > > > > >> The driver interrupt code limits the CPU processing time of each EQ > > >> by reading a small batch of EQEs in this interrupt. It guarantees > > >> all the EQs are checked on this CPU, and limits the interrupt > > >> processing time for any given EQ. In this way, a bad EQ (which is > > >> stormed by a bad user doing unreasonable re-arming on the CQ) can't > > >> storm other EQs on this CPU. > > > > > > Of course it can, the bad use just creates a million EQs and pushes > > > a bit of work through them constantly. How is that really any > > > different from pushing more EQEs into a single EQ? > > > > > > And how does your EQ multiplexing work anyhow? Do you poll every EQ > > > on every interrupt? That itself is a DOS vector. > > > > User does not create eqs directly . EQ creation is by product of > > opening device ie allocating context. > > Which is done directly by the user. > > > I am not sure if the same > > process is allowed to open device multiple times > > Of course it can. > > > of lock implemented. So million eqs are probably far fetched . > > Uh, how do you conclude that? > > > As for how the eq servicing is done - only those eq’s for which the > > interrupt is raised are checked. And each eq is tied only once and > > only to a single interrupt. > > So you iterate over a list of EQs in every interrupt? > > Allowing userspace to increase the number of EQs on an interrupt is a direct > DOS vector, no special fussing required. > > If you want this to work properly you need to have your HW arrange things so > there is only ever one EQE in the EQ for a given CQ at any time. Another EQE > cannot be stuffed by the HW until the kernel reads the first EQE and acks it > back. > > You have almost got this right, the mistake is that userspace is the thing that > allows the HW to generate a new EQE. If you care about DOS then this is the > wrong design, the kernel and only the kernel must be able to trigger a new EQE > for the CQ. > > In effect you need two CQ doorbells, a userspace one that re-arms the CQ, and > a kernel one that allows a CQ that triggered on ARM to generate an EQE. > > Thus the kernel can strictly limit the flow of EQEs through the EQs such that an > EQ can never overflow and a CQ can never consume more than one EQE. > > You cannot really fix this hardware problem with a software solution. You will > always have a DOS at some point. We'll address the comments and send another patch. Thanks, Long