On Tue, Feb 06, 2018 at 01:11:41AM +0200, Max Gurtovoy wrote: > > > On 2/5/2018 8:09 PM, Jason Gunthorpe wrote: > >On Mon, Feb 05, 2018 at 04:29:51PM +0200, Max Gurtovoy wrote: > >>Currently the async EQ has 256 entries only. It might not be big enough > >>for the SW to handle all the needed pending events. For example, in case > >>of many QPs (let's say 1024) connected to a SRQ created using NVMeOF target > >>and the target goes down, the FW will raise 1024 "last WQE reached" events > >>and may cause EQ overrun. Increase the EQ to more reasonable size, that beyond > >>it the FW should be able to delay the event and raise it later on using internal > >>backpressure mechanism. > > > >If the firmware has an internal backpressure meachanism then why > >would we get a EQ overrun? > > FW backpressure mechanism is WIP, that's why we get the overrun. Ah, so current HW blows up if EQ is overrun and that can actually be triggered by ULPs? Yuk > After consulting with FW team, we conclude that 256 EQ depth is small. > Do you think it's reasonable to allocate 4k entries (256KB of contig memory) > for async EQ ? No idea, ask Saeed? > >Do we need to block adding too many QPs to a SRQ as well or something > >like that? > > Hard to say. In the storage world, this may lead to a situation that > initiator X has priority over initiator Y on without any good reason (only > because X was served before Y).. Well, correctness comes first, so if the device does have to protect itself from rouge ULPS.. If that means enforcing a goofy limit, then so be it :( Presumably someday fixed firmware will remove the limitation? Jason -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html