On Wed, Jan 22, 2025 at 12:05:27AM -0800, Nicolin Chen wrote: > On Tue, Jan 21, 2025 at 01:40:13PM -0800, Nicolin Chen wrote: > > On Tue, Jan 21, 2025 at 05:14:04PM -0400, Jason Gunthorpe wrote: > > > Since we don't hold the spinlock the whole time there is a race where > > > we could pull the overflow off and then another entry could be pushed > > > while we do the copy_to_user. > > > > I see. I'll be careful around that. I can imagine that one tricky > > thing can be to restore the overflow node back to the list when a > > copy_to_user fails.. > > Hmm, it gets trickier because the overflow node is a preallocated > single node per vEVENTQ. We must not only check list_empty for its > restore, but also protect the overflow->header.sequence from races > between atomic_inc and copy_to_user. However, we can't use a mutex > because copy_to_user might DOS... > > A simple solution could be to just duplicate overflow nodes, each > of which contains a different sequence, like a regular vEVENT node. > This certainly changes the uAPI for read(). Though the duplication > of overflow nodes doesn't sound optimal, it's acceptable since the > duplicated nodes would have been regular vEVENT nodes if there was > no overflow (i.e. no extra overhead)? Ah, didn't think clearly last night.. We can't simply add overflow nodes either for rate/memory limit reason that you concerned about in the other email. On the other hand, though certainly not being ideal, indulging the race at the sequence index might not be that harmful, compared to the extreme of the other case.. I'll give another thought today if there's some other way out. Thanks Nicolin