On 4/5/22 1:45 AM, Miklos Szeredi wrote: > On Sat, 2 Apr 2022 at 03:17, Jens Axboe <axboe@xxxxxxxxx> wrote: >> >> On 4/1/22 10:21 AM, Jens Axboe wrote: >>> On 4/1/22 10:02 AM, Miklos Szeredi wrote: >>>> On Fri, 1 Apr 2022 at 17:36, Jens Axboe <axboe@xxxxxxxxx> wrote: >>>> >>>>> I take it you're continually reusing those slots? >>>> >>>> Yes. >>>> >>>>> If you have a test >>>>> case that'd be ideal. Agree that it sounds like we just need an >>>>> appropriate breather to allow fput/task_work to run. Or it could be the >>>>> deferral free of the fixed slot. >>>> >>>> Adding a breather could make the worst case latency be large. I think >>>> doing the fput synchronously would be better in general. >>> >>> fput() isn't sync, it'll just offload to task_work. There are some >>> dependencies there that would need to be checked. But we'll find a way >>> to deal with it. >>> >>>> I test this on an VM with 8G of memory and run the following: >>>> >>>> ./forkbomb 14 & >>>> # wait till 16k processes are forked >>>> for i in `seq 1 100`; do ./procreads u; done >>>> >>>> You can compare performance with plain reads (./procreads p), the >>>> other tests don't work on public kernels. >>> >>> OK, I'll check up on this, but probably won't have time to do so before >>> early next week. >> >> Can you try with this patch? It's not complete yet, there's actually a >> bunch of things we can do to improve the direct descriptor case. But >> this one is easy enough to pull off, and I think it'll fix your OOM >> case. Not a proposed patch, but it'll prove the theory. > > Sorry for the delay.. > > Patch works like charm. OK good, then it is the issue I suspected. Thanks for testing! -- Jens Axboe