On Tue, Apr 2, 2019 at 10:53 AM Michael S. Tsirkin <mst@xxxxxxxxxx> wrote: > > On Tue, Apr 02, 2019 at 10:45:43AM -0700, Alexander Duyck wrote: > > We went through this back in the day with > > networking. Adding more buffers is not the solution. The solution is > > to have a way to gracefully recover and keep our hinting latency and > > buffer bloat to a minimum. > > That's an interesting approach, I think that things that end up working > well are NAPI (asychronous notifications), limited batching, XDP (big > aligned buffers) and BQL (accounting). Is that your perspective too? > Yes, that is kind of what I was getting at. Basically we could have a kthread running somewhere that goes through and pulls something like 64M of pages out of the MAX_ORDER - 1 freelist, does what is necessary to isolate them, puts them on a queue somewhere, kicks the virtio ring, and waits for the response to come back indicating that the hints have been processed. We would just have to keep it running until the list doesn't have enough non-"Offline" memory to fulfill the request. Then we just wait until we again reach a level necessary to justify waking the thread back up and repeat. In my mind it looks a lot like your standard Rx ring in that we allocate some fixed number of buffers and wait for hardware to tell us when the buffers are ready. The only extra complexity is having to add tracking using the PageType "Offline" bit which should be cheap when we are already having to manipulate the "Buddy" PageType anyway. It would let us get away from having to do the per-cpu queues and complicated coordination logic to translate free pages to their buddy.