On Thu, Oct 03, 2019 at 10:42:44AM +0200, Miklos Szeredi wrote: > On Wed, Oct 2, 2019 at 3:27 PM Vivek Goyal <vgoyal@xxxxxxxxxx> wrote: > > > > On Wed, Oct 02, 2019 at 09:40:11AM +0200, Miklos Szeredi wrote: > > > Looking at the ugly retry logic in virtiofs and have some questions. > > > > Hi Miklos, > > > > What are you thinking w.r.t cleanup of retry logic. As of now we put > > requests in a list and retry later with the help of a worker. > > Two things: > > - don't use a timeout for retrying > - try to preserve order of requests submitted vs. order of requests queued Hi Miklos, I understand first point but no the second one. Can you elaborate a bit more that why it is important. Device does not provide any guarantees that requests will be completed in order of submission. If that's the case, then submitter can't assume any order in request completion anyway. > > This is complicated by the fact that we call virtqueue_add_sgs() under > spinlock, which is the reason GFP_ATOMIC is used. However GFP_ATOMIC > can easily fail and that means even if the "indirect_desc" feature is > turned on a request may use several slots of the ring buffer for a > single request. Aha, you are referring to the case where indirect_desc feature is enabled but memory allocation fails, so it falls back to using regular descriptors. > Worst case is that a request has lots of pages, > resulting in total_sgs > vring.num, which is bad, because we'll get > ENOSPC thinking that the operation needs to be retried once some > pending requests are done, but that's not actually the case... Got it. So if we always allocate memory for total_sgs (which could be more than vring.num) and are in a position to wait for memory allocation, then this problem will not be there. Alternate solution probably is to query queue size in the beginning and make sure number of pages attached to a request are less then that. > > So my proposed solution is to turn fsvq->lock into a mutex; call > virtqueue_add_sgs() with whatever gfp flags are used to allocate the > request and add __GFP_NOFAIL for good measure. This means that the > request is guaranteed to use a single slot IF "indirect_desc" is on. > And there should be some way to verify from the virtiofs code that > this feature is indeed on, otherwise things can get messy (as noted > above). Sounds reasonable. So if system is under memory pressure, currently we will fall back to using pre-allocated descriptors instead of waiting for memory to become free. IOW, virtio-fs can continue to make progress. But with GFP_NOFAIL we will wait for memory to be allocated for indirect descriptor and then make progress. It probably is fine, I am just thinking loud. I noticed that all the virtio drivers currently seem to be using GFP_ATOMIC. Interesting... Anyway, it probably is worth trying it. It also solves the problem of retrying as we will block for resources to be allocated and should not get -ENOSPC or -ENOMEM. BTW, I have few cleanup/improvement patches lined up internally to use better method to wait for draining of request. Will post these separately. They probably can go in next merge window. Thanks Vivek _______________________________________________ Virtualization mailing list Virtualization@xxxxxxxxxxxxxxxxxxxxxxxxxx https://lists.linuxfoundation.org/mailman/listinfo/virtualization