On Sat, May 2, 2020 at 12:48 AM Stefan Hajnoczi <stefanha@xxxxxxxxxx> wrote: > > On Fri, May 01, 2020 at 04:14:38PM +0900, Chirantan Ekbote wrote: > > On Tue, Apr 28, 2020 at 12:20 AM Stefan Hajnoczi <stefanha@xxxxxxxxxx> wrote: > > > Instead of modifying the guest driver, please implement request > > > parallelism in your device implementation. > > > > Yes, we have tried this already [1][2]. As I mentioned above, having > > additional threads in the server actually made performance worse. My > > theory is that when the device only has 2 cpus, having additional > > threads on the host that need cpu time ends up taking time away from > > the guest vcpu. We're now looking at switching to io_uring so that we > > can submit multiple requests from a single thread. > > The host has 2 CPUs? How many vCPUs does the guest have? What is the > physical storage device? What is the host file system? The host has 2 cpus. The guest has 1 vcpu. The physical storage device is an internal ssd. The file system is ext4 with directory encryption. > > io_uring's vocabulary is expanding. It can now do openat2(2), close(2), > statx(2), but not mkdir(2), unlink(2), rename(2), etc. > > I guess there are two options: > 1. Fall back to threads for FUSE operations that cannot yet be done via > io_uring. > 2. Process FUSE operations that cannot be done via io_uring > synchronously. > I'm hoping that using io_uring for just the reads and writes should give us a big enough improvement that we can do the rest of the operations synchronously. Chirantan