On Fri, 3 Sept 2021 at 07:31, JeffleXu <jefflexu@xxxxxxxxxxxxxxxxx> wrote: > > > > On 8/17/21 10:08 PM, Miklos Szeredi wrote: > > On Tue, 17 Aug 2021 at 15:22, JeffleXu <jefflexu@xxxxxxxxxxxxxxxxx> wrote: > >> > >> > >> > >> On 8/17/21 8:39 PM, Vivek Goyal wrote: > >>> On Tue, Aug 17, 2021 at 10:06:53AM +0200, Miklos Szeredi wrote: > >>>> On Tue, 17 Aug 2021 at 04:22, Jeffle Xu <jefflexu@xxxxxxxxxxxxxxxxx> wrote: > >>>>> > >>>>> This patchset adds support of per-file DAX for virtiofs, which is > >>>>> inspired by Ira Weiny's work on ext4[1] and xfs[2]. > >>>> > >>>> Can you please explain the background of this change in detail? > >>>> > >>>> Why would an admin want to enable DAX for a particular virtiofs file > >>>> and not for others? > >>> > >>> Initially I thought that they needed it because they are downloading > >>> files on the fly from server. So they don't want to enable dax on the file > >>> till file is completely downloaded. > >> > >> Right, it's our initial requirement. > >> > >> > >>> But later I realized that they should > >>> be able to block in FUSE_SETUPMAPPING call and make sure associated > >>> file section has been downloaded before returning and solve the problem. > >>> So that can't be the primary reason. > >> > >> Saying we want to access 4KB of one file inside guest, if it goes > >> through FUSE request routine, then the fuse daemon only need to download > >> this 4KB from remote server. But if it goes through DAX, then the fuse > >> daemon need to download the whole DAX window (e.g., 2MB) from remote > >> server, so called amplification. Maybe we could decrease the DAX window > >> size, but it's a trade off. > > > > That could be achieved with a plain fuse filesystem on the host (which > > will get 4k READ requests for accesses to mapped area inside guest). > > Since this can be done selectively for files which are not yet > > downloaded, the extra layer wouldn't be a performance problem. > > > > Is there a reason why that wouldn't work? > > I didn't realize this mechanism (working around from user space) before > sending this patch set. > > After learning the virtualization and KVM stuffs, I find that, as Vivek > Goyal replied in [1], virtiofsd/qemu need to somehow hook the user page > fault and then download the remained part. > > IMHO, this mechanism (as you proposed by implementing a plain fuse > filesystem on the host) seems a little bit sophisticated so far. Agree. Let's start with the simplest variant, which is the server selectively enabling dax. Thanks, Miklos