Re: [External] Re: [RFC PATCH 0/1] erofs: change to use asynchronous io for fscache readahead

Xin Yin <yinxin.x@xxxxxxxxxxxxx> · Sun, 1 May 2022 15:59:37 +0800

On Fri, Apr 29, 2022 at 8:36 PM Gao Xiang <hsiangkao@xxxxxxxxxxxxxxxxx> wrote:
>
> Hi Xin,
>
> On Fri, Apr 29, 2022 at 07:38:48AM +0800, Xin Yin wrote:
> > Hi Jeffle & Xiang
> >
> > I have tested your fscache,erofs: fscache-based on-demand read semantics
> > v9 patches sets https://www.spinics.net/lists/linux-fsdevel/msg216178.html.
> > For now , it works fine with the nydus image-service. After the image data
> > is fully loaded to local storage, it does have great IO performance gain
> > compared with nydus V5 which is based on fuse.
>
> Yeah, thanks for your interest and efforts. Actually I'm pretty sure you
> could observe CPU, bandwidth and latency improvement on the dense deployed
> scenarios since our goal is to provide native performance when the data is
> ready, as well as image on-demand read, flexible cache data management to
> end users.
>
> >
> > For 4K random read , fscache-based erofs can get the same performance with
> > the original local filesystem. But I still saw a performance drop in the 4K
> > sequential read case. And I found the root cause is in erofs_fscache_readahead()
> > we use synchronous IO , which may stall the readahead pipelining.
> >
>
> Yeah, that is a known TODO, in principle, when such part of data is locally
> available, it will have the similar performance (bandwidth, latency, CPU
> loading) as loop device. But we don't implement asynchronous I/O for now,
> since we need to make the functionality work first, so thanks for your
> patch addressing this.
>
> > I have tried to change to use asynchronous io during erofs fscache readahead
> > procedure, as what netfs did. Then I saw a great performance gain.
> >
> > Here are my test steps and results:
> > - generate nydus v6 format image , in which stored a large file for IO test.
> > - launch nydus image-service , and  make image data fully loaded to local storage (ext4).
> > - run fio with below cmd.
> > fio -ioengine=psync -bs=4k -size=5G -direct=0 -thread -rw=read -filename=./test_image  -name="test" -numjobs=1 -iodepth=16 -runtime=60
>
> Yeah, although I can see what you mean (to test buffered I/O), the
> argument is still somewhat messy (maybe because we don't support
> fscache-based direct I/O for now. That is another TODO but with
> low priority.)
>
> >
> > v9 patches: 202654 KB/s
> > v9 patches + async readahead patch: 407213 KB/s
> > ext4: 439912 KB/s
>
> May I ask if such ext4 image is through a loop device? If not, that is
> reasonable. Anyway, it's not a big problem for now, we could optimize
> it later since it should be exactly the same finally.
>

This ext4 image is not through a loop device ,  just the same test
file stored in native ext4.  Actually , after further tests , I could
see that fscache-based erofs with async readahead patch almost achieve
native performance in sequential buffer read cases.

Thanks,
Xin Yin

> And I will drop a message to Jeffle for further review since we're
> closing to another 5-day national holiday.
>
> Thanks again!
> Gao Xiang
>