On Mon, Dec 16, 2019 at 10:11:12AM -0800, Eric Biggers wrote: > From: Eric Biggers <ebiggers@xxxxxxxxxx> > > When fs-verity verifies data pages, currently it reads each Merkle tree > page synchronously using read_mapping_page(). > > Therefore, when the Merkle tree pages aren't already cached, fs-verity > causes an extra 4 KiB I/O request for every 512 KiB of data (assuming > that the Merkle tree uses SHA-256 and 4 KiB blocks). This results in > more I/O requests and performance loss than is strictly necessary. > > Therefore, implement readahead of the Merkle tree pages. > > For simplicity, we take advantage of the fact that the kernel already > does readahead of the file's *data*, just like it does for any other > file. Due to this, we don't really need a separate readahead state > (struct file_ra_state) just for the Merkle tree, but rather we just need > to piggy-back on the existing data readahead requests. > > We also only really need to bother with the first level of the Merkle > tree, since the usual fan-out factor is 128, so normally over 99% of > Merkle tree I/O requests are for the first level. > > Therefore, make fsverity_verify_bio() enable readahead of the first > Merkle tree level, for up to 1/4 the number of pages in the bio, when it > sees that the REQ_RAHEAD flag is set on the bio. The readahead size is > then passed down to ->read_merkle_tree_page() for the filesystem to > (optionally) implement if it sees that the requested page is uncached. > > While we're at it, also make build_merkle_tree_level() set the Merkle > tree readahead size, since it's easy to do there. > > However, for now don't set the readahead size in fsverity_verify_page(), > since currently it's only used to verify holes on ext4 and f2fs, and it > would need parameters added to know how much to read ahead. > > This patch significantly improves fs-verity sequential read performance. > Some quick benchmarks with 'cat'-ing a 250MB file after dropping caches: > > On ARM64 phone (using sha256-ce): > Before: 217 MB/s > After: 263 MB/s > (compare to sha256sum of non-verity file: 357 MB/s) > > In an x86_64 VM (using sha256-avx2): > Before: 173 MB/s > After: 215 MB/s > (compare to sha256sum of non-verity file: 223 MB/s) > > Signed-off-by: Eric Biggers <ebiggers@xxxxxxxxxx> > --- > fs/ext4/verity.c | 49 ++++++++++++++++++++++++++++++++++-- > fs/f2fs/data.c | 6 ++--- > fs/f2fs/f2fs.h | 3 +++ > fs/f2fs/verity.c | 49 ++++++++++++++++++++++++++++++++++-- > fs/verity/enable.c | 8 +++++- > fs/verity/fsverity_private.h | 1 + > fs/verity/open.c | 1 + > fs/verity/verify.c | 34 ++++++++++++++++++++----- > include/linux/fsverity.h | 7 +++++- > 9 files changed, 143 insertions(+), 15 deletions(-) Ted and Jaegeuk, have you had a chance to review this patch? I could use your Acked-bys on it, since it touches fs/ext4/ and fs/f2fs/. > diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c > index a034cd0ce0217..8a6b3266bd794 100644 > --- a/fs/f2fs/data.c > +++ b/fs/f2fs/data.c > @@ -1881,9 +1881,9 @@ static int f2fs_read_single_page(struct inode *inode, struct page *page, > * use ->readpage() or do the necessary surgery to decouple ->readpages() > * from read-ahead. > */ > -static int f2fs_mpage_readpages(struct address_space *mapping, > - struct list_head *pages, struct page *page, > - unsigned nr_pages, bool is_readahead) > +int f2fs_mpage_readpages(struct address_space *mapping, > + struct list_head *pages, struct page *page, > + unsigned int nr_pages, bool is_readahead) > { FYI, I'm aware that the f2fs compression patch (which is queued in f2fs/dev) also makes f2fs_mpage_readpages() non-static, but uses slightly different formatting. If/when I apply this patch I'll adjust it to match f2fs/dev so that there's no merge conflict. - Eric